I have a bunch of disagreements with Good Ventures and how they are allocating their funds, but also Dustin and Cari are plausibly the best people who ever lived.
I want to agree, but “best people who ever lived” is a ridiculously high bar! I’d imagine that both of them would be hesitant to claim anything quite that high.
Yeah, sorry: it was obvious to me that this was the intended meaning, after I realized it could be interpreted this way. I noted it because I found the syntactic ambiguity mildly interesting/amusing.
For example, Norman Borlaug is often called “the father of the Green Revolution”, and is credited with saving a billion people worldwide from starving to death. Stanislav Petrov and Vasily Arkhipov prevented a probable nuclear war from happening.
(Half baked and maybe just straight up incorrect about people’s orientations)
I worry a bit about groups thinking about the post-AGI future (e.g., Forethought) will not want to push for something like super-optimized flourishing because this will seem weird and possibly uncooperative with factions that don’t like the vibe of super-optimization. This might happen even if these groups thinking about the future do believe in their hearts that super-optimized flourishing is the best outcome.
It is very plausible to me that the situation is “convex” in the sense that it is better for the super-optimizers to optimize fully with their share of the universe, while the other groups do what they want with their share (with rules to prevent extreme suffering, pessimization etc). I think this approach might be better for all groups, rather than aiming for a more universal middle ground that leaves everyone disappointed. This bad middle ground might look like a universe that is both not very optimized for flourishing but is still super weird and unfamiliar.
It would be very sad if we miss out on the optimized flourishing because we were trying to not seem weird or uncooperative.
Two hours before you posted this, MacAskill posted a brief explanation of viatopianism.
This essay is the first in a series that discusses what a good north star [for post-superintelligence society] might be. I begin by describing a concept that I find helpful in this regard:
Viatopia: an intermediate state of society that is on track for a near-best future, whatever that might look like.
Viatopia is a waystation rather than a final destination; etymologically, it means “by way of this place”. We can often describe good waystations even if we have little idea what the ultimate destination should be. A teenager might have little idea what they want to do with their life, but know that a good education will keep their options open. Adventurers lost in the wilderness might not know where they should ultimately be going, but still know they should move to higher ground where they can survey the terrain. Similarly, we can identify what puts humanity in a good position to navigate towards excellent futures, even if we don’t yet know exactly what those futures look like.
In the past, Toby Ord and I have promoted the related idea of the “long reflection”: a stable state of the world where we are safe from calamity, and where we reflect on and debate the nature of the good life, working out what the most flourishing society would be. Viatopia is a more general concept: the long reflection is one proposal for what viatopia would look like, but it need not be the only one.
I think that some sufficiently-specified conception of viatopia should act as our north star during the transition to superintelligence. In later essays I’ll discuss what viatopia, concretely, might look like; this note will just focus on explaining the concept.
. . .
Unlike utopianism, it cautions against the idea of having some ultimate end-state in mind. Unlike protopianism, it attempts to offer a vision for where society should be going. It focuses on achieving whatever society needs to be able to steer itself towards a truly wonderful outcome.
I think I’m largely on board. I think I’d favor doing some amount of utopian planning (aiming for something like hedonium and acausal trade). Viatopia sounds less weird than utopias like that. I wouldn’t be shocked if Forethought talked relatively more about viatopia because it sounds less weird. I would be shocked if they push us in the direction of anodyne final outcomes. I agree with Peter that stuff is “convex” but I don’t worry that Forethought will have us tile the universe with compromisium. But I don’t have much private info.
Yeah, agreed on that point. Folks at Forethought aren’t necessarily thinking about what a near-optimal future should look like, they’re thinking about how to get civilisation to a point where we can make the best possible decisions about what to do with the long-term future.
Actually Jordan, better than “pretty ok” futures is explicitly something that folks at Forethought have been thinking about. Just not in the Viatopia piece.
Check this out: https://www.forethought.org/research/better-futures
Speculatively, I think there could actually just be convergence here, though, once you account for moral uncertainty and just very plausible situations where doing bad by everyone’s lights are as bad as, say, utilitarian nightmares but just easier to get others on board for (ie extreme power).
Imagine a world populated by many, many (trillions) of people. These people’s lives aren’t purely full of joy, and do have a lot of misery as well. But each person thinks that their life is worth living. Their lives might be a be bit boring or they might be full of huge ups and downs, but on the whole they are net-positive.
From this view it seems really strange to think that it would be good for every person in this world to die/not exist/never have existed in order to allow a very small number of privileged people to live spectacular lives. It seems bad to stop many people from living a life that they mostly enjoy, in order to allow the flourishing of the few.
I think this hypothetical is a decent intuition pump for why the Repugnant Conclusion isn’t actually repugnant. But I do think it might be a little bit dishonest or manipulative. It frames the situation in terms of fairness and equality; we can sympathize with the many slightly happy people who are maybe being denied the right to exist, and think of the few extremely happy people as the privileged elite. It also takes advantage of status quo bias; by beginning with the many slightly happy people it seems worse to then ‘remove’ them.
I’ve always thought the Repugnant Conclusion was mostly status quo bias, anyway, combined with the difficulty of imagining what such a future would actually be like.
I think the Utility Monster is a similar issue. Maybe it would be possible to create something with a much richer experience set than humans, which should be valued more highly. But any such being would actually be pretty awesome, so we shouldn’t resent giving it a greater share of resources.
Humans seem like (plausible) utility monsters compared to ants, and many religious people have a conception of god that would make Him a utility monster (“maybe you don’t like prayer and following all these rules, but you can’t even conceive of the - ‘joy’ doesn’t even do it justice—how much grander it is to god if we follow these rules than even the best experiences in our whole lives!”). Anti-utility monster sentiments seem to largely be coming from a place where someone imagines a human that’s pretty happy by human standards, and thinks the words “orders of magnitude happier than what any human feels”, and then they notice their intuition doesn’t track the words “orders of magnitude”.
Announcing: 2026 MIRI Technical Governance Team Research Fellowship.
MIRI’s Technical Governance Team plans to run a small research fellowship program in early 2026. The program will run for 8 weeks, and include a $1200/week stipend. Fellows are expected to work on their projects 40 hours per week. The program is remote-by-default, with an in-person kickoff week in Berkeley, CA (flights and housing provided). Participants who already live in or near Berkeley are free to use our office for the duration of the program.
Fellows will spend the first week picking out scoped projects from a list provided by our team or designing independent research projects (related to our overall agenda), and then spend seven weeks working on that project under the guidance of our Technical Governance Team. One of the main goals of the program is to identify full-time hires for the team.
If you are interested in participating, please fill out this application as soon as possible (should take 45-60 minutes). We plan to setdates for participation based on applicant availability, but we expect the fellowship to begin after February 2, 2026 and end before August 31, 2026 (i.e., some 8 week period in spring/summer, 2026).
Strong applicants care deeply about existential risk, have existing experience in research or policy work, and are able to work autonomously for long stretches on topics that merge considerations from the technical and political worlds.
Unfortunately, we are not able to sponsor visas for this program.
Just to clarify, is the 8-week period for all participants? And if so, will you still accept some applications after the date has be decided?
I might apply but I could only participate if the program was organized in July-August. But given that it could occur any time between February and August, I probably won’t apply since it’s only like 1/7th chance it will start in July.
We may be running multiple smaller cohorts rather than one big one, if that’s what maximizes the ability of strong candidates to participate.
The single most important factor in deciding the timing is the window in which strong candidates are available, and the target size for the cohort is small enough (5-20 depending on strength of applicants) that the availability of a single applicant is enough to sway the decision. It’s specifically cases like yours that we’re intending to accommodate. Please apply!
I made an AI generated podcast of the 2021 MIRI Conversations. There are different voices for the different participants, to make it easier and more natural to follow along with.
This was done entirely in my personal capacity, and not as part of my job at MIRI.[1] I did this because I like listening to audio and there wasn’t a good audio version of the conversations.
Imagine there is a box with a ball inside it, and you believe the ball is red. But you also believe that in the future you will update your belief and think that the ball is blue (the ball is a normal, non-color-changing ball). This seems like a very strange position to be in, and you should just believe that the ball is blue now.
This is an example of how we should deal with beliefs in general; if you think in the future you will update a belief in a specific direction then you should just update now.
I think the same principle applies to moral beliefs. If you think that in the future you’ll believe that it’s wrong to do something, then you should believe that it’s wrong now.
As an example of this, if you think that in the future you’ll believe eating meat is wrong, then you sort of already believe eating meat is wrong. I was in exactly this position for a while, thinking in the future I would stop eating meat, while also continuing to eat meat. A similar case to this is deliberately remaining ignorant about something because learning would change your moral beliefs. If you’re avoiding learning about factory farming because you think it would cause you to believe eating factory farmed meat is bad, then you already on some level believe that.
Another case of this is in politics when a politician says it’s ‘not the time’ for some political action but in the future it will be. This is ‘fine’ if it’s ‘not the time’ due to political reasons, such as the electorate not reelecting the politician. But I don’t think it’s consistent to say an action is currently not moral, but will be moral in the future. Obviously this only works if the action now and in the future are actually equivalent.
I have a bunch of disagreements with Good Ventures and how they are allocating their funds, but also Dustin and Cari are plausibly the best people who ever lived.
I want to agree, but “best people who ever lived” is a ridiculously high bar! I’d imagine that both of them would be hesitant to claim anything quite that high.
“Plausibly best people who have ever lived” is a much lower bar than “best people who have ever lived”.
If you are like me, this comment will leave you perplexed. After a while, I realized that it should not be read as
but as
fwiw i instinctively read it as the 2nd, which i think is caleb’s intended reading
I was going for the second, adding some quotes to make it clearer.
Yeah, sorry: it was obvious to me that this was the intended meaning, after I realized it could be interpreted this way. I noted it because I found the syntactic ambiguity mildly interesting/amusing.
For example, Norman Borlaug is often called “the father of the Green Revolution”, and is credited with saving a billion people worldwide from starving to death. Stanislav Petrov and Vasily Arkhipov prevented a probable nuclear war from happening.
It’s true how many people actually give away so much money as they make it?
(Half baked and maybe just straight up incorrect about people’s orientations)
I worry a bit about groups thinking about the post-AGI future (e.g., Forethought) will not want to push for something like super-optimized flourishing because this will seem weird and possibly uncooperative with factions that don’t like the vibe of super-optimization. This might happen even if these groups thinking about the future do believe in their hearts that super-optimized flourishing is the best outcome.
It is very plausible to me that the situation is “convex” in the sense that it is better for the super-optimizers to optimize fully with their share of the universe, while the other groups do what they want with their share (with rules to prevent extreme suffering, pessimization etc). I think this approach might be better for all groups, rather than aiming for a more universal middle ground that leaves everyone disappointed. This bad middle ground might look like a universe that is both not very optimized for flourishing but is still super weird and unfamiliar.
It would be very sad if we miss out on the optimized flourishing because we were trying to not seem weird or uncooperative.
Two hours before you posted this, MacAskill posted a brief explanation of viatopianism.
I think I’m largely on board. I think I’d favor doing some amount of utopian planning (aiming for something like hedonium and acausal trade). Viatopia sounds less weird than utopias like that. I wouldn’t be shocked if Forethought talked relatively more about viatopia because it sounds less weird. I would be shocked if they push us in the direction of anodyne final outcomes. I agree with Peter that stuff is “convex” but I don’t worry that Forethought will have us tile the universe with compromisium. But I don’t have much private info.
I should read that piece. In general, I am very into the Long Reflection and I guess also the Viatopia stuff.
Yeah, agreed on that point. Folks at Forethought aren’t necessarily thinking about what a near-optimal future should look like, they’re thinking about how to get civilisation to a point where we can make the best possible decisions about what to do with the long-term future.
Actually Jordan, better than “pretty ok” futures is explicitly something that folks at Forethought have been thinking about. Just not in the Viatopia piece. Check this out: https://www.forethought.org/research/better-futures
Hmm this is interesting.
Speculatively, I think there could actually just be convergence here, though, once you account for moral uncertainty and just very plausible situations where doing bad by everyone’s lights are as bad as, say, utilitarian nightmares but just easier to get others on board for (ie extreme power).
Flipping the Repugnant Conclusion
Imagine a world populated by many, many (trillions) of people. These people’s lives aren’t purely full of joy, and do have a lot of misery as well. But each person thinks that their life is worth living. Their lives might be a be bit boring or they might be full of huge ups and downs, but on the whole they are net-positive.
From this view it seems really strange to think that it would be good for every person in this world to die/not exist/never have existed in order to allow a very small number of privileged people to live spectacular lives. It seems bad to stop many people from living a life that they mostly enjoy, in order to allow the flourishing of the few.
I think this hypothetical is a decent intuition pump for why the Repugnant Conclusion isn’t actually repugnant. But I do think it might be a little bit dishonest or manipulative. It frames the situation in terms of fairness and equality; we can sympathize with the many slightly happy people who are maybe being denied the right to exist, and think of the few extremely happy people as the privileged elite. It also takes advantage of status quo bias; by beginning with the many slightly happy people it seems worse to then ‘remove’ them.
I’ve always thought the Repugnant Conclusion was mostly status quo bias, anyway, combined with the difficulty of imagining what such a future would actually be like.
I think the Utility Monster is a similar issue. Maybe it would be possible to create something with a much richer experience set than humans, which should be valued more highly. But any such being would actually be pretty awesome, so we shouldn’t resent giving it a greater share of resources.
Humans seem like (plausible) utility monsters compared to ants, and many religious people have a conception of god that would make Him a utility monster (“maybe you don’t like prayer and following all these rules, but you can’t even conceive of the - ‘joy’ doesn’t even do it justice—how much grander it is to god if we follow these rules than even the best experiences in our whole lives!”). Anti-utility monster sentiments seem to largely be coming from a place where someone imagines a human that’s pretty happy by human standards, and thinks the words “orders of magnitude happier than what any human feels”, and then they notice their intuition doesn’t track the words “orders of magnitude”.
I like this perspective. I’ve never really understood why people find the repugnant conclusion repugnant!
Announcing: 2026 MIRI Technical Governance Team Research Fellowship.
MIRI’s Technical Governance Team plans to run a small research fellowship program in early 2026. The program will run for 8 weeks, and include a $1200/week stipend. Fellows are expected to work on their projects 40 hours per week. The program is remote-by-default, with an in-person kickoff week in Berkeley, CA (flights and housing provided). Participants who already live in or near Berkeley are free to use our office for the duration of the program.
Fellows will spend the first week picking out scoped projects from a list provided by our team or designing independent research projects (related to our overall agenda), and then spend seven weeks working on that project under the guidance of our Technical Governance Team. One of the main goals of the program is to identify full-time hires for the team.
If you are interested in participating, please fill out this application as soon as possible (should take 45-60 minutes). We plan to set dates for participation based on applicant availability, but we expect the fellowship to begin after February 2, 2026 and end before August 31, 2026 (i.e., some 8 week period in spring/summer, 2026).
Strong applicants care deeply about existential risk, have existing experience in research or policy work, and are able to work autonomously for long stretches on topics that merge considerations from the technical and political worlds.
Unfortunately, we are not able to sponsor visas for this program.
See here for examples of potential projects
Just to clarify, is the 8-week period for all participants? And if so, will you still accept some applications after the date has be decided?
I might apply but I could only participate if the program was organized in July-August. But given that it could occur any time between February and August, I probably won’t apply since it’s only like 1/7th chance it will start in July.
We may be running multiple smaller cohorts rather than one big one, if that’s what maximizes the ability of strong candidates to participate.
The single most important factor in deciding the timing is the window in which strong candidates are available, and the target size for the cohort is small enough (5-20 depending on strength of applicants) that the availability of a single applicant is enough to sway the decision. It’s specifically cases like yours that we’re intending to accommodate. Please apply!
AI Generated Podcast for 2021 MIRI Conversations.
I made an AI generated podcast of the 2021 MIRI Conversations. There are different voices for the different participants, to make it easier and more natural to follow along with.
This was done entirely in my personal capacity, and not as part of my job at MIRI.[1] I did this because I like listening to audio and there wasn’t a good audio version of the conversations.
Spotify link: https://open.spotify.com/show/6I0YbfFQJUv0IX6EYD1tPe
RRS: https://anchor.fm/s/1082f3c7c/podcast/rss
Apple Podcasts: https://podcasts.apple.com/us/podcast/2021-miri-conversations/id1838863198
Pocket Casts: https://pca.st/biravt3t
I do think you probably should (pre-)order If Anyone Builds It, Everyone Dies though.
Updating Moral Beliefs
Imagine there is a box with a ball inside it, and you believe the ball is red. But you also believe that in the future you will update your belief and think that the ball is blue (the ball is a normal, non-color-changing ball). This seems like a very strange position to be in, and you should just believe that the ball is blue now.
This is an example of how we should deal with beliefs in general; if you think in the future you will update a belief in a specific direction then you should just update now.
I think the same principle applies to moral beliefs. If you think that in the future you’ll believe that it’s wrong to do something, then you should believe that it’s wrong now.
As an example of this, if you think that in the future you’ll believe eating meat is wrong, then you sort of already believe eating meat is wrong. I was in exactly this position for a while, thinking in the future I would stop eating meat, while also continuing to eat meat. A similar case to this is deliberately remaining ignorant about something because learning would change your moral beliefs. If you’re avoiding learning about factory farming because you think it would cause you to believe eating factory farmed meat is bad, then you already on some level believe that.
Another case of this is in politics when a politician says it’s ‘not the time’ for some political action but in the future it will be. This is ‘fine’ if it’s ‘not the time’ due to political reasons, such as the electorate not reelecting the politician. But I don’t think it’s consistent to say an action is currently not moral, but will be moral in the future. Obviously this only works if the action now and in the future are actually equivalent.