I agree with most of the points in this post (AI timelines might be quite short; probability of doom given AGI in a world that looks like our current one is high; there isn’t much hope for good outcomes for humanity unless AI progress is slowed down somehow). I will focus on one of the parts where I think I disagree and which feels like a crux for me on whether advocating AI pause (in current form) is a good idea.
You write:
But we can still have all the nice things (including a cure for ageing) without AGI; it might just take a bit longer than hoped. We don’t need to be risking life and limb driving through red lights just to be getting to our dream holiday a few minutes earlier.
I think framings like these do a misleading thing where they use the word “we” to ambiguously refer to both “humanity as a whole” and “us humans who are currently alive”. The “we” that decides how much risk to take is the humans currently alive, but the “we” that enjoys the dream holiday might be humans millions of years in the future.
I worry that “AI pause” is not being marketed honestly to the public. If people like Wei Dai are right (and I currently think they are), then AI development may need to be paused for millions of years potentially, and it’s unclear how long it will take unaugmented or only mildly augmented humans to reach longevity escape velocity.
So to a first approximation, the choice available to humans currently alive is something like:
Option A: 10% chance utopia within our lifetime (if alignment turns out to be easy) and 90% human extinction
Option B: ~100% chance death but then our descendants probably get to live in a utopia
For philosophy nerds with low time preference and altruistic tendencies (into which I classify many EA people and also myself), Option B may seem obvious. But I think many humans existing today would rather risk it and just try to build AGI now, rather than doing any AI pause, and to the extent that they say they prefer pause, I think they are being deceived by the marketing or acting under Caplanian Principle of Normality, or else they are somehow better philosophers than I expected they would be.
(Note: if you are so pessimistic about aligning AI without a pause that your probability on that is lower than the probability of unaugmented present-day humans reaching longevity escape velocity, then Option B does seem like a strictly better choice. But the older and more unhealthy you are, the less this applies to you personally.)
Option A: 10% chance utopia within our lifetime (if alignment turns out to be easy) and 90% human extinction
Are you simplifying here, or do you actually believe that “utopia in our lifetime” or “extinction” are the only two possible outcomes given AGI? Do you assign a 0% chance that we survive AGI, but don’t have a utopia in the next 80 years?
What if AGI stalls out at human level, or is incredibly expensive, or is buggy and unreliable like humans are? What if the technology required for utopia turns out to be ridiculously hard even for AGI, or substantially bottlenecked by available resources? What if technology alone can’t create a utopia, and the extra tech just exacerbates existing conflicts? What if AGI access is restricted to world leaders, who use it for their own purposes?
What if we build an unaligned AGI, but catch it early and manage to defeat it in battle? What if early, shitty AGI screws up in a way that causes a worldwide ban on further AGI development? What if we build an AGI, but we keep it confined to a box and can only get limited functionality out of it? What if we build an aligned AGI, but people hate it so much that it voluntary shuts off? What if the AGI that gets built is aligned to the values of people with awful views, like religious fundamentalists? What if AGI wants nothing to do with us and flees the galaxy? What if [insert X thing I didn’t think of here]?.
IMO, extinction and utopia are both unlikely outcomes. The bulk of the probability lies somewhere in the middle.
I was indeed simplifying, and e.g. probably should have said “global catastrophe” instead of “human extinction” to cover cases like permanent totalitarian regimes. I think some of the scenarios you mention could happen, but also think a bunch of them are pretty unlikely, and also disagree with your conclusion that “The bulk of the probability lies somewhere in the middle”. I might be up for discussing more specifics, but also I don’t get the sense that disagreement here is a crux for either of us, so I’m also not sure how much value there would be in continuing down this thread.
I would agree that “utopia in our lifetime” or “extinction” seems like a false dichotomy. What makes you say that you predict the bulk of the probability lies somewhere in the middle?
How about an Option A.1: pause for a few years or a decade to give alignment a chance to catch up? At least stop at the red lights for a bit to check whether anyone is coming, even if you are speeding!
if you are so pessimistic about aligning AI without a pause that your probability on that is lower than the probability of unaugmented present-day humans reaching longevity escape velocity
I think this easily goes through, even for 1-10% p(doom|AGI), as it seems like ageing is basically already a solved problem or will be within a decade or so (see the video I linked to—David Sinclair; and there are many other people working in the space with promising research too).
I agree with most of the points in this post (AI timelines might be quite short; probability of doom given AGI in a world that looks like our current one is high; there isn’t much hope for good outcomes for humanity unless AI progress is slowed down somehow). I will focus on one of the parts where I think I disagree and which feels like a crux for me on whether advocating AI pause (in current form) is a good idea.
You write:
I think framings like these do a misleading thing where they use the word “we” to ambiguously refer to both “humanity as a whole” and “us humans who are currently alive”. The “we” that decides how much risk to take is the humans currently alive, but the “we” that enjoys the dream holiday might be humans millions of years in the future.
I worry that “AI pause” is not being marketed honestly to the public. If people like Wei Dai are right (and I currently think they are), then AI development may need to be paused for millions of years potentially, and it’s unclear how long it will take unaugmented or only mildly augmented humans to reach longevity escape velocity.
So to a first approximation, the choice available to humans currently alive is something like:
Option A: 10% chance utopia within our lifetime (if alignment turns out to be easy) and 90% human extinction
Option B: ~100% chance death but then our descendants probably get to live in a utopia
For philosophy nerds with low time preference and altruistic tendencies (into which I classify many EA people and also myself), Option B may seem obvious. But I think many humans existing today would rather risk it and just try to build AGI now, rather than doing any AI pause, and to the extent that they say they prefer pause, I think they are being deceived by the marketing or acting under Caplanian Principle of Normality, or else they are somehow better philosophers than I expected they would be.
(Note: if you are so pessimistic about aligning AI without a pause that your probability on that is lower than the probability of unaugmented present-day humans reaching longevity escape velocity, then Option B does seem like a strictly better choice. But the older and more unhealthy you are, the less this applies to you personally.)
Are you simplifying here, or do you actually believe that “utopia in our lifetime” or “extinction” are the only two possible outcomes given AGI? Do you assign a 0% chance that we survive AGI, but don’t have a utopia in the next 80 years?
What if AGI stalls out at human level, or is incredibly expensive, or is buggy and unreliable like humans are? What if the technology required for utopia turns out to be ridiculously hard even for AGI, or substantially bottlenecked by available resources? What if technology alone can’t create a utopia, and the extra tech just exacerbates existing conflicts? What if AGI access is restricted to world leaders, who use it for their own purposes?
What if we build an unaligned AGI, but catch it early and manage to defeat it in battle? What if early, shitty AGI screws up in a way that causes a worldwide ban on further AGI development? What if we build an AGI, but we keep it confined to a box and can only get limited functionality out of it? What if we build an aligned AGI, but people hate it so much that it voluntary shuts off? What if the AGI that gets built is aligned to the values of people with awful views, like religious fundamentalists? What if AGI wants nothing to do with us and flees the galaxy? What if [insert X thing I didn’t think of here]?.
IMO, extinction and utopia are both unlikely outcomes. The bulk of the probability lies somewhere in the middle.
I was indeed simplifying, and e.g. probably should have said “global catastrophe” instead of “human extinction” to cover cases like permanent totalitarian regimes. I think some of the scenarios you mention could happen, but also think a bunch of them are pretty unlikely, and also disagree with your conclusion that “The bulk of the probability lies somewhere in the middle”. I might be up for discussing more specifics, but also I don’t get the sense that disagreement here is a crux for either of us, so I’m also not sure how much value there would be in continuing down this thread.
I would agree that “utopia in our lifetime” or “extinction” seems like a false dichotomy. What makes you say that you predict the bulk of the probability lies somewhere in the middle?
How about an Option A.1: pause for a few years or a decade to give alignment a chance to catch up? At least stop at the red lights for a bit to check whether anyone is coming, even if you are speeding!
I think this easily goes through, even for 1-10% p(doom|AGI), as it seems like ageing is basically already a solved problem or will be within a decade or so (see the video I linked to—David Sinclair; and there are many other people working in the space with promising research too).