What do the superforecasters say? Well, the most comprehensive effort to ascertain and influence superforecaster opinions on AI risk was the Forecasting Research Institute’s Roots of Disagreement Study.[2] In this study, they found that nearly all of the superforecasters fell into the “AI skeptic” category, with an average P(doom) of just 0.12%. If you’re tempted to say that their number is only so low because they’re ignorant or haven’t taken the time to fully understand the arguments for AI risk, then you’d be wrong; the 0.12% figure was obtained after having months of discussions with AI safety advocates, who presented their best arguments for believing in AI x-risks.
I see this a bunch but I think this study is routinely misinterpreted. I have some knowledge from having participated in it.
The question being posed to forecasters was about literal human extinction, which is pretty different from how I see p(doom) be interpreted. A lot of the “AI skeptics” were very sympathetic to AI being the biggest deal, but just didn’t see literal extinction as that likely. I also have moderate p(doom) (20%-30%) while thinking literal extinction is much lower than that (<5%).
Also the study ran 2023 April 1 to May 31, which was just right after the release of GPT-4. Since then there’s been so much more development. My guess is if you polled the “AI skeptics” now, the p(doom) will have gone up.
Relatedly, Table 1 of the report on the Existential Risk Persuasion Tournament (XPT) shows there was much more agreement between superforecasters and experts about catastrophic risk than extinction risk.
Interesting, I thought p(doom) was about literal extinction? If it also refers to unrecoverable collapse, then I’m really surprised that takes up 15-30% of your potential scenarios! I always saw that part of the existential risk definition as negligible.
p(doom) is about doom. For AI, I think this can mean a few things:
Literal human extinction
Humans lose power over their future but are still alive (and potentially even have nice lives), either via stable totalitarianism or gradual disempowerment or other means
I checked parts of the study, and the 0.12% figure is for P(AI-caused existential catastrophe by 2100) according to the “AI skeptics”. This is what is written about the definition of existential catastrophe just before it:
Participants made an initial forecast on the core question they disagreed about (we’ll call this U, for “ultimate question”): by 2100, will AI cause an existential catastrophe? We defined “existential catastrophe” as an event in which at least one of the following occurs:
Humanity goes extinct
Humanity experiences “unrecoverable collapse,” which means either:
<$1 trillion global GDP annually [in 2022 dollars] for at least a million years (continuously), beginning before 2100; or
Human population remains below 1 million for at least a million years (continuously), beginning before 2100.
That sounds similar to the classic existential risk definition?
(Another thing that’s important to note is that the study specifically sought forecasters skeptical of AI. So it doesn’t tell us much if anything about what a group of random superforecasters would actually predict!)
I am very very surprised your ‘second bucket’ contains the possibility of humans potentially having nice lives! I suspect if you had asked me the definition of p(doom) before I read your initial comment, I would actually have mentioned the definition of existential risks that includes the permanent destruction of future potential. But I simply never took that second part seriously? Hence my initial confusion. I just assumed disempowerment or a loss of control would lead to literal extinction anyway, and that most people shared this assumption. In retrospect, that was probably naive of me. Now I’m genuinely curious how much of people’s p(doom) estimates actually comes from actual extinction versus other scenarios...
That sounds similar to the classic existential risk definition?
Bostrom defines existential risk as “One where an adverse outcome would either annihilate Earth-originating intelligent life or permanently and drastically curtail its potential.” There’s tons of events that could permanently and drastically curtail potential without reducing population or GDP that much. For example, AI could very plausibly seize total power, and still choose to keep >1 million humans alive. Keeping humans alive seems very cheap on a cosmic scale, so it could be justified by caring about humans a tiny bit, or maybe justified by thinking that aliens might care about humans and the AI wanting to preserve the option of trading with aliens, or something else. It seems very plausible that this could still have curtailed our potential, in the relevant sense. (E.g. if our potential required us to have control over a non-trivial fraction of resources.)
I think this is more likely than extinction, conditional on (what I would call) doom from misaligned AI. You can also compare with Paul Christiano’s more detailed views.
I see this a bunch but I think this study is routinely misinterpreted. I have some knowledge from having participated in it.
The question being posed to forecasters was about literal human extinction, which is pretty different from how I see
p(doom)be interpreted. A lot of the “AI skeptics” were very sympathetic to AI being the biggest deal, but just didn’t see literal extinction as that likely. I also have moderate p(doom) (20%-30%) while thinking literal extinction is much lower than that (<5%).Also the study ran 2023 April 1 to May 31, which was just right after the release of GPT-4. Since then there’s been so much more development. My guess is if you polled the “AI skeptics” now, the p(doom) will have gone up.
Hi Peter,
Relatedly, Table 1 of the report on the Existential Risk Persuasion Tournament (XPT) shows there was much more agreement between superforecasters and experts about catastrophic risk than extinction risk.
Interesting, I thought p(doom) was about literal extinction? If it also refers to unrecoverable collapse, then I’m really surprised that takes up 15-30% of your potential scenarios! I always saw that part of the existential risk definition as negligible.
p(doom) is about doom. For AI, I think this can mean a few things:
Literal human extinction
Humans lose power over their future but are still alive (and potentially even have nice lives), either via stable totalitarianism or gradual disempowerment or other means
The second bucket is pretty big
I checked parts of the study, and the 0.12% figure is for P(AI-caused existential catastrophe by 2100) according to the “AI skeptics”. This is what is written about the definition of existential catastrophe just before it:
That sounds similar to the classic existential risk definition?
(Another thing that’s important to note is that the study specifically sought forecasters skeptical of AI. So it doesn’t tell us much if anything about what a group of random superforecasters would actually predict!)
I am very very surprised your ‘second bucket’ contains the possibility of humans potentially having nice lives! I suspect if you had asked me the definition of p(doom) before I read your initial comment, I would actually have mentioned the definition of existential risks that includes the permanent destruction of future potential. But I simply never took that second part seriously? Hence my initial confusion. I just assumed disempowerment or a loss of control would lead to literal extinction anyway, and that most people shared this assumption. In retrospect, that was probably naive of me. Now I’m genuinely curious how much of people’s p(doom) estimates actually comes from actual extinction versus other scenarios...
Bostrom defines existential risk as “One where an adverse outcome would either annihilate Earth-originating intelligent life or permanently and drastically curtail its potential.” There’s tons of events that could permanently and drastically curtail potential without reducing population or GDP that much. For example, AI could very plausibly seize total power, and still choose to keep >1 million humans alive. Keeping humans alive seems very cheap on a cosmic scale, so it could be justified by caring about humans a tiny bit, or maybe justified by thinking that aliens might care about humans and the AI wanting to preserve the option of trading with aliens, or something else. It seems very plausible that this could still have curtailed our potential, in the relevant sense. (E.g. if our potential required us to have control over a non-trivial fraction of resources.)
I think this is more likely than extinction, conditional on (what I would call) doom from misaligned AI. You can also compare with Paul Christiano’s more detailed views.