Error
Unrecognized LW server error:
Field "fmCrosspost" of type "CrosspostOutput" must have a selection of subfields. Did you mean "fmCrosspost { ... }"?
Unrecognized LW server error:
Field "fmCrosspost" of type "CrosspostOutput" must have a selection of subfields. Did you mean "fmCrosspost { ... }"?
Nice points, Isaac!
I would personally go a little further. I think the concept of existential risk is sufficiently vague for it to be better to mostly focus on clearer metrics (e.g. a suffering-free collapse of all value would be maximally good for negative utilitarians, but would be an existential risk for most people). For example, extinction risk, probability of a given drop in global population / GDP / democracy index, or probability of global population / GDP / democracy index remaining smaller than the previous maximum for a certain time.
There’s a new chart template that is better than “P(doom)” for most people.
Counterpoint: many people dismissed longtermism as a kind of mathematical blackmail because even miniscule probability events could justify infinite resources to them. The biggest change in moving from longtermism to “holy shit x risk” was emphasizing that the probability is not miniscule.
But I agree with your first point, and think that “p(doom)” should be expanded into “p(doom | agi)” and “p(agi)”.
Or to put more bluntly, the p(doom) estimates tended to rise after the “10^35 future humans so even if p(doom) is really low...” arguments were widely dismissed.
Obviously other stuff happened in the world of AI, and AGI researchers are justified in arguing they simply updated their priors in the light of the rise of emergent behaviour exhibited by LLMs[1] (others obviously always had high and near-term expectations of doom anyway). But Bayes’ Theorem also justifies sceptics updating their p(blackmail) estimates[2]
Priors for regular events like sports, market movements and insurable events are easily converted into money so bold predictions are easily put to the test whether they claim to be based on a robust frequentist model of similar events or inside information or pure powers of observation. But doom in most of the outlined scenarios is a one-off and the incentive structure actually works the opposite way round: people arguing we’re seriously underestimating p(doom) don’t expect to be around if they’re right and are asking for resources now to reduce it. I don’t think it’s an isolated demand for rigour to suggest that a probabilistic claim of this nature bears very little resemblance to a probabilistic claim made where there’s some evidence of some base rate and strong incentive not to be overconfident.
So yeah, I agree, p(doom) isn’t persuasive and I’m not sure decomposing it into p(doom | agi) and p(agi) or equivalents for other x-risk fields puts it on a stronger footing. Understanding how researchers believe a development increases or reduces a source of x-risk is much more convincing argument about their value than incrementing or decrementing an arbitrary-seeming doom number. The “doomsday clock” was an effective rhetorical tool because everyone understood it as asking politicians to reverse course, not because it was accepted as a valid representation of an underlying probability distribution.
[1]though they could also have been justified in updating the other way; [notionally] safety-conscious organisations getting commercially valuable, near-human level outputs from a text transformation matrix arguably gives less reasons to believe anyone would deem giving machines agency worth the effort.
[2]also in either direction.
Oh, I agree. Arguments of the form “bad things are theoretically possible, therefore we should worry” are bad and shouldn’t be used. But “bad things are likely” is fine, and seems more likely to reach an average person than “bad things are 50% likely”.
Toby Ord’s existential risk estimates in The Precipice were for risk this century (by 2100) IIRC. That book was very influential in x-risk circles around the time it came out, so I have a vague sense that people were accepting his framing and giving their own numbers, though I’m not sure quite how common that was. But these days most people talking about p(doom) haven’t read The Precipice, given how mainstream that phrase has become.
Also, in some classic hard-takeoff + decisive-strategic-advantage scenarios, p(doom) in the few years after AGI would be close to p(doom) in general, so these distinctions don’t matter that much. But nowadays people are worried about a much greater diversity of threat models.
Yeah, most of the p(doom) discussions I see taking place seem to be focusing on the nearer term of 10 years or less. I believe there are quite a few people (e.g. Gary Marcus, maybe?) who operate under a framework like “current LLMs will not get to AGI, but actual AGI will probably be hard to align), so they may give a high p(doom before 2100) and a low p(doom before 2030).
Isaac—good, persuasive post.
I agree that p(doom) is rhetorically ineffective—to normal people, it just looks weird, off-putting, pretentious, and depressing. Most folks out there have never taken a probability and statistics course, and don’t know what p(X) means in general, much less p(doom).
I also agree that p(doom) is way too ambiguous, in all the ways you mentioned, plus another crucial way: it isn’t conditioned on anything we actually do about AI risk. Our p(doom) given an effective global AI regulation regime might be a lot lower than p(doom) if we do nothing. And the fact that p(doom) isn’t conditioned on our response to p(doom) creates a sense of fatalistic futility, as if p(doom) is a quantitative fact of nature, like the Planck constant or the Coulomb constant, rather than a variable that reflects our collective response to AI risks, and that could go up or down quite dramatically given human behavior.
thanks for the writeup! I had a ton of similar feelings for a while, mixing between finding people who say “it’s not worth defending it’s just a meme” and “actually I’ll defend using something like this”.
At one point I was discussing this issue with Rob Miles at manifest, who told me something like “the default is a bool (some two valued variable)”, the idea being that if people are arguing over an interval then we could’ve done way worse.
Executive summary: The term “p(doom)” is ambiguous, rhetorically ineffective for communicating AI risks, and has become an polarizing ingroup signal that impedes thoughtful discussion on mitigating existential threats from advanced AI.
Key points:
“p(doom)” conflates multiple distinct probabilities like short-term AI catastrophe vs long-term, conditional on AGI vs conditional on superintelligence. This ambiguity fosters miscommunication.
Explicit probabilities meet motivated skepticism and innumeracy. Framing AI risk discussion around numbers backfires rhetorically.
“p(doom)” has become an ingroup shibboleth that outsiders easily ridicule. This entrenches polarization around AI risk.
People should stop using this term and instead discuss specific risks and probabilities when warranted, but focus rhetoric on normal language.
This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.
Very well put, thanks.
I feel that starting with “epistemic status” has [some!] similar aspects to p(doom). It’s a lot of fun for us but beginning an argument in real life with “Epistemic Status” loses in a split second.
Yeah, I don’t do it on any non-LW/EAF post.