Clarifying existential risks and existential catastrophes
This post was written for Convergence Analysis. In it, I summarise and analyse existing ideas more than proposing new ones. Some readers may already be familiar with much of this.
Existential risks are considered by many to be among the most pressing issues of our time (see e.g. The Precipice). But what, precisely, do we mean by “existential risks”?
To clarify this, this post will:
-
Quote prominent definitions of existential risk or existential catastrophe
-
Highlight three distinctions which are arguably obvious, but are also often overlooked:[1]
Existential risk vs existential catastrophe
Existential catastrophe vs extinction
Existential vs global catastrophic risks
-
Discuss two nuances of the concept of an existential catastrophe, regarding:
How much of our potential must be destroyed?
Does the catastrophe have to be a specific “event”? Does it have to look “catastrophic”?
Bostrom and Ord’s definitions
I believe that the term existential risk was introduced in this context by Nick Bostrom in a 2002 paper, where he defined such a risk as:
One where an adverse outcome would either annihilate Earth-originating intelligent life or permanently and drastically curtail its potential.
Bostrom later (2012) updated this to the definition you’ve probably heard most often:
An existential risk is one that threatens the premature extinction of Earth-originating intelligent life or the permanent and drastic destruction of its potential for desirable future development
More recently, in The Precipice (2020), Toby Ord gives the following definitions:
An existential catastrophe is the destruction of humanity’s longterm potential.
An existential risk is a risk that threatens the destruction of humanity’s longterm potential.[2]
Three important distinctions
Existential risk ≠ existential catastrophe
An existential risk is the risk of an existential catastrophe occurring. An existential risk is not itself the destruction of humanity’s potential. Unfortunately, the term existential risk seems to often be used as if it can refer to either the risk or the catastrophe, and the term existential catastrophe seems to relatively rarely be used at all.
For example, Millett and Snyder-Beattie write “For the purposes of this model, we assume that for any global pandemic arising from this kind of research, each has only a one in ten thousand chance of causing an existential risk” (emphasis added). As Beard et al. note (in their appendix):
It is not precisely clear whether the authors mean that one in ten thousand pandemics are predicted to cause extinction, or whether one in ten pandemics will have a risk of extinction. The latter reading is implausible because surely there is at least a risk, however small, that any global pandemic would cause extinction.
Existential catastrophe ≠ extinction
People often seem to say existential risk when they’re actually referring specifically to extinction risk (e.g., in this post, this podcast, and this post). Extinction of humanity is certainly one type of existential catastrophe,[3] and perhaps the most likely one. But it’s not the only one. Here’s Toby Ord’s breakdown of different types of existential catastrophe:
(For those interested, I’ve previously collected links to works on collapse and on dystopia.)
In line with Cotton-Barratt and Ord, I recommend that, if you are specifically talking about extinction or extinction risk, you use those terms, rather than the terms existential catastrophe or existential risk. This should avoid unnecessary jargon and confusion.
Existential risk ≠ global catastrophic risk
Bostrom & Ćirković use the term global catastrophic risk “to refer, loosely, to a risk that might have the potential to inflict serious damage to human well-being on a global scale.” A variety of other definitions can be found here, all of which refer to a wider set of risks than existential risks does. Unfortunately, people sometimes seem to:
use the term existential risk to refer to events that don’t actually meet the extremely high bar of destroying the vast majority of humanity’s potential
use the terms existential risk and global catastrophic risk as if they’re interchangeable (e.g., here and here)
To avoid confusion and concept creep, this should be avoided.
How much of our potential must be destroyed?
Ord (2020) writes that “An existential risk is a risk that threatens the destruction of humanity’s longterm potential.” He also states explicitly that:
This includes cases where the destruction is complete (such as extinction) and where it is nearly complete, such as a permanent collapse of civilisation in which the possibility for some very minor types of flourishing remain, or where there remains some remote chance of recovery. I leave the thresholds vague, but it should be understood that in any existential catastrophe the greater part of our potential is gone and very little remains.
It seems like a good idea to have the term capture such “nearly complete” cases.
That said, at least to me, it seems that “destruction of humanity’s longterm potential” could be read as meaning the complete destruction. So I’d personally be inclined to tweak Ord’s definitions to:
An existential catastrophe is the destruction of the vast majority of humanity’s long-term potential.
An existential risk is a risk that threatens the destruction of the vast majority of humanity’s long-term potential.[4]
But what if some of humanity’s long-term potential is destroyed, but not the vast majority of it? Given Ord and Bostrom’s definitions, I think that the risk of that should not be called an existential risk, and that its occurrence should not be called an existential catastrophe. Instead, I’d put such possibilities alongside existential catastrophes in the broader category of things that could cause “Persistent trajectory changes”. More specifically, I’d put them in a category I’ll term in an upcoming post “non-existential trajectory changes”. (Note that “non-existential” does not mean “not important”.)
Does an existential catastrophe have to be a specific “event”? Does it have to look “catastrophic”?
It seems to me that discussions of existential risk or catastrophe typically focus on relatively discrete “events”, and typically those that would look “catastrophic”. For example:
A “treacherous turn” from a misaligned AI
A global pandemic from a bioengineered pathogen
A nuclear war and ensuing nuclear winter
These events seem to clearly match the standard, layperson concept of a “catastrophe”, as they’d occur over a fairly short time period, have fairly clear start and end points, and involve “destruction” that we’d notice as it happens. The full descent into extinction, unrecoverable collapse, or unrecoverable dystopia might take years in some such scenarios, but it still would’ve been sparked by a clear-cut “catastrophe”.
But the term “existential catastrophe” can also apply to “slower moving” catastrophes. For example, when discussing climate change, Ord (2020) writes:
Unlike many of the other risks I address, the central concern here isn’t that we would meet our end this century, but that it may be possible for our actions now to all but lock in such a disaster for the future. If so, this could still be the time of the existential catastrophe—the time when humanity’s potential is destroyed.
And when giving his estimates of the chance of existential catastrophe from various outcomes during the next 100 years, he writes “when the catastrophe has delayed effects, like climate change, I’m talking about the point of no return coming within 100 years”.
Additionally, I think the term “existential catastrophe” should be able to apply to scenarios where there’s no obvious “catastrophe” (in the standard sense) at all. For example, Ord writes:
If our potential greatly exceeds the current state of civilisation, then something that simply locks in the current state would count as an existential catastrophe. An example would be an irrevocable relinquishment of further technology progress.
It may seem strange to call something a catastrophe due to merely being far short of optimal. [...] But consider, say, a choice by parents not to educate their child. There is no immediate suffering, yet catastrophic longterm outcomes for the child may have been locked in.
And his “plausible examples” of a “desired dystopia”, one type of existential catastrophe, include:
worlds that forever fail to recognise some key form of harm or injustice (and thus perpetuate it blindly), worlds that lock in a single fundamentalist religion, and worlds where we deliberately replace ourselves with something that we didn’t realise was much less valuable (such as machines incapable of feeling).[5]
Conclusion
To summarise:
Existential risks are distinct from existential catastrophes, extinction risks, and global catastrophic risks.
An existential catastrophe involves the destruction of the vast majority of humanity’s potential—not necessarily all of humanity’s potential, but more than just some of it.
Existential catastrophes could be “slow-moving” or not apparently “catastrophic”; at least in theory, our potential could be destroyed slowly, or without this being noticed.
Arguably, reducing existential risks should be a top priority of our time. I think one way to improve our existential risk reduction efforts is to clarify and sharpen our thinking, discussion, and research, and one way to do that is to clarify and sharpen our key concepts. I hope this post has helped on that front.
In upcoming posts, I’ll discuss two further complexities with the concept of existential catastrophe:
The idea that existential catastrophe is really the destruction of the potential of humanity or its “descendants”; it’s not necessarily solely about human wellbeing, nor just Homo sapiens’ potential.
How an existential catastrophe is (I believe) distinct from a scenario in which humanity maintains its potential but never uses it well, and the implications of that alternative possibility.
My thanks to David Kristoffersson for useful comments and feedback, and to Justin Shovelain for related discussions.
This is one of a series of posts I plan to write that summarise, comment on, or take inspiration from parts of The Precipice. You can find a list of all such posts here.
- ↩︎
In general, when I imply one meaning of a term is the “correct” or “actual” meaning, I’m really focused on the “useful” meaning, “standard” meaning, or meaning that’s “consistent with explicit definitions”.
- ↩︎
I don’t believe Bostrom makes explicit what he means by “potential” in his definitions. Ord writes “I’m making a deliberate choice not to define the precise way in which the set of possible futures determines our potential”, and then discusses that point. I’ll discuss the matter of “potential” more in an upcoming post.
Another approach would be to define existential catastrophes in terms of expected value rather than “potential”. That approach is discussed by Cotton-Barratt and Ord (2015).
- ↩︎
This is assuming we don’t classify as “extinction” scenarios such as:
humanity being “replaced” by a “descendant” which we’d be happy to be replaced by (e.g., whole brain emulations or a slightly different species that we evolve into)
“humanity (or its descendants) [going] extinct after fulfilling our longterm potential” (Ord)
- ↩︎
Unimportantly, I’ve also added a hyphen in “long-term” in these definitions. See footnote 2 here.
- ↩︎
Seemingly relevantly, Bostrom’s classification of types of existential risk (by which I think he really means “types of existential catastrophe”) includes “plateauing — progress flattens out at a level perhaps somewhat higher than the present level but far below technological maturity”, as well as “unconsummated realization”. Both types seem to me like they could occur in ways such that the catastrophe is very slow or not really recognised by anyone as a catastrophe.
And in Paul Christiano’s description of what “failure” might look like in the context of AI alignment, he writes: “As this world goes off the rails, there may not be any discrete point where consensus recognizes that things have gone off the rails.” Christiano doesn’t use the term “existential catastrophe” in that post, but it seems to me that the scenario he describes would count as one.
- The Future Might Not Be So Great by 30 Jun 2022 13:01 UTC; 142 points) (
- 3 suggestions about jargon in EA by 5 Jul 2020 3:37 UTC; 131 points) (
- Database of existential risk estimates by 15 Apr 2020 12:43 UTC; 130 points) (
- Two tools for rethinking existential risk by 5 Apr 2024 21:25 UTC; 82 points) (
- Venn diagrams of existential, global, and suffering catastrophes by 15 Jul 2020 12:28 UTC; 81 points) (
- Improving the future by influencing actors’ benevolence, intelligence, and power by 20 Jul 2020 10:00 UTC; 76 points) (
- Some thoughts on Toby Ord’s existential risk estimates by 7 Apr 2020 2:19 UTC; 67 points) (
- Quotes about the long reflection by 5 Mar 2020 7:48 UTC; 55 points) (
- Causal diagrams of the paths to existential catastrophe by 1 Mar 2020 14:08 UTC; 51 points) (
- Differential progress / intellectual progress / technological development by 24 Apr 2020 14:08 UTC; 47 points) (
- What are information hazards? by 18 Feb 2020 19:34 UTC; 41 points) (LessWrong;
- 23 Jan 2021 3:05 UTC; 38 points) 's comment on [Podcast] Ajeya Cotra on worldview diversification and how big the future could be by (
- What are information hazards? by 5 Feb 2020 20:50 UTC; 38 points) (
- Existential risks are not just about humanity by 28 Apr 2020 0:09 UTC; 35 points) (
- Four components of strategy research by 30 Jan 2020 19:08 UTC; 26 points) (
- [Link and commentary] The Offense-Defense Balance of Scientific Knowledge: Does Publishing AI Research Reduce Misuse? by 16 Feb 2020 19:56 UTC; 24 points) (LessWrong;
- Database of existential risk estimates by 20 Apr 2020 1:08 UTC; 24 points) (LessWrong;
- Memetic downside risks: How ideas can evolve and cause harm by 25 Feb 2020 19:47 UTC; 21 points) (LessWrong;
- What will the first human-level AI look like, and how might things go wrong? by 23 May 2024 11:17 UTC; 20 points) (LessWrong;
- 28 Feb 2020 17:23 UTC; 18 points) 's comment on MichaelA’s Quick takes by (
- [Link and commentary] Beyond Near- and Long-Term: Towards a Clearer Account of Research Priorities in AI Ethics and Society by 14 Mar 2020 9:04 UTC; 18 points) (
- Information hazards: Why you should care and what you can do by 23 Feb 2020 20:47 UTC; 18 points) (LessWrong;
- Giving Now vs. Later for Existential Risk: An Initial Approach by 29 Aug 2020 1:04 UTC; 14 points) (
- Extinction risk reduction and moral circle expansion: Speculating suspicious convergence by 4 Aug 2020 11:38 UTC; 12 points) (
- What will the first human-level AI look like, and how might things go wrong? by 23 May 2024 11:28 UTC; 12 points) (
- How would you define “existential risk?” by 29 Nov 2021 5:17 UTC; 10 points) (
- 28 May 2020 13:54 UTC; 7 points) 's comment on gavintaylor’s Quick takes by (
- 7 Apr 2020 2:06 UTC; 7 points) 's comment on MichaelA’s Quick takes by (
- 30 May 2020 1:40 UTC; 7 points) 's comment on MichaelA’s Quick takes by (
- 14 Mar 2024 15:53 UTC; 5 points) 's comment on Results from an Adversarial Collaboration on AI Risk (FRI) by (
- 24 Jun 2020 3:02 UTC; 5 points) 's comment on MichaelA’s Shortform by (LessWrong;
- Does Utilitarian Longtermism Imply Directed Panspermia? by 24 Apr 2020 18:15 UTC; 4 points) (
- 16 May 2020 7:56 UTC; 2 points) 's comment on Critical Review of ‘The Precipice’: A Reassessment of the Risks of AI and Pandemics by (
- 6 Jul 2020 5:34 UTC; 2 points) 's comment on Estimating the Philanthropic Discount Rate by (
- 21 Mar 2021 5:03 UTC; 2 points) 's comment on Is Democracy a Fad? by (
- 29 May 2020 1:54 UTC; 2 points) 's comment on Why making asteroid deflection tech might be bad by (
- AI as a Civilizational Risk Part 5/6: Relationship between C-risk and X-risk by 3 Nov 2022 2:19 UTC; 2 points) (LessWrong;
- Two tools for rethinking existential risk by 6 Apr 2024 2:55 UTC; 2 points) (LessWrong;
Thank you for this article, Michael! I like seeing the different mainline definitions of existential risk and catastrophe alongside each other, and having some common misunderstandings clarified.
Just a minor comment:
Ord was presumably going for brevity in his book, and I think his definition succeeds quite well! I don’t think generally adding 4 words to Ord’s short nice definition would be worth it. There’s other details that could be expanded on as well (like how we can mostly consider the definition in Bostrom 2012 to be a more expanded one). Expanding helps with discussing a particular point, though.
Have you seen my papers on the topic, by chance? One is published in Inquiry, the other is forthcoming. Send me an email if you’d like!
One is here: https://docs.wixstatic.com/ugd/d9aaad_64ac5f0da7ea494ab48f54181b249ce4.pdf. And my critique of the radical utopianism and valuation of imaginary lives that undergirds the most prominent notion of “existential risk” today is here: https://c8df8822-f112-4676-8332-ad89713358e3.filesusr.com/ugd/d9aaad_33466a921b2646a7a02482acb89b07b8.pdf