This post was written as part of the Red Team Challenge by Training For Good. We spent seven weeks writing a critique of Chapter 6 of The Precipice because it is a key text in the Effective Altruism community and we believe that it is important to ensure that our ways of comparing and prioritizing existential risks (henceforth referred to as x-risks) rest on a solid foundation.
Key takeaways
The author provides several useful frameworks for how to think about, compare, and prioritize existential risks - (1) Importance x Tractability x Neglectedness (ITN)[1] (2) Anatomy of extinction risks, (3) Correlation between different x-risks, (4) Existential risk and security factors, and (5) Soon, sudden, sharp.
While the Importance x Tractability x Neglectedness (ITN) framework is provided as a key way to prioritize, the author mainly focuses on the Importance aspect, in particular disregarding the Tractability parameter, which we believe could make a significant impact on the final prioritization. There are some plausible reasons for not trying to estimate Tractability, but we would prefer the author to explicitly mention his reasoning.
The author does not provide a way to combine all of the different frameworks into a single prioritization, leaving this difficult to apply consistently.
Priorities research—the question of how to allocate effort between mitigating x-risks directly vs. investigating the risk landscape to improve future resource allocation—seems important to the chapter, yet is missing from the book entirely.
The methods of x-risk probability calculation are highly uncertain, and the author’s use of these uncertain estimates as a basis for prioritizing between different x-risks and for allocating large amounts of resources to specific targeted interventions is not fully justified.
In light of the large uncertainty of existential risk prioritization, we explore the idea of how broad resilience interventions might side-step some of the issues with estimating the impact of targeted risk reduction, and suggest that these warrant more attention. With the goal of preventing existential catastrophe, focusing on maximizing core resiliency of our species could be a promising alternative.
The scope of our critique
The Precipice focuses on minimizing existential risks (x-risks) to humanity, defined as “permanent destruction of humanity’s potential” (p. 6). These risks include extinction, but also other ways humanity’s long-term potential could be permanently destroyed, e.g. permanent collapse of civilization. While there are arguments that a pure focus on x-risks at the cost of everything else may be misguided[2], we accept the premise of the book for the purposes of our critique. Instead, we focus on critiquing whether Chapter 6, The Risk Landscape, has achieved its stated goal to “contemplate the entire landscape of existential risk, seeing how the risks compare, how they combine, what they have in common, and which risks should be our highest priorities” (p. 165).
While we focus this critique on points we believe could be even stronger, we also investigated several claims that hold up very well. In particular, we believe that the author (1) provides useful models for thinking about existential risk, (2) presents a very solid mathematical treatment of how multiple risks combine into an overall risk, clearly highlighting some counterintuitive results, (3) highlights not only direct risks, but also existential risk and security factors, which either increase or lower the chance of existential catastrophe without being a direct existential threat themselves, and (4) differentiates well between targeted vs. broad risk interventions.
Tractability is important but neglected in The Precipice
The importance, tractability and neglectedness (ITN) framework is a tool for estimating the impact of allocating marginal resources towards trying to solve a given problem. The framework breaks the cost-effectiveness of allocating these marginal resources down into three factors:
The framework is commonly used in EA to prioritize cause areas. If an area is more important, tractable, and neglected than others, it means that it’s more cost-effective to allocate resources to it and thus it should be prioritized over others. The author uses this framework in chapter 6 of The Precipice to prioritize among different x-risks. First, he considers what the ideal global portfolio would look like (i.e. if we were in full control of the neglectedness parameter by being able to redistribute resources), and then points out that “[a]n individual or group should allocate its resources to help bring the world’s portfolio into line with the ideal allocation”. We believe that this argument for first focusing on estimating the ideal global portfolio and postponing an estimation of the neglectedness of different x-risks is solid.
The author is thus left with the task of estimating tractability and importance. We are well aware that doing this for x-risk mitigation is not easy because, due to their very nature, no x-risk can ever be realized and observed, since otherwise we’d already be extinct or our civilization would be in dire straits. While we acknowledge this inherent difficulty and praise the author’s attempt to take a stab at the problem, we think his failure to address tractability in a meaningful way is poorly justified.
To see what we mean, we first note that, in order to estimate the relative importance of different x-risks, a good starting point[3] is to estimate their probabilities. This is because all x-risks have a similar downside (human extinction or permanent destruction of humanity’s potential)[4]. Thus, the relative expected value of mitigating one vs. another will be approximated by the ratio of their probabilities, e.g. if P(human extinction due to rogue AI) = 1⁄6 and P(human extinction due to asteroid impact) = 1⁄100,000, then reducing AI x-risk to zero is at least 17,000 times more valuable than reducing x-risk from asteroid impacts to zero. A good chunk of the book is dedicated to estimating and comparing these probabilities[5]. However, the author devotes hardly any effort to estimating the tractability of mitigating each x-risk, even though he admits this is a key parameter for prioritization[6]. Instead, he just falls back to an ignorance prior and states as a corollary the following Principle of Proportionality: “When a set of risks have equal tractability, or when we have no idea which is more tractable, the ideal global portfolio allocates resources to each risk in proportion to its contribution to total risk” (p. 182). The author does not justify this principle and simply states in an endnote that he “owe[s] this point to Owen Cotton-Barratt” (p. 386). Researching the topic, we came across what we believe to be the primary source for this claim. The bulk of Owen Cotton-Barratt’s article is concerned with other, different rules for allocating resources between problems when their tractability can be assumed to be distributed across several orders of magnitude. The optimality of the allocation rule that the author labels Principle of Proportionality is merely stated as a conjecture in the original source, without any formal proof or justification.
On top of that, there might be other good reasons for not spending a lot of effort estimating the tractability of different x-risks, e.g.
The author may be using the ITN framework heuristically rather than as a formal algorithm. This would mean that, instead of positing a joint probability distribution over all three parameters, he could simply attempt to estimate 1-2 of them and use that as rough guidance for prioritization.
The author may believe that estimating the probability of different x-risks is significantly easier than estimating the tractability of mitigating them, so he preferred to invest his scarce time doing the former. This could be because some inherent features of those probabilities make them low hanging fruit compared to the corresponding tractabilities, or because the author thinks he’s better equipped to estimate them due to personal fit.
The author may expect that researching tractability would yield similar or imprecise estimates across x-risks, so falling back to simple, uninformative priors is the best strategy.
However, if the author had these or other reasons for allocating his research efforts in the way he did, we think the argument should have been made explicitly in the book[7].
Moreover, we think there are certain x-risks that, even a priori, seem to have wildly different tractabilities. For instance, the risk from unaligned artificial intelligence is poorly understood due to the fact that we lack a precise mechanistic explanation of how the problem could unfold (leading to very different views in the community about things like timelines and takeoff speeds), while biorisk is pretty well understood scientifically and most of the bottlenecks come down to concrete engineering solutions and regulatory / governance / policy problems. The greater uncertainty around how to align advanced artificial intelligence systems should perhaps make us put significantly more probability mass towards the low-tractability end of the spectrum, thus reducing the expected tractability of the problem.
To summarize, we think that the author’s failure to even attempt to estimate the relative tractability of the different x-risks in his portfolio may have been reasonable given the difficulty of the task. However, we believe this choice about the allocation of his research efforts should’ve been justified in the chapter given the centrality of this parameter to the ITN framework and the fact that there are a priori reasons to believe some x-risks are significantly more tractable than others.
It is unclear how to integrate different frameworks for estimating the impact of a given risk
This section focuses on the way in which the author of The Precipice brings in additional considerations and heuristics for assessing the importance of different x-risks. While we do not reject any of these heuristics per se, we argue that the author does not offer a clear approach for using them together, thus weakening his attempt to rigorously prioritize between individual x-risks[8].
From our reading of the chapter, we got the impression that all of the following are suggested as frameworks for assessing the importance of a potential x-risk:
Anatomy of extinction risks: Based on work by the Future of Humanity Institute, the author suggests that we should consider the probabilities that an event occurs (origin), that it scales globally, and that it leads to complete extinction (endgame) when assessing its extinction risk.
Correlation between different x-risks: The author here proposes the heuristic that larger risks are disproportionately more important than smaller ones, and this becomes more severe the more correlated risks are amongst each other (for an explanation of why that is, see pp. 173-175 and Appendix D).
Existential risk and security factors: By introducing these concepts, the author concedes that it is insufficient to look only at how different events or developments might directly lead to an existential catastrophe. He gives a few examples which he considers likely and less-likely candidates for being important existential risk or security factors, but does not elaborate on how to identify and assess the importance of these factors.
Soon, sudden, sharp: The author also suggests that a look at how soon, sudden, and sharp different x-risks are can help in assessing the importance of working on each.
Our critique has not found major flaws with any of these frameworks; they make sense to us and seem reasonable heuristics for thinking about the topic, and we appreciate that the author sought to broaden perspectives by going beyond the existing ITN framework. However, we are somewhat confused about how these approaches can be integrated in an overall approach to assessing the importance of different x-risks (from what we can tell, the author himself does not rely on a combination of all these frameworks to come up with his x-risk ranking, or, at least, he does not spell out in the chapter how the different frameworks feed into his risk assessment). Specifically, it is not clear how to employ these heuristics in a non-arbitrary and non-self-serving way, i.e. how to use them for truly robust estimates rather than as a means for rationalizing whichever claim one happens to consider most likely a priori[9].
Consider, for instance, a discussion on cause prioritization between two people: one thinks that work on AI should be prioritized most highly among x-risks because it has the largest raw probability and is sharp (i.e., we can’t expect warning shots). The other thinks it should be nuclear war because it scores higher on the soonness scale and can be considered a serious x-risk factor (i.e., a nuclear exchange may not cause existential catastrophe directly but make humanity more vulnerable to other x-risks). How, we wonder, can such a dispute be resolved in a non-arbitrary way? How can the multiple frameworks on offer be combined to make robust prioritization decisions? How can actors with “bad intentions″ (e.g., nuclear industry lobby groups that may have an interest in downplaying the risk from nuclear weapons) or with “bad epistemics” (all of us to some degree, since nobody is immune from cognitive biases) be prevented from putting different weights on different frameworks depending on how each framework supports their case/preconceived intuitions (e.g., in the example above, it matters greatly how important you think the soonness of a risk is, and how intuitively plausible you consider different scenarios for how humanity responds to a limited nuclear strike)?
Meta-prioritization: How should we allocate resources towards and within prioritization research?
Just like the author failed to justify his relative overinvestment of research effort in estimating importance vs. tractability, it is also unclear from reading the book how we should split resources between mitigating x-risks directly vs. investigating the risk landscape better[10]. It’s not apparent to us that one line of action would be obviously better than the other, but, given the relative neglect of the latter one in The Precipice, we present here some hypothetical “risk landscape mapping” projects that could be high impact as a form of intuition pump:
Some risks may have been overblown, e.g. a few years elapsed between when the classic Bostrom-Yudkowsky view on superintelligence was fleshed out and when Drexler’s Comprehensive AI Services (CAIS) was put forward as an alternative. Even if, after learning about CAIS, we think the risk from AGI is real and pressing, the fact that there’s disagreement in the community about whether CAIS is possible / desirable should result in an all-things-considered view that roughly says “the community failed to consider some plausible arguments against a very catastrophic view of AGI x-risk for a long time, so this may happen again.”
Some risks may have been underestimated, e.g. it took about a decade since the founding of AI as a field in the 50s until the first academic paper on superintelligence came out, and five more decades until Nick Bostrom’s Superintelligence was published. This suggests other emerging technologies could pose existential risks we haven’t spotted yet because we haven’t done the necessary big picture thinking / brainstorming needed to identify them.
We think explicit value of information calculations for direct action compared to global prioritization research would’ve been appropriate here. For example, the author presumably believes that researching and writing the book was more cost-effective than working on direct mitigation. How did he come to this conclusion? Does it all boil down to personal fit or are there other factors at play?
Over-reliance on unreliable expert estimates
On page 153 of The Precipice, the author provides the following definition and operationalization of the importance value in the ITN equation: “The importance of a problem is the value of solving it. In the case of an existential risk, we can usually treat this as the amount it contributes to total [existential] risk.”. He goes on to provide a summary of his estimates of the magnitudes of different existential risks in Table 6.1 on p. 141. Our understanding is that the author arrived at these probabilities by making plausibility judgments informed by his intuitions, his general knowledge, and his consultation of expert opinion on each risk. A potential weakness of this approach that we want to draw attention to is the fallibility of human intuitions, including the intuitions of experts, which seems quite relevant to assessing the reliability of forecasts[11].
We contend that Daniel Kahneman and Amos Tversky’s experiments cast doubt on the reliability of expert opinion by showing that specialists are far from immune to cognitive biases and thinking fallacies (for an overview, see Kahneman 2013). The effects of those biases are well-documented in Philipp Tetlock’s work on “expert judgment”, which has found that domain-level experts frequently underperform simple rules and models in the prediction of political and economic events and developments. We realize that some experts in Tetlock’s studies (namely, those he categorizes as “foxes”) show remarkable forecasting skills, which can be acquired through training and experience. However, nothing in The Precipice indicates that the author relied specifically on input from “foxy” (let alone superforecasting) experts, nor that he himself is a “foxy” forecaster, and we argue that this reduces the credibility of his probability estimates[12].
In addition, we think that x-risk estimates face challenges that are different in a significant way from those faced by forecasting more generally. Since x-risks are about the mid- to long-term future and since they are unprecedented, there is only indirect empirical data to underwrite forecasts about their occurrence, and there are few if any feedback mechanisms to monitor the reliability of such forecasts. We think that this last point deserves special emphasis: there is no evidence of forecasting being useful more than ~5 years in advance[13], so unprecedented risks that may unfold in time frames of decades (or longer) cannot necessarily be treated the same way as risks that are either (i) short-term or (ii) have been realized in the past many times such that a clear base rate is readily available.
This skepticism of expert judgments raises questions about the value of the x-risk probabilities in The Precipice. Given the paucity of empirical data that can be used to check forecasts and estimates of x-risks, we think that even very modest claims about the magnitude of probabilities are vulnerable to attack and contestation. In investigating and writing this post, we have not come to a definitive conclusion on whether putting some numbers on these probabilities is better than having no estimates for them at all, or if putting down unreliable numbers instills a false sense of confidence and may encourage suboptimal decisions (between us four authors, we found different intuitions regarding that question). Following from this, we are unsure in our judgment on whether these (potentially unreliable) probabilities should play any role in prioritizing between different ways to prevent existential catastrophe. However, what we do agree about is that the probability estimates are highly uncertain, and that there may be value in considering alternative approaches for deciding how to allocate existential risk reduction efforts. We outline one such option in the last section.
Broad interventions side-step the challenges with the prioritization frameworks
As discussed, the main problem with the accuracy of unprecedented risk estimation is the lack of relevant data and hard evidence. In addition, this approach does not take into account the risks we are currently unaware of. We put forth the possibility of maximizing our resiliency as opposed to investing in risk reduction.
Both the risks discussed in The Precipe (natural catastrophes, nuclear war, climate change, other environmental damage, pandemics and unaligned AI) and the existential threats we are currently unaware of, are categorizable as ‘Black Swan’ events[14]. The phrase “black swan” derives from a Latin expression[15]: “rara avis in terris nigroque simillima cygno” (translation: “a rare bird in the lands and very much like a black swan”). When the phrase was coined, the black swan was presumed not to exist. Later the existence of black swans was confirmed by explorers in Western Australia[16]. The term was later used in the renowned works of mathematical statistician, Nassim Nicholas Taleb. A ‘Black Swan’ event is an unprecedented event with a major impact that is often completely unforeseen. The Black Swan theory is underrepresented in the discussion of x-risks [17]but, in combination with ultramodern risk models based on this theory, offers a promising approach to human continuity.
The problem with unprecedented events, such as x-risks, is the ineffectiveness of probability calculations. Black Swan events are not foreseeable by the usual statistics of correlation, regression, standard deviation or return periods. Expert opinions are also of minimal use, we refer to our arguments under section ‘Over-reliance on unreliable expert estimates’. This inability to estimate the likelihood of occurrence for Black Swan events precludes the application of traditional risk management. Therefore, the development of strategies to manage their consequences is of paramount importance[18]. Black Swan risk models forego ineffective risk calculations and focus on contingency. Not surprisingly, the state-of-the-art risk models are found in the financial market[19]. Most organizations care more about losing money than about distant existential threats.
By solely basing strategy on potentially unreliable calculations, Effective Altruism has not yet learned the lessons of the financial sector. Had banks worldwide not accumulated large exposures on high-impact events based on their return calculations, they would’ve made a smaller profit for a while[20]. However, they also wouldn’t have bankrupted themselves in 2008 to the point of needing bailouts of trillions of dollars globally[21]. Unfortunately, when it concerns existential threats, Effective Altruism does not have the luxury of learning from experience.
These lessons have cost us trillions of dollars globally and launched national debts to unprecedented heights. However, they have also provided us with valuable insights into extreme risk management. The most effective plans for Black Swan contingency are ‘resilience’ and ‘diversification’[22]. And both methods are translatable to human continuity.
The first method is resilience. Resilience in this context means the ability to withstand adverse events and recover from them. In a business context, this means making your core systems more resilient, and the same applies to human continuity. There are two categories of core systems regarding human continuity. Firstly, systems which are crucial for human survival. If disaster strikes, which systems will prevent human extinction? These are the systems that will provide basic human needs for a (small) population[23]: water, food, medicine, shelter, etc. We refer to ‘ALLFED’ for an example of this support. Secondly, systems which are crucial for human recovery: If disaster strikes, which systems are necessary to help humankind recover once the pique of the impact is over? These are systems which serve the needs for human expansion: energy, preservation of knowledge, building material and even factors contributing to mental fortitude[24].
The second method is diversification. In the financial sector this means spreading your assets over multiple markets and sectors. This counters the uncertainty of market forces. If disaster strikes, the impact will be limited to part of your assets. This contingency principle is also applicable to human continuity. Since nearly all existential threats are of a global scale (pandemics, global warming, nuclear accidents[25], etc.) this would mean spreading or expanding the human population over a larger area, e.g. planetary colonization and refugee islands. The author agrees with the method of extinction prevention or as he describes it ’Security among the stars” (p. 194). Even though this would be a highly effective mitigation mechanism, we do not advocate Effective Altruism groups focus on this endeavor. Planet colonization to the point colonies could survive independently is a longterm effort whereas disaster may strike sooner. Secondly, this endeavor already receives a lot of investments by both (inter)national governmental bodies and extremely wealthy individuals. We therefore feel that EA should focus on resilience.
We do note a certain disadvantage to this risk model. This approach to preventing x-risks is more robust and arguably more effective than the specific risk reduction advocated by the author[26] as this model outperforms all other models in the area of future unpredictability. This is by design as it was formed on the basis that anything can happen. Nonetheless, it is not convenient to advocate for this model due to its brutal nature. Like the author, this model prioritizes human survival and safeguarding humanity’s potential above all. However, it goes a step further by focussing on maximizing core resiliency instead of risk reduction[27]. In essence, in maximizing humanity’s survivability against all threats it considers parts of the human population as acceptable losses. This is more easily acceptable in the financial sector where the losses are only of monetary value. When it comes to human life the thought of focussing on specific risk reduction is a much more comfortable one, though for the reasons mentioned in this article, it may be misguided. However, Effective Altruism has never cowered from asking the hard questions. This community has redirected vast sums of money from people who would have benefited from it towards people who have benefited more from it.
Fortunately, it is possible to use this approach as a complement to the ‘specific risk reduction’ approach. This is done through the strategy of investing in human resilience to an acceptable level, which is yet to be determined, and afterwards focussing on the reduction of specific risks based on our best attempts at calculating them, such as the Bayesian approaches discussed in The Precipice. This spending should be adaptive to precursors of existential events (e.g. specific developments in AI, mutations or virus strains, new measurements on global warming, etc.). These calculations should be adjusted or, if necessary, disregarded in the light of new evidence. The right balance between these three approaches (resilience, risk-based, adaptive) will optimize human survivability and harm reduction.
Conclusion
Overall, The Precipice is an excellent book which details the importance of longtermism alongside several practical approaches towards the goal of existential risk reduction.
While we focus this critique on points we believe could be even stronger, we also investigated several claims from the chapter that hold up very well. In particular, we believe that the author (1) provides useful models for thinking about existential risk, (2) presents a very solid mathematical treatment of how multiple risks combine into an overall risk, clearly highlighting some counterintuitive results, (3) highlights not only direct risks, but also existential risk and security factors, which either increase or lower the chance of existential catastrophe without being a direct existential threat themselves, and (4) differentiates well between targeted vs. broad risk interventions.
The key ways we think this work could be even stronger, whether in future editions or in further work that builds upon the book, are:
Providing estimates of Tractability in the Importance x Tractability x Neglectedness (ITN) framework, or explaining more clearly why these are being avoided.
Suggesting ways to combine all of the different risk prioritization frameworks into a single prioritization.
Contemplating priorities research—the question of how to allocate effort between mitigating x-risks directly vs. investigating the risk landscape to improve future resource allocation.
Highlighting the uncertainty of certain risk estimates (e.g. the estimates of future risk from AI are much more uncertain than the estimates of natural risk from asteroids), and how this uncertainty should impact our resource allocation.
Exploring broad resilience interventions in more depth, as these could side-step some of the issues with estimating the impact of targeted risk reduction. With the goal of preventing existential catastrophe, focusing on maximizing core resiliency of our species could be a promising alternative.
Acknowledgements
We want to thank the Training for Good team for organizing the Red Team Challenge, and for providing guidance and support throughout! We also received some very helpful feedback from other participants but have not yet heard back from them regarding whether or not they are comfortable being mentioned here. We will come back and add their names if and when they give us permission to do so; in the meantime, we simply offer an anonymous Thank you! to all those who commented on our earlier draft.
[Throughout our initial discussions and reading in the course of this Red-Teaming Challenge, we identified a number of open questions, things we were confused about or that seemed unclear, and potential weaknesses in chapter 6 of The Precipice. These are collected in a separate GoogleDoc, available and open for comments here]
Several such arguments (as well as counter-points to many of them) can be found under this Forum tag. In addition, we think that discussions surrounding the concepts of moral uncertainty and worldview diversification are relevant for thinking about the ethical convictions underlying The Precipice’s focus on existential risk.
The probability of each x-risk, considered in isolation, is only a rough proxy for its importance. In The Precipice, the author suggests that a more accurate importance estimation should also consider the correlation between individual x-risks, since the nature and level of correlation can change the amount that x-risks with different probabilities contribute to total existential risk (high correlation would suggest that high-probability risks contribute disproportionately large amounts to total risk; see pp. 173-175 of The Precipice).
As far as we can tell, the book is silent about the risk of futures that are much worse than extinction or permanent destruction of humanity’s potential, i.e. futures that contain a huge amount of negative value. These have usually been referred to in EA as suffering risks or s-risks. Criticizing this omission is outside the scope of this review, but we note that some of the x-risks considered in the chapter also have the potential of becoming s-risks, e.g. rogue AIs might find it useful to further their goals by exploiting an astronomical number of sentient digital beings.
In p. 143: “These probabilities provide a useful summary of the risk landscape, but they are not the whole story, nor even the whole bottom line. Even completely objective, precise and accurate estimates would merely measure how large the different risks are, saying nothing about how tractable they are, nor how neglected. The raw probabilities are thus insufficient for determining which risks should get the most attention, or what kind of attention they should receive.”
A timid attempt at this is made in footnote 29, where the author argues that importance and tractability might be correlated. This follows from the heuristic that it should take about the same amount of resources to change the odds of something by a fixed multiplicative factor. Because of the nonlinear transformation between odds and probabilities, this means that risks in the “middling” range (around 50%) would be in a sweet spot for cheap interventions. Analyzing whether this heuristic is appropriate or not is beyond the scope of this review — we merely want to point out that discussions like this one should have been featured more prominently instead of being relegated to footnotes to make the case for excluding tractability estimates more clear.
While fleshing out this section of our critique, we also considered the possibility that having multiple methodologies/heuristics for prioritizing between x-risks, without clear guidelines for how much weight to give to each of them, might be a feature rather than a bug: Maybe this fosters discussion and further investigation, more so than if the author had made the more straightforward case for only one prioritization framework? As evident from this write-up, we are not fully convinced that the provision of multiple frameworks primarily has positive effects of that sort, believing instead that it creates confusion and makes rigorous prioritization more difficult. However, we haven’t spent a huge amount of time considering this possibility, and thus welcome comments that make the case for it in more detail!
One option for integrating the frameworks rigorously might be the method of “Model Averaging” (borrowed from the field of machine learning), which we have not looked into in-depth for this critique. If this, or some other approach, is what the author had in mind as a means for combining his suggested frameworks, we think he should have made an explicit reference to it.
The only passage we could find where this is addressed, although without any attempt at precise quantification, is in a box titled “Early Action” (p. 185): “In short, early action is higher leverage, but more easily wasted. It has more power, but less accuracy. If we do act far in advance of a threat, we should do so in ways that take advantage of this leverage, while being robust to near-sightedness. This often means a focus on knowledge and capacity building, over direct work.”
Our impressions from the effective altruism community and common epistemic practices therein lead us to think that the author likely has consulted experts with a higher-than-average forecasting track record, and that Toby Ord himself may fall in the category of a “foxy expert”. However, we believe that the book’s case would be stronger if the origin of its expert advice were documented in a bit more detail, so that it has to rely less on the reader’s knowledge and impressions of epistemic norms within effective altruism (or the reader’s familiarity with the author’s character and approach to reasoning).
For a longer discussion of the challenges related to knowing the feasibility and quality of long-range forecasts, see this report by Open Philanthropy.
Puhvel, Jaan (Summer 1984). “The Origin of Etruscan tusna (“Swan”)”. The American Journal of Philology. Johns Hopkins University Press. 105 (2): 209–212.
Using the GPT-3 language model we consulted a database of 175 million academic papers and found 1 direct result: Jebari, K. (2015). Existential Risks: Exploring a Robust Risk Reduction Strategy. Science and Engineering Ethics, 21, 541-554.
Taleb, N. N., Goldstein, D. G., & Spitznagel, M. W. (2009). The six mistakes executives make in risk management. Harvard Business Review, 87(10), 78-81.
Aven, T. (2015). Implications of black swans to the foundations and practice of risk assessment and management. Reliability Engineering & System Safety, 134, 83-91.
Cotton‐Barratt, O., Daniel, M., & Sandberg, A. (2020). Defence in depth against human extinction: prevention, response, resilience, and why they all matter. Global Policy, 11(3), 271-282.
Boyd, M., & Wilson, N. (2021). Optimizing Island Refuges against global Catastrophic and Existential Biological Threats: Priorities and Preparations. Risk Analysis, 41.
A Critique of The Precipice: Chapter 6 - The Risk Landscape [Red Team Challenge]
This post was written as part of the Red Team Challenge by Training For Good. We spent seven weeks writing a critique of Chapter 6 of The Precipice because it is a key text in the Effective Altruism community and we believe that it is important to ensure that our ways of comparing and prioritizing existential risks (henceforth referred to as x-risks) rest on a solid foundation.
Key takeaways
The author provides several useful frameworks for how to think about, compare, and prioritize existential risks - (1) Importance x Tractability x Neglectedness (ITN)[1] (2) Anatomy of extinction risks, (3) Correlation between different x-risks, (4) Existential risk and security factors, and (5) Soon, sudden, sharp.
While the Importance x Tractability x Neglectedness (ITN) framework is provided as a key way to prioritize, the author mainly focuses on the Importance aspect, in particular disregarding the Tractability parameter, which we believe could make a significant impact on the final prioritization. There are some plausible reasons for not trying to estimate Tractability, but we would prefer the author to explicitly mention his reasoning.
The author does not provide a way to combine all of the different frameworks into a single prioritization, leaving this difficult to apply consistently.
Priorities research—the question of how to allocate effort between mitigating x-risks directly vs. investigating the risk landscape to improve future resource allocation—seems important to the chapter, yet is missing from the book entirely.
The methods of x-risk probability calculation are highly uncertain, and the author’s use of these uncertain estimates as a basis for prioritizing between different x-risks and for allocating large amounts of resources to specific targeted interventions is not fully justified.
In light of the large uncertainty of existential risk prioritization, we explore the idea of how broad resilience interventions might side-step some of the issues with estimating the impact of targeted risk reduction, and suggest that these warrant more attention. With the goal of preventing existential catastrophe, focusing on maximizing core resiliency of our species could be a promising alternative.
The scope of our critique
The Precipice focuses on minimizing existential risks (x-risks) to humanity, defined as “permanent destruction of humanity’s potential” (p. 6). These risks include extinction, but also other ways humanity’s long-term potential could be permanently destroyed, e.g. permanent collapse of civilization. While there are arguments that a pure focus on x-risks at the cost of everything else may be misguided[2], we accept the premise of the book for the purposes of our critique. Instead, we focus on critiquing whether Chapter 6, The Risk Landscape, has achieved its stated goal to “contemplate the entire landscape of existential risk, seeing how the risks compare, how they combine, what they have in common, and which risks should be our highest priorities” (p. 165).
While we focus this critique on points we believe could be even stronger, we also investigated several claims that hold up very well. In particular, we believe that the author (1) provides useful models for thinking about existential risk, (2) presents a very solid mathematical treatment of how multiple risks combine into an overall risk, clearly highlighting some counterintuitive results, (3) highlights not only direct risks, but also existential risk and security factors, which either increase or lower the chance of existential catastrophe without being a direct existential threat themselves, and (4) differentiates well between targeted vs. broad risk interventions.
Tractability is important but neglected in The Precipice
The importance, tractability and neglectedness (ITN) framework is a tool for estimating the impact of allocating marginal resources towards trying to solve a given problem. The framework breaks the cost-effectiveness of allocating these marginal resources down into three factors:
costeffectiveness=importance∗tractability∗neglectedness
ΔvalueΔresources=ΔvalueΔ%problemsolved∗Δ%problemsolvedΔ%resources∗Δ%resourcesΔresources
The framework is commonly used in EA to prioritize cause areas. If an area is more important, tractable, and neglected than others, it means that it’s more cost-effective to allocate resources to it and thus it should be prioritized over others. The author uses this framework in chapter 6 of The Precipice to prioritize among different x-risks. First, he considers what the ideal global portfolio would look like (i.e. if we were in full control of the neglectedness parameter by being able to redistribute resources), and then points out that “[a]n individual or group should allocate its resources to help bring the world’s portfolio into line with the ideal allocation”. We believe that this argument for first focusing on estimating the ideal global portfolio and postponing an estimation of the neglectedness of different x-risks is solid.
The author is thus left with the task of estimating tractability and importance. We are well aware that doing this for x-risk mitigation is not easy because, due to their very nature, no x-risk can ever be realized and observed, since otherwise we’d already be extinct or our civilization would be in dire straits. While we acknowledge this inherent difficulty and praise the author’s attempt to take a stab at the problem, we think his failure to address tractability in a meaningful way is poorly justified.
To see what we mean, we first note that, in order to estimate the relative importance of different x-risks, a good starting point[3] is to estimate their probabilities. This is because all x-risks have a similar downside (human extinction or permanent destruction of humanity’s potential)[4]. Thus, the relative expected value of mitigating one vs. another will be approximated by the ratio of their probabilities, e.g. if P(human extinction due to rogue AI) = 1⁄6 and P(human extinction due to asteroid impact) = 1⁄100,000, then reducing AI x-risk to zero is at least 17,000 times more valuable than reducing x-risk from asteroid impacts to zero. A good chunk of the book is dedicated to estimating and comparing these probabilities[5]. However, the author devotes hardly any effort to estimating the tractability of mitigating each x-risk, even though he admits this is a key parameter for prioritization[6]. Instead, he just falls back to an ignorance prior and states as a corollary the following Principle of Proportionality: “When a set of risks have equal tractability, or when we have no idea which is more tractable, the ideal global portfolio allocates resources to each risk in proportion to its contribution to total risk” (p. 182). The author does not justify this principle and simply states in an endnote that he “owe[s] this point to Owen Cotton-Barratt” (p. 386). Researching the topic, we came across what we believe to be the primary source for this claim. The bulk of Owen Cotton-Barratt’s article is concerned with other, different rules for allocating resources between problems when their tractability can be assumed to be distributed across several orders of magnitude. The optimality of the allocation rule that the author labels Principle of Proportionality is merely stated as a conjecture in the original source, without any formal proof or justification.
On top of that, there might be other good reasons for not spending a lot of effort estimating the tractability of different x-risks, e.g.
The author may be using the ITN framework heuristically rather than as a formal algorithm. This would mean that, instead of positing a joint probability distribution over all three parameters, he could simply attempt to estimate 1-2 of them and use that as rough guidance for prioritization.
The author may believe that estimating the probability of different x-risks is significantly easier than estimating the tractability of mitigating them, so he preferred to invest his scarce time doing the former. This could be because some inherent features of those probabilities make them low hanging fruit compared to the corresponding tractabilities, or because the author thinks he’s better equipped to estimate them due to personal fit.
The author may expect that researching tractability would yield similar or imprecise estimates across x-risks, so falling back to simple, uninformative priors is the best strategy.
However, if the author had these or other reasons for allocating his research efforts in the way he did, we think the argument should have been made explicitly in the book[7].
Moreover, we think there are certain x-risks that, even a priori, seem to have wildly different tractabilities. For instance, the risk from unaligned artificial intelligence is poorly understood due to the fact that we lack a precise mechanistic explanation of how the problem could unfold (leading to very different views in the community about things like timelines and takeoff speeds), while biorisk is pretty well understood scientifically and most of the bottlenecks come down to concrete engineering solutions and regulatory / governance / policy problems. The greater uncertainty around how to align advanced artificial intelligence systems should perhaps make us put significantly more probability mass towards the low-tractability end of the spectrum, thus reducing the expected tractability of the problem.
To summarize, we think that the author’s failure to even attempt to estimate the relative tractability of the different x-risks in his portfolio may have been reasonable given the difficulty of the task. However, we believe this choice about the allocation of his research efforts should’ve been justified in the chapter given the centrality of this parameter to the ITN framework and the fact that there are a priori reasons to believe some x-risks are significantly more tractable than others.
It is unclear how to integrate different frameworks for estimating the impact of a given risk
This section focuses on the way in which the author of The Precipice brings in additional considerations and heuristics for assessing the importance of different x-risks. While we do not reject any of these heuristics per se, we argue that the author does not offer a clear approach for using them together, thus weakening his attempt to rigorously prioritize between individual x-risks[8].
From our reading of the chapter, we got the impression that all of the following are suggested as frameworks for assessing the importance of a potential x-risk:
Anatomy of extinction risks: Based on work by the Future of Humanity Institute, the author suggests that we should consider the probabilities that an event occurs (origin), that it scales globally, and that it leads to complete extinction (endgame) when assessing its extinction risk.
Correlation between different x-risks: The author here proposes the heuristic that larger risks are disproportionately more important than smaller ones, and this becomes more severe the more correlated risks are amongst each other (for an explanation of why that is, see pp. 173-175 and Appendix D).
Existential risk and security factors: By introducing these concepts, the author concedes that it is insufficient to look only at how different events or developments might directly lead to an existential catastrophe. He gives a few examples which he considers likely and less-likely candidates for being important existential risk or security factors, but does not elaborate on how to identify and assess the importance of these factors.
Soon, sudden, sharp: The author also suggests that a look at how soon, sudden, and sharp different x-risks are can help in assessing the importance of working on each.
Our critique has not found major flaws with any of these frameworks; they make sense to us and seem reasonable heuristics for thinking about the topic, and we appreciate that the author sought to broaden perspectives by going beyond the existing ITN framework. However, we are somewhat confused about how these approaches can be integrated in an overall approach to assessing the importance of different x-risks (from what we can tell, the author himself does not rely on a combination of all these frameworks to come up with his x-risk ranking, or, at least, he does not spell out in the chapter how the different frameworks feed into his risk assessment). Specifically, it is not clear how to employ these heuristics in a non-arbitrary and non-self-serving way, i.e. how to use them for truly robust estimates rather than as a means for rationalizing whichever claim one happens to consider most likely a priori[9].
Consider, for instance, a discussion on cause prioritization between two people: one thinks that work on AI should be prioritized most highly among x-risks because it has the largest raw probability and is sharp (i.e., we can’t expect warning shots). The other thinks it should be nuclear war because it scores higher on the soonness scale and can be considered a serious x-risk factor (i.e., a nuclear exchange may not cause existential catastrophe directly but make humanity more vulnerable to other x-risks). How, we wonder, can such a dispute be resolved in a non-arbitrary way? How can the multiple frameworks on offer be combined to make robust prioritization decisions? How can actors with “bad intentions″ (e.g., nuclear industry lobby groups that may have an interest in downplaying the risk from nuclear weapons) or with “bad epistemics” (all of us to some degree, since nobody is immune from cognitive biases) be prevented from putting different weights on different frameworks depending on how each framework supports their case/preconceived intuitions (e.g., in the example above, it matters greatly how important you think the soonness of a risk is, and how intuitively plausible you consider different scenarios for how humanity responds to a limited nuclear strike)?
Meta-prioritization: How should we allocate resources towards and within prioritization research?
Just like the author failed to justify his relative overinvestment of research effort in estimating importance vs. tractability, it is also unclear from reading the book how we should split resources between mitigating x-risks directly vs. investigating the risk landscape better[10]. It’s not apparent to us that one line of action would be obviously better than the other, but, given the relative neglect of the latter one in The Precipice, we present here some hypothetical “risk landscape mapping” projects that could be high impact as a form of intuition pump:
Some risks may have been overblown, e.g. a few years elapsed between when the classic Bostrom-Yudkowsky view on superintelligence was fleshed out and when Drexler’s Comprehensive AI Services (CAIS) was put forward as an alternative. Even if, after learning about CAIS, we think the risk from AGI is real and pressing, the fact that there’s disagreement in the community about whether CAIS is possible / desirable should result in an all-things-considered view that roughly says “the community failed to consider some plausible arguments against a very catastrophic view of AGI x-risk for a long time, so this may happen again.”
Some risks may have been underestimated, e.g. it took about a decade since the founding of AI as a field in the 50s until the first academic paper on superintelligence came out, and five more decades until Nick Bostrom’s Superintelligence was published. This suggests other emerging technologies could pose existential risks we haven’t spotted yet because we haven’t done the necessary big picture thinking / brainstorming needed to identify them.
We think explicit value of information calculations for direct action compared to global prioritization research would’ve been appropriate here. For example, the author presumably believes that researching and writing the book was more cost-effective than working on direct mitigation. How did he come to this conclusion? Does it all boil down to personal fit or are there other factors at play?
Over-reliance on unreliable expert estimates
On page 153 of The Precipice, the author provides the following definition and operationalization of the importance value in the ITN equation: “The importance of a problem is the value of solving it. In the case of an existential risk, we can usually treat this as the amount it contributes to total [existential] risk.”. He goes on to provide a summary of his estimates of the magnitudes of different existential risks in Table 6.1 on p. 141. Our understanding is that the author arrived at these probabilities by making plausibility judgments informed by his intuitions, his general knowledge, and his consultation of expert opinion on each risk. A potential weakness of this approach that we want to draw attention to is the fallibility of human intuitions, including the intuitions of experts, which seems quite relevant to assessing the reliability of forecasts[11].
We contend that Daniel Kahneman and Amos Tversky’s experiments cast doubt on the reliability of expert opinion by showing that specialists are far from immune to cognitive biases and thinking fallacies (for an overview, see Kahneman 2013). The effects of those biases are well-documented in Philipp Tetlock’s work on “expert judgment”, which has found that domain-level experts frequently underperform simple rules and models in the prediction of political and economic events and developments. We realize that some experts in Tetlock’s studies (namely, those he categorizes as “foxes”) show remarkable forecasting skills, which can be acquired through training and experience. However, nothing in The Precipice indicates that the author relied specifically on input from “foxy” (let alone superforecasting) experts, nor that he himself is a “foxy” forecaster, and we argue that this reduces the credibility of his probability estimates[12].
In addition, we think that x-risk estimates face challenges that are different in a significant way from those faced by forecasting more generally. Since x-risks are about the mid- to long-term future and since they are unprecedented, there is only indirect empirical data to underwrite forecasts about their occurrence, and there are few if any feedback mechanisms to monitor the reliability of such forecasts. We think that this last point deserves special emphasis: there is no evidence of forecasting being useful more than ~5 years in advance[13], so unprecedented risks that may unfold in time frames of decades (or longer) cannot necessarily be treated the same way as risks that are either (i) short-term or (ii) have been realized in the past many times such that a clear base rate is readily available.
This skepticism of expert judgments raises questions about the value of the x-risk probabilities in The Precipice. Given the paucity of empirical data that can be used to check forecasts and estimates of x-risks, we think that even very modest claims about the magnitude of probabilities are vulnerable to attack and contestation. In investigating and writing this post, we have not come to a definitive conclusion on whether putting some numbers on these probabilities is better than having no estimates for them at all, or if putting down unreliable numbers instills a false sense of confidence and may encourage suboptimal decisions (between us four authors, we found different intuitions regarding that question). Following from this, we are unsure in our judgment on whether these (potentially unreliable) probabilities should play any role in prioritizing between different ways to prevent existential catastrophe. However, what we do agree about is that the probability estimates are highly uncertain, and that there may be value in considering alternative approaches for deciding how to allocate existential risk reduction efforts. We outline one such option in the last section.
Broad interventions side-step the challenges with the prioritization frameworks
As discussed, the main problem with the accuracy of unprecedented risk estimation is the lack of relevant data and hard evidence. In addition, this approach does not take into account the risks we are currently unaware of. We put forth the possibility of maximizing our resiliency as opposed to investing in risk reduction.
Both the risks discussed in The Precipe (natural catastrophes, nuclear war, climate change, other environmental damage, pandemics and unaligned AI) and the existential threats we are currently unaware of, are categorizable as ‘Black Swan’ events[14]. The phrase “black swan” derives from a Latin expression[15]: “rara avis in terris nigroque simillima cygno” (translation: “a rare bird in the lands and very much like a black swan”). When the phrase was coined, the black swan was presumed not to exist. Later the existence of black swans was confirmed by explorers in Western Australia[16]. The term was later used in the renowned works of mathematical statistician, Nassim Nicholas Taleb. A ‘Black Swan’ event is an unprecedented event with a major impact that is often completely unforeseen. The Black Swan theory is underrepresented in the discussion of x-risks [17]but, in combination with ultramodern risk models based on this theory, offers a promising approach to human continuity.
The problem with unprecedented events, such as x-risks, is the ineffectiveness of probability calculations. Black Swan events are not foreseeable by the usual statistics of correlation, regression, standard deviation or return periods. Expert opinions are also of minimal use, we refer to our arguments under section ‘Over-reliance on unreliable expert estimates’. This inability to estimate the likelihood of occurrence for Black Swan events precludes the application of traditional risk management. Therefore, the development of strategies to manage their consequences is of paramount importance[18]. Black Swan risk models forego ineffective risk calculations and focus on contingency. Not surprisingly, the state-of-the-art risk models are found in the financial market[19]. Most organizations care more about losing money than about distant existential threats.
By solely basing strategy on potentially unreliable calculations, Effective Altruism has not yet learned the lessons of the financial sector. Had banks worldwide not accumulated large exposures on high-impact events based on their return calculations, they would’ve made a smaller profit for a while[20]. However, they also wouldn’t have bankrupted themselves in 2008 to the point of needing bailouts of trillions of dollars globally[21]. Unfortunately, when it concerns existential threats, Effective Altruism does not have the luxury of learning from experience.
These lessons have cost us trillions of dollars globally and launched national debts to unprecedented heights. However, they have also provided us with valuable insights into extreme risk management. The most effective plans for Black Swan contingency are ‘resilience’ and ‘diversification’[22]. And both methods are translatable to human continuity.
The first method is resilience. Resilience in this context means the ability to withstand adverse events and recover from them. In a business context, this means making your core systems more resilient, and the same applies to human continuity. There are two categories of core systems regarding human continuity. Firstly, systems which are crucial for human survival. If disaster strikes, which systems will prevent human extinction? These are the systems that will provide basic human needs for a (small) population[23]: water, food, medicine, shelter, etc. We refer to ‘ALLFED’ for an example of this support. Secondly, systems which are crucial for human recovery: If disaster strikes, which systems are necessary to help humankind recover once the pique of the impact is over? These are systems which serve the needs for human expansion: energy, preservation of knowledge, building material and even factors contributing to mental fortitude[24].
The second method is diversification. In the financial sector this means spreading your assets over multiple markets and sectors. This counters the uncertainty of market forces. If disaster strikes, the impact will be limited to part of your assets. This contingency principle is also applicable to human continuity. Since nearly all existential threats are of a global scale (pandemics, global warming, nuclear accidents[25], etc.) this would mean spreading or expanding the human population over a larger area, e.g. planetary colonization and refugee islands. The author agrees with the method of extinction prevention or as he describes it ’Security among the stars” (p. 194). Even though this would be a highly effective mitigation mechanism, we do not advocate Effective Altruism groups focus on this endeavor. Planet colonization to the point colonies could survive independently is a longterm effort whereas disaster may strike sooner. Secondly, this endeavor already receives a lot of investments by both (inter)national governmental bodies and extremely wealthy individuals. We therefore feel that EA should focus on resilience.
We do note a certain disadvantage to this risk model. This approach to preventing x-risks is more robust and arguably more effective than the specific risk reduction advocated by the author[26] as this model outperforms all other models in the area of future unpredictability. This is by design as it was formed on the basis that anything can happen. Nonetheless, it is not convenient to advocate for this model due to its brutal nature. Like the author, this model prioritizes human survival and safeguarding humanity’s potential above all. However, it goes a step further by focussing on maximizing core resiliency instead of risk reduction[27]. In essence, in maximizing humanity’s survivability against all threats it considers parts of the human population as acceptable losses. This is more easily acceptable in the financial sector where the losses are only of monetary value. When it comes to human life the thought of focussing on specific risk reduction is a much more comfortable one, though for the reasons mentioned in this article, it may be misguided. However, Effective Altruism has never cowered from asking the hard questions. This community has redirected vast sums of money from people who would have benefited from it towards people who have benefited more from it.
Fortunately, it is possible to use this approach as a complement to the ‘specific risk reduction’ approach. This is done through the strategy of investing in human resilience to an acceptable level, which is yet to be determined, and afterwards focussing on the reduction of specific risks based on our best attempts at calculating them, such as the Bayesian approaches discussed in The Precipice. This spending should be adaptive to precursors of existential events (e.g. specific developments in AI, mutations or virus strains, new measurements on global warming, etc.). These calculations should be adjusted or, if necessary, disregarded in the light of new evidence. The right balance between these three approaches (resilience, risk-based, adaptive) will optimize human survivability and harm reduction.
Conclusion
Overall, The Precipice is an excellent book which details the importance of longtermism alongside several practical approaches towards the goal of existential risk reduction.
While we focus this critique on points we believe could be even stronger, we also investigated several claims from the chapter that hold up very well. In particular, we believe that the author (1) provides useful models for thinking about existential risk, (2) presents a very solid mathematical treatment of how multiple risks combine into an overall risk, clearly highlighting some counterintuitive results, (3) highlights not only direct risks, but also existential risk and security factors, which either increase or lower the chance of existential catastrophe without being a direct existential threat themselves, and (4) differentiates well between targeted vs. broad risk interventions.
The key ways we think this work could be even stronger, whether in future editions or in further work that builds upon the book, are:
Providing estimates of Tractability in the Importance x Tractability x Neglectedness (ITN) framework, or explaining more clearly why these are being avoided.
Suggesting ways to combine all of the different risk prioritization frameworks into a single prioritization.
Contemplating priorities research—the question of how to allocate effort between mitigating x-risks directly vs. investigating the risk landscape to improve future resource allocation.
Highlighting the uncertainty of certain risk estimates (e.g. the estimates of future risk from AI are much more uncertain than the estimates of natural risk from asteroids), and how this uncertainty should impact our resource allocation.
Exploring broad resilience interventions in more depth, as these could side-step some of the issues with estimating the impact of targeted risk reduction. With the goal of preventing existential catastrophe, focusing on maximizing core resiliency of our species could be a promising alternative.
Acknowledgements
We want to thank the Training for Good team for organizing the Red Team Challenge, and for providing guidance and support throughout! We also received some very helpful feedback from other participants but have not yet heard back from them regarding whether or not they are comfortable being mentioned here. We will come back and add their names if and when they give us permission to do so; in the meantime, we simply offer an anonymous Thank you! to all those who commented on our earlier draft.
Appendix: List of open questions
[Throughout our initial discussions and reading in the course of this Red-Teaming Challenge, we identified a number of open questions, things we were confused about or that seemed unclear, and potential weaknesses in chapter 6 of The Precipice. These are collected in a separate GoogleDoc, available and open for comments here]
Commonly used for cause prioritization in the EA community, e.g. by 80,000 Hours.
Several such arguments (as well as counter-points to many of them) can be found under this Forum tag. In addition, we think that discussions surrounding the concepts of moral uncertainty and worldview diversification are relevant for thinking about the ethical convictions underlying The Precipice’s focus on existential risk.
The probability of each x-risk, considered in isolation, is only a rough proxy for its importance. In The Precipice, the author suggests that a more accurate importance estimation should also consider the correlation between individual x-risks, since the nature and level of correlation can change the amount that x-risks with different probabilities contribute to total existential risk (high correlation would suggest that high-probability risks contribute disproportionately large amounts to total risk; see pp. 173-175 of The Precipice).
As far as we can tell, the book is silent about the risk of futures that are much worse than extinction or permanent destruction of humanity’s potential, i.e. futures that contain a huge amount of negative value. These have usually been referred to in EA as suffering risks or s-risks. Criticizing this omission is outside the scope of this review, but we note that some of the x-risks considered in the chapter also have the potential of becoming s-risks, e.g. rogue AIs might find it useful to further their goals by exploiting an astronomical number of sentient digital beings.
Most of Part 2 (chapters 3-5) of the book as well as chapter 6 are devoted to explaining and comparing these estimates.
In p. 143: “These probabilities provide a useful summary of the risk landscape, but they are not the whole story, nor even the whole bottom line. Even completely objective, precise and accurate estimates would merely measure how large the different risks are, saying nothing about how tractable they are, nor how neglected. The raw probabilities are thus insufficient for determining which risks should get the most attention, or what kind of attention they should receive.”
A timid attempt at this is made in footnote 29, where the author argues that importance and tractability might be correlated. This follows from the heuristic that it should take about the same amount of resources to change the odds of something by a fixed multiplicative factor. Because of the nonlinear transformation between odds and probabilities, this means that risks in the “middling” range (around 50%) would be in a sweet spot for cheap interventions. Analyzing whether this heuristic is appropriate or not is beyond the scope of this review — we merely want to point out that discussions like this one should have been featured more prominently instead of being relegated to footnotes to make the case for excluding tractability estimates more clear.
While fleshing out this section of our critique, we also considered the possibility that having multiple methodologies/heuristics for prioritizing between x-risks, without clear guidelines for how much weight to give to each of them, might be a feature rather than a bug: Maybe this fosters discussion and further investigation, more so than if the author had made the more straightforward case for only one prioritization framework? As evident from this write-up, we are not fully convinced that the provision of multiple frameworks primarily has positive effects of that sort, believing instead that it creates confusion and makes rigorous prioritization more difficult. However, we haven’t spent a huge amount of time considering this possibility, and thus welcome comments that make the case for it in more detail!
One option for integrating the frameworks rigorously might be the method of “Model Averaging” (borrowed from the field of machine learning), which we have not looked into in-depth for this critique. If this, or some other approach, is what the author had in mind as a means for combining his suggested frameworks, we think he should have made an explicit reference to it.
The only passage we could find where this is addressed, although without any attempt at precise quantification, is in a box titled “Early Action” (p. 185): “In short, early action is higher leverage, but more easily wasted. It has more power, but less accuracy. If we do act far in advance of a threat, we should do so in ways that take advantage of this leverage, while being robust to near-sightedness. This often means a focus on knowledge and capacity building, over direct work.”
A case similar to ours is made by Cremer and Whittlestone in this paper about AGI timelines.
Our impressions from the effective altruism community and common epistemic practices therein lead us to think that the author likely has consulted experts with a higher-than-average forecasting track record, and that Toby Ord himself may fall in the category of a “foxy expert”. However, we believe that the book’s case would be stronger if the origin of its expert advice were documented in a bit more detail, so that it has to rely less on the reader’s knowledge and impressions of epistemic norms within effective altruism (or the reader’s familiarity with the author’s character and approach to reasoning).
For a longer discussion of the challenges related to knowing the feasibility and quality of long-range forecasts, see this report by Open Philanthropy.
Aven, T. (2013). On the meaning of a black swan in a risk context. Safety science, 57, 44-51.
Puhvel, Jaan (Summer 1984). “The Origin of Etruscan tusna (“Swan”)”. The American Journal of Philology. Johns Hopkins University Press. 105 (2): 209–212.
“Black Swan Unique to Western Australia”, Parliament, AU: Curriculum, archived from the original on 13 September 2009.
Using the GPT-3 language model we consulted a database of 175 million academic papers and found 1 direct result: Jebari, K. (2015). Existential Risks: Exploring a Robust Risk Reduction Strategy. Science and Engineering Ethics, 21, 541-554.
Nafday, A. M. (2009). Strategies for managing the consequences of black swan events. Leadership and Management in Engineering, 9(4), 191-197.
Taylor, J. B., & Williams, J. C. (2009). A black swan in the money market. American Economic Journal: Macroeconomics, 1(1), 58-83.
Taleb, N. N., Goldstein, D. G., & Spitznagel, M. W. (2009). The six mistakes executives make in risk management. Harvard Business Review, 87(10), 78-81.
Taylor, J. B., & Williams, J. C. (2009). A black swan in the money market. American Economic Journal: Macroeconomics, 1(1), 58-83.
Aven, T. (2015). Implications of black swans to the foundations and practice of risk assessment and management. Reliability Engineering & System Safety, 134, 83-91.
Cotton‐Barratt, O., Daniel, M., & Sandberg, A. (2020). Defence in depth against human extinction: prevention, response, resilience, and why they all matter. Global Policy, 11(3), 271-282.
Pareeda, P.K., & Pareeda, P.K. (2008). Towards Rebuilding a Post-Disaster Society: A Case Study of Supercyclone-Affected Coastal Orissa.
All nations capable of nuclear deployment have agreed to article IV of the ‘Outer Space Treaty 1967’. “Treaty on Principles Governing the Activities of States in the Exploration and Use of Outer Space, including the Moon and Other Celestial Bodies”. United Nations Office for Disarmament Affairs. Retrieved 05/06/2022.
Wilson, N. (2018). “New Zealand needs a method to agree on a value framework and how to quantify future lives at risk.”
Boyd, M., & Wilson, N. (2021). Optimizing Island Refuges against global Catastrophic and Existential Biological Threats: Priorities and Preparations. Risk Analysis, 41.