A longtermist critique of “The expected value of extinction risk reduction is positive”
Reducing the probability of human extinction is a highly popular cause area among longtermist EAs. Unfortunately, this sometimes seems to go as far as conflating longtermism with this specific cause, which can contribute to the neglect of other causes. Here, I will evaluate Brauner and Grosse-Holz’s argument for the positive expected value (EV) of extinction risk reduction from a longtermist perspective. I argue that the EV of extinction risk reduction is not robustly positive, such that other longtermist interventions such as s-risk reduction and trajectory changes are more promising, upon consideration of counterarguments to Brauner and Grosse-Holz’s ethical premises and their predictions of the nature of future civilizations. I abbreviate “extinction risk reduction” as “ERR,” and the thesis that ERR is positive in expectation as “+ERR.”
Brauner and Grosse-Holz support their conclusion with arguments that I would summarize as follows: If humans do not go extinct, our descendants (“posthumans”) will tend to have values aligned with the considered preferences of current humans in expectation, in particular values that promote the welfare of sentient beings, as well as the technological capacity to optimize the future for those values. Our moral views upon reflection, and most welfarist views, would thus consider a future populated by posthumans better than one in which we go extinct. Even assuming a suffering-focused value system, the expected value of extinction risk reduction is increased by (1) the possibility of space colonization by beings other than posthumans, which would be significantly worse than posthuman colonization, and (2) posthumans’ reduction of existing disvalue in the cosmos. Also, in practice, interventions to reduce extinction risk tend to decrease other forms of global catastrophic risk and promote broadly positive social norms, such as global coordination, increasing their expected value.
In response, first, I provide a defense of types of welfarist moral views—in particular, suffering-focused axiologies—for which a future with posthuman space civilization is much less valuable, even if posthumans’ preferences are aligned with those of most current humans. Although +ERR appears likely conditional on value-monist utilitarian views that do not put much more moral weight on suffering than happiness, views on which sufficiently intense suffering cannot be morally compensated by any amount of positive value are defensible alternatives that many longtermists may endorse upon reflection. To the extent that (a) these suffering-focused views are reasonable, yet currently underweighted by the majority of longtermists, and (b) significant moral uncertainty is warranted given the theoretical challenges of population ethics, the longtermist community ought to update against +ERR.
Second, I argue that the future is much less valuable than argued by Brauner and Grosse-Holz according to several tenable longtermist views, due to (a) how large the expected extreme suffering created by the pursuit of posthumans’ values is and (b) how difficult it may be to efficiently create large amounts of moral value, given the complexity of value thesis. Forms of suffering-focused ethics (SFE) are examples of such views for which this evaluation is especially likely. But it is also plausible for common non-suffering-focused views on which creating value—but not disvalue—requires a conjunction of many complex factors, because the disvalue of futures without this conjunction may outweigh the value of the rare ideal futures. Scenarios in which posthumans relieve more suffering of animals, extraterrestrials, and artificially sentient beings than they cause appear insufficiently likely for +ERR to hold on these views. That said, the degree of plausibility of such scenarios is a crucial consideration that merits further research.
Those who find the case for +ERR unconvincing should focus on promising cause areas such as reducing near- and long-term suffering. By contrast, there are several strong arguments against actively seeking to increase extinction risk through disruptive, let alone violent, means. First, taking such actions risks severely tarnishing the cause of compassionately reducing suffering, and is thus counterproductive even from a perspective that considers extinction good in principle. Second, others have made compelling arguments, based on this reason and many others, that altruists should be cooperative with those with other value systems. Third, deliberately increasing extinction risk may also increase non-extinction global catastrophic risks. Finally, moral uncertainty and common sense heuristics weigh against these extreme actions.
However, this critique does suggest that working to reduce extinction risk may be significantly less consistent (and cooperative) with the reflected ethics of longtermist EAs, including with views other than SFE, than it appears. Further, Brauner and Grosse-Holz’s thesis is only that ERR is positive in expectation, which does not imply that ERR ought to be longtermists’ highest priority if there are alternatives with higher EV (even on a non-suffering-focused view). I suspect that longtermists of various moral persuasions can coordinate on interventions that are more promising from a variety of ethical perspectives than increasing or decreasing extinction risk, including interventions for positive trajectory changes and the reduction of s-risks and non-extinction existential risks.
I also want to commend Brauner and Grosse-Holz’s even-handed presentation of the evidence in several sections, particularly in considering the disvalue of animal lives in the future (Part 1.1) and the possibilities of s-risks (Part 1.3), much as I disagree with their conclusions.
Bracketed parts of section headings refer to the part of Brauner and Grosse-Holz to which I am responding.
1. Axiological considerations [“Moral assumptions”]
I essentially endorse Brauner and Grosse-Holz’s two starting moral assumptions, that “it morally matters what happens in the billions of years to come” and “we should aim to satisfy our reflected moral preferences.” The former is extremely plausible, and while the plausibility of the latter depends on how exactly “reflection” is operationalized, it can be taken as a reasonable premise. Like them, I also proceed from a broadly welfarist consequentialist position.
However, in this critique I do not share their theory of value, or “axiology,” in population ethics. They state: “Additional beings with positive (negative) welfare coming into existence will count as morally good (bad).” Although this claim is compatible with a wide variety of axiologies, Brauner and Grosse-Holz add that we “could conclude that human welfare is [net] positive” from the premise that most people “experience more positive than negative emotions.” Their implicit axiology is thus a total symmetric view: creating lives with more positive than negative emotions is morally good, because positive and negative emotions are morally weighed symmetrically.
It is worth strongly emphasizing that, as I argue in section 1.1, rejecting total symmetric population axiology does not require endorsement of person-affecting views or averagism. These latter two views have well-documented—though not necessarily conclusive—objections that many longtermists find unacceptable, including violations of at least one of the transitivity of “better than” (Greaves, 2017), independence of irrelevant alternatives, or separability (Ponthière, 2003). Further, although the views I will defend in what follows are not symmetric in this sense, I am not defending the Asymmetry as critiqued in Thomas (2019), which has similar objections to those of person-affecting views.
An alternative is a total suffering-focused axiology, which considers the disvalue of some or all forms of suffering or preference frustration more morally important than other aspects of hedonic experience/preferences. Besides the well-known example of negative(-leaning) utilitarianism, this includes some forms of prioritarianism, antifrustrationism, and pluralist views on which several values matter but extreme suffering has special importance.
1.1. Non-person-affecting objections to total symmetric axiology
Among other implications, the premise that the moral value of a population equals the total happiness (and/or satisfied preferences) minus suffering (and/or frustrated preferences) leads to the Very Repugnant Conclusion (VRC) (Arrhenius, 2003):
Let W1 be a world filled with very happy people leading meaningful lives [with no suffering whatsoever]. Then, according to total utilitarianism [plus the assumption that sufficiently great happiness can outweigh any intensity of suffering], there is a world W2 which is better than W1, where there is a population of [purely] suffering people much larger than the total population of W1, and everyone else has lives barely worth living—but the population is very huge.
Put differently, no matter how amazing, flawless, and large a given set of lives can be, and no matter how horrific, joyless, and large another set can be, total symmetric axiology (and others; see Budolfson and Spears, 2018) implies that the former is worse than the latter augmented with enough barely good lives. The great counterintuitiveness of this conclusion can be systematized as follows. In addition to inconsistency with one of the propositions in the Third Impossibility Theorem of Arrhenius (2000), the VRC violates this highly plausible principle:
Perfection Dominance Principle. Any world A in which no sentient beings experience disvalue, and all sentient beings experience arbitrarily great value, is no worse than any world B containing arbitrarily many sentient beings experiencing only arbitrarily great disvalue (possibly among other beings).
The well-known Repugnant Conclusion (RC) does not by itself violate the Perfection Dominance Principle, since none of the lives in the muzak-and-potatoes world consist purely of suffering by assumption. Hence, arguing that the RC is not unacceptable does not resolve the problem posed by the VRC. The RC does, in some formulations, violate a Strong Perfection Dominance Principle, defined as the Perfection Dominance Principle but excluding “only” from the last clause. To see this, let the “muzak-and-potatoes” lives each contain arbitrarily great suffering, along with slightly more happiness than this suffering. This Strong version of the principle is less prima facie plausible than the non-Strong version, though I will argue in the next section that the Strong version is still very plausible.
Some weak suffering-focused axiologies also imply the VRC and violate the Perfection Dominance Principle. Suppose, for instance, we claim that a given intensity of suffering is X times as disvaluable as the same intensity of positive experience. Then, if each “life barely worth living” in world W2 consists of no suffering and a miniscule amount of positive experience, multiplying the number of such lives by X—relative to the number of lives necessary for the VRC under a symmetric view—produces the VRC. We thus have a reason to adopt stronger suffering-focused views, discussed in the next section.
1.1.1. Counterarguments and some responses
A personal identity reductionist, motivated by the non-identity problem for instance, may note that the distribution of (dis)value among “beings” is not necessarily axiologically relevant. This does not imply, however, that the Perfection Dominance Principle is false. Another possibility is that, given that the alternative is a world with no disvalue, a world with arbitrarily great disvalue is unacceptable regardless of whether that disvalue is obtained within one being (in the folk sense). That is, when faced with the intuitive force of both the Perfection Dominance Principle and personal identity reductionism, we may want to accept the Strong Perfection Dominance Principle. We need not choose only between person-affecting views and total symmetric axiology, or any other axiology that entails the VRC.
A potential objection to this perspective is that if we want to accept something similar to the Perfection Dominance Principle, reductionism actually seems to commit us to:
Stronger Perfection Dominance Principle. Any world A in which absolutely no disvalue is experienced, and arbitrarily great value is experienced, is no worse than any world B containing arbitrarily great disvalue [not necessarily experienced by any single being].
In particular, the “arbitrarily great disvalue” in question could be distributed thinly across a sufficiently large volume of experience moments, each of which suffers only a pinprick of pain. It is considerably less clear that such a world would always be worse than world A.
However, first, every strictly aggregationist axiology has some counterintuitive conclusions in which the dominant contribution to the moral (dis)value of a world is a very thinly spread collection of (dis)valuable experiences. For example, we may ask whether the conclusion in the preceding paragraph is so untenable if all the positive experiences in the world with arbitrarily great disvalue are also distributed among many beings, who briefly flicker into and out of a marginally happy existence (Vinding, 2020, 8.12). Indeed, a form of the VRC in which the arbitrarily great disvalue is contained within individual persons, yet the supposedly overwhelming value is distributed among flickers of the slightest pleasure, is a strong counterexample to symmetry and weak asymmetry. See also Yudkowsky’s “Torture vs. Dust Specks” and the original RC.
Second, the purported pinprick conclusion does not necessarily follow from the Perfection Dominance Principle plus identity reductionism, if one is not committed to strict aggregationism. It is plausible that negative experiences of a sufficiently great intensity constitute disvalue that cannot be morally outweighed by positive experiences or mild negative experiences. An example of a view that entails this is that disvalue obtains as a lexical function of the experience intensity (Klocksiem, 2016). This view is consistent with both of the desired principles, without entailing the pinprick conclusion.
By contrast, even if we modify total symmetric axiology with a clause that only sufficiently intense positive experiences can morally outweigh intense negatives, this still entails a form of the VRC. The lives “barely worth living” in question would specifically need to be lives with lexically intense value (and slightly less intense disvalue), eluding the conclusion of muzak-and-potatoes but not the violation of the Perfection Dominance Principle.
Note that value lexicality is not just a “patch” applied to aggregationist suffering-focused views—rather, lexicality is a plausible resolution of decision theoretic problems independent of axiology. For example, the structure of the typical “continuity” argument against lexicality (Temkin, 1996) is formally equivalent to reasoning that entails fanaticism (see “God at Your Deathbed” in Beckstead and Thomas, 2020).
Importantly, the VRC, and its consequent inconsistency with the Perfection Dominance Principle, follow from other symmetric axiologies besides classical hedonism. Consider an axiology that maintains that any magnitude of suffering can be morally outweighed by a sufficiently great magnitude of preference satisfaction, virtue, novelty, beauty, knowledge, honor, justice, purity, etc., or some combination thereof. It is not apparent that substituting any of these values for happiness in the VRC makes it any more palatable, or cuts against the Perfection Dominance Principle.
We also need not deny the premise that these experiences and other entities are “good,” perhaps even in some self-evident sense (Moen, 2016; Sinhababu, 2010). But even if they are, it does not follow that their goodness is commensurable with and can cancel out bad experiences (Wolf, 1997). The notion that such goods and negative experiences are on a single value-disvalue axis, despite its popularity among many longtermists, is a philosophically controversial claim.
Finally, suffering-focused axiologies have commonly been criticized as reifying the bias of scope neglect—i.e., sufficiently large value does not seem to outweigh large disvalue, the argument goes, because our minds cannot appropriately conceive of the scale of the former. Although this provides some reason to be skeptical of suffering-focused intuitions, symmetric intuitions may also be explained in terms of several biases (Vinding, 2020, ch. 7):
There is evolutionary pressure for biological organisms to consider their lives worth living, and consider extreme suffering to be worth it for supposedly greater goods, because a species that unanimously endorsed a strong asymmetry would be much less likely to survive and reproduce. Likewise, cultures that discourage reproduction would tend not to survive in large numbers.
In many (if not most) ordinary tradeoffs in which people purportedly accept suffering in exchange for positive values, the positive values in question are instrumentally valuable for preventing, distracting from, and relieving disvalue. Our intuitions about the outweighing of disvalue with value may therefore systematically conflate intrinsic with instrumental value.
Social desirability bias and wishful thinking: Strongly suffering-focused views have implications that many people would be uncomfortable endorsing due to the risk of social sanction, or of appearing ungrateful or like a “downer.” It is unpleasant to contemplate the possibility that one’s parents committed a harm against oneself by procreating, that habitat reduction makes the world better via effects on wild animals (even for species with supposedly “net positive” lives), or that voluntary extinction may be morally responsible, for example.
Due to a desire for an interesting narrative arc to our lives (or the future of humanity), we may underrate the value of states that lack both suffering and intense happiness. No matter how much it is emphasized that simple, suffering-free contentment is contentment—total freedom from frustration and need—our intuitions resist this as a good or optimal state of affairs partly because it seems boring from the outside.
There is plausibly a status quo bias at play in responses to the prospect of extinction. Suppose the current world had no humans. It is not clear that we would consider it unethical for some supernatural being to not create humans in our current state, including all the suffering humans endure and impose on other sentient beings, even if the only alternative were to do nothing.
1.2. +ERR under suffering-focused, totalist, non-person-affecting axiology
Tell me straight out, I call on you—answer me: imagine that you yourself are building the edifice of human destiny with the object of making people happy in the finale, of giving them peace and rest at last, but for that you must inevitably and unavoidably torture just one tiny creature, that same child who was beating her chest with her little fist, and raise your edifice on the foundation of her unrequited tears—would you agree to be the architect on such conditions? Tell me the truth.
—Ivan Karamazov, in Fyodor Dostoyevsky’s The Brothers Karamazov
On most suffering-focused axiologies, even if the expected total intensity of positive experiences in the far future is orders of magnitude greater than negative, and even if “the large majority of human other-regarding preferences seem positive” (Part 1.2 of Brauner and Grosse-Holz), it does not follow that the far future is net morally valuable. It appears plausible that in expectation, astronomically many beings in the far future will experience lives dominated by misery—though many more will have closer to neutral or hedonically positive lives. That is, the average future may resemble a milder, yet still gravely concerning, version of world W2 of the Very Repugnant Conclusion. According to a reasonable cluster of moral views, a future containing such horrors is simply not a price worth paying for a great utopia elsewhere.
This does not necessarily mean that +ERR is false on suffering-focused axiology, as human space colonization could be the least of several evils. I will consider this in sections 2.3 and 2.4.
Importantly, even if the reader is unconvinced that strongly suffering-focused views are the most plausible set of axiologies, it is at least not clear that the VRC and other disadvantages of alternative views are overwhelmingly more acceptable than the problems with suffering-focused axiologies. Each of the five mutually inconsistent principles in the Third Impossibility Theorem of Arrhenius (2000) is, in isolation, very hard to deny. This suggests that great moral uncertainty is warranted between symmetric views that violate Weak Quality Addition, and suffering-focused views that violate Non-Extreme Priority.
2. Empirical considerations
2.1. Expected value of space colonization
I will only briefly discuss the content of Part 1.1 of Brauner and Grosse-Holz’s essay, “Extrapolating from today’s world.” This is because, as they acknowledge, it is unlikely that extrapolating from the present is the best way to forecast the EV of the far future, and we can be somewhat confident that technological development will greatly reduce the current causes of the worst disvalue on Earth, such as factory farming. Likewise, humans do appear to be getting happier on average over time based on life satisfaction surveys and relatively low suicide rates (compared with a naive prediction of at least 50% if lives were on average not worth living). As Brauner and Grosse-Holz discuss in footnotes, there are strong evolutionary pressures toward survival regardless of the balance of welfare, as well as pervasive taboos, religious sanctions, and stigmas against attempting suicide, which is a frightening and unreliable method of death for most people. People who do not consider their own lives egoistically worth living may nonetheless continue living for altruistic purposes, including avoiding causing grief to their loved ones. Brauner and Grosse-Holz also note that life satisfaction surveys are only “moderate evidence” due to potential optimism bias, though negativity biases could be comparably strong.
Even from suffering-focused perspectives, the dominant considerations for assessing +ERR are aspects of welfare that we should expect to persist for astronomical timescales, rather than current problems in human and animal welfare. Still, the near-term considerations above are weakly relevant for empirically grounded forecasts of the far future based on recent trends.
2.1.1. Animals [1.1]
Brauner and Grosse-Holz’s following claim is plausible: “In the future, factory farming is likely to be abolished or modified to improve animal welfare as our moral circle expands to animals.” However, it seems less credible that the phenomenon of factory farming will turn out to have been an unparalleled anomaly. Had someone in the year 1921 forecasted that, due to moral progress, the (near-term) future would not contain a significant increase in human-caused suffering offsetting such progress, they would have been profoundly wrong. Insufficient concern for artificially sentient beings appears likely to recapitulate this error, at least conditional on their creation. This is because such beings are even “stranger” to human empathetic capacities than animals are, although it may simply be too early to confidently extrapolate from our current attitudes.
Wild animals provide, in my estimation, one of the strongest considerations in favor of +ERR on a suffering-focused moral view. If (post)humans agreed not to spread wildlife into space, and used biotechnological measures to improve wild animal welfare on Earth, the future would contain perhaps far less wild animal suffering than if humans went extinct. (For the remainder of this section, I will focus on this argument and assume a suffering-focused view.)
However, it seems unlikely that reducing wild animal suffering is something that humans will do by default once it is convenient. Contrast with ending factory farming: there is a plausible case that making veganism the “lazy” option will succeed. Short of naive extrapolation of the trend of wild habitat reduction by humans, which most likely will be met with strong resistance by environmentalist and conservationist factions, as well as governments aiming to mitigate climate change via biological carbon sequestration, this project would require a significant coordinated effort. Considering the large pushback against the prospect of intervening in nature even (perhaps especially) by many ethical vegans, it seems complacent to expect moral circle expansion to inevitably drive posthumans to implement the Hedonistic Imperative.
Further, as large as the scale of wild animal suffering is on Earth, the scale involved in scenarios where wildlife (or, more likely, simulations thereof) is spread through space colonization and terraforming is so much larger, such that human survival is in expectation worse for wild animals on suffering-focused views. Conservatively suppose only a 0.0001% chance of wildlife-spreading and simulation scenarios on the scale of a million Earth-sized planets populated, or the equivalent of energy devoted to simulations. Then there would be more expected suffering in biological plus simulated wildlife given human survival than given extinction.
Finally, even if the scenario of human extinction followed by no intervention in the wild were worse than the average human-controlled future, it is not clear that this is a true dichotomy. If it is possible to create non-sentient (or at least minimally sentient or non-suffering) AI systems that reliably reduce wild animal suffering, humans need not stay around ourselves for this purpose.
2.1.2. Posthumans [1.2]
Brauner and Grosse-Holz claim that, because future agents will “have better conditions for an idealized reflection process than we have,” we would upon reflection endorse the other-regarding preferences of future agents (on average). They conclude that, since these agents will have the technology to shape the future according to these preferences that we would endorse, we should be generally optimistic about posthumans’ future.
The authors rightly note that s-risks can plausibly be facilitated by future posthuman preferences that drift in directions orthogonal to ours. Though more contestable, the following claims from this section also seem plausible:
Future agents will probably be more intelligent and rational
Empirical advances will help inform moral intuitions (e.g. experience machines might allow agents to get a better idea of other beings’ experiences)
Philosophy will advance further
Future agents will have more time and resources to deliberate
It is important to disentangle several related, yet distinct, claims here. A weak claim is that we should be epistemically modest, such that we expect future people to be better informed about ethics and therefore would tend to make the future better than we would, with their technological resources. That this implies the future will be net ethically good is a much stronger claim, which does not appear clearly entailed by the bullet points above, though it is supported somewhat. As discussed in section 1.1.1, there are several strong biases against endorsing suffering-focused axiologies, and towards endorsing the spread of life, that we should expect to be prevalent in any civilization that survives into the future. Thus, even if future preferences are parallel to, and “idealizations” of, those held by most current humans, this future can still plausibly be net bad.
In particular, if one has high credence in at least moderately suffering-focused axiologies, as argued for in section 1, then the expected moral value of information from reducing extinction risk does not appear high enough to offset the ex ante moral costs. If space colonization is facilitated by ERR, and upon reflection future people endorsed some version of these axiological asymmetries, they would have a severely reduced capacity to halt the spread of astronomical suffering (as acknowledged both by Brauner and Grosse-Holz in Part 1.3, and by the Global Priorities Institute’s research agenda). Additionally, all else equal ERR increases the probability of the lock-in of negative futures controlled by malevolent, or indifferent, powerful actors. Though Brauner and Grosse-Holz themselves do not argue that option value is a significant point in favor of +ERR, I’ve included these remarks for completeness.
On the expected welfare of powerful beings, the authors argue that “compromise and cooperation seem to usually benefit all involved parties, indicating that we can expect future agents to develop good tools for coordination and use them a lot.” Although this is a generally sensible extrapolation, there are serious contrary considerations regarding the prospects for cooperation between artificially intelligent agents, as outlined in the Center on Long-Term Risk’s research agenda. Game theory provides several explanations for why “rational” actors can systematically forgo massive gains from trade.
Brauner and Grosse-Holz argue that the expected welfare of powerless future beings, such as AI tools, may be positive because the dominant contribution to their welfare will be deliberate optimization, and powerful beings will tend to care more about optimizing positive than negative welfare. This argument does not apply under an axiological asymmetry. In the next section, we will see that even without an axiological asymmetry, the difficulty of creating value relative to disvalue also calls +ERR into question. Given the sparsity of evidence about the benevolence of future beings, we are left to rely mostly on priors, and the net implications of these priors are unclear. For instance, in contrast to the reasonable prior that moral circle expansion will continue, we may also expect that powerless beings will be treated poorly if this provides even a mild convenience to the powerful, based on the precedents of factory farming and human oppression.
Finally, we should be generally wary of optimistic biases that can arise in “far mode thinking,” that is, expecting that the trajectory of the future will be fairly predictable and aligned with our plans. As noted by Sotala and Gloor (2017):
To recap, various evolutionary and game-theoretical forces may push civilization in directions that are effectively random, random changes are likely to [be] bad for the things that humans value …. Putting these considerations together suggests (though does not guarantee) that freewheeling development could eventually come to produce massive amounts of suffering.
In particular, the values that tend to dominate the far future will be those that most effectively spread and survive memetic selection, not necessarily those that promote well-being of sentient creatures (Bostrom, 2009). Extrapolating from humanity’s trend of moral circle expansion and increasing impartiality, associated with increasing wealth reducing pressure towards selfishness, would provide some reason for optimism, although our increased capacity to harm other sentient beings with our technology makes the sign of this trend unclear. Also, if a singleton emerges, the Darwinian trend will no longer apply afterwards, and then the relevant question is whether we should expect singletons to be benevolent.
Appendix 2 of Brauner and Grosse-Holz argues that selection will not be for ruthlessness, but for longtermist patience, and this seems to apply to singletons as well. Would preferences selected for caring about the long term in general display strong altruistic caring in particular? A paperclip maximizer, for instance, would care about long-term maximization of paperclips, and a superintelligent sadist may care about long-term suffering. On the other hand, we might predict future longtermists to be altruistic based on the present correlation of long term concern with altruistic motives, both in EA longtermism and mainstream climate activism. Egoistic preferences in the strictest sense would tend to be more short-termist than altruistic preferences, since an individual can only live for so long. However, megalomaniacal dictators with access to life extension or mind-uploading technologies may be concerning counterexamples.
2.2. Disvalue is not complex
Many longtermists believe that value cannot be reduced to a small number of simple properties, such as maximizing happiness. In a slogan, “value is complex,” and it is conjunctive in the sense that a world with only a subset of the conditions of value is extremely suboptimal. Such longtermists expect that after careful—perhaps benign superintelligence-assisted—reflection, they would value a messy, narrow space of possible futures.
This view appears to be in some tension with a commonly cited argument for why the future will be good in expectation: that posthumans will technologically pursue what we (reflectively) value, not disvalue. As Christiano writes:
I think that the goodness of a world is mostly driven by the amount of explicit optimization that is going on to try and make the world good …. This seems to be true even if relatively little optimization is going on. Fortunately, I also think that the future will be characterized by much higher influence for altruistic values.
The problem is that if value is complex, then, all else equal, it is harder to realize than are simpler goals (like happiness), perhaps requiring many orders of magnitude more resources. Thus, someone who cares only about complex value, yet still opposes simple disvalue, would consider most non-ideal futures—which contain significant amounts of simple disvalue (section 2.1) yet little complex value—net negative. In expectation, the future could still be positive if the rare ideal futures contain enough value, but this consideration suggests that the EV of ERR is lower than we might otherwise suspect, conditional on the complexity of value thesis.
This is especially plausible considering that many who subscribe to complexity of value reject Nozick’s experience machine. That is, even if their experiences were optimized for whichever of their values could be satisfied by experiences alone, they would consider this worse than current real life, with all its hedonic shortcomings. While in principle non-solipsistic simulations involving other beings could create complex value, the complexity would render such simulations much more expensive than happiness optimization.
Disvalue does not seem to have this “problem.” Hurka (2010) writes:
Or imagine first a world containing only intense mindless pleasure like that of the deltas and epsilons in Brave New World. This may be a good world, and better than if there were nothing, but it is surely not very good. Now imagine a world containing only intense physical pain. This is a very bad world, and vastly worse than nothing.
That is, it is plausible that suffering is simple. Though layers of complexity can intensify suffering, e.g., the isolation of dying alone while in physical pain seems especially bad, such complexity is not necessary for the experience to be seriously disvaluable. Indeed, the stipulation that there could be non-suffering forms of disvalue seems to only increase the expected disvalue of the future, since it does not appear plausible that these other forms are necessary, in conjunction with suffering, for serious disvalue to obtain. Rather, they add to the number of ways the future could be bad. (See also the discussion of the Anna Karenina principle in Sotala and Gloor, 2017; Vinding, 2020, 1.3; and this article.) For those who hold that value is complex, this pushes the ratio of expected value to expected disvalue of the far future considerably down towards (or below) one.
2.3. Space colonization by nonhumans [2.1]
Brauner and Grosse-Holz argue that, even if space colonization by posthumans is worse than a future in which no colonization occurs at all, the former is not as bad as the realistic alternative trajectory: colonization of our future light cone by other beings. Since their argument is that this constitutes an especially strong update toward +ERR for those who hold that posthuman space colonization is negative, in this section (as in section 2.1.1) I will assume a suffering-focused view.
The probability of space colonization by nonhumans given human extinction is very much an open research question, and, like Brauner and Grosse-Holz, I cannot claim to offer more than a guess. A low figure would be warranted by the arguments for the rare earth answer to the Fermi paradox, while models such as Hanson’s and Finnveden’s predict a high probability. In light of extreme empirical uncertainty here, the more serious point of contention is how bad we should expect such colonization to be compared with that by posthumans.
Arguments on this point will very likely not be robust; on any side of the debate, we are left with speculation, as our data consists of only one sample from the distribution of potentially space-colonizing species (i.e., ourselves). On the side of optimism about humans relative to aliens, our species has historically displayed a capacity to extend moral consideration from tribes to other humans more broadly, and partly to other animals. Pessimistic lines of evidence include the exponential growth of factory farming, genocides of the 19th and 20th centuries, and humans’ unique degree of proactive aggression among primates (Wrangham, 2019). Our great uncertainty arguably warrants focusing on increasing the quality of future lives conditional on their existence, rather than influencing the probability of extinction in either direction.
It does seem plausible that, by evolutionary forces, biological nonhumans would care about the proliferation of sentient life about as much as humans do, with all the risks of great suffering that entails. To the extent that impartial altruism is a byproduct of cooperative tendencies that were naturally selected (rather than “spandrels”), and of rational reflection, these beings plausibly would care about as much as humans do about reducing suffering. If, as suggested by work such as that of Henrich (2020), impartial values are largely culturally contingent, this argument does not provide a substantial update against +ERR if our prior view was that impartiality is an inevitable consequence of philosophical progress. On the other hand, these cultures that tend to produce impartial values may themselves arise from convergent economic factors. Brauner and Grosse-Holz’s mathematical model also acknowledges the following piece of weak evidence against +ERR in this respect: intelligent beings with values orthogonal to most humans’ (or most philosophically deliberative humans’) would tend not only to create less value in the future, but also less disvalue. Given the arguments in section 2.2 for the simplicity of disvalue, however, this difference may not be large.
Let x be the expected factor by which nonhuman space colonization is worse than human colonization, and p be the probability that nonhumans colonize space conditional on human extinction. Ultimately, to believe +ERR despite believing that human space colonization would be morally bad, one needs to believe that x > 1/p. Indeed, this is slightly conservative. If humans go extinct, then with the exception of AI, any Earth-originating colonizers would presumably require millions of years to evolve to our current level of technological sophistication, and alien colonizers may require billions of years. Consequently, they cannot colonize as large a region of our future light cone as future humans could, decreasing the magnitude of disvalue nonhumans could create in our future light cone compared to humans. See “Artificial Intelligence and Its Implications for Future Suffering,” particularly sections 17 and 18, for discussion of unaligned AI successors; overall it appears plausible (though very tentatively so) that unaligned AI space colonizers would cause slightly less suffering than humans would. It is not apparent that the current state of evidence and priors robustly support even the conservative belief that x > 1/p, although note that Brauner and Grosse-Holz do not claim as much.
2.4. Cosmic rescues [2.2]
Wild animals aside, might posthumans relieve more suffering than they cause and bear by assisting extraterrestrial sentience (i.e., cosmic rescue missions)? On one hand, even if they have little moral concern for these beings, it would be relatively inexpensive for an advanced civilization to reduce extraterrestrial suffering. Just as humans have plausibly reduced wild animal suffering significantly as a side effect of urbanization, posthumans may have the same effect by harvesting planetary resources. On the other, a similar counterargument to that at the end of section 2.1.2 applies here. Just because our descendants could ideally avoid creating astronomical suffering and work on reducing suffering throughout the cosmos, it does not follow that this will hold in expectation. The systematic forces in human values that tend to increase total intense suffering, discussed in that section, give reason to suppose the opposite.
To assess the expected disvalue posthumans would relieve versus create, we can consider the authors’ Fermi estimate. I will grant all the factors of this estimate, except the 40 : 1 ratio of parallel/altruistic : antiparallel/anti-altruistic efforts made by posthumans:
Assume that in expectation, future agents will spend 40 times as many resources pursuing other-regarding preferences parallel to our reflected preferences (“altruistic”) than on pursuing other-regarding preferences anti-parallel to our reflected preferences (“anti-altruistic”).
Unless “altruistic” is construed broadly enough to include attempts to help according to values that actually increase extreme suffering, this ratio seems unduly optimistic in light of strong background uncertainty and the arguments in prior sections. Agents that tend to proliferate in the future will probably have systematic pro-existence drives and preferences that increase suffering as a byproduct of increasing life in general. If posthumans’ conception of helping extraterrestrials includes dramatically increasing their or their own population, such efforts to help could well increase extreme disvalue in expectation.
One may argue that I am only considering the current moral views of those persuaded by section 1.1 here rather than “reflected preferences,” but I see no reason to think rational reflection would tend toward endorsing intuitions or views that favor existence and the spreading of life. A ratio of 10 : 1 at the highest seems warranted. Granting the other steps of the Fermi estimate, this ratio reverses the authors’ conclusion that the expected reduction of disvalue from human survival (via cosmic rescue missions) outweighs the expected creation of disvalue. (Cosmic rescues are still an update in favor of +ERR, however.)
2.5. Non-extinction global catastrophic risks (GCRs) [3.1]
While it is true that “many potential causes of human extinction, like a large scale epidemic, nuclear war, or runaway climate change, are far more likely to lead to a global catastrophe than to complete extinction,” it is important to avoid a motte-and-bailey argument here. Many longtermists cite the quoted fact as a reason to mostly prioritize other classes of risks (i.e., those which are more likely to lead to extinction specifically). It would therefore not be entirely accurate to claim that those who are not convinced that extinction itself would be an astronomically large moral catastrophe, but who would not want the future to contain non-extinction GCRs, should support extinction-focused work and funds in general. Those who most efficiently target ERR in their altruistic efforts will probably not reduce non-extinction GCRs most efficiently, even if there is a strong correlation between the tendencies of an intervention to reduce extinction risk and non-extinction GCRs.
As a case in point, the authors acknowledge that this argument does not apply to AI alignment work, which appears to be the leading ERR-focused cause area precisely because the mechanism for full extinction via unaligned AI is clear. However, my counterargument to Part 3.1 of Brauner and Grosse-Holz does not provide a strong update against working on alternatives such as climate change, nuclear security, or preventing non-human-engineered pandemics.
The risks of destabilized technological progress are indeed concerning. S-risks could be facilitated by power struggles in such futures, where GCRs decrease global coordination without significantly slowing down technology. It is also plausible that social destabilization after a GCR could push the survivors’ values in inhumane directions. If these changes persist into a future where the survivors’ descendants colonize space, s-risks could occur via astronomical amounts of suffering subroutines.
On the other hand, given how future catastrophes (such as lock-in of dystopian social systems) are enabled by elaborate technological development as a whole, including space colonization, reducing non-extinction GCRs has downsides on suffering-focused axiologies. We should expect such GCRs to slightly decrease the probability of futures containing this level of technological development, stable or otherwise. And if one expects futures with human space colonization to be morally catastrophic, for reasons discussed in prior sections, it is not obvious that the downsides of a post-GCR future with less social progress outweigh the benefits of reducing the probability of colonization. Still, because of the risks of regression to the mean, my overall judgment is that reducing the non-extinction GCRs most likely to curtail global cooperation is a potentially promising intervention from suffering-focused perspectives.
2.6. Promoting coordination and peace [3.2]
The authors give a plausible argument that attempts to reduce extinction risk generally require increasing coordination and other prosocial attitudes, which are robustly good. Strictly speaking, this supports the +ERR thesis. All else equal, the EV of interventions that tend to reduce extinction risk looks better under this consideration.
However, for cause prioritization, a more relevant question is whether the coordination consideration is evidence of a greater EV increase for ERR than for alternative longtermist projects. For example, considering that the most plausible interventions to reduce s-risks involve promoting coordination and cooperation, the answer is not a clear yes. Note also that if ERR is causally downstream of increases in coordination, those who primarily want more coordination should just prioritize promoting it directly, unless they think the benefits of direct interventions on ERR with respect to their other goals are greater.
3. Moral uncertainty
Although Brauner and Grosse-Holz do not dedicate a particular section to the implications of moral uncertainty, they reference it throughout the piece and remark in the conclusion that moral uncertainty favors +ERR. Insofar as (a) a person was previously using a “My Favorite Theory” approach (“we should act in accordance with the moral theory that we are most confident in”), and (b) the person’s “favorite theory” is a strongly suffering-focused axiology, it is true that accounting for moral uncertainty will update the person somewhat toward +ERR. However, most EAs’ highest-credence ethical theory does not appear to be strongly suffering-focused. For people with symmetric views, the update from moral uncertainty would work in the other direction, against +ERR.
What about non-consequentialist views? This depends on which courses of action we’re comparing, exactly.
It does seem that most deontological philosophies would strongly object to intentionally increasing extinction risk. However, that is not what this critique is defending. I am only arguing that it is plausible that not reducing extinction risk is morally defensible, from a longtermist perspective, and that alternative interventions—especially reducing s-risks—are preferable. Deontological ethics would not condemn this any more than, e.g., refraining from donating as much of one’s wealth as feasible to the most effective charity from a consequentialist perspective. Additionally, according to some deontological views it is unacceptable to impose great suffering on one being to achieve “positive consequences befalling persons whose lives would be acceptable at all events” (Ohlsson, 1979). As far as we can interpret ERR as an act that foreseeably brings more suffering beings into existence in the future, then, although this is not necessarily a straightforward implication, ERR seems to violate such a principle.
Ord has argued that there is an intergenerational partnership argument for preventing extinction: Failing to do so would be to “destroy this legacy which we’ve been handed down, and to have it go nowhere from here,” on such a view. While this is a prima facie reason to work on ERR if one has credence in these types of intergenerational duties, and if one interprets past generations as motivated more by high-minded ideals than by tribalism, it is not clear that every single implicit contract between generations is valid. There may be extenuating circumstances, or the contract may be too burdensome. Consideration of extreme risks to the well-being of future generations is a plausible defeater of the responsibility to fulfill the wishes of some subset of past generations, even generously granting that most people in the past did indeed wish for the continuation of humanity more than the relief of great suffering. Moreover, this argument relies on the contestable premise that past generations have indeed passed on a good legacy, one worth preserving despite the harms it entails and risks giving rise to. We may have a greater responsibility to honor the victims of atrocities committed by our ancestors, by breaking the chain of such atrocities.
The virtue ethical argument also appears underdetermined, such that it is unclear whether on balance virtue ethics pushes in favor of or against +ERR, without more systematic analysis. On one hand, as claimed by Ord, preventing extinction could be viewed as compassionate, fair, loving (in a quasi-parental sense), wise, prudent, and patient. But given that the future also entails risks of tremendous misery—and is guaranteed to involve unbearable suffering for some individuals—we could just as well interpret this decision as callous, unfair (failing to prioritize the worst off), foolhardy (taking the aforementioned risks for the sake of our science fiction dreams), and naive. Due to my unfamiliarity with the nuances of virtue ethics, I cannot say what procedure virtue ethics would recommend for deriving a decision from all these conflicting considerations. But it does not suffice to simply list some virtues or vices associated with ERR.
Lastly, the above analysis only focused on magnitudes of positive or negative value of ERR conditional on each given view. But just as in the case of deciding under uncertainty among consequentialist views, these magnitudes are important only as far as the respective ethical theories are plausible. It is not apparent that the best approach to moral uncertainty is a majority vote among all people’s moral views, thus even if it is “common sense” that extinction seems extremely bad, maximizing expected choiceworthiness could still feasibly lead us to reject +ERR, especially if we give significant weight to suffering-focused consequentialist axiologies.
In summary, we have seen that there are important concerns with the case for positive expected value of extinction risk reduction, challenging its robustness on both normative and empirical grounds.
In section 1, I argued that there is a highly plausible population ethical principle that is inconsistent with the symmetric axiology on which Brauner and Grosse-Holz’s case is, at least in part, premised. While some may find certain implications of suffering-focused views implausible as well, such views are serious alternatives to symmetric consequentialism that merit consideration and substantial credence. Unfortunately, these views have largely been neglected in population ethics, at least in EA and plausibly in academia as well, while far more attention has been devoted to person-affecting views.
If we reject axiological symmetry, we cannot conclude that hypothesized greater amounts of happiness than suffering in the far (human-controlled) future would imply that such a future is morally positive. The probability of extreme misery experienced by many beings in a human-colonized future is sufficiently large that no strongly suffering-focused view can consider this future positive.
In section 2, I considered empirical factors that bear on whether those with suffering-focused views should still favor ERR, and whether those with only mildly or non-suffering-focused views should not favor ERR. Although consideration of farmed and wild animal welfare is likely to increase over time, psychological and sociological obstacles cast serious doubt on the proposition that posthumans will both coordinate to eliminate extreme suffering from nature, and refrain from spreading biological or simulated animal suffering on astronomical scales.
We should expect the most evolutionarily and memetically successful value systems in the far future to spread life and hence increase, rather than reduce, total suffering. Even aside from suffering-focused premises, the difficulty of optimizing for complex values and happiness—compared with the ease of creating suffering due to insufficient concern—suggests that the future is much less good in expectation than concluded by Brauner and Grosse-Holz (given one assigns substantial credence to the view that value is complex and disvalue is simple). While in principle ERR could be positive on a suffering-focused view if other civilizations developed after human extinction, the values of such civilizations would need to be sufficiently worse than humans’ to outweigh a significant reduction in disvalue. We are currently too radically uncertain about the nature of nonhuman civilizations for this argument to be a large update toward +ERR for suffering-focused views. Intervening to reduce some global catastrophes (including but not limited to s-risks), which if averted would not significantly increase the probability of space colonization, appears more robustly good than working on ERR in hopes of flow-through effects on GCRs.
Finally, we have seen that moral uncertainty, both within consequentialism and between consequentialism and non-consequentialism, provides less support for +ERR than has been asserted in longtermist discourse, and indeed plausibly speaks against it relative to the starting point of most EA longtermists.
This essay has benefited significantly from comments and suggestions by Michael Aird, Tobias Baumann, Simon Knutsson, Brian Tomasik, Stefan Torges, Magnus Vinding, and Jonas Vollmer. This does not imply their endorsement of all of my claims.
Appendix: Moral Uncertainty and Totalism
The following considerations are not a response to Brauner and Grosse-Holz’s essay, but they are relevant to our overall assessment of the case for +ERR.
Based on Greaves and Ord (2017), I anticipate the following argument: “If you adopt the ‘Maximize Expected Choiceworthiness’ (MEC) approach to moral uncertainty, totalism dominates the moral calculation even if your credence in it is very low, for the large population sizes anticipated in the far future.”
Taking “totalism” to refer generally to axiologies that evaluate the sum of value minus disvalue over sentient beings, this is true. However, totalism in this sense is entirely consistent with suffering-focused views on which positive “value” either simply does not exist (Schopenhauer, 1851) or is lexically less important than the disvalue obtained by sufficiently extreme negative experiences (Mayerfeld, 1999, p. 178; Vinding, 2020, ch. 4). Alternatively, some weaker suffering-focused views hold that additional happy lives have diminishing marginal value, while the disvalue of additional bad lives is linear (Hurka, 1983; Parfit, 1984, ch. 17). Such views may imply that it is important to create happy lives on Earth, if happy lives dominate on that scale, while the overwhelming priority in the context of astronomical stakes might be to avoid creating a vast number of miserable lives.
It would thus be an equivocation to use the totalist objection above as a defense of symmetric totalism, which is how this term is typically (and in my opinion, regrettably) used. (To be clear, Greaves and Ord (2017) are not there attempting to specifically defend symmetric totalism against suffering-focused totalism; I am just anticipating a possible way their argument could be misunderstood and misused.)
Consider the choice between worlds W1 and W2 in the Very Repugnant Conclusion discussed in section 1.1. Is there less “at stake” here to the proponent of a suffering-focused axiology than to a proponent of symmetry? It appears not—world W2 would be nothing short of extremely horrific to the former, and there does not seem to be any reason to think the extent to which suffering-focused views regard it as horrific, compared with W1, is less than that to which symmetric views regard it as fantastic. Arguments of the form given in section 4 of Greaves and Ord (2017) do not push against suffering-focused axiology, because the disvalue of W2 grows without bound as W2 grows, which is precisely the behavior of the “Total View” from which they derive its dominance in moral uncertainty calculations:
For sufficiently large such populations, much as for the Extinction Risk scenario, the Total View favors settlement over non-settlement, and does so by an amount that increases without bound as n [the number of space settlements] increases. In contrast, while various other views favor non-settlement over settlement, they do so by at most an amount that remains bounded as n goes to infinity [my note: this is not true for suffering-focused totalist views]. Therefore, in the limit n → ∞, the Total View overpowers the rival theories that we are considering.
Indeed, Chapter 8 of MacAskill, Bykvist, and Ord (2020) briefly notes that, all else equal, MEC recommends acting as if endorsing at least a slight asymmetry:
According to some plausible moral views, the alleviation of suffering is more important, morally, than the promotion of happiness. According to other plausible moral views (such as classical utilitarianism), the alleviation of suffering is equally as important, morally, as the promotion of happiness. But there is no reasonable moral view on which the alleviation of suffering is less important than the promotion of happiness.
This consideration appears especially strong with regard to future people. The Asymmetry in population ethics—“while we ought to avoid creating additional bad lives, there is no requirement to create additional good ones” (Thomas, 2019)—is a rather widespread intuition, even if totalist suffering-focused axiologies are relatively uncommon. In particular, there is evidently much wider agreement that some lives can be so bad that we have strong moral reasons to avoid creating them, than that some lives are so good that we have strong moral reasons to create them (Vinding, 2020, 1.5).
The precise implications depend, still, on one’s credences in suffering-focused versus symmetric views. Most consequentialists of any sort would not endorse the conclusion that MEC, in combination with nonzero credence in absolute deontological restrictions, implies dominance of such restrictions in all our moral deliberations. Analogously, we cannot conclude from the above arguments that reducing extreme suffering is the top priority regardless of how small one’s nonzero credence in strongly suffering-focused axiologies is. However, it is at least not apparent that moral uncertainty considerations favor +ERR on the grounds of the astronomical stakes of symmetric consequentialist views.
Arrhenius, G. Future Generations: A Challenge for Moral Theory (PhD thesis). Uppsala University. 2000.
Arrhenius, G. “The Very Repugnant Conclusion.” In Krister Segerberg & Ryszard Sliwinski (eds.), Logic, Law, Morality: Thirteen Essays in Practical Philosophy in Honour of Lennart Åqvist, 2003.
Beckstead, N. On the overwhelming importance of shaping the far future (PhD thesis). Rutgers University. 2013.
Beckstead, N., Thomas, T. “A paradox for tiny probabilities and enormous values.” Working paper, 2020.
Bostrom, N. “Astronomical Waste: The Opportunity Cost of Delayed Technological Development.” Utilitas, vol. 15, 2003.
Bostrom, N. “The future of human evolution.” Bedeutung, 2009.
Budolfson, M., Spears, D. “Why the Repugnant Conclusion is Inescapable.” Working paper, 2018.
Greaves, H. “Population axiology.” Philosophy Compass, vol. 12, 2017.
Greaves, H., Ord, T. “Moral Uncertainty About Population Axiology.” Journal of Ethics and Social Philosophy, vol. 12 no. 2, 2017.
Hanson, R. The Age of Em: Work, Love and Life when Robots Rule the Earth. Oxford: Oxford University Press. 2016.
Henrich, J. The WEIRDest People in the World: How the West Became Psychologically Peculiar and Particularly Prosperous. Farrar, Straus and Giroux. 2020.
Hurka, T. “Value and Population Size.” Ethics, vol. 93, 1983.
Hurka, T. “Asymmetries In Value.” Nous, vol. 44, 2010.
Kagan, S. “The Costs of Transitivity: Thoughts on Larry Temkin’s Rethinking the Good.” Journal of Moral Philosophy, vol. 12, 2015.
Klocksiem, J. “How to accept the transitivity of better than.” Philosophical Studies, vol. 173, 2016.
MacAskill, W., Bykvist, K., Ord, T. Moral Uncertainty. 2020.
MacAskill, W., Mogensen, A. “The paralysis argument.” Working paper, 2019.
Mayerfeld, J. Suffering and Moral Responsibility. Oxford: Oxford University Press. 1999.
Moen, O. M. “An Argument for Hedonism.” The Journal of Value Inquiry, vol. 50, 2016.
Ohlsson, R. The Moral Import of Evil: On Counterbalancing Death, Suffering, and Degradation (PhD thesis). Stockholm University. 1979.
Ord, T. The Precipice: Existential Risk and the Future of Humanity. New York: Hachette Books. 2020.
Parfit, D. Reasons and Persons. Oxford Oxfordshire: Clarendon Press. 1984.
Ponthière, G. “Utilitarian population ethics: a survey.” CREPP Working Papers 0303, 2003.
Ryder, R. D. “Painism Versus Utilitarianism.” Think, vol. 8, 2009.
Schopenhauer, A. Studies in Pessimism. 1851.
Sinhababu, N. “The Epistemic Argument for Hedonism.” Working paper, 2010.
Sotala, K., Gloor, L. “Superintelligence as a Cause or Cure for Risks of Astronomical Suffering.” Informatica, vol. 41, 2017.
Temkin, L. “A Continuum Argument for Intransitivity.” Philosophy and Public Affairs, vol. 25, 1996.
Thomas, T. “The asymmetry, uncertainty, and the long term.” Working paper, 2019.
Vinding, M. Suffering-Focused Ethics: Defense and Implications. 2020.
Wolf, C. “Person-Affecting Utilitarianism and Population Policy.” In Heller, J. & Fotion, N. (eds.), Contingent Future Persons. Dordrecht Boston: Kluwer Academic Publishers. 1997.
Wrangham, R. The Goodness Paradox. New York: Vintage Books. 2019.
For example, see this post in EA Hangout, as well as this quote from Cotra on the 80,000 Hours Podcast: “I think I would characterise the longtermist camp as the camp that wants to go all the way with buying into the total view — which says that creating new people is good — and then take that to its logical conclusion, which says that bigger worlds are better, bigger worlds full of people living happy lives are better — and then take that to its logical conclusion, which basically says that because the potential for really huge populations is so much greater in the future — particularly with the opportunity for space colonisation — we should focus almost all of our energies on preserving the option of having that large future. So, we should be focusing on reducing existential risks.” Although existential risks are a broader category than extinction risks, the claim that we should prioritize “preserving the option of having that large future” is particularly extinction-focused (among the set of existential risks). See this comment that makes a similar critique, and Cotra’s clarification that she does not endorse equating longtermism with reducing extinction risks. ↩︎
Stronger than the title of their post suggests, one of the headings of Brauner and Grosse-Holz’s conclusion claims: “The expected value of efforts to reduce the risk of human extinction (from non-AI causes) seems robustly positive.” [emphasis mine] ↩︎
That is, the view that there is one source of intrinsic value and disvalue. This in contrast to the view that value is complex, discussed in the next paragraph. ↩︎
Practically, several interventions aimed at ERR may have indirect effects of increasing the quality of future lives as well. However, I find it unlikely that these interventions are the most efficient methods of achieving such positive effects; Sections 2.5 and 2.6 discuss this further. ↩︎
One could maintain that intrinsically positive welfare does not exist, for example. ↩︎
It is not straightforward how to quantify the sum of positive and negative emotions. For prioritization, quantifying amounts of a given valuable or disvalue object in the usual consequentialist fashion is reasonable, but this does not imply a clear procedure for aggregating different objects. ↩︎
Transitivity has received criticism independently of defenses of person-affecting views or averagism (Temkin, 1996; Kagan, 2015), though it seems extremely difficult to reject, and I assume transitivity in this piece. ↩︎
The Asymmetry is a claim about the permissibility of not creating “good” lives, which is not committed to a particular axiology of what constitutes an overall good life. ↩︎
This section focuses on population ethics, because this seems to be an area where discussion of SFE is relatively neglected, but for other defenses of SFE, see Vinding (2020), Essays on Reducing Suffering, Simon Knutsson’s writings, and this FAQ. ↩︎
Interpersonal tradeoffs of some people’s misery for others’ pleasure, added to baseline contentment, are other less abstract examples. Budolfson and Spears (2018) present a “very repugnant” modification of the Utility Monster. Ryder (2009) briefly discusses the problem of the supposedly overriding pleasure of gang rapists over the suffering of the victim. ↩︎
The following principle, titled Weak Quality Addition, is one of five propositions that are each very intuitive in isolation, yet cannot be consistently satisfied by any one population axiology: “For any population X, there is at least one perfectly equal population with very high welfare such that its addition to X is at least as good as an addition of any population with very low positive welfare to X, other things being equal” (Arrhenius, 2000). This implies (i.e., is stronger than) the rejection of the VRC. Let X be the set of the lives with arbitrarily great disvalue. Then, given the presumably uncontroversial premise that adding X to a given population makes that population worse, we have at least one perfectly equal population with very high welfare that is no worse than the combination of X and any set of lives (no matter how large) with very low positive welfare each. For those familiar with Arrhenius’ five propositions, my argument in this section is equivalent to choosing Non-Extreme Priority as the principle we should reject in the face of this impossibility result. ↩︎
To be clear, “arbitrarily” does not mean infinite value or disvalue. The Perfection Dominance Principle relies only on the existence of some finite amount of value X, and some other finite amount of disvalue Y. “Perfection” here is meant in the sense that world A has no disvalue and as much value as one would like to stipulate, not in that no more value could be added. ↩︎
This is not the formulation provided by Parfit originally, in which he explicitly says the beings in this world “never suffer.” Many suffering-focused axiologies would accept the RC under this formulation—see e.g. Wolf (1997)—which is arguably a plausible conclusion rather than a “repugnant” bullet to bite. However, in many common formulations of the RC, the distinguishing feature of these beings is that their lives are just barely worth living according to axiologies other than strongly suffering-focused ones, hence they may contain a lot of suffering as long as they also contain slightly more happiness. ↩︎
I include scare quotes to highlight that reductionism calls into question the metaphysical, and ethical, significance of beings in this sense in the first place. ↩︎
Put differently, the extent to which it seems unacceptable to impose intense suffering upon one person to create happiness for others (Parfit, 1984), i.e., interpersonal tradeoffs, may imply a similar asymmetry for intrapersonal tradeoffs. ↩︎
I call an axiology “aggregationist” if it holds that for any (dis)valuable objects of absolute intensity X and x, with x << X, there always exists some N such that the (dis)value of N x-intensity objects exceeds that of one X-intensity object. ↩︎
An earlier iteration of this argument, albeit with a different conclusion, is in Temkin (1996). ↩︎
Another critique of lexicality is that it is extremely counterintuitive to prefer a mediocre world to one that is almost utopian but with an arbitrarily small probability of a lexically bad experience. As remarked in Beckstead and Thomas (2020), however, small-probability swamping is a problem for non-lexical axiologies and decision theories as well. For example, between world W1’ with no suffering and arbitrarily many arbitrarily positive lives, and world W2’ with arbitrarily many lives of pure agony plus an arbitrarily small probability of some brief super-positive experience, there is an intensity of the latter such that total symmetric axiology prefers world W2’. This is arguably even worse than the VRC, although less practically relevant. Infinite ethics poses similar problems for any totalist axiology in combination with orthodox expected value decision-making. ↩︎
Note, however, that “scope neglect” with respect to certain non-hedonic values seems entirely defensible. For example, suppose that we could create 10<sup>100</sup> beautiful planets in a corner of the universe that will never contain sentient observers of such beauty, at the cost of torturing someone for an hour. It does not seem implausible to hold that despite the astronomical scale of beauty, the trade is not worth it. ↩︎
The implicit Bayesian argument is that, even in a world in which (a)symmetric intuitions were those we would endorse upon careful philosophical reflection, these biases suggest that the fact that people hold opposite intuitions strongly is not very surprising. Thus the intuitions provide only weak evidence in favor of the purported conclusions. Other biases in favor of suffering-focused views include: (1) There is an empirical asymmetry of the depth of suffering versus happiness experienced by humans thus far, which means that we may be discounting the moral value of the heights of happiness that could be achieved with posthuman technology. (2) Some apparently suffering-focused responses to thought experiments may be motivated by an act vs. omission distinction. For example, having a child that one expects to have a life with more misery than joy is an act, while refraining from having a child expected to have life with more joy than misery is a (mere) omission, and so we may underrate the badness of the latter compared with the former. ↩︎
While this essay is not advocating for voluntary human extinction, it is notable that a recent video discussing this option very sympathetically has been well-received by tens of thousands of viewers, despite that the creator’s channel is not targeted at pessimists. This is at least weak evidence that the idea that voluntary extinction could be net positive is not beyond the pale, among the general public rather than strong negative utilitarians. ↩︎
As moving as this passage is, it underestimates the horror present in futures with far more than just one miserable being. ↩︎
See the epigram of Wolf (1997) for a similar sentiment. ↩︎
Unfortunately, though relatively low, this rate is not absolutely negligible, as about 800,000 people die by suicide annually, rendering it the 15th leading cause of death globally for people of all ages. For people aged 15 to 49, it is the sixth leading cause of death. ↩︎
The counterargument provided by Brauner and Grosse-Holz is: “One could claim that this just shows that people are afraid of dying or don’t commit suicide for other reasons, but people that suffer from depression have lifetime suicide rates of 2-15%, 10-25 times higher than general population. This at least indicates that suicide rates increase if quality of life decreases.” But rejecting their original argument does not require believing that suicide rates are insensitive to quality of life. The hypothesis is that biological and social pressures, independent of experienced welfare, can considerably raise the threshold of disvalue above which a person’s egoistic preference to cease existing outweigh these other factors in their decision-making. This is entirely consistent with higher suicide rates among people who experience more disvalue, or who feel more hopeless. ↩︎
These may also be confounded by sampling, for example, not proportionately reporting the experiences of those in prisons or psychiatric institutions. The main source for Our World in Data’s life satisfaction scores, the Gallup World Poll, notes in their methodology section: “The target population is the entire civilian, non-institutionalized, aged 15 and older population.” To the extent that we expect people without telephone access to be among the worst off in developed nations, there is also potential confounding based on this note: “Gallup uses telephone surveys in countries where telephone coverage represents at least 80% of the population or is the customary survey methodology.” By contrast, it seems less likely that the upper extremes of happiness would be excluded by such biases. ↩︎
Due to habitat reduction, it is plausible that net suffering still decreased, though this is highly uncertain. ↩︎
Interestingly, there is already some evidence of a reversal of this trend, though perhaps not enough to confidently reject the conventional wisdom among wild animal welfare researchers. Rich countries are seeing “reforestation” in recent decades. See also this discussion of the effect of anthropogenic CO2 emissions. ↩︎
Compare with the 10<sup>13</sup> stars in the Virgo Supercluster considered in Bostrom (2003). While in context Bostrom is considering the prospect of harvesting energy from these stars, rather than settling planets orbiting them, this gives a reasonable sense of how conservative the figure of a million is (for an event with probability 0.0001%), on the scale of space colonization scenarios. ↩︎
From non-suffering-focused perspectives, however, this option is far from optimal—conditional on our descendants coordinating this well on AI development, a future populated by them would probably be very good on such views. It is also technically possible that ERR is a practical prerequisite to reaching the level of aligned AI development necessary to implement this option. ↩︎
An antirealist case can be made for this perspective, despite the phrasing here. One can hold that we are generally confused about our preferences over futures, especially considering the difficulties of population ethics and decision theoretic puzzles, and so further philosophical work and empirical information could help us make better moral decisions even if there are no moral facts. ↩︎
“It seems likely that future agents would probably surpass our current level of empirical understanding, rationality, and coordination, and in a considerable fraction of possible futures they might also do better on values and non-selfishness. However, we should note that to actually not colonize space, they would have to surpass a certain threshold in all of these fields, which may be quite high. Thus, a little bit of progress doesn’t help—option value is only created in deferring the decision to future agents if they surpass this threshold.” ↩︎
“It may be argued that [the long reflection] would be warranted before deciding whether to undertake an irreversible decision of immense importance, such as whether to attempt spreading to the stars.” [emphasis mine] ↩︎
Further, if technological progress results in an explosion in the population of artificially sentient beings, this may outpace the increase in wealth, and the future could contain a large class of relatively poor worker “ems” (Hanson, 2016). ↩︎
Potential counterexamples to this correlation would include the propagation of traditions and legacies for their own sake. The desire to have a lasting imprint on the world, not necessarily altruistic, seems fairly common. ↩︎
I base this claim on anecdotal discussions, and reading various related LessWrong posts. ↩︎
In context, Hurka is providing an intuitive argument for views promoted by G.E. Moore and Jamie Mayerfeld, not necessarily endorsing this argument himself. ↩︎
Arguably, Hurka’s phrasing of “mindless” for pleasure but not pain slightly stacks the deck against pleasure in this thought experiment. Still, the intuition that mindless pain can be intensely bad seems widespread. ↩︎
Plausibly, even happiness is subject to a milder form of this principle (Vinding, 2020, 1.3). Under current human psychology, one needs many factors in place to be happy, although this seems likely to be less of a problem for uploaded minds/ems. ↩︎
In general, our uncertainty about the ease of value optimization also suggests a less extreme ratio in either direction. Admittedly, this argument also makes non-ERR longtermist interventions less promising. ↩︎
The strength of this argument depends on the details of the particular complexity of value view in question. If one holds that simple value and simple disvalue “cancel out” in the sense that classical utilitarianism holds is the case for all happiness and suffering, even if simple value is highly suboptimal, then the argument is less persuasive. It still seems to temper the force of the original argument that optimized value in the future will dominate. ↩︎
What if the reader is not sympathetic to the complexity of value thesis, and instead favors hedonistic utilitarianism? (This is not to say these are the only two options.) While this particular argument would be less compelling, though not entirely deflated as footnote 46 argues, there is a prima facie argument that strong axiological asymmetries should seem especially plausible to those sympathetic to a hedonistic view. This is because the hedonistic utilitarian already holds that most people are systematically mistaken about the intrinsic value of non-hedonic goods. The fact that people report sincerely valuing things other than happiness and the absence of suffering, even when it is argued to them that such values could just be a conflation of intrinsic with instrumental value, often gives little pause to hedonistic utilitarians. But this is precisely the position a strongly suffering-focused utilitarian is in, relative to symmetric hedonists. That is, although this consideration is not decisive, a symmetric hedonist should not be convinced that suffering-focused views are untenable due to their immediate intuition or perception that happiness is valuable independent of relief of suffering. They would need to offer an argument for why happiness is indeed intrinsically valuable, despite the presence of similar debunking explanations for this inference as for non-hedonic goods. Note that the “complexity” of weakly suffering-focused views—which hold that happiness has moral value but is required in larger quantities to outweigh the disvalue of suffering—is a disadvantage of such views for those who are hedonistic/monist utilitarians partly out of a metaethical preference for simplicity. ↩︎
Theirs is 30%. ↩︎
Brauner and Grosse-Holz include a similar caveat in their conclusion of Part 1.1. ↩︎
That is, aggression that is deliberate and planned, rather than a reaction to provocation. A significant part of Wrangham’s thesis is that our relatively high proactive aggression was facilitated by the invention of language, which allowed our ancestors to conspire to execute antisocial and reactively aggressive individuals. If aliens tend to convergently evolve along similar lines, proactive aggression would not constitute strong evidence against the quality of human civilizations, but by the same token we should also expect aliens to be low in reactive aggression; hence, the latter would not be strong evidence in humans’ favor, either. ↩︎
It does provide a moderate update against +ERR relative to the prior view that impartial values are a sheer accident, or a position of complete ignorance of the sources of impartiality. I would not weigh this point too heavily, regardless, because Henrich’s thesis may also update us in favor of +ERR as far as it predicts that the long-term future will have relatively little chaotic value drift away from impartial altruism. See also these notes on Henrich (2020). ↩︎
We can roughly operationalize this as “badness per unit of spacetime influenced by the given civilization.” ↩︎
Brauner and Grosse-Holz acknowledge that this particular estimate is not intended to be robust, but it is a useful model to formalize their argument in favor of +ERR vis-a-vis cosmic rescues. ↩︎
In which case their second Fermi factor should be adjusted. ↩︎
At least the subset of AI alignment that focuses on avoiding extinction. One could prioritize aspects of AI alignment (or safety more broadly) that would differentially increase the quality of future lives, rather than the probability of such lives existing. ↩︎
Plausibly, however, the only two attractor states in the future could be extinction and technological mastery (c.f. the technological completion conjecture). I also say only “slightly” due to arguments that technological civilizations are very likely to recover. ↩︎
There are important distinctions between promoting cooperation among humans versus among AIs, the latter of which receives significant attention in CLR’s research agenda. However, it does seem that implementation of safeguards against s-risks in AI designs will require some coordination by AI developers, as well as governments. ↩︎
Specifically, they summarize two arguments: (1) if we expect posthumans to engage in extensive moral reflection, they would tend to make better decisions than we would given our current moral uncertainty, and (2) we are likely currently unaware of “unknown unknowns” about moral (dis)value that exists throughout the universe. ↩︎
This, again, is not a central concern in Brauner and Grosse-Holz’s essay, so the following critiques should not be taken as indictments of their reasoning. Moral uncertainty with respect to non-consequentialist views bears on the robustness of the arguments on either side, however, so it is important to include these points. ↩︎
Though see MacAskill and Mogensen (2019) for why some deontological views should consider basically any action wrong. ↩︎
See also a somewhat more elaborate formulation of this argument in Ord (2020, pp. 49-52). ↩︎
I do not think that accounting for relative plausibility would clearly push towards or against +ERR; this is just a general caveat. ↩︎
As opposed to a majority vote within one’s “moral parliament.” ↩︎
For instance, see Budolfson and Spears (2018), which claims that the VRC is inescapable for “all leading welfarist axiologies,” but excludes suffering-focused axiologies from this list (despite that intuitions in favor of such axiologies are arguably more widespread than, e.g., Ng’s Theory X’). ↩︎
Note that this evaluation does not require the belief that no suffering can be compensated by sufficiently great flourishing. The conclusion follows for views on which only very intense physical or mental pains cannot be compensated, either in principle or by practically foreseeable degrees of happiness, as well as views on which added happiness has diminishing returns (see the Appendix). ↩︎
These value systems would increase the total amount of many goods as well. ↩︎
See also MacAskill on the 80,000 Hours podcast: “Because there’s just so much potential value at stake, even if you don’t find population ethical views like the total view or other views that think that it’s good to bring happy people into existence and bad to bring very unhappy people into existence. Even if you don’t find them that plausible, because the stakes are so high you should really focus on that.” I do not claim to know that MacAskill would endorse the argument as I have stated it, but this quote is premised on the same potential conflation of totalism with symmetric totalism. ↩︎
This view is admittedly subject to the objection of violating separability. See also Beckstead (2013, ch. 5). ↩︎
Karnofsky has also expressed sympathy for such a view: “So one crazy analogy to how my morality might turn out to work, and the big point here is I don’t know how my morality works, is we have a painting and the painting is very beautiful. There is some crap on the painting. Would I like the crap cleaned up? Yes, very much. That’s like the suffering that’s in the world today. Then there is making more of the painting: that’s just a strange function. My utility with the size of the painting, it’s just like a strange and complicated function. It may go up in any kind of reasonable term that I can actually foresee, but flatten out, at some point. So to see the world as like a painting and my utility of it is that, I think that is somewhat of an analogy to how my morality may work, that it’s not like there is this linear multiplier and the multiplier is one thing or another thing. It’s: starting to talk about billions of future generations is just like going so far outside of where my morality has ever been stress-tested. I don’t know how it would respond. I actually suspect that it would flatten out the same way as with the painting.” ↩︎
Indeed, perhaps the opposite, if the suffering-focused view includes disvalue lexicality. It is not clear what the “right” way to make intertheoretic comparisons is when one view features lexicality and another does not, though. ↩︎