TL;DR: I show that a certain very strong class of longtermist argument that projects utilitarian calculations over extremely large numbers (10^many) of affectable humans contains some mathematical/Bayesian errors. I also outline a couple other gripes I have with the current longtermist mindset. My overall conclusion is that this type of argument — including my critique — is very unsteady and vulnerable to minor misapplications or misunderstandings of logic, which are very likely to be present. So I advocate for a more common-sense and epistemically modest application of longtermism and morality.

I’ve been asked to write down this critique that I explained at some point to a friend in EA. I will write it down to the best of my ability. I apologize for any instances of reinventing the wheel or misreading social phenomena that happen in EA: this post is very much written by an EA outsider. Also see comment #8 in Scott’s “Highlights from the What We Owe the Future” post (search for Hari Seldon), which gives a short version of the argument I’m making here. Both Scott’s post, and the book “What we Owe the Future” came out while my post was in my drafts folder, so I’d like to flag that there might be a large discussion about this since that book’s publication (or, indeed, before it) which I’m not aware of.

The naive longtermist hypothesis (and two reasons I don’t like it)

I love the idea of thinking seriously about other people, including future people, and insofar as longtermism creates a way to indicate such concern and coordinate around it, I am very pro longtermism. But I dislike a kind of fundamentalism that I see associated with the philosophy. Specifically, there is a dogma that I dislike that unfortunately seems to have become inseparable from the good parts of longtermism in certain parts of EA.

The dogmatic position I have observed is contingent on something like the following argument, which I will call:

The naive longtermist hypothesis:

Future people have just as much moral value as present people, and because the number of future people is essentially unlimited in expectation, the overwhelming bulk of moral weight lies on future people.

In the next section, I will give a more in-depth explanation of the naive longtermist hypothesis as I understand it and a steel-manned version of what the argument is and how it is logically deduced from “first principles″ by those who believe it.

But first let me briefly explain why this argument, as it is currently used by those who hold it strongly, bothers me. One of the reasons has to do with how people talk about it/use it in the EA community, and the other has to do with the actual content. In brief, here are the two reasons. Most of the post will focus on the second.

The first reason I feel uncomfortable with how this argument is sometimes framed is that it lends itself easily to a motte-and-bailey fallacy which results in discussions around it becoming tribal, ideological, and divisive. The motte that I see is “future people matter” — ostensibly a simple argument that follows from standard moral assumptions. The bailey, however, is far more expansive: “nearly all the expected value of your actions lies in how you think it will affect the long-term future, and if you can’t make a case for how what you’re working on does that, it can be rounded down to 0.” The result is that some people will say “future people matter,” but mean the extreme version. People who are uneasy with these more extreme consequences are dismissed as irrational or immoral, as if they disagree with the basic statement (the motte). The extreme version, in this way, becomes increasingly morally incontrovertible.

In this way, the use of this argument is starting to remind me, at least in some contexts, of similarly ambiguous dogmas in religion and politics, which eventually turn into purity tests or into semantic cudgels to police certain kinds of nuanced doubt (and this if you’re lucky; if you’re less lucky, you get holy war).
The second reason I dislike the argument is that I think it is false, or at least follows from what I believe are bad assumptions in a Bayesian framework.

The naive longtermist hypothesis is, I think, one of those statements that is only a little bit false. Most of the specific and actionable consequences of this statement are correct. Should you work on that term paper instead of playing Starcraft to help future you at the expense of present you? Yeah, probably. Is it worthwhile to work on mitigating an existential risk even if you believe it will only affect your great-grandchildren? Yes, it almost certainly is. Not to mention that I subscribe to the view that many existential risks that longtermism cares about, like AI risk and populist uprisings, are actually relevant to many, if not most people alive today (see for example).

It’s one of those things, like Newtonian mechanics and the wrongness of theft, that are broken, but only break in extreme and unusual circumstances which are irrelevant for most practical purposes. But I sometimes see EA’s applying this moral principle in precisely such extreme, weird hypothetical scenarios, and then prescribing weird courses of action in the present — or at least giving the impression that they would bite this bullet. Thus I think it is relevant that this maxim is mostly true but can’t be exactly and universally true.

The critique of the argument

I am going to start by writing down some assumptions that I think people that subscribe to the extreme (“bailey”) version of this moral tenet believe, and show how they lead to a contradiction (or at least a very improbable conclusion) using Bayes’ theorem. The core cause for the contradiction is a utilitarian version of the Doomsday Argument, (also known as the “Carter catastrophe” argument), which I explain below.

I will later outline how some of the assumptions can be weakened to avoid this specific critique.

Assumption 1. A year of life of a being is valuable insofar as it is sentient (or intelligent, human-like, etc. - I will use “sentient” as a catch-all term for your favorite morally relevant category).

Assumption 2. Future people are sentient.

Assumption 3. The morality of an action is determined by its expected effect on the well-being of sentient beings in our universe (whatever that means — physical nuance is not part of the present critique).

Assumption 4. There is a non-negligible probability that there will be a continuous civilization of >>10^50 sentient being life-years. (E.g. if humanity finds a stable advanced mode of existence and expands to the known universe, even very conservative estimates for the duration of the universe before heat death give easily over 10^50 life-years, and many orders of magnitude more if we assume simulated humans or a universe significantly larger than what we observe. For this argument to be valid, you don’t need to assume there is a high probability of humanity surviving, just a non-negligible probability.)

Note that continuity here matters, as longtermists care about affectable beings in the future (those whose lives we can improve), and if there’s a significant discontinuity (total loss of contact or history, etc.), then affectability seems to disappear.

Assumption 5. You (the reader) are sentient right now.

Assumption 6. There is a non-negligible (>1/10^30) probability that some of our assumptions are wrong. (For example, Gödel’s incompleteness theorems say that we cannot be sure that logic is consistent, and I think most logicians who have thought about this would assign a >1/10^20 probability that even very basic logical deductions are way off.)

I think most people who accept the naive longtermist hypothesis will agree with versions of all of these assumptions.

But if so, we get a statistical contradiction (or, extreme improbability) equivalent to the Doomsday Argument, as I outline below and summarize here. In brief: we live in a civilization with <10^11 people so far, so <10^13 sentient (and thus morally relevant) life-years. But a randomly chosen sentient being will live somewhere close to the middle of the >>10^50 sentient life-years of total civilizational experience, and will know that there are >>10^50 people who lived before. Thus, the probability of you being within the first 10^13 hours of sentient experience is much less than 10^-37; a near-impossibility which implies that almost surely, one of the above assumptions is wrong. This argument is described in greater detail below, or you can just skip to the section outlining how we could relax some of the assumptions to avoid a contradiction.

How the Doomsday Argument applies; an outline

I’ll briefly outline the Doomsday argument, which might be familiar to many of this forum’s readers. (For a basic introduction/discussion, you can also see this video or read the Wikipedia page, and skip the rest of this explanation.)

Suppose you have two boxes of numbered marbles (one marble is labeled “#1,” one is “#2,” etc.). One box has 100 marbles, and the other has a million marbles. Your friend picks either box A or B at random, without telling you which box she chose, and then tells you the number on the marble. You find out that the number is 57.

Your prior probability on your friend picking either box was 50-50, but after hearing the number on the ball, Bayes’ theorem tells you that you should now be almost certain (with 1,000,000:100 likelihood, i.e., a probability of about 99.99%) that your friend picked box A.

This is intuitive for many people without doing the math: indeed since the number is <100, you know that box A is a possibility at all (if you got ball #101, you’d know your friend picked box B). But if it had been box B that your friend chose (a priori a 50% probability), the likelihood of getting a number <=100 would be ¹⁄_10,000 — extremely improbable.

The doomsday argument follows the same logic, and applies it to the fate of humanity. You replace the boxes A and B by possible hypotheses about the future of humanity. Box A can be the hypothesis that humanity is going to end in a doomsday catastrophe after just a few generations (with, say, a trillion total people). Box B can be the possibility of an extremely long-lasting galactic civilization. There can be many boxes (or hypotheses), and you don’t need to assume that they are all equally likely to be picked a priori.

The “number your friend drew from the box” is the number you hold in the birth order of all humans. If your a priori hypothesis included civilizations with enormous amounts of humans (>10^50, say) but the number you drew is comparatively extremely small (say, 116515849091 — a realistic number close to the number of humans who have existed until now) then all the boxes with enormously more balls than this number get updated to have vanishingly small likelihood of having been picked. Indeed, so small, that it is far more likely that one of your arguments (or logic itself) is incorrect than that you drew a number so (comparatively) small. You conclude that the only possibility is that the box your number was drawn from must also be comparatively small, and the world must end in a doomsday scenario. (Or you conclude that the logic is broken.)

Note that to make the argument more airtight, you should replace “humans” with “intelligent beings” (or “morally relevant beings”, or some comparatively universal class in the utilitarian calculus), and you should replace the “birth order” of a human by the “order in the continuum of human experience”, i.e., the total time that all humans lived up until now. However (other than a few nitpicks about infinite lifespans and the like), thinking “birth order” instead of “position in the continuum of conscious experience” will give the same general ideas.

My take is that the Doomsday Argument is a trippy argument which shouldn’t be taken too seriously (it itself is based on some serious teleological assumptions!), but it follows from the assumptions outlined above and from many naive forms of longtermist utilitarian calculus. In other words, I think the fact that the Doomsday Argument applies (and seems to contradict the naive longtermist hypothesis) is an inherent and serious contradiction in the “naive” application of longtermism that I see.

Some interesting solutions: modifying the assumptions.

There are some standard weak (but commonly seen) objections to the Doomsday Argument which often come up when it is mentioned. See the appendix for some (in my opinion, weak) objections to the argument (as it applies here), and my responses.

In this section, I want to focus on other solutions to the contradiction above, which I think are more interesting.

Indeed, think there are (at least) two interesting ways to weaken the chain of assumptions above to avoid the contradiction while still maintaining a similar general framework. The first is to throw out assumption 1 (future beings are sentient), and posit morally relevant non-sentient beings. The second is to throw out assumption 4 (there is a non-negligible possibility of essentially infinite human experience) and to lean into the Doomsday Argument.

As it turns out, I think both of these solutions leave the naive longtermist hypothesis incorrect (or at least deeply broken), though the way the second solution breaks it is I think really interesting, and gives an alternative (though almost certainly equally naive) quantitative measurement of the value of future humans to the naive longtermist hypothesis.

When writing this post I also thought of a third, more crazy solution, which is not directly useful but which I include as a footnote.

Interesting solution 1. The set of morally relevant beings is different from the set of sentient beings.

I.e., you can assume that your current experience is for some reason drawn from a different distribution than most morally relevant experience. This is a bit of a mind-bender, but for example if you believe in a Jewish God, you believe in a being that is fundamentally different from humans but cares about humans. Alternatively, you might believe that animals are morally relevant but not included in the distribution of sentient experience (though this particular objection runs into the same doomsday issues, with “number of hours of consciou experience” replaced by “number of conscious and animal experience”, possibly weighted in some way). Generally, such arguments would bend our understanding of logic and sentience, but you can also imagine that this is the case for some more pragmatic reason that has to do with the simulation hypothesis: e.g., most of the future is populated by AI minds that started out as 21st Century humans and really love to relive simulations of their youth.

If you subscribe to such a simulation theory, then you probably don’t need to do weird and extreme things to ensure the future existence of humanity, because what you are inhabiting is just a memory/simulation.

Interesting solution 2. Leaning into the doomsday argument.

In other words, you may believe that, because our experience is selected from the entirety of human experience, and the entirety of human experience so far comprises under 10^13 life-years, there is some fundamental reason why we are near the middle of all human existence. Here “near the middle” shouldn’t be taken too literally: for example there is a 10% chance that humanity will continue existing for 10x as long as it has so far, etc., and 10% probabilities happen.

This is a depressing worldview, but interestingly it brings us close to what I consider the “intellectually honest” longtermist view, which is that we can realistically try to make the future better for the next few generations, but should not force ourselves to extrapolate too far into the future. If you really commit to this argument and keep all the other assumptions from the previous section, then the naive longtermism hypothesis would become something like the following:

The doomsday-adjusted longtermist hypothesis

The value of future humans is inversely proportional to how far away from us they are (as measured by their total years of human experience/our total years of human experience).

I don’t think this is quite the correct solution (for more on my take, see below, and I might elaborate in a later post). But it is certainly an interesting attempt to quantify the problems inherent in treating “all sentient experience” as a distribution you can perform utilitarian calculus on.

There is also a third solution which was the strongest counterargument to my rebuttal of naive longtermism that I could think of, but which is too weird to include here.^[1]

My overall take

What, then do I believe? Well, I think my point of view is more suitable for a separate post but briefly, I believe the following:

Our model is incomplete.

The doomsday argument, which I argue causes the contradiction in naive longtermism, also posits a somewhat unphysical assumption that the universe has it in for us—that there is some physical reason, beyond human agency, that humanity cannot exist stably for much longer. I don’t buy it. Physics (which I kind of study) has gone through multiple periods where people thought that our universe was fully understood, but ran into minor issues. Some physicists posited weird or extreme solutions to fit it in a framework that conforms with contemporary understanding (there are many examples of this that were much more extreme than the aether theory). Other physicists more modestly admitted that the current theory is incomplete and probably fails to give accurate predictions in extreme scenarios or after a certain level of precision. So far, the more intellectually modest physicists have been correct in almost every single case.

In a similar way, our understanding of sentience and sentient experience, of possible futures for humanity and the universe, and of consistent theories of morality is incomplete, often overwhelmingly so.

So I’d like to advocate for a similar kind of modesty in your approach to longtermism.

If you are working on solving a problem that threatens your grandchildren and your solution might incidentally lead to ultra-long-term human flourishing, then by all means carry on! You are a longtermist in the sense that I approve of.

But if you are someone who, for some reason, has realized that the only way to ensure the possibility of near-infinite human flourishing is to kill 90% of currently existing humans “for the greater good” or something even remotely similarly crazy by normal morality, I’d rather you didn’t call yourself a longtermist (and reconsidered your chosen course of action and philosophy).

Appendix. Some weak objections to the doomsday argument

As promised, I will explain some objections that I’ve seen to the doomsday argument (and related ideas) that I think come from a misunderstanding of the argument, and my explanations for why I think they are incorrect.

Weak objection 1. But there’s nothing special about birth order (or the order in the span of human experience)! You could choose some other arbitrary measure like a hash of all your atom coordinates being a given string, which would identify you uniquely and then say “Aha! I am the only person whose atoms hash to [string of 1000 random-looking characters], so Bayes’ theorem says there shouldn’t be much more than 1 person alive in the universe.”

The way to formalize questions of “how surprising is it to measure something to be X” is using something called the “Minimum description length”, roughly a measure of how easy it would be to explain X to a friendly alien with rudimentary knowledge of our universe. If you believe that “morally relevant sentient experience” is an absolute measurement that the alien can be made to understand easily, then I think if you asked the alien to come up with a measure on this quantity, then it’s hard to argue with the fact that “total morally relevant sentient experience until time t” would be one of the first things it came up with (there would be some fuzziness around relativity and quantum mechanics, which is essentially irrelevant in the very small and very non-quantum world that humans have inhabited so far). Thus “total time of human experience” (and indeed, the closely related “birth order”) have far lower minimum description length than a hash of all your atoms.

Weak objection 2. The analogy with the two boxes of numbered balls in the probabilistic example is inaccurate because you can’t quantify all the different options/the options aren’t equally likely/etc.

As soon as you have a few distinct classes of guesses for the future of the universe and you assign them non-infinitesimal probabilities (for example, say you figure out think that there is a >.00000001% chance that our future lies in a clearly definable class of “continuous human flourishing” civilizations, because there is a clear though unlikely path of us getting everything right and avoiding AI extinction and you can estimate the likelihood of every step in this sequence to give a lower bound on this possibility), the magic of Bayes’ theorem forces you to update your chance to almost zero. It essentially doesn’t matter what your priors are; our drawn ball makes their relative sizes essentially irrelevant for the large-scale picture.

Weak objection 3. What about aliens/sentient AI/etc?

If you believe the central longtermist idea that the possibility of a super-long continuous civilization (human, alien, AI, etc.) is possible and quantifiable, then a randomly chosen intelligence would still think they are in position >>10^50 in the order of sentient beings, and the same principle makes your early existence extremely unusual. Note that here there is a loophole if there is a continuous sequence of civilizations, each of which doesn’t know about the previous ones, which turns this objection into one of the “interesting solutions” above (although you’d still need to make an argument for why you think that these civilizations should expect to be able to affect the future ones in predictably good ways).

Thank you to Lizka for feedback and encouragement on this post.

^
Ok, I’ll include it as a footnote, since I couldn’t stop myself. But I warn you that it is very weird, and if you’re not in for this kind of stuff, you might want to skip it.
The solution is to weaken Assumption 5: you (the reader) [as well as me, the author] are sentient right now in an interesting way.
There are probably really weird (and mind-bending) ways to explain away this assumption directly, but I think a strong counterargument is to go around it, as follows. Suppose that you (the reader) are sentient, and a valid random sample from a sentient distribution. But at the same time suppose that distribution is a simulation by some mega-intelligence in a “real” universe simulating members of randomly generated early universes for some reasons, perhaps of pragmatism or curiosity. Then on the one hand, it is the case that your sentient experience is far more morally relevant than that of later beings (since so many more people like you are being simulated). But on the other hand, you can make the argument that your actions are highly correlated with the long-term future of the “real universe” doing the simulating. So if you do things that strongly de-prioritize the ultra-long-term future, then by the logic of Newcomb’s paradox, it is likely that inhabitants of the “real universe” will do the same. In order to improve the outcome of the universe that is simulating you, which you have some kind of existential link with, you are thus incentivized to take actions with better ultra-long-term outcomes on the margin.
If I were to quantify the moral calculus that comes out of such a thought experiment, I would guess that it assigns to future people some function intermediate between the doomsday-adjusted longtermist hypothesis above and the naive longtermist assumption that all speculative future people matter equally. But here we are getting into increasingly speculative and bizarre epistemological territory, and I will stop here.

Katja Grace has written about SIA Doomsday but this is (in my view) contingent on beliefs about aliens & simulations whereas SSA Doomsday is not.

On longtermism, Bayesianism, and the doomsday argument