Thanks for writing this. I think it’s very valuable to be having this discussion. Longtermism is a novel, strange, and highly demanding idea, so it merits a great deal of scrutiny. That said, I agree with the thesis and don’t currently find your objections against longtermism persuasive (although in one case I think they suggest a specific set of approaches to longtermism).
I’ll start with the expected value argument, specifically the note that probabilities here are uncertain and therefore random valuables, whereas in traditional EU they’re constant. To me a charitable version of Greaves and MacAskill’s argument is that, taking the expectation over the probabilities times the outcomes, you have a large future in expectation. (What you need for the randomness of probabilities to sink longtermism is for the probabilities to correlate inversely and strongly with the size of the future.) I don’t think they’d claim the probabilities are certain.
Maybe the claim you want to make, then, is that we should treat random probabilities differently from certain probabilities, i.e. you should not “take expectations” over probabilities in the way I’ve described. The problem with this is that (a) alternatives to taking expectations over probabilities have been explored in the literature, and they have a lot of undesirable features; and (b) alternatives to taking expectations over probabilities do not necessarily reject longtermism. I’ll discuss (b), since it involves providing an example for (a).
(b) In economics at least, Gilboa and Schmeidler (1989) propose what’s probably the best-known alternative to EU when the probabilities are uncertain, which involves maximizing expected utility for the prior according to which utility is the lowest, sort of a meta-level risk aversion. They prove that this is the optimal decision rule according to some remarkably weak assumptions. If you take this approach, it’s far from clear you’ll reject longtermism: more likely, you end up with a sort of longtermism focused on averting long-term suffering, i.e. focused on maximizing expected value according to the most pessimistic probabilities. There’s a bunch of other approaches, but they tend to have similar flavors. So alternatives on EU may agree on longtermism and just disagree on the flavor of it.
(a) Moving away from EU leads to a lot of problems. As I’m sure you know given your technical background, EU derives from a really nice set of axioms (The Savage Axioms). Things go awry when you leave it. Al-Najjar and Weinstein (2009) offer a persuasive discussion of this (H/T Phil Trammell). For example, non-EU models imply information aversion. Now, a certain sort of information aversion might make sense in the context of longtermism. In line with your Popper quote, it might make sense to avoid information about the feasibility of highly-specific future scenarios. But that’s not really the sort of information non-EU models imply aversion to. Instead, they imply aversion to info that would shift you toward the option that currently has a lot of ambiguity about it because you dislike it based on its current ambiguity.
So I don’t think we can leave behind EU for another approach to evaluating outcomes. The problems, to me, seem to lie elsewhere. I think there are problems with the way we’re arriving at probabilities (inventing subjective ones that invite biases and failing to adequately stick to base rates, for example). I also think there might be a point to be made about having priors on unlikely conclusions so that, for example, the conclusion of strong longtermism is so strange that we should be disinclined to buy into it based on the uncertainty about probabilities feeding into the claim. But the approach itself seems right to me. I honestly spent some time looking for alternative approaches because of these last two concerns I mentioned and came away thinking that EU is the best we’ve got.
I’d note, finally, that I take the utopianism point well and wold like to see more discussion of this. Utopian movements have a sordid history, and Popper is spot-on. Longtermism doesn’t have to be utopian, though. Avoiding really bad outcomes, or striving for a middling outcome, is not utopian. This seems to me to dovetail with my proposal in the last paragraph to improve our probability estimates. Sticking carefully to base rates and things we have some idea about seems to be a good way to avoid utopianism and its pitfalls. So I’d suggest a form of longtermism that is humble about what we know and strives to get the least-bad empirical data possible, but I still think longtermism comes out on top.
However, I think a weakness in the argument is that it assumes the probabilities exist and are constant throughout, but they aren’t defined by assumption in the Ellsberg paradox. In particular, looking at the figure for case 1, the argument assumes p is the same when you start at the first random node as it is looking forward when you’re at one of the two choice nodes, 1 or 2. In some sense, this is true, since the colours of the balls don’t change between, but you don’t have a subjective estimate of p by assumption and “unknown probability” is a contradiction in terms for a Bayesian. (These are notes I took when I read the paper a while ago, so I hope they make sense! :P.)
Another weakness is that I think these kinds of sequential lotteries are usually only relevant in choices where an agent is working against you or trying to get something from you (e.g. money for their charity!), which also happen to be the cases where ambiguity aversion is most useful. You can’t set up such a sequential lottery for something like the degree of insect consciousness, P vs NP, or whether the sun will rise tomorrow.
On the expected value argument, are you referring to this?
The answer I think lies in an oft-overlooked fact about expected values: that while probabilities are random variables, expectations are not. Therefore there are no uncertainties associated with predictions made in expectation. Adding the magic words “in expectation” allows longtermists to make predictions about the future confidently and with absolute certainty.
Based on the link to the wiki page for random variables, I think Vaden didn’t mean that the probabilities themselves follow some distributions, but was rather just identifying probability distributions with the random variables they represent, i.e., given any probability distribution, there’s a random variable distributed according to it.
However, I do think his point does lead us to want to entertain multiple probability distributions.
If you did have probabilities over your outcome probabilities or aggregate utilities, I’d think you could just take iterated expectations. If U is the aggregate utility, U∼p and p∼q, then you’d just take the expected value of p with respect to q first, and calculate:
EV∼Eq[p][V]]
If the dependence is more complicated (you talk about correlations), you might use (something similar to) the law of total expectation.
And you’d use Gilboa and Schmeidler’s maxmin expected value approach if you don’t even have a joint probability distribution over all of the probabilities.
A more recent alternative to maxmin is the maximality rule, which is to rule out any choices whose expected utilities are weakly dominated by the expected utilities of another specific choice.
Mogensen comes out against this rule in the end for being too permissive, though. However, I’m not convinced that’s true, since that depends on your particular probabilities. I think you can get further with hedging.
Yeah, that’s the part I’m referring to. I take his comment that expectations are not random variables to be criticizing taking expectations over expected utility with respect to uncertain probabilities.
I think the critical review of ambiguity aversion I linked to us sufficiently general that any alternatives to taking expectations with respect to uncertain probabilities will have seriously undesirable features.
I have two doubts about the Al-Najjar and Weinstein paper—I’d be curious to hear your (or others’) thoughts on these.
First, I’m having trouble seeing where the information aversion comes in. A simpler example than the one used in the paper seems to be enough to communicate what I’m confused about: let’s say an urn has 100 balls that are each red or yellow, and you don’t know their distribution. Someone averse to ambiguity would (I think) be willing to pay up to $1 for a bet that pays off $1 if a randomly selected ball is red or yellow. But if they’re offered that bet as two separate decisions (first betting on a ball being red, and then betting on the same ball being yellow), then they’d be willing to pay less than $0.50 for each bet. So it looks like preference inconsistency comes from the choice being spread out over time, rather than from information (which would mean there’s no incentive to avoid information). What am I missing here?
(Maybe the following is how the authors were thinking about this? If you (as a hypothetical ambiguity-averse person) know that you’ll get a chance to take both bets separately, then you’ll take them both as long as you’re not immediately informed of the outcome of the first bet, because you evaluate acts, not by their own uncertainty, but by the uncertainty of your sequence of acts as a whole (considering all acts whose outcomes you remain unaware of). This seems like an odd interpretation, so I don’t think this is it.)
[edit: I now think the previous paragraph’s interpretation was correct, because otherwise agents would have no way to make ambiguity averse choices that are spread out over time and consistent, in situations like the ones presented in the paper. The ‘oddness’ of the interpretation seems to reflect the oddness of ambiguity aversion: rather than only paying attention to what might happen differently if you choose one action or another, ambiguity aversion involves paying attention to possible outcomes that will not be affected by your action, since they might influence the uncertainty of your action.]
Second, assuming that ambiguity aversion does lead to information aversion, what do you think of the response that “this phenomenon simply reflects a [rational] trade-off between the intrinsic value of information, which is positive even in the presence of ambiguity, and the value of commitment”?
On the first point, I think your intuition does capture the information aversion here, but I still think information aversion is an accurate description. Offered a bet that pays $X if I pick a color and then see if a random ball matches that color, you’ll pay more than for a bet that pays $X if a random ball is red. The only difference between these situations is that you have more information in the latter: you know the color to match is red. That makes you less willing to pay. And there’s no obvious reason why this information aversion would be something like a useful heuristic.
I don’t quite get the second point. Commitment doesn’t seem very relevant here since it’s really just a difference in what you would pay for each situation. If one comes first, I don’t see any reason why it would make sense to commit, so I don’t think that strengthens the case for ambiguity aversion in any way. But I think I might be confused here.
Offered a bet that pays $X if I pick a color and then see if a random ball matches that color, you’ll pay more
I’m not sure I follow. If I were to take this bet, it seems that the prior according to which my utility would be lowest is: you’ll pick a color to match that gives me a 0% chance of winning. So if I’m ambiguity averse in this way, wouldn’t I think this bet is worthless?
(The second point you bring up would make sense to me if this first point did, although then I’d also be confused about the papers’ emphasis on commitment.)
Sorry—you’re right that this doesn’t work. To clarify, I was thinking that the method of picking the color should be fixed ex-ante (e.g. “I pick red as the color with 50% probability”), but that doesn’t do the trick because you need to pool the colors for ambiguity to arise.
The issue is that the problem the paper identifies does not come up in your example. If I’m offered the two bets simultaneously, then an ambiguity averse decision maker, like an EU decision maker, will take both bets. If I’m offered the bets sequentially without knowing I’ll be offered both when I’m offered the first one, then neither an ambiguity-averse nor a risk-averse EU decision-maker will take them. The reason is that the first one offers the EU decision-maker a 50% chance of winning, so given risk-aversion its value is less than 50% of $1. So your example doesn’t distinguish a risk-averse EU decision-maker from an ambiguity-averse one.
So I think unfortunately we need to go with the more complicated examples in the paper. They are obviously very theoretical. I think it could be a valuable project for someone to translate these into more practical settings to show how these problems can come up in a real-world sense.
Thanks for writing this. I think it’s very valuable to be having this discussion. Longtermism is a novel, strange, and highly demanding idea, so it merits a great deal of scrutiny. That said, I agree with the thesis and don’t currently find your objections against longtermism persuasive (although in one case I think they suggest a specific set of approaches to longtermism).
I’ll start with the expected value argument, specifically the note that probabilities here are uncertain and therefore random valuables, whereas in traditional EU they’re constant. To me a charitable version of Greaves and MacAskill’s argument is that, taking the expectation over the probabilities times the outcomes, you have a large future in expectation. (What you need for the randomness of probabilities to sink longtermism is for the probabilities to correlate inversely and strongly with the size of the future.) I don’t think they’d claim the probabilities are certain.
Maybe the claim you want to make, then, is that we should treat random probabilities differently from certain probabilities, i.e. you should not “take expectations” over probabilities in the way I’ve described. The problem with this is that (a) alternatives to taking expectations over probabilities have been explored in the literature, and they have a lot of undesirable features; and (b) alternatives to taking expectations over probabilities do not necessarily reject longtermism. I’ll discuss (b), since it involves providing an example for (a).
(b) In economics at least, Gilboa and Schmeidler (1989) propose what’s probably the best-known alternative to EU when the probabilities are uncertain, which involves maximizing expected utility for the prior according to which utility is the lowest, sort of a meta-level risk aversion. They prove that this is the optimal decision rule according to some remarkably weak assumptions. If you take this approach, it’s far from clear you’ll reject longtermism: more likely, you end up with a sort of longtermism focused on averting long-term suffering, i.e. focused on maximizing expected value according to the most pessimistic probabilities. There’s a bunch of other approaches, but they tend to have similar flavors. So alternatives on EU may agree on longtermism and just disagree on the flavor of it.
(a) Moving away from EU leads to a lot of problems. As I’m sure you know given your technical background, EU derives from a really nice set of axioms (The Savage Axioms). Things go awry when you leave it. Al-Najjar and Weinstein (2009) offer a persuasive discussion of this (H/T Phil Trammell). For example, non-EU models imply information aversion. Now, a certain sort of information aversion might make sense in the context of longtermism. In line with your Popper quote, it might make sense to avoid information about the feasibility of highly-specific future scenarios. But that’s not really the sort of information non-EU models imply aversion to. Instead, they imply aversion to info that would shift you toward the option that currently has a lot of ambiguity about it because you dislike it based on its current ambiguity.
So I don’t think we can leave behind EU for another approach to evaluating outcomes. The problems, to me, seem to lie elsewhere. I think there are problems with the way we’re arriving at probabilities (inventing subjective ones that invite biases and failing to adequately stick to base rates, for example). I also think there might be a point to be made about having priors on unlikely conclusions so that, for example, the conclusion of strong longtermism is so strange that we should be disinclined to buy into it based on the uncertainty about probabilities feeding into the claim. But the approach itself seems right to me. I honestly spent some time looking for alternative approaches because of these last two concerns I mentioned and came away thinking that EU is the best we’ve got.
I’d note, finally, that I take the utopianism point well and wold like to see more discussion of this. Utopian movements have a sordid history, and Popper is spot-on. Longtermism doesn’t have to be utopian, though. Avoiding really bad outcomes, or striving for a middling outcome, is not utopian. This seems to me to dovetail with my proposal in the last paragraph to improve our probability estimates. Sticking carefully to base rates and things we have some idea about seems to be a good way to avoid utopianism and its pitfalls. So I’d suggest a form of longtermism that is humble about what we know and strives to get the least-bad empirical data possible, but I still think longtermism comes out on top.
This might also be of interest:
The Sequential Dominance Argument for the Independence Axiom of Expected Utility Theory by Johan E. Gustafsson, which argues for the Independence Axiom with stochastic dominance, a minimal rationality requirement, and also against the Allais paradox and Ellsberg paradox (ambiguity aversion).
However, I think a weakness in the argument is that it assumes the probabilities exist and are constant throughout, but they aren’t defined by assumption in the Ellsberg paradox. In particular, looking at the figure for case 1, the argument assumes p is the same when you start at the first random node as it is looking forward when you’re at one of the two choice nodes, 1 or 2. In some sense, this is true, since the colours of the balls don’t change between, but you don’t have a subjective estimate of p by assumption and “unknown probability” is a contradiction in terms for a Bayesian. (These are notes I took when I read the paper a while ago, so I hope they make sense! :P.)
Another weakness is that I think these kinds of sequential lotteries are usually only relevant in choices where an agent is working against you or trying to get something from you (e.g. money for their charity!), which also happen to be the cases where ambiguity aversion is most useful. You can’t set up such a sequential lottery for something like the degree of insect consciousness, P vs NP, or whether the sun will rise tomorrow.
See my discussion with Owen Cotton-Barratt.
On the expected value argument, are you referring to this?
Based on the link to the wiki page for random variables, I think Vaden didn’t mean that the probabilities themselves follow some distributions, but was rather just identifying probability distributions with the random variables they represent, i.e., given any probability distribution, there’s a random variable distributed according to it.
However, I do think his point does lead us to want to entertain multiple probability distributions.
If you did have probabilities over your outcome probabilities or aggregate utilities, I’d think you could just take iterated expectations. If U is the aggregate utility, U∼p and p∼q, then you’d just take the expected value of p with respect to q first, and calculate:
EV∼Eq[p][V]]If the dependence is more complicated (you talk about correlations), you might use (something similar to) the law of total expectation.
And you’d use Gilboa and Schmeidler’s maxmin expected value approach if you don’t even have a joint probability distribution over all of the probabilities.
A more recent alternative to maxmin is the maximality rule, which is to rule out any choices whose expected utilities are weakly dominated by the expected utilities of another specific choice.
https://academic.oup.com/pq/article-abstract/71/1/141/5828678
https://globalprioritiesinstitute.org/andreas-mogensen-maximal-cluelessness/
https://forum.effectivealtruism.org/posts/WSytm4XG9DrxCYEwg/andreas-mogensen-s-maximal-cluelessness
Mogensen comes out against this rule in the end for being too permissive, though. However, I’m not convinced that’s true, since that depends on your particular probabilities. I think you can get further with hedging.
Yeah, that’s the part I’m referring to. I take his comment that expectations are not random variables to be criticizing taking expectations over expected utility with respect to uncertain probabilities.
I think the critical review of ambiguity aversion I linked to us sufficiently general that any alternatives to taking expectations with respect to uncertain probabilities will have seriously undesirable features.
Hi Zach, thanks for this!
I have two doubts about the Al-Najjar and Weinstein paper—I’d be curious to hear your (or others’) thoughts on these.
First, I’m having trouble seeing where the information aversion comes in. A simpler example than the one used in the paper seems to be enough to communicate what I’m confused about: let’s say an urn has 100 balls that are each red or yellow, and you don’t know their distribution. Someone averse to ambiguity would (I think) be willing to pay up to $1 for a bet that pays off $1 if a randomly selected ball is red or yellow. But if they’re offered that bet as two separate decisions (first betting on a ball being red, and then betting on the same ball being yellow), then they’d be willing to pay less than $0.50 for each bet. So it looks like preference inconsistency comes from the choice being spread out over time, rather than from information (which would mean there’s no incentive to avoid information). What am I missing here?
(Maybe the following is how the authors were thinking about this? If you (as a hypothetical ambiguity-averse person) know that you’ll get a chance to take both bets separately, then you’ll take them both as long as you’re not immediately informed of the outcome of the first bet, because you evaluate acts, not by their own uncertainty, but by the uncertainty of your sequence of acts as a whole (considering all acts whose outcomes you remain unaware of). This seems like an odd interpretation, so I don’t think this is it.)
[edit: I now think the previous paragraph’s interpretation was correct, because otherwise agents would have no way to make ambiguity averse choices that are spread out over time and consistent, in situations like the ones presented in the paper. The ‘oddness’ of the interpretation seems to reflect the oddness of ambiguity aversion: rather than only paying attention to what might happen differently if you choose one action or another, ambiguity aversion involves paying attention to possible outcomes that will not be affected by your action, since they might influence the uncertainty of your action.]
Second, assuming that ambiguity aversion does lead to information aversion, what do you think of the response that “this phenomenon simply reflects a [rational] trade-off between the intrinsic value of information, which is positive even in the presence of ambiguity, and the value of commitment”?
Thanks! Helpful follow-ups.
On the first point, I think your intuition does capture the information aversion here, but I still think information aversion is an accurate description. Offered a bet that pays $X if I pick a color and then see if a random ball matches that color, you’ll pay more than for a bet that pays $X if a random ball is red. The only difference between these situations is that you have more information in the latter: you know the color to match is red. That makes you less willing to pay. And there’s no obvious reason why this information aversion would be something like a useful heuristic.
I don’t quite get the second point. Commitment doesn’t seem very relevant here since it’s really just a difference in what you would pay for each situation. If one comes first, I don’t see any reason why it would make sense to commit, so I don’t think that strengthens the case for ambiguity aversion in any way. But I think I might be confused here.
Thanks!
I’m not sure I follow. If I were to take this bet, it seems that the prior according to which my utility would be lowest is: you’ll pick a color to match that gives me a 0% chance of winning. So if I’m ambiguity averse in this way, wouldn’t I think this bet is worthless?
(The second point you bring up would make sense to me if this first point did, although then I’d also be confused about the papers’ emphasis on commitment.)
Sorry—you’re right that this doesn’t work. To clarify, I was thinking that the method of picking the color should be fixed ex-ante (e.g. “I pick red as the color with 50% probability”), but that doesn’t do the trick because you need to pool the colors for ambiguity to arise.
The issue is that the problem the paper identifies does not come up in your example. If I’m offered the two bets simultaneously, then an ambiguity averse decision maker, like an EU decision maker, will take both bets. If I’m offered the bets sequentially without knowing I’ll be offered both when I’m offered the first one, then neither an ambiguity-averse nor a risk-averse EU decision-maker will take them. The reason is that the first one offers the EU decision-maker a 50% chance of winning, so given risk-aversion its value is less than 50% of $1. So your example doesn’t distinguish a risk-averse EU decision-maker from an ambiguity-averse one.
So I think unfortunately we need to go with the more complicated examples in the paper. They are obviously very theoretical. I think it could be a valuable project for someone to translate these into more practical settings to show how these problems can come up in a real-world sense.