Utilitarianism with and without expected utility

Aaron Gertler 🔸Jul 24, 2020, 6:40 AM

27 points

The paper this post is based on: McCarthy, D., Mikkola, K., Thomas, T., Utilitarianism with and without expected utility. Journal of Mathematical Economics 87 (2020): 77-113. (Online Article | PDF)

A common criticism of utilitarianism is that it makes strong assumptions about the nature of welfare and about welfare comparisons. It assumes that welfare should be understood as something like pleasure or desire satisfaction, and that we can measure welfare precisely. But to many people, that assumption is implausible. They think that welfare is made up of different goods that are difficult to compare, and certainly different to measure.

Moreover, many people think that social welfare comparisons should reflect the social consensus about individual welfare comparisons. But the social consensus about individual welfare comparisons arguably falls a long way short of full agreement, and rules out pleasure or desire-based accounts. This is one of the objections that defenders of contractualism make against utilitarianism. For example, Rawls (1982) gives an especially strong statement of a version of this criticism. His disagreement with utilitarianism on this point is one of the central features of his contractualist theory.

Our paper explains how utilitarianism can respond. This post will also connect this topic with uncertainty.

1. Harsanyi’s utilitarianism

Criticisms of utilitarianism, including Rawls’ famous separateness of persons criticism, are too often directed at old fashioned versions of the view. But the worry about its assumptions about welfare applies to the sophisticated version presented in the 1955 utilitarian theorem of John Harsanyi. I understand this to be the version of Harsanyi’s result that assumes interpersonal comparisons.

Harsanyi’s utilitarian theorem assumes a constant population, and is stated for circumstances involving risk. In other words, it assumes that uncertainty is modelled by standard point-valued probabilities that are objective or universally agreed. It rests on some simple and natural premises, and shows that these assumptions imply that individual welfare comparisons uniquely determine social welfare comparisons.

In one way of presenting Harsanyi’s result, welfare comparisons, both intrapersonal and interpersonal, are encoded in a single preorder that we call the individual preorder. (A preorder is a binary relation that is reflexive and transitive.) In those terms, the premises of Harsanyi’s utilitarian theorem are as follows: (a) the individual and social preorders satisfy the expected utility axioms; (b) the strong Pareto principle; and (c) an impartiality axiom.

Strong Pareto says that (i) if two social lotteries are equally good for every member of the population, they are equally good; and (ii) if one social lottery is at least as good for every member of the population, and better for some members of the population, then it is better than the other. Impartiality essentially expresses indifference to permutations of individuals. But here I want to focus on expected utility.

But first, let’s quickly address what utilitarianism has to assume about the nature of welfare. Harsanyi himself proposed to identify welfare with the satisfaction of ideally rational preferences. He has a long and complicated argument to back that up which is not very convincing, and has been amply criticized by Broome (1993) and others.

However for the purposes of the utilitarian theorem, this view is inessential: any account of the content of welfare — such as more objective accounts, like Rawls’ own account, that take into account such things as resources or capabilities — is compatible with the theorem. What I want to focus on is what Harsanyi assumes about the structure of welfare comparisons, or more precisely, the structural assumptions he makes about the individual preorder. They are that it satisfies the expected utility axioms.

2. Expected utility and welfare comparisons

Expected utility theory is the best known and developed theory for rational decision making under conditions of risk. But this is where the old-style criticisms of the assumptions classical utilitarians make about welfare comparisons applies equally to Harsanyi. Expected utility theory contains three basic axioms: completeness, continuity, and independence. But the assumption that the individual preorder satisfies these axioms means the following.

1. Completeness implies that all goods, and even lotteries over goods, are comparable.
2. Continuity implies that no goods are infinitely more valuable than others.
3. Independence implies a precise way of making welfare comparisons under risk.

The first of these means that the standard objection to classical utilitarianism applies directly to Harsanyi’s version. The second and third provide further ways in which Harsanyi’s utilitarianism does not allow for much flexibility about welfare comparisons.

3. Information and uncertainty

The following is another feature of Harsanyi’s framework.

4. Framing the problem in terms of risk means that the theorem only applies when probabilities are objective or agreed.

This last point is not directly about welfare. But it is connected.

Amartya Sen taught us to think about different ethical theories in terms of how much information they need in order to function; see e.g. Sen 1985. Both classical utilitarianism and Harsanyi’s utilitarianism require a lot of precise information about welfare comparisons. But in only applying to risk, Harsanyi’s utilitarianism also requires a lot of information about uncertainty. But in real-world situations, we often face uncertainty without having any idea about what the probabilities are.

In Sen’s terms, we can summarize 1 to 4 as follows. Harsanyi’s utilitarianism demands a very high, and perhaps unattainable, amount of information. It is not a theory which we can use in most of the situations where we need ethics.

4. A utilitarian solution

The main result of our paper manages to avoid all of these difficulties. We present three simple and plausible axioms. These are much weaker than Harsanyi’s axioms, but they still preserve the utilitarian flavor of his approach. Moreover, our axioms still imply that individual welfare comparisons uniquely determine social welfare comparisons.

In addition, we do not assume any of the expected utility axioms. In fact, they are all allowed to fail. This means that our approach is much more flexible than Harsanyi’s in the welfare comparisons it handle.

In particular,

1′. Our result can accept that some (or all!) goods are incomparable.
2′. It allows some goods to be infinitely more valuable than others.
3′. It has room for a wide range of welfare comparisons involving risk.

For simplicity, most of our paper follows Harsanyi in working in the framework of risk. But we also explain how to extend our main result. In particular,

4′. Our approach allows for a very wide range of ways of representing uncertainty.

For example, it can allow for so-called imprecise probabilities in all kinds of variations; see Bradley (2019) for an excellent introduction to that topic. In Sen’s terms again, our version of utilitarianism needs very little information.

In my view, this means that standard criticism of utilitarianism we started with is off-target. But I would hesitate to put this by saying that utilitarianism is right and contractualism is wrong. We could also say that the result allows utilitarianism to incorporate some contractualist ideas.

Our paper is very long, and contains much else besides, including an extension of the result sketched above to variable populations, and proofs that our assumptions lead to very general versions of the standard additive form of classical and Harsanyi-style utilitarianism. But explaining how utilitarianism can cope with all kinds of limitations on welfare comparisons and many forms of uncertainty is at least how I see its central philosophical point.

What links here?

Aaron Gertler 🔸Jul 24, 2020, 6:40 AM

27 points

4 comments4 min readEA link

Research summary Moral philosophy Philosophy

MichaelStJules Jul 25, 2020, 5:56 PM
4 points
0 ∶ 0

It’s worth pointing out that their theorems 2.2 and 3.5 are compatible with Rawls’ difference principle/leximin/maximin (infinite risk-aversion), so their results generalize both Harsanyi’s and Rawls’ approaches, rather than defend utilitarianism against Rawls. They don’t require continuity or cardinal welfare for these theorems, and as far as I know, continuity is not actually an axiom justifiable with Dutch books or money pumps, so I’m not sure what reason we have to believe it other than pure intuition, which is especially suspect in extreme tradeoffs (e.g. involving torture) and because of time-inconsistency in our preferences.
Continuity would of course also fail under utilitarianism with stochastic separability and infinite stakes, i.e. Pascalian problems, although I suppose one defence might be that the physical differences in outcomes are also infinite in these cases, so we might only have continuity starting from finite physical differences and extend it from there.
I don’t think continuity deserves to be called a rationality axiom, and without it and cardinal welfare, the case for utilitarianism as normally conceived falls apart.
MichaelStJules Jul 25, 2020, 5:47 AM
3 points
0 ∶ 0

Teruji Thomas, one of the authors, wrote a paper for GPI with a similar theorem, called the Supervenience Theorem. There’s an EA Forum post about it here. There’s an EA Forum post on Harsanyi’s original utilitarian theorem here, too.
I pulled out the definitions and put them together to be able to state the first theorem more compactly, introducing the notation as it’s needed and skipping some unnecessary notation and jargon.

The first theorem is for the constant population case, with a finite set of individuals $I$ . Welfare states come from some set $W$ , and “a distribution is an assignment of welfare states to individuals”, or an element of the set of vectors $W^{I}$ indexed by individuals in $I$ . Then,
A ‘lottery’ is a probability measure (or probability distribution or random variable) over distributions. A ‘prospect’ is a probability measure (or probability distribution or random variable) over welfare states. Each lottery determines a prospect for each individual. The ‘social preorder’ expresses a view about how good lotteries are from an impartial perspective, while the ‘individual preorder’ expresses a view about how good prospects are for individuals, allowing interpersonal comparisons. The central question for us is how the social preorder should depend upon the individual preorder.
That there’s only one individual preorder that’s used for everyone allows interpersonal comparisons. Welfare states can be arbitrary otherwise, even allowing incomparability between welfare states and between prospects. A preorder is just a ranking that allows incomparability; it’s a transitive and reflexive relation $≿$ (at least as good as), and we write $X \sim Y$ if both $X ≿ Y$ and $Y ≿ X$ .
- Reflexivity: $X ≿ X$ for all $X$ .
- Transitivity: if $X ≿ Y$ and $Y ≿ Z$ , then $X ≿ Z$ .
For a given lottery $L$ and individual $i \in I$ , let $P_{i} (L)$ denote the prospect that $i$ faces in $L$ .
Anteriority: Given lotteries $L$ and $L^{'}$ , if for each individual $i \in I$ , $P_{i} (L)$ and $P_{i} (L^{'})$ are identically distributed (equal up to shuffling the outcomes randomly) $^{1}$ , then according to the social preorder, $L \sim L^{'}$ .
In other words, “the social preorder only depends on which prospect each individual faces”, and not how their actual outcomes may be statistically dependent upon one another, ruling out concern for “ex post equality”, according to which it would be better if prospects are correlated than anticorrelated or independent. For example, if A and B have equal chances of being happy or miserable, Anteriority implies it doesn’t matter if they’d be happy or miserable together with equal chances (correlated), or if one would be happy if and only if the other would be miserable (anticorrelated).

Let $L (P)$ denote the lottery in which everyone faces prospect $P$ , so that $P_{i} (L (P)) = P$ for each individual $i \in I$ , and “and it is certain that all individuals will have the same welfare” as each other.
Reduction to Prospects: If $P ≿ P^{'}$ according to the individual preorder, then $L (P) ≿ L (P^{'})$ according to the social preorder.
Or, “for lotteries that guarantee perfect equality, social welfare matches individual welfare.” That is, perfect equality in welfare between everyone, but not necessarily any guarantee at what welfare level, as there may still be uncertainty involved. Again, if for an individual, some prospect $P$ is at least as good as prospect $P^{'}$ , then the lottery with everyone facing $P$ , $L (P)$ , is at least as good as the one with everyone facing $P^{'}$ , $L (P^{'})$ .

For a permutation (bijection) $σ : I \to I$ of identities and a lottery $L$ , we write the permuted lottery as $σ L$ . This is just swapping people’s identities. The permutation is applied uniformly so that if $σ (i) = j$ , then in $σ L$ , individual $i$ faces the prospect that $j$ faces in $L$ .
Anonymity: Given a permutation $σ$ of identities, and a lottery $L$ , the social preorder is indifferent between $L$ and the permuted lottery $σ L$ :
$L \sim σ L$

One important operation on lotteries is “probabilistic mixture”. Given two lotteries $L$ and $L$ , and a probability $p$ , $0 < p < 1$ , we can define a compound lottery $p L + (1 - p) L^{'}$ , which, for a binary random variable $X$ that’s $1$ with probability $p$ and $0$ with probability $1 - p$ (like a biased coin, and independent of the randomness in $L$ and $L^{'}$ ), conditional on $X = 1$ , the compound lottery is identical to $L$ , not just identically distributed, but also $L$ resolves to a given welfare distribution if and only if the compound lottery, conditional on $X = 1$ does, too, and conditional on $X = 0$ , it’s identical to $L^{'}$ . Hence, $p L + (1 - p) L^{'} = X L + (1 - X) L^{'}$ and
$P r o b [p L + (1 - p) L^{'} = L | X = 1] = 1 and P r o b [p L + (1 - p) L^{'} = L^{'} | X = 0] = 1$
We can also do this with more than two lotteries and use summation notation, $\sum$ , for it.
Anonymity is then strengthened:
Two-Stage Anonymity: Given two lotteries $L$ and $L^{'}$ , $p \in [0, 1] \cap Q$ (a rational number $p$ between $0$ and $1$ , inclusive), and a permutation of identities $σ$ , then according to the social preorder, we have the equivalence:
$p (σ L) + (1 - p) L^{'} \sim p L + (1 - p) L^{'}$
So, you can permute individuals conditionally on the binary random variable that mixes the two lotteries while maintaining equivalence. This rules out concern for “ex ante equality”, according to which it would be better if people had fairer chances or equal opportunities. So, if I can benefit one of two people the same with the same initial welfare, it doesn’t matter if I just choose one, or flip a coin to choose, giving each a fair chance.

Let $# I$ denote the number of individuals in $I$ . For a given lottery $L$ , $\sum_{i \in I} \frac{1}{# I} P_{i} (L)$ is the prospect given by Harsanyi’s veil of ignorance, where with equal probability, “you” will be one of the individuals $i \in I$ , and then face their prospect $P_{i} (L)$ .
And now we can state their first theorem:
Theorem 2.2: Given an arbitrary preorder on the set of prospects (lotteries for single individuals), if the social preorder satisfies Anteriority, Reduction to Prospects and Two-Stage Anonymity, then $L ≿ L^{'}$ according to the social preorder if and only if
$\sum i \in I \frac{1}{# I} P_{i} (L) ≿ \sum i \in I \frac{1}{# I} P_{i} (L^{'})$
That is, you can permute individuals conditionally on the binary random variable that mixes the two lotteries while maintaining equivalence.
So the social preorder is just the one obtained by imagining yourself in the place of each individual with equal probability and applying the individual preorder, as in Harsanyi’s veil of ignorance.

1. The statement uses equality notation instead of identical distributions, but equality for each individual forces the lotteries to be literally the same, $L = L^{'}$ , not just equivalent, and the definition is trivially satisfied.
- MichaelStJules Jul 25, 2020, 6:47 AM
  2 points
  0 ∶ 0
  Parent
  
  For the variable population case, they
  - add an extra welfare state $Ω$ to represent nonexistence, without saying how it compares to other welfare states at all (e.g. totalism or person-affecting views). Prospects can include nonexistence, so you (may be able to) compare prospects with different probabilities of nonexistence.
  - replace the finite constant population $I$ with an infinite set $I^{\infty}$ of all possible individuals and assign welfare $Ω$ (nonexistence) to individuals who don’t exist in a given welfare distribution.
  - generalize the Anteriority, Reduction to Prospects and Two-Stage Anonymity conditions. Only Reduction to Prospects looks different, since rather than defining lotteries for everyone in $I^{\infty}$ as a whole, you require it to hold for every finite non-empty subset of $I^{\infty}$ .
  - define Omega Independence.
  - generalize Theorem 2.2 for Theorem 3.5.
  For a given welfare state $w$ , let $1_{w}$ denote the prospect with definite welfare state $w$ , with probability 1. In particular, $1_{Ω}$ denotes definite nonexistence.
  Omega Independence: For any two prospects $P$ and $P^{'}$ , and any rational probability $p \in [0, 1] \cap Q$ ,
  $P ≿ P^{'} if and only if p P + (1 - p) 1_{Ω} ≿ p P^{'} + (1 - p) 1_{Ω}$
  In other words, mixing with the same chance of nonexistence makes no difference to the ranking of two prospects.
  Then Theorem 3.5 is basically the same as Theorem 2.2, with the corresponding definitions, but the social preorder only exists at all if Omega Independence is satisfied and the veil of ignorance comparisons are applied only to pairs of lotteries from a common finite subset of $I^{\infty}$ (which may have any individuals assigned nonexistence $Ω$ , and any two finite sets can be expanded to their union, so prospects over finite sets of individuals can always be compared):
  
  Theorem 3.5: Given an arbitrary individual preorder, there is at most one social preorder satisfying Anteriority, Reduction to Prospects, and Two-Stage Anonymity. When it exists, it is given by $L ≿ L^{'}$ if and only if
  $\sum i \in I \frac{1}{# I} P_{i} (L) ≿ \sum i \in I \frac{1}{# I} P_{i} (L^{'})$
  according to the individual preorder for any finite non-empty $I \subset I^{\infty}$ such that $L$ and $L^{'}$ are lotteries in $I$ . The social preorder exists if and only if the individual preorder satisfies Omega Independence.
  
  Personally, I like the procreation asymmetry, so I might say that $Ω$ is strictly better than some states (hence defined as negative), but either incomparable to or at least as good as all other states, so never worse than any other state.
MichaelStJules Jul 25, 2020, 6:02 AM
2 points
0 ∶ 0

Abstract:
We give two social aggregation theorems under conditions of risk, one for constant population cases, the other an extension to variable populations. Intra and interpersonal welfare comparisons are encoded in a single ‘individual preorder’. The theorems give axioms that uniquely determine a social preorder in terms of this individual preorder. The social preorders described by these theorems have features that may be considered characteristic of Harsanyi-style utilitarianism, such as indifference to ex ante and ex post equality. However, the theorems are also consistent with the rejection of all of the expected utility axioms, completeness, continuity, and independence, at both the individual and social levels. In that sense, expected utility is inessential to Harsanyi-style utilitarianism. In fact, the variable population theorem imposes only a mild constraint on the individual preorder, while the constant population theorem imposes no constraint at all. We then derive further results under the assumption of our basic axioms. First, the individual preorder satisfies the main expected utility axiom of strong independence if and only if the social preorder has a vector-valued expected total utility representation, covering Harsanyi’s utilitarian theorem as a special case. Second, stronger utilitarian-friendly assumptions, like Pareto or strong separability, are essentially equivalent to strong independence. Third, if the individual preorder satisfies a ‘local expected utility’ condition popular in non-expected utility theory, then the social preorder has a ‘local expected total utility’ representation. Fourth, a wide range of non-expected utility theories nevertheless lead to social preorders of outcomes that have been seen as canonically egalitarian, such as rank-dependent social preorders. Although our aggregation theorems are stated under conditions of risk, they are valid in more general frameworks for representing uncertainty or ambiguity.