The Moral Two Envelopes Problem and the Moral Weights Project

This post is an overview of the Moral Two Envelopes Problem and a take on whether it applies to Rethink Priorities’ Moral Weights Project. It is a product of discussions between myself, Michael St. Jules, Hayley Clatterbuck, Marcus Davis, Bob Fischer, and Arvo Muñoz Morán, but need not reflect anyone’s views but my own. Thanks to Brian Tomasik, Carl Shulman, and Michael St. Jules for comments. Michael St. Jules discusses many of these issues in depth in a forum post earlier this year.

Introduction

When deciding how to prioritize interventions aimed at improving the welfare of different species of animals, it is critical to have some sense of their relative capacities for wellbeing. Animals’ welfare capacities are, for the most part, deeply uncertain. Our uncertainty stems both from the gaps in our understanding of the cognitive faculties and the behaviors of different species, which constitute the external evidence of consciousness and sentience, and from our limited grasp of the bearing of that evidence on what we should think.

Rethink Priorities’ Moral Weights Project produced estimates of relative moral significance of some intensively farmed species that reflected our uncertainties. It assumed that moral significance depends on capacities for welfare. In order to respect uncertainty and differences of opinion about methodology, it adopted a Monte Carlo approach. Potential sources of evidence were combined with different theories of their evidential significance to produce a range of predictions. Those predictions were aggregated to create overall estimates of relative welfare capacity and hence moral significance.

The project addresses a complex issue and our methodology is open to criticism. Most notably, in anticipation of work like the Moral Weights Project, Brian Tomasik explored complications to naive aggregation of expected value across different theories[1]. He directed his criticisms at attempts to aggregate welfare estimates in light of different theories of the significance of brain size, but similar criticisms could be developed for the other proxies that the Moral Weights Project considered. If the issues Tomasik identified did apply to the project’s methodology, then the conclusions of the report would be compromised.

It is my view that while there are important lessons to be drawn from Tomasik’s work and Tomasik is right that certain forms of aggregation would be inappropriate, the concerns he posed do not apply to the project’s methodology. This document explains why.

Moral Two Envelopes Problem

The Two Envelopes Problem is a venerable issue in decision theory that has generated a lot of scholarly discussion. At its heart, it poses a challenge in understanding how to apply expected value reasoning to making decisions. The challenge depends on the Two Envelopes Case, a thought experiment in which some amounts of money are placed into two envelopes. A subject chooses between them and keeps the money inside their chosen envelope. They know that one envelope contains exactly twice as much money as the other but they don’t know which contains more. Before they’ve seen how much is inside their chosen envelope, they are allowed to switch to the other. Expected value calculations appear to support switching.

The argument for switching is as follows. There is some exact but unknown amount of money inside the chosen envelope. Let’s call that amount ‘CEM’. The subject knows that the unchosen envelope is equally likely to contain ½ CEM and 2 CEM. The expected value of switching to the other envelope is therefore ½ * ½ CEM + ½ * 2 CEM = 1.25 CEM. The expected value of not switching is 1 CEM.

Given the symmetry of the situation, there is obviously no reason to switch. The Two Envelopes Problem is to explain where the expected value reasoning goes wrong.

The problem Tomasik posed bears a resemblance. Tomasik illustrates it with his Two Elephants Case: Suppose we’re uncertain between two theories, one of which values species equally and one which values species linearly by neuron count. Consider two prospects, one that benefits one human to some degree and one that benefits two elephants to the same degree. Assume that elephants have a quarter the neuron count of humans. Then on one of the theories, helping the elephants is twice as good as helping humans. On the other, it is half as good.

As with the standard Two Envelopes Case, we can argue from the expected value for helping either. Suppose we’ve elected to help the human. Call the amount of value we produce HEV (Human Expected Value). The expected value of switching to help the elephants is 1.25 HEV. Therefore, we should switch. But we could just as easily make the same argument in reverse. Call the amount of value produced by helping the elephants 1 EEV. Then helping the human has an expected value of 1.25 EEV.

Tomasik argues that applying expected value reasoning to evaluating prospects in this way is inappropriate.

In contrast with the Two Envelopes Problem, he sees little challenge in explaining what makes that reasoning inappropriate. For Tomasik, at least some of the fundamental issues at play in the Two Envelopes Case and the Two Elephants Case are different. He claims that the Two Envelopes Problem is solvable, but the Moral Two Envelopes Problem is not. The difference is that the Moral Two Envelopes Problem involves expected value calculations that aggregate the value of each species according to different normative theories.

According to Tomasik, distinct utility functions are incomparable. The numbers that each theory attributes only make sense within the theory; they specify the tradeoffs sanctioned by those theories. In order to aggregate them, we would need to find some common unit in which they both make sense. But there is no common unit.

Three Related Issues

There is a straightforward explanation for why the Moral Two Envelopes Problem is not a problem for the Moral Weights Project: the Moral Weights Project does not attempt to aggregate across normative theories. Instead, it assumes a single normative theory and aggregates across assumptions about how some behavioral and neurological proxies relate to the cognitive traits that matter to that theory. I’ll explore this idea more below. But before that, Tomasik’s observation of the similarities with the Two Envelopes Problem raises other issues that should be addressed; I will describe three challenges that warrant attention.

The Pinning Problem

One line of response to the traditional Two Envelopes Problem asserts that the problematic reasoning rests on a subtle equivocation.[2] The traditional thought experiment assumes that any amount of money may be in the two envelopes, but the equivocation is easier to see if we assume specific amounts. Suppose that one envelope contains $10 and the other $20. The expected value of switching from the chosen envelope to the other is then 1.25 * CEM. However ‘CEM’ is tied by definition to the specific amount of money in one envelope and we are uncertain which of the two values that specific amount is. If ‘CEM’ refers to the higher value, then that value is $20 and the cost of switching is $10. If ‘CEM’ refers to the lower value, then it is $10 and the gain in switching is $10. It is only because ‘CEM’ refers to different values in the scenarios the subject is uncertain between that switching can have an expected value of 1.25 CEM, staying can have an expected value of 1 CEM, and switching can still not be worthwhile.[3]

There is an issue here, which I’ll call the ‘Pinning Problem’. Sometimes we point to an uncertain value and give it a name, then use units of that value for expected value calculations. That can be ok. But it can also be problematic when our evidence bears on what that unit must actually be in different epistemically possible situations. ‘CEM’ might refer to $10 or it might refer to $20. Part of the Pinning Problem concerns knowing when it is acceptable to use fixed terms to represent potentially varying units of value in expected value calculations – it isn’t always wrong. Part of it concerns knowing when our terms could represent different levels of value in the first place.

This could be a problem for a Moral Weights Project if the welfare capacities were specified in a unit whose significance we were uncertain about and whose value would be different under different possible scenarios, considered as actual. This problem isn’t tied to normativity, so presents a different issue than the one that Tomasik focused on.

The Ratio Incorporation Problem

Suppose that the oracle (who states only truths) asserts that the relative value of a $1000 donation to The Tuberculosis Initiative (TI) and of the same amount to Delead The World (DW) is 10:1. We had previously thought it was 5:1. There are multiple ways of adjusting our beliefs. We might take this as evidence that TI’s work was more effective than we initially thought and raise that assessment while holding our regard for DW fixed. Alternatively, we might lower our assessment for DW’s work while holding TI’s fixed. Finally, we become both more optimistic about TI and more pessimistic about DW. The oracle’s revelation does not tell us which way to go.

Sometimes we are uncertain about the ratio of values between different prospects. The proper way to reason about our prospects given such uncertainty may depend on how various possible ratios would be incorporated if we were to decide that they were true.

Suppose that the oracle informs us that the ratio of value in donations to TI and to DW is either 10:1 or 1:2 and we are equally uncertain between them. This might seem to suggest that we should give to TI rather than DW. But really it depends on how we would choose to incorporate these ratios into our absolute estimates of effectiveness. Suppose that we would incorporate information about the 10 to 1 ratio, if confirmed, by lowering our estimate of DW, but incorporate the 2 to 1 ratio by upping our estimate of DW (in both cases holding TI fixed). In that case, the expected value of DW would actually be higher, even though the more favorable ratio is on TI’s side. On the other hand, if we instead held DW fixed, the expected value of TI would be higher.

The Ratio Incorporation Problem means that we need to know more than just the ratios according to different theories, we need to know how to adjust our prior expectations about value in light of those ratios. Or, alternatively, we need to know how to gauge the significance of some ratios against others.

The Metric Locality Problem

Finally, different metrics may specify different relative values such that there is no way to compare values across metrics.

This problem contains several subproblems. One subproblem is making sense of the idea of the same amount of value within different normative theories. We might think the theories are fundamentally about different things, so no translation between the two metrics is possible. Consider: how many euros is equivalent to 80 degrees Fahrenheit? The question doesn’t make sense. We might think the same thing is true about the utility measures of different normative theories. Kantian deontologists, for example, just think about value in such a different way than utilitarians that trying to fit their valuations on one scale looks a bit like trying to put money and temperature on one scale.

Another subproblem concerns how to actually compare across metrics, assuming it is intelligible that one value in each could be equivalent. Compare: is Nigel Richards better at Scrabble than Magnus Carlsen is at chess? It doesn’t sound like an incoherent question in quite the way that asking about the value of a currency in temperature is, but it is also not obvious that there is an answer. The numbers representing the values of options in a utility calculus traditionally reflect an ordering of prospects and the relative sizes of differences between values, but not absolute values. Specific values can be transformed with linear functions (such as by multiplying each assignment by two) without altering the ordering.

In order to fix a scale that we might use to compare the values assigned by two theories, we just need to know how to translate the value of two assignments from one to the other. From the translations of two specific values, we can derive a general understanding of how normative distance translates between the two scales. And with a relative measure of distance and any shared value, we can calibrate how far from that shared value any other value is. To translate between Celsius and Fahrenheit, you only need to know that the metrics are the same at −40°, and that the difference between 0° and 100° in Celsius is the same as the difference between 32° and 212° in Fahrenheit.

It is plausible that the value represented with zero may be granted special significance that is assumed to be the same across measures. This reduces our problem to finding one other translation between the metrics. Without another translation to fix a scale, we can’t assume that numbers mean the same thing in different metrics. (We assume too that the proxies do not imply non-linear differences in value. This is a non-trivial assumption[4], but consistent with our rather minimalistic interpretation of the proxies.)

If different metrics are incomparable and the numbers actually assigned are only accurate up to linear transformations, then there is no way to aggregate the results from different metrics.

The Metric Locality Problem may be thought of as a reason for skepticism about the possibility of resolving the Pinning Problem or the Ratio Incorporation Problem in normative cases. In order for it to be possible to know how to identify one unit of value across normative theories, it would need to be coherent to identify value across theories. In order for it to be possible to incorporate different ratios relative to one another, there would have to be a way of fitting them on the same scale. The Metric Locality Problem says there is not.

Avoiding These Issues

These are real challenges. Each could be a problem for a project like the Moral Weights Project. However, we think the Moral Weights Project follows a methodology that avoids these challenges. Several different factors contribute to the explanation of why the Moral Weights Project does not face the Moral Two Envelopes Problem.

Pinning to Humans

The Moral Weights Project aims to assign numerical values to represent the moral significance of species. The methodology involves assigning numerical values according to different theories and then aggregating those values. In each of the theories, it is assumed that humans always have a moral weight of 1 and so any variance implied by those theories is applied to other species.

Assuming a constant value for the welfare of humans allows us to fix a scale across different measures. The key to this is that we also assume assignments of 0 to have a fixed meaning in the metric for each theory. In each metric, 0 is the value of not existing as a welfare subject. The significance of non-existence is uncontroversial. Plausibly, it is the default against which all other changes are assessed. So the meaning of this number may be assumed identical between theories.

We assume 1 to reflect the amount of value provided by the welfare capacity of human beings. Since all other numerical assignments are interpretable relative to these values, and since the numbers for these two absolute amounts of value are the same in each metric we consider, all other numerical assignments can be interpreted as equally significant.

The Pinning Problem is solved by representing everything in human units. We do not have to worry about the meanings of our terms shifting in different contexts, because the value of human units is introspectively fixed. (More on this below.) The Ratio Incorporation Problem is solved by the constraint to always hold the value of humans fixed. The capacity of welfare for humans is assumed to be the same no matter which approach to inferring welfare from physiological and behavioral traits is correct, so any information about the ratio of human and non-human welfare is incorporated by adjusting the value assigned to non-human welfare.

The justification for assuming a consistent meaning to the assignments to humans is that we, as humans, have special[5] introspective access to our capacity for pleasure and pain. We know how badly pain hurts. We know how good pleasure feels. Our interest in behavioral and neurological proxies relates to what it tells us about the extent to which other animals feel the same as we do.

Our grasp on our own welfare levels is independent of theory. Prick your finger: that’s how bad a pricked finger feels. That is how bad it feels no matter how it is that the number of neurons you have in your cortex relates to your capacity to suffer[6]. This is the best access we can have to our own capacities for suffering. If you’re suffering from a migraine, learning about the true ratio of suffering in humans and chickens shouldn’t make you feel any better or worse about your present situation[7].

Consider again the Two Elephants Case. Under one theory, two elephants are worth twice as much as one human: 2 HEV. Under another theory, two elephants are worth half as much as one human: ½ HEV. Symmetrically, the two theories have humans coming out worth half and twice as much as the two elephants respectively. Suppose that we pin the value of humans to be identical across the theories: 1 HEV refers to the same amount no matter which theory is true. Then it follows that in the second framing, although humans are worth half as much as the elephants in one theory, and twice as much in the other, the value of the two elephants shifts between the two theories, and calculating the expected value in elephantine units is inappropriate.

There are two caveats to this.

First, we aren’t assuming that humans have the same value according to every normative theory. We are only assuming that humans have a certain level of welfare, as determined by their valenced experiences, no matter what the proxies say. It is only because we assume that valenced experiences determine moral significance that we can infer that humans have the same level of moral significance.

Second, we are assuming that we have direct access to the range of our own capacity for suffering (at least to an extent that is independent of the question of which proxies are correct). The direct access we have to our phenomenal states is somewhat mysterious and open to doubt: we might struggle to explain how we reliably know about the extent of our own welfare states. Nevertheless, we think that there is sufficient consensus that we have such access which is an acceptable assumption for this project.

Assuming a Normative Theory

According to Tomasik’s assessment, the Moral Two Envelopes Problem is a problem specifically for aggregations of value across the utility functions that result from different normative theories. So, for instance, if we are equally unsure between a deontological theory that assigns a value of −5 to murder and a utilitarian theory that assigns a value of −1 to murder, we can’t average them to get an expected value of −3. The problem is supposed to be that the numbers in these functions are incomparable. In contrast, disagreement about factual matters within a theory is supposed to be unproblematic.

The theories that we assign variable degrees of probability within the Moral Weights Project are (for the most part[8]) not normative theories. Instead, our assessments assume hedonism: valenced experiences are what matters, and so calibrating moral weights involves assessing species for their capacity for pleasure and pain. The theories over which we are uncertain relate the relationships between physiological and behavioral proxies for valenced experiences. One theory suggests that brain size is a proxy for suffering. Another theory suggests that aversive behavior is a proxy for suffering. Our uncertainty is not about what matters directly, but about how what we know about the world relates to what matters.

The question of how proxies relate to valenced experiences is a factual question, not a question of what to value[9]. There are possible questions of what to value that might be confused for the question of the relevance of proxies. For instance, it might be that we assign higher moral weights to humans because we think more complex cognitive states matter more or because we think that the amount of neural mass involved in a feeling matters beyond how it feels. However, these were not the kinds of theories over which we try to aggregate. For the proxies in which cognitive complexity is taken as indicative of a wider range, the assumption is that cognitive complexity correlates (perhaps as the result of evolutionary pressures) with the determinants of that welfare range, not that cognitive complexity constitutes the thing of value.

The fact that we treat proxies as proxies of something further that cannot be directly studied may call some of the methodological choices into question. In particular, some may be skeptical that our proxies provide the same evidence across the phylogenetic tree: similarities in behavior indicate similarities in underlying mental faculties in creatures who share our neuroanatomy and evolutionary heritage. For instance, play behavior may be taken to provide more evidence for welfare capacity in chickens than in fruit flies. The nuances here are more difficult to formalize and study in a rigorous and consistent manner. In the interest of making progress, the Moral Weight Project adopted an approach that smooths over such complexities. That said, readers should be cautious about naively accepting the results of the project for very distant species and I would not endorse straightforwardly extending the methodology to non-biological systems.

Assuming a single normative theory is also potentially problematic. It is especially problematic if that theory, like hedonism, is widely regarded as false. However, the Moral Weights Project team thought hedonism’s tractability and its proximity to the truth were enough to justify its use. While hedonism is not widely accepted, the numerical values produced by the project are still informative and speak to an important component of moral value.

It might be objected that there aren’t any plausible correlates that underlie welfare capacities that are independent from the sorts of proxies we chose.[10] On that view, our uncertainty about how welfare capacities relate to proxies would not be addressed by any facts we don’t know about the true underlying nature of welfare. The question of how to think about chicken suffering or shrimp suffering is then not what it really feels like to be a chicken or to be a shrimp, but instead how we want to categorize their states. This amounts to a rejection of realism about the notion of welfare ranges. Doubts about the validity of the concept of welfare ranges would reduce the value of the project, but that shouldn’t come as a surprise. It would suggest issues with the foundations and aims of the project rather than its methodology.

Calibrating Through Shared Assessments

Finally, I believe that it is possible to aggregate across the metrics for different normative theories so long as those metrics are properly calibrated. Proper calibration is possible for normative theories that are not too different from one another.

Tomasik considers and rejects one calibration scheme according to which all theories are put on a single scale according to the best possible outcomes in those theories. I agree with Tomasik that this approach would be problematic. Many normative theories place no upper limit on the amount of value, and there is no reason we can see to think normative theories must all assume the same overall stakes.

I think that a more promising strategy involves using some shared assessments across normative theories to create a common currency.

First, consider a particularly easy case: suppose we have hedonism and hedonism+. Hedonism+ shares hedonism’s verdicts on all things except that it also attributes some value to personal autonomy, and is therefore willing to pay costs in experience for greater levels of autonomy. Let hedonism and hedonism+ share not just their assessments of the value of pleasures and pains, but the reasons for those assessments: the explanations they provide for their value and our epistemic access to that value are identical. Given this, it is reasonable to equate the value each assigns to pleasure and pain in the two theories. The value of autonomy in hedonism+ can then be inferred from the tradeoffs that hedonism+ warrants with pleasure and pain.

We can generalize from this special case. Common reasoning about sources of value may let us calibrate across normative theories. Insofar as two theories place an amount of value on the same prospect for the exact same reasons, we can assume it is the same amount of value.[11]

This strategy might be applied to other flavors of hedonism that value different kinds of experiences. Consider a flavor of hedonism that attributes greater wellbeing to more complex varieties of pleasure and pain. Human capacities for moral value on this theory might be greater than on theories that treat complex pleasure and pain the same as simple pleasures and pain. But if each flavor of hedonism agrees about the value of simple pleasure and pain, and complex hedonism sees additional reasons to value complex pleasure or pain, then we can calibrate between the theories using the shared assessments of simple pleasures and pains.[12]

Conclusion

Although the Moral Two Envelopes Problem presents real difficulties, the Moral Weights Project methodology largely avoids them by assuming a single normative theory and pinning its value units to humans.

  1. ^
  2. ^

    I don’t claim that this solves the Two Envelope Problem, or even that the problem is solvable in full generality. I think it is solvable in every finite case, where our uncertainties don’t permit just any amount of money in the two envelopes, for the reasons expressed here. The Moral Two Envelopes Problem doesn’t seem to rely on problematic infinities, so it is appropriate to focus on just the finite case.

  3. ^

    Whether this expected value is action-relevant depends in part on our preferences. Suppose that Spencer and Trish are both allowed to choose an envelope (the same or different) and get the amount of money inside. Trish chooses one envelope. Spencer doesn’t care about exactly how much money he gets, he only cares about how much money he gets relative to Trish. He’d rather get $2 if Trish gets $1 than $500 if Trish gets $1000. What he really cares about is expected value in Trish-based units (He’d sell his kidney for $2 if Trish gets $1, but wouldn’t sell it for $500 if Trish gets $1000). Spencer should pick the envelope Trish does not pick, not because it has higher expected value in absolute monetary terms, but because it has a higher expected value in the Trish-based units.

  4. ^

    Thanks to Michael St. Jules for stressing this point. If we took the proxies as corresponding to theories about what makes mental states valuable, this could be a significant issue. Instead, we see the proxies not as identifying physical bases of normatively-significant features but just as possible sources of evidence regarding normatively-significant features that are assumed to be the same no matter which proxies actually provide the best evidence.

  5. ^

    The introspective access provides two advantages. First, we can examine our internal mental life in a way that would be very difficult to replicate from a third-person perspective with other animals. We get some insight into what we’re paying attention to, what properties are represented, what aspects of our experiences we find motivating, etc. In principle, we might be able to do this from a third-person perspective with any other animal, but it would take a lot of neuroscientific work. Second, introspection of valenced experiences provides a target for fixing a unit that we can in part recall through memory and imagination. We could define a unit of suffering by indexing to any noxious event in an individual of any species, but the significance of that unit would remain mysterious to us insofar as we couldn’t imagine what it is like to experience it and couldn’t use our internal cognitive machinery to assess tradeoffs with other things we care about.

  6. ^

    This doesn’t mean that which theory of welfare is true doesn’t have a bearing on how good or bad our lives are. It may be that our lives would have been worse than they are if there was (counterfactually) a linear relationship between brain size and suffering. This is a strange counterfactual to entertain because I take that it is not possible that the proxies should be different than they are, at least conditional on certain plausible assumptions: the true nature of consciousness and the forces of evolution in our ancestors’ environment require the proxies to be roughly whatever they are. However, we don’t need to worry about this. For our purposes, it makes sense to just aggregate across the value of different theories considered as actual.

  7. ^

    Michael St. Jules makes a similar argument.

  8. ^

    We did include one set of proxies – our ‘higher /​ lower pleasures’ model – that we gave an explicitly normative explanation. Removing this model wouldn’t significantly change the results. Furthermore, the proxies relate to markers for intelligence that would also fit naturally with non-normative rationales.

  9. ^

    Carl Shulman and Brian Tomasik both suggested a view according to which the relevant facts underdetermine the factual question. On my understanding, they think that suffering is a normatively loaded concept, and so the question about which states count as suffering is itself normative. Given that the physical facts don’t force an answer, the precise delineation of suffering vs non-suffering is normative. This view seems like it makes more sense for the kind of uncertainty we have over insects than the kind of uncertainty we have over chickens; we can be reasonably confident that insects don’t share the robust natural kind of cognitive state that underlies our consciousness. In any case, another plausible response to factual underdetermination is to reflect that indeterminacy in welfare ranges. Such complexities were beyond the scope of the project as planned, which aimed to apply a concrete (albeit rough) methodology to generate precise moral weights.

  10. ^

    Thanks to Carl Shulman for making this point.

  11. ^

    Michael St. Jules discusses similar ideas. See also Constructivism about Intertheoretic Comparisons.

  12. ^

    This may suggest that there is much more value at stake according to expansive normative views. If we adopt a meta-normative principle in which we should favor expected choice-worthiness, this would give us reason to be maximalists. It isn’t obvious to me that that conclusion is wrong, but if we find it disagreeable we can also reject the maximization of expected choice-worthiness.