Apples and oranges? Some initial thoughts on comparing diverse benefits

An important part of effective altruism is comparing the value of different altruistic endeavors. Many altruistic endeavors bring about different kinds of good things, for instance protection of children from incapacitating diseases, and extra years of quality education. To find the best causes, we need some way of evaluating these next to one another. How much extra education in the developing world is worth the same as an extra year of healthy life?

Answering such questions is notoriously tricky, and GWWC faces an even harder problem of answering them in such a way that other people are happy to use our judgements, and thus our recommendations. One can’t just opt out of answering them, for the same reason one shouldn’t choose one’s partner by their social security number just because it’s hard to weigh up a good sense of humor against kind eyes.

So how can we make such evaluations? GWWC research has been looking into it a bit, and this blog post will tell you some of what we think. We’ll look at various methods from economics and the social sciences, and discuss the advantages and disadvantages of different approaches.

Any attempt to compare the value of two things has two basic parts. The first step is often to parcel out all of the main ways each item might have benefits. For instance, if a person in Malawi doesn’t contract malaria next year, this will translate to various good things: less suffering and more joy for that person, less stress for their family, less congestion at the local hospital, a few thousand dollars of productive work, slightly different expectations among people nearby about how well their lives are likely to go.

This can go on for more steps—a few thousand dollars of productive work might be factored out in terms of increased prosperity for the family, and more prosperity spread among the wider world. Prosperity for the family might in turn be factored out as better nutrition for them, more education for their children, some investments that will bring them further prosperity, etc.

Another thing that often happens at this stage is that items are made more abstract. For instance one particular girl’s recovery from schistosomiasis might be equated to some number of ‘quality adjusted life years’ or ‘QALYs’. This makes evaluation more tractable. Now we can evaluate many similar things at once, a little bit inaccurately, instead of doing thousands of different evaluations. For instance if we convert many different health interventions into QALYs saved, then we can compare a QALY to a year of school, and we automatically have a comparison of lots of different health interventions.

The second step is to actually compare the value of the items you end up with. This might involve for instance seeing how money much a person is willing to pay to avoid some suffering. You could do this by observing how much money and effort they invest in avoiding the flu, in flu season. Or you might ask them directly, ‘if you could pay $1000 to avoid this surgery, would you take that offer?’. Or instead of looking at money they will sacrifice, you might just ask about how they feel. For instance you might ask a blind person repeatedly over a period how satisfied they are with their life, and do the same with a person who is not blind.

Comparing value can be done immediately without the previous step of converting things, or after so many iterations of the previous step as to turn health and education into unrecognizable buckets of value. Many ways of comparing the value of A against B have been developed, especially in the social sciences. I have collected some of them into a menu, which describes their upsides and downsides, and suggests when we might expect them to be most appropriate.

Note that the methods in the menu are generally comparing benefits to identifiable groups of people. We may care about people for whom it’s hard to make these measurements, such as future generations, but we’ll need different or further methodology to know how to compare these effects to the direct benefits we can measure. Nonetheless it’s valuable to understand how we should compare diverse benefits today, and this may be a key component in a more general analysis.

Also, we won’t usually be able to choose freely from the list of methods. GWWC will probably not have the resources to go and find out how much an additional parent improves the life outcomes for a person in rural China. The hope is rather that this list might help guide choices of comparison when a few different methods happen to be available. I’ll give you a few examples of evaluation methods on the list, then in the rest of this post I’ll describe some of key choices it offers and mention some of the factors that might suggest one choice over another.

Appetizers

To give you a concrete idea of what we are talking about, here are some examples of how we might use different methods from the list to figure out how good something is for a person:

Willingness to pay: if a person is willing to pay $100 for a textbook, you can infer that they prefer having the textbook to having $100, and therefore to other things they could buy for $100.

Instantaneous reports of subjective wellbeing: a person’s phone pings them throughout the day and and asks something like, ‘zero to ten, how good do you feel right now?’ Using this method, we can build a hedonic profile for different individuals and (hopefully) discover the correlates and causes of happiness, as well as their relative importance.

The standard gamble: A person is asked what probability of dying they would be willing to accept for an intervention that would improve their health in some way.

Averting behavior method: Use the total costs people pay to avoid extra sunburn as a lower bound for how much a healthy ozone layer is worth to them.

What to elicit: preferences, happiness, or another proxy of good?

Whether you ultimately care to optimize for good feelings, people getting what they want, or something else is a controversial moral question. I won’t address this question here except to note that different answers can lead us to prefer different measurement techniques. For instance, it is relatively easy to observe a person’s preference by looking at which things they choose. Their happiness in different situations is harder to observe, so you might do better just asking them directly if that’s what you care about. On the other hand, there are many reasons to suspect people’s answers to ‘how happy are you’ across time and between people don’t exactly correspond to the true landscape of happiness, so even if you cared about happiness you might take a person’s preferences seriously, if you thought they liked to be happy.

When the situation you want to evaluate will have ramifications for other people than the one(s) directly affected, the preferences and feelings of the people involved will be a poor proxy for the overall good or bad done. For instance, suppose you are considering educating a young woman in a developing country. This should help her, but it is thought to also have large effects on her children and others in her community. If you are hoping to eventually lift her region out of poverty through this type of action, you are relying mostly on good from such indirect effects. In this case, discovering how much the woman would like to be educated is not very useful. Even if she wouldn’t like to be educated at all it may be a good intervention. Similarly, if a person does not understand or care very much about their own future, their views on the value of improving it are not very useful. Asking a child how much they would like to be dewormed is not very useful.

One might deal with this in the earlier step, by equating ‘one child dewormed’ into a small amount of health, education, and physical development. A child might know better how much they dislike being sick, or adults may be able to shed light on the long term costs of malnourishment. Or you might just abstract a child’s illness to be generic illness, and ask how much better informed people wish to avoid such illness.

How to elicit: ask or observe?

In many situations what people think or say they value differs from what they choose given the opportunity. There are many possible reasons for this, from hypocrisy to difficulty envisaging the situation of interest accurately and without a biased focus. To the extent one believes a person’s behavior better indicates their ‘real’ preferences, observing choices will be more accurate than asking. However a person’s speech may better reflect their preferences on reflection than their behavior, which usually comprises a combination of considered preferences and unendorsed urges. If you wish to respect only the former, this is some reason to ask instead of observing.

Asking also has the benefit of making it relatively easy to isolate the issue of interest; natural situations rarely allow one to pin a preference to a specific factor, as so many factors change. For instance, if you see that parents choose to send their children to school at substantial cost, you might like to infer that they expect the education to benefit their children substantially. But it could also be that they expect their children to look poor if they don’t go to school, and be treated worse. Then if the school didn’t exist their children would not be worse off—they are just worse off if the school does exist, and they don’t go to it.

Who to elicit it from?

The best people to evaluate A and B would seem to be people who have experienced both, and are currently either experiencing both or neither. This way they will hopefully be familiar with both items, and not too biased by the details of one being more salient at the moment.

Often it will be hard to find people in exactly that situation though. You will often have a trade off between people who are familiar with one item and not the other (so know more but might be biased) and people who are not familiar with either (so know little, but probably less biased).

This is all only an issue if you are interested in evaluating A and B for the person directly involved. If you want to know about the overall social benefits of fewer people being blind for instance, compared to fewer people being crippled, you may be better off asking someone who is neither blind nor crippled.

Comparison or separate evaluation?

Suppose you want to know whether it is better to get a five year old to go to school for an extra year, or to avoid them getting malaria for one year, and you intend to figure it out by asking their parents. One way you could do it is to ask each parent ‘do you think your child will be better off if she goes to school this year and gets malaria, or if she doesn’t go to school or get malaria this year?’. Another is to ask some parents how valuable they think going to school is, and other parents how valuable they think avoiding malaria is, then comparing those values.

If you ask a person to compare two things at once, you will often get different answers to if you get them to just compare one or the other. One reason is that when they can see things side by side, they compare on the characteristics that seem salient. When they can only see one thing, they tend to ignore characteristics if they don’t feel like they can get any grasp on how good the actual number for the characteristic is, even if it seems important. In our earlier example, suppose that for both the school and the health interventions, the parents are told how much extra income their child is likely to have in the future as a result of the project, and zero-to-ten how much their child will like it at the time. Suppose education will bring about much more future income than the health intervention on offer, but makes the child ²⁄₁₀ happy instead of ⁸⁄₁₀ happy. Then when the parents look at both interventions together, they might weigh up the present costs and future gains and choose the education intervention. When they look at the projects separately though, they might have fairly similar mental images of two different largish gains in future income, whereas happiness looks quite different on the given scale. So implicitly they focus more on the difference in happiness, and so the comparison could come out the other way.

***

These have been several of the considerations which make some evaluation methods more appropriate than others, at different times. The social science literature has a lot more to say. Hopefully our menu nonetheless summarizes enough important points to strengthen our research comparing interventions, and can be built upon in the future as we carry out such comparisons.

Crossposted from the Giving What We Can blog