Impact is very complicated

Epistemic status: gestural food for thought.

This is a post to aggregate a bunch of boring or obvious observations that are easy to overlook. I don’t expect anything to be new. But I do think difficulties gauging impact, in aggregate, form a sort of missing mood that we should be paying more attention to. At the end of the post, I’ll touch on why.

Let us count the ways

Here are some factors that can make assessing impact complex:

  • Some interventions are backed by scientific studies.

    • Scientific studies vary in quality in many ways. They can be larger or smaller, more or less numerous, clearly biased or apparently unbiased, clearly significant or only marginally significant, randomized or non-randomized, observational or experimental, etc.

    • Even for good studies, the same interventions may become better or worse over time as conditions in the world change, or may be very particular to certain places. Cash transfers, for example, might just work way better in certain parts of the world than others, and it may be hard to predict how or why in advance.

  • Many interventions require many simultaneous layers of involvement.

    • Suppose I give to the Against Malaria Foundation. I can say “I estimate I am saving a life per $5,000 I spend.” But I read Peter Singer when I was 13, then Scott Alexander when I was 19, and I likely wouldn’t have ended up donating much without these. I also couldn’t give to AMF if it didn’t exist, so I owe a debt to Rob Mather. And perhaps whoever told Scott Alexander about AMF. All these steps are necessary to actually “save a life”, so we run the risk of massively overcounting if we give every person in the chain “full credit”.

    • But there’s no objectively rigorous way to decide who gets how much of the credit! Just using counterfactuals doesn’t work; it may be the case that all of us are required and a single person “out of the chain” breaks it down. But we can’t all get all the credit!

    • Plus many interventions, like in the AMF example, mostly are just reducing probabilities across large numbers of people anyway. What does it even mean for “my money” to “save life”. Once the money all goes into a pool, whose money actually funds which nets anyway? And which nets prevent cases of malaria that would have been fatal? No way to answer these questions even in principle.

  • Some (perhaps all) interventions rely on difficult-to-impossible philosophical questions to resolve.

    • How should we weigh insect suffering? All we can do is guess—learning more facts about insects doesn’t really get us over Nagel’s “What is it like to be a bat?” hurdle. Empirical information, analogies, and intuition pumps all can help, but there are fundamental judgment calls at play.

    • How to assess well being and weigh well being against survival is another example here where it’s hard to boil down to numbers: there are lots of ways to do it (QALYs, seeing how much people would pay to avert various harms, natural experiments) but none is perfect and all involve their own judgment calls.

  • Some interventions require certain thresholds being met or they don’t actually accomplish anything.

    • Donating to a political campaign that promises credibly to do something good might help bring about that good thing. But if the campaign fails, that donation accomplished basically nothing.

    • Existential risk mitigation efforts (as such) only do any good if they work. If the world ends anyway, that effort didn’t actually accomplish anything.

    • Plenty of interventions can also backfire.

  • Some interventions aim to increase or decrease probabilities. There are a lot of ways to mess this up.

    • My least favorite arguments in intro EA messaging historically were things like: “even if the chance of [EVENT] is quite low, say 1%, then [ACTION] is extremely valuable.” Anchor points like 1% may be dramatically too high.

    • In some sense almost every intervention “aims to increase or decrease probabilities”, eg AMF is trying to decrease probabilities of preventable malaria deaths. But x-risk or lobbying efforts are trying to increase or decrease one big probability. This probability is generally unknown and it’s very difficult to figure out how it moved after the fact (much less in advance).

    • Thinking in terms of probability is important but just making up numbers introduces lots of room for bias. Remember the replication crisis? Even strong institutional processes very focused on truth seeking (at least in theory) can produce lots of bunk for a long time before anyone notices.

  • People often make decisions about what to prioritize based on cultural and peer group effects.

    • This worsens a lot when money and social belonging are at stake. It’s hard to think clearly if you’re trying to land a cool job or break into a friend group. Especially for people from ages, say, 16-25.

    • My psychological state thinking about this sort of thing feels very different now that I’m a little older and do not rely on EA centrally as a peer community or source of job opportunities. If I didn’t have a stable career outside of EA I’d expect my cognition to be really, really warped by the differential probabilities of “finding an in” if I held various beliefs.

  • Outside vs. inside view debates are really hard to settle.

    • If most observers outside a community think something seems obviously false, and most people inside this community think it seems obviously true, it’s difficult in either position to make progress.

    • Inside view logic often has lots of jargon, canon, and sneakily shared assumptions. Outsiders can’t point these out in a satisfying way because to them it just seems like random nonsense.

    • But outside view logic treats everything in the inside view as random nonsense, and some of it often contains good insights.

  • Different reference classes also complicate outside view concerns.

    • “Most experts think X” claims are especially slippery. Who’s an expert? Who’s highly engaged? My “alarmingly self-reinforcing memeplex” might be your “emerging field of impressive scholarship”.

    • You can see endless telescoping debates of this sort even in very tight niches within quite modest epistemic communities. Every academic field has its civil wars, with no clear resolution in sight.

    • If it’s hard to settle questions even in fields with tons of high quality studies, imagine how hard it is to get a definitive answer for really big questions like “what are the odds humanity goes extinct this century”. Worth asking and deeply investigating, but hard not to have an industrial salt shaker ready for whatever you find.

  • Lots more!

    • I’ll end the list here because I’m running out of steam, but it isn’t exhaustive.

Putting it together

I’ll attempt an incomplete summary to gesture more directly at what I’m talking about.

Suppose you have a seemingly safe intervention—you’re providing insecticide-coated bed nets to families in tropical areas. There have been lots of studies on efficacy, and lots of high-quality analysis on those studies. You’ve read some of this and come away with the belief that you can help save a life for a few thousand dollars.

But how sure are you? The research or analysis you read could be off. Your mom read somewhere that maybe the nets are used for fishing sometimes, which you don’t think is a big issue. But you’re not totally sure. The country in which your nets will be deployed is different than the countries where the studies were done, and plus it’s been many years so maybe there have been cultural differences that change the likelihood nets will be used correctly.

And what happens once you save a life? What’s the new life expectancy of that person? How high variance is it?

It becomes clear that there’s a lot of value in really nailing down your intervention the best you can. Having tons of different reasons to think something will work. In this case, we’ve got:

  1. It’s common sense that not being bit by mosquitos is nice, all else equal.

  2. The global public health community has clearly accomplished lots of good for many decades, so their recommendation is worth a lot.

  3. Lots of smart people recommend this intervention.

  4. There are strong counterarguments to all the relevant objections, and these objections are mostly shaped like “what about this edge case” rather than taking issue with the central premise.

Even if one of these fails, there are still the others. You’re very likely to be doing some good, both probabilistically and in a more fuzzy, hard-to-pin-down sense.

So what?

To some degree, I think it’s worth remembering that epistemology is hard for its own sake. The world is really complicated and it’s easy to cache too many foundational beliefs and see long chains of inference as obvious axioms.

But also, I think this is a good argument for cause diversity and against confident pronouncements on differences in expected impact.

Wait a minute, you might say, isn’t comparing interventions a huge part of effective altruism? And yes! It’s really important to notice that some actions are predictably way more impactful than others. But I think this bottoms out in two ways:

  1. Commonly shared intuition

  2. Apples to apples comparisons

Commonly shared intuition

Some claims like “saving a child’s life is more valuable than making a child have a deeper appreciation of the arts” pass a clear sniff test for almost all people who hear them. This kind of comparison is important for effective altruism.

Other claims are harder/​less obvious, like “saving the life of ten children child across the world is more valuable than funding scholarships for ten children nearby”. Most readers here would automatically agree. Not everyone would. But it’s valuable to present people with lots of intuition pumps like this, to make sure we can think clearly and make good choices.

That being said, intuition is the operative concept. There’s probably no objective account, or at least not one we have clear access too other than via introspection.

Apples to apples comparisons

You don’t need to bottom out in intuition if you’re comparing two interventions with the same target. If you want to improve global health, good to get the best bang for your buck in that arena.

If you want to save lives, you can try to save the most lives. Or you can try to save lives with the highest probability/​confidence. Or some combination.

But it’s good to notice that as the difference between interventions grows, comparison get messy.

Conclusion

It’s very difficult to make pronouncements about relative impact that are both confident and accurate. Even canonical cases are really complicated under the surface.

We should take care to let newcomers find their own paths, and give out the tools to think through the principles of EA using their own assumptions about the world. More diversity is better, here. And given how hard it is to get things right, I don’t think there’s any shame—or even necessarily an expected value hit—in choosing to play it safe.

And speaking just from my own perspective, don’t underestimate the epistemic value of having a full social life and career that do not depend on EA. You only really notice the pull of needing peer/​professional approval once it’s gone, and you might marvel at just how strong it was.