Evidence, cluelessness, and the long term—Hilary Greaves

Hilary Greaves is a professor of philosophy at the University of Oxford and the Director of the Global Priorities Institute. This talk was delivered at the Effective Altruism Student Summit in October 2020.

This transcript has been lightly edited for clarity.

Introduction

My talk has three parts. In part one, I’ll talk about three of the basic canons of effective altruism, as I think most people understand them. Effectiveness, cost-effectiveness, and the value of evidence.

In part two, I’ll talk about the limits of evidence. It’s really important to pay attention to evidence, if you want to know what works. But a problem we face is that evidence can only go so far. In particular, I argue in the second part of my talk that most of the stuff that we ought to care about is necessarily stuff that we basically have no evidence for. This generates the problem that I call ‘cluelessness’.

And in the third part of my talk, I’ll discuss how we might respond to this fact. I don’t know the answer and this is something that I struggle with a lot myself, but what I will do in the third part of the talk is I’ll lay out five possible responses and I’ll at least tell you what I think about each of those possible responses.

Part one: effectiveness, cost-effectiveness, and the importance of evidence.

Effectiveness

So firstly, then, effectiveness. It’s a familiar point in discussions of effective altruism and elsewhere that even most well-intentioned interventions don’t in fact work at all, or in some cases, they even do more harm than good, on net.

One example (which may be familiar to many of you already) is that of Playpumps. Playpumps were supposed to be a novel way of improving access to clean water across rural Africa. The idea is that instead of the village women laboriously pumping the water by hand themselves, you harness the energy and enthusiasm of youth to get children to play on a roundabout; and the turning of the roundabout is what pumps the water.

This perhaps seemed like a great idea at the time, and millions of dollars were spent rolling out thousands of these pumps across Africa. But we now know that, well intentioned though it was, this intervention does more harm than good. The Playpumps are inferior to the original hand pumps that they replaced.

For another example, one might be concerned to increase school attendance in poor rural areas. To do that, one starts thinking about: “Well, what might be the reasons children aren’t going to school in those areas?” And there are lots of things you might think about: maybe because they’re so poor they’re staying home to work for the family instead, in which case perhaps sponsoring a child so they don’t have to do that would help. Maybe they can’t afford the school uniform. Maybe they’re teenage girls and they’re too embarrassed to go to school if they’ve got their period because they don’t have access to adequate sanitary products. There could be lots of things.

But let’s seize on that last one, which seems like a plausible thing. Maybe their period is what’s keeping many teenage girls away from school. If so, then one might very well think distributing free sanitary products would be a cost-effective way of increasing school attendance. But at least in one study, this too turns out to have zero net effect on the intended outcome. It has zero net effect on child years spent in school. That’s maybe surprising, but that’s what the evidence seems to be telling us. So many well-intentioned interventions turn out not to work.

Cost-effectiveness

Secondly, though, comes cost-effectiveness: even amongst the interventions that do work, there’s an enormous variation in how well they work.

If you have a fixed sized pot of altruistic resources, which all of us do (nobody has infinite resources), then you face the question of how to do the most good that you can per dollar of your resources. And so you need to know about cost-effectiveness. You need to know about which of the possible interventions that you might fund with your altruistic dollars will do the most good, per dollar.

And even within a given cause area, for example, within the arena of global health, we typically see a cost-effectiveness distribution like the one in this graph.

So this is a graph for global health. Most interventions don’t work very well, if at all. They’re bunched down there on the left hand side of the graph. But if you choose carefully, one can find things that are many hundreds of times more cost-effective than the median intervention. So if you want to do the most good with your fixed pot of resources, it’s crucial, then, to focus not only on what works at all, but also on what works best.

The importance of evidence

This then leads us naturally onto the third point: the importance of evidence.

The world is a complicated place. It’s very hard to know a priori which interventions are going to cause which outcomes. We don’t know all the factors that are in play, particularly if we’re going in as foreigners to try and intervene in what’s going on in a different country.

And so if you want to know what actually works, you have to pay a close attention to the evidence. Ideally, perhaps randomised controlled trials. This is analogous to a revolution that’s taken place in medicine, for great benefit of the world over the past 50 years or so, where we replace a paradigm where treatments used to be decided mostly on the basis of the experience and intuition of the individual medical practitioner. We’ve moved away from that model and we’ve moved much more towards evidence-based medicine, where treatment decisions are backed up by careful attention to randomised controlled trials.

Much more recently, in the past ten or fifteen years or so, we’ve seen an analogous revolution in the altruistic enterprise spearheaded by such organisations as GiveWell, which pay close attention to randomised controlled trials to establish what works in the field of altruistic endeavour.

This is a great achievement and nothing in my talk is suppose to move away from the basic observation that this is a great achievement. Indeed, my own personal journey with effective altruism started when I realised that there were organisations like GiveWell doing this.

(The organisers of this conference asked me to try and find a photo of myself as a student. I’m not sure that digital photography had actually been invented yet when I was a student. So all I have along those lines is some negatives lying up in my loft somewhere. But anyway, here’s a photo of me as a relatively youthful college tutor, perhaps ten or fifteen years ago.)

I was at dinner in my college with one of my students. I was discussing the usual old chestnut worries about aid not working: culture of dependency, wastage and so forth. And I mentioned that even though, like the rest of us I feel my middle class guilt, I feel like as a rich Westerner I really should be trying to do something with some of my resources to make the world better.

I was so plagued by these worries, by ineffectiveness, that I basically wasn’t donating more than 10 or 20 pounds a month at that point. And it was when my student turned round to me and said, basically: GiveWell exists; there are people who have paid serious attention to the evidence, thought it all through, written up their research; you can be pretty confident of what works actually, if you just read this website. That, for me, was the turning point. That was where I started feeling “OK, I now feel sufficiently confident that I’m willing to sacrifice 10 percent of my salary or whatever it may be”.

And again, that observation is still very important for me. Nothing in this talk is meant to be backing away from that. It’s important to highlight that because it’s going to sound as though I am backing away from that in what follows. What I want to do is share with you some worries that I think we should all face up to.

Part two: the limits of evidence

So here we go. Part two: the limits of evidence.

In what I’ll call a ‘simple cost-effectiveness analysis’, one only measures the immediate intended effect of one’s intervention. So, for example, if one’s talking about water pumps, you might have a cost-effectiveness analysis that tries to calculate how many litres of dirty water consumption are replaced by litres of clean water consumption per dollar spent on the intervention. If we’re talking about distributing insecticide treated bed nets in malarial regions, then we might be looking at data that tells us how many deaths are averted per dollar spent on bed net distribution. If it’s child years spent in school, well, then the question is by how much do we increase child years spent in school, per dollar spent on whatever intervention it might be.

Once you’ve answered that question, then in the usual model, you go about doing two kinds of comparison. You do your intra-cause comparison. That is to say, insofar as our focus is (for example) child years spent in school, which intervention increases that thing the most, per dollar donated?

And of course, since we also want to know whether we should be focusing on child years spent in school or instead on something else like water consumption, we want to do cross-cause comparisons which tell us—on the basis of some admittedly much trickier but reasonable, well thought through theoretical model—how we should trade off additional child years spent in school against improvements in clean water consumption. How many litres increase in clean water consumption is equivalent from the point of view of good done to an increase of, say, one child year spent in school?

Knock-on effects and side effects

Let’s suppose we can do all those things (there are questions about how you do it, particularly in the case of cross-cause comparisons, but those are not the focus of my talk). What I want to focus on here is what’s left out by those simple cost-effectiveness analyses. There are two kinds of effects of our interventions that aren’t counted, if we just do the kind of thing that I described on the previous slide.

There’s what I’ll call ‘knock-on effects’, or perhaps sometimes called ‘flow-through effects’, on the one hand; and then there are side effects. Knock-on effects are effects that are causally downstream of the intended effect. So you have some intervention, (say) whose intended effect is an increase in child years spent in school. Increasing child years spent in school itself has downstream further consequences not included in the basic calculation. It has downstream consequences, for example, on future economic prosperity. Perhaps it has downstream consequences on the future political setup in the country.

There are also side effects. These are effects that are effects of the intervention, but they don’t go via the intended effect, so they have some other causal route. For example, in the context of things like provision of healthcare services by Western funded charities, many people have worried that having rich Westerners come in and fund frontline health services via charities might decrease the tendency of the local population to lobby their own governments for adequate health services. And so this well-intentioned effect of providing healthcare might have adverse political consequences.

Now, in both of these cases, both in the case of the knock-on effects and in the case of the side effects, we have effects rippling on, in principle, down the centuries, even down the millennia.

So in terms of this picture, if you like, the paddleboard in the foreground represents the intended effect. You can have some effect on that part of the river immediately. That’s the bit that we’re measuring in our simple cost-effectiveness analysis. But in principle, in both the cases of knock-on effects and in the cases of side effects, there are also effects further on into the distant parts of that river, and even over there in the distant mountains that we can only dimly see.

Cluelessness

OK, so there are all these unmeasured effects not included in our simple cost-effectiveness analysis. I want to make three observations about those unmeasured effects. Firstly, I’ll claim here (and I’ll say more about it in a minute), I claim that the unmeasured effects are almost certainly greater in aggregate than the measured effects. And I don’t just mean ex post this is likely to be the case; I mean that, according to reasonable credences even in terms of expected value, the unmeasured effects are likely to dominate the calculation, if you’re trying to calculate (even in expected terms) all of the effects of your intervention.

The second observation is that these further future (causally downstream or otherwise) events are much harder to estimate. In fact, they’re really hard to estimate; they’re much harder to estimate, anyway, than the near-term effects. That’s because, for example, you can’t do a randomised controlled trial to ascertain what the effect of your intervention is going to be in 100 years. You don’t have that long to wait.

The third observation is that even these further future and relatively unforeseeable effects, in principle, matter from an altruistic point of view, just as much as the near-term effects. The mere fact that they’re remote in time shouldn’t mean that we don’t care about them. If you need convincing on that point, here’s a little thought experiment. Suppose you had in front of you right now a red button and suppose for the sake of argument, you knew (never mind how) that the effect of your pressing this button here and now would be a nuclear explosion going off in two thousand years time, killing millions of people. I take it you would have overwhelming moral reason, if you knew that were the case, not to press the red button. So what that thought experiment is supposed to show is that the mere fact that these people—the hypothetical victims of your button pressing—are remote from you in time and that you have no other personal connection to to them, those facts don’t diminish the moral significance of the effects.

What do we get when we put all those three observations together? Well, what I get is a deep seated worry about the extent to which it really makes sense to be guided by cost-effectiveness analyses of the kinds that are provided by meta-charities like GiveWell. If what we have is a cost-effectiveness analysis that focuses on a tiny part of the thing we care about, and if we basically know that the real calculation—the one we actually care about—is going to be swamped by this further future stuff that hasn’t been included in the cost-effectiveness analysis; how confident should we be really that the cost-effectiveness analysis we’ve got is any decent guide at all to how we should be spending our money? That’s the worry that I call ‘cluelessness’. We might feel clueless about how to spend money even after reading GiveWell’s website.

Five possible responses to cluelessness

So there’s the worry. And now let me sketch five possible responses to that worry. The first one I mention only to set aside. The other four I want to take someone seriously in each case.

Response one: Make the analysis more sophisticated

So the response I want to set aside is the thought that “maybe all this shows that we need to make the cost-effectiveness analysis a little bit more sophisticated”. If the problem was that our cost-effectiveness analysis of, say, bed net distribution only counted deaths averted, and we also cared about things like effects on economic prosperity in the next generation and political effects and so forth; doesn’t that just show (the thought might run) that we need to make our analysis more complicated so that includes those things as well?

Well, that’s certainly an improvement and very much to their credit this is something that GiveWell has done. If you go to their website, you can download their cost-effectiveness analyses back as far as 2012, and for every year since then. And in particular, say, if you look at the analyses for the Against Malaria Foundation (one of the top charities that distributes insecticide treated bed nets in malarial regions), you’ll see that the 2012 analysis basically just counts deaths averted in children under five, whereas the 2020 analysis includes a whole host of things beyond that. It includes morbidity effects, so effects of non-fatal illness from non-fatal cases of malaria. It includes effects on the prevention of stillbirths. It includes prevention of diseases other than malaria. And it includes reductions in treatment costs, if fewer people are getting sick then there’s less burden on the health service. So those are all things that might increase the cost-effectiveness of bed net distribution relative to the simple cost-effectiveness analysis. They also include some things that might decrease it, for example, decreases in immunity to malaria resulting from the intervention and increases in insecticide resistance in the mosquitoes.

So that’s definitely progress and GiveWell is very much to be applauded for having done this. But from the point of view of the thing that I’m worrying about in this talk it’s not really a solution. It only relatively slightly shifts the boundary between the things that we know about and the things that we’re clueless about. That is, it’s still going to be the case, even after you’ve done the most complicated, remotely plausible cost-effectiveness analysis, that you’ve said basically nothing about, say, effects on population size down the generations.

It’s perhaps worth pausing a bit on this point. Why do I still feel, even given the 2020 GiveWell analysis for AMF, that most of the things I care about, even in expected value terms, have been left out of the calculation?

Well, an easy way of seeing this is to consider, in particular, the case of population size. Okay, so, I fund some bed nets. Suppose that saves a life in the current generation. I can be pretty sure that one way or another, saving a life in the current generation is going to have an effect on population size in the next generation. Maybe it increases future population because, look, here’s an additional person who’s going to survive to adulthood. Statistically speaking, that person is likely to go on to have children. Maybe it actually decreases future population because there are well known correlations between reductions in child mortality rate and reductions in fertility. But either way, it seems very plausible that once I’ve done my research, then the expected effect on future population size will be non-zero.

But now let’s think about how long the future of humanity hopefully is. It’s not going to be just one further future generation. Nor is it going to be just two. At least, hopefully, if all goes well, there are thousands of future generations. And so, then, it seems extremely unlikely that the mere 60 (or so) life years I can gain in the life of the person whose premature death my bed net distribution has averted, is going to add up more in value terms overall than all those effects on population size I have down the millennia.

Now, I don’t know whether the further future population size effects are good or bad. That’s for two reasons. Firstly, I don’t know whether I’m going to increase or decrease future population. And secondly, even if I did, even if I knew, let’s say for the sake of argument, that I was going to be increasing future population size, I don’t know whether that’s going to be good or bad. There are very complicated questions here. I don’t know what the effect is of increasing population size on economic growth. I don’t know what the effect is on tendencies towards peace and cooperation versus conflict. And crucially, I don’t know what the effect is of increasing population size on the size of existential risks faced by humanity (that is, chances that something might go really catastrophically wrong, either wiping out the human race entirely, or destroying most of the value in the future of human civilisation). So, what I can be pretty sure about is that once I’ve thought things through, there will be a non-zero expected value effect in that further future; and that will dominate the calculation. But at the moment, at least, I feel thoroughly clueless about even the sign, never mind the magnitude, of those further future effects.

Okay, so the take home point from this slide is: sure, you can try and make your cost-effectiveness analysis more sophisticated and that’s a good thing to do—I very much applaud it—but, it’s not going to solve the problem I’m worrying about at the moment.

So, that’s the response I want to set aside. Let me tell you about the other four.

Response two: Give up the effective altruist enterprise

Second response: give up the effective altruist enterprise. This, I think, is a very common reaction indeed. I think, anecdotally, many people refrain from getting engaged in Effective Altruism in the first place because of worries like the ones I’m talking about in this talk—worries about cluelessness.

The line of thought would run something like this: look, when I was that college tutor, having that conversation with that student, when I felt really confident that I could be doing significant amounts of good per dollar donated, that was what motivated me to make big personal sacrifices in material terms to start giving away significant fractions of my salary. But if cluelessness worries have now undermined that, I no longer feel I have that certainty. Why then would I be donating 10 percent, 20 percent, 50 percent, or whatever, on something that I feel really, really clueless about, knowing that I could instead (say) be paying off my mortgage?

Okay, so I want to lay this response on the table, because it’s an important one. It’s an understandable one. It’s a common one. And it shouldn’t be just shamed out of the conversation. My own tentative view, and certainly my hope, is that this isn’t the right response. But for the rest of the talk, I’ll set that aside.

Response three: Make bolder estimates

What other responses might there be? The third response is to make bolder estimates. This picks up on the thread left hanging by that first response. The first response was: make the cost-effectiveness analysis a little bit more sophisticated. In this third response—making bolder estimates—the idea is: let’s do the uber-analysis that really includes everything we care about down to the end of time.

So recall, two sections ago, I was worrying about distant future effects on population size and the value of changes to future population size. I said there were lots of difficult questions here. But in principle, one can build a model that takes account of all of those things. One could input into the model one’s best guesses about the sign of the effects on future population size and about the sign and the magnitude of the value of a given change to future population size. Of course, in doing so, one would have to be making some extremely bold estimates, and have to take a stand on some controversial questions. They’d be questions where there’s relatively little guidance from evidence, and one feels much more that one’s guessing. But if this is what we’ve got to do in order to make well thought through funding decisions, perhaps this is just what we’ve got to do, and we should get on with doing it.

Well, I think there are probably some people in the effective altruist community who are comfortable with doing that. But for my own part, I want to confess to some profound discomfort. To bring out why I feel that discomfort, I think it’s helpful to think about both intra-personal (so, inside my own head) issues that I face when I contemplate doing this analysis and also about inter-personal issues.

The intra-personal issue is this: Okay, so I tried doing this uber-analysis; I come up with my best guess about the sign of the effect on future population and so forth; and I put that into my analysis. Suppose the result is I think funding bed nets is robustly good because it robustly increases future population size, and that in turn is robustly good.

Suppose that’s my personal uber-analysis. I’m not going to be able to shake the feeling that when I wrote down that particular uber-analysis, I had to make some really arbitrary decisions. It was pretty arbitrary, perhaps, that I came down on the side of increasing population size being good rather than bad. I didn’t really have any idea. I just felt like I had to make a guess for the purpose of the analysis. And so here I am, I’ve reached this conclusion that I should be spending, say, 20 percent of my salary on increasing future population size via bed nets or otherwise. But I really know at the back of my mind, if I’m honest with myself, that I could equally well have made the opposite arbitrary choice and chosen the estimate that said increasing future population size is bad. I should instead be spending 20 percent of my salary on decreasing future population size. So, the cluelessness worry here is: How confident can I feel? How sensible can I feel going all out to increase future population size—perhaps via bed nets or, more plausibly, perhaps via some other route—when I know that the thing that led me to choose that conclusion rather than the opposite one was really arbitrary.

The inter-personal point is closely related. Suppose I choose to go all out on increasing future population size, and you choose to go all out on decreasing future population size. So here we both are, giving away such and such proportion of our salary to our chosen, supposedly altruistic, enterprises. But the two of us are just directly working against one another. We’re cancelling one another out. We would have done something much more productive if we got together and had a conversation and perhaps together decided to instead fund some third thing that at least the two of us could agree upon.

Response four: Ignore things that we can’t even estimate

Fourth response: Ignore things that we can’t even estimate. This one, too, I think is at least a very understandable response (at least, psychologically), although to me it doesn’t seem the right one. I’ll say a little bit about that here. I’ve said more in print, for example, in this paper that I’ve cited on this slide.

So the idea would be this: Okay, let’s consider the most sophisticated, plausible cost-effectiveness analysis. So we have some cost-effectiveness analysis, perhaps like the GiveWell 2020 analysis. It’s not the uber-analysis where we’ve gone crazy and started making guesses for things that we really have no clue about. It stopped at the point where we’re making some educated guesses and we can also do our sensitivity analysis to check that our important conclusions are not too sensitive to reasonable variations in the input parameters for this medium complexity cost-effectiveness model. Then the thought would be: what we should do is base our funding decisions on cost-effectiveness analyses of that type, just because it’s the best that we can do. So, if you like, we should look under the lamppost and ignore the darkness just because we can’t see into the darkness.

So, again, perhaps like the second response, this is one that I understand. I don’t think it’s right. I do think it’s very tempting, though. And for the purpose of this talk, I just want to lay it out there as an option.

Response five: “Go longtermist”

Finally, the response that’s probably my favourite one and the one that I’m personally most inclined towards. One might be driven by considerations of cluelessness to “go longtermist”, as it were. Let me say a bit more about what I mean by that. As many of you will probably be aware, there’s something of a division in the effective altruist community on the question of: In what cause area do there exist, in the world as we find it today, the most cost-effective opportunities to do good? In which cause area can you do the most good per dollar spent? Some people think the answer is global poverty, health and development. Some people think the answer is animal welfare. And a third contingent thinks the answer is what I’ll call ‘longtermism’, trying to beneficially influence the course of the very far future of humanity and more generally of the planets in the universe.

Considerations of cluelessness are often taken to be an objection to longtermism because, of course, it’s very hard to know what’s going to beneficially influence the course of the very far future on timescales of centuries and millennia. Again, we still have the point that we can’t do randomised controlled trials on those timescales.

However, what my own journey through thinking about cluelessness has convinced me, tentatively, is that that’s precisely the wrong conclusion. And in fact, considerations of cluelessness favour longtermism rather than undermining it.

Why would that be? Well, what seems to me to emerge from the discussion of interventions like funding bed nets is, firstly, we think the majority of the value of funding things like bed net distribution comes from their further future effects. However, in the case of interventions like that, we find ourselves really clueless about not only the magnitude, but even the sign of the value of those further future effects. This then raises the question of whether we might choose our interventions more carefully if we care in principle about all the effects of our actions until the end of time. But we’re clueless about what most of those effects are for things like bed net distribution.

Perhaps we could find some other interventions for which that’s the case to a much lesser extent. If we deliberately try to beneficially influence the course of the very far future, can we find things where we more robustly have at least some clue that what we’re doing is beneficial and of how beneficial it is? I think the answer is yes.

And if we want to know what kinds of interventions might have that property, we just need to look at what people, in fact, do fund in the effective altruist community when they’re convince that longtermism is the best cause area. They’re typically things like reducing the chance of premature human extinction—the thought being that if you can reduce the probability of premature human extinction, even by a tiny little bit, then in expected value terms, given the potential size of the future of humanity, that’s going to be enormously valuable. (This has been argued forcefully by Nick Beckstead and by Nick Bostrom. Will MacAskill and I canvas some of the same arguments in our own recent paper.)

There are also interventions aimed at improving very long run average future welfare, conditional on the supposition that humanity doesn’t go prematurely extinct, perhaps by improving the content of key, long lasting political institutions.

So these are the kind of things that you can fund if you’re convinced, whether by the arguments that I’ve set out today or otherwise, that longtermism is the way to go. And, in particular, you might choose to donate to Effective Altruism’s Long Term Future Fund, which focuses on precisely these kinds of interventions.

Summary

In summary, then: In part one I talked about effectiveness, cost-effectiveness, and the importance of evidence. The point here was that altruism has to be effective. Most well-intentioned things don’t work. Even among the things that do work, some work hundreds of times better than others. And we have to pay attention to evidence if we want to know which are which.

In part two, though, I talked about the limits of this. The limits of evidence, where evidence gives out, and what it can’t tell us about. Here I worried about the fact that evidence, kind of necessarily, only tracks relatively near term effects. We can only gather evidence on relatively short timescales. And plausibly, I’ve argued, or at least suggested, that the bulk of even the expected value of our interventions comes from their effects on the very far future. That is: the things that are not measured in even the more complicated, plausible cost-effectiveness analysis.

Then in section three I talked about five possible responses to this fact. I said I think making the cost-effectiveness analyses somewhat more sophisticated only relocates the problem. That left four other responses: Give up effective altruism; do the uber-analysis; adopt a parochial form of morality where you only care about the near-term, predictable effects; or shift away from things like bed net distribution in favour of interventions that are explicitly aimed at improving, as much as we possibly can, the expected course of the very long run future.

I said that I myself am probably most sympathetic to that last response—the longtermist one—but I think there are very hard questions here. So actually, in my own case, the take home message for this is: we need to do a lot more thinking and research about this. And this motivates the enterprise that we call global priorities research, bringing to bear the tools of various academic disciplines—in particular at the moment, in the case of my own institute, economics and philosophy—to think carefully through issues like this and try to get to a point where we do feel less clueless.