TL;DR: The explore-exploit tradeoff for causes is impossible if you don’t know how far exploration could take you—how good the best causes may be.
Recently, I found out that Centre for Exploratory Altruism Research (CEARCH) estimates, with a high confidence level, that advocacy for top sodium reduction policies is around 100x as cost-effective as top GiveWell charities, in terms of DALYs. This made me feel like a sucker, for having donated to GiveWell.
You see, when I’m donating, I think of myself as buying a product—good for others. The hundreds of dollars I donated to GiveWell could’ve probably been replaced with a couple of dollars to this more effective cause. That means that I wasted hundreds of dollars, that could’ve done much more good.
But let’s use milk as an analogy. GiveWell is the equivalent of buying a liter of milk for 200$. If that happened with a product I wanted to buy for myself, I would probably feel scammed. A literal lifetime of donations to GiveWell might be replaced with 6 months of donating to this cause. I’m not saying I got scammed, but thinking about it from the perspective of buying good intuitively helps. Nowadays I don’t donate to GiveWell anyways, but it sucks.
This—being a sucker who pays too much for doing good—is really bad. It’s exactly what we try to avoid in EA. It can decrease our impact by orders of magnitude.
And that’s not even the end. CEARCH also estimated (although with a low level of certainty) that nuclear arsenal limitation could be 5000x cost-effective as top GiveWell charities. If they’re wrong by an order of magnitude, that’s still 5x times better then even the hypertension work. Now donating to GiveWell is like buying a liter of milk for 1000$. And who’s to say that there’s nothing 50x more effective than the nuclear cause?
At some point, you might consider to stop buying milk for the moment, and looking around for the cheapest prices. And you’ll probably not get the cheapest, but you might be able to figure a reasonable estimate of them, and buy at a rate close to it.
So this situation made me think—can we put a reasonable limit to the maximum cost-effectiveness of our money and time? Within a 50% CI? 80 CI? 99% CI… And can we have a reasonable estimate for the time it will be reached?
For someone extremely risk-averse it’s probably easy, as there are only so much RCTs in our world. And it’s seems likely that GiveWell is within a close range. But for anyone that’s risk-neutral, I can’t think of a good way. So I’ve come to ask you wonderful people—what do you think?
The end result I’m thinking of is something like:
“The maximum cost-effectiveness we can expect is 500x-50,000x that of GiveWell (95% CI). The year we expect to find a cause within an order of magnitude of that cost-effectiveness is 2035-2050 (50% CI)”.
But of course any input will be interesting.
Things that could potentially limit cost-effectiveness, off the top of my head:
A weak form of the ‘stable market hypothesis’ for doing good.
Hedonic adaptation—people adapt to better circumstances. The amount of good we can induce in anyone’s lives is thus limited.
Caps on the amount of humans & other beings that are likely to ever live.
Note: I’m emotionally content with my past donations to GiveWell, don’t worry. Also, this is not a diss on GiveWell, they’re doing a great job for their goals.
CEARCH’s work is, as the name implies, exploratory. It searches for new potential programs and cause areas. It recognizes that early-stage CEAs often overstate the effectiveness of a program. Even its deep CEA on hypertension relies on a lot of guesswork in my view.
I personally think its work is often seriously overoptimistic, especially when it comes to predicting the success and ease of lobbying efforts against considerable opposition. But that’s more okay given its exploratory function. If there were a shovel-ready sodium-reduction charity ready to receive and deploy significant funds, I would view that as akin to buying equity in a brand-new startup.
GiveWell does something entirely different with its Top Charities—you’re getting a specific implementation of a program by a carefully vetted charity, with a pretty tight cost-effectiveness analysis based on either the charity’s actual performance or efficacy of similar real-world interventions in a randomized controlled trial.
The extent to which people should prefer tried-and-true interventions versus novel-but-potentially-higher-impact ones is a recurrent question. But most of us retail global-health donors who have done our research, as well as the bulk of the money from big donors in EA global health, goes to the tried-and-true. That is a pretty good clue that we are not all being “scammed” here!
Finally, to the extent that work with insanely good cost-effectiveness exists, that work often has limited room for more funding. Here, for instance, both of your examples involve lobbying.
Hi Jason,
I agree with the broad thrust of your comment that there’s a tradeoff between guaranteed performance vs potentially higher impact stuff.
That said, I would push back on two points:
(1) With respect to our work being “seriously overoptimistic, especially when it comes to predicting the success and ease of lobbying efforts against considerable opposition.” I think some of our earlier/shallower CEAs definitely do suffer from that, but our deep CEAs are (a) highly comprehensive, factoring in a multiple of relevant temporal/epistemic discounts; and (b) are far more sceptical of advocacy efforts (e.g. we think there is a 6% chance of success for sodium reduction advocacy in a single country over 3 years. i.e. if you did 48 country-years of lobbying you’ll get one win). Amongst the various considerations informing this estimate is, of course, the reality of industry pushback.
(b) I disagree to a limited but fairly significant extent, on the notion that such work has limited room for more funding. Scaling works differently in advocacy than in direct delivery. You pay more to buy increased chances of success (e.g. the difference between funding a small NGO vs hiring a professional lobbying outfit vs hiring ex-government officials to lobby for you). Of course, diminishing marginal returns apply (as opposed to economies of scale in direct delivery—though in the long run, there’s always DMR as we work our way down the list of priority countries).
Thanks for the reply!
If I understand your main arguments correctly, you’re basically saying that high cost-effectiveness options are rare, uncertain, and have a relatively small funding gap that is likely to be closed anyways. Also new charities are likely to fail, and can be less effective. And smart EAs won’t waste their money.
Uncertainty and rarity: Assume that CEARCH is on average 5x too optimistic about their high confidence level report, 20x too optimistic about their low confidence stuff (that’s A LOT). Still, out of 17 causes they researched, 4 are still over 10x effective as top GiveWell charities. Almost 1/4th. They were probably lucky—Rethink Priorities and CE and such don’t have such a high rate (it would be an interesting topic for analysis). But still, their budgets are miniscule. RP has spent around 14m$ in it’s entire lifetime. CEARCH is composed of only 2 full-time workers, and was founded less than 2 years ago. CE had a total income of £775k in 2022. The cost of operations for these stuff is tiny, compared to the amount we spend on direct work.
Small funding gap, likely to be closed anyways: Let’s say that on average, finding such causes requires 5m$ (seems overblown with the aforementioned info). And assume these causes are on average 20x effective as top GiveWell charities. And the funding gap is indeed small—only 10m$ on average. That’s 15m$, that would’ve done as much good as 200m$ for GiveWell. So by finding and donating to 2.6 causes/y, we can equal the impact GiveWell has done in 2021. Those funding gaps aren’t that likely to be closed—it took more then 10 years after the inception of EA for CEARCH to find those causes. In the stock market, a 2% misprice may be closed within a few hours. In altruism, a 500% misallocation will never be closed without deliberate challenge.
And these causes pretty easy to find. CEARCH has been started in 2022 and has already found 4 causes 10x GiveWell under my aforementioned pessimistic assumptions. CE and RP have found more. There are big funding gaps, because there are many causes like this. There are many big world governments to do lobbying to. We should aim to close the funding gaps as soon as possible, because that would help more people.
New charities likely to fail, and be less effective: CE’s great work shows that might not be true. A substantial number of their charities report significant success. Also, I assume that’s taken into account in exploratory research. It can still diminish the impact by 50% and it won’t matter to the overall scheme.
EAs won’t waste their money on bad donations: If that was true, then all EAs seeking to maximize expected value would roughly agree on where to donate their money. Rather, we see the community being split into 4 main parts (global H&P, animals, existential risk, meta). Some people in EA simply don’t and won’t donate to some of these parts. This shows that at least a part of the community might donate to worse charities.
Imagine you have 2 investments that you will return your money only in 10 years.
a safe investment that would return you Y.
a risky start-up that you expect to return 10Y in EV.
What would you choose? I bet the start-up. With altruism there’s no reason to be loss averse, so the logic is even more solid.
I guess my are that we should spend more on cause prioritization and supporting new charities (akin to CE). But then—when do we know we’ve found a decent cause? The exploration-exploitation trade-off is impossible if you don’t know how far exploration will take you.
EA is the smartest, most open community I know. I’m sure it will explore this.
I think we agree on somewhat more than it seems at first glance. I don’t think the current GiveWell top charities are the pinnacle of cost-effectiveness, support further cause exploration and incubating the most promising ideas into charities, and think it’s quite possible for EA funders to miss important stuff.
The crux is that I don’t think it’s warranted to directly compare cost-effectiveness analyses conducted on a few weeks of desktop research, expert interviews and commissioning of surveys and quantitative modelling to evaluations of specific charities at scale and in action, and I think your original post did that with allusions to scamming, GiveWell charities as $1000 liters of milk, and being a sucker.
Although CEARCH is too young to us retrospectively compare its analyses to the cost-effectiveness of launched charities, I think something like drug development is a good analogy. Lots of stuff looks great on paper, in situ, or even in animal models, only to fall apart entirely in the multi-phase human clinical trial process on the way to full approval. Comparing how a drug does in limited animal models to how another drug does in Phase III trials is comparing apples to oranges. Moreover, “risk the model/drug will fall apart in later phases” is distinct from “risk after Phase III trials that the model/drug will not work in a specific country/patient.”
To be very clear, this is not a criticism of CEARCH—as I see it, its job is to screen candidate interventions, not to bring them up to the level of mature, shovel-ready interventions. The next step would be either incubation or a deep dive on a specific target charity already doing this work. I would expect to see a ton of false positives, just as I would expect that from the earliest phases of drug development. It’s worth it to find the next ten-figure drug / blockbuster EA intervention.
I think this should make you question your assumptions to some extent. GiveWell has evaluated tons of interventions for a number of years, and made significant grants for a number of them. If CEARCH has come up with 4 causes that are 10X top charities in ~ a year with 2 FTEs, while GiveWell hasn’t come up with anything better than 1x in many years with lots more FTEs, what conclusion do we draw from that? I think it more likely that CEARCH is applying more generous assumptions than that GiveWell is badly screwing up its analysis of intervention after intervention. (And no one else, e.g., Founders’ Pledge, has been able to come up with clearly better interventions either, at least based on neartermist global health priorities.)
More generous assumptions come with the territory of early-stage CEAs, so I am not suggesting that is problematic given CEARCH’s mission. But I think its analysis supports a conclusion of “we should incubate a charity pursuing this intervention,” not “we should conclude that our GiveWell donations were very poor value and immediately divert tens of millions of dollars into sodium-reduction policy.” In my view, your original post was relatively closer to the later than your reply comment.
As for CE, it estimates that “starting a high-impact charity has the same impact as donating $200,000 to the most effective NGOs every year.” That doesn’t suggest a belief that a lot of its incubated charities are 10x+ GiveWell and able to absorb significant funding.
GiveWell has shown a willingness to fund policy work out of its All Grants Fund where it thinks the cost-effectiveness is there (cf. $7MM to the Centre for Pesticide Suicide Prevention for general support in January 2021, also for work on alcohol policy). So a general antipathy toward policy/lobbying work doesn’t seem to explain what is going on here. Rather, I think there’s a fundamental, difficult-to-resolve disagreement about the EV of lobbying/policy work. It’s certainly possible that I—and it seems, most EA funders—are simply wrong in our estimation on that point. But I don’t think referring to the criterion standard non-policy interventions as $1000 liters of milk acknowledges that disagreement and the reasons for it.
I think this is predominately about the donor’s values and ethical framework (e.g., the relative value of human vs. animal welfare, the extent to which future lives matter), although there are some strategic elements as well. I’m not aware of any reason to think the people who donate to global health are hostile to lobbying efforts if that is the most effective approach.
I might’ve used too strong of a language with my original post, such as the talk about being a sucker. For me it’s useful to think about donations as a product I’m buying, but I probably took it too far. And I don’t think I’ve properly emphasized my main message, which was (as I’ve added later) - the explore-exploit tradeoff for causes is really hard if you don’t know how far exploration could take you. Honestly, I’m most interested in your take on that. I initially only used GiveWell and CEARCH to demonstrate that argument and show I how got to it.
The drug analogy is interesting, although I prefer the start-up analogy. Drug development is more binary—some drugs can just flat-out fail in humans, while start-ups are more of a spectrum (the ROI might be smaller than thought etc.). I don’t see a reason to think of CEARCH recommended programs or for most other exploratory stuff as binary. Of course lobbying could flat-out fail, but it’s unlikely we’ll have to update our beliefs that this charity would NEVER work, as might happen in drug development. And obviously with start-ups, there’s also a lot of difference between the initial market research and the later stages (as you said).
GiveWell has a lot of flaws for cause exploration. They really focus on charity research, not cause research. It’s by design really biased towards existing causes and charities. The charities must be interested and cooperate with GiveWell. They look for the track record, i.e. charities operating in high-risk, low tractibillity areas such as policy have a harder time. In most cases it makes sense, sometimes it can miss great opportunities.
Yes, they’ve funded some policy focused charities, but they might’ve funded much more if they were more EV maximizing instead of risk-averse. Seeing the huge leverage such options provide, it’s entirely possible.
Also, they aren’t always efficient—look at GiveDirectly. Their bar for top charities was 10x GiveDirectly for years, yet they kept GiveDirectly as a top charity until last year??? This is not some small, hard to notice inefficiency. It literally is their consistent criteria for their flagship charities. Can you imagine a for-profit company telling their investors “well, we believe these other channels have a ROI of at least 10x, but please also consider investing in this channel with x ROI”, for multiple years? I can’t. Let alone putting that less efficient channel as one of the best investments…
That’s exactly what I mean when I say altruism, even EA, can have gross inefficiency in allocations. It’s not special to GiveWell, I’m just exemplifying.
If GiveWell can make such gross mistakes, then probably others can. Another example was their relative lack of research on family planning, which I’ve written about. They’re doing A LOT of great things too. But I must say I am a bit skeptical of their decision making sometimes.
Keep in mind, CEARCH would have to be EXTREMELY optimistic in order for us to say that it hasn’t found a couple of causes 10x GiveWell. We are talking about 40x optimistic. That might be the case, but IMO it’s a strong enough assertion to require proof. Do you have examples of something close to 40x optimism in cost-effectiveness?
I agree that a lot of the difference in EAs donations can come from differing perspectives, probably most. But I think even some utilitarian, EV maximizing, 0-future discount, animal equalists EAs donate to different causes (or any other set of shared beliefs). It’s definitely not impossible.
As for other examples of 10x GiveWell cost-effectiveness in global health:
CE has estimated another charity yields $5.62 per DALY.
An Israeli non-profit, which produced an estimate of 4.3$ QALYs per dollar, in cooperation with EA Israel. A volunteer said to me he believed they were about 3x too optimistic, but that’s still around 10x GiveWell.
Also here is an example of 4x disagreement between GiveWell and Founders Pledge, and an even bigger disagreement with RP, on a mass media campaign for family planning. Even the best in the business can disagree.
Sorry for this being a bit of a rave