I think a similar view is found in ‘Why we can’t take expected value estimates literally even when they’re unbiased’ I.e. we should have a pretty low prior that any particular intervention is above (e.g.) 10x cash transfers, but the strength and robustness of top charities’ CEAs are sufficient to clear them over the bar. And most CEAs of specific interventions written up on the forum aren’t compelling enough to bring the estimate all that much higher from the low prior. I agree it’d be informative to see what ‘naive’ versions of top charity CEAs would be like. As a quick and dirty version, I looked at Givewell’s stuff on AMF—their 2023 central figure is $5,500 per life saved, with 226 rows in the spreadsheet. If we look at their 2012 CEA (downloadable here), they have 45 rows with their optimistic value being $1819 per life saved. Leaving aside inter-temporal confounders, naively a 3x cut is reasonable if the random forum CEA is equivalent to 2012 optimistic Givewell. Though it depends on the quality of the random CEA, I’d guess a 2x-10x cut is a reasonable prior? Plus a stronger cut for really high estimates—eg a 500x cost-effectiveness is more likely to be due to an over-generous methodology
That’s an interesting separate point, I certainly agree that our prior should have low mass around 10x cash and above and that has its own large effect. But I don’t feel like I would make this point contingent on the quality of the CEA; I think even the highest-quality ex-ante CEA can’t avoid these issues. Some CEAs are probably high-quality because there are real decisions attached to them (e.g. Charity Entrepreneurship’s ex-ante CEAs of their prospective charities) and I don’t think I would be convinced by those either.
Neat exercise with 2012 GiveWell. Does 2023 have a country breakdown? Because the main intertemporal confounder I would want to guard against is the change in country mix. I would compare 2012 to the 2023 country in which AMF had the most activity in 2012, which I don’t know off the top of my head. But 3x seems reasonable to me.
I sympathise with this view, but I think I see it in more continuous terms than ex ante vs. ex post, and maybe akin to quality. This is because even ex post, I think there would still be substantial guess-work and assumptions, and the bottom line still relies on interpretation. But the difference for ex post is how empirically informed that analysis can be, and how specific. I.e an ex post analysis can ground estimates on data for that specific org, with that program, in that community. Ex ante analyses can also differ in quality for how empirically informed they are, and how specific they are. But a great ex ante CEA could be more empirically informed and specific than an sub-par ex post CEA. But all this is ~semantics—I think we basically agree.
Not sure about the geography question—from this it looks like the 2012 estimate was based on distribution in Malawi. In 2022 they distribute in DRC, Ghana, Guinea, Malawi, Papua New Guinea, Togo, Uganda, and Zambia, and my guess is that the Givewell figure is an average of those programs? Read into that what you will.
Ooh another angle would be to compare Charity Entrepreneurship’s ex ante CEAs with the eventual charities’ own ex post CEAs. But there’d be a strong selection effect given it depends on eventual charity success/stability, plus the interventions change a lot from the research to implementaiton.
Yes, I agree quality matters a lot, but I think people are universally aware of that—I just wanted to draw attention to the ex-ante/ex-post distinction, which I hadn’t seen raised before.
The CE approach is a good idea, because actually I think the interventions changing a lot from research to implementation is a key part of why ex-ante estimates are unreliable. I don’t know if both estimates are available but it would be great if they are!
One example I know of off the top of my head is LEEP—their CEA for their Malawi campaign found a median of $14/DALY. CE’s original report on lead paint regulation suggested $156/DALY (as a central estimate, I think). That direction and magnitude is pretty surprising to me. I expect it would be explicable based on the details of the different approaches/considerations, but I’d need to look into the details. Maybe a motivating story is that LEEP’s Malawi campaign was surprisingly fast and effective compared to the original report’s hopes?
Another is Family Empowerment Media. An ex post Rethink Priorities report mentions FEM used a Givewell model to estimate a cost-effectiveness 26.9x cash transfers, and Founders Pledge estimated 22x. The original CE report links to a CEA that estimates $984/DALY averted, which is lower than Givewell top charities—though I don’t know the exact comparison to cash transfers, and there are other benefits to family planning than just DALYs.
I suspect a strong selection effect is in play—i.e. I know of these examples and their CEAs are prominent because they were successful—and the ideas survived the gauntlet of further research, selection, founding, piloting, and scaling.
LEEP is a pretty unusual situation in general I think, and I’m not sure is super generalisable. If you get an easy-ish win with lead things, the cost-effectiveness can be insane (see the bangladesh cumin situation).
This is one of the reasons I don’t love post-hoc Cost-effectiveness assessments of successful individual campaigns and policy changes which don’t take into account the probability that their (now successful) campaign might have failed—which I have seen a number of times on the lead front. For every win there might be 5, or 10 or 20 failures (which is fine). If you just zero in on the successes then cost-effective numbers look unrealistically rosy.
If the initial assessment say for LEEP in Malawi assessed say a 20% chance of success, then this should be factored into their final calculation I think, then they can perhaps update it if they realise their success rate increases. Otherwise we end up not costing in the failed campaigns, while the successful ones appear ludicrously cost-effective.
Yeah, though to be fair the CEA for Malawi was b/c it was LEEP’s literal first campaign. I’d imagine LEEP has CEAs for all their country work which include adjustments for likelihood of success, though I don’t know whether they intend to publish them any time soon.
I think a similar view is found in ‘Why we can’t take expected value estimates literally even when they’re unbiased’ I.e. we should have a pretty low prior that any particular intervention is above (e.g.) 10x cash transfers, but the strength and robustness of top charities’ CEAs are sufficient to clear them over the bar. And most CEAs of specific interventions written up on the forum aren’t compelling enough to bring the estimate all that much higher from the low prior.
I agree it’d be informative to see what ‘naive’ versions of top charity CEAs would be like. As a quick and dirty version, I looked at Givewell’s stuff on AMF—their 2023 central figure is $5,500 per life saved, with 226 rows in the spreadsheet. If we look at their 2012 CEA (downloadable here), they have 45 rows with their optimistic value being $1819 per life saved. Leaving aside inter-temporal confounders, naively a 3x cut is reasonable if the random forum CEA is equivalent to 2012 optimistic Givewell. Though it depends on the quality of the random CEA, I’d guess a 2x-10x cut is a reasonable prior? Plus a stronger cut for really high estimates—eg a 500x cost-effectiveness is more likely to be due to an over-generous methodology
That’s an interesting separate point, I certainly agree that our prior should have low mass around 10x cash and above and that has its own large effect. But I don’t feel like I would make this point contingent on the quality of the CEA; I think even the highest-quality ex-ante CEA can’t avoid these issues. Some CEAs are probably high-quality because there are real decisions attached to them (e.g. Charity Entrepreneurship’s ex-ante CEAs of their prospective charities) and I don’t think I would be convinced by those either.
Neat exercise with 2012 GiveWell. Does 2023 have a country breakdown? Because the main intertemporal confounder I would want to guard against is the change in country mix. I would compare 2012 to the 2023 country in which AMF had the most activity in 2012, which I don’t know off the top of my head. But 3x seems reasonable to me.
I sympathise with this view, but I think I see it in more continuous terms than ex ante vs. ex post, and maybe akin to quality. This is because even ex post, I think there would still be substantial guess-work and assumptions, and the bottom line still relies on interpretation. But the difference for ex post is how empirically informed that analysis can be, and how specific. I.e an ex post analysis can ground estimates on data for that specific org, with that program, in that community. Ex ante analyses can also differ in quality for how empirically informed they are, and how specific they are. But a great ex ante CEA could be more empirically informed and specific than an sub-par ex post CEA. But all this is ~semantics—I think we basically agree.
Not sure about the geography question—from this it looks like the 2012 estimate was based on distribution in Malawi. In 2022 they distribute in DRC, Ghana, Guinea, Malawi, Papua New Guinea, Togo, Uganda, and Zambia, and my guess is that the Givewell figure is an average of those programs? Read into that what you will.
Ooh another angle would be to compare Charity Entrepreneurship’s ex ante CEAs with the eventual charities’ own ex post CEAs. But there’d be a strong selection effect given it depends on eventual charity success/stability, plus the interventions change a lot from the research to implementaiton.
Yes, I agree quality matters a lot, but I think people are universally aware of that—I just wanted to draw attention to the ex-ante/ex-post distinction, which I hadn’t seen raised before.
The CE approach is a good idea, because actually I think the interventions changing a lot from research to implementation is a key part of why ex-ante estimates are unreliable. I don’t know if both estimates are available but it would be great if they are!
One example I know of off the top of my head is LEEP—their CEA for their Malawi campaign found a median of $14/DALY. CE’s original report on lead paint regulation suggested $156/DALY (as a central estimate, I think). That direction and magnitude is pretty surprising to me. I expect it would be explicable based on the details of the different approaches/considerations, but I’d need to look into the details. Maybe a motivating story is that LEEP’s Malawi campaign was surprisingly fast and effective compared to the original report’s hopes?
Another is Family Empowerment Media. An ex post Rethink Priorities report mentions FEM used a Givewell model to estimate a cost-effectiveness 26.9x cash transfers, and Founders Pledge estimated 22x. The original CE report links to a CEA that estimates $984/DALY averted, which is lower than Givewell top charities—though I don’t know the exact comparison to cash transfers, and there are other benefits to family planning than just DALYs.
I suspect a strong selection effect is in play—i.e. I know of these examples and their CEAs are prominent because they were successful—and the ideas survived the gauntlet of further research, selection, founding, piloting, and scaling.
LEEP is a pretty unusual situation in general I think, and I’m not sure is super generalisable. If you get an easy-ish win with lead things, the cost-effectiveness can be insane (see the bangladesh cumin situation).
Yeah makes sense, and that the early research could have been heavily discounted by pessimism about a charity achieving big wins.
This is one of the reasons I don’t love post-hoc Cost-effectiveness assessments of successful individual campaigns and policy changes which don’t take into account the probability that their (now successful) campaign might have failed—which I have seen a number of times on the lead front. For every win there might be 5, or 10 or 20 failures (which is fine). If you just zero in on the successes then cost-effective numbers look unrealistically rosy.
If the initial assessment say for LEEP in Malawi assessed say a 20% chance of success, then this should be factored into their final calculation I think, then they can perhaps update it if they realise their success rate increases. Otherwise we end up not costing in the failed campaigns, while the successful ones appear ludicrously cost-effective.
Yeah, though to be fair the CEA for Malawi was b/c it was LEEP’s literal first campaign. I’d imagine LEEP has CEAs for all their country work which include adjustments for likelihood of success, though I don’t know whether they intend to publish them any time soon.