Stepping back, CEARCH’s goal is to identify cause areas that have been missed by EA. But to be successful, you need to compare apples with apples. If you’re benchmarking everything to GiveWell Top Charities, readers expect your methodology to be broadly consistent with GiveWell’s and their conservative approach (and for other cause areas, consistent with best-practice EA approaches). The cause areas that are standing out for CEARCH should be because they are actually more cost-effective, not because you’re using a more lax measuring method.
Coming back to the soda tax intervention, CEARCH’s finding that it’s 1000x GiveWell Top Charities raised a red flag for me so it seemed that you must somehow be measuring things differently. LEEP seems comparable since they also work to pass laws that limit a bad thing (lead paint), but they’re at most ~10x GiveWell Top Charities. So where’s the additional 100x coming from? I was skeptical that soda taxes would have greater scale, tractability, or neglectedness since LEEP already scores insanely high on each of these dimensions.
So I hope CEARCH can ensure cost-effectiveness comparability and if you’re picking up giant differences w/​ existing EA interventions, you should be able to explain the main drivers of these differences (and it shouldn’t be because you’re using a different yardstick). Thanks!
Just to clarify, one should definitely expect cost-effectiveness estimates to drop as you put more time into them, and I don’t expect this cause area to be literally 1000x GiveWell. Headline cost-effectiveness always drops, from past experience, and it’s just optimizer’s curse where over (or under) performance comes partly from the cause area being genuinely better (or worse) but also partly from random error that you fix at deeper research stages. To be honest, I’ve come around to the view that publishing shallow reports—which are really just meant for internal prioritization—probably isn’t useful, insofar as it can be misleading.
As an example of how we more aggressive discount at deeper research stages, consider our intermediate hypertension report—there was a fairly large drop from around 300x to 80x GiveWell, driven by (among other things): (a) taking into accounting speeding up effects, (b) downgrading confidence in advocacy success rates, (c) updating for more conservative costing, and (d) doing GiveWell style epistemological discounts (e.g. taking into account a conservative null hypothesis prior, or discounting for publication bias/​endogeneity/​selection bias etc.)
As for what our priors should be with respect to whether a cause can really be 100x GiveWell—I would say there’s a reasonable case for this, if: (a) One targets NCDs and other diseases that grow with economic growth (instead of being solved by countries getting richer, and improving sanitation/​nutrition/​healthcare systems etc). (b) There are good policy interventions available, because it really does matter that: (i) a government has enormous scale/​impact; (ii) their spending is (counterfactually) relative to EA money that would have gone to AMF and the like; and (iii) policy tends to be sticky, and so the impact lasts in a way that distributing malaria nets or treating depression may not.
Thanks for your response, Joel!
Stepping back, CEARCH’s goal is to identify cause areas that have been missed by EA. But to be successful, you need to compare apples with apples. If you’re benchmarking everything to GiveWell Top Charities, readers expect your methodology to be broadly consistent with GiveWell’s and their conservative approach (and for other cause areas, consistent with best-practice EA approaches). The cause areas that are standing out for CEARCH should be because they are actually more cost-effective, not because you’re using a more lax measuring method.
Coming back to the soda tax intervention, CEARCH’s finding that it’s 1000x GiveWell Top Charities raised a red flag for me so it seemed that you must somehow be measuring things differently. LEEP seems comparable since they also work to pass laws that limit a bad thing (lead paint), but they’re at most ~10x GiveWell Top Charities. So where’s the additional 100x coming from? I was skeptical that soda taxes would have greater scale, tractability, or neglectedness since LEEP already scores insanely high on each of these dimensions.
So I hope CEARCH can ensure cost-effectiveness comparability and if you’re picking up giant differences w/​ existing EA interventions, you should be able to explain the main drivers of these differences (and it shouldn’t be because you’re using a different yardstick). Thanks!
Just to clarify, one should definitely expect cost-effectiveness estimates to drop as you put more time into them, and I don’t expect this cause area to be literally 1000x GiveWell. Headline cost-effectiveness always drops, from past experience, and it’s just optimizer’s curse where over (or under) performance comes partly from the cause area being genuinely better (or worse) but also partly from random error that you fix at deeper research stages. To be honest, I’ve come around to the view that publishing shallow reports—which are really just meant for internal prioritization—probably isn’t useful, insofar as it can be misleading.
As an example of how we more aggressive discount at deeper research stages, consider our intermediate hypertension report—there was a fairly large drop from around 300x to 80x GiveWell, driven by (among other things): (a) taking into accounting speeding up effects, (b) downgrading confidence in advocacy success rates, (c) updating for more conservative costing, and (d) doing GiveWell style epistemological discounts (e.g. taking into account a conservative null hypothesis prior, or discounting for publication bias/​endogeneity/​selection bias etc.)
As for what our priors should be with respect to whether a cause can really be 100x GiveWell—I would say there’s a reasonable case for this, if: (a) One targets NCDs and other diseases that grow with economic growth (instead of being solved by countries getting richer, and improving sanitation/​nutrition/​healthcare systems etc). (b) There are good policy interventions available, because it really does matter that: (i) a government has enormous scale/​impact; (ii) their spending is (counterfactually) relative to EA money that would have gone to AMF and the like; and (iii) policy tends to be sticky, and so the impact lasts in a way that distributing malaria nets or treating depression may not.