You’re right! I’m currently working on the intermediate report for diabetes, and one factor we’re looking at that the shallow report did not cover is the speeding up effect, which we model by looking at the base rate from past data (i.e. country-years in which passage occurred, divided by total country-years). This definitely cuts into the headline cost-effectiveness estimate.
On a related note, one issue, I think, is whether we think of tax policy success as counterfactually mutually exclusive, or as additive. (A) For the former, as you say, the idea is that the tax would have occurred anyway. (B) For the latter, the idea is that the tax an EA or EA-funded advocacy organization pushes shifts upwards the tax over time curve (i.e. what the tax rate is over time; presumably this slopes upwards, as countries get stricter). In short, we’re having a counterfactual effect because the next round of tax increases don’t replace so much as add on to what we’ve achieved, and our actions ensure that the tax rate at any one point in time is systematically higher than it otherwise would have been.
I think reality is a mix between both both viewpoints (A) & (B) - success means draining the political capital to do more in the short to medium term, but you’re probably also ensuring that the tax rate is systematically higher going forward. In practice, I tend to model using (A), just to be conservative
Stepping back, CEARCH’s goal is to identify cause areas that have been missed by EA. But to be successful, you need to compare apples with apples. If you’re benchmarking everything to GiveWell Top Charities, readers expect your methodology to be broadly consistent with GiveWell’s and their conservative approach (and for other cause areas, consistent with best-practice EA approaches). The cause areas that are standing out for CEARCH should be because they are actually more cost-effective, not because you’re using a more lax measuring method.
Coming back to the soda tax intervention, CEARCH’s finding that it’s 1000x GiveWell Top Charities raised a red flag for me so it seemed that you must somehow be measuring things differently. LEEP seems comparable since they also work to pass laws that limit a bad thing (lead paint), but they’re at most ~10x GiveWell Top Charities. So where’s the additional 100x coming from? I was skeptical that soda taxes would have greater scale, tractability, or neglectedness since LEEP already scores insanely high on each of these dimensions.
So I hope CEARCH can ensure cost-effectiveness comparability and if you’re picking up giant differences w/​ existing EA interventions, you should be able to explain the main drivers of these differences (and it shouldn’t be because you’re using a different yardstick). Thanks!
Just to clarify, one should definitely expect cost-effectiveness estimates to drop as you put more time into them, and I don’t expect this cause area to be literally 1000x GiveWell. Headline cost-effectiveness always drops, from past experience, and it’s just optimizer’s curse where over (or under) performance comes partly from the cause area being genuinely better (or worse) but also partly from random error that you fix at deeper research stages. To be honest, I’ve come around to the view that publishing shallow reports—which are really just meant for internal prioritization—probably isn’t useful, insofar as it can be misleading.
As an example of how we more aggressive discount at deeper research stages, consider our intermediate hypertension report—there was a fairly large drop from around 300x to 80x GiveWell, driven by (among other things): (a) taking into accounting speeding up effects, (b) downgrading confidence in advocacy success rates, (c) updating for more conservative costing, and (d) doing GiveWell style epistemological discounts (e.g. taking into account a conservative null hypothesis prior, or discounting for publication bias/​endogeneity/​selection bias etc.)
As for what our priors should be with respect to whether a cause can really be 100x GiveWell—I would say there’s a reasonable case for this, if: (a) One targets NCDs and other diseases that grow with economic growth (instead of being solved by countries getting richer, and improving sanitation/​nutrition/​healthcare systems etc). (b) There are good policy interventions available, because it really does matter that: (i) a government has enormous scale/​impact; (ii) their spending is (counterfactually) relative to EA money that would have gone to AMF and the like; and (iii) policy tends to be sticky, and so the impact lasts in a way that distributing malaria nets or treating depression may not.
Hi Wayne,
You’re right! I’m currently working on the intermediate report for diabetes, and one factor we’re looking at that the shallow report did not cover is the speeding up effect, which we model by looking at the base rate from past data (i.e. country-years in which passage occurred, divided by total country-years). This definitely cuts into the headline cost-effectiveness estimate.
On a related note, one issue, I think, is whether we think of tax policy success as counterfactually mutually exclusive, or as additive. (A) For the former, as you say, the idea is that the tax would have occurred anyway. (B) For the latter, the idea is that the tax an EA or EA-funded advocacy organization pushes shifts upwards the tax over time curve (i.e. what the tax rate is over time; presumably this slopes upwards, as countries get stricter). In short, we’re having a counterfactual effect because the next round of tax increases don’t replace so much as add on to what we’ve achieved, and our actions ensure that the tax rate at any one point in time is systematically higher than it otherwise would have been.
I think reality is a mix between both both viewpoints (A) & (B) - success means draining the political capital to do more in the short to medium term, but you’re probably also ensuring that the tax rate is systematically higher going forward. In practice, I tend to model using (A), just to be conservative
Thanks for your response, Joel!
Stepping back, CEARCH’s goal is to identify cause areas that have been missed by EA. But to be successful, you need to compare apples with apples. If you’re benchmarking everything to GiveWell Top Charities, readers expect your methodology to be broadly consistent with GiveWell’s and their conservative approach (and for other cause areas, consistent with best-practice EA approaches). The cause areas that are standing out for CEARCH should be because they are actually more cost-effective, not because you’re using a more lax measuring method.
Coming back to the soda tax intervention, CEARCH’s finding that it’s 1000x GiveWell Top Charities raised a red flag for me so it seemed that you must somehow be measuring things differently. LEEP seems comparable since they also work to pass laws that limit a bad thing (lead paint), but they’re at most ~10x GiveWell Top Charities. So where’s the additional 100x coming from? I was skeptical that soda taxes would have greater scale, tractability, or neglectedness since LEEP already scores insanely high on each of these dimensions.
So I hope CEARCH can ensure cost-effectiveness comparability and if you’re picking up giant differences w/​ existing EA interventions, you should be able to explain the main drivers of these differences (and it shouldn’t be because you’re using a different yardstick). Thanks!
Just to clarify, one should definitely expect cost-effectiveness estimates to drop as you put more time into them, and I don’t expect this cause area to be literally 1000x GiveWell. Headline cost-effectiveness always drops, from past experience, and it’s just optimizer’s curse where over (or under) performance comes partly from the cause area being genuinely better (or worse) but also partly from random error that you fix at deeper research stages. To be honest, I’ve come around to the view that publishing shallow reports—which are really just meant for internal prioritization—probably isn’t useful, insofar as it can be misleading.
As an example of how we more aggressive discount at deeper research stages, consider our intermediate hypertension report—there was a fairly large drop from around 300x to 80x GiveWell, driven by (among other things): (a) taking into accounting speeding up effects, (b) downgrading confidence in advocacy success rates, (c) updating for more conservative costing, and (d) doing GiveWell style epistemological discounts (e.g. taking into account a conservative null hypothesis prior, or discounting for publication bias/​endogeneity/​selection bias etc.)
As for what our priors should be with respect to whether a cause can really be 100x GiveWell—I would say there’s a reasonable case for this, if: (a) One targets NCDs and other diseases that grow with economic growth (instead of being solved by countries getting richer, and improving sanitation/​nutrition/​healthcare systems etc). (b) There are good policy interventions available, because it really does matter that: (i) a government has enormous scale/​impact; (ii) their spending is (counterfactually) relative to EA money that would have gone to AMF and the like; and (iii) policy tends to be sticky, and so the impact lasts in a way that distributing malaria nets or treating depression may not.