We’ve considered this issue and discussed it internally; we spent some time last year exploring ways in which we might potentially adjust our models for it, but did not come up with any promising solutions (and, as the post notes, an explicit quantitative adjustment factor is not Chris’s recommended solution at this time).
So, we are left in a difficult spot: the optimizer’s curse (and related issues) seems like a real threat, but we do not see high-return ways to address it other than continuing to broadly deepen and question our research. In the case that Chris highlights most — our recommendation of deworming — we have put substantial effort into working along the lines that he recommends and we continue to do so. Examples of the kind of additional scrutiny that we have given to this recommendation includes:
- Embracing model skepticism: We put weight on qualitative factors relevant to specific charities’ operations and specific uses of marginal funding (more). We generally try not to put too much weight on minor differences in cost-effectiveness analyses (more). We place substantial weight on cost-effectiveness analyses while doing what we can to recognize their limitations and bring in other forms of evidence.
– Re-examining our assumptions through vetting: we asked Senior Advisor David Roodman to independently assess the evidence for deworming and he produced extensive reports with his thoughts: see here and here.
– Having conversations and engaging with a variety of deworming researchers, particularly including skeptics. E.g., we’ve engaged with work from skeptical Cochrane researchers (e.g. here and here), epidemiologist Nathan Lo, Melissa Parker and Tim Allen (who looked at deworming through an anthropological perspective), etc.
– Funding additional research with the goal of potentially falsifying our conclusions: see e.g. grants here and here.
We will continue to take high-return steps to assess whether our recommendations are justified. For example, this year we are deepening our assessment of how we should expect deworming’s effectiveness to vary in contexts with different levels of worm infection. It is also on our list to consider quantitative adjustments for the optimizer’s curse further at some point in the future, but given the challenges we encountered in our work so far, we are unlikely to prioritize it soon.
Finally, we hope to continue to follow discussions on the optimizer’s curse and would be interested if theoretical progress or other practical suggestions are made. As Chris notes, this seems to be a cross-cutting theoretical issue that applies to cause prioritization researchers outside of GiveWell, as well.
FYI I asked about this on GiveWell’s most recent open thread, Josh replied: