For what it’s worth, I don’t specifically think our research for estimating cost-effectiveness of corporate chicken welfare work is more reliable/​less biased than GiveWell’s research overall. I’d say
GiveWell makes prospective marginal cost-effectiveness estimates and those estimates are more reliable for prospective marginal cost-effectiveness (what we care about when planning donations and grants) than any of our animal welfare cost-effectiveness estimates, because GiveWell’s are based on stronger evidence about the kinds of interventions used (RCTs and meta-analyses of them vs observational studies, fact checking, informal investigation) and more carefully adjusted for future marginal work.
EAAs have made retrospective average cost-effectiveness estimates for corporate chicken welfare work. It’s hard to compare their reliability (for the times and contexts studied) to that of using GiveWell’s prospective estimates + M&E for past impact.
GiveWell only pretty indirectly tracks outcomes that matter morally, and could fail to adequately adjust RCT estimates, by making wrong assumptions or missing differences from the contexts of the RCTs. As far as I know, GiveWell isn’t trying to verify the impacts of AMF or MC on the basis of measured malaria cases or deaths, which are the outcomes we actually care about, or at least are close to the measures of welfare we care about.
For corporate chicken welfare work, we more directly track outcomes of interest of the work being done (cage-free eggs/​hens, not welfare directly) and use this information to track our impacts more directly. This is both in the studies themselves, which reflect the work of the charities we’ve actually been supporting (unlike RCTs for most GiveWell recommendations, I’d guess), and more informally just tracking commitments and progress on them, and the achievements of specific orgs. We make assumptions to estimate causal effects, and those could be off.
And prospective marginal cost-effectiveness is more decision-relevant than retrospective cost-effectiveness, to guide donations and grants, which further favours GiveWell.
This seems very ungenerous to the global health space (...)
Fair, they’re not just monitoring inputs, but also often proper use and implementation (on top of other factors that may affect cost-effectiveness, which they adjust for). However, that could still leave many potential holes along the way to the outcomes that actually matter in themselves.
In the case of AMF, compared to the RCTs, there can be changes to or differences in
the mosquitoes (insecticide resistance, other characteristics),
the malaria organisms (Plasmodium),
net characteristics/​quality,
net use,
recipient reports, e.g. their honesty or interpretations,
immune system responses to malaria and how resilient humans are to malaria or death by malaria in other ways, e.g. due to nutrition, exercise, exposure to other harmful things, other environmental factors, population (epi)genetics,
other things GiveWell or I haven’t thought of.
I think GiveWell (or the charities) tracks and/​or makes some adjustments for some of these, too, of course, and we might expect the others not to matter much at all.
On the other hand, counting cataract surgeries seems pretty close to tracking an actual outcome that matters in itself or that is itself close to welfare, improved vision.[1]
That is how RCTs work. You can’t have a separate RCT for every situation unfortunately.
But we could have ongoing RCTs of GiveWell recommendations to check that the charities are still having important effects, although that may raise ethical issues at this point. Instead, M&E, observational research, fact checking and other investigation could be used to provide more independent evidence for outcomes closer to the ones we actually care about and our causal effects on them, like has been done for corporate chicken welfare work.
They could verify the severity of cataracts cases being treated and that the cataracts are actually cured by the surgeries (in a representative or random sample of treated individuals). (Maybe they do all this; I didn’t check.) Cataracts don’t go away on their own, so we only need to make assumptions about how many would have otherwise gotten (successful) treatment anyway (and when) to estimate the causal effects on cataracts.
Still, the quality of life impacts of the cataract surgeries could also be different compared to studies. There could be differences in social support. There could be differences in vision unrelated to cataracts, like people being more nearsighted, reducing the impact of cataract surgery without correction for nearsightedness, although I don’t expect differences in nearsightedness to matter much.
This seems very ungenerous to the global health space:
Malaria nets are based on RCTs. Here’s a Cochrane review of 22 RCTs:
Against Malaria Foundation does quite intensive monitoring of uptake (not perfect, but you’re implying none)
New Incentives is based on an RCT and also monitors many metrics
Malaria consortium is also based on RCTs and does monitoring
Seva and Fred Hollows track and publish their cataract surgery numbers
Innovations for Poverty Action’s main purpose is to trial interventions and measure them
That is how RCTs work. You can’t have a separate RCT for every situation unfortunately.
For what it’s worth, I don’t specifically think our research for estimating cost-effectiveness of corporate chicken welfare work is more reliable/​less biased than GiveWell’s research overall. I’d say
GiveWell makes prospective marginal cost-effectiveness estimates and those estimates are more reliable for prospective marginal cost-effectiveness (what we care about when planning donations and grants) than any of our animal welfare cost-effectiveness estimates, because GiveWell’s are based on stronger evidence about the kinds of interventions used (RCTs and meta-analyses of them vs observational studies, fact checking, informal investigation) and more carefully adjusted for future marginal work.
EAAs have made retrospective average cost-effectiveness estimates for corporate chicken welfare work. It’s hard to compare their reliability (for the times and contexts studied) to that of using GiveWell’s prospective estimates + M&E for past impact.
GiveWell only pretty indirectly tracks outcomes that matter morally, and could fail to adequately adjust RCT estimates, by making wrong assumptions or missing differences from the contexts of the RCTs. As far as I know, GiveWell isn’t trying to verify the impacts of AMF or MC on the basis of measured malaria cases or deaths, which are the outcomes we actually care about, or at least are close to the measures of welfare we care about.
For corporate chicken welfare work, we more directly track outcomes of interest of the work being done (cage-free eggs/​hens, not welfare directly) and use this information to track our impacts more directly. This is both in the studies themselves, which reflect the work of the charities we’ve actually been supporting (unlike RCTs for most GiveWell recommendations, I’d guess), and more informally just tracking commitments and progress on them, and the achievements of specific orgs. We make assumptions to estimate causal effects, and those could be off.
And prospective marginal cost-effectiveness is more decision-relevant than retrospective cost-effectiveness, to guide donations and grants, which further favours GiveWell.
Fair, they’re not just monitoring inputs, but also often proper use and implementation (on top of other factors that may affect cost-effectiveness, which they adjust for). However, that could still leave many potential holes along the way to the outcomes that actually matter in themselves.
In the case of AMF, compared to the RCTs, there can be changes to or differences in
the mosquitoes (insecticide resistance, other characteristics),
the malaria organisms (Plasmodium),
net characteristics/​quality,
net use,
recipient reports, e.g. their honesty or interpretations,
immune system responses to malaria and how resilient humans are to malaria or death by malaria in other ways, e.g. due to nutrition, exercise, exposure to other harmful things, other environmental factors, population (epi)genetics,
other things GiveWell or I haven’t thought of.
I think GiveWell (or the charities) tracks and/​or makes some adjustments for some of these, too, of course, and we might expect the others not to matter much at all.
On the other hand, counting cataract surgeries seems pretty close to tracking an actual outcome that matters in itself or that is itself close to welfare, improved vision.[1]
But we could have ongoing RCTs of GiveWell recommendations to check that the charities are still having important effects, although that may raise ethical issues at this point. Instead, M&E, observational research, fact checking and other investigation could be used to provide more independent evidence for outcomes closer to the ones we actually care about and our causal effects on them, like has been done for corporate chicken welfare work.
They could verify the severity of cataracts cases being treated and that the cataracts are actually cured by the surgeries (in a representative or random sample of treated individuals). (Maybe they do all this; I didn’t check.) Cataracts don’t go away on their own, so we only need to make assumptions about how many would have otherwise gotten (successful) treatment anyway (and when) to estimate the causal effects on cataracts.
Still, the quality of life impacts of the cataract surgeries could also be different compared to studies. There could be differences in social support. There could be differences in vision unrelated to cataracts, like people being more nearsighted, reducing the impact of cataract surgery without correction for nearsightedness, although I don’t expect differences in nearsightedness to matter much.