I’ve been thinking about this and I think you’re right, I do believe that running this replication trial passes a cost-benefit test, and I should try to explain why.
how strongly do you really believe these results are wrong? And by how much?
I think there’s a 50% chance that a perfectly done SMC replication would find mortality effects that are statistically indistinguishable from a null, for two reasons: 1) the documented empirical effects are strange and don’t gel with our underlying theory of malaria; 2) our theory also conflicts with the repeated observation that people living in extreme poverty don’t seem to take malaria as seriously as outsiders do, which is prima facie evidence that we’re misunderstanding something big.
My essay’s thesis is that SMC’s underlying RCT evidence, which is the foundation of GiveWell’s cost-benefit analysis, is weaker than it appears at first glance.
Does the use of meta-analysis somewhat or largely obviate this problem? In my opinion, no, aggregation does not paper over structural issues in the data generation process.
One of the most striking things my co-authors found when meta-analyzing the contact hypothesis literature was the gap in effect size between studies that had a pre-analysis plan (d = 0.016) and those that didn’t (d = 0.451). This obviously isn’t dispositive that there’s “no there there” with intergroup contact; but when subsequent high-quality studies on the subject found much more mixed results (e.g. here and here), at the very least, we can say we had a warning sign.
Can we supplement evidence that SMC reduces malaria cases with other putatively causal[1] evidence that intervening to reduce malaria leads to a sizeable reduction in deaths?
That depends on how seriously we take the argument that most published research findings are false. I myself take this very seriously, and I basically treat all research as provisional until it’s been validated through a seriously well-identified study.
I’m not saying that we don’t know that malaria causes deaths—we definitely know that people die of malaria. But why did the SMC studies find much smaller overall mortality effects, in both treatment and control, than expected?
Cissé et al. (2006) studied a region “where the mortality rate for children under 5 years of age is 40 deaths per 1000 children per year. Malaria accounts for about a quarter of deaths in those aged 1–5 years” (p. 660).
That study was on the very low end for “entomological inoculation rates (infective bites per person per year)” (Meremikwu et al. 2012 p.8), which ranged from 10 (Cisse 2006) to 173 bites (Konate 2011).
So let’s just take Cissé et al.’s estimate as a conservative baseline, and say that if you study 12,589 children in endemic/hyperendemic regions for a year,[2] you should expect about 500 deaths overall, with 125 of them attributable to malaria.
Instead, we get 26 deaths overall.
To my eyes, this looks like a serious disconnect between theory and empirics, and it’s repeated across many settings. Frankly I have no idea what’s going on. Am I misunderstanding something fundamental here or have I made a mistake? What does GiveWell make of this? I take it you work for AMF—what do you make of it?
Back to my experience with the contact hypothesis: I treat “something weird that we don’t understand in the published findings” as a warning bell. So, personally, I think there is at least a 50% chance that a perfectly run SMC study today would find effects on mortality that are indistinguishable from a null.
Let’s say that trial cost $10M to do, and it affected the allocation of hundreds of millions of dollars. I think that passes a cost-benefit test across a wide range of supplementary parameter values.
They also say that they only want to recommend charities that meet a benchmark of being 10X as cost-effective as cash.
In effect, GiveWell is saying that if you give someone living in extreme poverty $10, the things they spend it on will only give them 10% of the utility that they might have gotten if someone else had chosen their bundle of goods for them.
This is actually an extraordinary claim and thus requires extraordinary evidence. It would definitely raise your hackles if someone said it about you — that you’re leaving 90% of potential value on the table because you don’t know what‘s best for yourself.
Are we saying that they don’t have access to the correct bundle of goods for distribution/infrastructure reasons?
A few of the SMC studies note that people have bed nets but that they’re in bad shape; others have found that “widely distributed ITNs have been repurposed as fishing nets throughout the world” (Larsen et al. 2021).
At face value, this seems like evidence that people in extreme poverty don’t value malaria treatments nearly as much as GiveWell does. What’s going on there? Is it because they’re ignorant or short-sighted? Or is it because Westerners are convinced of a theory that conflicts with people’s lived experiences—and also with the actual empirical evidence found by SMC studies? I don’t know, and I find these theories about equally plausible.
I know this was all very approximate for a cost-benefit analysis, but IMO,we need a stronger basis for our assumptions about effect sizes than we currently have to be more specific.
putative because I’m pretty sure it doesn’t come from Human Challenge Trials, i.e. malaria was not actually the thing randomly assigned. FWIW I don’t think that that trial would pass a cost-benefit test.
I appreciate the thoughts! I’m going to think about this more thoroughly… but here’s a quick guess about the low death numbers:
These trials involved measuring malaria prevalence in children. Presumably, children with a positive result would then get medication or be referred to a health center. Malaria is a curable disease, so this approach would save lives. Unfortunately, it’s also quite likely that the child would not receive appropriate treatment in the absence of a diagnosis, due to lack of knowledge of the parents, distance to health facilities, etc.
Anyway, it’s just a quick guess. Might be worth checking if the studies describe what happened to children with positive test results.
Looks like I can confirm this. Relevant passages from Cissé et al (2006):
The study was designed to measure Malaria, not deaths:
The primary outcome measure was a comparison of the occurence of clinical malaria between children in the two study groups.
Children with positive malaria tests received treatment:
Malaria morbidity was monitored through home visits every week and by detection of study children who presented at one of three health centres in the study area. At each assessment, axillary temperature was measured, and if it was 37.5C or greater, or if there was a history of fever or vomiting during the previous 24 h, a blood film was prepared. Results of the blood film examination were usually available within 2 h. Antimalarial treatment was given when appropriate according to the national guidelines: chloroquine as firstline treatment, quinine or sulfadoxine-pyrimethamine as second-line treatment in cases of failure of treatment with chloroquine, and injectable quinine for cases with persistent vomiting or severe malaria. Study children received iron supplementation if they presented at a health centre with an illness suggestive of anaemia, pale mucosae, or both.
I’ll still think more about this… but here we have at least a lead towards better understanding of low death numbers in SMC trials.
Thank you for looking into it! Definitely interesting. To recap:
GiveWell’s cost-benefit calculations hinge on the relationship between SMC and mortality.
The key mediator there is cases of malaria.
In the provided studies, the estimated relationship between cases of malaria and deaths is likely to bedownwardly biased because of co-delivered interventions (ITN, HMM, and, as you’ve identified, just more attentiveness to malaria in general in treated areas).
As SMC is rolled out, is it rolled out along with more general medical care, or without? With co-interventions, or without? This seems like the key question we don’t have a handle on and that GiveWell’s materials don’t shine much light on.
Let’s say it’s rolled out along with general medical care. In that case, what’s actually doing the work in reducing mortality, SMC or medical care? And which set of costs (SMC, medical care, or the two combined) should factor into the $-per-life-saved calculation?
Let’s say it’s rolled out without that general medical care. In that case, do we really have a good estimate of the expected effects on mortality of just SMC? because that seems like the number GiveWell is basing its top charity title on, and at first glance, it’s really not clear what percentage of the research actually estimates that directly.
So in sum, either SMC is typically going to be rolled out in places/contexts where its effect on mortality is likely to be much lower than broader data about the relationship between malaria and mortality would suggest, which means that our $-per-life-saved metrics might be seriously off-base; or it will be rolled out in places that are very much unlike the settings in which the studies were run, which is a serious external validity problem.
So all in all, a confusing situation. And given the high stakes, I suggest that GiveWell taps a team with expertise in both the subject matter and RCTs to design and run an intervention that maps directly onto the target population.
Two postscripts:
Just curious, is this the kind of thing y’all discuss day-to-day at AMF? I’m very curious to hear from practitioners on this kind of thing. I am a total outsider who happened to notice that the evidence in Cochrane review didn’t map very neatly onto GiveWell’s analyses. Would love to know the ‘insider’ perspective a bit more.
I think we are getting closer to the core of your question here: the relationship between cases of malaria (or severe malaria more specifically) and deaths. I think that it would indeed be good to know more about the circumstances under which children die from malaria, and how this is affected by various kinds of medical care.
The question might partially touch upon SMC. Besides preventing malaria cases, it could also have an effect on severity (I’m thinking of Covid vaccines as an analogy). That said, the case for SMC (as I understand it) is that it’s an excellent way to prevent malaria infections. This is what the RCTs measure, and this is where its value comes from.
To answer the question, I believe it would be more helpful to do research into malaria as an illness, rather than doing an SMC trial replication. I continue to think that the evidence base for SMC is good enough. You have doubts since “most published research findings are false”, but “most published research findings” might be the wrong reference class here:
It includes observational studies, surveys, and other less reliable methods; here, we have RCTs.
It includes all published studies, also those with small samples and effect sizes. Here, we have >7 trials, >12k participants, and the effect (SMC’s reduction of malaria episodes) is >6 standard deviations away from zero.
It includes studies with effects that are multiple causal steps away from the intervention (e.g., deworming improves income) and have many confounding factors. Here, we are measuring the effect of a malaria medication on malaria, with clearly-understood underlying mechanisms.
You also ask about the settings in which SMC is rolled out. There is no specific answer here, since SMC is often rolled out for entire countries or regions, aiming to fully cover all eligible children. More than 30 million children received SMC last year. In their cost-effectiveness analysis, GiveWell looks at interventions by country and takes a number of relevant factors into account, such as the “mortality rate from malaria for 3-59 month olds”.
In general, malaria fatality (deaths per case) is trending downwards a bit, due to factors such as better access to medical care, better diagnosis, better education of parents, and certainly many others. It could make sense to make this explicit when doing a cost-effectiveness analysis.
I’d expect GiveWell to be mindful about these things and to have thought of the most-relevant factors. I don’t think additional RCTs would lead to large changes here.
Regarding the post-script about AMF: We are fortunate to have a board of trustees and leaders that think a lot about high-level questions and trends, both those closer to AMF’s work (e.g., resistance to insecticides used in nets) and those more peripheral (e.g., the impact of new vaccines). There is also good and regular communication between GiveWell and AMF. As for myself, the day-to-day preoccupations are often much more mundane ;-)
Thanks as always for your careful and helpful read! I was just telling someone yesterday that this exchange is a positive reflection on the EA community and ethos — as a comparison point, it’s been way more constructive and collaborative than any of my experiences with academic peer review.
It sounds like I haven’t changed your mind on the core subject and that’s totally understandable. I speculate that this is something of a (professional) culture difference — the academics I discussed this essay with all started nodding along with the general idea the moment I mentioned “uncertainty about external validity” 😃
And thanks for the insight into AMF, y’all do great work.
Hi Sjlver,
I’ve been thinking about this and I think you’re right, I do believe that running this replication trial passes a cost-benefit test, and I should try to explain why.
I think there’s a 50% chance that a perfectly done SMC replication would find mortality effects that are statistically indistinguishable from a null, for two reasons: 1) the documented empirical effects are strange and don’t gel with our underlying theory of malaria; 2) our theory also conflicts with the repeated observation that people living in extreme poverty don’t seem to take malaria as seriously as outsiders do, which is prima facie evidence that we’re misunderstanding something big.
My essay’s thesis is that SMC’s underlying RCT evidence, which is the foundation of GiveWell’s cost-benefit analysis, is weaker than it appears at first glance.
Does the use of meta-analysis somewhat or largely obviate this problem? In my opinion, no, aggregation does not paper over structural issues in the data generation process.
One of the most striking things my co-authors found when meta-analyzing the contact hypothesis literature was the gap in effect size between studies that had a pre-analysis plan (d = 0.016) and those that didn’t (d = 0.451). This obviously isn’t dispositive that there’s “no there there” with intergroup contact; but when subsequent high-quality studies on the subject found much more mixed results (e.g. here and here), at the very least, we can say we had a warning sign.
Can we supplement evidence that SMC reduces malaria cases with other putatively causal[1] evidence that intervening to reduce malaria leads to a sizeable reduction in deaths?
That depends on how seriously we take the argument that most published research findings are false. I myself take this very seriously, and I basically treat all research as provisional until it’s been validated through a seriously well-identified study.
I’m not saying that we don’t know that malaria causes deaths—we definitely know that people die of malaria. But why did the SMC studies find much smaller overall mortality effects, in both treatment and control, than expected?
Cissé et al. (2006) studied a region “where the mortality rate for children under 5 years of age is 40 deaths per 1000 children per year. Malaria accounts for about a quarter of deaths in those aged 1–5 years” (p. 660).
That study was on the very low end for “entomological inoculation rates (infective bites per person per year)” (Meremikwu et al. 2012 p.8), which ranged from 10 (Cisse 2006) to 173 bites (Konate 2011).
So let’s just take Cissé et al.’s estimate as a conservative baseline, and say that if you study 12,589 children in endemic/hyperendemic regions for a year,[2] you should expect about 500 deaths overall, with 125 of them attributable to malaria.
Instead, we get 26 deaths overall.
To my eyes, this looks like a serious disconnect between theory and empirics, and it’s repeated across many settings. Frankly I have no idea what’s going on. Am I misunderstanding something fundamental here or have I made a mistake? What does GiveWell make of this? I take it you work for AMF—what do you make of it?
Back to my experience with the contact hypothesis: I treat “something weird that we don’t understand in the published findings” as a warning bell. So, personally, I think there is at least a 50% chance that a perfectly run SMC study today would find effects on mortality that are indistinguishable from a null.
Let’s say that trial cost $10M to do, and it affected the allocation of hundreds of millions of dollars. I think that passes a cost-benefit test across a wide range of supplementary parameter values.
GiveWell cautions us not to take its expected value estimates literally, which is why I don’t take its 5K per life saved estimate as a baseline.
They also say that they only want to recommend charities that meet a benchmark of being 10X as cost-effective as cash.
In effect, GiveWell is saying that if you give someone living in extreme poverty $10, the things they spend it on will only give them 10% of the utility that they might have gotten if someone else had chosen their bundle of goods for them.
This is actually an extraordinary claim and thus requires extraordinary evidence. It would definitely raise your hackles if someone said it about you — that you’re leaving 90% of potential value on the table because you don’t know what‘s best for yourself.
Are we saying that they don’t have access to the correct bundle of goods for distribution/infrastructure reasons?
A few of the SMC studies note that people have bed nets but that they’re in bad shape; others have found that “widely distributed ITNs have been repurposed as fishing nets throughout the world” (Larsen et al. 2021).
At face value, this seems like evidence that people in extreme poverty don’t value malaria treatments nearly as much as GiveWell does. What’s going on there? Is it because they’re ignorant or short-sighted? Or is it because Westerners are convinced of a theory that conflicts with people’s lived experiences—and also with the actual empirical evidence found by SMC studies? I don’t know, and I find these theories about equally plausible.
I know this was all very approximate for a cost-benefit analysis, but IMO,we need a stronger basis for our assumptions about effect sizes than we currently have to be more specific.
putative because I’m pretty sure it doesn’t come from Human Challenge Trials, i.e. malaria was not actually the thing randomly assigned. FWIW I don’t think that that trial would pass a cost-benefit test.
“The length of follow-up for the included trials varied from six months to two years; with one year being most common” (Meremikwu et al p. 8)
I appreciate the thoughts! I’m going to think about this more thoroughly… but here’s a quick guess about the low death numbers:
These trials involved measuring malaria prevalence in children. Presumably, children with a positive result would then get medication or be referred to a health center. Malaria is a curable disease, so this approach would save lives. Unfortunately, it’s also quite likely that the child would not receive appropriate treatment in the absence of a diagnosis, due to lack of knowledge of the parents, distance to health facilities, etc.
Anyway, it’s just a quick guess. Might be worth checking if the studies describe what happened to children with positive test results.
Looks like I can confirm this. Relevant passages from Cissé et al (2006):
The study was designed to measure Malaria, not deaths:
Children with positive malaria tests received treatment:
I’ll still think more about this… but here we have at least a lead towards better understanding of low death numbers in SMC trials.
Thank you for looking into it! Definitely interesting. To recap:
GiveWell’s cost-benefit calculations hinge on the relationship between SMC and mortality.
The key mediator there is cases of malaria.
In the provided studies, the estimated relationship between cases of malaria and deaths is likely to be downwardly biased because of co-delivered interventions (ITN, HMM, and, as you’ve identified, just more attentiveness to malaria in general in treated areas).
As SMC is rolled out, is it rolled out along with more general medical care, or without? With co-interventions, or without? This seems like the key question we don’t have a handle on and that GiveWell’s materials don’t shine much light on.
Let’s say it’s rolled out along with general medical care. In that case, what’s actually doing the work in reducing mortality, SMC or medical care? And which set of costs (SMC, medical care, or the two combined) should factor into the $-per-life-saved calculation?
Let’s say it’s rolled out without that general medical care. In that case, do we really have a good estimate of the expected effects on mortality of just SMC? because that seems like the number GiveWell is basing its top charity title on, and at first glance, it’s really not clear what percentage of the research actually estimates that directly.
So in sum, either SMC is typically going to be rolled out in places/contexts where its effect on mortality is likely to be much lower than broader data about the relationship between malaria and mortality would suggest, which means that our $-per-life-saved metrics might be seriously off-base; or it will be rolled out in places that are very much unlike the settings in which the studies were run, which is a serious external validity problem.
So all in all, a confusing situation. And given the high stakes, I suggest that GiveWell taps a team with expertise in both the subject matter and RCTs to design and run an intervention that maps directly onto the target population.
Two postscripts:
Just curious, is this the kind of thing y’all discuss day-to-day at AMF? I’m very curious to hear from practitioners on this kind of thing. I am a total outsider who happened to notice that the evidence in Cochrane review didn’t map very neatly onto GiveWell’s analyses. Would love to know the ‘insider’ perspective a bit more.
DataColada just published something about some structural issues with conventional meta-analysis that might be of interest.
Thanks for the thoughts!
I think we are getting closer to the core of your question here: the relationship between cases of malaria (or severe malaria more specifically) and deaths. I think that it would indeed be good to know more about the circumstances under which children die from malaria, and how this is affected by various kinds of medical care.
The question might partially touch upon SMC. Besides preventing malaria cases, it could also have an effect on severity (I’m thinking of Covid vaccines as an analogy). That said, the case for SMC (as I understand it) is that it’s an excellent way to prevent malaria infections. This is what the RCTs measure, and this is where its value comes from.
To answer the question, I believe it would be more helpful to do research into malaria as an illness, rather than doing an SMC trial replication. I continue to think that the evidence base for SMC is good enough. You have doubts since “most published research findings are false”, but “most published research findings” might be the wrong reference class here:
It includes observational studies, surveys, and other less reliable methods; here, we have RCTs.
It includes all published studies, also those with small samples and effect sizes. Here, we have >7 trials, >12k participants, and the effect (SMC’s reduction of malaria episodes) is >6 standard deviations away from zero.
It includes studies with effects that are multiple causal steps away from the intervention (e.g., deworming improves income) and have many confounding factors. Here, we are measuring the effect of a malaria medication on malaria, with clearly-understood underlying mechanisms.
You also ask about the settings in which SMC is rolled out. There is no specific answer here, since SMC is often rolled out for entire countries or regions, aiming to fully cover all eligible children. More than 30 million children received SMC last year. In their cost-effectiveness analysis, GiveWell looks at interventions by country and takes a number of relevant factors into account, such as the “mortality rate from malaria for 3-59 month olds”.
In general, malaria fatality (deaths per case) is trending downwards a bit, due to factors such as better access to medical care, better diagnosis, better education of parents, and certainly many others. It could make sense to make this explicit when doing a cost-effectiveness analysis.
I’d expect GiveWell to be mindful about these things and to have thought of the most-relevant factors. I don’t think additional RCTs would lead to large changes here.
Regarding the post-script about AMF: We are fortunate to have a board of trustees and leaders that think a lot about high-level questions and trends, both those closer to AMF’s work (e.g., resistance to insecticides used in nets) and those more peripheral (e.g., the impact of new vaccines). There is also good and regular communication between GiveWell and AMF. As for myself, the day-to-day preoccupations are often much more mundane ;-)
Thanks as always for your careful and helpful read! I was just telling someone yesterday that this exchange is a positive reflection on the EA community and ethos — as a comparison point, it’s been way more constructive and collaborative than any of my experiences with academic peer review.
It sounds like I haven’t changed your mind on the core subject and that’s totally understandable. I speculate that this is something of a (professional) culture difference — the academics I discussed this essay with all started nodding along with the general idea the moment I mentioned “uncertainty about external validity” 😃
And thanks for the insight into AMF, y’all do great work.