How does the meta-analysis avoid the garbage-in-garbage-out problem? Are you simply averaging across studies, or do you weight by study quality (eg. sample size, being pre-registered, etc)? Did you consider replicating the individual studies?
Do you worry about effect sizes decreasing as StrongerMinds scales up? Eg. they start targeting a different population where therapy has smaller effects.
One quibble: “post-treatment effect” sounds weird, I would just call it a “treatment effect”.
The point is, these concerns cannot be dealt with simply by suggesting that they won’t make enough difference to change the headline result; in fact they could.
If this issue was addressed in the research discussed here, it’s not obvious to me how it was done.
Give well rated the evidence of impact for GiveDirectly “Exceptionally strong”, though it’s not clear exactly what this means with regard to the credibility of studies that estimate the size of the effect of cash transfers on wellbeing (https://www.givewell.org/charities/top-charities#cash). Nevertheless, if a charity was being penalized in such comparisons for doing rigorous research, then I would expect to see assessments like “strong evidence, lower effect size”, which is what we see here.
I try to avoid avoid the problem by discounting the average effect of psychotherapy. The point isn’t to try and find the “true effect”. The goal is to adjust for the risk of bias present in psychotherapy’s evidence base relative to the evidence base of cash transfers. We judge the CTs evidence to be higher quality. Psychotherapy has lower sample sizes on average and fewer unpublished studies, both of which are related to larger effect sizes in meta-analyses (MetaPsy, 2020; Vivalt, 2020, Dechartres et al., 2018 ;Slavin et al., 2016). FWIW I discuss this more in appendix C of the psychotherapy report.
I should note that I think the tool I use needs development. This issue of detecting and adjusting for the bias present in a study is a more general issue in social science.
I do worry about the effect sizes decreasing, but the hope is that the cost will drop to a greater degree as StrongMinds scales up.
We say “post-treatment effect” because it makes it clear the time point we are discussing. “Treatment effect” could refer either to the post-treatment effect or to the total effect of psychotherapy, where the total effect is the decision-relevant effect.
How does the meta-analysis avoid the garbage-in-garbage-out problem? Are you simply averaging across studies, or do you weight by study quality (eg. sample size, being pre-registered, etc)? Did you consider replicating the individual studies?
Do you worry about effect sizes decreasing as StrongerMinds scales up? Eg. they start targeting a different population where therapy has smaller effects.
One quibble: “post-treatment effect” sounds weird, I would just call it a “treatment effect”.
I share this concern. I don’t have much of a baseline on how much meta-analysis overstated effect sizes, but I suspect it is substantial.
One comparison I do know about: as of about 2018, the average effect size of unusually careful studies funded by the EEF (https://educationendowmentfoundation.org.uk/projects-and-evaluation/projects) was 0.08, while the mean of meta-analytic effect sizes overall was allegedly 0.40(https://visible-learning.org/hattie-ranking-influences-effect-sizes-learning-achievement/), suggesting that meta analysis in that field on average yields effect sizes about five times higher than is realistic.
The point is, these concerns cannot be dealt with simply by suggesting that they won’t make enough difference to change the headline result; in fact they could.
If this issue was addressed in the research discussed here, it’s not obvious to me how it was done.
Give well rated the evidence of impact for GiveDirectly “Exceptionally strong”, though it’s not clear exactly what this means with regard to the credibility of studies that estimate the size of the effect of cash transfers on wellbeing (https://www.givewell.org/charities/top-charities#cash). Nevertheless, if a charity was being penalized in such comparisons for doing rigorous research, then I would expect to see assessments like “strong evidence, lower effect size”, which is what we see here.
Hi Michael,
I try to avoid avoid the problem by discounting the average effect of psychotherapy. The point isn’t to try and find the “true effect”. The goal is to adjust for the risk of bias present in psychotherapy’s evidence base relative to the evidence base of cash transfers. We judge the CTs evidence to be higher quality. Psychotherapy has lower sample sizes on average and fewer unpublished studies, both of which are related to larger effect sizes in meta-analyses (MetaPsy, 2020; Vivalt, 2020, Dechartres et al., 2018 ;Slavin et al., 2016). FWIW I discuss this more in appendix C of the psychotherapy report.
I should note that I think the tool I use needs development. This issue of detecting and adjusting for the bias present in a study is a more general issue in social science.
I do worry about the effect sizes decreasing, but the hope is that the cost will drop to a greater degree as StrongMinds scales up.
We say “post-treatment effect” because it makes it clear the time point we are discussing. “Treatment effect” could refer either to the post-treatment effect or to the total effect of psychotherapy, where the total effect is the decision-relevant effect.