The point is, these concerns cannot be dealt with simply by suggesting that they won’t make enough difference to change the headline result; in fact they could.
If this issue was addressed in the research discussed here, it’s not obvious to me how it was done.
Give well rated the evidence of impact for GiveDirectly “Exceptionally strong”, though it’s not clear exactly what this means with regard to the credibility of studies that estimate the size of the effect of cash transfers on wellbeing (https://www.givewell.org/charities/top-charities#cash). Nevertheless, if a charity was being penalized in such comparisons for doing rigorous research, then I would expect to see assessments like “strong evidence, lower effect size”, which is what we see here.
I share this concern. I don’t have much of a baseline on how much meta-analysis overstated effect sizes, but I suspect it is substantial.
One comparison I do know about: as of about 2018, the average effect size of unusually careful studies funded by the EEF (https://educationendowmentfoundation.org.uk/projects-and-evaluation/projects) was 0.08, while the mean of meta-analytic effect sizes overall was allegedly 0.40(https://visible-learning.org/hattie-ranking-influences-effect-sizes-learning-achievement/), suggesting that meta analysis in that field on average yields effect sizes about five times higher than is realistic.
The point is, these concerns cannot be dealt with simply by suggesting that they won’t make enough difference to change the headline result; in fact they could.
If this issue was addressed in the research discussed here, it’s not obvious to me how it was done.
Give well rated the evidence of impact for GiveDirectly “Exceptionally strong”, though it’s not clear exactly what this means with regard to the credibility of studies that estimate the size of the effect of cash transfers on wellbeing (https://www.givewell.org/charities/top-charities#cash). Nevertheless, if a charity was being penalized in such comparisons for doing rigorous research, then I would expect to see assessments like “strong evidence, lower effect size”, which is what we see here.