Gregory Lewis🔸 comments on Evaluating StrongMinds: how strong is the evidence?

Gregory Lewis🔸 23 Jan 2023 11:02 UTC
35 points
0 ∶ 0
I found (I think) the spreadsheet for the included studies here. I did a lazy replication (i.e. excluding duplicate follow-ups from studies, only including the 30 studies where ‘raw’ means and SDs were extracted, then plugging this into metamar). I copy and paste the (random effects) forest plot and funnel plot below—doubtless you would be able to perform a much more rigorous replication.
- ryancbriggs 23 Jan 2023 11:59 UTC
  14 points
  2 ∶ 5
  Parent
  This is why we like to see these plots! Thank you Gregory, though this should not have been on you to do.
  
  Having results like this underpin a charity recommendation and not showing it all transparently is a bad look for HLI. Hopefully there has been a mistake in your attempted replication and that explains e.g. the funnel plot. I look forward to reading the responses to your questions to Joel.
  - ryancbriggs 23 Jan 2023 18:33 UTC
    12 points
    1 ∶ 1
    Parent
    I’d love to hear which parts of my comment people disagree with. I think the following points, which I tried to make in my comment, are uncontentious:
    The plots I requested are indeed informative, and they cast some doubt on the credibility of the original meta-analysis
    Basic meta-analysis plots like a forest or funnel plot, which are incredible common in meta-analyses, should have been provided by the authors rather than made by community members
    Relatedly, transparency in the strength and/or quality of evidence underpinning charity recommendation is good (not checking the strength or quality of evidence is bad, as is not sharing that information if one did check)
    The funnel plot looks very asymmetric as well as just weird, and it would be nice if this was due to e.g. data entry mistakes by Gregory as opposed to anything else
    - Jason 23 Jan 2023 19:02 UTC
      11 points
      7 ∶ 0
      Parent
      I didn’t vote, but people may feel “not showing it all transparently is a bad look for HLI” is a little premature and unfriendly without allowing HLI time for a response to fresh analysis.
      - ryancbriggs 23 Jan 2023 19:15 UTC
        4 points
        2 ∶ 1
        Parent
        Thank you for responding Jason. That makes sense. The analysis under question here was done in Oct 2021, so I do think there was enough time to check a funnel plot for publication bias or odd heterogeneity. I really do think it’s a bad look if no one checked for this, and it’s a worse look if people checked and didn’t report it. This is why I hope the issue is something like data entry.
        Your core point is still fair though: There might be other explanations for this that I’m not considering, so while waiting for clarification from HLI I should be clear that I’m agnostic on motives or anything else. Everyone here is trying.
        JoelMcGuire 23 Jan 2023 23:26 UTC
        19 points
        2 ∶ 2
        Parent
        Hi Ryan,
        Our preferred model uses a meta-regression with the follow-up time as a moderator, not the typical “average everything” meta-analysis. Because of my experience presenting the cash transfers meta-analysis, I wanted to avoid people fixating on the forest plot and getting confused about the results since it’s not the takeaway result. But In hindsight I think it probably would have been helpful to include the forest plot somewhere.
        I don’t have a good excuse for the publication bias analysis. Instead of making a funnel plot I embarked on a quest to try and find a more general system for adjusting for biases between intervention literatures. This was, perhaps unsurprisingly, an incomplete work that failed to achieve many of its aims (see Appendix C) -- but it did lead to a discount of psychotherapy’s effects relative to cash transfers. In hindsight, I see the time spent on that mini project as a distraction. In the future I think we will spend more time focusing on using extant ways to adjust for publication bias quantitatively.
        Part of the reasoning was because we weren’t trying to do a systematic meta-analysis, but trying to do a quicker version on a convenience sample of studies. As we said on page 8 “These studies are not exhaustive (footnote: There are at least 24 studies, with an estimated total sample size of 2,310, we did not extract. Additionally, there appear to be several protocols registered to run trials studying the effectiveness and cost of non-specialist-delivered mental health interventions.). We stopped collecting new studies due to time constraints and the perception of diminishing returns.”
        I wasn’t sure if a funnel plot was appropriate when applied to a non-systematically selected sample of studies. As I’ve said elsewhere, I think we could have made the depth (or shallowness) of our analysis more clear.
        so I do think there was enough time to check a funnel plot for publication bias or odd heterogeneity
        While that’s technically true that there was enough time, It certainly doesn’t feel like it! -- HLI is a very small research organization (from 2020 through 2021 I was pretty much the lone HLI empirical researcher), and we have to constantly balance between exploring new cause areas / searching for interventions, and updating / improving previous analyses. It feels like I hit publish on this yesterday. I concede that I could have done better, and I plan on doing so in the future, but this balancing act is an art. It sometimes takes conversations like this to put items on our agenda.
        FWIW, here some quick plots I cooked up with the cleaner data. Some obvious remarks:
        The StrongMinds relevant studies (Bolton et al., 2003; Bass et al., 2006) appear to be unusually effective (outliers?).
        There appears more evidence of publication bias than was the case with our cash transfers meta-analysis (see last plot).
        I also added a p-curve. What you don’t want to see is a larger number of studies at the 0.05 mark than the 0.04 significance level, but that’s what you see here.
        Here are the cash transfer plots for reference:
        What links here?
        Assessment of Happier Lives Institute’s Cost-Effectiveness Analysis of StrongMinds by GiveWell (22 Mar 2023 17:04 UTC; 243 points)
        ryancbriggs 24 Jan 2023 1:15 UTC
        16 points
        2 ∶ 0
        Parent
        Thank you for sharing these Joel. You’ve got a lot going on in the comments here, so I’m going only make a few brief specific comments and one larger one. The larger one relates to something you’ve noted elsewhere in the thread, which is:
        “That the quality of this analysis was an attempt to be more rigorous than most shallow EA analyses, but definitely less rigorous than an quality peer reviewed academic paper. I think this [...] is not something we clearly communicated.”
        This work forms part of the evidence base behind some strong claims from HLI about where to give money, so I did expect it to be more rigorous. I wondered if I was alone in being surprised here, so I did a very informal (n = 23!) Twitter poll in the EA group asking about what people expected re: the rigor of evidence for charity recommendations. (I fixed my stupid Our World in Data autocorrect glitch in a follow up tweet).
        I don’t want to lean on this too much, but I do think it suggests that I’m not alone in expecting a higher degree of rigor when it comes to where to put charity dollars. This is perhaps mostly a communication issue, but I also think that as quality of analysis and evidence becomes less rigorous then claims should be toned down or at least the uncertainty (in the broad sense) needs to be more strongly expressed.
        On the specifics, first, I appreciate you noting the apparent publication bias. That’s both important and not great.
        Second, I think comparing the cash transfer funnel plot to the other one is informative. The cash transfer one looks “right”. It has the correct shape and it’s comforting to see the Egger regression line is basically zero. This is definitely not the case with the StrongMinds MA. The funnel plot looks incredibly weird, which could be heterogeneity that we can model but should regardless make everyone skeptical because doing that kind of modelling well is very hard. It’s also rough to see that if we project the Egger regression line back to the origin then the predicted effect when the SE is zero is basically zero. In other words, unwinding publication bias in this way would lead us to guess at a true effect of around nothing. Do I believe that? I’m not sure. There are good reasons to be skeptical of Egger-type regressions, but all of this definitely increases my skepticism of the results. While I’m glad it’s public now, I don’t feel great that this wasn’t part of the very public first cut of the results.
        Again, I appreciate you responding. I do think going forward it would be worth taking seriously community expectations about what underlies charity recommendations, and if something is tentative or rough then I hope that it gets clearly communicated as such, both originally and in downstream uses.
        JoelMcGuire 24 Jan 2023 4:05 UTC
        10 points
        2 ∶ 0
        Parent
        Interesting poll Ryan! I’m not sure how much to take away because I think epistemic / evidentiary standards is pretty fuzzy in the minds of most readers. But still, point taken that people probably expect high standards.
        It’s also rough to see that if we project the Egger regression line back to the origin then the predicted effect when the SE is zero is basically zero.
        I’m not sure about that. Here’s the output of the Egger test. If I’m interpreting it correctly then that’s smaller, but not zero. I’ll try to figure out how what the p-curve suggested correction says.
        Edit: I’m also not sure how much to trust the Egger test to tell me what the corrected effect size should be, so this wasn’t an endorsement that I think the real effect size should be halfed. It seems different ways of making this correction give very different answers. I’ll add a further comment with more details.
        I do think going forward it would be worth taking seriously community expectations about what underlies charity recommendations, and if something is tentative or rough then I hope that it gets clearly communicated as such, both originally and in downstream uses.
        Seems reasonable.
        ryancbriggs 24 Jan 2023 12:30 UTC
        1 point
        0 ∶ 0
        Parent
        Fair re: Egger. I just eyeballed the figure.
- stewardcampbell 23 Jan 2023 16:47 UTC
  −20 points
  0 ∶ 0
  Parent
  Hello, as far as I understood this post that charity plays an important role in our life. Moreover we should pay more attention to it