I’m belatedly making an overall comment about this post.
I think this was a valuable contribution to the discussion around charity evaluation. We agree that StrongMinds’ figures about their effect on depression are overly optimistic. We erred by not pointing this out in our previous work and not pushing StrongMinds to cite more sensible figures. We have raised this issue with StrongMinds and asked them to clarify which claims are supported by causal evidence.
There are some other issues that Simon raises, like social desirability bias, that I think are potential concerns. The literature we reviewed in our StrongMinds CEA (page 26) doesn’t suggest it’s a large issue, but I only found one study that directly addresses this in a low-income country (Haushofer et al., 2020), so the evidence appears very limited here (but let me know if I’m wrong). I wouldn’t be surprised if more work changed my mind on the extent of this bias. However, I would be very surprised if this alone changed the conclusion of our analysis. As is typically the case, more research is needed.
Having said that, I have a few issues with the post and see it as more of a conversation starter than the end of the conversation. I respond to a series of quotes from the original post below.
I’m going to leave aside discussing HLI here. Whilst I think they have some of the deepest analysis of StrongMinds, I am still confused by some of their methodology, it’s not clear to me what their relationship to StrongMinds is.
If there’s confusion about our methodology, that’s fair, and I’ve tried to be helpful in that regard. Regarding our relationship with StrongMinds, we’re completely independent.
“The key thing to understand about the HLI methodology is that it follows the same structure as the Founders Pledge analysis and so all the problems I mention above regarding data apply just as much to them as FP.”
This is false. As we’ve explained before, our evaluation of StrongMinds is primarily based on a meta-analysis of psychological interventions in LMICs, which is a distinction between our work and Founders Pledge that means that many of the problems mentioned apply less to our work.
I also have some issues with the claims this post makes. I’ll focus on Simon’s summary of his argument:
“I think the strongest statement I can make (which I doubt StrongMinds would disagree with) is: ’StrongMinds have made limited effort to be quantitative in their self-evaluation, haven’t continued monitoring impact after intervention, haven’t done the research they once claimed they would. They have not been vetted sufficiently to be considered a top charity, and only one independent group has done the work to look into them.”
Next, I remark on the problem with each line.
“I think the strongest statement I can make (which I doubt StrongMinds would disagree with) is – ”
I think StrongMinds would disagree with this argument. This strikes me as overconfident.
“StrongMinds have made limited effort to be quantitative in their self-evaluation, haven’t continued monitoring impact after intervention…”
If quantitative means “RCTs”, then sure, but until very recently, they surveyed the depression score before and after treatment for every participant (which in 2019 meant an n = 28,294, unpublished data shared with me during their evaluation). StrongMinds also followed up 18 months after their initial trial and in 2019 they followed up with 300 participants six months after they received treatment (again, unpublished data). I take that as at least a sign they’re trying to quantitatively evaluate their impact – even if they could do much better (which I agree they could).
“[StrongMinds] haven’t done the research they once claimed they would.”
I’m a bit confused by this point. It sounds more like the appropriate claim is, “they didn’t do the research they once claimed they would do fast enough.” As Simon pointed out, there’s an RCT whose results should be released soon by Baird et al. From conversations we’ve had with StrongMinds, they’re also planning on starting another RCT in 2023. I also know that they completed a controlled trial in 2020 (maybe randomised, still unsure) with a six-month and year follow-up. However, I agree that StrongMinds could and should invest in collecting more causal data. I just don’t think the situation is as bleak as it has been made out to be, as running an RCT can be an enormous undertaking.
“They have not been vetted sufficiently to be considered a top charity, and only one independent group has done the work to look into them.”
This either means (a) only Founders Pledge has evaluated StrongMinds, which is wrong, or (b) HLI doesn’t count because we are not independent, which would be both wrong and uncharitable.
“Based on Phase I and II surveys, it seems to me that a much more cost-effective intervention would be to go around surveying people. I’m not exactly sure what’s going on with the Phase I / Phase II data, but the best I can tell is in Phase I we had a ~7.5 vs ~5.1 PHQ-9 reduction from “being surveyed” vs “being part of the group” and in Phase II we had ~5.1 vs ~4.5 PHQ-9 reduction from “being surveyed” vs “being part of the group”. For what it’s worth, I don’t believe this is likely the case, I think it’s just a strong sign that the survey mechanism being used is inadequate to determine what is going on.”
I think this could have a pretty simple explanation. StrongMinds used a linear model to estimate: depression reduction = group + sessions. This will lead to a non-zero intercept if the relationship between sessions and depression reduction is non-linear, which we see in the graphs provided in the post.
I’m belatedly making an overall comment about this post.
I think this was a valuable contribution to the discussion around charity evaluation. We agree that StrongMinds’ figures about their effect on depression are overly optimistic. We erred by not pointing this out in our previous work and not pushing StrongMinds to cite more sensible figures. We have raised this issue with StrongMinds and asked them to clarify which claims are supported by causal evidence.
There are some other issues that Simon raises, like social desirability bias, that I think are potential concerns. The literature we reviewed in our StrongMinds CEA (page 26) doesn’t suggest it’s a large issue, but I only found one study that directly addresses this in a low-income country (Haushofer et al., 2020), so the evidence appears very limited here (but let me know if I’m wrong). I wouldn’t be surprised if more work changed my mind on the extent of this bias. However, I would be very surprised if this alone changed the conclusion of our analysis. As is typically the case, more research is needed.
Having said that, I have a few issues with the post and see it as more of a conversation starter than the end of the conversation. I respond to a series of quotes from the original post below.
If there’s confusion about our methodology, that’s fair, and I’ve tried to be helpful in that regard. Regarding our relationship with StrongMinds, we’re completely independent.
This is false. As we’ve explained before, our evaluation of StrongMinds is primarily based on a meta-analysis of psychological interventions in LMICs, which is a distinction between our work and Founders Pledge that means that many of the problems mentioned apply less to our work.
I also have some issues with the claims this post makes. I’ll focus on Simon’s summary of his argument:
Next, I remark on the problem with each line.
I think StrongMinds would disagree with this argument. This strikes me as overconfident.
If quantitative means “RCTs”, then sure, but until very recently, they surveyed the depression score before and after treatment for every participant (which in 2019 meant an n = 28,294, unpublished data shared with me during their evaluation). StrongMinds also followed up 18 months after their initial trial and in 2019 they followed up with 300 participants six months after they received treatment (again, unpublished data). I take that as at least a sign they’re trying to quantitatively evaluate their impact – even if they could do much better (which I agree they could).
I’m a bit confused by this point. It sounds more like the appropriate claim is, “they didn’t do the research they once claimed they would do fast enough.” As Simon pointed out, there’s an RCT whose results should be released soon by Baird et al. From conversations we’ve had with StrongMinds, they’re also planning on starting another RCT in 2023. I also know that they completed a controlled trial in 2020 (maybe randomised, still unsure) with a six-month and year follow-up. However, I agree that StrongMinds could and should invest in collecting more causal data. I just don’t think the situation is as bleak as it has been made out to be, as running an RCT can be an enormous undertaking.
This either means (a) only Founders Pledge has evaluated StrongMinds, which is wrong, or (b) HLI doesn’t count because we are not independent, which would be both wrong and uncharitable.
I think this could have a pretty simple explanation. StrongMinds used a linear model to estimate: depression reduction = group + sessions. This will lead to a non-zero intercept if the relationship between sessions and depression reduction is non-linear, which we see in the graphs provided in the post.