My analysis of StrongMinds is based on a meta-analysis of 39 RCTS of group psychotherapy in low-income countries. I didn’t rely solely on StrongMinds’ own evidence alone, I incorporated the broader evidence base from other similar interventions too. This strikes me, in a Bayesian sense, as the sensible thing to do.
I agree, but as we have already discussed offline, I disagree with some of the steps in your meta-analyses, and think we should be using effect sizes smaller than the ones you have arrived at. I certainly didn’t mean to claim in my post that StrongMinds has no effect, just that it has an effect which is small enough that we are looking at numbers on the order (or lower) than cash-transfers and therefore it doesn’t meet the bar of “Top-Charity”.
I think Simon would define “strong evidence” as recent, high-quality, and charity-specific. If that’s the case, I think that’s too stringent. That standard would imply that GiveWell should not recommend bednets, deworming, or vitamin-A supplementation.
I agree with this, although I think the difference here is I wouldn’t expect those interventions to be as sensitive to the implementation details. (Mostly I think this is a reason to reduce the effect-size from the meta-analysis, whereas HLI thinks it’s a reason to increase the effect size).
As a community, I think that we should put some weight on a recommendation if it fits the two standards I listed above, according to a plausible worldview (i.e., GiveWell’s moral weights or HLI’s subjective wellbeing approach). All that being said, we’re still developing our charity evaluation methodology, and I expect our views to evolve in the future.
I agree with almost all of this. I don’t think we should use HLI’s subjective wellbeing approach until it is better understood by the wider community. I doubt most donors appreciate some of the assumptions the well-being approach makes or the conclusions that it draws.
First on this comment, which I disagree with—and this is one of the few areas where I think the Effective Altruism community can at times miss something quite important. This isn’t really about the StrongMinds charity question, but instead a general bugbear of mine as someone who implements things ;).
″ I think the difference here is I wouldn’t expect those interventions to be as sensitive to the implementation details.”
Any intervention is extremely sensitive to implementation details, whether deworming or nets or psychotherapy. In fact I think that intervention details are often more important than the pre-calculated expected value. If a given intervention is implemented poorly, or in the wrong place or at the wrong time then it could still have less impact than an intervention that is theoretically 100x worse. As a plausible if absurd example, imagine a vitamin A project which doesn’t actually happen because the money is corrupted away. Or if you you give out mosquito nets in the same village where another NGO gave out nets 2 weeks ago. Or if you deworm in a place where 20 previous deworming projects and sanitation has already drastically reduced the worm burden.
Maybe some interventions are easier to implement than others, and there might be more variance in the effectiveness of psychotherapy compared with net distribution (although I doubt that, I would guess less variance than nets) but all are very sensitive to implementation details.
And second this statement
“just that it has an effect which is small enough that we are looking at numbers on the order (or lower) than cash-transfers and therefore it doesn’t meet the bar of “Top-Charity”.”
I’d be interested in you backing up this comment with a bit explanation if you have time (all good if not!). I know this isn’t your job and you don’t have the time that Joel has, but what is it that has led you to conclude that the numbers are “on the order (or lower) than cash transfers”? Is this comment based on intuition or have you done some maths?
Any intervention is extremely sensitive to implementation details, whether deworming or nets or psychotherapy.
Yes, I’m sorry if my comment appeared to dismiss this fact as I do strongly agree with this.
Maybe some interventions are easier to implement than others, and there might be more variance in the effectiveness of psychotherapy compared with net distribution (although I doubt that, I would guess less variance than nets) but all are very sensitive to implementation details.
This is pretty much my point
I’d be interested in you backing up this comment with a bit explanation if you have time (all good if not!). I know this isn’t your job and you don’t have the time that Joel has, but what is it that has led you to conclude that the numbers are “on the order (or lower) than cash transfers”? Is this comment based on intuition or have you done some maths?
I haven’t done a bottom up analyses, more I have made my own adjustments to the HLI numbers which get me to about that level:
You use 0.88 as the effect-size for StrongMinds whereas I think it’s more appropriate to use something closer to the 0.4/0.5 you use here. (And in fact I actually skew this number even lower than you do)
You convert SDs of depression-scores directly to SDs of well-being, which I strongly object to. I don’t have exact numbers of how I would discount this, but my guess is there are two reasons I want to discount this:
Non-linearity in severity of depression
No perfect correlation between the measures (when I spoke to Joel we discussed this, and I do think your reasoning is reasonable, but I still disagree with it)
I think the fairest way to resolve this would be to bet on the effect-size of the Ozler trial. Where would you make me 50⁄50 odds in $5k?
Just to clarify, am not part of Strongminds or HLI, maybe you thought I was Joel replying?
Thanks for the clarifications, appreciate that. Seems like we generally agree on implementation sensitivity.
Thanks for your explanation on the HLI numbers which unfortunately I only I partly understand. A quick (and possibly stupid question), what does SD stand for? Usually I would expect standard deviation?
No bet from me on the Ozler trial I’m afraid (not a gambling guy ;) ). Personally I think this trial it will find a fairly large effect due partly to the intervention actually working, but the effect will be inflated compared to the real effect to a due to inflated post-study SBJ scores. This happens due to “demand bias” and “future hope bias” (discussed in another post) but my certainty of any of this is so low it almost touches the floor...
what does SD stand for? Usually I would expect standard deviation?
Yes, that’s exactly right. The HLI methodology consists of polling together a bunch of different studies effect-sizes (measured in standard deviations) and then converting those standard deviations into WELLBYs. (By mulitplying by a number ~2).
No bet from me on the Ozler tria
Fair enough—I’m open to betting on this with anyone* fwiw. * anyone who hasn’t already seen results / involved in the trial ofc
I agree, but as we have already discussed offline, I disagree with some of the steps in your meta-analyses, and think we should be using effect sizes smaller than the ones you have arrived at. I certainly didn’t mean to claim in my post that StrongMinds has no effect, just that it has an effect which is small enough that we are looking at numbers on the order (or lower) than cash-transfers and therefore it doesn’t meet the bar of “Top-Charity”.
I agree with this, although I think the difference here is I wouldn’t expect those interventions to be as sensitive to the implementation details. (Mostly I think this is a reason to reduce the effect-size from the meta-analysis, whereas HLI thinks it’s a reason to increase the effect size).
I agree with almost all of this. I don’t think we should use HLI’s subjective wellbeing approach until it is better understood by the wider community. I doubt most donors appreciate some of the assumptions the well-being approach makes or the conclusions that it draws.
A couple of quick comments Simon.
First on this comment, which I disagree with—and this is one of the few areas where I think the Effective Altruism community can at times miss something quite important. This isn’t really about the StrongMinds charity question, but instead a general bugbear of mine as someone who implements things ;).
″ I think the difference here is I wouldn’t expect those interventions to be as sensitive to the implementation details.”
Any intervention is extremely sensitive to implementation details, whether deworming or nets or psychotherapy. In fact I think that intervention details are often more important than the pre-calculated expected value. If a given intervention is implemented poorly, or in the wrong place or at the wrong time then it could still have less impact than an intervention that is theoretically 100x worse. As a plausible if absurd example, imagine a vitamin A project which doesn’t actually happen because the money is corrupted away. Or if you you give out mosquito nets in the same village where another NGO gave out nets 2 weeks ago. Or if you deworm in a place where 20 previous deworming projects and sanitation has already drastically reduced the worm burden.
Maybe some interventions are easier to implement than others, and there might be more variance in the effectiveness of psychotherapy compared with net distribution (although I doubt that, I would guess less variance than nets) but all are very sensitive to implementation details.
And second this statement
“just that it has an effect which is small enough that we are looking at numbers on the order (or lower) than cash-transfers and therefore it doesn’t meet the bar of “Top-Charity”.”
I’d be interested in you backing up this comment with a bit explanation if you have time (all good if not!). I know this isn’t your job and you don’t have the time that Joel has, but what is it that has led you to conclude that the numbers are “on the order (or lower) than cash transfers”? Is this comment based on intuition or have you done some maths?
Yes, I’m sorry if my comment appeared to dismiss this fact as I do strongly agree with this.
This is pretty much my point
I haven’t done a bottom up analyses, more I have made my own adjustments to the HLI numbers which get me to about that level:
You use 0.88 as the effect-size for StrongMinds whereas I think it’s more appropriate to use something closer to the 0.4/0.5 you use here. (And in fact I actually skew this number even lower than you do)
You convert SDs of depression-scores directly to SDs of well-being, which I strongly object to. I don’t have exact numbers of how I would discount this, but my guess is there are two reasons I want to discount this:
Non-linearity in severity of depression
No perfect correlation between the measures (when I spoke to Joel we discussed this, and I do think your reasoning is reasonable, but I still disagree with it)
I think the fairest way to resolve this would be to bet on the effect-size of the Ozler trial. Where would you make me 50⁄50 odds in $5k?
Just to clarify, am not part of Strongminds or HLI, maybe you thought I was Joel replying?
Thanks for the clarifications, appreciate that. Seems like we generally agree on implementation sensitivity.
Thanks for your explanation on the HLI numbers which unfortunately I only I partly understand. A quick (and possibly stupid question), what does SD stand for? Usually I would expect standard deviation?
No bet from me on the Ozler trial I’m afraid (not a gambling guy ;) ). Personally I think this trial it will find a fairly large effect due partly to the intervention actually working, but the effect will be inflated compared to the real effect to a due to inflated post-study SBJ scores. This happens due to “demand bias” and “future hope bias” (discussed in another post) but my certainty of any of this is so low it almost touches the floor...
Yes, that’s exactly right. The HLI methodology consists of polling together a bunch of different studies effect-sizes (measured in standard deviations) and then converting those standard deviations into WELLBYs. (By mulitplying by a number ~2).
Fair enough—I’m open to betting on this with anyone* fwiw. * anyone who hasn’t already seen results / involved in the trial ofc