I might be being a bit dim here (I don’t have the time this week to do a good job of this), but I think of all the orgs evaluating StrongMinds, SoGive’s moral weights are most likely to find favourably for StrongMinds. Given that, I wonder what you expect you’d rate them at if you altered your moral weights to be more inline with FP and HLI?
SoGive’s Gold Standard Benchmarks are:
£5,000 per life saved
£50 to double someone’s consumption (spending) for one year
£200 to avert one year of severe depression
£5 to avert the suffering of one chicken who is living in very poor conditions
This is a ratio of 4:1 for averting a year of severe depression vs doubling someone’s consumption.
For context Founders Pledge who have a ratio somewhere around 1.3:1. Income doubling : DALY is 0.5 : 1. And severe depression corresponds to a DALY weighting of 0.658 in their CEA. (I understand they are shifting to a WELLBY framework like HLI, but I don’t think it will make much difference).
HLI is harder to piece together, but roughly speaking they see doubling income as having 1.3 WELLBY and severe depression has having a 1.3 WELLBY effect. A ratio of 1.3:1 (similar to FP)
Thanks for your question Simon, and it was very eagle-eyed of you to notice the difference in moral weights. Good sleuthing! (and more generally, thank you for provoking a very valuable discussion about StrongMinds)
I run SoGive and oversaw the work (then led by Alex Lawsen) to produce our moral weights. I’d be happy to provide further comment on our moral weights, however that might not be the most helpful thing. Here’s my interpretation of (the essence of) your very reasonable question:
“SoGive has a tendency to put a quite high value on tackling depression. Is this enough to explain why SoGive sounds like they might be more positive about StrongMinds than Simon M is?”
I have a simple answer to this: no, it isn’t.
Let me flesh that out. We have (at least) two sources of information:
Academic literature
Data from StrongMinds (e.g. their own evaluation report on themselves, or their regular reporting)
And we have (at least) two things we might ask about:
(a) How effective is the intervention that StrongMinds does, including the quality of evidence for it?
(b) How effective is the management team at StrongMinds?
I’d say that the main crux is the fact that our assessment of the quality of evidence for the intervention (item (a)) is based mostly on item 1 (the academic literature) and not on item 2 (data from StrongMinds).
This is the driver of the comments made by Ishaan above, not the moral weights.
And just to avoid any misunderstandings, I have not here said that the evidence base from the academic literature is really robust—we haven’t finished our assessment yet. I am saying that (unless our remaining work throws up some surprises) it will warrant a more positive tone than your post, and that it may well demonstrate a strong enough evidence base + good enough cost-effectiveness that it’s in the same ballpark as other charities in the GWWC list.
I don’t understand how that’s possible. If you put 3x the weight on StrongMind’s cost-effectiviness viz-a-vis other charities, changing this must move the needle on cost-effectiveness more than anything else. It’s possible to me it could have been “well into the range of gold-standard” and now it’s “just gold-standard” or “silver-standard”. However if something is silver standard, I can’t see any way in which your cost-effectivness being adjusted down by 1/3rd doesn’t massively shift your rating.
I’d say that the main crux is the fact that our assessment of the quality of evidence for the intervention (item (a)) is based mostly on item 1 (the academic literature) and not on item 2 (data from StrongMinds).
I feel like I’m being misunderstood here. I would be very happy to speak to you (or Ishaan) on the academic literature. I think probably best done in a more private forum so we can tease out our differences on this topic. (I can think of at least one surprise you might not have come across yet).
Ishaan’s work isn’t finished yet, and he has not yet converted his findings into the SoGive framework, or applied the SoGive moral weights to the problem. (Note that we generally try to express our findings in terms of the SoGive framework and other frameworks, such as multiples of cash, so that our results are meaningful to multiple audiences).
Just to reiterate, neither Ishaan nor I have made very strong statements about cost-effectiveness, because our work isn’t finished yet.
I would be very happy to speak to you (or Ishaan) on the academic literature.
That sounds great, I’ll message you directly. Definitely not wishing to misunderstand or misinterpret—thank you for your engagement on this topic :-)
To expand a little on “this seems implausible”: I feel like there is probably a mistake somewhere in the notion that anyone involves thinks that <doubling income as having 1.3 WELLBY and severe depression has having a 1.3 WELLBY effect.>
The mistake might be in your interpretation of HLI’s document (it does look like the 1.3 figure is a small part of some more complicated calculation regarding the economic impacts of AMF and their effect on well being, rather than intended as a headline finding about the cash to well being conversion rate). Or it could be that HLI has an error or has inconsistencies between reports. Or it could be that it’s not valid to apply that 1.3 number to “income doubling” SoGive weights for some reason because it doesn’t actually refer to the WELLBY value of doubling.
I’m not sure exactly where the mistake is, so it’s quite possible that you’re right, or that we are both missing something about how the math behind this works which causes this to work out, but I’m suspicious because it doesn’t really fit together with various other pieces of information that I know. For instance - it doesn’t really square with how HLI reported Psychotherapy is 9x GiveDirectly when the cost of treating one person with therapy is around $80, or how they estimated that it took $1000 worth of cash transfers to produce 0.92 SDs-years of subjective-well-being improvement (“totally curing just one case of severe depression for a year” should correspond to something more like 2-5 SD-years).
I wish I could give you a clearer “ah, here is where i think the mistake is” or perhaps a “oh, you’re right after all” but I too am finding the linked analysis a little hard to follow and am a bit short on time (ironically, because I’m trying to publish a different piece of Strongminds analysis before a deadline). Maybe one of the things we can talk about once we schedule a call is how you calculated this and whether it works? Or maybe HLI will comment and clear things up regarding the 1.3 figure you pulled out and what it really means.
Good stuff. I haven’t spent that much time looking at HLIs moral weights work but I think the answer is “Something is wrong with how you’ve constructed weights, HLI is in fact weighing mental health harder than SoGive”. I think a complete answer to this question requires me checking up on your calculations carefully, but I haven’t done so yet, so it’s possible that this is right.
If if were true that HLI found anything on the order of roughly doubling someone’s consumption improved well being as much as averting 1 case of depression, that would be very important as it would mean that SoGive moral weights fail some basic sanity checks. It would imply that we should raise our moral weight on cash-doubling to at least match the cost of therapy even under a purely subjective-well-being oriented framework to weighting. (why not pay 200 to double income, if it’s as good as averting depression and you would pay 200 to avert depression?) This seems implausible.
I haven’t actually been directly researching the comparative moral weights aspect, personally—I’ve been focusing primarily on <what’s the impact of therapy on depression in terms of effect size> rather than on the “what should the moral weights be” question (though I have put some attention to the “how to translate effect sizes into subjective intuitions” question, but that’s not quite the same thing). That said when I have more time I will look more deeply into this and check if our moral weights are failing some sort of sanity check on this order, but, I don’t think that they are.
Regarding the more general question of “where would we stand if we altered our moral weights to be something else”, ask me again in a month or so when all the spreadsheets are finalized, moral weights should be relatively easy to adjust once the analysis is done.
(as sanjay alludes to in the other thread, I do think all this is a somewhat separate discussion from the GWWC list—my main point with the GWWC list was that StrongMinds is not in the big picture actually super out of place with the others, in terms of how evidence-backed it is relative to the others, especially when you consider the big picture of the background academic literature about the intervention rather than their internal data. But I wanted to address the moral weights issue directly as it does seem like an important and separate point.)
that would be very important as it would mean that SoGive moral weights fail some basic sanity checks
I would recommend my post here. My opinion is—yes—SoGive’s moral weights do fail a basic sanity check.
1 year of averted depression is 4 income doublings 1 additional year of life (using GW life-expectancies for over 5s) is 1.95 income doublings.
ie SoGive would thinks depression is worse than death. Maybe this isn’t quite a “sanity check” but I doubt many people have that moral view.
I do think all this is a somewhat separate discussion from the GWWC list
I think cost-effectiveness is very important for this. StrongMinds isn’t so obviously great that we don’t need to consider the cost.
my main point with the GWWC list was that StrongMinds is not in the big picture actually super out of place with the others, in terms of how evidence-backed it is relative to the others, especially when you consider the big picture of the background academic literature about the intervention rather than their internal data
Yes, this is a great point which I think Jeff has addressed rather nicely in his new post. When I posted this it wasn’t supposed to be a critique of GWWC (I didn’t realise how bad the situation there was at the time) as much as a critique of StrongMinds. Now I see quite how bad it is, I’m honestly at a loss for words.
ie SoGive would thinks depression is worse than death. Maybe this isn’t quite a “sanity check” but I doubt many people have that moral view.
I replied in the moral weights post w.r.t. “worse than death” thing. (I think that’s a fundamentally fair, but fundamentally different point from what I meant re: sanity checks w.r.t not crossing hard lower bounds w.r.t. the empirical effects of cash on well being vs the empirical effect of mental health interventions on well being)
I might be being a bit dim here (I don’t have the time this week to do a good job of this), but I think of all the orgs evaluating StrongMinds, SoGive’s moral weights are most likely to find favourably for StrongMinds. Given that, I wonder what you expect you’d rate them at if you altered your moral weights to be more inline with FP and HLI?
(Source)
This is a ratio of 4:1 for averting a year of severe depression vs doubling someone’s consumption.
For context Founders Pledge who have a ratio somewhere around 1.3:1. Income doubling : DALY is 0.5 : 1. And severe depression corresponds to a DALY weighting of 0.658 in their CEA. (I understand they are shifting to a WELLBY framework like HLI, but I don’t think it will make much difference).
HLI is harder to piece together, but roughly speaking they see doubling income as having 1.3 WELLBY and severe depression has having a 1.3 WELLBY effect. A ratio of 1.3:1 (similar to FP)
Thanks for your question Simon, and it was very eagle-eyed of you to notice the difference in moral weights. Good sleuthing! (and more generally, thank you for provoking a very valuable discussion about StrongMinds)
I run SoGive and oversaw the work (then led by Alex Lawsen) to produce our moral weights. I’d be happy to provide further comment on our moral weights, however that might not be the most helpful thing. Here’s my interpretation of (the essence of) your very reasonable question:
I have a simple answer to this: no, it isn’t.
Let me flesh that out. We have (at least) two sources of information:
Academic literature
Data from StrongMinds (e.g. their own evaluation report on themselves, or their regular reporting)
And we have (at least) two things we might ask about:
(a) How effective is the intervention that StrongMinds does, including the quality of evidence for it?
(b) How effective is the management team at StrongMinds?
I’d say that the main crux is the fact that our assessment of the quality of evidence for the intervention (item (a)) is based mostly on item 1 (the academic literature) and not on item 2 (data from StrongMinds).
This is the driver of the comments made by Ishaan above, not the moral weights.
And just to avoid any misunderstandings, I have not here said that the evidence base from the academic literature is really robust—we haven’t finished our assessment yet. I am saying that (unless our remaining work throws up some surprises) it will warrant a more positive tone than your post, and that it may well demonstrate a strong enough evidence base + good enough cost-effectiveness that it’s in the same ballpark as other charities in the GWWC list.
I don’t understand how that’s possible. If you put 3x the weight on StrongMind’s cost-effectiviness viz-a-vis other charities, changing this must move the needle on cost-effectiveness more than anything else. It’s possible to me it could have been “well into the range of gold-standard” and now it’s “just gold-standard” or “silver-standard”. However if something is silver standard, I can’t see any way in which your cost-effectivness being adjusted down by 1/3rd doesn’t massively shift your rating.
I feel like I’m being misunderstood here. I would be very happy to speak to you (or Ishaan) on the academic literature. I think probably best done in a more private forum so we can tease out our differences on this topic. (I can think of at least one surprise you might not have come across yet).
Ishaan’s work isn’t finished yet, and he has not yet converted his findings into the SoGive framework, or applied the SoGive moral weights to the problem. (Note that we generally try to express our findings in terms of the SoGive framework and other frameworks, such as multiples of cash, so that our results are meaningful to multiple audiences).
Just to reiterate, neither Ishaan nor I have made very strong statements about cost-effectiveness, because our work isn’t finished yet.
That sounds great, I’ll message you directly. Definitely not wishing to misunderstand or misinterpret—thank you for your engagement on this topic :-)
To expand a little on “this seems implausible”: I feel like there is probably a mistake somewhere in the notion that anyone involves thinks that <doubling income as having 1.3 WELLBY and severe depression has having a 1.3 WELLBY effect.>
The mistake might be in your interpretation of HLI’s document (it does look like the 1.3 figure is a small part of some more complicated calculation regarding the economic impacts of AMF and their effect on well being, rather than intended as a headline finding about the cash to well being conversion rate). Or it could be that HLI has an error or has inconsistencies between reports. Or it could be that it’s not valid to apply that 1.3 number to “income doubling” SoGive weights for some reason because it doesn’t actually refer to the WELLBY value of doubling.
I’m not sure exactly where the mistake is, so it’s quite possible that you’re right, or that we are both missing something about how the math behind this works which causes this to work out, but I’m suspicious because it doesn’t really fit together with various other pieces of information that I know. For instance - it doesn’t really square with how HLI reported Psychotherapy is 9x GiveDirectly when the cost of treating one person with therapy is around $80, or how they estimated that it took $1000 worth of cash transfers to produce 0.92 SDs-years of subjective-well-being improvement (“totally curing just one case of severe depression for a year” should correspond to something more like 2-5 SD-years).
I wish I could give you a clearer “ah, here is where i think the mistake is” or perhaps a “oh, you’re right after all” but I too am finding the linked analysis a little hard to follow and am a bit short on time (ironically, because I’m trying to publish a different piece of Strongminds analysis before a deadline). Maybe one of the things we can talk about once we schedule a call is how you calculated this and whether it works? Or maybe HLI will comment and clear things up regarding the 1.3 figure you pulled out and what it really means.
Replied here
Good stuff. I haven’t spent that much time looking at HLIs moral weights work but I think the answer is “Something is wrong with how you’ve constructed weights, HLI is in fact weighing mental health harder than SoGive”. I think a complete answer to this question requires me checking up on your calculations carefully, but I haven’t done so yet, so it’s possible that this is right.
If if were true that HLI found anything on the order of roughly doubling someone’s consumption improved well being as much as averting 1 case of depression, that would be very important as it would mean that SoGive moral weights fail some basic sanity checks. It would imply that we should raise our moral weight on cash-doubling to at least match the cost of therapy even under a purely subjective-well-being oriented framework to weighting. (why not pay 200 to double income, if it’s as good as averting depression and you would pay 200 to avert depression?) This seems implausible.
I haven’t actually been directly researching the comparative moral weights aspect, personally—I’ve been focusing primarily on <what’s the impact of therapy on depression in terms of effect size> rather than on the “what should the moral weights be” question (though I have put some attention to the “how to translate effect sizes into subjective intuitions” question, but that’s not quite the same thing). That said when I have more time I will look more deeply into this and check if our moral weights are failing some sort of sanity check on this order, but, I don’t think that they are.
Regarding the more general question of “where would we stand if we altered our moral weights to be something else”, ask me again in a month or so when all the spreadsheets are finalized, moral weights should be relatively easy to adjust once the analysis is done.
(as sanjay alludes to in the other thread, I do think all this is a somewhat separate discussion from the GWWC list—my main point with the GWWC list was that StrongMinds is not in the big picture actually super out of place with the others, in terms of how evidence-backed it is relative to the others, especially when you consider the big picture of the background academic literature about the intervention rather than their internal data. But I wanted to address the moral weights issue directly as it does seem like an important and separate point.)
I would recommend my post here. My opinion is—yes—SoGive’s moral weights do fail a basic sanity check.
1 year of averted depression is 4 income doublings
1 additional year of life (using GW life-expectancies for over 5s) is 1.95 income doublings.
ie SoGive would thinks depression is worse than death. Maybe this isn’t quite a “sanity check” but I doubt many people have that moral view.
I think cost-effectiveness is very important for this. StrongMinds isn’t so obviously great that we don’t need to consider the cost.
Yes, this is a great point which I think Jeff has addressed rather nicely in his new post. When I posted this it wasn’t supposed to be a critique of GWWC (I didn’t realise how bad the situation there was at the time) as much as a critique of StrongMinds. Now I see quite how bad it is, I’m honestly at a loss for words.
I replied in the moral weights post w.r.t. “worse than death” thing. (I think that’s a fundamentally fair, but fundamentally different point from what I meant re: sanity checks w.r.t not crossing hard lower bounds w.r.t. the empirical effects of cash on well being vs the empirical effect of mental health interventions on well being)