James courteously shared a draft of this piece with me before posting, I really appreciate that and his substantive, constructive feedback.
1. I blundered
The first thing worth acknowledging is that he pointed out a mistake that substantially changes our results. And for that, I’m grateful. It goes to show the value of having skeptical external reviewers.
He pointed out that Kemp et al., (2009) finds a negative effect, while we recorded its effect as positive — meaning we coded the study as having the wrong sign.
What happened is that MH outcomes are often “higher = bad”, and subjective wellbeing is “higher = better”, so we note this in our code so that all effects that imply benefits are positive. What went wrong was that we coded Kemp et al., (2009), which used the GHQ-12 as “higher = bad” (which is usually the case) when the opposite was true. Higher equalled good in this case because we had to do an extra calculation to extract the effect [footnote: since there was baseline imbalance in the PHQ-9, we took the difference in pre-post changes], which flipped the sign.
This correction would reduce the spillover effect from 53% to 38% and reduce the cost-effectiveness comparison from 9.5 to 7.5x, a clear downwards correction.
This is how the forest plot should look.
2. James’s other critiques
I think James’s other critiques are also pretty reasonable. This updates me towards weighting these studies less. That said, what I should place weight on instead remains quite uncertain for me.
I’ve thought about it a bit, but I’m unsure what to make of the observational evidence. My reading of the observational literature mostly accords with James (I think), and it does appear to suggest smaller spillovers than the low quality RCTs I previously referenced (20% versus the now 38%). Here’s a little table I made while doing a brief review during my discussion with James.
However, I wonder if there’s something about the more observational literature that makes household spillovers appear smaller, regardless of the source. To investigate this further, I briefly compared household spillovers from unemployment and mental health shocks. This led me to an estimate of around 57% as the household spillover of unemployment, which I think we could use as a prior for other economic shocks. This is a bit lower than the 86% I estimated as the household spillover for cash transfers. Again, not quite sure what to make of this.
3. Other factors that influence my priors / fuzzy tummy feelings about psychotherapy spillovers.
Mother’s mental health seems really important, over and above a father’s mental health. Augustijn (2022) finds a higher relationship between mother<> child MH than father<>child mental health (a 1 LS point change in the mother predicts a 0.13 change in LS for child (as compared to 0.06 for fathers). Many of the studies above seem to have larger mother → child effects than father → child. This could be relevant as StrongMinds primarily treats women.
Mental health appears important relative to the effect of income.
See figure from Clark et al., (2018) -- shown below.
Mcnamee et al., (2021) (published version of Mendolia et al., 2018) finds that having a partner with a long term emotional or nervous condition that requires treatment has a −0.08 effect on LS, and that log household income has a 0.064 effect. If we interpret 0.69 log-units as leading to a doubling, and assume that most $1000 CTs lead to about a doubling in household income, then the effect of doubling income is 0.064 * 0.69 = 0.04 effect on LS. If I assume that the effect of depression is similar to “long-term emotional or nervous condition that requires treatment” and psychotherapy cures 40% of depression cases, then this leads to an effect of psychotherapy of 0.4 * −0.08 = 0.032. Or the effect of psychotherapy relative to doubling the income on a partner is 73%. Applying this to the 86% CT spillover would get us a 63% spillover ratio for psychotherapy.
You could compare income and mental health effects on wellbeing in other studies—but I haven’t had time to do so (and I’m not really sure of how informative this is).
Powdthavee & Vignoles (2008), which found the effect of mother distress in the previous period on children is 14% of the effect that the child’s own wellbeing in the previous period had on their present wellbeing. But it also seems to give weirdly small coefficients (and non-significant) for the effect of log-income on life satisfaction (0.054 for fathers, negative −0.132 for mothers).
Early life exposure to a parent’s low mental health seems plausibly related to very long term wellbeing effects through higher likelihood of worse parenting (abuse, fewer resources to support the child; Zheng et al., 2021)
I’m unsure if positive and negative shocks spillover in the same way. Negativity seems more contagious than positivity. For instance in Hurd et al. (2014) the spillover effects of re-employment seemed less than the harms of unemployment. Also see Adam et al., (2014)[4] – I’m sure there’s much more on this topic. This makes me think that it may not be wild to see a relatively smaller gap between the spillovers of cash transfers and psychotherapy than we may initially expect.
Most of these studies are in HICs. It seems plausible that spillovers for any intervention could be different and I suspect higher in LMICs than HICs. I assume emotional contagion is a function of spending time together, and spending more time together seems likelier when houses are smaller (you can’t shut yourself in your room as easily), transportation is relatively more expensive, difficult, and dangerous – and you may have fewer reasons to go elsewhere. One caveat is that household sizes are larger, so there may be less time directly spent with any given household member, so that’s a factor that could push in the other direction.
4. What’s next?
I think James and I probably agree that making sense out of the observational evidence is tricky to say the least, and a high quality RCT would be very welcome for informing our views and settling our disagreements. I think some further insight could come sooner rather than later. As I was writing this, I wondered if there was any possibility of household spillovers in Barker et al., (2022), a recent study about the effects of CBT on the general population in Ghana that looked into community spillovers of psychotherapy (-33% the size of the treatment effect but non-significant – but’s that’s a problem for another time).
In section 3.2 the paper reads, “At the endline we administered only the “adult” survey, again to both the household head and their spouse… In our analysis of outcomes, we include the responses of both adults in control households; in households where an individual received CBT, we only include treated individuals.’”
This means that while Barket et al. didn’t look into it we should be able to estimate the spousal mental health spillover of having your partner receive CBT. In further good news, the replication materials are public. But I’ll leave this as a teaser while I try to figure out how to run the analysis.
Why a 25% discount? I guess partners are likelier to share tendencies towards a given level of wellbeing, but I think this “birds of a feather” effect is smaller than the genetic effects. Starting from a 50% guess for genetic effects (noted in the next footnote), I thought that the assortative mating effects would be about half the magnitude or 25%.
How did I impute the parent/child effect? The study was ambiguous about the household relations being analysed. So I assumed that it was 50-50 parents and children and that the spouse-to-spouse spillover was a 1/4th that of the parent-to-child spillover.
Why a 50% discount? There appears to be an obvious genetic factor between a parent and child’s levels of wellbeing that could confound these estimates. Jami et al., (2021) reviews ~30 studies that try to disentangle the genetic and environmental link between families affective mental health. My reading is that environmental (pure contagion) effects dominate the anxiety transmission, and genetic-environmental factors seem roughly balanced for depression. Since we mostly consider psychotherapy to treat depression, I only reference the depression results when coming up with the 50% figure.
“When positive posts were reduced in the News Feed, the percentage of positive words in people’s status updates decreased by B = −0.1% compared with control [t(310,044) = −5.63, P < 0.001, Cohen’s d = 0.02], whereas the percentage of words that were negative increased by B = 0.04% (t = 2.71, P = 0.007, d = 0.001). Conversely, when negative posts were reduced, the percent of words that were negative decreased by B = −0.07% [t(310,541) = −5.51, P < 0.001, d = 0.02] and the percentage of words that were positive, conversely, increased by B = 0.06% (t = 2.19, P < 0.003, d = 0.008).
Strong upvote for both James and Joel for modeling a productive way to do this kind of post—show the organization a draft of the post first, and give them time to offer comments on the draft + prepare a comment for your post that can go up shortly after the post does.
Given that this post has been curated, I wanted to follow up with a few points I’d like to emphasise that I forgot to include in the original comment.
To my knowledge, we were the first to attempt to estimate household spillovers empirically. In hindsight, it shouldn’t be too surprising that it’s been a messy enterprise. I think I’ve updated towards “messiness will continue”.
One hope of ours in the original report was to draw more attention to the yawning chasm of good data on this topic.
“The lack of data on household effects seems like a gap in the literature that should be addressed by further research. We show that including household spillovers can change the relative cost-effectiveness of two interventions, which demonstrates the need to account for the impact of interventions beyond the direct recipient.”
Relatedly, spillovers don’t have to be huge to be important. If you have a household of 5, with 1 recipient and 4household non-recipients, household spillovers only need to be 25% that of the recipient effect for the two effects to be equivalent in size. I’m still pretty confident we omit an important parameter when we fail to estimate household spillovers.
So I’m pleased with this conversation and hopeful that spillovers for all outcomes in the global health and wellbeing space will be given more empirical consideration as a consequence.
There are probably relatively cost-effective ways to gather more data regarding psychotherapy spillovers in particular.
I’ve heard that some people working with Vida-Plena are trying to find funding for an RCT that includes spillovers — but I haven’t spoken to Joy about this recently.
StrongMinds could be willing to do more work here. I think they’re planning an RCT — if they get it funded, I think adding a module for household surveys shouldn’t be too expensive.
There’s also a slew of meta-analyses of interventions aimed at families that didn’t always seem jointly to target parents and children that may include more RCTs where we can infer spillovers. Many of these I missed before: Siegenthaler et al. (2012), Thanhäuser et al. (2017), Yap et al. (2016), Loechner et al. (2018), Lannes et al., (2018), and Havinga et al. (2021)
In general, household spillovers should be relatively cheap to estimate if they just involve surveying a randomly selected additional household member and clarifying the relationships between those surveyed.
I still don’t have the Barker et al. RCT spillover results, but will update this comment once I know.
Is it as easy (or easy enough) to enroll participants in RCTs if you need their whole household, rather than just them, to consent to participate? Does it create any bias in the results?
I’d assume that 1. you don’t need the whole household, depending on the original sample size, it seems plausible to randomly select a subset of household members [1](e.g., in house A you interview recipient and son, in B. recipient and partner, etc...) and 2. they wouldn’t need to consent to participate, just to be surveyed, no?
If these assumptions didn’t hold, I’d be more worried that this would introduce nettlesome selection issues.
I recognise this isn’t necessarily simple as I make it out to be. I expect you’d need to be more careful with the timing of interviews to minimise the likelihood that certain household members are more likely to be missing (children at school, mother at the market, father in the fields, etc.).
James courteously shared a draft of this piece with me before posting, I really appreciate that and his substantive, constructive feedback.
1. I blundered
The first thing worth acknowledging is that he pointed out a mistake that substantially changes our results. And for that, I’m grateful. It goes to show the value of having skeptical external reviewers.
He pointed out that Kemp et al., (2009) finds a negative effect, while we recorded its effect as positive — meaning we coded the study as having the wrong sign.
What happened is that MH outcomes are often “higher = bad”, and subjective wellbeing is “higher = better”, so we note this in our code so that all effects that imply benefits are positive. What went wrong was that we coded Kemp et al., (2009), which used the GHQ-12 as “higher = bad” (which is usually the case) when the opposite was true. Higher equalled good in this case because we had to do an extra calculation to extract the effect [footnote: since there was baseline imbalance in the PHQ-9, we took the difference in pre-post changes], which flipped the sign.
This correction would reduce the spillover effect from 53% to 38% and reduce the cost-effectiveness comparison from 9.5 to 7.5x, a clear downwards correction.
This is how the forest plot should look.
2. James’s other critiques
I think James’s other critiques are also pretty reasonable. This updates me towards weighting these studies less. That said, what I should place weight on instead remains quite uncertain for me.
I’ve thought about it a bit, but I’m unsure what to make of the observational evidence. My reading of the observational literature mostly accords with James (I think), and it does appear to suggest smaller spillovers than the low quality RCTs I previously referenced (20% versus the now 38%). Here’s a little table I made while doing a brief review during my discussion with James.
0.00%
0.00%
25%[1]
5.25%
7.00%
25%
22.50%
30.00%
25%
3.75%
5.00%
25%
10.00%
40.00%
50%[3]
7.88%
10.50%
14.00%
25%
12.00%
16.00%
25%
43.50%
58.00%
25%
40.00%
40.00%
50%
16.50%
33.00%
50%
24.50%
20.34%
31.50%
35.00%
10%
35.00%
35.00%
0%
34.00%
34.00%
0%
60.00%
60.00%
0%
40.13%
63.00%
63.00%
0%
63.00%
57.28%
However, I wonder if there’s something about the more observational literature that makes household spillovers appear smaller, regardless of the source. To investigate this further, I briefly compared household spillovers from unemployment and mental health shocks. This led me to an estimate of around 57% as the household spillover of unemployment, which I think we could use as a prior for other economic shocks. This is a bit lower than the 86% I estimated as the household spillover for cash transfers. Again, not quite sure what to make of this.
3. Other factors that influence my priors / fuzzy tummy feelings about psychotherapy spillovers.
Mother’s mental health seems really important, over and above a father’s mental health. Augustijn (2022) finds a higher relationship between mother<> child MH than father<>child mental health (a 1 LS point change in the mother predicts a 0.13 change in LS for child (as compared to 0.06 for fathers). Many of the studies above seem to have larger mother → child effects than father → child. This could be relevant as StrongMinds primarily treats women.
Mental health appears important relative to the effect of income.
See figure from Clark et al., (2018) -- shown below.
Mcnamee et al., (2021) (published version of Mendolia et al., 2018) finds that having a partner with a long term emotional or nervous condition that requires treatment has a −0.08 effect on LS, and that log household income has a 0.064 effect. If we interpret 0.69 log-units as leading to a doubling, and assume that most $1000 CTs lead to about a doubling in household income, then the effect of doubling income is 0.064 * 0.69 = 0.04 effect on LS. If I assume that the effect of depression is similar to “long-term emotional or nervous condition that requires treatment” and psychotherapy cures 40% of depression cases, then this leads to an effect of psychotherapy of 0.4 * −0.08 = 0.032. Or the effect of psychotherapy relative to doubling the income on a partner is 73%. Applying this to the 86% CT spillover would get us a 63% spillover ratio for psychotherapy.
You could compare income and mental health effects on wellbeing in other studies—but I haven’t had time to do so (and I’m not really sure of how informative this is).
Powdthavee & Vignoles (2008), which found the effect of mother distress in the previous period on children is 14% of the effect that the child’s own wellbeing in the previous period had on their present wellbeing. But it also seems to give weirdly small coefficients (and non-significant) for the effect of log-income on life satisfaction (0.054 for fathers, negative −0.132 for mothers).
Early life exposure to a parent’s low mental health seems plausibly related to very long term wellbeing effects through higher likelihood of worse parenting (abuse, fewer resources to support the child; Zheng et al., 2021)
I’m unsure if positive and negative shocks spillover in the same way. Negativity seems more contagious than positivity. For instance in Hurd et al. (2014) the spillover effects of re-employment seemed less than the harms of unemployment. Also see Adam et al., (2014)[4] – I’m sure there’s much more on this topic. This makes me think that it may not be wild to see a relatively smaller gap between the spillovers of cash transfers and psychotherapy than we may initially expect.
Most of these studies are in HICs. It seems plausible that spillovers for any intervention could be different and I suspect higher in LMICs than HICs. I assume emotional contagion is a function of spending time together, and spending more time together seems likelier when houses are smaller (you can’t shut yourself in your room as easily), transportation is relatively more expensive, difficult, and dangerous – and you may have fewer reasons to go elsewhere. One caveat is that household sizes are larger, so there may be less time directly spent with any given household member, so that’s a factor that could push in the other direction.
4. What’s next?
I think James and I probably agree that making sense out of the observational evidence is tricky to say the least, and a high quality RCT would be very welcome for informing our views and settling our disagreements. I think some further insight could come sooner rather than later. As I was writing this, I wondered if there was any possibility of household spillovers in Barker et al., (2022), a recent study about the effects of CBT on the general population in Ghana that looked into community spillovers of psychotherapy (-33% the size of the treatment effect but non-significant – but’s that’s a problem for another time).
In section 3.2 the paper reads, “At the endline we administered only the “adult” survey, again to both the household head and their spouse… In our analysis of outcomes, we include the responses of both adults in control households; in households where an individual received CBT, we only include treated individuals.’”
This means that while Barket et al. didn’t look into it we should be able to estimate the spousal mental health spillover of having your partner receive CBT. In further good news, the replication materials are public. But I’ll leave this as a teaser while I try to figure out how to run the analysis.
Why a 25% discount? I guess partners are likelier to share tendencies towards a given level of wellbeing, but I think this “birds of a feather” effect is smaller than the genetic effects. Starting from a 50% guess for genetic effects (noted in the next footnote), I thought that the assortative mating effects would be about half the magnitude or 25%.
How did I impute the parent/child effect? The study was ambiguous about the household relations being analysed. So I assumed that it was 50-50 parents and children and that the spouse-to-spouse spillover was a 1/4th that of the parent-to-child spillover.
Why a 50% discount? There appears to be an obvious genetic factor between a parent and child’s levels of wellbeing that could confound these estimates. Jami et al., (2021) reviews ~30 studies that try to disentangle the genetic and environmental link between families affective mental health. My reading is that environmental (pure contagion) effects dominate the anxiety transmission, and genetic-environmental factors seem roughly balanced for depression. Since we mostly consider psychotherapy to treat depression, I only reference the depression results when coming up with the 50% figure.
“When positive posts were reduced in the News Feed, the percentage of positive words in people’s status updates decreased by B = −0.1% compared with control [t(310,044) = −5.63, P < 0.001, Cohen’s d = 0.02], whereas the percentage of words that were negative increased by B = 0.04% (t = 2.71, P = 0.007, d = 0.001). Conversely, when negative posts were reduced, the percent of words that were negative decreased by B = −0.07% [t(310,541) = −5.51, P < 0.001, d = 0.02] and the percentage of words that were positive, conversely, increased by B = 0.06% (t = 2.19, P < 0.003, d = 0.008).
Strong upvote for both James and Joel for modeling a productive way to do this kind of post—show the organization a draft of the post first, and give them time to offer comments on the draft + prepare a comment for your post that can go up shortly after the post does.
Given that this post has been curated, I wanted to follow up with a few points I’d like to emphasise that I forgot to include in the original comment.
To my knowledge, we were the first to attempt to estimate household spillovers empirically. In hindsight, it shouldn’t be too surprising that it’s been a messy enterprise. I think I’ve updated towards “messiness will continue”.
One hope of ours in the original report was to draw more attention to the yawning chasm of good data on this topic.
“The lack of data on household effects seems like a gap in the literature that should be addressed by further research. We show that including household spillovers can change the relative cost-effectiveness of two interventions, which demonstrates the need to account for the impact of interventions beyond the direct recipient.”
Relatedly, spillovers don’t have to be huge to be important. If you have a household of 5, with 1 recipient and 4household non-recipients, household spillovers only need to be 25% that of the recipient effect for the two effects to be equivalent in size. I’m still pretty confident we omit an important parameter when we fail to estimate household spillovers.
So I’m pleased with this conversation and hopeful that spillovers for all outcomes in the global health and wellbeing space will be given more empirical consideration as a consequence.
There are probably relatively cost-effective ways to gather more data regarding psychotherapy spillovers in particular.
I’ve heard that some people working with Vida-Plena are trying to find funding for an RCT that includes spillovers — but I haven’t spoken to Joy about this recently.
StrongMinds could be willing to do more work here. I think they’re planning an RCT — if they get it funded, I think adding a module for household surveys shouldn’t be too expensive.
There’s also a slew of meta-analyses of interventions aimed at families that didn’t always seem jointly to target parents and children that may include more RCTs where we can infer spillovers. Many of these I missed before: Siegenthaler et al. (2012), Thanhäuser et al. (2017), Yap et al. (2016), Loechner et al. (2018), Lannes et al., (2018), and Havinga et al. (2021)
In general, household spillovers should be relatively cheap to estimate if they just involve surveying a randomly selected additional household member and clarifying the relationships between those surveyed.
I still don’t have the Barker et al. RCT spillover results, but will update this comment once I know.
Is it as easy (or easy enough) to enroll participants in RCTs if you need their whole household, rather than just them, to consent to participate? Does it create any bias in the results?
I’d assume that 1. you don’t need the whole household, depending on the original sample size, it seems plausible to randomly select a subset of household members [1](e.g., in house A you interview recipient and son, in B. recipient and partner, etc...) and 2. they wouldn’t need to consent to participate, just to be surveyed, no?
If these assumptions didn’t hold, I’d be more worried that this would introduce nettlesome selection issues.
I recognise this isn’t necessarily simple as I make it out to be. I expect you’d need to be more careful with the timing of interviews to minimise the likelihood that certain household members are more likely to be missing (children at school, mother at the market, father in the fields, etc.).