A few things that stand out to me that seem dodgy and make me doubt this analysis:
One of the studies you included with the strongest effect (Araya et al. 2003 in Chile with an effect of 0.9 Cohens d) uses antidepressants as part of the intervention. Why did you include this? How many other studies included non-psychotherapy interventions?
Some of the studies deal with quite specific groups of people eg. survivors of violence, pregnant women, HIV-affected women with young children. Generalising from psychotherapy’s effects in these groups to psychotherapy in the general population seems unreasonable.
Similarly, the therapies applied between studies seem highly variable including “Antenatal Emotional Self-Management Training”, group therapy, one-on-one peer mentors. Lumping these together and drawing conclusions about “psychotherapy” generally seems unreasonable.
With the difficulty of blinding patients to psychotherapy, there seems to be room for the Hawthorne effect to be skewing the results of each of the 39 studies: with patients who are aware that they’ve received therapy feeling obliged to say that it helped.
Other minor things: - Multiple references to Appendix D. Where is Appendix D? - Maybe I’ve missed it but do you properly list the studies you used somewhere. “Husain, 2017” is not enough info to go by.
I addressed the variance in the primacy of psychotherapy in the studies in response to Nick’s comment, so I’ll respond to your other issues.
Some of the studies deal with quite specific groups of people eg. survivors of violence, pregnant women, HIV-affected women with young children. Generalising from psychotherapy’s effects in these groups to psychotherapy in the general population seems unreasonable.
I agree this would be a problem if we only had evidence from one quite specific group. But when we have evidence from multiple groups, and we don’t have strong reasons for thinking that psychotherapy will affect these groups differently than the general population—I think it’s better to include rather than exclude them.
I didn’t show enough robustness checks like this, which is a mistake I’ll remedy in the next version. I categorised the population of every study as involving “conflict or violence”, “general” or “HIV”. Running these trial characteristics as moderating factors suggests that, if anything, adding these additional populations underestimates the efficacy. But this is a point worth returning to.
Similarly, the therapies applied between studies seem highly variable including “Antenatal Emotional Self-Management Training”, group therapy, one-on-one peer mentors. Lumping these together and drawing conclusions about “psychotherapy” generally seems unreasonable.
I’m less concerned with variation in the type of therapy not generalising because as I say in the report (page 5) ”...different forms of psychotherapy share many of the same strategies. We do not focus on a particular form of psychotherapy. Previous meta-analyses find mixed evidence supporting the superiority of any one form of psychotherapy for treating depression (Cuijpers et al., 2019).”
Due to the fact most types of psychotherapy seem about as effective, and expertise doesn’t seem to be of first order importance, I formed the view that if you regularly get someone talk to about their problems in a semi-structured way it’ll probably be pretty good for them. This isn’t a view I’d defend to the death, but I held it strongly enough to justify (at least to myself and the team) doing the simpler version of the analysis I performed.
With the difficulty of blinding patients to psychotherapy, there seems to be room for the Hawthorne effect to be skewing the results of each of the 39 studies: with patients who are aware that they’ve received therapy feeling obliged to say that it helped.
Right, but this is the case with most interventions (e.g., cash transfers). So long as the Hawthorne effect is balanced across interventions (which I’m not implying is assured), then we should still be able to compare their cost-effectiveness using self-reports.
Furthermore, only 8 of the trials had waitlist or do nothing controls. The rest of the trials received some form of “care as usual” or a placebo like “HIV education”. Presumably these more active controls could also elicit a Hawthorne effect or response bias?
Hi Henry. Thanks for your feedback! I’ll let Joel respond to the substantive comments but just wanted to note that I’ve changed the “Appendix D” references to “Appendix C”. Thanks very much for letting us know about that.
I’m not sure why Appendix B has hyperlinks for some studies but not for others. I’ll check with Joel about that and add links to all the papers as soon as I can. In future, I plan to convert some of our data tables into embedded AirTables so that readers can reorder by different columns if they wish.
A few things that stand out to me that seem dodgy and make me doubt this analysis:
One of the studies you included with the strongest effect (Araya et al. 2003 in Chile with an effect of 0.9 Cohens d) uses antidepressants as part of the intervention. Why did you include this? How many other studies included non-psychotherapy interventions?
Some of the studies deal with quite specific groups of people eg. survivors of violence, pregnant women, HIV-affected women with young children. Generalising from psychotherapy’s effects in these groups to psychotherapy in the general population seems unreasonable.
Similarly, the therapies applied between studies seem highly variable including “Antenatal Emotional Self-Management Training”, group therapy, one-on-one peer mentors. Lumping these together and drawing conclusions about “psychotherapy” generally seems unreasonable.
With the difficulty of blinding patients to psychotherapy, there seems to be room for the Hawthorne effect to be skewing the results of each of the 39 studies: with patients who are aware that they’ve received therapy feeling obliged to say that it helped.
Other minor things:
- Multiple references to Appendix D. Where is Appendix D?
- Maybe I’ve missed it but do you properly list the studies you used somewhere. “Husain, 2017” is not enough info to go by.
Hi Henry,
I addressed the variance in the primacy of psychotherapy in the studies in response to Nick’s comment, so I’ll respond to your other issues.
I agree this would be a problem if we only had evidence from one quite specific group. But when we have evidence from multiple groups, and we don’t have strong reasons for thinking that psychotherapy will affect these groups differently than the general population—I think it’s better to include rather than exclude them.
I didn’t show enough robustness checks like this, which is a mistake I’ll remedy in the next version. I categorised the population of every study as involving “conflict or violence”, “general” or “HIV”. Running these trial characteristics as moderating factors suggests that, if anything, adding these additional populations underestimates the efficacy. But this is a point worth returning to.
I’m less concerned with variation in the type of therapy not generalising because as I say in the report (page 5) ”...different forms of psychotherapy share many of the same strategies. We do not focus on a particular form of psychotherapy. Previous meta-analyses find mixed evidence supporting the superiority of any one form of psychotherapy for treating depression (Cuijpers et al., 2019).”
Due to the fact most types of psychotherapy seem about as effective, and expertise doesn’t seem to be of first order importance, I formed the view that if you regularly get someone talk to about their problems in a semi-structured way it’ll probably be pretty good for them. This isn’t a view I’d defend to the death, but I held it strongly enough to justify (at least to myself and the team) doing the simpler version of the analysis I performed.
Right, but this is the case with most interventions (e.g., cash transfers). So long as the Hawthorne effect is balanced across interventions (which I’m not implying is assured), then we should still be able to compare their cost-effectiveness using self-reports.
Furthermore, only 8 of the trials had waitlist or do nothing controls. The rest of the trials received some form of “care as usual” or a placebo like “HIV education”. Presumably these more active controls could also elicit a Hawthorne effect or response bias?
Hi Henry. Thanks for your feedback! I’ll let Joel respond to the substantive comments but just wanted to note that I’ve changed the “Appendix D” references to “Appendix C”. Thanks very much for letting us know about that.
I’m not sure why Appendix B has hyperlinks for some studies but not for others. I’ll check with Joel about that and add links to all the papers as soon as I can. In future, I plan to convert some of our data tables into embedded AirTables so that readers can reorder by different columns if they wish.