To fully disclose my biases: I’m not part of EA, I’m Greg’s younger sister, and I’m a junior doctor training in psychiatry in the UK. I’ve read the comments, the relevant areas of HLI’s website, Ozler study registration and spent more time than needed looking at the dataset in the Google doc and clicking random papers.
I’m not here to pile on, and my brother doesn’t need me to fight his corner. I would inevitably undermine any statistics I tried to back up due to my lack of talent in this area. However, this is personal to me not only wondering about the fate of my Christmas present (Greg donated to Strongminds on my behalf), but also as someone who is deeply sympathetic to HLI’s stance that mental health research and interventions are chronically neglected, misunderstood and under-funded. I have a feeling I’m not going to match the tone here as I’m not part of this community (and apologise in advance for any offence caused), but perhaps I can offer a different perspective as a doctor with clinical practice in psychiatry and on an academic fellowship (i.e. I have dedicated research time in the field of mental health).
The conflict seems to be that, on one hand, HLI has important goals related to a neglected area of work (mental health, particularly in LMICs). I also understand the precarious situation they are in financially, and the fears that undermining this research could have a disproportionate effect on HLI vs critiquing an organisation which is not so concerned with their longevity. There might be additional fears that further work in this area will be scrutinised to a uniquely high degree if there is a precedent set that HLI’s underlying research is found to be flawed. And perhaps this concern is compounded by the stats from people in this thread, which perhaps is not commonly directed to other projects in the EA-sphere, and might suggest there is an underlying bias against this type of work.
I think it’s fair to hold these views, but I’d argue this is likely the mechanism by which HLI has escaped scrutiny before now – people agree more work and funding should be directed to mental health and wanted to support an organisation addressing this. It possibly elevated the status of HLI in people’s minds, appearing more revolutionary in redirecting discussions in EA as a whole. Again, Greg donated to Strongminds on my behalf and, while he might now feel a sense of embarrassment for not delving into this research prior, in my mind I think it reflects a sense of affirmation in this cause and trust in this community which prides itself on being evidence-based. I’m mentioning it, because I think everyone here is united on these points and it’s always easier to have productive discussions from the mutual understanding of shared values and goals.
However, there are serious issues in the meta-analysis which appears to underlie the CEA, and therefore the strength of claims made by HLI. I think it is possible to uncouple this statement from arguments against HLI or any of the above points (where I don’t see disagreement). It seems critical to acknowledge the flaws in this work given the values of EA as an objective, data-driven approach to charitable giving. Failing to do this will risk the reputation of EA, and suggest there is a lack of critical appraisal and scrutiny which perhaps is driven by personal biases i.e. the number of reassurances in this thread that HLI is a good organisation where members are known personally to others in the community. Good people with good intentions can produce flawed research. Similarly, from the perspective of a clinical academic in psychiatry, there is a long history in my field of poorly-conducted, misinterpreted and rushed research which has meant establishing evidence-based care and attracting funding for research/interventions particularly difficult. Poor research in this area risks worsening this problem and mis-allocating very limited resources – it’s fairly shocking seeing the figures quoted here in terms of funding if it is based wholly or in part on outputs such as this meta-analysis which were accepted by EA. Again, as an outsider, it’s difficult for me to judge how critical this research was in attracting this allocation of funds.
While I think the issues with the analysis and all the statistics discussions are valid critiques of this work, it’s important to establish that this is only part of the reason this study would fall down under peer review. It’s concerning to me that peer-review is not the standard for organisations supported by EA; this is not just about scrutinising how the research was conducted and arguing about statistics, but establishes the involvement of expertise within the field of study. As someone who works in this field, the assumptions this meta-analysis makes about psychotherapy, outcome measures in mental health, etc, are problematic but perhaps not readily identified to those without a clinical background, and this is a much greater problem if there is an increasing interest in addressing mental health within EA. I’m not familiar with the backgrounds of people involved in HLI, but I’d be curious about who was consulted in formulating this work given the tone seems to reflect more philosophical vs psychiatric/psychotherapeutic language.
The way the statistical analysis has been heavily debated in this thread likely reflect the skills-mix in the EA community (clearly stats are well-covered!), but the statistics are somewhat irrelevant if your study design and inputs into the analysis are flawed to start with. Even if the findings of this research were not so unusual (perhaps something else which could have been flagged sooner) or were based on concrete stats, the research would still be considered flawed in my field. I imagine this will prompt some reflection in EA on this topic, but peer-review as a requirement could have avoided the bad timing of these discussions and would reduce the reliance on community members to critique research. I think this thread has demonstrated that critical appraisal is time-intensive and relies on specialist skills – it’s not likely that every area of interest will have representation within the EA community so the problem of ‘not knowing what you don’t know’ or how you weight the importance of voices in the community vs their amplification would be greatly helped by peer-review and reduce these blind spots. If the central goal of EA is using money to do the most good, and there is no robust system to evaluate research prior to attracting funding, this is an organisational problem rather than a specific issue with HLI/Strongminds.
My unofficial peer review.
Given inclusion/exclusion criteria aren’t stated clearly in the meta-analysis and the aim is pretty woolly, It seems the focus of the upcoming RCT and Strongminds research is evaluating:
Training non-HCPs in delivering psychotherapy in LMICs
Providing treatment (particularly to young women and girls) with symptoms suggestive of moderate to severe depression (PHQ-9 score of 10 and above)
Measuring the efficacy of this treatment on subjective symptom rating scales, such as PHQ-9, and other secondary outcome measures which might reflect broader benefits not captured in the symptom rating scales.
Finding some way to compare the cost-effectiveness of this treatment to other interventions such as cash transfers in broader discussions of life satisfaction and wellbeing which it obviously complicated compared to using QALYs, but important to do as the impact of mental illness is under-valued using measures geared towards physical morbidity. Or maybe it’s trying to understand effectiveness of treating symptoms vs assumed precipitating/perpetuating factors like poverty.
Grand.
However, the meta-analysis design seems to miss the mark on developing anything which would support a CEA along these lines. Even from the perspective of favouring broad inclusion criteria, you would logically set these limits:
Population
LMIC setting, people with depressive symptoms. It’s not clear if this is about effectively treating depression with psychotherapy and extrapolating that to a comment on wellbeing; or using psychotherapy as a tool to improve wellbeing, which for some reason is being measured in a reduction in various symptom scales for different mental health conditions and symptoms – this needs to be clearly stated. If it’s the former, what you accept as a diagnosis of depression (ICD diagnostic codes, clinical assessment by trained professional, symptom scale cut-offs, antidepressant treatment, etc) should be defined.
If not defining the inclusion criteria of depression as a diagnosis, it’s worth considering if certain psychiatric/medical conditions or settings should be excluded e.g. inpatients. As a hypothetical, extracting data on depression symptom scales for a non-HCP delivered psychotherapy in bipolar patients will obviously be misleading in isolation (i.e. the study likely accounted for measuring mania symptoms in their findings, but would be lost in this meta-analysis). One study included in this analysis (Richter et al) looked at an intervention which encouraged adherence to anti-retroviral medications via peer support for women newly diagnosed with HIV. Fortunately, this study shouldn’t have been included as it didn’t involve delivering psychotherapy, but for the sake of argument, is that fair given the neuropsychiatric complications of HIV/AIDS? Again, it’s not about preparing for every eventuality, but it’s having clear inclusion/exclusion criteria so there’s no argument about cherry-picking studies because this has been discussed prior to search and analysis.
Intervention
Delivery of a specific psychotherapeutic modality (IPT, etc) by a non-HCP. While I can agree there are shared core concepts between different modalities of psychotherapy, you absolutely have to define what you mean by psychotherapy because your dataset containing a column labelled ‘therapyness’ (high/medium/low) undermines a lot of confidence, as do some of the interventions you’ve included as meeting the bar for psychotherapy treatment. If you want to include studies which perhaps are not focussed on treating depression and might therefore involve other forms of therapy but still have benefit in alleviating depressive symptoms e.g. where the presenting complaint is trauma, the intervention might be EMDR (a specific therapy for PTSD) but the authors collected a number of outcome measures including symptom rating scales for anxiety and depression as secondary outcomes, it would be logical to stratify studies in this manner as a plan for analysis. I.e. psychotherapeutic intervention with evidence-base in relieving depressive symptoms (CBT, IPT, etc), psychotherapeutic intervention not specifically targeted at depressive symptoms (EMDR, MBT etc), with non-(psychotherapy) intervention as the control.
Several studies instead use non-psychotherapy as the intervention under study and this confusion seems to be down to papers describing them as having a ‘psychotherapeutic approach’ or being based on principles in any area of psychotherapy. This would cover almost anything as ‘psychotherapeutic’ as an adjective just means understanding people’s problems through their internal environment e.g. thoughts, feelings, behaviours and experiences. In my day-to-day work, I maintain a psychotherapeutic approach in patient interactions, but I do not sit down and deliver 14-week structured IPT. You can argue that generally having a supportive environment to discuss your problems with someone who is keen to hear them is equally beneficial to formal psychotherapy, but this leads to the obvious question of how you can use the idea of any intervention which sounds a bit ‘psychotherapy-y’ to justify the cost of training people to specifically deliver psychotherapy in a CEA from this data.
The fundamental lack of definition or understanding of these clinical terms leads to odd issues in some of the papers I clicked on i.e. Rojas et al (2007) compares a multicomponent group intervention involving lots of things but notably not delivery of any specific psychotherapy, to normal clinical care in a postnatal clinic. The next sentence describes part of normal clinical care to be providing ‘brief psychotherapeutic interventions’ – perhaps this is understood by non-clinicians as not highly ‘therapyish’ but this term is often used to describe short-term focussed CBT, or CBT-informed interventions. Not defining the intervention clearly means the control group contains patients receiving evidence-based psychotherapy of a specific modality and a treatment arm of no specific psychotherapy which is muddled by the MA.
Comparison
As alluded to above, you need to be clear about what is an acceptable control and it’s simply not enough to state you are not sure what the ‘usual care’ is in research by Strongminds you have weighted so heavily. It can’t be then justified by an assumption mental health is neglected in LMICs so probably wouldn’t involve psychotherapy (with no citation). Especially as the definition of psychotherapy in this meta-analysis would deem someone visiting a pastor in church once a week as receiving psychotherapy. Without clearly defining the intervention, it’s really difficult to understand what you are comparing against what.
Outcome
This meta-analysis uses a range of symptom rating scales as acceptable outcome measures, favouring depression and anxiety rating scales, and scales measuring distress. This seems to be based on idea that these clusters of symptoms are highly adverse to wellbeing. This makes the analysis and discussion really confused, in my opinion, and seems to be a sign the analysis, expected findings, extrapolation to wellbeing and CEA was mixed into the methodology.
To me, the issue arises from not clearly defining the aim and inclusion/exclusion criteria. This meta-analysis could be looking at psychotherapy as a treatment for depression/depressive symptoms. This would acknowledge that depression is a psychiatric illness with congitive, psychological and biological symptoms (as captured by depression rating scales). As a clinical term, it is not just about ‘negative affect’ - low mood is not even required for a diagnosis as per ICD criteria. It absolutely does negatively affect wellbeing, as would any illness with unpleasant/distressing symptoms, but this therefore means generating some idea for how much patients’ wellbeing improves from treatment has to be specific to depression. The subsequent CEA would then need to account for this and evaluate only psychotherapies with an evidence-base in depression. In the RCT design, I’d guess this is the rationale for a high PHQ cut-off—it’s a proxy for relative certainty in a clinical diagnosis of depression (or at least a high burden of symptoms which may respond to depression treatments and therefore demonstrate a treatment effect); it’s not supporting the idea that some general negative symptoms impacting a concept of wellbeing, short of depression, will likely benefit from specific psychotherapy to any degree of significance, and it would be an error to take this assumption and then further assume a linear relationship between PHQ and wellbeing/impairment.
If you are looking at depressive symptom reduction, you need to only include evaluation tools for depressive symptoms (PHQ, etc). You need to define which tools you would accept prior to the search and that these are validated for the population under study as you are using them in isolation—how mental illness is understood and presents is highly culturally-bound and these tools almost entirely developed outside of LMICs.
If, instead, you’re looking at a range of measures you feel reflect poor mental health (including depression, anxiety and distress) in order to correlate this to a concept of wellbeing, these tools similarly have to be defined and validated. You also need to explain why some tools should be excluded, because this is unclear e.g. in Weiss et al, a study looking at survivors of torture and militant attacks in Iraq, the primary outcome measure was a trauma symptom scale (the HTQ), yet you’ve selected the secondary outcome measures of depression and anxiety symptom scores for inclusion. I would have assumed that reducing the highly distressing symptoms of PTSD in this group would be most relevant to a concept of wellbeing, yet that is not included in favour of the secondary measures. Including multiple outcome measures with no plan to stratify/subgroup per symptom cluster or disorder seems to accept double/triple counting participants who completed multiple outcome measures from the same intervention. Importantly, you can’t then use this wide mix of various scales to make any comment on the benefits of psychotherapy for depression in improving wellbeing (as lots of the included scores are not measuring depression).
In both approaches, you do need to show it is accepted to pool these different rating scales to still answer your research question. It’s interesting to state you favour subjective symptom scores over functional scores (which are excluded), when both are well-established in evaluating psychotherapy. Other statements made by HLI suggest symptom rating scores include assesment of functioning—I’ve reproduced the PHQ-9 below for people to draw their own conclusions, but it’s safe to say I disagree with this. It’s not clear to me if it’s understood functional scores are also commonly subjective measures, like the WSAS—patients are asked how rate how well they feel they are managing work activities, social activities etc. Ignoring functioning as a blanket rule seems to miss the concept of ‘insight’ in mental health, where people can struggle identifying symptoms as symptoms but are severely disabled due to an illness (this perhaps should also be considered in excluding scales completed by an informant or close relative, particularly thinking about studies involving children or more severe psychopathology). Incorporating functional scoring captures the holistic nature of psychotherapy, where perhaps people may still struggle with symptoms of anxiety/depression after treatment, but have made huge strides in being able to return to work. Again, you need to be clear why functional scores are excluded and be clear this was done when extrapolating findings to discussions of life satisfaction or wellbeing. This research has made a lot of assumptions in this regard that I don’t follow.
x. Grouping measures and relating this to wellbeing:
On that note – using a mean change in symptoms scores is a reasonable evaluation of psychotherapy as a concept if you are so inclined but I would strongly argue that this cannot be used in isolation to make any inference about how this correlates to wellbeing. As others have alluded to in this thread, symptom scores are not linear. To isolate depression as an example, this is deemed mild/moderate/severe based on the number of symptoms experienced, the presence of certain concerning symptoms (e.g. psychosis) and the degree of functional impact.
Measures like the PHQ-9 score the number of depressive symptoms present and how often they occur from 0 (not at all) to 3 (nearly every day) over the past two weeks:
Little interest or pleasure in doing things?
Feeling down, depressed or hopeless?
Trouble falling or staying asleep, or sleeping too much?
Feeling tired or having little energy?
Poor appetite or overeating?
Feeling bad about yourself—or that you are a failure or have let yourself or your family down?
Trouble concentrating on things, such as reading the newspaper or watching television?
Moving or speaking so slowly that other people have noticed? Or the opposite—being so fidgety or restless that you have been moving around a lot more than usual?
Thoughts that you would be better off dead, or of hurting yourself in some way?
If you take the view that a symptom rating score has a linear relationship to ‘negative affect’ or suffering in depression, you would then imagine that the outcomes of PHQ-9 (no depression, mild, moderate, severe) would be regularly distributed in the 27-item score i.e. a score of 0-6 should be no depression, 7-13 mild depression, 14-20 moderate depression and 21-27 severe. This is not the case as the actual PHQ-9 scores are 0-4 no depression, 5-9 mild depression, 10-14 moderate, 15-19 moderately severe, 20-27 severe. This is because the symptoms asked about in the PHQ are diagnostic for depression – it’s not an attempt at trying to gather how happy or sad someone is on a scale from 0-27 (in fact 0 just indicates ‘no depression symptoms’, not happiness or fulfilment, and it’s likely people with very serious depression will not be able to complete a PHQ-9). Hopefully it’s clear from the PHQ-9 why the cut-offs are low and why the severity increases so sharply; the symptoms in question are highly indicative of pathology if occuring frequently. It’s also in the understanding that a PHQ-9 would be administered when there is clinical suspicion of depression to elicit severity or in evaluation of treatment (i.e. in some contexts, like bereavement, experiencing these symptoms would be considered normal, or if symptoms are better explained by another illness the PHQ is unhelpful) and it’s not used for screening (vs the Edinburgh score for postnatal depression which is a screening tool and features heavily in included studies). Critically, it’s why you can’t assume it’s valid to lump all symptom scales together, especially cross-disorders/symptom clusters as in this meta-analysis.
x. Search strategy
I feel this should go without saying, but once you’ve ironed out these issues to have a research question you could feasibly inform with meta-analysis, you then need to develop a search strategy and conduct a systematic review. It’s guaranteed that papers have been missed with the approach used here, and I’ve never read a peer-reviewed meta-analysis where a (10-hour) time constraint was used as part of this strategy. While I agree the funnel plot is funky, it’s likely reflecting the errors in not conducting this search systematically rather than assuming publication bias – it’s likely the highly cited papers etc were more easily found using this approach and therefore account for the p-clustering. If the search was a systematic review and there were objective inclusion/exclusion criteria and the funnel plot looked like that, you can make an argument for publication bias. As it stands, the outputs are only as good as the inputs i.e. you can’t out-analyse a poor study design/methodology to produce reliable results.
Simply put, the most critical problem here is that without even getting into the problems with the data extraction I found, or the analysis as discussed in this thread, from this study design which doesn’t seek to justify why any of these decisions were made, any analytic outputs are destined to be unreliable. How much of this was deliberate on the part of HLI can’t be determined as there is no possible way of replicating the search strategy they used (this is the reason to have a robust strategy as part of your study design). I think if you want to call this a back-of-napkin scoping review to generate some speculative numbers, you could describe what you found as there being early signals that psychotherapy could be more cost-effective than assumed and therefore there’s need to conduct a vigorous SR/MA. It perhaps may have been more useful in a shallow review to actually exclude the Strongminds study and evaluate existing research through the framework of (1) do the SM results make sense in the context of previous studies and (2) can we explain any differences in a narrative review. It seems instead this work generated figures which were treated as useful or reliable and fed into a CEA, which was further skewed by how this was discussed by HLI.
TL;DR
This is obviously very long and not going to be read in any detail on an online forum, but from the perspective of someone within this field, there seem to be a raft of problems with how this research was conducted and evaluated by HLI and EA. I’m not considered the Queen Overload of Psychiatry, I don’t have a PhD, but I suppose I’m trying to demonstrate that having a different background raises different questions, which seems particularly relevant if there is a recognition of the importance of peer-review (hopefully, I’m assuming, outside of EA literature). I’m also going to caveat this by saying I’ve not poured over HLI’s work, it’s just what immediately stood out to me, and haven’t made any attempt to cite my own knowledge derived from my practice – to me this is a post on a forum I’m not involved with rather than an ‘official’ attempt at peer review so I’m not holding myself to the same standard, just commenting in good faith.
I get the difficult position HLI are in with reputational salvage, but there is a similar risk to EA’s reputation if there are no checks in place given this has been accessible information for some time and did not raise questions earlier. While this might feel like Greg’s younger sister joining in to dunk on HLI, and I see from comments in this thread that perhaps criticism said passionately can be construed as hostile online, I don’t think this is anyone’s intent. Incredibly ironically given our genetic aversion to team sports, perhaps critique is intended as a fellow teammate begging a striker to get off the field when injured as they are hurting themselves and the team. Letting that player limp on is not being a supportive teammate. Personally, I hope these discussions drive discussions in HLI and EA which provide scope for growth.
In my unsolicited and unqualified opinion, I would advise withdrawing the CEA and drastically modifying the weight HLI puts on this work so it does not appear to be foundational to HLI as an organisation. Journals are encouraging the submission of meta-analysis study protocols for peer-review and publication (BMJPsych Open is one – I have acted as a peer reviewer for this journal to be transparent) in order to improve the quality of research. While conducting a whole SR/MA and publication takes time which could allow further loss of reputation, this is a quick way of acknowledging the issues here and taking concrete steps to rectify them. It’s not acceptable, to me, for the same people to offer a re-analysis or review this work because I’m sceptical this would not produce another flawed output and it seems there is a real need to involve expertise from the field of study (i.e. in formal peer review) at an earlier stage to right the ship.
Again, I do think the aims of HLI are important and I do wish them the best; and I’m interested to see how these discussions evolve in EA as it seems straying into a subject I’m passionate about. I come in peace and this feedback is genuinely meant constructively, so in the spirit of EA and younger-sibling disloyalty, I’m happy to offer HLI help above what’s already provided if they would like it.
[Edit for clarity mostly under ‘outcomes’ and ‘grouping measures’, corrected my horrid formatting/typos, and included the PHQ-9 for context. Kept my waffle and bad jokes for accountability, and was using the royal ‘you’ vs directing any statements at OP(!)]
Strongly upvoted for the explanation and demonstration of how important peer-review by subject matter experts is. I obviously can’t evaluate either HLI’s work or your review, but I think this is indeed a general problem of EA where the culture is, for some reason, aversive to standard practices of scientific publishing. This has to be rectified.
Out of curiosity @LondonGal, have you received any followups from HLI in response to your critique? I understand you might not be at liberty to share all details, so feel free to respond as you feel appropriate.
Redo the meta-analysis with a psychiatrist involved in the design, and get external review before publishing.
Have some sort of sensitivity analysis which demonstrates to donors how the effect size varies based on different weightings of the StrongMinds studies.
(I still strongly support funding HLI, not least so they can actually complete these recommended next steps)
[Speaking from a UK perspective with much less knowledge of non-medical psychotherapy training]
I think the importance is having a strong mental health research background, particularly in systematic review and meta-analysis. If you have an expert in this field then the need for clinical experience becomes less important (perhaps, depends on HLI’s intended scope).
It’s fair to say psychology and psychiatry do commonly blur boundaries with psychotherapy as there are different routes of qualification—it can be with a PhD through a psychology/therapy pathway, or there is a specialism in psychotherapy that can be obtained as part of psychiatry training (a bit like how neurologists are qualified through specialism in internal medicine training). Psychotherapists tend to be qualified in specific modalities in order to practice them independently e.g. you might achieve accreditation in psychoanalytic psychotherapy, etc. There are a vast number of different professionals (me included, during my core training in psychiatry) who deliver psychotherapy under supervision of accredited practitioners so the definition of therapist is blurry.
Psychotherapy is similarly researched through the perspective of delivering psychotherapy which perhaps has more of a psychology focus, and as a treatment of various psychiatric illnesses (+/- in combination or comparison with medication, or novel therapies like psychadelics) which perhaps is closer to psychiatric research. Diagnosis of psychiatric illnesses like depression and directing treatment tends to remain the responsibility of doctors (psychiatrists or primary care physicians), and so psychiatry training requires the development of competencies in psychotherapy, even if delivery of psychotherapy does not always form the bulk of day-to-day practice, as it relates to formulating treatment plans for patients with psychiatric illness.
The issues I raise relate to the clinical presentation of depression as it pertains to impairment/wellbeing, diagnosis of depression, symptom rating scales, psychotherapy as a defined treatment, etc.; as well as the wide range of psychopathology captured in the dataset. My feeling is the breadth of this would benefit from a background in psychiatry for the assumptions I made about HLI’s focus of the meta-analysis. However, if the importance is the depth of understanding IPT as an intervention, or perhaps the hollistic outcomes of psychotherapy particularly related to young women/girls in LMICs, then you might want a psychotherapist (PhD or psychiatrist) working with accreditation in the modality or with the population of interest. If you found someone who regularly publishes systematic reviews and meta-analyses of psychotherapy efficacy then that would probably trump both regardless of clinical background. Or perhaps all three is best.
You’re both right to clarify this, though—I was giving my opinion from my background in clinical/academic psychiatry and so I talk about it a lot! When I mention the field of study etc, I meant mental health research more broadly given it depends on HLI’s aims/scope to know what specific area this would be.
[Edit—Sorry, I’ve realised my lack of digging into the background of HLI members/contributors to this research could render the above highly offensive if there are individuals from this field on staff, and also makes me appear extremely arrogant. For clarity, it’s possible all of my concerns were actually fully-rationalised, deliberate choices by HLI that I’ve not understood from my quick sense-check, or I might disagree with but are still valid.
[However, my impression from the work, in particular the design and methodology, is that there is a lack of psychiatric and/or psychotherapy knowledge (given the questions I had from a clinical perspective); and a lack of confidence in systematic review and meta-analysis from how far this deviates from Cochrane/PRISMA that I was trying to explain in more accessible terms in my comment without being exhaustive. It’s possible contributors to this work did have experience in these areas but were not represented in the write-up, or not involved at the appropriate times in the work, etc. I’m not going to seek out whether or not that is the case as I think it would make this personal given the size of the organisation, and I’m worried that if I check I might find a psychotherapy professor on staff I’ve now crossed (jk ;-)).
[It’s interesting to me either way, as both seem like problems—HLI not identifying they lacked appropriate skills to conduct this research, or seemingly not employing those with the relevant skills appropriately to conduct or communicate it—and it has relevance outside of this particular meta-analysis in the consideration of further outputs from HLI, or evaluation of orgs by EA. In any case, peer-review offers reassurance to the wider EA community that external subject-matter expertise has been consulted in whatever field of interest (with the additional benefit of shutting people like me down very quickly), and provides an opportunity for better research if deficits identified from peer-review suggest skills need to be reallocated or additional skills sought in order to meet a good standard.]
Hi everyone,
To fully disclose my biases: I’m not part of EA, I’m Greg’s younger sister, and I’m a junior doctor training in psychiatry in the UK. I’ve read the comments, the relevant areas of HLI’s website, Ozler study registration and spent more time than needed looking at the dataset in the Google doc and clicking random papers.
I’m not here to pile on, and my brother doesn’t need me to fight his corner. I would inevitably undermine any statistics I tried to back up due to my lack of talent in this area. However, this is personal to me not only wondering about the fate of my Christmas present (Greg donated to Strongminds on my behalf), but also as someone who is deeply sympathetic to HLI’s stance that mental health research and interventions are chronically neglected, misunderstood and under-funded. I have a feeling I’m not going to match the tone here as I’m not part of this community (and apologise in advance for any offence caused), but perhaps I can offer a different perspective as a doctor with clinical practice in psychiatry and on an academic fellowship (i.e. I have dedicated research time in the field of mental health).
The conflict seems to be that, on one hand, HLI has important goals related to a neglected area of work (mental health, particularly in LMICs). I also understand the precarious situation they are in financially, and the fears that undermining this research could have a disproportionate effect on HLI vs critiquing an organisation which is not so concerned with their longevity. There might be additional fears that further work in this area will be scrutinised to a uniquely high degree if there is a precedent set that HLI’s underlying research is found to be flawed. And perhaps this concern is compounded by the stats from people in this thread, which perhaps is not commonly directed to other projects in the EA-sphere, and might suggest there is an underlying bias against this type of work.
I think it’s fair to hold these views, but I’d argue this is likely the mechanism by which HLI has escaped scrutiny before now – people agree more work and funding should be directed to mental health and wanted to support an organisation addressing this. It possibly elevated the status of HLI in people’s minds, appearing more revolutionary in redirecting discussions in EA as a whole. Again, Greg donated to Strongminds on my behalf and, while he might now feel a sense of embarrassment for not delving into this research prior, in my mind I think it reflects a sense of affirmation in this cause and trust in this community which prides itself on being evidence-based. I’m mentioning it, because I think everyone here is united on these points and it’s always easier to have productive discussions from the mutual understanding of shared values and goals.
However, there are serious issues in the meta-analysis which appears to underlie the CEA, and therefore the strength of claims made by HLI. I think it is possible to uncouple this statement from arguments against HLI or any of the above points (where I don’t see disagreement). It seems critical to acknowledge the flaws in this work given the values of EA as an objective, data-driven approach to charitable giving. Failing to do this will risk the reputation of EA, and suggest there is a lack of critical appraisal and scrutiny which perhaps is driven by personal biases i.e. the number of reassurances in this thread that HLI is a good organisation where members are known personally to others in the community. Good people with good intentions can produce flawed research. Similarly, from the perspective of a clinical academic in psychiatry, there is a long history in my field of poorly-conducted, misinterpreted and rushed research which has meant establishing evidence-based care and attracting funding for research/interventions particularly difficult. Poor research in this area risks worsening this problem and mis-allocating very limited resources – it’s fairly shocking seeing the figures quoted here in terms of funding if it is based wholly or in part on outputs such as this meta-analysis which were accepted by EA. Again, as an outsider, it’s difficult for me to judge how critical this research was in attracting this allocation of funds.
While I think the issues with the analysis and all the statistics discussions are valid critiques of this work, it’s important to establish that this is only part of the reason this study would fall down under peer review. It’s concerning to me that peer-review is not the standard for organisations supported by EA; this is not just about scrutinising how the research was conducted and arguing about statistics, but establishes the involvement of expertise within the field of study. As someone who works in this field, the assumptions this meta-analysis makes about psychotherapy, outcome measures in mental health, etc, are problematic but perhaps not readily identified to those without a clinical background, and this is a much greater problem if there is an increasing interest in addressing mental health within EA. I’m not familiar with the backgrounds of people involved in HLI, but I’d be curious about who was consulted in formulating this work given the tone seems to reflect more philosophical vs psychiatric/psychotherapeutic language.
The way the statistical analysis has been heavily debated in this thread likely reflect the skills-mix in the EA community (clearly stats are well-covered!), but the statistics are somewhat irrelevant if your study design and inputs into the analysis are flawed to start with. Even if the findings of this research were not so unusual (perhaps something else which could have been flagged sooner) or were based on concrete stats, the research would still be considered flawed in my field. I imagine this will prompt some reflection in EA on this topic, but peer-review as a requirement could have avoided the bad timing of these discussions and would reduce the reliance on community members to critique research. I think this thread has demonstrated that critical appraisal is time-intensive and relies on specialist skills – it’s not likely that every area of interest will have representation within the EA community so the problem of ‘not knowing what you don’t know’ or how you weight the importance of voices in the community vs their amplification would be greatly helped by peer-review and reduce these blind spots. If the central goal of EA is using money to do the most good, and there is no robust system to evaluate research prior to attracting funding, this is an organisational problem rather than a specific issue with HLI/Strongminds.
My unofficial peer review.
Given inclusion/exclusion criteria aren’t stated clearly in the meta-analysis and the aim is pretty woolly, It seems the focus of the upcoming RCT and Strongminds research is evaluating:
Training non-HCPs in delivering psychotherapy in LMICs
Providing treatment (particularly to young women and girls) with symptoms suggestive of moderate to severe depression (PHQ-9 score of 10 and above)
Measuring the efficacy of this treatment on subjective symptom rating scales, such as PHQ-9, and other secondary outcome measures which might reflect broader benefits not captured in the symptom rating scales.
Finding some way to compare the cost-effectiveness of this treatment to other interventions such as cash transfers in broader discussions of life satisfaction and wellbeing which it obviously complicated compared to using QALYs, but important to do as the impact of mental illness is under-valued using measures geared towards physical morbidity. Or maybe it’s trying to understand effectiveness of treating symptoms vs assumed precipitating/perpetuating factors like poverty.
Grand.
However, the meta-analysis design seems to miss the mark on developing anything which would support a CEA along these lines. Even from the perspective of favouring broad inclusion criteria, you would logically set these limits:
Population
LMIC setting, people with depressive symptoms. It’s not clear if this is about effectively treating depression with psychotherapy and extrapolating that to a comment on wellbeing; or using psychotherapy as a tool to improve wellbeing, which for some reason is being measured in a reduction in various symptom scales for different mental health conditions and symptoms – this needs to be clearly stated. If it’s the former, what you accept as a diagnosis of depression (ICD diagnostic codes, clinical assessment by trained professional, symptom scale cut-offs, antidepressant treatment, etc) should be defined.
If not defining the inclusion criteria of depression as a diagnosis, it’s worth considering if certain psychiatric/medical conditions or settings should be excluded e.g. inpatients. As a hypothetical, extracting data on depression symptom scales for a non-HCP delivered psychotherapy in bipolar patients will obviously be misleading in isolation (i.e. the study likely accounted for measuring mania symptoms in their findings, but would be lost in this meta-analysis). One study included in this analysis (Richter et al) looked at an intervention which encouraged adherence to anti-retroviral medications via peer support for women newly diagnosed with HIV. Fortunately, this study shouldn’t have been included as it didn’t involve delivering psychotherapy, but for the sake of argument, is that fair given the neuropsychiatric complications of HIV/AIDS? Again, it’s not about preparing for every eventuality, but it’s having clear inclusion/exclusion criteria so there’s no argument about cherry-picking studies because this has been discussed prior to search and analysis.
Intervention
Delivery of a specific psychotherapeutic modality (IPT, etc) by a non-HCP. While I can agree there are shared core concepts between different modalities of psychotherapy, you absolutely have to define what you mean by psychotherapy because your dataset containing a column labelled ‘therapyness’ (high/medium/low) undermines a lot of confidence, as do some of the interventions you’ve included as meeting the bar for psychotherapy treatment. If you want to include studies which perhaps are not focussed on treating depression and might therefore involve other forms of therapy but still have benefit in alleviating depressive symptoms e.g. where the presenting complaint is trauma, the intervention might be EMDR (a specific therapy for PTSD) but the authors collected a number of outcome measures including symptom rating scales for anxiety and depression as secondary outcomes, it would be logical to stratify studies in this manner as a plan for analysis. I.e. psychotherapeutic intervention with evidence-base in relieving depressive symptoms (CBT, IPT, etc), psychotherapeutic intervention not specifically targeted at depressive symptoms (EMDR, MBT etc), with non-(psychotherapy) intervention as the control.
Several studies instead use non-psychotherapy as the intervention under study and this confusion seems to be down to papers describing them as having a ‘psychotherapeutic approach’ or being based on principles in any area of psychotherapy. This would cover almost anything as ‘psychotherapeutic’ as an adjective just means understanding people’s problems through their internal environment e.g. thoughts, feelings, behaviours and experiences. In my day-to-day work, I maintain a psychotherapeutic approach in patient interactions, but I do not sit down and deliver 14-week structured IPT. You can argue that generally having a supportive environment to discuss your problems with someone who is keen to hear them is equally beneficial to formal psychotherapy, but this leads to the obvious question of how you can use the idea of any intervention which sounds a bit ‘psychotherapy-y’ to justify the cost of training people to specifically deliver psychotherapy in a CEA from this data.
The fundamental lack of definition or understanding of these clinical terms leads to odd issues in some of the papers I clicked on i.e. Rojas et al (2007) compares a multicomponent group intervention involving lots of things but notably not delivery of any specific psychotherapy, to normal clinical care in a postnatal clinic. The next sentence describes part of normal clinical care to be providing ‘brief psychotherapeutic interventions’ – perhaps this is understood by non-clinicians as not highly ‘therapyish’ but this term is often used to describe short-term focussed CBT, or CBT-informed interventions. Not defining the intervention clearly means the control group contains patients receiving evidence-based psychotherapy of a specific modality and a treatment arm of no specific psychotherapy which is muddled by the MA.
Comparison
As alluded to above, you need to be clear about what is an acceptable control and it’s simply not enough to state you are not sure what the ‘usual care’ is in research by Strongminds you have weighted so heavily. It can’t be then justified by an assumption mental health is neglected in LMICs so probably wouldn’t involve psychotherapy (with no citation). Especially as the definition of psychotherapy in this meta-analysis would deem someone visiting a pastor in church once a week as receiving psychotherapy. Without clearly defining the intervention, it’s really difficult to understand what you are comparing against what.
Outcome
This meta-analysis uses a range of symptom rating scales as acceptable outcome measures, favouring depression and anxiety rating scales, and scales measuring distress. This seems to be based on idea that these clusters of symptoms are highly adverse to wellbeing. This makes the analysis and discussion really confused, in my opinion, and seems to be a sign the analysis, expected findings, extrapolation to wellbeing and CEA was mixed into the methodology.
To me, the issue arises from not clearly defining the aim and inclusion/exclusion criteria. This meta-analysis could be looking at psychotherapy as a treatment for depression/depressive symptoms. This would acknowledge that depression is a psychiatric illness with congitive, psychological and biological symptoms (as captured by depression rating scales). As a clinical term, it is not just about ‘negative affect’ - low mood is not even required for a diagnosis as per ICD criteria. It absolutely does negatively affect wellbeing, as would any illness with unpleasant/distressing symptoms, but this therefore means generating some idea for how much patients’ wellbeing improves from treatment has to be specific to depression. The subsequent CEA would then need to account for this and evaluate only psychotherapies with an evidence-base in depression. In the RCT design, I’d guess this is the rationale for a high PHQ cut-off—it’s a proxy for relative certainty in a clinical diagnosis of depression (or at least a high burden of symptoms which may respond to depression treatments and therefore demonstrate a treatment effect); it’s not supporting the idea that some general negative symptoms impacting a concept of wellbeing, short of depression, will likely benefit from specific psychotherapy to any degree of significance, and it would be an error to take this assumption and then further assume a linear relationship between PHQ and wellbeing/impairment.
If you are looking at depressive symptom reduction, you need to only include evaluation tools for depressive symptoms (PHQ, etc). You need to define which tools you would accept prior to the search and that these are validated for the population under study as you are using them in isolation—how mental illness is understood and presents is highly culturally-bound and these tools almost entirely developed outside of LMICs.
If, instead, you’re looking at a range of measures you feel reflect poor mental health (including depression, anxiety and distress) in order to correlate this to a concept of wellbeing, these tools similarly have to be defined and validated. You also need to explain why some tools should be excluded, because this is unclear e.g. in Weiss et al, a study looking at survivors of torture and militant attacks in Iraq, the primary outcome measure was a trauma symptom scale (the HTQ), yet you’ve selected the secondary outcome measures of depression and anxiety symptom scores for inclusion. I would have assumed that reducing the highly distressing symptoms of PTSD in this group would be most relevant to a concept of wellbeing, yet that is not included in favour of the secondary measures. Including multiple outcome measures with no plan to stratify/subgroup per symptom cluster or disorder seems to accept double/triple counting participants who completed multiple outcome measures from the same intervention. Importantly, you can’t then use this wide mix of various scales to make any comment on the benefits of psychotherapy for depression in improving wellbeing (as lots of the included scores are not measuring depression).
In both approaches, you do need to show it is accepted to pool these different rating scales to still answer your research question. It’s interesting to state you favour subjective symptom scores over functional scores (which are excluded), when both are well-established in evaluating psychotherapy. Other statements made by HLI suggest symptom rating scores include assesment of functioning—I’ve reproduced the PHQ-9 below for people to draw their own conclusions, but it’s safe to say I disagree with this. It’s not clear to me if it’s understood functional scores are also commonly subjective measures, like the WSAS—patients are asked how rate how well they feel they are managing work activities, social activities etc. Ignoring functioning as a blanket rule seems to miss the concept of ‘insight’ in mental health, where people can struggle identifying symptoms as symptoms but are severely disabled due to an illness (this perhaps should also be considered in excluding scales completed by an informant or close relative, particularly thinking about studies involving children or more severe psychopathology). Incorporating functional scoring captures the holistic nature of psychotherapy, where perhaps people may still struggle with symptoms of anxiety/depression after treatment, but have made huge strides in being able to return to work. Again, you need to be clear why functional scores are excluded and be clear this was done when extrapolating findings to discussions of life satisfaction or wellbeing. This research has made a lot of assumptions in this regard that I don’t follow.
x. Grouping measures and relating this to wellbeing:
On that note – using a mean change in symptoms scores is a reasonable evaluation of psychotherapy as a concept if you are so inclined but I would strongly argue that this cannot be used in isolation to make any inference about how this correlates to wellbeing. As others have alluded to in this thread, symptom scores are not linear. To isolate depression as an example, this is deemed mild/moderate/severe based on the number of symptoms experienced, the presence of certain concerning symptoms (e.g. psychosis) and the degree of functional impact.
Measures like the PHQ-9 score the number of depressive symptoms present and how often they occur from 0 (not at all) to 3 (nearly every day) over the past two weeks:
Little interest or pleasure in doing things?
Feeling down, depressed or hopeless?
Trouble falling or staying asleep, or sleeping too much?
Feeling tired or having little energy?
Poor appetite or overeating?
Feeling bad about yourself—or that you are a failure or have let yourself or your family down?
Trouble concentrating on things, such as reading the newspaper or watching television?
Moving or speaking so slowly that other people have noticed? Or the opposite—being so fidgety or restless that you have been moving around a lot more than usual?
Thoughts that you would be better off dead, or of hurting yourself in some way?
If you take the view that a symptom rating score has a linear relationship to ‘negative affect’ or suffering in depression, you would then imagine that the outcomes of PHQ-9 (no depression, mild, moderate, severe) would be regularly distributed in the 27-item score i.e. a score of 0-6 should be no depression, 7-13 mild depression, 14-20 moderate depression and 21-27 severe. This is not the case as the actual PHQ-9 scores are 0-4 no depression, 5-9 mild depression, 10-14 moderate, 15-19 moderately severe, 20-27 severe. This is because the symptoms asked about in the PHQ are diagnostic for depression – it’s not an attempt at trying to gather how happy or sad someone is on a scale from 0-27 (in fact 0 just indicates ‘no depression symptoms’, not happiness or fulfilment, and it’s likely people with very serious depression will not be able to complete a PHQ-9). Hopefully it’s clear from the PHQ-9 why the cut-offs are low and why the severity increases so sharply; the symptoms in question are highly indicative of pathology if occuring frequently. It’s also in the understanding that a PHQ-9 would be administered when there is clinical suspicion of depression to elicit severity or in evaluation of treatment (i.e. in some contexts, like bereavement, experiencing these symptoms would be considered normal, or if symptoms are better explained by another illness the PHQ is unhelpful) and it’s not used for screening (vs the Edinburgh score for postnatal depression which is a screening tool and features heavily in included studies). Critically, it’s why you can’t assume it’s valid to lump all symptom scales together, especially cross-disorders/symptom clusters as in this meta-analysis.
x. Search strategy
I feel this should go without saying, but once you’ve ironed out these issues to have a research question you could feasibly inform with meta-analysis, you then need to develop a search strategy and conduct a systematic review. It’s guaranteed that papers have been missed with the approach used here, and I’ve never read a peer-reviewed meta-analysis where a (10-hour) time constraint was used as part of this strategy. While I agree the funnel plot is funky, it’s likely reflecting the errors in not conducting this search systematically rather than assuming publication bias – it’s likely the highly cited papers etc were more easily found using this approach and therefore account for the p-clustering. If the search was a systematic review and there were objective inclusion/exclusion criteria and the funnel plot looked like that, you can make an argument for publication bias. As it stands, the outputs are only as good as the inputs i.e. you can’t out-analyse a poor study design/methodology to produce reliable results.
Simply put, the most critical problem here is that without even getting into the problems with the data extraction I found, or the analysis as discussed in this thread, from this study design which doesn’t seek to justify why any of these decisions were made, any analytic outputs are destined to be unreliable. How much of this was deliberate on the part of HLI can’t be determined as there is no possible way of replicating the search strategy they used (this is the reason to have a robust strategy as part of your study design). I think if you want to call this a back-of-napkin scoping review to generate some speculative numbers, you could describe what you found as there being early signals that psychotherapy could be more cost-effective than assumed and therefore there’s need to conduct a vigorous SR/MA. It perhaps may have been more useful in a shallow review to actually exclude the Strongminds study and evaluate existing research through the framework of (1) do the SM results make sense in the context of previous studies and (2) can we explain any differences in a narrative review. It seems instead this work generated figures which were treated as useful or reliable and fed into a CEA, which was further skewed by how this was discussed by HLI.
TL;DR
This is obviously very long and not going to be read in any detail on an online forum, but from the perspective of someone within this field, there seem to be a raft of problems with how this research was conducted and evaluated by HLI and EA. I’m not considered the Queen Overload of Psychiatry, I don’t have a PhD, but I suppose I’m trying to demonstrate that having a different background raises different questions, which seems particularly relevant if there is a recognition of the importance of peer-review (hopefully, I’m assuming, outside of EA literature). I’m also going to caveat this by saying I’ve not poured over HLI’s work, it’s just what immediately stood out to me, and haven’t made any attempt to cite my own knowledge derived from my practice – to me this is a post on a forum I’m not involved with rather than an ‘official’ attempt at peer review so I’m not holding myself to the same standard, just commenting in good faith.
I get the difficult position HLI are in with reputational salvage, but there is a similar risk to EA’s reputation if there are no checks in place given this has been accessible information for some time and did not raise questions earlier. While this might feel like Greg’s younger sister joining in to dunk on HLI, and I see from comments in this thread that perhaps criticism said passionately can be construed as hostile online, I don’t think this is anyone’s intent. Incredibly ironically given our genetic aversion to team sports, perhaps critique is intended as a fellow teammate begging a striker to get off the field when injured as they are hurting themselves and the team. Letting that player limp on is not being a supportive teammate. Personally, I hope these discussions drive discussions in HLI and EA which provide scope for growth.
In my unsolicited and unqualified opinion, I would advise withdrawing the CEA and drastically modifying the weight HLI puts on this work so it does not appear to be foundational to HLI as an organisation. Journals are encouraging the submission of meta-analysis study protocols for peer-review and publication (BMJPsych Open is one – I have acted as a peer reviewer for this journal to be transparent) in order to improve the quality of research. While conducting a whole SR/MA and publication takes time which could allow further loss of reputation, this is a quick way of acknowledging the issues here and taking concrete steps to rectify them. It’s not acceptable, to me, for the same people to offer a re-analysis or review this work because I’m sceptical this would not produce another flawed output and it seems there is a real need to involve expertise from the field of study (i.e. in formal peer review) at an earlier stage to right the ship.
Again, I do think the aims of HLI are important and I do wish them the best; and I’m interested to see how these discussions evolve in EA as it seems straying into a subject I’m passionate about. I come in peace and this feedback is genuinely meant constructively, so in the spirit of EA and younger-sibling disloyalty, I’m happy to offer HLI help above what’s already provided if they would like it.
[Edit for clarity mostly under ‘outcomes’ and ‘grouping measures’, corrected my horrid formatting/typos, and included the PHQ-9 for context. Kept my waffle and bad jokes for accountability, and was using the royal ‘you’ vs directing any statements at OP(!)]
Strongly upvoted for the explanation and demonstration of how important peer-review by subject matter experts is. I obviously can’t evaluate either HLI’s work or your review, but I think this is indeed a general problem of EA where the culture is, for some reason, aversive to standard practices of scientific publishing. This has to be rectified.
I think it’s because the standard practices of scientific publishing are very laborious and EA wants to be a bit more agile.
Having said that I strongly agree that more peer-review is called for in EA, even if we don’t move all the way to the extreme of the academic world.
Out of curiosity @LondonGal, have you received any followups from HLI in response to your critique? I understand you might not be at liberty to share all details, so feel free to respond as you feel appropriate.
Nope, I’ve not heard from any current HLI members regarding this in public or private.
Strongly upvoted.
My recommended next steps for HLI:
Redo the meta-analysis with a psychiatrist involved in the design, and get external review before publishing.
Have some sort of sensitivity analysis which demonstrates to donors how the effect size varies based on different weightings of the StrongMinds studies.
(I still strongly support funding HLI, not least so they can actually complete these recommended next steps)
A professional psychotherapy researcher, or even just a psychotherapist, would be more appropriate than a psychiatrist no?
[Speaking from a UK perspective with much less knowledge of non-medical psychotherapy training]
I think the importance is having a strong mental health research background, particularly in systematic review and meta-analysis. If you have an expert in this field then the need for clinical experience becomes less important (perhaps, depends on HLI’s intended scope).
It’s fair to say psychology and psychiatry do commonly blur boundaries with psychotherapy as there are different routes of qualification—it can be with a PhD through a psychology/therapy pathway, or there is a specialism in psychotherapy that can be obtained as part of psychiatry training (a bit like how neurologists are qualified through specialism in internal medicine training). Psychotherapists tend to be qualified in specific modalities in order to practice them independently e.g. you might achieve accreditation in psychoanalytic psychotherapy, etc. There are a vast number of different professionals (me included, during my core training in psychiatry) who deliver psychotherapy under supervision of accredited practitioners so the definition of therapist is blurry.
Psychotherapy is similarly researched through the perspective of delivering psychotherapy which perhaps has more of a psychology focus, and as a treatment of various psychiatric illnesses (+/- in combination or comparison with medication, or novel therapies like psychadelics) which perhaps is closer to psychiatric research. Diagnosis of psychiatric illnesses like depression and directing treatment tends to remain the responsibility of doctors (psychiatrists or primary care physicians), and so psychiatry training requires the development of competencies in psychotherapy, even if delivery of psychotherapy does not always form the bulk of day-to-day practice, as it relates to formulating treatment plans for patients with psychiatric illness.
The issues I raise relate to the clinical presentation of depression as it pertains to impairment/wellbeing, diagnosis of depression, symptom rating scales, psychotherapy as a defined treatment, etc.; as well as the wide range of psychopathology captured in the dataset. My feeling is the breadth of this would benefit from a background in psychiatry for the assumptions I made about HLI’s focus of the meta-analysis. However, if the importance is the depth of understanding IPT as an intervention, or perhaps the hollistic outcomes of psychotherapy particularly related to young women/girls in LMICs, then you might want a psychotherapist (PhD or psychiatrist) working with accreditation in the modality or with the population of interest. If you found someone who regularly publishes systematic reviews and meta-analyses of psychotherapy efficacy then that would probably trump both regardless of clinical background. Or perhaps all three is best.
You’re both right to clarify this, though—I was giving my opinion from my background in clinical/academic psychiatry and so I talk about it a lot! When I mention the field of study etc, I meant mental health research more broadly given it depends on HLI’s aims/scope to know what specific area this would be.
[Edit—Sorry, I’ve realised my lack of digging into the background of HLI members/contributors to this research could render the above highly offensive if there are individuals from this field on staff, and also makes me appear extremely arrogant. For clarity, it’s possible all of my concerns were actually fully-rationalised, deliberate choices by HLI that I’ve not understood from my quick sense-check, or I might disagree with but are still valid.
[However, my impression from the work, in particular the design and methodology, is that there is a lack of psychiatric and/or psychotherapy knowledge (given the questions I had from a clinical perspective); and a lack of confidence in systematic review and meta-analysis from how far this deviates from Cochrane/PRISMA that I was trying to explain in more accessible terms in my comment without being exhaustive. It’s possible contributors to this work did have experience in these areas but were not represented in the write-up, or not involved at the appropriate times in the work, etc. I’m not going to seek out whether or not that is the case as I think it would make this personal given the size of the organisation, and I’m worried that if I check I might find a psychotherapy professor on staff I’ve now crossed (jk ;-)).
[It’s interesting to me either way, as both seem like problems—HLI not identifying they lacked appropriate skills to conduct this research, or seemingly not employing those with the relevant skills appropriately to conduct or communicate it—and it has relevance outside of this particular meta-analysis in the consideration of further outputs from HLI, or evaluation of orgs by EA. In any case, peer-review offers reassurance to the wider EA community that external subject-matter expertise has been consulted in whatever field of interest (with the additional benefit of shutting people like me down very quickly), and provides an opportunity for better research if deficits identified from peer-review suggest skills need to be reallocated or additional skills sought in order to meet a good standard.]