Thanks for this! Useful to get some insight into the FP thought process here.
The effect sizes observed are very large, but it’s important to place in the context of StrongMinds’ work with severely traumatized populations. Incoming PHQ-9 scores are very, very high, so I think … 2) I’m not sure that our general priors about the low effectiveness of therapeutic interventions are likely to be well-calibrated here.
(emphasis added)
Minor nitpick (I haven’t personally read FP’s analysis / work on this): Appendix C (pg 31) details the recruitment process, where they teach locals about what depression is prior to recruitment. The group they sample from are groups engaging in some form of livelihood / microfinance programmes, such as hairdressers. Other groups include churches and people at public health clinic wait areas. It’s not clear to me based on that description that we should take at face value that the reason for very very high incoming PHQ-9 scores is that these groups are “severely traumatised” (though it’s clearly a possibility!)
RE: priors about low effectiveness of therapeutic interventions—if the group is severely traumatised, then while I agree this might make us feel less skeptical about the astounding effect size, it should also make us more skeptical about the high success rates, unless we have reason to believe that severe depression in severely traumatised populations in this context is easier to treat than moderate / mild depression.
Thank you for linking to that appendix describing the recruitment process. Could the initial high scores be driven by demand effects from SM recruiters describing depression symptoms and then administering the PHQ-9 questionnaire? This process of SM recruiters describing symptoms to participants before administering the tests seems reminiscent of old social psychology experiments (e.g. power posing being driven in part by demand effects).
No worries! Yeah, I think that’s definitely plausible, as is something like this (“People in targeted communities often incorrectly believe that StrongMinds will provide them with cash or material goods and may therefore provide misleading responses when being diagnosed”). See this comment for another perspective.
I think the main point I was making is just that it’s unclear to me that high PHQ-9 scores in this context necessarily indicate a history of severe trauma etc.
While StrongMinds runs a programme that explicitly targets refugees, who’re presumably much more likely to be traumatized, this made up less than 8% of their budget in 2019.
However, some studies seem to find very high rates of depression prevalence in Uganda (one non-representative meta-analysis found a prevalence of 30%). If a rate like this did characterise the general population, then I wouldn’t be surprised that the communities they work in (which are typically poorer / rural / many are in Northern Uganda) have very high incoming PHQ scores for reasons genuinely related to high psychological distress.
Whether they are a hairdresser or an entrepreneur living in this context seems like it could be pulling on our weakness to the conjunction fallacy. I.e., it seems less likely that someone has a [insert normal sounding job] and trauma while living in an ex-warzone than what we’d guess if we only knew th at someone was living in an ex-warzone.
Oh that’s interesting RE: refugees! I wonder what SM results are in that group—do you know much about this?
Iirc, the conjunction fallacy iirc is something like:
For the following list of traits / attributes, is it more likely that “Jane Doe is a librarian” or “Jane Doe is a librarian + a feminist”? And it’s illogical to pick the latter because it’s a perfect subset of the former, despite it forming a more coherent story for system 1.
But in this case, using the conjunction fallacy as a defence is like saying “i’m going to recruit from the ‘librarian + feminist’ subset for my study, and this is equivalent to sampling all librarians”, which I think doesn’t make sense to me? Clearly there might be something about being both a librarian + feminist that makes you different to the population of librarians, even if it’s more likely for any given person to be a librarian than a ‘librarian + feminist’ by definition.
I might be totally wrong and misunderstanding this though! But also to be clear, I’m not actually suggesting that just because someone’s a hairdresser or a churchgoer that they can’t have a history of severe trauma. I’m saying when Matt says “The effect sizes observed are very large, but it’s important to place in the context of StrongMinds’ work with severely traumatized populations”, I’m interpreting this to mean that due to the population having a history of severe trauma, we should expect larger effect sizes than other populations with similar PHQ-9 scores. But clearly there are different explanations for high initial PHQ-9 scores that don’t involve severe trauma, so it’s not clear that I should assume there’s a history of severe trauma based on just the PHQ-9 score or the recruitment methodology.
The StrongMinds pre-post data I have access to (2019) indicates that the Refugee programme has pre-post mean difference in PHQ9 of 15.6, higher than the core programme of 13.8, or their peer / volunteer-delivered or youth programmes (13.1 and 12). They also started with the highest baseline PHQ: 18.1 compared to 15.8 in the core programme.
Thanks for this! Useful to get some insight into the FP thought process here.
(emphasis added)
Minor nitpick (I haven’t personally read FP’s analysis / work on this):
Appendix C (pg 31) details the recruitment process, where they teach locals about what depression is prior to recruitment. The group they sample from are groups engaging in some form of livelihood / microfinance programmes, such as hairdressers. Other groups include churches and people at public health clinic wait areas. It’s not clear to me based on that description that we should take at face value that the reason for very very high incoming PHQ-9 scores is that these groups are “severely traumatised” (though it’s clearly a possibility!)
RE: priors about low effectiveness of therapeutic interventions—if the group is severely traumatised, then while I agree this might make us feel less skeptical about the astounding effect size, it should also make us more skeptical about the high success rates, unless we have reason to believe that severe depression in severely traumatised populations in this context is easier to treat than moderate / mild depression.
Thank you for linking to that appendix describing the recruitment process. Could the initial high scores be driven by demand effects from SM recruiters describing depression symptoms and then administering the PHQ-9 questionnaire? This process of SM recruiters describing symptoms to participants before administering the tests seems reminiscent of old social psychology experiments (e.g. power posing being driven in part by demand effects).
No worries! Yeah, I think that’s definitely plausible, as is something like this (“People in targeted communities often incorrectly believe that StrongMinds will provide them with cash or material goods and may therefore provide misleading responses when being diagnosed”). See this comment for another perspective.
I think the main point I was making is just that it’s unclear to me that high PHQ-9 scores in this context necessarily indicate a history of severe trauma etc.
While StrongMinds runs a programme that explicitly targets refugees, who’re presumably much more likely to be traumatized, this made up less than 8% of their budget in 2019.
However, some studies seem to find very high rates of depression prevalence in Uganda (one non-representative meta-analysis found a prevalence of 30%). If a rate like this did characterise the general population, then I wouldn’t be surprised that the communities they work in (which are typically poorer / rural / many are in Northern Uganda) have very high incoming PHQ scores for reasons genuinely related to high psychological distress.
Whether they are a hairdresser or an entrepreneur living in this context seems like it could be pulling on our weakness to the conjunction fallacy. I.e., it seems less likely that someone has a [insert normal sounding job] and trauma while living in an ex-warzone than what we’d guess if we only knew th at someone was living in an ex-warzone.
Oh that’s interesting RE: refugees! I wonder what SM results are in that group—do you know much about this?
Iirc, the conjunction fallacy iirc is something like:
For the following list of traits / attributes, is it more likely that “Jane Doe is a librarian” or “Jane Doe is a librarian + a feminist”? And it’s illogical to pick the latter because it’s a perfect subset of the former, despite it forming a more coherent story for system 1.
But in this case, using the conjunction fallacy as a defence is like saying “i’m going to recruit from the ‘librarian + feminist’ subset for my study, and this is equivalent to sampling all librarians”, which I think doesn’t make sense to me? Clearly there might be something about being both a librarian + feminist that makes you different to the population of librarians, even if it’s more likely for any given person to be a librarian than a ‘librarian + feminist’ by definition.
I might be totally wrong and misunderstanding this though! But also to be clear, I’m not actually suggesting that just because someone’s a hairdresser or a churchgoer that they can’t have a history of severe trauma. I’m saying when Matt says “The effect sizes observed are very large, but it’s important to place in the context of StrongMinds’ work with severely traumatized populations”, I’m interpreting this to mean that due to the population having a history of severe trauma, we should expect larger effect sizes than other populations with similar PHQ-9 scores. But clearly there are different explanations for high initial PHQ-9 scores that don’t involve severe trauma, so it’s not clear that I should assume there’s a history of severe trauma based on just the PHQ-9 score or the recruitment methodology.
The StrongMinds pre-post data I have access to (2019) indicates that the Refugee programme has pre-post mean difference in PHQ9 of 15.6, higher than the core programme of 13.8, or their peer / volunteer-delivered or youth programmes (13.1 and 12). They also started with the highest baseline PHQ: 18.1 compared to 15.8 in the core programme.