I would like to push back slightly on your second point: Secondly, isn’t it a massive problem that you only look at the 27% that completed the program when presenting results?
By restricting to the people who completed the program, we get to understand the effect that the program itself has. This is important for understanding its therapeutic value.
Retention is also important—it is usually the biggest challenge for online or self-help mental health interventions, and it is practically a given that many people will not complete the course of treatment. 27% tells us a lot about how “sticky” the program was. It lies between the typical retention rates of pure self-help interventions and face-to-face therapy, as we would expect for an in-between intervention like this.
More important than effect size and retention—I would argue—is the topline cost-effectiveness in depression averted per $1,000 or something like that. This we can easily estimate from retention rate, effect size and cost-per-treatment.
By restricting to the people who completed the program, we get to understand the effect that the program itself has. This is important for understanding its therapeutic value.
I disagree with this. If this were a biomedical intervention where we gave a pill regiment, and two-thirds of the participants dropped out of the evaluation before the end because the pills had no effect (or had negative side-effects for that matter), it would not be right to look at only the remaining third that stuck with it to evaluate the effect of the pills. Although I do agree that it’s impressive and relevant that 27% complete the treatment, and that this is evidence of it’s relative effectiveness given the norm for such programmes.
I also wholeheartedly agree that the topline cost-effectiveness is what matters in the end.
The vast majority of psychotherapy drop-out happens between session 1 and 2. You’d expect people to give it at least two sessions before concluding their symptoms aren’t reducing fast enough. I think you’re attributing far too larger proportion of drop-out to ineffectiveness.
This is fair, we don’t know why people drop out. But it seems much more plausible to me that looking at only the completers with no control is heavily biased in favor of the intervention.
I could spin the opposite story of course, it works so well that people drop out early because they are cured, and we never hear from them. My gut feeling is that this is unlikely to balance out, but again, we don’t know, and I contend this is a big problem. And I don’t think it’s the kind of issue you kan hand-wave away and proceed to casually presenting the results for completers like it represents the effect of the program as a whole. (To be clear, this post does not claim this, but I think it might easily be read like this by a naive reader).
There are all sort of other stories you could spin as well. For example, have the completers recently solved some other issue, e.g. gotten a job or resolved a health issue? Are they at the tail-end of the typical depression peak? Are the completers in general higher conscientiousness and thus more likely to resolve their issues on their own regardless of the programme? Given the information presented here, we just don’t know.
Qualitative interview with the completers only gets you so far, people are terrible at attributing cause and effect, and thats before factoring in the social pressure to report positive results in an interview. It’s not no evidence, but it is again biased in favor of the intervention.
Completers are a highly selected subset of the participants, and while I appreciate that in these sort of programmes you have to make some judgement-calls given the very high drop-out rate, I still think it is a big problem.
The best meta-analysis for deterioration (i.e. negative effects) rates of guided self-help (k = 18, N = 2,079) found that deterioration was lower in the intervention condition, although they did find a moderating effect where participants with low education didn’t see this decrease in deterioration rates (but nor did they see an increase)[1].
So, on balance, I think it’s very unlikely that any of the dropped-out participants were worse-off for having tried the programme, especially since the counterfactual in low-income countries is almost always no treatment. Given that your interest is top-line cost-effectiveness, then only counting completed participants for effect size estimates likely underestimates cost-effectiveness if anything, since churned participants would be estimated at 0.
Yes, this makes sense if I understand you correctly. If we set the effect size to 0 for all the dropouts, while having reasonable grounds for thinking it might be slightly positive, this would lead to underestimate top-line cost effectiveness.
I’m mostly reacting to the choice of presenting the results of the completer subgroup which might be conflated with all participants in the program. Even the OP themselves seem to mix this up in the text.
Context: To offer a few points of comparison, two studies of therapy-driven programs found that 46% and 57.5% of participants experienced reductions of 50% or more, compared to our result of 72%. For the original version of Step-by-Step, it was 37.1%. There was an average PHQ-9 reduction of 6 points compared to our result of 10 points.
As far as I can tell, they are talking about completers in this paragraph, not participants. @RachelAbbott could you clarify this?
When reading the introduction again I think it’s pretty balanced now (possibly because it was updated in response to the concerns). Again, thank you for being so receptive to feedback @RachelAbbott!
I would like to push back slightly on your second point: Secondly, isn’t it a massive problem that you only look at the 27% that completed the program when presenting results?
By restricting to the people who completed the program, we get to understand the effect that the program itself has. This is important for understanding its therapeutic value.
Retention is also important—it is usually the biggest challenge for online or self-help mental health interventions, and it is practically a given that many people will not complete the course of treatment. 27% tells us a lot about how “sticky” the program was. It lies between the typical retention rates of pure self-help interventions and face-to-face therapy, as we would expect for an in-between intervention like this.
More important than effect size and retention—I would argue—is the topline cost-effectiveness in depression averted per $1,000 or something like that. This we can easily estimate from retention rate, effect size and cost-per-treatment.
I disagree with this. If this were a biomedical intervention where we gave a pill regiment, and two-thirds of the participants dropped out of the evaluation before the end because the pills had no effect (or had negative side-effects for that matter), it would not be right to look at only the remaining third that stuck with it to evaluate the effect of the pills. Although I do agree that it’s impressive and relevant that 27% complete the treatment, and that this is evidence of it’s relative effectiveness given the norm for such programmes.
I also wholeheartedly agree that the topline cost-effectiveness is what matters in the end.
The vast majority of psychotherapy drop-out happens between session 1 and 2. You’d expect people to give it at least two sessions before concluding their symptoms aren’t reducing fast enough. I think you’re attributing far too larger proportion of drop-out to ineffectiveness.
This is fair, we don’t know why people drop out. But it seems much more plausible to me that looking at only the completers with no control is heavily biased in favor of the intervention.
I could spin the opposite story of course, it works so well that people drop out early because they are cured, and we never hear from them. My gut feeling is that this is unlikely to balance out, but again, we don’t know, and I contend this is a big problem. And I don’t think it’s the kind of issue you kan hand-wave away and proceed to casually presenting the results for completers like it represents the effect of the program as a whole. (To be clear, this post does not claim this, but I think it might easily be read like this by a naive reader).
There are all sort of other stories you could spin as well. For example, have the completers recently solved some other issue, e.g. gotten a job or resolved a health issue? Are they at the tail-end of the typical depression peak? Are the completers in general higher conscientiousness and thus more likely to resolve their issues on their own regardless of the programme? Given the information presented here, we just don’t know.
Qualitative interview with the completers only gets you so far, people are terrible at attributing cause and effect, and thats before factoring in the social pressure to report positive results in an interview. It’s not no evidence, but it is again biased in favor of the intervention.
Completers are a highly selected subset of the participants, and while I appreciate that in these sort of programmes you have to make some judgement-calls given the very high drop-out rate, I still think it is a big problem.
The best meta-analysis for deterioration (i.e. negative effects) rates of guided self-help (k = 18, N = 2,079) found that deterioration was lower in the intervention condition, although they did find a moderating effect where participants with low education didn’t see this decrease in deterioration rates (but nor did they see an increase)[1].
So, on balance, I think it’s very unlikely that any of the dropped-out participants were worse-off for having tried the programme, especially since the counterfactual in low-income countries is almost always no treatment. Given that your interest is top-line cost-effectiveness, then only counting completed participants for effect size estimates likely underestimates cost-effectiveness if anything, since churned participants would be estimated at 0.
Ebert, D. D. et al. (2016) Does Internet-based guided-self-help for depression cause harm? An individual participant data meta-analysis on deterioration rates and its moderators in randomized controlled trials, Psychological Medicine, vol. 46, pp. 2679–2693.
Yes, this makes sense if I understand you correctly. If we set the effect size to 0 for all the dropouts, while having reasonable grounds for thinking it might be slightly positive, this would lead to underestimate top-line cost effectiveness.
I’m mostly reacting to the choice of presenting the results of the completer subgroup which might be conflated with all participants in the program. Even the OP themselves seem to mix this up in the text.
As far as I can tell, they are talking about completers in this paragraph, not participants. @RachelAbbott could you clarify this?
When reading the introduction again I think it’s pretty balanced now (possibly because it was updated in response to the concerns). Again, thank you for being so receptive to feedback @RachelAbbott!