0) I donât know what the bar should be for calling something a âcause areaâ or âEA interestâ should be, but I think this bar should be above (e.g.) âpromising new drug treatment for bipolar disorderâ, even though this is unequivocally a good thing. Wherever exactly this bar falls (I donât think it needs to be âas promising as global healthâ), I donât think psychedelics meet it.
1) My scepticism on the mental health benefits of psychedelics mainly rely on second-order causes for concern, namely:
1.1) Thereâs some weak wisdom of nature prior that blasting one of your neurotransmitter pathways for a short period is unlikely to be helpful. This objection is pretty weak, given existing psychiatric drugs are similarly crude (although one of their advantages by the lights of this consideration is they generally didnât come to human attention by previous recreational use).
1.2) I get more sceptical as the number of (fairly independent) âupsidesâ of a proposed intervention increases. The OP notes psychedelics could help with anxiety and depression and OCD and addiction and PTSD, which looks remarkably wide-ranging and gives suspicion of a âcure looking for a diseaseâ. (That they are often mooted as having still other benefits on people without mental health issues such as improving creativity and empathy deepens my suspicion). Likewise, a cause that is proposed to be promising on long-termism and its negation pings suspicious convergence worries.
1.3) (Owed to Scott Alexanderâs recent post). The psychedelic literature mainly comprises small studies generally conducted by âtrue believersâ in psychedelics and often (but not always) on self-selected and motivated participants. This seems well within the territory of scientific work vulnerable to replication crises.
1.4) Thus my impression is that although I wouldnât be shocked if psychedelics are somewhat beneficial, Iâd expect them to regress at least as far down to efficicacies observed in existing psychopharmacology, probably worse, and plausibly to zero. Adding to the armamentarium of therapy for mental illness (in expectation) is worthwhile, but not enough for a big slice of EA opinion: it being a promising candidate for further exploration relies on âneartermismâ and (conditional on this) the belief that mental health is similarly promising to standard global health interventions on NTDs etc.
2) On the âlongtermismâ side of the argument, I agree it would be goodâand good enough to be an important âcauseâ - if there were ways of further enhancing human capital. (I bracket here the proposed mental health benefits, as my scepticism above applies even more strongly to the case that psychedelics are promising based on their benefits to EA community membersâ mental health alone).
My impression is most of the story for âhow do some people perform so well?â will be a mix of traits/ââunmodifiableâ factors (e.g. intelligence, personality dispositions, propitious upbringing); very boring advice (e.g. âSleep enoughâ, âexercise regularlyâ); and happenstance/âgood fortune. Iâd guess there will be some residual variance left on the table after these have taken the lionâs share, and these scraps would be important to take. Yet I suspect a lot of this will be pretty idiographic/âreducible to boring advice (e.g. anecdotally, novelists have their own peculiar habits for writing: IIRC Nabokov used index cards, Pullman has a writing shed, Gaiman a ânovel writing penâ - maybe âhaving a ritual for dedicated workâ matters, but which one is a matter of taste).
The evidence for psychedelic âenhancementâ is even thinner than psychedelic therapy, and labours under a more adverse prior. I agree the case for psychedelics here is comparable to CFAR/âParadigm/ârationality training, but I would rule both out, not in.
3) I agree with agdfoster that psychedelics have reputational costs. This âbad rapâ looks unfair to me (notwithstanding the above, Iâm confident that an âMDMA habitâ is much better for you than an alcohol, smoking, extreme sports, or social media one, none of which attract similar opprobrium), but it is decision-relevant all the same. If the upside was big enough, these costs would be worth paying, but I donât think they are.
1.1) Thereâs some weak wisdom of nature prior that blasting one of your neurotransmitter pathways for a short period is unlikely to be helpful.
The data doesnât support this, and generally suggests that 1-3 psychedelic experiences can have beneficial effects lasting 6 months or longer. See for example Carhart-Harris et al. 2018:
âAlthough limited conclusions can be drawn about treatment efficacy from open-label trials, tolerability was good, effect sizes large and symptom improvements appeared rapidly after just two psilocybin treatment sessions and remained significant 6 months post-treatment in a treatment-resistant cohort.â
âHigh-dose psilocybin produced large decreases in clinician- and self-rated measures of depressed mood and anxiety, along with increases in quality of life, life meaning, and optimism, and decreases in death anxiety. At 6-month follow-up, these changes were sustained, with about 80% of participants continuing to show clinically significant decreases in depressed mood and anxiety.â
âAll 15 participants completed a 12-month follow-up, and 12 (80%) returned for a long-term (â„16 months) follow-up, with a mean interval of 30 months (range = 16 â 57 months) between target-quit date (i.e., first psilocybin session) and long-term follow-up. At 12-month follow-up, 10 participants (67%) were confirmed as smoking abstinent. At long-term follow-up, nine participants (60%) were confirmed as smoking abstinent.â
I get more sceptical as the number of (fairly independent) âupsidesâ of a proposed intervention increases. The OP notes psychedelics could help with anxiety and depression and OCD and addiction and PTSD, which looks remarkably wide-ranging and gives suspicion of a âcure looking for a diseaseâ.
I would push back against the idea that these upsides are as independent as they may seem. Depression and anxiety are often comorbid (Hirschfeld 2001) and often comorbid with addiction (Quello 2005), OCD (Tukel 2002) and eating disorders (Marucci 2018). It seems that similar neurological states and cognitive processes underly these mental disorders, which is why psychedelics can effectively treat them all.
âA sense of disconnection is a feature of many major psychiatric disorders, particularly depression, and a sense of connection or connectedness is considered a key mediator of psychological well-being, as well as a factor underlying recovery of mental health. One of the most curious aspects of the growing literature on the therapeutic potential of psychedelics is the seeming general nature of their therapeutic applicability, i.e. they have shown promise not just for the treatment of depression but for addictions, anxiety and obsessive-compulsive disorder. This raises the question of whether psychedelic therapy targets a core factor underlying mental health. We believe that it does, and that connectedness is the key.â
A secondary point here is that substances with different pharmacological and phenomenological effects are all grouped under the term âpsychedelicâ. MDMA, for example, works and feels differently from ketamine, which works and feels differently from âclassicalâ psychedelics like LSD, psilocybin, and DMT. So while it may seem unlikely that psychedelics (understood as one uniform thing) could have a range of benefits, it makes more sense when psychedelics are understood as a category that includes different substances.
1.1) Thereâs some weak wisdom of nature prior that blasting one of your neurotransmitter pathways for a short period is unlikely to be helpful.
I think that the wisdom of nature prior would say that we shouldnât expect blasting a neurotransmitter pathway to be evolutionarily adaptive on average. If we know why something wouldnât be adaptive, then it seems like it doesnât apply. This prior would argue against claims like âX increases human capitalâ, but not claims like âX increases altruismâ, since thereâs a clear mechanism whereby being much more altruistic than normal is bad for inclusive genetic fitness.
1.2) I get more sceptical as the number of (fairly independent) âupsidesâ of a proposed intervention increases. The OP notes psychedelics could help with anxiety and depression and OCD and addiction and PTSD, which looks remarkably wide-ranging and gives suspicion of a âcure looking for a diseaseâ.
I would worry about this more if the OP were referring to a specific intervention rather than a class of interventions. I think that the concern about being good on longterm and shortterm perspectives is reasonable, though there is a proposed mechanism (healing emotional blocks) that is related to both.
1.4) Thus my impression is that although I wouldnât be shocked if psychedelics are somewhat beneficial, Iâd expect them to regress at least as far down to efficicacies observed in existing psychopharmacology, probably worse, and plausibly to zero
Normal drug discovery seems to be based off of coming up with hypotheses, then testing many chemicals to find statistically significant effects. In contrast, these trials are investigating chemicals that people are already taking for their effects. Running many trials then continuing the investigations that find significance is a good way to generate false positives, but that doesnât seem to be the case here, and I would be surprised to find zero effect (as opposed to shorter or different effects) if it were investigated more thoroughly.
2) On the âlongtermismâ side of the argument, I agree it would be goodâand good enough to be an important âcauseâ - if there were ways of further enhancing human capital.
...
My impression is most of the story for âhow do some people perform so well?â will be a mix of traits/ââunmodifiableâ factors (e.g. intelligence, personality dispositions, propitious upbringing); very boring advice (e.g. âSleep enoughâ, âexercise regularlyâ); and happenstance/âgood fortune. Iâd guess there will be some residual variance left on the table after these have taken the lionâs share, and these scraps would be important to take. Yet I suspect a lot of this will be pretty idiographic/âreducible to boring advice.
I also think that improving human capital is important, and am not convinced that this is a clear and unambiguous winner for that goal. Iâm curious about what evidence would make you more optimistic about the possibility of large improvements to human capital.
1.3) (Owed to Scott Alexanderâs recent post). The psychedelic literature mainly comprises small studies generally conducted by âtrue believersâ in psychedelics and often (but not always) on self-selected and motivated participants. This seems well within the territory of scientific work vulnerable to replication crises.
I think small studies are also more vulnerable to publication bias.
On the flip side, it may be possible that the âtrue believersâ actually are on to something, but they have a hard time formalizing their procedure into something that can be replicated on a massive scale. So if larger studies fail to replicate the results from the small studies, this may be the reason why.
On the flip side, it may be possible that the âtrue believersâ actually are on to something, but they have a hard time formalizing their procedure into something that can be replicated on a massive scale. So if larger studies fail to replicate the results from the small studies, this may be the reason why.
Do you have any examples of this actually happening? I have seen it as an excuse for things that never pan out many times, but I donât recall an instance of it actually delivering. E.g. in Many Labs 2 and other mass reproducibility efforts, you donât find a minority of experimenters with a âknackâ who get the effect but canât pass it on to others.
I donât have data either way, but âknacksâ for psychotherapy feel more plausible to me than âknacksâ for producing the effects in Many Labs 2 (just skimming over the list of effects here). Like, the strongest version of this claim is that no one is more skilled than anyone else at anything, which seems obviously false.
Suppose we conduct a study of the Feynman problem-solving algorithm: â1. Write down the problem. 2. Think real hard. 3. Write down the solution.â A n=1 study of Richard Feynman finds the algorithm works great, but it fails to replicate on a larger sample. What is your conclusion: that the n=1 result was spurious, or that Feynman has useful things to teach us but the 3-step algorithm didnât capture them?
I havenât read enough studies on psychedelics to know how much room there is in the typical procedure for a skilled therapist to make a difference though.
It does, but although thatâs enough to make it worthwhile on the margin of existing medical research, that is not enough to make it a priority for the EA community.
The latter. EA shouldnât fund most research, but whether it is confirmatory or not is irrelevant. Psychedelics shouldnât make the cut if we expect (as I argue above) we expect a lot of failure to replicate and regression, and the true effect to be unexceptional in the context of existing mental health treatment.
I feel confused about why you think psychedelics shouldnât make the cut. The present state of research (several small-n studies finding very large effect sizes) seems consistent with both:
The world in which psychedelics are in fact a promising intervention
The world in which the current promise of psychedelics is an artifact of our academic knowledge-generating process
It seems like the only way to know which world weâre in is to do confirmatory research.
That sounds a bit like the argument âeither this claim is right, or itâs wrong, so thereâs a 50% chance itâs true.â
One needs to attend to base rates. Our bad academic knowledge-generating process throws up many, many illusory interventions with purported massive effects for each amazing intervention we find, and the amazing interventions that we do find disproportionately were easier to show (with the naked eye, visible macro-correlations, consistent effects with well-powered studies, etc).
People are making similar arguments about cold fusion, psychic powers (of many different varieties), many environmental and nutritional contaminants, brain training, carbon dioxide levels, diets, polyphasic sleep, assorted purported nootropics, many psychological/âparenting/âeducational interventions, etc.
Testing how your prior applies across a spectrum of other cases (past and present) is helpful for model checking. If psychedelics are a promising EA cause how many of those others qualify? If many do, then any one isnât so individually special, although one might want to have a systematic program of systematically doing rigorous testing of all the wacky claims of large impact that can be tested cheaply.
If not, then it would be good to explain what exactly makes psychedelics different from the rest.
I think the case for psychedelics the OP has made doesnât pass this standard yet, so doesnât meet the standard for an EA cause area.
From what I understand, effect size is one of the better ways to predict whether a study will replicate. For example, this paper found that 77% of replication effect sizes reported were within a 95% prediction interval based on the original effect size.
As a spot check, you say that brain training has massive purported effects. I looked at the research page of Lumosity, a company which sells brain training software. I expect their estimates of the effectiveness of brain training to be among the most optimistic, but their highlighted effect size is only d = 0.255.
A caveat is that if an effect size seems implausibly large, it might have arisen due to methodological error. (The one brain training study I found with a large effect size has been subject to methodological criticism.) Here is a blog post by Daniel Lakens where he discusses a study which found that judges hand out much harsher sentences before lunch:
If hunger had an effect on our mental resources of this magnitude, our society would fall into minor chaos every day at 11:45. Or at the very least, our society would have organized itself around this incredibly strong effect of mental depletion⊠we would stop teaching in the time before lunch, doctors would not schedule surgery, and driving before lunch would be illegal.
However, I think psychedelic drugs arguably do pass this test. During the 60s, before they became illegal, a lot of people kind of were talking about how society would reorganize itself around them. And forget about performing surgery or driving while you are tripping.
The way I see it, if you want to argue that an effect isnât real, there are two ways to do it. You can argue that the supposed effect arose through random chance/âp-hacking/âetc., or you can argue that it arose through methodological error.
The random chance argument is harder to make if the studies have large effect sizes. If the true effect is 0, itâs unlikely weâll observe a large effect by chance. If researchers are trying to publish papers based on noise, youâd expect p-values to cluster just below the p < 0.05 threshold (see p-curve analysis)⊠theyâre essentially going to publish the smallest effect size they can get away with.
The methodological error argument could be valid for a large effect size, but if this is the case, confirmatory research is not necessarily going to help, because confirmatory research could have the same issue. So at that point your time is best spent trying to pinpoint the actual methodological flaw.
The random chance argument is harder to make if the studies have large effect sizes. If the true effect is 0, itâs unlikely weâll observe a large effect by chance.
This is exactly what p-values are designed for, so you are probably better off looking at p-values rather than effect size if thatâs the scenario youâre trying to avoid.
I suppose you could imagine that p-values are always going to be just around 0.05, and that for a real and large effect size people use a smaller sample because thatâs all thatâs necessary to get p < 0.05, but this feels less likely to me. I would expect that with a real, large effect you very quickly get p < 0.01, and researchers would in fact do that.
(I donât necessarily disagree with the rest of your comment, Iâm more unsure on the other points.)
This is exactly what p-values are designed for, so you are probably better off looking at p-values rather than effect size if thatâs the scenario youâre trying to avoid.
This comment is a wonderful crystallisation of the âdefensive statisticsâ of Andrew Gelman, James Heathers and other great epistemic policemen. Thanks!
That sounds a bit like the argument âeither this claim is right, or itâs wrong, so thereâs a 50% chance itâs true.â
Iâm not claiming this. Iâm claiming that given the research to date, more psychedelic research would be very impactful in expectation. (Iâm at like 30-40% that the beneficial effects are real.)
If not, then it would be good to explain what exactly makes psychedelics different from the rest.
I havenât read the literatures for all the examples you gave. For psychic powers & cold fusion, my impression is that confirmatory research was done and the initial results didnât replicate.
So one difference is that the main benefits of psychedelic therapy havenât yet failed to replicate.
Scott referred to some failures to replicate in his post.
Scott referred to one failure to replicate, for a finding that a psychedelic experience increased trait openness. This isnât one of the benefits cited by the OP.
[Erritzoe et al. 2018 found that psilocybin increased Openness in a population of depressed people, which SSRIs do not do.] Maclean et al. 2011, an analysis of psilocybin given to healthy-typed people, also found a persisting increase in Openness. However, Griffiths et al. 2017, also psilocybin for healthy-typed people, found no persisting increase in Openness. So maybe psilocybin causes greater Openness but only sometimes? As always more research is needed.
Also:
Why would increasing Big-Five Openness matter? Erritzoe [et al. 2018] engages with that too:
âłâŠ the facets Openness to Actions and to Values significantly increased in our study. The facet Openness to Actions pertains to not being set in oneâs way, and instead, being ready to try and do new things. Openness to Values is about valuing permissiveness, open-mindedness, and tolerance. These two facets therefore reflect an active approach on the part of the individual to try new ways of doing things and consider other peoplesâ values and/âor worldviews.â
And:
âIt is well established that trait Openness correlates reliably with liberal political perspective⊠The apparent link between Openness and a generally liberal worldview may be attributed to the notion that people who are more open to new experiences are also less personally constrained by convention and that this freedom of attitude extends into every aspect of a personâs life, including their political orientation.â
Right, so you would want to show that 30-40% of interventions with similar literatures pan out.
I think we have a disagreement about what the appropriate reference class here is.
The reference class Iâm using is something like âresults which are supported by 2-3 small-n studies with large effect sizes.â
Iâd expect roughly 30-40% of such results to hold up after confirmatory research.
Somewhat related: 62% of results assessed by Camerer et al. 2018 replicated.
Itâs a bit complicated to think about replication re: psychedelics because the intervention is showing promise as a treatment for multiple indications (there are a couple studies showing large effect sizes for depression, a couple studies showing large effect sizes for anxiety, a couple studies showing large effect sizes for addictive disorders).
Could you say a little more about what reference class youâre using here?
[Own views]
0) I donât know what the bar should be for calling something a âcause areaâ or âEA interestâ should be, but I think this bar should be above (e.g.) âpromising new drug treatment for bipolar disorderâ, even though this is unequivocally a good thing. Wherever exactly this bar falls (I donât think it needs to be âas promising as global healthâ), I donât think psychedelics meet it.
1) My scepticism on the mental health benefits of psychedelics mainly rely on second-order causes for concern, namely:
1.1) Thereâs some weak wisdom of nature prior that blasting one of your neurotransmitter pathways for a short period is unlikely to be helpful. This objection is pretty weak, given existing psychiatric drugs are similarly crude (although one of their advantages by the lights of this consideration is they generally didnât come to human attention by previous recreational use).
1.2) I get more sceptical as the number of (fairly independent) âupsidesâ of a proposed intervention increases. The OP notes psychedelics could help with anxiety and depression and OCD and addiction and PTSD, which looks remarkably wide-ranging and gives suspicion of a âcure looking for a diseaseâ. (That they are often mooted as having still other benefits on people without mental health issues such as improving creativity and empathy deepens my suspicion). Likewise, a cause that is proposed to be promising on long-termism and its negation pings suspicious convergence worries.
1.3) (Owed to Scott Alexanderâs recent post). The psychedelic literature mainly comprises small studies generally conducted by âtrue believersâ in psychedelics and often (but not always) on self-selected and motivated participants. This seems well within the territory of scientific work vulnerable to replication crises.
1.4) Thus my impression is that although I wouldnât be shocked if psychedelics are somewhat beneficial, Iâd expect them to regress at least as far down to efficicacies observed in existing psychopharmacology, probably worse, and plausibly to zero. Adding to the armamentarium of therapy for mental illness (in expectation) is worthwhile, but not enough for a big slice of EA opinion: it being a promising candidate for further exploration relies on âneartermismâ and (conditional on this) the belief that mental health is similarly promising to standard global health interventions on NTDs etc.
2) On the âlongtermismâ side of the argument, I agree it would be goodâand good enough to be an important âcauseâ - if there were ways of further enhancing human capital. (I bracket here the proposed mental health benefits, as my scepticism above applies even more strongly to the case that psychedelics are promising based on their benefits to EA community membersâ mental health alone).
My impression is most of the story for âhow do some people perform so well?â will be a mix of traits/ââunmodifiableâ factors (e.g. intelligence, personality dispositions, propitious upbringing); very boring advice (e.g. âSleep enoughâ, âexercise regularlyâ); and happenstance/âgood fortune. Iâd guess there will be some residual variance left on the table after these have taken the lionâs share, and these scraps would be important to take. Yet I suspect a lot of this will be pretty idiographic/âreducible to boring advice (e.g. anecdotally, novelists have their own peculiar habits for writing: IIRC Nabokov used index cards, Pullman has a writing shed, Gaiman a ânovel writing penâ - maybe âhaving a ritual for dedicated workâ matters, but which one is a matter of taste).
The evidence for psychedelic âenhancementâ is even thinner than psychedelic therapy, and labours under a more adverse prior. I agree the case for psychedelics here is comparable to CFAR/âParadigm/ârationality training, but I would rule both out, not in.
3) I agree with agdfoster that psychedelics have reputational costs. This âbad rapâ looks unfair to me (notwithstanding the above, Iâm confident that an âMDMA habitâ is much better for you than an alcohol, smoking, extreme sports, or social media one, none of which attract similar opprobrium), but it is decision-relevant all the same. If the upside was big enough, these costs would be worth paying, but I donât think they are.
The data doesnât support this, and generally suggests that 1-3 psychedelic experiences can have beneficial effects lasting 6 months or longer. See for example Carhart-Harris et al. 2018:
âAlthough limited conclusions can be drawn about treatment efficacy from open-label trials, tolerability was good, effect sizes large and symptom improvements appeared rapidly after just two psilocybin treatment sessions and remained significant 6 months post-treatment in a treatment-resistant cohort.â
Griffiths et al. 2016:
âHigh-dose psilocybin produced large decreases in clinician- and self-rated measures of depressed mood and anxiety, along with increases in quality of life, life meaning, and optimism, and decreases in death anxiety. At 6-month follow-up, these changes were sustained, with about 80% of participants continuing to show clinically significant decreases in depressed mood and anxiety.â
Johnson et al. 2017:
âAll 15 participants completed a 12-month follow-up, and 12 (80%) returned for a long-term (â„16 months) follow-up, with a mean interval of 30 months (range = 16 â 57 months) between target-quit date (i.e., first psilocybin session) and long-term follow-up. At 12-month follow-up, 10 participants (67%) were confirmed as smoking abstinent. At long-term follow-up, nine participants (60%) were confirmed as smoking abstinent.â
I would push back against the idea that these upsides are as independent as they may seem. Depression and anxiety are often comorbid (Hirschfeld 2001) and often comorbid with addiction (Quello 2005), OCD (Tukel 2002) and eating disorders (Marucci 2018). It seems that similar neurological states and cognitive processes underly these mental disorders, which is why psychedelics can effectively treat them all.
Carhart-Harris et al 2017, for example, suggest âconnectednessâ as the mechanism:
âA sense of disconnection is a feature of many major psychiatric disorders, particularly depression, and a sense of connection or connectedness is considered a key mediator of psychological well-being, as well as a factor underlying recovery of mental health. One of the most curious aspects of the growing literature on the therapeutic potential of psychedelics is the seeming general nature of their therapeutic applicability, i.e. they have shown promise not just for the treatment of depression but for addictions, anxiety and obsessive-compulsive disorder. This raises the question of whether psychedelic therapy targets a core factor underlying mental health. We believe that it does, and that connectedness is the key.â
A secondary point here is that substances with different pharmacological and phenomenological effects are all grouped under the term âpsychedelicâ. MDMA, for example, works and feels differently from ketamine, which works and feels differently from âclassicalâ psychedelics like LSD, psilocybin, and DMT. So while it may seem unlikely that psychedelics (understood as one uniform thing) could have a range of benefits, it makes more sense when psychedelics are understood as a category that includes different substances.
I think that the wisdom of nature prior would say that we shouldnât expect blasting a neurotransmitter pathway to be evolutionarily adaptive on average. If we know why something wouldnât be adaptive, then it seems like it doesnât apply. This prior would argue against claims like âX increases human capitalâ, but not claims like âX increases altruismâ, since thereâs a clear mechanism whereby being much more altruistic than normal is bad for inclusive genetic fitness.
I would worry about this more if the OP were referring to a specific intervention rather than a class of interventions. I think that the concern about being good on longterm and shortterm perspectives is reasonable, though there is a proposed mechanism (healing emotional blocks) that is related to both.
Normal drug discovery seems to be based off of coming up with hypotheses, then testing many chemicals to find statistically significant effects. In contrast, these trials are investigating chemicals that people are already taking for their effects. Running many trials then continuing the investigations that find significance is a good way to generate false positives, but that doesnât seem to be the case here, and I would be surprised to find zero effect (as opposed to shorter or different effects) if it were investigated more thoroughly.
I also think that improving human capital is important, and am not convinced that this is a clear and unambiguous winner for that goal. Iâm curious about what evidence would make you more optimistic about the possibility of large improvements to human capital.
I think small studies are also more vulnerable to publication bias.
On the flip side, it may be possible that the âtrue believersâ actually are on to something, but they have a hard time formalizing their procedure into something that can be replicated on a massive scale. So if larger studies fail to replicate the results from the small studies, this may be the reason why.
Do you have any examples of this actually happening? I have seen it as an excuse for things that never pan out many times, but I donât recall an instance of it actually delivering. E.g. in Many Labs 2 and other mass reproducibility efforts, you donât find a minority of experimenters with a âknackâ who get the effect but canât pass it on to others.
I donât have data either way, but âknacksâ for psychotherapy feel more plausible to me than âknacksâ for producing the effects in Many Labs 2 (just skimming over the list of effects here). Like, the strongest version of this claim is that no one is more skilled than anyone else at anything, which seems obviously false.
Suppose we conduct a study of the Feynman problem-solving algorithm: â1. Write down the problem. 2. Think real hard. 3. Write down the solution.â A n=1 study of Richard Feynman finds the algorithm works great, but it fails to replicate on a larger sample. What is your conclusion: that the n=1 result was spurious, or that Feynman has useful things to teach us but the 3-step algorithm didnât capture them?
I havenât read enough studies on psychedelics to know how much room there is in the typical procedure for a skilled therapist to make a difference though.
Wouldnât 1.2), 1.3), and 1.4) point towards funding more psychedelic research?
(To prove or disprove the benefits found in the early-stage trials?)
It does, but although thatâs enough to make it worthwhile on the margin of existing medical research, that is not enough to make it a priority for the EA community.
Are you saying that EA shouldnât fund confirmatory research, in general?
Or are you saying that thereâs something in particular about this research, such that EA shouldnât fund confirmatory research in this case?
The latter. EA shouldnât fund most research, but whether it is confirmatory or not is irrelevant. Psychedelics shouldnât make the cut if we expect (as I argue above) we expect a lot of failure to replicate and regression, and the true effect to be unexceptional in the context of existing mental health treatment.
Got it, thanks!
I feel confused about why you think psychedelics shouldnât make the cut. The present state of research (several small-n studies finding very large effect sizes) seems consistent with both:
The world in which psychedelics are in fact a promising intervention
The world in which the current promise of psychedelics is an artifact of our academic knowledge-generating process
It seems like the only way to know which world weâre in is to do confirmatory research.
That sounds a bit like the argument âeither this claim is right, or itâs wrong, so thereâs a 50% chance itâs true.â
One needs to attend to base rates. Our bad academic knowledge-generating process throws up many, many illusory interventions with purported massive effects for each amazing intervention we find, and the amazing interventions that we do find disproportionately were easier to show (with the naked eye, visible macro-correlations, consistent effects with well-powered studies, etc).
People are making similar arguments about cold fusion, psychic powers (of many different varieties), many environmental and nutritional contaminants, brain training, carbon dioxide levels, diets, polyphasic sleep, assorted purported nootropics, many psychological/âparenting/âeducational interventions, etc.
Testing how your prior applies across a spectrum of other cases (past and present) is helpful for model checking. If psychedelics are a promising EA cause how many of those others qualify? If many do, then any one isnât so individually special, although one might want to have a systematic program of systematically doing rigorous testing of all the wacky claims of large impact that can be tested cheaply.
If not, then it would be good to explain what exactly makes psychedelics different from the rest.
I think the case for psychedelics the OP has made doesnât pass this standard yet, so doesnât meet the standard for an EA cause area.
From what I understand, effect size is one of the better ways to predict whether a study will replicate. For example, this paper found that 77% of replication effect sizes reported were within a 95% prediction interval based on the original effect size.
As a spot check, you say that brain training has massive purported effects. I looked at the research page of Lumosity, a company which sells brain training software. I expect their estimates of the effectiveness of brain training to be among the most optimistic, but their highlighted effect size is only d = 0.255.
A caveat is that if an effect size seems implausibly large, it might have arisen due to methodological error. (The one brain training study I found with a large effect size has been subject to methodological criticism.) Here is a blog post by Daniel Lakens where he discusses a study which found that judges hand out much harsher sentences before lunch:
However, I think psychedelic drugs arguably do pass this test. During the 60s, before they became illegal, a lot of people kind of were talking about how society would reorganize itself around them. And forget about performing surgery or driving while you are tripping.
The way I see it, if you want to argue that an effect isnât real, there are two ways to do it. You can argue that the supposed effect arose through random chance/âp-hacking/âetc., or you can argue that it arose through methodological error.
The random chance argument is harder to make if the studies have large effect sizes. If the true effect is 0, itâs unlikely weâll observe a large effect by chance. If researchers are trying to publish papers based on noise, youâd expect p-values to cluster just below the p < 0.05 threshold (see p-curve analysis)⊠theyâre essentially going to publish the smallest effect size they can get away with.
The methodological error argument could be valid for a large effect size, but if this is the case, confirmatory research is not necessarily going to help, because confirmatory research could have the same issue. So at that point your time is best spent trying to pinpoint the actual methodological flaw.
This is exactly what p-values are designed for, so you are probably better off looking at p-values rather than effect size if thatâs the scenario youâre trying to avoid.
I suppose you could imagine that p-values are always going to be just around 0.05, and that for a real and large effect size people use a smaller sample because thatâs all thatâs necessary to get p < 0.05, but this feels less likely to me. I would expect that with a real, large effect you very quickly get p < 0.01, and researchers would in fact do that.
(I donât necessarily disagree with the rest of your comment, Iâm more unsure on the other points.)
Yes, this is a better idea.
This comment is a wonderful crystallisation of the âdefensive statisticsâ of Andrew Gelman, James Heathers and other great epistemic policemen. Thanks!
Iâm not claiming this. Iâm claiming that given the research to date, more psychedelic research would be very impactful in expectation. (Iâm at like 30-40% that the beneficial effects are real.)
I havenât read the literatures for all the examples you gave. For psychic powers & cold fusion, my impression is that confirmatory research was done and the initial results didnât replicate.
So one difference is that the main benefits of psychedelic therapy havenât yet failed to replicate.
> Iâm at like 30-40% that the beneficial effects are real.)
Right, so you would want to show that 30-40% of interventions with similar literatures pan out. I think the figure is less.
Scott referred to [edit: one] failure to replicate in his post.
Scott referred to one failure to replicate, for a finding that a psychedelic experience increased trait openness. This isnât one of the benefits cited by the OP.
More on psychedelics & Openness:
Also:
I think we have a disagreement about what the appropriate reference class here is.
The reference class Iâm using is something like âresults which are supported by 2-3 small-n studies with large effect sizes.â
Iâd expect roughly 30-40% of such results to hold up after confirmatory research.
Somewhat related: 62% of results assessed by Camerer et al. 2018 replicated.
Itâs a bit complicated to think about replication re: psychedelics because the intervention is showing promise as a treatment for multiple indications (there are a couple studies showing large effect sizes for depression, a couple studies showing large effect sizes for anxiety, a couple studies showing large effect sizes for addictive disorders).
Could you say a little more about what reference class youâre using here?