Hi Gregory, Thanks for the detailed response. I understand where you are coming from: if tables were turned, I would have posted a similar comment. I’d be happy go over the science in greater detail with you; perhaps we can start another thread to cover this, as I expect our science discussion to be very long, detailed and technical. Right now, we have an important question to answer: is it worth spending 500K$ USD in an attempt to replicate Samuel et al 2010 with more patients and proper controls?
The 500K$ USD figure is from a detailed budget produced by a credible university-affiliated clinical research team eager to start this study. I reviewed this budget with them, and it is reasonable.
Zeke estimates the direct financial upside of a successful replication to be about 33B$/year. This is a 66000:1 ratio (33B/500K = 66000). We need to assign probabilities to the following explanations of Samuel et al 2010’s results:
1. Their results are correct: itraconazole cures Crohn’s. 2. Their results are a fluke: itraconazole isn’t affecting Crohn’s symptoms, and natural waxing and waning of symptoms made it look like itraconazole cures Crohn’s.
Itraconazole is a cheap, widely available off-patent broad-spectrum antifungal drug. The main immunological signature of Crohn’s disease are antibodies against conserved fungal sugars (mannan, beta-glucan, and chitin). By principle of parsimony, this means that Crohn’s patients’ immune systems are likely fighting a fungus which is the root cause of Crohn’s disease. There are a number of other possible explanations for the above observations, but these are more complex and difficult to prove.
(A) What are the odds that Samuel et al 2010’s results will replicate? X (B) At what odds is this replication project a good candidate for EA funding? Y (C) If X > Y and X is low, can funding agencies and foundations tolerate the risk of failure, or must we find these funds in a less non-conventional manner?
I am getting a much better understanding of Y with the help of people on this forum, thanks to Seke and Ryan (I have no experience in doing these estimates). I was hoping to get a better idea of the value of X too. In most situations, people round-down the value of X to zero before starting their analysis. This means they consider further effort evaluating X or Y to be largely futile (in Bayesian terms, prior probabilities of zero cannot be changed by further analysis). EA folks are used to dealing with low X values, so I thought they’d be less likely to round down to zero
If X < Y, then I will move on to other things. If X > Y, then I will do all I can to fund this study, as this is likely the highest-impact charitable project available to me.
I’d very much appreciate it if Gregory, Ryan, Seke, Aaron could help me quantify X and Y. Other diseases listed here can somewhat decrease Y, but calculating by how much is complex, so let’s stick to Crohn’s for now. I included other diseases in this post because they were the main focus of my research for six years, and they might well have the same fungal etiology as Crohn’s disease. I realize that despite this prior research *strengthening* the case a fungal etiology in Crohn’s, many people instinctively *reduce* the prior probability of any of this being correct due to the unusually large scope of this project.
A cheaper alternative (also by about an order of magnitude) is to do a hospital record study where you look at subsequent Crohn’s admissions or similar proxies of disease activity in those recently prescribed antifungals versus those who aren’t.
I also imagine it would get better data than a poorly powered RCT.
Hi Hauke, Thank you very much for this suggestion. Yes, animal models would be another category 2 option. You might know that Barry Marshall had much trouble developing animals models of Helicobacter pylori-induced gastritis, so this approach is hit-and-miss at best, and it is hard to know ahead of time what the probability of a “hit” would be. It is also less ethical than the other solutions, and for this reason, I’d prefer avoiding animal models (if possible).
Hi Gregory, Great suggestion! The main issue with this approach is that it seems long-term use of itraconazole is required (>3 months), which rarely occurs in practice. Most on-label uses of itraconazole are for much shorter periods, which is one reason why Samuel et al 2010 was such an exception: histoplasmosis is only prevalent in the mid-West, and requires a very long course of itraconazole.
A second problem is that once treatment is discontinued, Crohn’s symptoms seem to return after a few months (again per Samuel et al 2010). This is very much like dandruff (caused by the fungus Malassezia): once antifungal shampoos are discontinued, Malassezia return, and so does dandruff! So we’d have to be able to test using the medical database if these Crohn’s patients got a flare or not during the treatment period (as compared to properly selected controls—getting comparable/unbiased controls using this methods is not trivial).
In addition, Samuel et al 2010 was very well positioned to detect the effect of itraconazole, because they stopped giving their patients immunosuppressants—so they were expecting severe flares during treatment. This is not expected to occur in most cases from medical databases.
Finally, I don’t think medical database studies like this can be used to change medical practice. Would the FDA allow a new indication without an RCT? I doubt it. So running a database could not reach the stated impact.
How many patients do you think we would need in a RCT to have sufficient power? The researchers I am working with think itra=20, placebo=20 would be sufficient. I don’t have the expertise to evaluate this. Samuel et al 2010 noticed a marked effect on 5 patients, although there were no controls, so they were judging this using their clinical experience. FWIW, the last author of Samuel et al 2010 is one of the top Crohn’s researchers in the world.
X = odds that Samuel et al 2010’s results will replicate (range 0 − 1).
Category 1 options: studies which can bring X’s value close to 1.
(1a) A well powered RCT testing itraconazole in Crohn’s (success = curing Crohn’s).
Category 2 options: cheaper studies which can increase X, but not bring it close enough to 1 to change clinical practice. However, they would raise awareness that Crohn’s might be caused by a fungus, and thus might be cured by itraconazole. Hopefully someone will do (1a) based on the results of these category 2 options.
(2a) Test Samuel et al 2010 by using a larger medical database than that available at the Mayo Clinic in 2010 (ideally in the mid-West where histoplasmosis is endemic).
(2b) Antibodies against Malassezia are associated with psoriasis (Squiquera et al 1994; Liang et al 2003). We could try replicating these studies in Crohn’s disease.
(2c) In psoriasis, white blood cells release interferon gamma when exposed to Malassezia antigens (Kanda et al 2002), likely because T cells are specifically targeting Malassezia on the skin. We could replicate this study in Crohn’s disease.
Note that (2c) will likely be successful because vedolizumab is known to cause psoriasis in ~10% of Crohn’s patients by sending T cells from the gut to the skin (Tadbiri et al 2018).
The idea of doing an intermediate piece of work is so one can abandon the project if it is negative whilst having spent less than 500k. Even independent of the adverse indicators I note above, the prior on case series finding replicating out in RCT is very low.
Another cheap option would be talking to the original investigators. They may have reasons why they haven’t followed this finding up themselves.
I attempted to contact them, but they did not reply. These are top Crohn’s researchers, and must be very solicited from all sides, so their lack of response is expected.
(2b) (2c) (2d) are being run right now by different groups. I don’t know how long it will take for them to publish (best guess ~1-2 years).
What numerical value do you assign to the probability of replication of Samuel et al 2010 (variable X)?
Hi Gregory, Thank you for helping try to establish these probabilities. I am not sure I follow the math (I’m not used to doing these calculations). Could you explain how you calculated it? Thanks again!
If you use a two tailed test and find a positive effect with p<0.05 it’s <0.025 likely you’d get a positive effect that big by chance. If you don’t understand that then you should look up two tailed tests.
OK, I will. I don’t have your input data, nor the assumptions on which you based your analysis to apply the two-tailed test. These are necessary to understand your results.
Hi Ryan, I need to know what input data and assumptions he used to be able to verify/replicate/interpret his math. Without this information, I cannot comment further. Thanks!
You could cast about for various relevant base-rates (“What is the chance of any given proposed conjecture in medical science being true?” “What is the chance of a given medical trial giving a positive result?”). Crisp data on these questions are hard to find, but the proportion for either is comfortably less than even. (Maybe ~5% for the first, ~20% for the second).
From something like this one can make further adjustments based on the particular circumstances, which are generally in the adverse direction:
Typical trials have more than n=6 non-consecutive case series behind them, and so this should be less likely to replicate than the typical member of this class.
(Particularly, heterodox theories of pathogenesis tend to do worse, and on cursory search I can find a alternative theories of Crohn’s which seem about as facially plausible as this).
The wild theory also imposes a penalty: even if the minimal prediction doesn’t demand the wider ‘malasezzia causes it etc.’, that the hypothesis is generated through these means is a further cost.
There’s also information I have from medical training which speaks against this (i.e. if antifungals had such dramatic effects as proposed, it probably would have risen to attention somewhat sooner).
All the second order things I noted in my first comment.
As Ryan has explained, standard significance testing puts a floor of 2.5% of a (false) positive result in any trial even if the true effect is zero. There is some chance the ground truth really is that itraconazole cures Crohn’s (given some evidence of TNFa downstream effects, background knowledge of fungal microbiota disregulation, and the very slender case series), which gives it a small boost above this, although this in itself is somewhat discounted by the limited power of the proposed study (i.e. even if Itraconazole works, the study might miss it).
Hi Gregory, Thanks for the detailed answer. I’m still not clear on how the numbers quoted above (0.005, 3%, 2.5%) were calculated, nor how they affect the probability of Samuel et al 2010 replicating successfully. It is worthwhile to break down the problem in two parts:
(I) Does Samuel et al 2010 give us any information to support the hypothesis that Crohn’s might be cured by itraconazole? If so, how much?
(II) How large does an RCT need to be to properly test this hypothesis?
Answering these two questions is essential to determine if Samuel et al 2010 should be replicated or not (obviously with proper controls this time). This is what I am trying to determine with this forum post: should we raise ~500K$ to replicate it or not? What is the expected return on giving for this experiment?
>Zeke estimates the direct financial upside of a successful replication to be about 33B$/year. This is a 66000:1 ratio (33B/500K = 66000).
This is not directly relevant, because the money is being saved by other people and governments, who are not normally using their money very well. EAs’ money is much more valuable as it is used much more efficiently than Western people and governments usually do. NB: this is also the reason why EA should generally be considered funders of last resort.
If the study has a 0.5% (??? I have no idea) chance of leading to global approval and effective treatment then it’s 35k QALY in expectation per my estimate which means a point estimate of $14/QALY. iirc, that’s comparable to global poverty interventions but at a much lower robustness of evidence, some other top EA efforts with a similar degree of robustness will presumably have a much higher EV. Of course the other diseases you can work on may be much worse causes.
Also that $33B comes from a study on the impact of the disease. Just because you replicate well doesn’t mean the treatment truly works, and is approved globally, etc. Hence the 0.5% number being very low.
Hi Zeke, Thanks for the clarification and the estimate for Y. If I understand correctly:
(1) Minimum success probability for project viability is ~0.5% (Y=0.5%)
(2) Upside following success is 33B$*10 years = 330B$ (per your earlier estimate, this needs to be adjusted for many different reasons, both up and down, but these adjustments are beyond my capabilities).
(3) Cost is 500K$.
(4) Expected ROI is = (330B$ * 0.5%) / 500K$ = 3300.
So this means if you find a 100$ bill on the sidewalk and giving it away to someone else statistically gives them ~300K$, you will keep it, but if it statistically gives them 400K$ you will give it away. Is that right?
Hi Gregory, Thanks for the detailed response. I understand where you are coming from: if tables were turned, I would have posted a similar comment. I’d be happy go over the science in greater detail with you; perhaps we can start another thread to cover this, as I expect our science discussion to be very long, detailed and technical. Right now, we have an important question to answer: is it worth spending 500K$ USD in an attempt to replicate Samuel et al 2010 with more patients and proper controls?
The 500K$ USD figure is from a detailed budget produced by a credible university-affiliated clinical research team eager to start this study. I reviewed this budget with them, and it is reasonable.
Zeke estimates the direct financial upside of a successful replication to be about 33B$/year. This is a 66000:1 ratio (33B/500K = 66000). We need to assign probabilities to the following explanations of Samuel et al 2010’s results:
1. Their results are correct: itraconazole cures Crohn’s.
2. Their results are a fluke: itraconazole isn’t affecting Crohn’s symptoms, and natural waxing and waning of symptoms made it look like itraconazole cures Crohn’s.
Itraconazole is a cheap, widely available off-patent broad-spectrum antifungal drug. The main immunological signature of Crohn’s disease are antibodies against conserved fungal sugars (mannan, beta-glucan, and chitin). By principle of parsimony, this means that Crohn’s patients’ immune systems are likely fighting a fungus which is the root cause of Crohn’s disease. There are a number of other possible explanations for the above observations, but these are more complex and difficult to prove.
(A) What are the odds that Samuel et al 2010’s results will replicate? X
(B) At what odds is this replication project a good candidate for EA funding? Y
(C) If X > Y and X is low, can funding agencies and foundations tolerate the risk of failure, or must we find these funds in a less non-conventional manner?
I am getting a much better understanding of Y with the help of people on this forum, thanks to Seke and Ryan (I have no experience in doing these estimates). I was hoping to get a better idea of the value of X too. In most situations, people round-down the value of X to zero before starting their analysis. This means they consider further effort evaluating X or Y to be largely futile (in Bayesian terms, prior probabilities of zero cannot be changed by further analysis). EA folks are used to dealing with low X values, so I thought they’d be less likely to round down to zero
If X < Y, then I will move on to other things. If X > Y, then I will do all I can to fund this study, as this is likely the highest-impact charitable project available to me.
I’d very much appreciate it if Gregory, Ryan, Seke, Aaron could help me quantify X and Y. Other diseases listed here can somewhat decrease Y, but calculating by how much is complex, so let’s stick to Crohn’s for now. I included other diseases in this post because they were the main focus of my research for six years, and they might well have the same fungal etiology as Crohn’s disease. I realize that despite this prior research *strengthening* the case a fungal etiology in Crohn’s, many people instinctively *reduce* the prior probability of any of this being correct due to the unusually large scope of this project.
A cheaper alternative (also by about an order of magnitude) is to do a hospital record study where you look at subsequent Crohn’s admissions or similar proxies of disease activity in those recently prescribed antifungals versus those who aren’t.
I also imagine it would get better data than a poorly powered RCT.
This might be naive and I have only skimmed this thread, but wouldn’t using a cheap study using mouse model be best here? Maybe contact the authors of the papers cited in this paper “Mouse models of inflammatory bowel disease for investigating mucosal immunity in the intestine” to collaborate on such a study.
Hi Hauke, Thank you very much for this suggestion. Yes, animal models would be another category 2 option. You might know that Barry Marshall had much trouble developing animals models of Helicobacter pylori-induced gastritis, so this approach is hit-and-miss at best, and it is hard to know ahead of time what the probability of a “hit” would be. It is also less ethical than the other solutions, and for this reason, I’d prefer avoiding animal models (if possible).
Hi Gregory, Great suggestion! The main issue with this approach is that it seems long-term use of itraconazole is required (>3 months), which rarely occurs in practice. Most on-label uses of itraconazole are for much shorter periods, which is one reason why Samuel et al 2010 was such an exception: histoplasmosis is only prevalent in the mid-West, and requires a very long course of itraconazole.
A second problem is that once treatment is discontinued, Crohn’s symptoms seem to return after a few months (again per Samuel et al 2010). This is very much like dandruff (caused by the fungus Malassezia): once antifungal shampoos are discontinued, Malassezia return, and so does dandruff! So we’d have to be able to test using the medical database if these Crohn’s patients got a flare or not during the treatment period (as compared to properly selected controls—getting comparable/unbiased controls using this methods is not trivial).
In addition, Samuel et al 2010 was very well positioned to detect the effect of itraconazole, because they stopped giving their patients immunosuppressants—so they were expecting severe flares during treatment. This is not expected to occur in most cases from medical databases.
Finally, I don’t think medical database studies like this can be used to change medical practice. Would the FDA allow a new indication without an RCT? I doubt it. So running a database could not reach the stated impact.
How many patients do you think we would need in a RCT to have sufficient power? The researchers I am working with think itra=20, placebo=20 would be sufficient. I don’t have the expertise to evaluate this. Samuel et al 2010 noticed a marked effect on 5 patients, although there were no controls, so they were judging this using their clinical experience. FWIW, the last author of Samuel et al 2010 is one of the top Crohn’s researchers in the world.
Hi Gregory, here are some more options:
X = odds that Samuel et al 2010’s results will replicate (range 0 − 1).
Category 1 options: studies which can bring X’s value close to 1.
(1a) A well powered RCT testing itraconazole in Crohn’s (success = curing Crohn’s).
Category 2 options: cheaper studies which can increase X, but not bring it close enough to 1 to change clinical practice. However, they would raise awareness that Crohn’s might be caused by a fungus, and thus might be cured by itraconazole. Hopefully someone will do (1a) based on the results of these category 2 options.
(2a) Test Samuel et al 2010 by using a larger medical database than that available at the Mayo Clinic in 2010 (ideally in the mid-West where histoplasmosis is endemic).
(2b) Antibodies against Malassezia are associated with psoriasis (Squiquera et al 1994; Liang et al 2003). We could try replicating these studies in Crohn’s disease.
(2c) In psoriasis, white blood cells release interferon gamma when exposed to Malassezia antigens (Kanda et al 2002), likely because T cells are specifically targeting Malassezia on the skin. We could replicate this study in Crohn’s disease.
(2d) We could replicate Kellermayer et al 2012 or Richard 2018, who found extremely strong associations between Malassezia and IBD.
Note that (2c) will likely be successful because vedolizumab is known to cause psoriasis in ~10% of Crohn’s patients by sending T cells from the gut to the skin (Tadbiri et al 2018).
Other ideas are welcome!
The idea of doing an intermediate piece of work is so one can abandon the project if it is negative whilst having spent less than 500k. Even independent of the adverse indicators I note above, the prior on case series finding replicating out in RCT is very low.
Another cheap option would be talking to the original investigators. They may have reasons why they haven’t followed this finding up themselves.
I attempted to contact them, but they did not reply. These are top Crohn’s researchers, and must be very solicited from all sides, so their lack of response is expected.
(2b) (2c) (2d) are being run right now by different groups. I don’t know how long it will take for them to publish (best guess ~1-2 years).
What numerical value do you assign to the probability of replication of Samuel et al 2010 (variable X)?
~3% (Standard significance testing means there’s a 2.5% chance of a false positive result favouring the treatment group under the null).
Hi Gregory, Thank you for helping try to establish these probabilities. I am not sure I follow the math (I’m not used to doing these calculations). Could you explain how you calculated it? Thanks again!
If you use a two tailed test and find a positive effect with p<0.05 it’s <0.025 likely you’d get a positive effect that big by chance. If you don’t understand that then you should look up two tailed tests.
OK, I will. I don’t have your input data, nor the assumptions on which you based your analysis to apply the two-tailed test. These are necessary to understand your results.
He’s just saying he thinks there’s a 0.005 chance of detecting a real effect.
Hi Ryan, I need to know what input data and assumptions he used to be able to verify/replicate/interpret his math. Without this information, I cannot comment further. Thanks!
In hope but little expectation:
You could cast about for various relevant base-rates (“What is the chance of any given proposed conjecture in medical science being true?” “What is the chance of a given medical trial giving a positive result?”). Crisp data on these questions are hard to find, but the proportion for either is comfortably less than even. (Maybe ~5% for the first, ~20% for the second).
From something like this one can make further adjustments based on the particular circumstances, which are generally in the adverse direction:
Typical trials have more than n=6 non-consecutive case series behind them, and so this should be less likely to replicate than the typical member of this class.
(Particularly, heterodox theories of pathogenesis tend to do worse, and on cursory search I can find a alternative theories of Crohn’s which seem about as facially plausible as this).
The wild theory also imposes a penalty: even if the minimal prediction doesn’t demand the wider ‘malasezzia causes it etc.’, that the hypothesis is generated through these means is a further cost.
There’s also information I have from medical training which speaks against this (i.e. if antifungals had such dramatic effects as proposed, it probably would have risen to attention somewhat sooner).
All the second order things I noted in my first comment.
As Ryan has explained, standard significance testing puts a floor of 2.5% of a (false) positive result in any trial even if the true effect is zero. There is some chance the ground truth really is that itraconazole cures Crohn’s (given some evidence of TNFa downstream effects, background knowledge of fungal microbiota disregulation, and the very slender case series), which gives it a small boost above this, although this in itself is somewhat discounted by the limited power of the proposed study (i.e. even if Itraconazole works, the study might miss it).
Hi Gregory, Thanks for the detailed answer. I’m still not clear on how the numbers quoted above (0.005, 3%, 2.5%) were calculated, nor how they affect the probability of Samuel et al 2010 replicating successfully. It is worthwhile to break down the problem in two parts:
(I) Does Samuel et al 2010 give us any information to support the hypothesis that Crohn’s might be cured by itraconazole? If so, how much?
(II) How large does an RCT need to be to properly test this hypothesis?
Answering these two questions is essential to determine if Samuel et al 2010 should be replicated or not (obviously with proper controls this time). This is what I am trying to determine with this forum post: should we raise ~500K$ to replicate it or not? What is the expected return on giving for this experiment?
>Zeke estimates the direct financial upside of a successful replication to be about 33B$/year. This is a 66000:1 ratio (33B/500K = 66000).
This is not directly relevant, because the money is being saved by other people and governments, who are not normally using their money very well. EAs’ money is much more valuable as it is used much more efficiently than Western people and governments usually do. NB: this is also the reason why EA should generally be considered funders of last resort.
If the study has a 0.5% (??? I have no idea) chance of leading to global approval and effective treatment then it’s 35k QALY in expectation per my estimate which means a point estimate of $14/QALY. iirc, that’s comparable to global poverty interventions but at a much lower robustness of evidence, some other top EA efforts with a similar degree of robustness will presumably have a much higher EV. Of course the other diseases you can work on may be much worse causes.
Also that $33B comes from a study on the impact of the disease. Just because you replicate well doesn’t mean the treatment truly works, and is approved globally, etc. Hence the 0.5% number being very low.
Hi Zeke, Thanks for the clarification and the estimate for Y. If I understand correctly:
(1) Minimum success probability for project viability is ~0.5% (Y=0.5%)
(2) Upside following success is 33B$*10 years = 330B$ (per your earlier estimate, this needs to be adjusted for many different reasons, both up and down, but these adjustments are beyond my capabilities).
(3) Cost is 500K$.
(4) Expected ROI is = (330B$ * 0.5%) / 500K$ = 3300.
So this means if you find a 100$ bill on the sidewalk and giving it away to someone else statistically gives them ~300K$, you will keep it, but if it statistically gives them 400K$ you will give it away. Is that right?
Only if this project is assumed to be the best available use of funds. Other things may be better.