> I’m at like 30-40% that the beneficial effects are real.)
Right, so you would want to show that 30-40% of interventions with similar literatures pan out. I think the figure is less.
Scott referred to [edit: one] failure to replicate in his post.
That sounds a bit like the argument ‘either this claim is right, or it’s wrong, so there’s a 50% chance it’s true.’
One needs to attend to base rates. Our bad academic knowledge-generating process throws up many, many illusory interventions with purported massive effects for each amazing intervention we find, and the amazing interventions that we do find disproportionately were easier to show (with the naked eye, visible macro-correlations, consistent effects with well-powered studies, etc).
People are making similar arguments about cold fusion, psychic powers (of many different varieties), many environmental and nutritional contaminants, brain training, carbon dioxide levels, diets, polyphasic sleep, assorted purported nootropics, many psychological/parenting/educational interventions, etc.
Testing how your prior applies across a spectrum of other cases (past and present) is helpful for model checking. If psychedelics are a promising EA cause how many of those others qualify? If many do, then any one isn’t so individually special, although one might want to have a systematic program of systematically doing rigorous testing of all the wacky claims of large impact that can be tested cheaply.
If not, then it would be good to explain what exactly makes psychedelics different from the rest.
I think the case for psychedelics the OP has made doesn’t pass this standard yet, so doesn’t meet the standard for an EA cause area.
On the flip side, it may be possible that the “true believers” actually are on to something, but they have a hard time formalizing their procedure into something that can be replicated on a massive scale. So if larger studies fail to replicate the results from the small studies, this may be the reason why.
Do you have any examples of this actually happening? I have seen it as an excuse for things that never pan out many times, but I don’t recall an instance of it actually delivering. E.g. in Many Labs 2 and other mass reproducibility efforts, you don’t find a minority of experimenters with a ‘knack’ who get the effect but can’t pass it on to others.
Recent large sample within-family data does seem to establish causal effects of brain size on intelligence and educational attainment. The genetic correlation is ~0.4, so most of the genetic variance isn’t working through overall brain size.
Some kinds of features that could contribute to genetic variance in humans, but not scale for arbitrary differences across species:
Mutation load (the rate at which this is trimmed back, and thus the equilibrium load, depends on the strength of selection for cognitive abilities)
Motivation: attention to learning, play, imitation, and language comes at the expense of attention to other things
Pleiotropy with other selection combined with evolutionary limits (selection for lower aggression also causes white patches in fur via changes in neural crests, and retention of a variety of juvenile features), e.g. selection for disease resistance changing pathways so as to accidentally impair brain function (with the change surviving because of its benefits)
Alleles that provide resistance to disease (genetic variance is maintained in a Red Queen’s Race) that damages the brain would be a source of genetic variance, likewise variants affecting nutrition or other environmental influences
Thank you for this excellent and detailed post, I expect to use it in the future as a go-to reference for explaining this point. You might be interested in an old paper where Nick Bostrom and I went through some of this reasoning (with similar conclusions but much less explanation) in the course of discussing the implications of anthropic theories for the possible difficulty of evolving intelligence.
I am not so sure about the specific numerical estimates you give, as opposed to the ballpark being within a few orders of magnitude for SIA and ADT+total views (plus auxiliary assumptions), i.e. the vicinity of “(roughly the largest value that doesn’t make the Fermi observation too unlikely, as shown in the next two sections”. But that’s compatible with much or most of our expected on the total view coming from scenarios where we don’t overlap with aliens much.
″ However, varying the planet formation rate at particular times in the history of the Universe can make a large difference.”
We also update our uncertainty about this sort of temporal structure to some extent from our observation of late existence. Ideally we would want to let as much as possible vary so that we don’t asymmetrically immunize some parameters against update.
“For this reason I will ignore scenarios where life is extraordinarily unlikely to colonise the Universe, by making fs loguniform between 10−4 and 1.”
This seems overall too pessimistic to me as a pre-anthropic prior for colonization (~10% credence).
I don’t think you can define aging research so narrowly and get the same expected impact. E.g. De Grey’s SENS includes curing cancer as one of many subgoals, and radical advances in stem cell biology and genetic engineering, massive fields that don’t fall under ‘aging research.’ The more dependent progress in an area is advances from outside that field, the less reliable this sort of projection will be.
I saw your request for commentary on Facebook, so here are some off-the-cuff comments (about 1 hour’s worth so take with appropriate grains of salt, but summarizing prior thinking):
My prior take on metformin was that it seems promising for its space (albeit with mixed evidence, and prior longevity drug development efforts haven’t panned out, but the returns would be very high for medical research if true), although overall the space looks less promising than x-risk reduction to me; the following comments will be about details of the analysis where I would currently differ
The suggestion of this trial moving forward LEV by 3+ years through an icebreaker effect boosting research looks wildly implausible to me
LEV is not mainly bottlenecked on ‘research on aging,’ e.g. de Grey’s proposals require radical advances in generally medically applicable stem cell and genetic engineering technologies that already receive massive funding and are quite challenging; the ability to replace diseased cells with genetically engineered stem cell derived tissues is already a major priority, and curing cancer is a small subset of SENS
Much of the expected gain in biomedical technology is not driven by shifts within biology, and advances within a particular medical field are heavily driven by broader improvements (e.g. computers, CRISPR, genome sequencing, PCR, etc); if LEV is far off and heavily dependent on other areas, then developments in other fields will make it comparatively easy for aging research to benefit from ‘catch up growth’ reducing the expected value of immediate speedup (almost all of which would have washed away if LEV happens in the latter half of the century)
In particular, if automating R&D with AI is easier than LEV, and would moot prior biomedical research, then that adds an additional discount factor; I would bet that this happens before LEV through biomedical research
Getting approval to treat ‘aging’ isn’t actually particularly helpful relative to approval for ‘diseases of aging’ since all-cause mortality requires larger trials and we don’t have great aging biomarkers; and the NIH has taken steps in that direction regardless
Similar stories have been told about other developments and experiments, which haven’t had massive icebreaker effects
Combined, these effects look like they cost a couple orders of magnitude
From my current epistemic state the expected # of years added by metformin looks too high
Re the Guesstimate model the statistical power of the trial is tightly tied to effect size; the larger the effect size the fewer people you need to show results; that raises the returns of small trials, but means you have diminishing returns for larger ones (you are spending more money to detect smaller effects so marginal cost-effectiveness goes a lot lower than average cost-effectiveness, reflecting high VOI of testing the more extravagant possibility)
Likewise the proportion using metformin conditional on a positive result is also correlated with effect size (which raises average EV, but shifts marginal EV lower proportionate to average EV); also the proportion of users seems too low to me conditional on success
One issue I would add to your theoretical analysis: with assigning 1000+ QALYs to letting someone reach LEV is that people commonly don’t claim linear utility with lifespan, i.e. they would often prefer to live to 80 with certainty rather than die at 20 with 90% probability and live to 10,000 with 10% probability.
I agree it’s worth keeping the chance that people will be able to live much longer in the future in mind when assessing benefits to existing people (I would also add the possibility of drastic increases in quality of life through technology). I’d guess most of this comes from broader technological improvements (e.g. via AI) rather than reaching LEV through biomedical approaches), but not with extreme confidence.
However, I don’t think it has very radical implications for cause prioritization since, as you note, deaths for any reason (include malaria and global catastrophes) deny those people a chance at LEV. LEV-related issues are also mainly a concern for existing humans, so to the extent one gives a boost for enormous impacts on nonhuman animals and the existence of future generations, LEV speedup won’t reap much of those boosts.
Within the field of biomedical research, aging looks relatively promising, and I think on average the best-targeted biomedical research does well for current people compared to linear charity in support of deployment (e.g. gene drives vs bednets). But it’s not a slam dunk because the problems are so hard (including ones receiving massive investment). I don’t see it as strongly moving most people who prefer to support bednets over malaria gene drives, farmed animal welfare over gene drives, or GCR reduction over gene drives.
″ Oh, is the concern that they’re looking at a more biased subset of possible effects (by focusing primarily on effects that seem positive)? ”
Yes. It doesn’t mention other analyses that have come to opposite conclusions by considering effects on wild animals and long-term development.
If you’re going to select interventions specifically to reduce the human population and have downstream consequences, it seems absolutely essential to take a broader view of the empirical consequences than in the linked report. E.g. among others, effects on wild animals (not mentioned but most immediate animal effects of this change will be on wild animals), future technological advancement, and global catastrophic risks have good cases for being far larger and plausibly of opposite sign to the effects discussed in the report but are not mentioned even as areas for further investigation.
What about a report along the lines of ‘I am donating in support of X, for highly illegible reasons relating to my intuition from looking at their work, and private information I have about them personally’?
This is a good point, and worth highlighting in discussion of reports (especially as we get more data on the effects of winning on donation patterns). On the other hand, the average depth and quality of investigation by winners (and the access they got) does seem higher than what they would otherwise have done, whilst less than expert donors.
I don’t think this is true. The probabilities and payouts are the same for any given participant, regardless of what others do, so people who are unlikely to write up a report don’t reduce the average number of reports produced by those who would.
Except that the pot size isn’t constrained by the participation of small donors: the CEA donor lottery has fixed pot sizes guaranteed by large donors, and the largest donors could be ~risk-neutral over lotteries with pots of many millions of donors. So there is no effect of this kind, and there is unlikely to ever be one except at ludicrously large scales (where one could use derivatives or the like to get similar effects).
Yes, the main effect balances out like that.
But insofar as the lottery enhances the effectiveness of donors (by letting them invest more in research if they win, amortized against a larger donation), then you want donors doing good to be enhanced and donors doing bad not to be enhanced. So you might want to try to avoid boosting pot size available to bad donors, and ensure good donors have large pots available. The CEA lottery is structured so that question doesn’t arise.
There is also the minor issue of correlation with other donors in the same block mentioned in the above comment, although you could ask CEA for a separate block if some unusual situation meant your donation plans would change a lot if you found out another block participant had won.
> but also in the other 80% of worlds you have a preference for your money being allocated by people who are more thoughtful.
For the CEA donor lottery, the pot size is fixed independent of one’s entry as the guarantor (Paul Christiano last year, the regranting pool I am administering this year) puts in funds for any unclaimed tickets. So the distribution of funding amounts for each entrant is unaffected by other entrants. It’s set up this way specifically so that people don’t even have to think about the sort of effect you discuss (the backstop fund has ~linear value of funds over the relevant range, so that isn’t an impact either).
The only thing that participating in the same lottery block as someone else matters for is correlations between your donations and theirs. E.g. if you would wind up choosing a different charity to give to depending on whether another participant won the lottery. But normally the behavior of one other donor wouldn’t change what you think is the best opportunity.
What happened in those cases?
I would love to see a canonical post making this argument, conflating EA with the benefits of maxing out personal warm fuzzies is one of my pet peeves.
I actually happen to think that the report was too dismissive of more leveraged climate change interventions that I expected could be a lot better than the estimates for Cool Earth (especially efficient angles on scientific research and political activity in the climate space), but the OP is suggesting that the original Cool Earth numbers (which indicate much lower cost-effectiveness than charities recommended by EAs in other areas with more robust data) were overstated, not understated (as the original report would suggest due to regression to the mean and measurement error).