Yeah these are interesting questions Eli. I’ve worked on a few big RCTs and they’re really hard and expensive to do. It’s also really hard to adequately power experiments for small effect sizes in noisy environments (e.g., productivity of remote/in-person work). Your suggestions to massively scale up those interventions and to do things online would make things easier. As Ozzie mentioned, the health ones require such long and slow feedback loops that I think they might not be better than well (statistically) controlled alternatives. I used to think RCTs were the only way to get definitive causal data. The problem is, because of biases that can be almost impossible to eliminate (https://sites.google.com/site/riskofbiastool/welcome/rob-2-0-tool) RCTs are seldom perfect causal data. Conversely, with good adjustment for confounding, observational data can provide very strong causal evidence (think smoking; I recommend my PhD students do this course for this reason https://www.coursera.org/learn/crash-course-in-causality). For the ones with fast feedback loops, I think some combination of “priors + best available evidence + lightweight tests in my own life” works pretty well to see if I should adopt something.
At a meta-level, in an ideal world, the NSF and NIH (and global equivalents) are probably designed to fund people to address questions that are most important and with the highest potential. There are probably dietetics/sleep/organisational psychology experts who have dedicated their careers to questions #1-4 above, and you’d hope that those people are getting funded if those questions are indeed critical to answer. In reality, science funding probably does not get distributed based on criteria that maximises impartial welfare, so maybe that’s why #1-4 would get missed. As mentioned in a recent forum post, I think the mega-org could be better focused nudging scientific incentives to focus on those questions rather than working on those questions ourselves https://forum.effectivealtruism.org/posts/JbddnNZHgySgj8qxj/improving-science-influencing-the-direction-of-research-and
On causal evidence of RCTs vs. observational data: I’m intuitively skeptical of this but the sources you linked seem interesting and worthwhile to think about more before setting an org up for this. (Edited to add:) Hearing your view already substantially updates mine, but I’d be really curious to hear more perspectives from others with lots of experience working on this type of stuff, to see if they’d agree, then I’d update more. If you have impressions of how much consensus there is on this question that would be valuable too.
On nudging scientific incentives to focus on important questions rather than working on them ourselves: this seems pretty reasonable to me. I think building an app to do this still seems plausibly very valuable and I’m not sure how much I trust others to do it, but maybe we combine the ideas and build an app then nudge other scientists to use this app to do important studies.
I should clarify: RCTs are obviously generally >> even a very well controlled propensity score matched quasi-experiment, but I just don’t think the former is ‘bulletproof’ anymore. The former should update your priors more but if you look at the variability among studies in meta-analyses, even among low-risk-of-bias RCTs, I’m now much less easily swayed by any single one.
Yeah these are interesting questions Eli. I’ve worked on a few big RCTs and they’re really hard and expensive to do. It’s also really hard to adequately power experiments for small effect sizes in noisy environments (e.g., productivity of remote/in-person work). Your suggestions to massively scale up those interventions and to do things online would make things easier. As Ozzie mentioned, the health ones require such long and slow feedback loops that I think they might not be better than well (statistically) controlled alternatives. I used to think RCTs were the only way to get definitive causal data. The problem is, because of biases that can be almost impossible to eliminate (https://sites.google.com/site/riskofbiastool/welcome/rob-2-0-tool) RCTs are seldom perfect causal data. Conversely, with good adjustment for confounding, observational data can provide very strong causal evidence (think smoking; I recommend my PhD students do this course for this reason https://www.coursera.org/learn/crash-course-in-causality). For the ones with fast feedback loops, I think some combination of “priors + best available evidence + lightweight tests in my own life” works pretty well to see if I should adopt something.
At a meta-level, in an ideal world, the NSF and NIH (and global equivalents) are probably designed to fund people to address questions that are most important and with the highest potential. There are probably dietetics/sleep/organisational psychology experts who have dedicated their careers to questions #1-4 above, and you’d hope that those people are getting funded if those questions are indeed critical to answer. In reality, science funding probably does not get distributed based on criteria that maximises impartial welfare, so maybe that’s why #1-4 would get missed. As mentioned in a recent forum post, I think the mega-org could be better focused nudging scientific incentives to focus on those questions rather than working on those questions ourselves https://forum.effectivealtruism.org/posts/JbddnNZHgySgj8qxj/improving-science-influencing-the-direction-of-research-and
Really appreciate hearing your perspective!
On causal evidence of RCTs vs. observational data: I’m intuitively skeptical of this but the sources you linked seem interesting and worthwhile to think about more before setting an org up for this. (Edited to add:) Hearing your view already substantially updates mine, but I’d be really curious to hear more perspectives from others with lots of experience working on this type of stuff, to see if they’d agree, then I’d update more. If you have impressions of how much consensus there is on this question that would be valuable too.
On nudging scientific incentives to focus on important questions rather than working on them ourselves: this seems pretty reasonable to me. I think building an app to do this still seems plausibly very valuable and I’m not sure how much I trust others to do it, but maybe we combine the ideas and build an app then nudge other scientists to use this app to do important studies.
I should clarify: RCTs are obviously generally >> even a very well controlled propensity score matched quasi-experiment, but I just don’t think the former is ‘bulletproof’ anymore. The former should update your priors more but if you look at the variability among studies in meta-analyses, even among low-risk-of-bias RCTs, I’m now much less easily swayed by any single one.