I’m interested in people’s thoughts on:
How valuable would it be for more academics to do research into forecasting?
How valuable would it be for more non-academics to do academic-ish research into forecasting? (E.g., for more people in think tanks or EA orgs to do research on forecasting that’s closer in levels of “rigour” to the average paper than to the average EA Forum post.)
What questions about forecasting should be researched by academics, or by non-academics using academic approaches?
I imagine this could involve psychological experiments, historical research, conceptual/theoretical/mathematical work, political science literature reviews, etc.
(Note that I don’t mean “What questions should people forecast?” For that, there’s this post.)
Context
Many EAs have appreciated and drawn on Phil Tetlock’s research into forecasting. Some people in the EA community or related communities are now doing non-academic work on forecasting, such as:
building tools for forecasting (e.g. Foretold)
running forecasting projects (e.g. Foretell [no relation])
experimenting/playing around with different forecasting techniques and methods (e.g. here and here)
researching and writing about forecasting (e.g. here and here)
And some people in the EA community or related communities are doing academic work:
To forecast specific things (e.g. Grace et al.; Manheim)
About forecasting itself (e.g. Beard et al.; Baum [see also])
But I’m not aware of much academic research on forecasting itself from EAs. I imagine there might be room for more. And two things I’m considering trying to do in future are:
A PhD in psychology, focusing on forecasting or other things related to improving institutional decision-making
Research on forecasting in a job in an EA organisation or with a grant
But I don’t know how valuable that would be—e.g., maybe non-EA academics are covering this area well already, or maybe what’s really needed is just people in government/business trying to actually implement or make use of forecasting projects. Nor do I know what the important open questions or topics would be. And I haven’t engaged much with the academic literature on forecasting, other than reading Superforecasting, Expert Political Judgement, and various summaries by EAs.
So this leads me to ask the above questions, both for my own benefit and (hopefully) for the benefit of other people who might be a good fit for research into this topic. (To help capture that benefit, I’ve added this post to A central directory for open research questions.)
The 80,000 Hours profile on improving institutional decision-making has some great analysis and ideas on this, but it’s now almost 3 years old, and I’m interested in additional perspectives.
(I also posted these three questions to the Improving institutional decision-making Facebook group, and there was some discussion there.)
How open are various decision-makers to actually paying attention to forecasts?
How likely are they to just make the same decision anyway, referring to forecasts when they justify this decision and ignoring them the rest of the time?
How does this vary for different decision-makers and contexts (e.g., politicians vs civil servants vs funders vs business leaders)?
How does this vary by different approaches to forecasting (e.g. those surveyed by Beard et al.), different presentations of forecasting, and different topics?
I’m guessing a good amount of research will have already been done on these topics, but I’ve been surprised about such things before.
I imagine these questions could be answered through a mixture of:
surveys of relevant people
lab experiments
interviews with people who’ve tried implementing forecasting in relevant settings
literature reviews of potentially relevant work in political science and political economy (e.g., on kinds of info are drawn on in political decision-making more generally)
probably also other approaches
It seems plausible that these sorts of questions aren’t as well answered by academic research as by people just actually trying to implement forecasting in relevant institutions, get relevant decision-makers to pay attention to forecasts, etc.
(This is one of my own ideas about a cluster of questions that might warrant academic/academic-style research. I’d be interested in people’s thoughts on this cluster and the cluster I suggest in a different comment, including regarding how important, tractable, and neglected these questions are.)
Not really answering your question, but there is some recent work attempting to forecast clinical trial results that may be relevant: Can Oncologists Predict the Efficacy of Treatments in Randomized Trials? Kimmelman (the senior author) is doing other work on the topic too (e.g. here). I’m not aware of much published work in this space in a biomedical context.
My guess is that key decision makers in medicine (e.g. funders of trials), would not be very open to paying attention to forecasts (even if shown to be accurate to some degree), as there is a very strong culture of relying on data and in particular on RCTs.
Greenberg 2018 lists and evaluates forecasting scoring rules. Research on additionally more complex metrics that take into account e.g.:
importance of the questions forecasted on (perhaps using an interest score such as Metaculus)
number of questions forecasted on (brier score distorts if only few forecasts)
relative performance as compared to other forecasters on similar sets of questions
Might be useful to set incentives right in forecasting tournaments. Prediction markets solve the first point by logarithmic subsidising.
How does the resolution and calibration of forecasts vary by the forecasts’ “range” (e.g., whether the forecast is for an event 6 months away vs 3 years away vs 20 years away)?
How does this vary between topics, types of forecasters, etc.?
Do people who are superforecasters for short- or medium-range questions (e.g., those that resolve within 3 years) still do better than average for long-range forecasts?
Are there approaches to forecasting that are especially useful for long-range forecasts? (Approaches that could be considered include those surveyed by Beard et al.)
All the same questions, but now for “extreme” or “extremely rare” vs more “mundane” or “common” events (e.g., forecasts of global catastrophes vs forecasts of more minor disruptions).
All the same questions, but now for forecasts that are long-range and about extreme/extremely rare outcomes.
Background for this cluster of questions: How Feasible Is Long-range Forecasting?
One rationale for this cluster of questions would be to improve our ability to forecast global catastrophic and existential risks, and/or inform how we interpret those forecasts and how much weight we give them.
This research could use very similar methodologies to those used by Tetlock, just with different forecasts. However, a very important practical limitation is that the research may inherently take many years or decades.
(This post says “Fortunately, we might soon be in a position to learn more about long-range forecasting from the EPJ [Expert Political Judgement] data, since most EPJ forecasts (including most 25-year forecasts) will have resolved by 2022”. This might reduce the marginal value of further work on this, but I imagine the marginal value could remain quite high.)
I have an (unfinished) essay on the topic using Metaculus and PredictionBook data. Relation between range and accuracy is negative within forecasts on one specific questions. Specifically, linear regression is
0.0019x+0.0105
for brier score over the range in days. Of course, I’ll look into better statistical analyses if I find time.Thanks for sharing this, I’ll try look over it soon.
Just beware I got feedback by two different people that it’s difficult to understand.