How does the resolution and calibration of forecasts vary by the forecasts’ “range” (e.g., whether the forecast is for an event 6 months away vs 3 years away vs 20 years away)?
How does this vary between topics, types of forecasters, etc.?
Do people who are superforecasters for short- or medium-range questions (e.g., those that resolve within 3 years) still do better than average for long-range forecasts?
Are there approaches to forecasting that are especially useful for long-range forecasts? (Approaches that could be considered include those surveyed by Beard et al.)
All the same questions, but now for “extreme” or “extremely rare” vs more “mundane” or “common” events (e.g., forecasts of global catastrophes vs forecasts of more minor disruptions).
All the same questions, but now for forecasts that are long-range and about extreme/extremely rare outcomes.
One rationale for this cluster of questions would be to improve our ability to forecast global catastrophic and existential risks, and/or inform how we interpret those forecasts and how much weight we give them.
This research could use very similar methodologies to those used by Tetlock, just with different forecasts. However, a very important practical limitation is that the research may inherently take many years or decades.
(This post says “Fortunately, we might soon be in a position to learn more about long-range forecasting from the EPJ [Expert Political Judgement] data, since most EPJ forecasts (including most 25-year forecasts) will have resolved by 2022”. This might reduce the marginal value of further work on this, but I imagine the marginal value could remain quite high.)
How does the resolution and calibration of forecasts vary by the forecasts’ “range” (e.g., whether the forecast is for an event 6 months away vs 3 years away vs 20 years away)?
I have an (unfinished) essay on the topic using Metaculus and PredictionBook data. Relation between range and accuracy is negative within forecasts on one specific questions. Specifically, linear regression is 0.0019x+0.0105 for brier score over the range in days. Of course, I’ll look into better statistical analyses if I find time.
How does the resolution and calibration of forecasts vary by the forecasts’ “range” (e.g., whether the forecast is for an event 6 months away vs 3 years away vs 20 years away)?
How does this vary between topics, types of forecasters, etc.?
Do people who are superforecasters for short- or medium-range questions (e.g., those that resolve within 3 years) still do better than average for long-range forecasts?
Are there approaches to forecasting that are especially useful for long-range forecasts? (Approaches that could be considered include those surveyed by Beard et al.)
All the same questions, but now for “extreme” or “extremely rare” vs more “mundane” or “common” events (e.g., forecasts of global catastrophes vs forecasts of more minor disruptions).
All the same questions, but now for forecasts that are long-range and about extreme/extremely rare outcomes.
Background for this cluster of questions: How Feasible Is Long-range Forecasting?
One rationale for this cluster of questions would be to improve our ability to forecast global catastrophic and existential risks, and/or inform how we interpret those forecasts and how much weight we give them.
This research could use very similar methodologies to those used by Tetlock, just with different forecasts. However, a very important practical limitation is that the research may inherently take many years or decades.
(This post says “Fortunately, we might soon be in a position to learn more about long-range forecasting from the EPJ [Expert Political Judgement] data, since most EPJ forecasts (including most 25-year forecasts) will have resolved by 2022”. This might reduce the marginal value of further work on this, but I imagine the marginal value could remain quite high.)
I have an (unfinished) essay on the topic using Metaculus and PredictionBook data. Relation between range and accuracy is negative within forecasts on one specific questions. Specifically, linear regression is
0.0019x+0.0105
for brier score over the range in days. Of course, I’ll look into better statistical analyses if I find time.Thanks for sharing this, I’ll try look over it soon.
Just beware I got feedback by two different people that it’s difficult to understand.