Thanks for the questions—your experience certainly sounds interesting as well (coming from someone with a smidgeon of past experience in the UK)!
As for the link between decision-relevance and forecaster activity: I think it bears repeating just how actively we had to manage our partnerships to not end up with virtually every question being long-term, which:
a) while obviously not instantly dropping decision relevance is at least heuristically tied to it (insofar as there are by default fewer incentives to act on any information regarding the future than there are for more immediate datapoints);
b) presents a fundamental obstacle for both evaluating forecast accuracy itself (as the questions just linger unresolved) and for the tournament model which seeks to reward this accuracy or a proxy thereof
That being said, from the discussions we had I feel at least somewhat confident in making two claims:
a) forecasters definitely cared about who will use the predictions and to what effect, though there didn’t seem to be significant variance in turnout or accuracy (insofar as we can measure it) bar a few outlier questions (which were duds on our part).
b) as a result and based on our exit interviews with the top forecasters, I would think about decision-relevance as a binary or categorical variable, rather than a continuous one. If the forecasting body continuously builds credibility in presenting questions and giving feedback from the institutions, it activates the “I’m not shouting into the void” mode of forecasters and delivers any benefits that might have.
At the same time, however, it is possible that none of our questions involved a leveled-up immediate question (“Is Bin Laden hiding in the compound...”), where a threshold would be crossed and suddenly activate an even more desirable mode of thinking/evaluating evidence. It’s questionable, however, whether even if such a threshold exists, a sustainable forecasting ecosystem can be built that exists on the other side of it (though this would be the dream scenario, of course).
As for training: in the previous tournament we ran, there was a compulsory training course on the basics such as base rates, fermisation, etc. Given that many participants in FORPOL had already taken part in it’s predecessor, and that our sign-ups indicated that most were familiar with these from having read Superforecasting or already forecasted elsewhere, we kept an updated version of the short training course available, but no longer compulsory. There was no directed training after the tournament, as we did not observe demand for it.
Lastly, perhaps one nugget of personal experience you might find relevant/relatable: when working with the institutions, it definitely was not rare to feel like the causal inference aspects (and even just eliciting cognitive models of how the policy variables interact) might have deserved a whole project to themselves.
Thanks for the questions—your experience certainly sounds interesting as well (coming from someone with a smidgeon of past experience in the UK)!
As for the link between decision-relevance and forecaster activity: I think it bears repeating just how actively we had to manage our partnerships to not end up with virtually every question being long-term, which:
a) while obviously not instantly dropping decision relevance is at least heuristically tied to it (insofar as there are by default fewer incentives to act on any information regarding the future than there are for more immediate datapoints); b) presents a fundamental obstacle for both evaluating forecast accuracy itself (as the questions just linger unresolved) and for the tournament model which seeks to reward this accuracy or a proxy thereof
That being said, from the discussions we had I feel at least somewhat confident in making two claims: a) forecasters definitely cared about who will use the predictions and to what effect, though there didn’t seem to be significant variance in turnout or accuracy (insofar as we can measure it) bar a few outlier questions (which were duds on our part). b) as a result and based on our exit interviews with the top forecasters, I would think about decision-relevance as a binary or categorical variable, rather than a continuous one. If the forecasting body continuously builds credibility in presenting questions and giving feedback from the institutions, it activates the “I’m not shouting into the void” mode of forecasters and delivers any benefits that might have.
At the same time, however, it is possible that none of our questions involved a leveled-up immediate question (“Is Bin Laden hiding in the compound...”), where a threshold would be crossed and suddenly activate an even more desirable mode of thinking/evaluating evidence. It’s questionable, however, whether even if such a threshold exists, a sustainable forecasting ecosystem can be built that exists on the other side of it (though this would be the dream scenario, of course).
As for training: in the previous tournament we ran, there was a compulsory training course on the basics such as base rates, fermisation, etc. Given that many participants in FORPOL had already taken part in it’s predecessor, and that our sign-ups indicated that most were familiar with these from having read Superforecasting or already forecasted elsewhere, we kept an updated version of the short training course available, but no longer compulsory. There was no directed training after the tournament, as we did not observe demand for it.
Lastly, perhaps one nugget of personal experience you might find relevant/relatable: when working with the institutions, it definitely was not rare to feel like the causal inference aspects (and even just eliciting cognitive models of how the policy variables interact) might have deserved a whole project to themselves.