Forecasting in the Czech public administration—preliminary findings

TL;DR

  • When communicating forecasting with policy partners, it is useful to have a good understanding of other foresight methods as well (horizon scanning, backcasting, scenario planning, etc). The pool of practically forecastable policy questions is somewhat smaller than we expected, but other methods can often be used.

  • Forecasting side: It might be more effective to split the current forecasting models into 1/​ tournaments to identify talent using short-term, attractive questions and 2/​ an advisory group of proven forecasters to swiftly predict policy-relevant questions (that are often more demanding, less attractive, and long-term).

  • Policymaking side: Epistemically curious public officials thinking about the future in probabilistic terms (scout mindset + future mindset + probabilistic mindset) are crucial for success. Without them, bringing forecasting into policy might be possible indirectly, e. g. by raising awareness through the media or NGOs.

Intro

In this post, we discuss some of our preliminary findings from the FORPOL (Forecasting for Policy) project run by Czech Priorities in cooperation with Metaculus. Our main goal is to provide other organizations with real-life experiences and recommendations aimed at making forecasting more policy-relevant and accepted by the public administration.

In Czech Priorities (EA-aligned NGO), we consider ourselves to be in a good position to advance this knowledge for three main reasons: 1/​ we have experience in organizing large-scale local forecasting tournaments, 2/​ the Czech Republic is a small enough country for us to have developed good connections with government analysts and politicians at most of the public institutions, and 3/​ our team has a strong consulting expertise.

Based on expert consultations and preparatory work in the summer of 2022, we developed a strategy of individually reaching out to a large number of selected policy partners (in forecasting-relevant areas), working with them intensively, and monitoring the interactions.

Contrary to what might be expected, we did not work with organizations in the realms of foreign policy or the local intelligence community, as we wanted to explore possible use cases beyond these areas and we have stronger links to potential “champions” of forecasting (i.e. senior officials) in other governmental ministries, departments, and agencies.

The expected value of our approach lies in delivering impact and success stories and, more importantly, in understanding precisely what works in various situations and interactions. This will be the main content of our study, which will come out in September 2023. Until then, some of our previous suggestions on how to use forecasting tournaments in policy can also be found in our methodology manual (České priority, 2021).

Our work so far

These are the Czech institutions (12 public inst., 2 NGOs, 1 foreign ministry) that we have partnered with so far, and the topics we forecasted for them (2-4 predictions per topic):

  1. Ministry of Regional Development: Accommodation for Ukrainian refugees

  2. Ministry of Regional Development (different section): Urbanization

  3. Ministry of Education: Deferrals in Czech elementary schools

  4. Π Institute (Institute of the Pirate political party): Digitization of the society

  5. Ministry of Health—The fade-out of Covid-19 (tests, vaccines, & hospitalizations)

  6. (undisclosed) - Labor market

  7. STEM (Public polling agency & research institute) - Perceptions of security & trust

  8. Ministry of Education—The demand for teachers and principals

  9. Czechitas (NGO providing IT courses for women) - Women in the IT industry

  10. Slovak Ministry of Finance—The future of the European economy

  11. Ministry of Labour and Social Affairs—Insolvency & Foreclosures

  12. Ministry of Trade and Industry—The future of Industry

  13. Ministry of Education—Lifelong learning

  14. Technological Agency (main public R&I funder) - Financing science and R&I

  15. Ministry of Justice—Forensic experts

On a broader scale, we discussed the potential and the need for the use of foresight methods in public policy with many other stakeholders. This included the newly elected president Petr Pavel (the conversations took place during his campaign), who remains interested in using participative foresight methods to predict national risks & opportunities.

Our tournament is still running and we are in the process of helping selected policy partners in leveraging the predictions, but we wanted to share some of the findings early on. Our model of cooperation with policy partners has three main phases, along the lines of which we structure the remainder of this post.

  1. Scoping—Multiple meetings to specify the real needs, motivations, and bottlenecks, explain the Theory of Change, and discuss the future use of the predictions.

  2. Forecasting—Submitting questions (2-4 in each thematic “challenge”) to the platform, where an average of 100 forecasters discuss & predict for 3 weeks.

  3. Implementation—Delivering a written report with aggregate predictions and the main identified causalities or drivers, and organizing “implementation follow-ups”.

1. Scoping

  • In the scoping process, it seems useful to discuss the partner’s needs before explaining how forecasting questions should look.

    • When discussing that later, it is also helpful to show examples of questions that might already be relevant to them. They are also often curious about the characteristics of the forecasters as a group (mostly men, ages 30-40 and graduates), but in our experience, this was never the reason for rejection.

  • The predictions should feed into ongoing analytical or strategic work.

    • In cases where partners identified topics that would be generally useful for their work or support their argumentation, the incentive to use the prediction was usually not strong enough to be translated into action. Until forecasting itself becomes a widely requested analytical input, injecting probabilistic forecasts and rationales into contracted analytical work seems reasonable.

    • For example, when the Russian invasion of Ukraine started, it became clear that many of the refugees would flee to the Czech Republic. Our forecasters quickly made a forecast, and we used it to create 2 scenarios (300k and 500k incoming refugees). We then used these scenarios in our joint analysis with PAQ Research on how to effectively integrate the Ukrainian immigrants. The study was then used by multiple Czech ministries in creating programs of support for housing, education, and employment. This happened at a time when widely circulated estimates spoke of tens of thousands of such arrivals by the end of 2022. In reality, it was over 430 thousand people.

  • A trusting personal relationship is very helpful and a lot of guidance is necessary, especially when the partner did not actively reach out.

    • Our proactive approach (reaching out to policymakers) was useful and necessary to use in our project, but it demands a lot of work in the scoping phase. In addition, if a partner spends considerable time with us scoping the problem and developing good questions only to find them not very actionable at the moment, there is a risk of actually creating an aversion to forecasting.

    • This proactive approach has a few benefits (e.g. quick feedback loops), but in the long-term, we would suggest focusing on cultivating demand for actionable forecasts by other means too, such as the upskilling of individual analysts, or raising the prestige of communicating in probabilistic terms within and across institutions.

  • An understanding of other foresight methods is important for the quality of advice and support, as well as for communication purposes.

    • After the first series of meetings with potential policy partners, we realized the need for an even deeper understanding of other foresight methods. We leveraged our other team, who created this foresight manual. And we strengthened the framing that judgemental forecasting is only one of many methods of foresight.

    • Our experience suggests that having robust expertise in foresight helps to propose (and deliver) the right method when appropriate, but also it might be a useful “gateway” topic, providing better access to officials who are already convinced about the need for better foresight but don´t understand forecasting yet—which was the case for the majority of our partners. We are aware that this finding might be slightly EU-specific, as “foresight” is a buzzword increasingly being used in the European Union structures.

    • We became part of the European JRC network of foresight practitioners, which—aside from the useful sharing of experience—enabled us to disseminate forecasting internationally, e.g., by organizing a calibration workshop for the EU Competence Centre on Participatory and Deliberative Democracy. We previously built our foresight reputation mainly thanks to the 2021 study of Global megatrends, which now serves as a basis for, e.g, the national R&I funding strategy (and where we actually used forecasting tournaments as a supplementary method to reduce noise).

2. Forecasting

  • It is difficult to maintain engagement in relatively long, policy-relevant forecasting tournaments.

    • Drop-off across the six months of our tournament is around 75% (from ∿160 to ∿40 weekly active participants), which is interestingly similar to the Salem center/​SCPI forecasting tournament. In our case, it is probably caused by three main factors: the tournament is long, many questions are complex (including a few conditional questions), and some of the questions are long-term (unresolvable by the ground truth for the purposes of rewards for participation/​accuracy). Even though our top forecasters are mostly not motivated financially, the most effective explicit reward mechanism for keeping most forecasters engaged still seems to be prizes for the most informative rationales in each thematic “challenge” (every 3 weeks).

    • Predictions that produce an actual impact are often very focused and aimed at very specific indicators. With such niche questions, the amount of research (and especially time investment) necessary to feel comfortable forecasting on the topic is often discouraging, limiting the discussion and increasing the role of forecasters with an already established knowledge base about the subject, who write most rationales.

    • Shorter tournaments and/​or periodical public sharing of the results can help (a public leaderboard alone might not be enough), however, the problem is that most of the questions are not resolved in a matter of days or weeks. Therefore, some of the questions might be designed to resolve quite soon after opening to provide early feedback. These questions should be prepared by facilitators (ideally in advance) and not draw too much attention away from the other, more important questions. Once we have data for the whole duration of the tournament, one of the things we will be looking into will be the role played by negative scores in participation.

    • The broad range of topics covered over the course of the tournament led to a necessity to obtain substantial knowledge on different subjects, which might have been a challenge for some forecasters. However, it also gave us rather strong evidence of the curiosity and “forecasting mindset”—the cognitive style—of those who persisted and did well.

  • Know your forecasters (motivations, interests, etc.) to e.g. eliminate policy questions or topics where the final aggregate predictions might not be very reliable, as the forecasters lack inherent motivation to delve deeper.

    • It is worthwhile to explore the real motivations of the forecasters and to learn more about which topics they individually consider attractive so that the questions asked in the tournament can be made more responsive to that. Moreover, understanding the main driver of the forecasters’ activity likely directly relates to the eventual drop-off. Even strong financial incentives or the potential prestige of being coined a professional forecaster might not be enough if other needs are not met.

    • Some forecasters respond negatively to a perceived lack of control over their forecasts, e.g. in setting the granularity of probabilities or the question boundaries, even in cases where this has no material effect on the scoring or ranking within a tournament. If it can be corroborated that this trait is correlated with the ‘forecaster’ cognitive style to a significant level, this might warrant a rethinking of current forecast question generation approaches.

3. Implementation

  • The involvement of domain experts was useful especially to increase trust and prevent future public criticism

    • We organized two online consultations with the top domain experts, most of whom were not familiar with forecasting. These consultations were open to forecasters, who could listen to their opinions and then ask questions.

    • When individually reaching out to experts, some were (predictably) unwilling to give their probabilistic predictions and some argued against the very feasibility of doing so. Some were very skeptical of the benefits of having their expertise combined with the opinions of a group of forecasters.

    • The involvement of experts probably did not substantially increase the trust of our direct policy partners in the quality of the predictions. The names of respected experts on the final report were, however, important when communicating the report to other people (e.g. other relevant departments at the ministry) not familiar with forecasting.

  • Aside from sending a written report, it is very important to organize a follow-up meeting for clarifications and planning of the use of the prediction. It helps to come up with concrete next steps and ideas for sharing the results further. We sometimes even draft e-mails about our results for the partners to forward.

  • We will report more findings from this phase in the later stages of the project.

Our next steps

Starting in Q2/​23, we plan to split our forecasting process into two separate mechanisms. We are now considering a few design choices for both the recruiting mechanism (long ongoing vs. short intensive tournaments) and the expert forecasting mechanism (general vs. topic-specific, publicly visible vs. on-demand access, etc).

We are in the process of working with policy partners, conducting individual talks with the 30+ expert forecasters who will participate in our expert forecasting team, and mapping the demand for “rapid predictions“ as a product even outside of the policymaking sector. Recently, we also had the opportunity to publish forecasts and talk in one of the largest Czech weeklies, Respekt (podcast in Czech), and to work with a network of private schools (SCIO) on running tournaments for elementary and high school students, which are both streams that we want to continue working on.

Any advice, recommendations, or experience sharing would be appreciated, let me know! We’re also happy to help or consult with folks starting similar projects.