Research Summary: Prediction Markets

Link post

Note: I initially published this summary as a blog post here. The blog has more information about the context of this post and my reasoning transparency. In brief, this post was me summarizing what I learned about prediction polling, a specific flavor of forecasting, while trying to understand how it could be applied to forecasting global catastrophic risks. If I cite a source, it means I read it in full. If I relay a claim, it means it’s made in the cited source and that I found it likely to be true (>50%). I do not make any original arguments (that’s done elsewhere on the blog), but I figured this might be helpful in jumpstarting other people’s understanding on the topic or directing them towards sources that they weren’t aware of. Any and all feedback is most welcome.

The text should all be copied exactly from my original post, but I redid the footnotes/citations to use the EA forum formatting. If any of those are missing/seem incorrect please let me know.

Prediction markets can have a variety of different structures and mechanisms, but at their core they enable large numbers of people to bet with each other on a particular outcome. A simple example is a market to predict whether a candidate will be elected president. If shares in the market pay out at $1 if this candidate is elected and $0 if they are not, and enough people trade these shares in the market, then the current trading price reflects a sort of consensus on the probability of this candidate being elected. In this case, a 33¢ last traded price reflects a 33% consensus belief in the likelihood of the outcome. Economists tend to believe that with enough trading volume the accuracy of this prediction will approach the best possible given the available information. Different mechanisms can allow these markets to be applied to predicting dates or scalar values instead of probabilities of binary outcomes.^[1]

To date, prediction markets have most commonly been used in research, entertainment and corporate contexts. Though they were originally designed to benefit from the incentives of financial gains for accurate forecasters, market designs using “play money” have also been shown to be effective. Companies’ internal, real money, prediction markets on things like sales or project deadlines with only 20-60 traders were found to be better forecasters than the existing internal processes. Early experiments with prediction markets found them to have roughly equal accuracy to experts but superior to polls or simple averages of large groups. This was across domains from political elections to sports betting to box office openings.^[1] Prediction markets were even used to predict the outcome of replication attempts for academic research in the social sciences, correctly predicting the binary outcome 73% of the time across 103 replication attempts which was superior to a simple survey of participants.^[2]

The incentive of either financial gain or status or fun seems to drive participants to seek out information relevant to a particular outcome and use it to calibrate their own private forecast in the form of a bet. Furthermore, since this bet is both directional and a certain volume of shares, a forecaster’s confidence is an indirect input into the aggregation mechanism of the market. Though some experiments have found prediction markets to be inferior to prediction polling^[3], when a scoring rule that better accommodated low liquidity markets was used in a true apples to apples comparison with comparable forecasters, the prediction market forecasts were just as accurate as prediction polling with the best discovered aggregation algorithm. Furthermore, success in forecasting via markets was just as predictive of forecasting skill as accuracy in prediction polling.^[4]

This is somewhat surprising! Rather than directly ask participants for their estimate and confidence, we’ve introduced an intermediate step of betting with particular volumes and depend on a market for aggregation rather than an algorithm. You might expect trading skill to be a confounder that means returns and therefore available investments go to people who aren’t necessarily the best forecasters, or that the market mechanism itself could confuse participants into providing less accurate forecasts. But it seems either the aggregation power of the market cancels this out, or they were not significant or valid concerns in the first place, at least in the realm of geopolitical forecasting with well incentivized participants.

The nature of prediction markets makes them ideal for predicting continually changing situations, as they output a continually updating real time consensus forecast. This also makes them unsuitable for some situations, as forecasters must see the consensus forecast in order to make their own, preventing them from working fully independently or from this technology being used to produce forecasts that are themselves sensitive information.

The accuracy of prediction markets also degrades the farther in the future the market’s outcome will be determined. This is partially inherent to forecasting, but the problem is amplified in prediction markets because it means any financial investment made as a bet will be tied up until it is concluded. The opportunity cost is every other way that money could have been invested, which over long time periods will surely be superior to be the prediction market. One proposed solution to this drawback is investing traded funds on the forecaster’s behalf which seems logistically challenging^[5]. Another is to chain the resolution of criteria of prediction markets so that a combination of shorter term markets collectively produce a long term forecast^[6], but to my knowledge this has never been tested.

Users of prediction markets in experiments have reported low satisfaction with them and low trust in the resulting consensus forecasts^[7] but I think it’s likely this is a product of the experimental context with minimal training and poor UI of early prediction market software. A more modern internal prediction market at Google with very high engagement seems to have benefitted from a vastly improved user experience.^[8]

A more fundamental limitation of prediction markets arises when you try to apply them to conditional scenarios, a potentially very valuable aspect of forecasting. For example, predicting the impacts of a hypothetical policy with questions targeting a particular metric in worlds either with or without that policy being implemented. Unfortunately, it seems that prediction markets can only be used to predict correlation with the possibility for lots of confounding factors obscuring conditional insights on causation.^[9] In my mind, this implies the need for an intersubjective scoring and incentive system like those discussed in Research: Prediction Polling.

However, the greatest impediment to advancing the use of prediction markets in forecasting is a legal one. The Commodity Futures Trading Commission (CFTC) has effectively banned the creation of real money prediction markets in the US despite academic pushback for roughly the last two decades. Though this is a solvable problem, with some signs of political progress, recent results for prediction market platforms have been grim.^[10]^[11]^[12] Intense media blowback to the prospect of government funded research prediction markets in the early 2000s also reveals a separate potential risk should the government legalize them, depending on public perception and partisan politicization.^[13]

^
Wolfers, Justin, and Eric Zitzewitz. 2004. “Prediction Markets.” Journal of Economic Perspectives, 18 (2): 107-126.
^
Gordon, Michael, et al. “Predicting replicability—Analysis of survey and prediction market data from large-scale forecasting projects.” Plos one 16.4 (2021): e0248780.
^
Dana, Jason, et al. “Are markets more accurate than polls? The surprising informational value of “just asking”.” Judgment and Decision Making 14.2 (2019): 135-147.
^
Atanasov, Pavel, et al. “Crowd Prediction Systems: Markets, Polls, and Elite Forecasters.” Proceedings of the 23rd ACM Conference on Economics and Computation. 2022.
^
Antweiler, Werner. “Long-term prediction markets.” The Journal of Prediction Markets 6.3 (2012): 43-61.
^
Alexander, Scott. “Mantic Monday: Let Me Google That for You.” Astralcodexten.substack.com, astralcodexten.substack.com/p/mantic-monday-let-me-google-that. Accessed 13 Feb. 2023.
^
Graefe, Andreas. Prediction markets versus alternative methods: empirical tests of accuracy and acceptability. Diss. Karlsruhe, Univ., Diss., 2009, 2009.
^
“Design Patterns in Google’s Prediction Market on Google Cloud.” Google Cloud Blog, cloud.google.com/blog/topics/solutions-how-tos/design-patterns-in-googles-prediction-market-on-google-cloud. Accessed 18 Feb. 2023.
^
dynomight. “Prediction Market Does Not Imply Causation.” DYNOMIGHT, 6 Oct. 2022, dynomight.net/prediction-market-causation/. Accessed 18 Feb. 2023.
^
Hanania, Richard. “How to Legalize Prediction Markets.” Richardhanania.substack.com, richardhanania.substack.com/p/how-to-legalize-prediction-markets. Accessed 18 Feb. 2023.
^
Arrow, Kenneth J., et al. “The promise of prediction markets.” Science 320.5878 (2008): 877-878.
^
Alexander, Scott. “The Passage of Polymarket.” Astralcodexten.substack.com, astralcodexten.substack.com/p/the-passage-of-polymarket. Accessed 13 Feb. 2023.
^
Hanson, Robin. “Policy Analysis Market Archive.” Mason.gmu.edu, mason.gmu.edu/~rhanson/policyanalysismarket.html. Accessed 14 Feb. 2023.