The mean Brier scores of Metaculus’ predictions (and Metaculus’ community predictions) are (from here):
For all the questions:
At resolve time (N = 1,710), 0.087 (0.092).
For 1 month prior to resolve time (N = 1,463), 0.106 (0.112).
For 6 months (N = 777), 0.109 (0.127).
For 1 year (N = 334), 0.111 (0.145).
For 3 years (N = 57), 0.104 (0.133).
For 5 years (N = 8), 0.182 (0.278).
For the questions of the category artificial intelligence:
At resolve time (N = 46), 0.128 (0.198).
For 1 month prior to resolve time (N = 40), 0.142 (0.205).
For 6 months (N = 21), 0.119 (0.240).
For 1 year (N = 13), 0.107 (0.254).
For 3 years (N = 1), 0.007 (0.292).
Note:
For the questions of the category artificial intelligence:
Metaculus’ community predictions made earlier than 6 months prior to resolve time perform as badly or worse than always predicting 0.5, as their mean Brier score is similar or higher than 0.25.
Metaculus’ predictions perform significantly better than Metaculus’ community predictions.
Questions for which the Brier score can be assessed for a longer time prior to resolve, i.e. the ones with longer lifespans, tend to have lower base rates (I found a correlation of −0.129 among all questions). This means it is easier to achieve a lower Brier score:
Predicting 0.5 for a question whose base rate is 0.5 will lead to a Brier score of 0.25 (= 0.5*(0.5 − 1)^2 + (0.5 − 0)*(0.5 − 0)^2).
Predicting 0.1 for a question whose base rate is 0.1 will lead to a Brier score of 0.09 (= 0.1*(0.1 − 1)^2 + (1 − 0.1)*(0.1 − 0)^2).
You are welcome, and thanks for the follow-up question! I have added the community predictions inside parentheses above. I have also added a new bullet commenting on the community predictions for AI.
What would the brier score be if it involved forecasts significantly far removed from the event (6 months, 1 year, 2 years let’s say?)
Hi Gideon,
The mean Brier scores of Metaculus’ predictions (and Metaculus’ community predictions) are (from here):
For all the questions:
At resolve time (N = 1,710), 0.087 (0.092).
For 1 month prior to resolve time (N = 1,463), 0.106 (0.112).
For 6 months (N = 777), 0.109 (0.127).
For 1 year (N = 334), 0.111 (0.145).
For 3 years (N = 57), 0.104 (0.133).
For 5 years (N = 8), 0.182 (0.278).
For the questions of the category artificial intelligence:
At resolve time (N = 46), 0.128 (0.198).
For 1 month prior to resolve time (N = 40), 0.142 (0.205).
For 6 months (N = 21), 0.119 (0.240).
For 1 year (N = 13), 0.107 (0.254).
For 3 years (N = 1), 0.007 (0.292).
Note:
For the questions of the category artificial intelligence:
Metaculus’ community predictions made earlier than 6 months prior to resolve time perform as badly or worse than always predicting 0.5, as their mean Brier score is similar or higher than 0.25.
Metaculus’ predictions perform significantly better than Metaculus’ community predictions.
Questions for which the Brier score can be assessed for a longer time prior to resolve, i.e. the ones with longer lifespans, tend to have lower base rates (I found a correlation of −0.129 among all questions). This means it is easier to achieve a lower Brier score:
Predicting 0.5 for a question whose base rate is 0.5 will lead to a Brier score of 0.25 (= 0.5*(0.5 − 1)^2 + (0.5 − 0)*(0.5 − 0)^2).
Predicting 0.1 for a question whose base rate is 0.1 will lead to a Brier score of 0.09 (= 0.1*(0.1 − 1)^2 + (1 − 0.1)*(0.1 − 0)^2).
Thanks for this. What does this data further out from resolution look like for community predictions?
You are welcome, and thanks for the follow-up question! I have added the community predictions inside parentheses above. I have also added a new bullet commenting on the community predictions for AI.