Simon_M comments on How does forecast quantity impact forecast quality on Metaculus?

Simon_M 1 Oct 2021 20:35 UTC
15 points
0 ∶ 0
If one had access to the individual predictions, one could also try to take 1000 random bootstrap samples of size 1 of all the predictions, then 1000 random bootstrap samples of size 2, and so on and measure how accuracy changes with larger random samples. This might also be possible with data from other prediction sites.
I discussed this with Charles. It’s not possible to do exactly this with the API, but we can approximate this by looking at the final predictions just before close.
Brier score from bootstrapped predictors
We can see that:
1. Questions with more predictors have better brier scores (regardless of # of predictors sampled)
2. Performance increases with # of predictors up to ~100 predictors
To account for the different brier scores based on groups of questions, I have normalized by subtracting off the performance of 8 predictors. This makes point 2 from above more clear to see.
When discussing this with Charles he suggested that questions which are ~0 / 1 are more popular and therefore they look easier. Excluding them, those charts look as follows:
Amazingly this seems to be ~all of the effect making more popular questions “easier”!
(NB: there’s only 22 questions with >= 256 predictors and 5% < p < 95% so the error bars on that cyan line should be quite wide)
N Predictors >= N Predictors >= N predictors | 5% < p < 95%
8 852 673
16 843 665
32 786 613
64 537 393
128 196 116
256 43 22
What links here?
- Forecasting Newsletter: October 2021. by NunoSempere (LessWrong; 2 Nov 2021 14:07 UTC; 22 points)
- Forecasting Newsletter: October 2021. by NunoSempere (2 Nov 2021 14:05 UTC; 15 points)
- nikos 26 Jan 2023 13:16 UTC
  1 point
  0 ∶ 0
  Parent
  Hi Simon, I’m working on a follow-up to this post that uses individual-level data. Could you please give some detail on how you “sampled” k predictors? As in, did you have access to individual data and could actually do the sampling? I’m not entirely sure what the x-axis in your plot means and what the difference betwenn “>N predictors” and “k predictors” is. Thank you!
  - Simon_M 26 Jan 2023 19:41 UTC
    2 points
    0 ∶ 0
    Parent
    iirc, there is access to the histogram, which tells you how many people predicted each %age. I then sampled k predictors from that distribution.
    “k predictors” is the number of samples I was looking at
    ”>N predictors” was the total number of people who predicted on a given question

N Predictors	>= N Predictors	>= N predictors \| 5% < p < 95%
8	852	673
16	843	665
32	786	613
64	537	393
128	196	116
256	43	22