Diego Oliveira 🔸 comments on How accurate are Open Phil’s predictions?

Diego Oliveira 🔸 11 Apr 2023 2:23 UTC
1 point
0 ∶ 0
Thanks for sharing this! I’ve been forecasting myself for 5 months now (got 1005 resolved predictions so far), and I adopted a slightly different strategy to increase the number of samples: I only predict in the range [50%-100%]. After all, there doesn’t seem to be any probabilistically or cognitively relevant difference between [predicting X will happen with 20% probability] and [not-X will happen with 80% probability]
What do you folks think about this?
- Javier Prieto🔸 11 Apr 2023 10:16 UTC
  1 point
  0 ∶ 0
  Parent
  Thanks! That’s a reasonable strategy if you can choose question wording. I agree there’s no difference mathematically, but I’m not so sure that’s true cognitively. Sometimes I’ve seen asymmetric calibration curves that look fine >50% but tend to overpredict <50%. That suggests it’s easier to stay calibrated in the subset of questions you think are more likely to happen than not. This is good news for your strategy! However, note that this is based on a few anecdotal observations, so I’d caution against updating too strongly on it.
  - Diego Oliveira 🔸 11 Apr 2023 15:40 UTC
    1 point
    0 ∶ 0
    Parent
    Thanks for your reply. The possibility of asymmetry suggests even more that we shouldn’t predict in the whole [0%-100%] range, but rather stick to whatever half of the interval we feel more comfortable with. All we have to do is to get in the habit of flipping the “sign” of the question (i.e, taking the complement of the sample space) when needed, which usually amounts to adding the phrase “It’s not the case that” in front of the prediction. This leads to roughly double the number of samples per bin, and therefore more precise estimates of our calibration. And since we have to map an event to a set that is now half the size it was before, it seems easier for us to get better at it over time.
    Do you see any reason not to change Open Philanthropy’s approach to forecasting besides the immense logistic effort this implies?