Thanks! Thatās a reasonable strategy if you can choose question wording. I agree thereās no difference mathematically, but Iām not so sure thatās true cognitively. Sometimes Iāve seen asymmetric calibration curves that look fine >50% but tend to overpredict <50%. That suggests itās easier to stay calibrated in the subset of questions you think are more likely to happen than not. This is good news for your strategy! However, note that this is based on a few anecdotal observations, so Iād caution against updating too strongly on it.
Thanks for your reply. The possibility of asymmetry suggests even more that we shouldnāt predict in the whole [0%-100%] range, but rather stick to whatever half of the interval we feel more comfortable with. All we have to do is to get in the habit of flipping the āsignā of the question (i.e, taking the complement of the sample space) when needed, which usually amounts to adding the phrase āItās not the case thatā in front of the prediction. This leads to roughly double the number of samples per bin, and therefore more precise estimates of our calibration. And since we have to map an event to a set that is now half the size it was before, it seems easier for us to get better at it over time.
Do you see any reason not to change Open Philanthropyās approach to forecasting besides the immense logistic effort this implies?
Thanks! Thatās a reasonable strategy if you can choose question wording. I agree thereās no difference mathematically, but Iām not so sure thatās true cognitively. Sometimes Iāve seen asymmetric calibration curves that look fine >50% but tend to overpredict <50%. That suggests itās easier to stay calibrated in the subset of questions you think are more likely to happen than not. This is good news for your strategy! However, note that this is based on a few anecdotal observations, so Iād caution against updating too strongly on it.
Thanks for your reply. The possibility of asymmetry suggests even more that we shouldnāt predict in the whole [0%-100%] range, but rather stick to whatever half of the interval we feel more comfortable with. All we have to do is to get in the habit of flipping the āsignā of the question (i.e, taking the complement of the sample space) when needed, which usually amounts to adding the phrase āItās not the case thatā in front of the prediction. This leads to roughly double the number of samples per bin, and therefore more precise estimates of our calibration. And since we have to map an event to a set that is now half the size it was before, it seems easier for us to get better at it over time.
Do you see any reason not to change Open Philanthropyās approach to forecasting besides the immense logistic effort this implies?