tl;dr: even using priors, with more options and hazier probabilities, you tend to increase the number of options which are too sensitive to supporting information (or just optimistically biased due to your priors), and these options look disproportionately good. This is still an optimizerâs curse in practice.
This is an issue of the models and priors. If your models and priors are not right⌠then you should update over your priors and use better models. Of course they can still be wrong⌠but thatâs true of all beliefs, all reasoning, etc.
If you assume from the outside (unbeknownst to the agent) that they are all fair, then youâre not showing a problem with the agentâs reasoning, youâre just using relevant information which they lack.
In practice, your models and priors will almost always be wrong, because you lack information; thereâs some truth of the matter of which you arenât aware. Itâs unrealistic to expect us to have good guesses for the priors in all cases, especially with little information or precedent as in hazy probabilities, a major point of the OP.
Youâd hope that more information would tend to allow you to make better predictions and bring you closer to the truth, but when optimizing, even with correctly specified likelihoods and after updating over priors as you said should be done, the predictions for the selected coin can be more biased in expectation with more information (results of coin flips). On the other hand, the predictions for any fixed coin will not be any more biased in expectation over the new information, and if the priorâs EV hadnât matched the true mean, the predictions would tend to be less biased.
More information (flips) per option (coin) would reduce the bias of the selection on average, but, as I showed, more options (coins) would increase it, too, because you get more chances to be unusually lucky.
My prior would not be uniform, it would be 0.5! What else could âunbiased coinsâ mean?
The intent here again is that you donât know the coins are fair.
Bayesian EV estimation doesnât do hypothesis testing with p-value cutoffs. This is the same problem popping up in a different framework, yes it will require a different solution in that context, but they are separate.
Fair enough.
The proposed solution applies here too, just do (simplistic, informal) posterior EV correction for your (simplistic, informal) estimates.
How would you do this in practice? Specifically, how would you get an idea of the magnitude for the correction you should make?
Maybe you could test your own (or your groupâs) prediction calibration and bias, but itâs not clear how exactly you should incorporate this information, and itâs likely these tests wonât be very representative when youâre considering the kinds of problems with hazy probabilities mentioned in the OP.
tl;dr: even using priors, with more options and hazier probabilities, you tend to increase the number of options which are too sensitive to supporting information (or just optimistically biased due to your priors), and these options look disproportionately good. This is still an optimizerâs curse in practice.
In practice, your models and priors will almost always be wrong, because you lack information; thereâs some truth of the matter of which you arenât aware. Itâs unrealistic to expect us to have good guesses for the priors in all cases, especially with little information or precedent as in hazy probabilities, a major point of the OP.
Youâd hope that more information would tend to allow you to make better predictions and bring you closer to the truth, but when optimizing, even with correctly specified likelihoods and after updating over priors as you said should be done, the predictions for the selected coin can be more biased in expectation with more information (results of coin flips). On the other hand, the predictions for any fixed coin will not be any more biased in expectation over the new information, and if the priorâs EV hadnât matched the true mean, the predictions would tend to be less biased.
More information (flips) per option (coin) would reduce the bias of the selection on average, but, as I showed, more options (coins) would increase it, too, because you get more chances to be unusually lucky.
The intent here again is that you donât know the coins are fair.
Fair enough.
How would you do this in practice? Specifically, how would you get an idea of the magnitude for the correction you should make?
Maybe you could test your own (or your groupâs) prediction calibration and bias, but itâs not clear how exactly you should incorporate this information, and itâs likely these tests wonât be very representative when youâre considering the kinds of problems with hazy probabilities mentioned in the OP.