I’m probably missing something but doesn’t the graph show OP is under-confident in the 0-10 and 10-20 bins? e.g. those data points are above the dotted grey line of perfect calibration where the 90%+ bin is far below?
I think overconfident and underconfident aren’t crisp terms to describe this. With binary outcomes, you can invert the prediction and it means the same thing (20% chance of X == 80% chance of not X). So being below the calibration line in the 90% bucket and above the line in the 10% bucket are functionally the same thing.
I’m using overconfident here to mean closer to extreme confidence (0 or 100, depending on whether they are below or above 50%, respectively) than they should be.
I’m probably missing something but doesn’t the graph show OP is under-confident in the 0-10 and 10-20 bins? e.g. those data points are above the dotted grey line of perfect calibration where the 90%+ bin is far below?
I think overconfident and underconfident aren’t crisp terms to describe this. With binary outcomes, you can invert the prediction and it means the same thing (20% chance of X == 80% chance of not X). So being below the calibration line in the 90% bucket and above the line in the 10% bucket are functionally the same thing.
I’m using overconfident here to mean closer to extreme confidence (0 or 100, depending on whether they are below or above 50%, respectively) than they should be.