I am so dumb I was mistakenly using odds instead of probs to compute the brier score :facepalm:
And yes, you are right, we should extremize before aggregating. Otherwise, the method is equivalent to geo mean of odds.
It’s still not very good though
I am so dumb I was mistakenly using odds instead of probs to compute the brier score :facepalm:
And yes, you are right, we should extremize before aggregating. Otherwise, the method is equivalent to geo mean of odds.
It’s still not very good though