Notice that in the limit it’s obvious we should expect the forecasting frequency to affect the average daily Brier score: Suppose Alice makes a new forecast every day while Bob only makes a single forecast (which is equivalent to him making an initial forecast and then blindly making the same forecast every day until the question closes).
re: limit — a nice example. Please notice, that Bob makes a forecast on a (uniformly) random day, so when you take an expectation over the days he is making forecasts on you get the average of scores for all days as if he forecasted every day.
Let N be the number of total days, Pd=1N be the probability Bob forecasted on a day d, Brierd be the brier score of the forecast made on day d:
Eavg. Brier=∑dPd×Brierd×num. days forecast will be activetotal num. of active days=∑dPd×Brierd×(N−d)N−d=∑dPd×Brierd=∑BrierdN.
I am a bit surprised that it worked out here because it breaks the assumption of the equality of the expected number of days forecast will be active. Lack of this assumption will play out if when aggregating over multiple questions [weighted by the number of active days]. Still, I hope this example gives helpful intuitions
I don’t think this formal argument conflicts with the claim that we should expect the forecasting frequency to affect the average daily Brier score. In the example that Flodorner gave where the forecast is essentially resolved before the official resolution date, Alice will have perfect daily Brier scores: Brierd=0, for any d>N′, while in those days Bob will have imperfect Brier scores: Brierd=BrierN′.
I didn’t follow that last sentence.
Notice that in the limit it’s obvious we should expect the forecasting frequency to affect the average daily Brier score: Suppose Alice makes a new forecast every day while Bob only makes a single forecast (which is equivalent to him making an initial forecast and then blindly making the same forecast every day until the question closes).
re: limit — a nice example. Please notice, that Bob makes a forecast on a (uniformly) random day, so when you take an expectation over the days he is making forecasts on you get the average of scores for all days as if he forecasted every day.
Let N be the number of total days, Pd=1N be the probability Bob forecasted on a day d, Brierd be the brier score of the forecast made on day d:
Eavg. Brier=∑dPd×Brierd×num. days forecast will be activetotal num. of active days=∑dPd×Brierd×(N−d)N−d=∑dPd×Brierd=∑BrierdN.
I am a bit surprised that it worked out here because it breaks the assumption of the equality of the expected number of days forecast will be active. Lack of this assumption will play out if when aggregating over multiple questions [weighted by the number of active days]. Still, I hope this example gives helpful intuitions
.
Thanks for the explanation!
I don’t think this formal argument conflicts with the claim that we should expect the forecasting frequency to affect the average daily Brier score. In the example that Flodorner gave where the forecast is essentially resolved before the official resolution date, Alice will have perfect daily Brier scores: Brierd=0, for any d>N′, while in those days Bob will have imperfect Brier scores: Brierd=BrierN′.
Thanks for challenging me :) I wrote my takes after this discussion above.