I wonder how these compare with fitting a Beta distribution and using one of its statistics? I’m imagining treating each forecast (assuming they are probabilities) as an observation, and maximizing the Beta likelihood. The resulting Beta is your best guess distribution over the forecasted variable.
It would be nice to have an aggregation method which gave you info about the spread of the aggregated forecast, which would be straightforward here.
It’s not clear to me that “fitting a Beta distribution and using one of it’s statistics” is different from just taking the mean of the probabilities.
I fitting a beta distribution to Metaculus forecasts and looked at:
Median forecast
Mean forecast
Mean log-odds / Geometric mean of odds
Fitted beta median
Fitted beta mean
Scattering these 5 values against each other I get:
We can see fitted values are closely aligned with the mean and mean-log-odds, but not with the median. (Unsurprising when you consider the ~parametric formula for the mean / median).
The performance is as follows:
brier
log_score
questions
geo_mean_odds_weighted
0.116
0.37
856
beta_median_weighted
0.118
0.378
856
median_weighted
0.121
0.38
856
mean_weighted
0.122
0.391
856
beta_mean_weighted
0.123
0.396
856
My intuition for what is going on here is that the beta-median is an extremized form of the beta-mean / mean, which is an improvement
Looking more recently (as the community became more calibrated), the beta-median’s performance edge seems to have reduced:
For a quick foray into this we can see what would happen if we use our estimate the mean of the max likelihood beta distribution implied by the sample of forecasts p1,...,pN.
The wikipedia article on the Beta distribution discusses this maximization problem in depth, pointing out that albeit no closed form exists if α and β can be assumed to be not too small the max likelihood estimate can be approximated as ^α≈12+^GX2(1−^GX−^G1−X) and ^β≈12+^G1−X2(1−^GX−^G1−X), where GX=∏ip1/Ni and G1−X=∏i(1−pi)1/N.
The mean of a beta with these max likelihood parameters is ^α^α+^β=(1−G1−X)(1−GX)+(1−G1−X).
By comparison, the geometric mean of odds estimate is:
p=∏Ni=1p1/Ni∏Ni=1p1/Ni+∏Ni=1(1−pi)1/N=GXGX+G1−X
Here are two examples of how the two methods compare aggregating five forecasts
I originally did this to convince myself that the two aggregates were different. And they seem to be! The method seems to be close to the arithmetic mean in this example. Let’s see what happens when we extremize one of the predictions:
We have made p3 one hundred times smaller. The geometric mean is suitable affected. The maximum likelihood beta mean stays close to the arithmetic mean, unperturbed.
This makes me a bit less excited about this method, but I would be excited about people poking around with this method and related ones!
I wonder how these compare with fitting a Beta distribution and using one of its statistics? I’m imagining treating each forecast (assuming they are probabilities) as an observation, and maximizing the Beta likelihood. The resulting Beta is your best guess distribution over the forecasted variable.
It would be nice to have an aggregation method which gave you info about the spread of the aggregated forecast, which would be straightforward here.
It’s not clear to me that “fitting a Beta distribution and using one of it’s statistics” is different from just taking the mean of the probabilities.
I fitting a beta distribution to Metaculus forecasts and looked at:
Median forecast
Mean forecast
Mean log-odds / Geometric mean of odds
Fitted beta median
Fitted beta mean
Scattering these 5 values against each other I get:
We can see fitted values are closely aligned with the mean and mean-log-odds, but not with the median. (Unsurprising when you consider the ~parametric formula for the mean / median).
The performance is as follows:
My intuition for what is going on here is that the beta-median is an extremized form of the beta-mean / mean, which is an improvement
Looking more recently (as the community became more calibrated), the beta-median’s performance edge seems to have reduced:
Hmm good question.
For a quick foray into this we can see what would happen if we use our estimate the mean of the max likelihood beta distribution implied by the sample of forecasts p1,...,pN.
The log-likelihood to maximize is then
logL(α,β)=(α−1)∑ilogpi+(β−1)∑ilog(1−pi)−NlogB(α,β)
The wikipedia article on the Beta distribution discusses this maximization problem in depth, pointing out that albeit no closed form exists if α and β can be assumed to be not too small the max likelihood estimate can be approximated as ^α≈12+^GX2(1−^GX−^G1−X) and ^β≈12+^G1−X2(1−^GX−^G1−X), where GX=∏ip1/Ni and G1−X=∏i(1−pi)1/N.
The mean of a beta with these max likelihood parameters is ^α^α+^β=(1−G1−X)(1−GX)+(1−G1−X).
By comparison, the geometric mean of odds estimate is:
p=∏Ni=1p1/Ni∏Ni=1p1/Ni+∏Ni=1(1−pi)1/N=GXGX+G1−X
Here are two examples of how the two methods compare aggregating five forecasts
I originally did this to convince myself that the two aggregates were different. And they seem to be! The method seems to be close to the arithmetic mean in this example. Let’s see what happens when we extremize one of the predictions:
We have made p3 one hundred times smaller. The geometric mean is suitable affected. The maximum likelihood beta mean stays close to the arithmetic mean, unperturbed.
This makes me a bit less excited about this method, but I would be excited about people poking around with this method and related ones!