For a quick foray into this we can see what would happen if we use our estimate the mean of the max likelihood beta distribution implied by the sample of forecasts p1,...,pN.
The wikipedia article on the Beta distribution discusses this maximization problem in depth, pointing out that albeit no closed form exists if α and β can be assumed to be not too small the max likelihood estimate can be approximated as ^α≈12+^GX2(1−^GX−^G1−X) and ^β≈12+^G1−X2(1−^GX−^G1−X), where GX=∏ip1/Ni and G1−X=∏i(1−pi)1/N.
The mean of a beta with these max likelihood parameters is ^α^α+^β=(1−G1−X)(1−GX)+(1−G1−X).
By comparison, the geometric mean of odds estimate is:
p=∏Ni=1p1/Ni∏Ni=1p1/Ni+∏Ni=1(1−pi)1/N=GXGX+G1−X
Here are two examples of how the two methods compare aggregating five forecasts
I originally did this to convince myself that the two aggregates were different. And they seem to be! The method seems to be close to the arithmetic mean in this example. Let’s see what happens when we extremize one of the predictions:
We have made p3 one hundred times smaller. The geometric mean is suitable affected. The maximum likelihood beta mean stays close to the arithmetic mean, unperturbed.
This makes me a bit less excited about this method, but I would be excited about people poking around with this method and related ones!
Hmm good question.
For a quick foray into this we can see what would happen if we use our estimate the mean of the max likelihood beta distribution implied by the sample of forecasts p1,...,pN.
The log-likelihood to maximize is then
logL(α,β)=(α−1)∑ilogpi+(β−1)∑ilog(1−pi)−NlogB(α,β)
The wikipedia article on the Beta distribution discusses this maximization problem in depth, pointing out that albeit no closed form exists if α and β can be assumed to be not too small the max likelihood estimate can be approximated as ^α≈12+^GX2(1−^GX−^G1−X) and ^β≈12+^G1−X2(1−^GX−^G1−X), where GX=∏ip1/Ni and G1−X=∏i(1−pi)1/N.
The mean of a beta with these max likelihood parameters is ^α^α+^β=(1−G1−X)(1−GX)+(1−G1−X).
By comparison, the geometric mean of odds estimate is:
p=∏Ni=1p1/Ni∏Ni=1p1/Ni+∏Ni=1(1−pi)1/N=GXGX+G1−X
Here are two examples of how the two methods compare aggregating five forecasts
I originally did this to convince myself that the two aggregates were different. And they seem to be! The method seems to be close to the arithmetic mean in this example. Let’s see what happens when we extremize one of the predictions:
We have made p3 one hundred times smaller. The geometric mean is suitable affected. The maximum likelihood beta mean stays close to the arithmetic mean, unperturbed.
This makes me a bit less excited about this method, but I would be excited about people poking around with this method and related ones!