eca comments on My current best guess on how to aggregate forecasts

eca Oct 6, 2021, 1:52 PM
10 points
0 ∶ 0
I wonder how these compare with fitting a Beta distribution and using one of its statistics? I’m imagining treating each forecast (assuming they are probabilities) as an observation, and maximizing the Beta likelihood. The resulting Beta is your best guess distribution over the forecasted variable.

It would be nice to have an aggregation method which gave you info about the spread of the aggregated forecast, which would be straightforward here.
- Simon_M Oct 6, 2021, 5:00 PM
  23 points
  0 ∶ 0
  Parent
  It’s not clear to me that “fitting a Beta distribution and using one of it’s statistics” is different from just taking the mean of the probabilities.
  I fitting a beta distribution to Metaculus forecasts and looked at:
  - Median forecast
  - Mean forecast
  - Mean log-odds / Geometric mean of odds
  - Fitted beta median
  - Fitted beta mean
  Scattering these 5 values against each other I get:
  We can see fitted values are closely aligned with the mean and mean-log-odds, but not with the median. (Unsurprising when you consider the ~parametric formula for the mean / median).
  The performance is as follows:
  brier log_score questions
  geo_mean_odds_weighted 0.116 0.37 856
  beta_median_weighted 0.118 0.378 856
  median_weighted 0.121 0.38 856
  mean_weighted 0.122 0.391 856
  beta_mean_weighted 0.123 0.396 856
  My intuition for what is going on here is that the beta-median is an extremized form of the beta-mean / mean, which is an improvement
  Looking more recently (as the community became more calibrated), the beta-median’s performance edge seems to have reduced:
  brier log_score questions
  geo_mean_odds_weighted 0.09 0.29 330
  median_weighted 0.091 0.294 330
  beta_median_weighted 0.091 0.297 330
  mean_weighted 0.094 0.31 330
  beta_mean_weighted 0.095 0.314 330
  What links here?
  - Jaime Sevilla's comment on When pooling forecasts, use the geometric mean of odds by Jaime Sevilla (Oct 22, 2023, 1:18 PM; 4 points)
  - Vasco Grilo🔸's comment on Intermediate Report on Abrupt Sunlight Reduction Scenarios by Stan Pinsent (Oct 31, 2023, 11:24 AM; 3 points)
- Jaime Sevilla Oct 6, 2021, 4:23 PM
  7 points
  0 ∶ 0
  Parent
  Hmm good question.
  For a quick foray into this we can see what would happen if we use our estimate the mean of the max likelihood beta distribution implied by the sample of forecasts $p_{1}, . . ., p_{N}$ .
  The log-likelihood to maximize is then
  $log L (α, β) = (α - 1) \sum_{i} log p_{i} + (β - 1) \sum_{i} log (1 - p_{i}) - N log B (α, β)$
  The wikipedia article on the Beta distribution discusses this maximization problem in depth, pointing out that albeit no closed form exists if $α$ and $β$ can be assumed to be not too small the max likelihood estimate can be approximated as $^α \approx \frac{1}{2} + \frac{{^G}_{X}}{2 (1 - {^G}_{X} - {^G}_{1 - X})}$ and $^β \approx \frac{1}{2} + \frac{{^G}_{1 - X}}{2 (1 - {^G}_{X} - {^G}_{1 - X})}$ , where $G_{X} = \prod_{i} p_{i}^{1 / N}$ and $G_{1 - X} = \prod_{i} (1 - p_{i})^{1 / N}$ .
  The mean of a beta with these max likelihood parameters is $\frac{^α}{^α +^β} = \frac{(1 - G_{1 - X})}{(1 - G_{X}) + (1 - G_{1 - X})}$ .
  By comparison, the geometric mean of odds estimate is:
  $p = \frac{\prod_{i = 1}^{N} p_{i}^{1 / N}}{\prod_{i = 1}^{N} p_{i}^{1 / N} + \prod_{i = 1}^{N} (1 - p_{i})^{1 / N}} = \frac{G_{X}}{G_{X} + G_{1 - X}}$
  Here are two examples of how the two methods compare aggregating five forecasts
  I originally did this to convince myself that the two aggregates were different. And they seem to be! The method seems to be close to the arithmetic mean in this example. Let’s see what happens when we extremize one of the predictions:
  We have made p3 one hundred times smaller. The geometric mean is suitable affected. The maximum likelihood beta mean stays close to the arithmetic mean, unperturbed.
  This makes me a bit less excited about this method, but I would be excited about people poking around with this method and related ones!

	brier	log_score	questions
geo_mean_odds_weighted	0.116	0.37	856
beta_median_weighted	0.118	0.378	856
median_weighted	0.121	0.38	856
mean_weighted	0.122	0.391	856
beta_mean_weighted	0.123	0.396	856

	brier	log_score	questions
geo_mean_odds_weighted	0.09	0.29	330
median_weighted	0.091	0.294	330
beta_median_weighted	0.091	0.297	330
mean_weighted	0.094	0.31	330
beta_mean_weighted	0.095	0.314	330