SiebeRozendal comments on When pooling forecasts, use the geometric mean of odds

SiebeRozendal 7 Sep 2021 12:15 UTC
14 points
0 ∶ 0
Interesting! Seems intuitively right.
I wonder though: how would this affect expected value calculations? Doesn’t this have far-reaching consequences?
One thing I have always wondered about is how to aggregate predicted values that differ by orders of magnitude. E.g. person A’s best guess is that the value of x will be 10, person B’s guess is that it will be 10,000. Saying that the expected value of x is ~5,000 seems to lose a lot of information. For simple monetary betting, this seems fine. For complicated decision-making, I’m less sure.
- Jaime Sevilla 8 Sep 2021 16:12 UTC
  20 points
  0 ∶ 0
  Parent
  Let’s work this example through together! (but I will change the quantities to 10 and 20 for numerical stability reasons)
  One thing we need to be careful with is not mixing the implied beliefs with the object level claims.
  In this case, person A’s claim that the value is $m_{A} = 10$ is more accurately a claim that the beliefs of person A can be summed up as some distribution over the positive numbers, eg a log normal with parameters $μ_{A} = log m_{A}$ and $σ_{A}$ . So the density distribution of beliefs of A is $f_{A} = \frac{1}{x σ_{A} \sqrt{2 π}} exp [- \frac{(ln x - μ_{A})^{2}}{2 σ_{A}^{2}}]$ (and similar for person B, with $m_{B} = 20$ ). The scale parameters $σ_{A}, σ_{B}$ intuitively represent the uncertainty of person A and person B.
  Taking $σ_{A} = σ_{B} = 0.1$ , these densities look like:
  Note that the mean of these distributions is slightly displaced upwards from the median $exp μ$ . Concretely, the mean is computed as $exp [μ + \frac{σ^{2}}{2}]$ , and equals 10.05 and 20.10 for person A and person B respectively.
  To aggregate the distributions, we can use the generalization of the geometric mean of odds referred to in footnote [1] of the post.
  According to that, the aggregated distribution has a density $f = \frac{{\sqrt{f}}_{A} \cdot {\sqrt{f}}_{B}}{\int {\sqrt{f}}_{A} \cdot {\sqrt{f}}_{B}}$ .
  The plot of the aggregated density looks like:
  I actually notice that I am very surprised about this—I expected the aggregate distribution to be bimodal, but here it seems to have a single peak.
  For this particular example, a numerical approximation of the expected value seems to equal around 14.21 - which exactly equals the geometric mean of the means.
  I am not taking away any solid conclusions from this exercise—I notice I am still very confused about how the aggregated distribution looks like, and I encountered serious numerical stability issues when changing the parameters, which make me suspect a bug.
  Maybe a Monte Carlo approach for estimating the expected value would solve the stability issues—I’ll see if I can get around to that at some point.
  Meanwhile, here is my code for the results above.
  
  EDIT: Diego Chicharro has pointed out to me that the expected value can be easily computed analytically in Mathematica.
  
  The resulting expected value of the aggregated distribution is $exp [\frac{μ_{A} σ_{B}^{2} + μ_{B} σ_{A}^{2} + σ_{A}^{2} σ_{B}^{2}}{σ_{A}^{2} + σ_{B}^{2}}]$ .
  In the case where $σ_{A}^{2} = σ_{B}^{2} = σ^{2}$ we have then that the expected value is $exp [\frac{μ_{A} σ^{2} + μ_{B} σ^{2} + σ^{2} σ^{2}}{σ^{2} + σ^{2}}] = exp [\frac{μ_{A} + μ_{B} + σ^{2}}{2}] = \sqrt{exp [μ_{A} + σ^{2} / 2]} \sqrt{exp [μ_{B} + σ^{2} / 2]}$ , which is exactly the geometric mean of the expected values of the individual predictions.
  What links here?