Given 2 distributions and which are independent estimates of the distribution , this be estimated with the inverse-variance method from:
.
Under which conditions is this a good aproach? For example, for which types of distributions? These questions might be relevant for determining:
A posterior distribution based on distributions for the prior and estimate.
A distribution which combines estimates of different theories.
Some notes:
The inverse-variance method minimises the variance of a weighted mean of and .
Calculating and according to the above formula would result in a mean and variance equal to those derived in this analysis from Dario Amodei, which explains how to combine and following a Bayesian approach if these follow normal distributions.
If you assume both X1 and X2 are normal then the only difference between them comes from their moments, so you can use the inverse variance formula. But that leans directly on the formula for the product of normal distributions. The formula for a general convolution of two distributions does not have such a clean form. So while I don’t have a rigorous argument for this, I would be shocked if you could do the same for any two PDFs X1 and X2 with no change to the formula.
I do not know if this is really necessary for the uses you name, though. Bayes Rule determines the posterior distribution regardless of whether it follows an inverse variance formula or not.
Thanks for the reply!
I also think the above formula does not formally apply to non-normal distributions, but I was wondering whether it was a good enough approximation.
Is there a simple way of applying the Bayes Rule to two arrays X1 and X2 of Monte Carlo samples? I believe this is analagous to considering that all elements of X1 and X2 are equiprobable.
I don’t think I follow. Monte Carlo sampling is done from a distribution, which I assume you want to use as the basis of your likelihood function? In this case, you can just calculate the likelihood function from this distribution, and combine it with your prior to get a posterior distribution.
I was thinking about cases in which X1 and X2 are non-linear functions of arrays of Monte Carlo samples generated from distributions of different types (e.g. loguniform and lognormal). To calculate E(X1), I can simply compute the mean of the elements of X1. I was looking for a similar simple formula to combine X1 and X2, without having to work with the original distributions used to compute X1 and X2.
A concrete simple example would be combining the following:
According to estimate 1, X is as likely to be 1, 3, 4, 6 or 8: X1 = [1, 2, 3, 4, 5].
According to estimate 2, X is as likely to be 2, 4, 6, 8 or 10: X2 = [2, 4, 6, 8, 10].
The generation mechanisms of estimates 1 and 2 are not known.
How are both X1 and X2 estimates of X when they are different distributions? At this point I am out of my depth so I do not have an informative answer for you.
I will try to illustrate what I mean with an example:
X could be the total number of confirmed and suspected monkeypox cases in Europe as of July 1, 2022.
X1 could be a distribution fitted to 3 quantiles predicted for X by forecaster A (as in Metaculus’ questions which do not involve forecasting probabilities).
X2 could be a distribution fitted to 3 quantiles predicted for X by forecaster B.
Meanwhile, I have realised the inverse-variance method minimises the variance of a weighted mean of X1 and X2 (and have updated the question above to reflect this).