Thanks for sharing! Monte Carlo simulations are useful to work with distributions of different types, but I also wanted to note there are some nice results one can use when working with just lognormals or normals. If for nothing else, these may be useful for Fermi estimates.
Uncertainty of the product between independent lognormal distributions without Monte Carlos
If Y is the product of independent lognormal distributions X_1, X_2, …, and X_N, and r_i is the ratio between the values of 2 quantiles of X_i (e.g. r_i = “95th percentile of X_i”/”5th percentile of X_i”), I think the ratio R between the 2 same quantiles of Y (e.g. R = “95th percentile of Y”/”5th percentile of Y”) is e^((ln(r_1)^2 + … + ln(r_N)^2)^0.5)[1]. For the particular case where all input distributions have the same uncertainty, r_i = r, and therefore R = r^(N^0.5). This illustrates your point that performing point estimates with pessimistic and optimistic values overestimates uncertainty:
If the ratio between the 95th and 5th percentile of 3 independent lognormal distributions was r = 100, the naive approach would suggest the product would have an uncertainty (ratio between 95th and 5th percentile) of 100^3 = 10^6.
However, the actual uncertainty of the product would be 100^(3^0.5) = 2.91*10^3, which is only 0.3 % of the above.
The naive approach would only make sense if the input distribution were perfectly (or very highly) correlated.
Sum and product of independent normal distributions without Monte Carlos
If X_1, X_2, …, and X_N are independent normal distributions, and Y = X_1 + X_2 + … + X_N:
E(Y) = E(X_1) + E(X_2) + … + E(X_N).
V(Y) = V(X_1) + V(X_2) + … + V(X_N).
If X_1, X_2, …, and X_N are independent normal distributions, and Y = X_1 X_2 … X_N:
One would obtain E(ln(X_i)) and V(ln(X_i)) in the same way one would get the mean and variance of a normal distribution given 2 points, but using their logarithms. If a_i and b_i are 2 values of X_i such that P(X_i ⇐ a_i) = P(X_i >= b_i) (for example, a_i and b_i could be the 5th and 95th percentile, 10th and 90th percentile, 25th and 75th percentile, etc.):
E(ln(X_i)) = (ln(a_i) + ln(b_i))/2.
V(ln(X_i)) = (ln(b_i) - ln(a_i))/(2*NORMINV(P(X_i ⇐ b_i), 1, 0)). The denominator is the distance between ln(a_i) and ln(b_i) expressed in standard deviations of ln(X_i).
After getting the mean and variance of ln(Y), one could obtain e.g. the 5th percentile of Y in google sheets using EXP(NORMINV(0.05, E(ln(Y)), V(ln(Y))^0.5)).
Thanks for sharing! Monte Carlo simulations are useful to work with distributions of different types, but I also wanted to note there are some nice results one can use when working with just lognormals or normals. If for nothing else, these may be useful for Fermi estimates.
Uncertainty of the product between independent lognormal distributions without Monte Carlos
If Y is the product of independent lognormal distributions X_1, X_2, …, and X_N, and r_i is the ratio between the values of 2 quantiles of X_i (e.g. r_i = “95th percentile of X_i”/”5th percentile of X_i”), I think the ratio R between the 2 same quantiles of Y (e.g. R = “95th percentile of Y”/”5th percentile of Y”) is e^((ln(r_1)^2 + … + ln(r_N)^2)^0.5)[1]. For the particular case where all input distributions have the same uncertainty, r_i = r, and therefore R = r^(N^0.5). This illustrates your point that performing point estimates with pessimistic and optimistic values overestimates uncertainty:
If the ratio between the 95th and 5th percentile of 3 independent lognormal distributions was r = 100, the naive approach would suggest the product would have an uncertainty (ratio between 95th and 5th percentile) of 100^3 = 10^6.
However, the actual uncertainty of the product would be 100^(3^0.5) = 2.91*10^3, which is only 0.3 % of the above.
The naive approach would only make sense if the input distribution were perfectly (or very highly) correlated.
Sum and product of independent normal distributions without Monte Carlos
If X_1, X_2, …, and X_N are independent normal distributions, and Y = X_1 + X_2 + … + X_N:
E(Y) = E(X_1) + E(X_2) + … + E(X_N).
V(Y) = V(X_1) + V(X_2) + … + V(X_N).
If X_1, X_2, …, and X_N are independent normal distributions, and Y = X_1 X_2 … X_N:
E(Y) = E(X_1 X_2 … X_N) = E(X_1) E(X_2) … E(X_N).
V(Y) = E((X_1 X_2 … X_N)^2) - E(Y)^2 = E(X_1^2) E(X_2^2) … E(X_N^2) - E(Y)^2 = (V(X_1) + E(X_1)^2) (V(X_2) + E(X_2)^2) … (V(X_N) + E(X_N)^2) - E(Y)^2.
After getting the mean and variance of Y, one could obtain e.g. its 5th percentile in google sheets using NORMINV(0.05, E(Y), V(Y)^0.5).
Product of independent lognormal distributions without Monte Carlos
If X_1, X_2, …, and X_N are independent lognormal distributions, and Y = X_1 X_2 … X_N, ln(Y) = ln(X_1) + ln(X_2) + … + ln(X_N), so:
E(ln(Y)) = E(ln(X_1)) + E(ln(X_2)) + … + E(ln(X_N)).
V(ln(Y)) = V(ln(X_1)) + V(ln(X_2)) + … + V(ln(X_N)).
One would obtain E(ln(X_i)) and V(ln(X_i)) in the same way one would get the mean and variance of a normal distribution given 2 points, but using their logarithms. If a_i and b_i are 2 values of X_i such that P(X_i ⇐ a_i) = P(X_i >= b_i) (for example, a_i and b_i could be the 5th and 95th percentile, 10th and 90th percentile, 25th and 75th percentile, etc.):
E(ln(X_i)) = (ln(a_i) + ln(b_i))/2.
V(ln(X_i)) = (ln(b_i) - ln(a_i))/(2*NORMINV(P(X_i ⇐ b_i), 1, 0)). The denominator is the distance between ln(a_i) and ln(b_i) expressed in standard deviations of ln(X_i).
After getting the mean and variance of ln(Y), one could obtain e.g. the 5th percentile of Y in google sheets using EXP(NORMINV(0.05, E(ln(Y)), V(ln(Y))^0.5)).
I used this formula here.