One can often calculate the expected value and uncertainty of expressions involving distributions without running full Monte Carlo simulations. If for nothing else, the following results may be useful for Fermi estimates. In the next sections:
The input distributions X1, X2, …, and XN are independent, as is often assumed in Monte carlo simulations.
Uncertainty of the product between independent lognormal distributions
If Y=X1X2…XN, and ri is the ratio between the values of 2 quantiles of Xi (e.g. ri = “95th percentile of Xi”/”5th percentile of Xi”), I think the ratio R between the 2 same quantiles of Y (e.g. R = “95th percentile of Y”/”5th percentile of Y”) is e((ln(r1)2+…+ln(rN)2)0.5). For the particular case where all input distributions have the same uncertainty, ri=r, and therefore R=rN0.5. This illustrates the point that performing point estimates with pessimistic and optimistic values overestimates uncertainty:
If the ratio between the 95th and 5th percentile of 3 independent lognormal distributions is r = 100, the naive approach will suggest the product would have an uncertainty (ratio between 95th and 5th percentile) of 100^3 = 10^6.
However, the actual uncertainty of the product will be 100^(3^0.5) = 2.91*10^3, which is only 0.291 % of the above.
The naive approach would only make sense if the input distributions were perfectly (or very highly) correlated.
If Y=w1X1+w2X2+…+wNXN, where wi are constants (which often add up to 1):
E(Y)=w1E(X1)+w2E(X2)+…+wNE(XN).
V(Y)=w12V(X1)+w22V(X2)+…+wN2V(XN).
Other expressions
If Y can be expressed as a linear function of E(Xi) and V(Xi), one can calculate E(Y) and V(Y) applying the results of the 3 previous sections. For example for Y=0.75X1X2+0.25X3X4:
Otherwise, it is probably better to run a full Monte Carlo simulation. That being said, one can also combine the results of the 3 previous sections with estimates obtained from Monte Carlo simulations which each only involves a single variable. To do this:
Write E(Y) as a linear function of E(f1(X1)), E(f2(X2)), …, and E(fN(XN)). For example for Y=1X1X22…XNN:
E(Y)=E(1X1)E(1X22)…E(1XNN).
Write V(Y) as a linear function of the above, V(f1(X1)), V(f2(X2)), …, and V(fN(XN)). For the example above:
Generate random samples of Xi (e.g. with Guesstimate or Squiggle), and then compute E(fi(Xi)) and V(fi(Xi)). For the example above, E(1X1), E(1X22), …, E(1XNN), V(1X1), V(1X22), …, and V(1XNN).
Determine E(Y) and V(Y) using the expressions of steps 1 and 2 with the results obtained in step 3.
Expected value and uncertainty without full Monte Carlo simulations
One can often calculate the expected value and uncertainty of expressions involving distributions without running full Monte Carlo simulations. If for nothing else, the following results may be useful for Fermi estimates. In the next sections:
The input distributions X1, X2, …, and XN are independent, as is often assumed in Monte carlo simulations.
E and V are the expected value and variance.
Uncertainty of the product between independent lognormal distributions
If Y=X1X2…XN, and ri is the ratio between the values of 2 quantiles of Xi (e.g. ri = “95th percentile of Xi”/”5th percentile of Xi”), I think the ratio R between the 2 same quantiles of Y (e.g. R = “95th percentile of Y”/”5th percentile of Y”) is e((ln(r1)2+…+ln(rN)2)0.5). For the particular case where all input distributions have the same uncertainty, ri=r, and therefore R=rN0.5. This illustrates the point that performing point estimates with pessimistic and optimistic values overestimates uncertainty:
If the ratio between the 95th and 5th percentile of 3 independent lognormal distributions is r = 100, the naive approach will suggest the product would have an uncertainty (ratio between 95th and 5th percentile) of 100^3 = 10^6.
However, the actual uncertainty of the product will be 100^(3^0.5) = 2.91*10^3, which is only 0.291 % of the above.
The naive approach would only make sense if the input distributions were perfectly (or very highly) correlated.
Sum of independent distributions
If Y=X1+X2+…+XN:
E(Y)=E(X1)+E(X2)+…+E(XN).
V(Y)=V(X1)+V(X2)+…+V(XN).
Product of independent distributions
If Y=X1X2…XN:
E(Y)=E(X1)E(X2)…E(XN).
V(Y)=E((X1X2…XN)2)−(E(X1X2…XN))2=E(X12)E(X22)…E(XN2)−E(X1)2E(X2)2…E(XN)2=(V(X1)+E(X1)2)(V(X2)+E(X2)2)…(V(XN)+E(XN)2)−=(−E(X1)2E(X2)2…E(XN)2.
Weighted sum of independent distributions
If Y=w1X1+w2X2+…+wNXN, where wi are constants (which often add up to 1):
E(Y)=w1E(X1)+w2E(X2)+…+wNE(XN).
V(Y)=w12V(X1)+w22V(X2)+…+wN2V(XN).
Other expressions
If Y can be expressed as a linear function of E(Xi) and V(Xi), one can calculate E(Y) and V(Y) applying the results of the 3 previous sections. For example for Y=0.75X1X2+0.25X3X4:
E(Y)=0.75E(X1)E(X2)+0.25E(X3)E(X4).
V(Y)=0.752((V(X1)+E(X1)2)(V(X2)+E(X2)2)−E(X1)2E(X2)2)+0.252((V(X3)+E(X3)2)(V(X4)+E(X4)2)−E(X3)2E(X4)2).
Otherwise, it is probably better to run a full Monte Carlo simulation. That being said, one can also combine the results of the 3 previous sections with estimates obtained from Monte Carlo simulations which each only involves a single variable. To do this:
Write E(Y) as a linear function of E(f1(X1)), E(f2(X2)), …, and E(fN(XN)). For example for Y=1X1X22…XNN:
E(Y)=E(1X1)E(1X22)…E(1XNN).
Write V(Y) as a linear function of the above, V(f1(X1)), V(f2(X2)), …, and V(fN(XN)). For the example above:
V(Y)=(V(1X1)+E(1X1)2)⎛⎝V(1X22)+E(1X22)2⎞⎠…⎛⎝V(1XNN)+E(1XNN)2⎞⎠−=(−E(1X1)2E(1X22)2…E(1XNN)2.
Generate random samples of Xi (e.g. with Guesstimate or Squiggle), and then compute E(fi(Xi)) and V(fi(Xi)). For the example above, E(1X1), E(1X22), …, E(1XNN), V(1X1), V(1X22), …, and V(1XNN).
Determine E(Y) and V(Y) using the expressions of steps 1 and 2 with the results obtained in step 3.