Thanks so much for commenting! Huge fan of your work!
Also is a general issue where your simulation involves ratios. A positive denominator whose lower bound is close to zero will introduce huge and often implausible numbers. These are situations where the divergence between E(x) / E(y) and E(x/y) will be the largest.
Yes, I think in general cases where the denominator spans multiple orders of magnitude (which is usually because it can be close to 0). In some models I saw it could even be negative (e.g. normal distribution with 90% between 1 and 10), which lead to even more confusing results.
To report point estimates as calculated from point estimates
Isn’t that also misleading in many cases? e.g. mean(log(cash)) ≠ log(mean(cash)), and I think we care about the former.
first check if the mean(x/y) approximates mean(x)/ mean(y). If so, then we use the uncertainty generated.
I think if y does not vary too much on a log scale this is good enough, and that is true in most cases (not sure how to express this, has low relative variance?). Otherwise, I guess you can try to consider y/x and still be informative (e.g. lives per $1,000 instead of dollars per lives). Sad to hear you don’t know of a good general solution :( Do you think showing point clouds, like you do in the HLI reports, helps with this?
EDIT: mean(x)/mean(y) has some variance, but it’s not quite what we’re after
Isn’t the variance 0? Since mean(x)/mean(y) is a number and not a distribution? x/y does have some variance sometimes (e.g. a Cauchy distribution apparently has no variance, but the math seems harder than what we need).
Isn’t the variance 0? Since mean(x)/mean(y) is a number and not a distribution?
No, I don’t think that’s correct. I take it that with “mean(x)” and “mean(y)” you mean the sample averages of x and y. In this case, these means will have variances equal to Var(x)/N and Var(y)/N. Consequently, the ratio of mean(x) and mean(y) will also have a variance. See here and here.
But now I’m still confused: why is this “not quite what we’re after” and why can’t we use it to express the uncertainty around the cost/effectiveness ratio?
But doesn’t this tend to 0 if we consider enough samples? (N very large)
Thanks so much for commenting! Huge fan of your work!
Yes, I think in general cases where the denominator spans multiple orders of magnitude (which is usually because it can be close to 0). In some models I saw it could even be negative (e.g. normal distribution with 90% between 1 and 10), which lead to even more confusing results.
Isn’t that also misleading in many cases? e.g. mean(log(cash)) ≠ log(mean(cash)), and I think we care about the former.
I think if y does not vary too much on a log scale this is good enough, and that is true in most cases (not sure how to express this, has low relative variance?).
Otherwise, I guess you can try to consider y/x and still be informative (e.g. lives per $1,000 instead of dollars per lives).
Sad to hear you don’t know of a good general solution :(
Do you think showing point clouds, like you do in the HLI reports, helps with this?
Isn’t the variance 0? Since mean(x)/mean(y) is a number and not a distribution?
x/y does have some variance sometimes (e.g. a Cauchy distribution apparently has no variance, but the math seems harder than what we need).
Thanks again for your comment!
Hey!
No, I don’t think that’s correct. I take it that with “mean(x)” and “mean(y)” you mean the sample averages of x and y. In this case, these means will have variances equal to Var(x)/N and Var(y)/N. Consequently, the ratio of mean(x) and mean(y) will also have a variance. See here and here.
Thanks for commenting, this is interesting!
(For other people that had never heard of this and are curious about the derivation)
So the variance would be Var(¯X¯Y)=Var(∑X∑Y) which can be estimated running a simulation or with these fancy forumulas https://en.wikipedia.org/wiki/Ratio_estimator#Variance_estimates
But now I’m still confused: why is this “not quite what we’re after” and why can’t we use it to express the uncertainty around the cost/effectiveness ratio?But doesn’t this tend to 0 if we consider enough samples? (N very large)