1/E(X) is not E(1/X)
When modeling with uncertainty we often care about the expected value of our result. In CEAs, in particular, we often try to estimate . This is different from both and (which are also different from each other). [1] The goal of this post is to make this clear.
One way to simplify this is to assume that the cost is constant. So we only have uncertainty about the effect. We will also assume at first that the effect can only be one of two values, say either 1 QALY or 10 QALYs with equal probability.
Expected Value is defined as the weighted average of all possible values, where the weights are the probabilities associated with these values. In math notation, for a random variable ,
where are all of the possible values of .[2] For non-discrete distributions, like a normal distribution, we’ll change the sum with an integral.
Coming back to the example above, we seek the expected value of effect over cost. As the cost is constant, say dollars, we only have two possible values:
In this case we do have , but as we’ll soon see that’s only because the cost is constant. What about ?
which is not , a smaller amount.
The point is that generally In fact, we always have with equality if and only if is constant.[3]
Another common and useful example is when is lognormally distributed with parameters . That means, by definition, that is normally distributed with expected value and variance respectively. The expected value of itself is a slightly more complicated expression:
Now the fun part: is also lognormally distributed! That’s because . Its parameters are (why?) and so we get
In fact, we see that the ratio between these values is
- ^
See Probability distributions of Cost-Effectiveness can be misleading for relevant discussion. There are arguably reasons to care about the two alternatives or rather than , which are left for a future post.
- ^
One way to imagine this is that if we sample many times we will observe each possible value roughly of the times. So the expected value would indeed generally be approximately the average value of many independent samples.
- ^
Due to Jensen’s Inequality.
Commenting to help out any other people confused by the mathematical notation, because I couldn’t find this out with Google (but ChatGPT got it for me):
(Of course, given that I didn’t know that, I have no hope of following the entire post, but at least I now understand roughly what the claim is)
Great post! A particular issue is that E(cost/effect) is infinite or undefined if you have a non-zero probability that the effect is 0. This is very commonly the case.
Another interesting point, highlighted by your log normal example, is that higher variance will tend to increase the difference between E(1/X) and 1/E(X).
E[effect/cost] will also inflate the cost-effectiveness, giving too much weight to cases where you spend little and too little weight to your costs (and opportunity costs) when you spend a lot. If there’s a 90% chance the costs are astronomical and there’s no impact or it’s net negative, the other 10% could still make it “look” cost-effective. The whole project could have negative expected effects (negative E[effect]), but positive E[effect/cost]. That would not be a project worth supporting.
You should usually be estimating E[costs]/E[effects] or E[effects]/E[costs], not the expected values of ratios.
Have you figured out exactly when “E[costs]/E[effects] or E[effects]/E[costs]” is called for? I have historically agreed with the point you are making, but my beliefs have been shaken recently. Here’s an example that has made me think twice:
You are donating $100 to a malaria charity and can choose between charity A and B. Charity A gets bednets for $1 each. Charity B does not yet know the cost of its bednets, but they will cost either $0.50 or $1.50 with equal probability.
Donating to charity A has a value of 100 bednets. Donating to charity B has expected value 133 bednets (equal chance of buying 200 or 66). But “E[costs]/E[effects] or E[effects]/E[costs]” is the same for each charity. In this case, E[effect/cost] seems like the right metric.
So is, the difference the fact that total costs are fixed? Someone deciding whether to start an organization or to commit to fully-funding a new intervention would have to contend with variable, unknown total costs.
Is it because funding charity A involves buying 100 “shares” in an intervention, and funding charity B involves buying either 200 or 66 “shares”, which “E[costs]/E[effects] or E[effects]/E[costs]” fails to capture?
If you set costs=$100 as constant in this case, then
E[effects/costs] = E[effects]/costs = E[effects]/E[costs],
and both are right.
The cases where it matters are the ones where you don’t know how much you’ll spend, including if you’re starting or running a charity, how much your charity will spend. For example, depending on your outputs, impact and updated expectations for cost-effectiveness, you’d stop taking donations and shut down.
If you wanted specifically to buy exactly 100 bednets, then committing to B would be worse, because you’ll spend more than $100 in expectation, and each extra expected dollar could have bought another bednet from A. This would be more relevant from the perspective of a charity that doesn’t know how much it’ll need to spend to reach a specific fixed binary goal, like a government policy change. But, the ratios of expected values still seem right here.
I’m not sure there are any cases where E[costs]/E[effects] or E[effects]/E[costs] gives the wrong answer in practical applications for resource allocation, if costs is what you’ll spend on the intervention. E[effects] is what you’re trying to maximize, and you can get that by multiplying E[effects]/E[costs] and E[costs]. E[effects/costs] won’t in general give you E[effects] when you multiply by E[costs].
Ah yes, I see that.
Having now read the post that Lorenzo recommended below, I’m coming round to the majority view that the key question is “how much good could we expect from a fixed unit of cost?”.
I think in this thread there are two ways of defining costs:
Michael considers the cost as the total amount spent
Stan suggests a case where the cost is the amount needed to be spent per unit of intervention.
I think this is the major source of disagreement here, right?
This discussion resembles the observation that the cost-effectiveness ratio should mostly be used in the margin. That is, in the end we care about something like (total effect)−(total cost) and when we decide where to spend the next dollar we should compute the derivatives with respect to that extra resource and choose the intervention which maximizes that increased value.
I thought about this a bit and have edited this post from last year. I’m curious if you find it useful!
There’s also lots of discussion in the comments on that post about why E[effect/cost] is better than E[effects]/E[costs] (which I originally argued for) according to most commenters (which I now agree with).
Thank you. I’d been drafting a very similar post of my own!!
It’s probably still worth posting! E.g. it seems that @MichaelStJules and commenters on my post would disagree on defaulting to E[effect/cost] vs E[effects]/E[costs]
Things to add:
Graphical example
Explicit discussion of GiveWell’s use of cost/effect
Real-world examples where this causes a significant error
Expand on the lognormal example, discuss cases where the variance is in the same order of magnitude as the expected value
What this says about how we should model uncertainty
From my experience last year writing the post you linked to (which I now just edited), I would really dumb things down and try to make the graphical example as simple as possible for people with ~high-school math knowledge (like me)
Several smart people I asked for advice about that post seemed to find reasoning about this not trivial, gave somewhat contradicting answers, or thought that the solution I had thought of was too obvious to post, despite it being very wrong (as you explained in the comments last year, thank you!) Even after a year, tbh I’m not sure I “grok” it completely (see the appendix I added to the post. But I reassure myself with the fact that costs very rarely span across orders of magnitude, and in those cases, I could try to model E(value) directly rather than E(value/cost) as a proxy).
Thanks! Yeah, I totally agree—the topic is surprisingly delicate and nonintuitive, and my examples above are too technical.
By the way, I’d love it if other people would write posts that make the exact same point but better! (or for different audiences).
I think most people who’d benefit from reading this will bounce off the equations above. Here is a concrete example that I think is easy to understand.
Consider a variable X that can be 1 or 2 with equal probability.
E(X) = (1+2)/2 = 1.5
So 1/E(X) = 1⁄1.5 = 0.667
Whereas, E(1/X) = (1/1 + 1⁄2)/2 = 0.75
So sometimes 1/E(X) does not equal E(1/X).