Thanks for raising this question! Following other comment, I find the use of mean(cost)mean(effect) somewhat unsatisfactory.
Perhaps some of the confusion could be reduced by i) taking into account the number of interventions and ii) distinguishing the following two situations:
1. Epistemic uncertainty: the magic intervention will always save 1 life, or always save 100 lives, or always save 199 lives, we just don’t know. In this case, one can repeat the intervention as many times as one wants, the expected cost-effectiveness will remain ~$3,400/life.
2. True randomness: sometimes the magic intervention will save 1 life, sometimes 100 lives, sometimes 199 lives. What happens then if you repeat it n times? If n=1, your expectation is still ~$3400/life (tail risk of a single life saved). But the more interventions you do, the more you converge to a combined cost-effectiveness $100/life (see figure below), because failed interventions will probably be compensated by very successful ones.
(R code to reproduce the plot : X <- sample(1:20,1000000, replace=T) ; Y <- sapply(X,function(n)mean(10000*n/sum(sample(c(1,100,199), n, replace = T)))) ; plot(X, Y, log="y", pch=19, col=alpha("forestgreen", 0.3), xlab="Number of interventions", ylab="Cost-effectiveness ($/life, log scale)", main="Expected cost to save a live decreases with more interventions") ; lines(sort(unique(X)), sapply(sort(unique(X)), function(x)mean(Y[X==x])), lwd=3, col=alpha("darkgreen",0.5)))
I’m not sure how to translate this into practice, especially since you can consider EA interventions as a portfolio even if you don’t repeat the intervention 10 times yourself. But do you find this framing useful?
Thanks so much for writing this! I understood it much better than other comments.
do you find this framing useful?
I do! Especially the “epistemic uncertainty” vs “true randomness” framing I think is the core of the misunderstanding, I think we’re usually in the second scenario (and funding lots of different interventions), but indeed it was a very implicit assumption!
Thanks for raising this question! Following other comment, I find the use of mean(cost)mean(effect) somewhat unsatisfactory.
Perhaps some of the confusion could be reduced by i) taking into account the number of interventions and ii) distinguishing the following two situations:
1. Epistemic uncertainty: the magic intervention will always save 1 life, or always save 100 lives, or always save 199 lives, we just don’t know. In this case, one can repeat the intervention as many times as one wants, the expected cost-effectiveness will remain ~$3,400/life.
2. True randomness: sometimes the magic intervention will save 1 life, sometimes 100 lives, sometimes 199 lives. What happens then if you repeat it n times? If n=1, your expectation is still ~$3400/life (tail risk of a single life saved). But the more interventions you do, the more you converge to a combined cost-effectiveness $100/life (see figure below), because failed interventions will probably be compensated by very successful ones.
(R code to reproduce the plot :
X <- sample(1:20,1000000, replace=T) ; Y <- sapply(X,function(n)mean(10000*n/sum(sample(c(1,100,199), n, replace = T)))) ; plot(X, Y, log="y", pch=19, col=alpha("forestgreen", 0.3), xlab="Number of interventions", ylab="Cost-effectiveness ($/life, log scale)", main="Expected cost to save a live decreases with more interventions") ; lines(sort(unique(X)), sapply(sort(unique(X)), function(x)mean(Y[X==x])), lwd=3, col=alpha("darkgreen",0.5))
)I’m not sure how to translate this into practice, especially since you can consider EA interventions as a portfolio even if you don’t repeat the intervention 10 times yourself. But do you find this framing useful?
Thanks so much for writing this! I understood it much better than other comments.
I do! Especially the “epistemic uncertainty” vs “true randomness” framing I think is the core of the misunderstanding, I think we’re usually in the second scenario (and funding lots of different interventions), but indeed it was a very implicit assumption!
Edit: you might be interested in this https://en.wikipedia.org/wiki/Ratio_estimator#Statistical_properties linked in another comment, for a formalization of the sample bias shown in the plot