Thanks for sharing your thinking in a detailed and accessible way! I think this is a great example of reasoning transparency about philanthropic grantmaking, and relevant modelling.
Similarly, the impact of any given policy depends on the quality of implementation, features of the world we do not know before, as well as general political, economic and geopolitical conditions, to name a few. Again, an uncertainty of a factor of 10x seems conservative ex ante.
How are you thinking about adaptation to climate change (e.g. more air conditioning)?
If all the uncertainties are independent – meaning knowing how one would resolve tells us nothing about the others – we are right to multiply them which gives us an overall uncertainty of at least 3000, with four uncertainties layered on top of each other.
I do not think this is correct. If the component uncertainties are independent, the overall uncertainty will be narrower than the suggested by the product of the component uncertainties. If one gets a relatively high value in the 1st multiplier, one will tend to get a relatively lower value in the 2nd due to regression to the mean (even if the multipliers are independent). If the multipliers were perfectly correlated, then the overall uncertainty would be the product of the component uncertainties.
If Y is the product of independent lognormal distributions X_1, X_2, …, and X_N, and r_i is the ratio between the values of 2 quantiles of X_i (e.g. r_i = “95th percentile of X_i”/”5th percentile of X_i”), I think the ratio R between the 2 same quantiles of Y (e.g. R = “95th percentile of Y”/”5th percentile of Y”) is e^((ln(r_1)^2 + … + ln(r_N)^2)^0.5). For multipliers whose 95th percentiles are 3, 10, 10 and 10 times the 5th percentiles, the 95th percentile of the overall multiplier would be 62.6 (= e^((ln(3)^2 + 3*ln(10)^2)^0.5) times the 5th percentile. This is much smaller than the ratio of 3 k you mention above, but actually pretty close to the ratio of 40 between the all-things-considered expected value of organisations A and B!
As a side note, if all multipliers have the same uncertainty r_i = r, and are:
Independent, the overall uncertainty will be R = r^(N^0.5). So the logarithm of the overall uncertainty ln(R) = N^0.5*ln(r) would increase sublinearly with the number of multipliers.
Perfectly correlated, the overall uncertainty will be R = r^N. So the logarithm of the overall uncertainty ln(R) = N*ln(r) would increase linearly with the number of multipliers.
We can see this by plotting a graph like the below: Organization B outperforms Organization A when the above plotted ratio is greater than 1; in other words, when the expected value for Organization B is greater than the expected value for Organization A. As shown in the distribution ratio above, Organization B has a greater expected value than Organization A in 91% of the simulated cases.
This might be a sensible reality check. However, I would say we should conceptually care about E(“cost-effectiveness of A”)/E(“cost-effectiveness of B”), not E(“cost-effectiveness of A”/”cost-effectiveness of B”), where E is the expected value.
In this piece, we tried to characterize the problem we face when making claims about expected impacts in a high-uncertainty environment such as climate philanthropy.
How do you think about adaptation?
With our illustrative example we demonstrated how we currently think about tackling this problem, introducing a suite of tools of varying granularity and generality along the way.
Do you have any thoughts on applying a similar methodology and developing analogous models for other areas where there is large uncertainty? I think it would be great to do for AI, bio and nuclear what you did for climate. Have you considered doing this at Founders Pledge (potentially with funding from Open Phil, which I guess would be more keen to support these areas rather than climate)?
I would also be curious to know whether you have tried to pitch your approach to non-EA funders, potentially even outside philanthropy. There is lots of non-EA funding going to climate, so it would be good if more of it was allocated in an effective way!
It took me a while to fully parse, but here are my thoughts, let me know if I misunderstood something.
I/ Re the 3000x example, I think I wasn’t particularly clear in the talk and this is a misunderstanding resulting from that. You’re right to point out that the expected uncertainty is not 3000x.
I meant this more to quickly demonstrate that if you put a couple of uncertainties together it quickly becomes quite hard to evaluate whether something meets a given bar, the range of outcomes is extremely large (if on regression to the mean the uncertainty in the example goes to ~40x, as you suggest, this is still not that helpful to evaluate whether something meets a bar).
And if we did this for a realistic case, e.g. something with 7+ uncertainties then this would be clear even taking into account regression to the mean. In our current trial runs, we definitely see differences that are significantly larger than 40x, i.e. the two options discussed in the talk do indeed look fairly similar in our model of the overall impact space.
II/ In general, we do not assume that all uncertainties are independent or that they all have the same distribution, right now we’ve enabled normal, lognormal and uniform distributions as well as correlations between all variables. For technological change, log normal distributions appear a good approximation of the data but this is not necessarily the same for other variables.
III/ “However, I would say we should conceptually care about E(“cost-effectiveness of A”)/E(“cost-effectiveness of B”), not E(“cost-effectiveness of A”/”cost-effectiveness of B”), where E is the expected value.”
Yes, in general we care about E(CE of A/CE of B). The different decomposition in the talk comes from the specific interest in that case, illustrating that even if we are quite uncertain in general about absolute values, we can make relatively confident statements about dominance relations, e.g. that in 91% of worlds a given org dominates even though the first intuitive reaction to the visualization would be “ah, this is all really uncertain, can we really know anything?”.
IV/ The less formalized versions of this overall framework have indeed already influenced a lot of other FP work in other cause areas, e.g. on bio and nuclear risk and also air pollution, and I do expect that some of the tools we are developing will indeed diffuse to other parts of the research team and potentially to other orgs (we aim to publish the code as well). This is very intentional, we try to push methodology forward in mid-to-high uncertainty contexts where little is published so far.
V/ Most of the donors to the Climate Fund are indeed not cause-neutral EAs and we mostly target non-EA audiences.
Yes, in general we care about E(CE of A/CE of B).
I meant we should in theory just care about r = E(“CE of A”)/E(“CE of B”)[1], and pick A over B if the expected cost-effectiveness of A is greater than that of B (i.e. if r > 1), even if A was worse than B in e.g. 90 % of the worlds. In practice, if A is better than B in 90 % of the worlds (in which case the 10th precentile of “CE of A”/”CE of B” would be 1), r will often be higher than 1, so focussing on r or E(“CE of A”/”CE of B”) will lead to the same decisions.
If r is what matters, to investigate whether one’s decision to pick A over B is robust, the aim of the sensitivity analysis would be ensuring that r > 1 under various plausible conditions. So, instead of checking whether the CE of A is often higher than the CE of B, one should be testing whether the expected CE of A if often higher than the expected CE of B.
In practice, it might be the case that:
If r > 1 and A is better than B in e.g. 90 % of the worlds, then the conclusion that r > 1 is robust, i.e. we can be confident that A will continue to be better than B upon further investigation.
If r > 1 and A is better than B in e.g. just 25 % of the worlds, then the conclusion that r > 1 is not robust, i.e. we cannot be confident that A will continue to be better than B upon further investigation.
In this piece, we tried to characterize the problem we face when making claims about expected impacts in a high-uncertainty environment such as climate philanthropy.
How do you think about adaptation (e.g. economic growth, adoption of air conditioning, and migration)? I forgot to finish this sentence in my last comment.
Thanks for sharing your thinking in a detailed and accessible way! I think this is a great example of reasoning transparency about philanthropic grantmaking, and relevant modelling.
How are you thinking about adaptation to climate change (e.g. more air conditioning)?
I do not think this is correct. If the component uncertainties are independent, the overall uncertainty will be narrower than the suggested by the product of the component uncertainties. If one gets a relatively high value in the 1st multiplier, one will tend to get a relatively lower value in the 2nd due to regression to the mean (even if the multipliers are independent). If the multipliers were perfectly correlated, then the overall uncertainty would be the product of the component uncertainties.
If Y is the product of independent lognormal distributions X_1, X_2, …, and X_N, and r_i is the ratio between the values of 2 quantiles of X_i (e.g. r_i = “95th percentile of X_i”/”5th percentile of X_i”), I think the ratio R between the 2 same quantiles of Y (e.g. R = “95th percentile of Y”/”5th percentile of Y”) is e^((ln(r_1)^2 + … + ln(r_N)^2)^0.5). For multipliers whose 95th percentiles are 3, 10, 10 and 10 times the 5th percentiles, the 95th percentile of the overall multiplier would be 62.6 (= e^((ln(3)^2 + 3*ln(10)^2)^0.5) times the 5th percentile. This is much smaller than the ratio of 3 k you mention above, but actually pretty close to the ratio of 40 between the all-things-considered expected value of organisations A and B!
As a side note, if all multipliers have the same uncertainty r_i = r, and are:
Independent, the overall uncertainty will be R = r^(N^0.5). So the logarithm of the overall uncertainty ln(R) = N^0.5*ln(r) would increase sublinearly with the number of multipliers.
Perfectly correlated, the overall uncertainty will be R = r^N. So the logarithm of the overall uncertainty ln(R) = N*ln(r) would increase linearly with the number of multipliers.
This might be a sensible reality check. However, I would say we should conceptually care about E(“cost-effectiveness of A”)/E(“cost-effectiveness of B”), not E(“cost-effectiveness of A”/”cost-effectiveness of B”), where E is the expected value.
How do you think about adaptation?
Do you have any thoughts on applying a similar methodology and developing analogous models for other areas where there is large uncertainty? I think it would be great to do for AI, bio and nuclear what you did for climate. Have you considered doing this at Founders Pledge (potentially with funding from Open Phil, which I guess would be more keen to support these areas rather than climate)?
I would also be curious to know whether you have tried to pitch your approach to non-EA funders, potentially even outside philanthropy. There is lots of non-EA funding going to climate, so it would be good if more of it was allocated in an effective way!
Hi Vasco,
Thanks for your thoughtful comment!
It took me a while to fully parse, but here are my thoughts, let me know if I misunderstood something.
I/ Re the 3000x example, I think I wasn’t particularly clear in the talk and this is a misunderstanding resulting from that. You’re right to point out that the expected uncertainty is not 3000x.
I meant this more to quickly demonstrate that if you put a couple of uncertainties together it quickly becomes quite hard to evaluate whether something meets a given bar, the range of outcomes is extremely large (if on regression to the mean the uncertainty in the example goes to ~40x, as you suggest, this is still not that helpful to evaluate whether something meets a bar).
And if we did this for a realistic case, e.g. something with 7+ uncertainties then this would be clear even taking into account regression to the mean. In our current trial runs, we definitely see differences that are significantly larger than 40x, i.e. the two options discussed in the talk do indeed look fairly similar in our model of the overall impact space.
II/ In general, we do not assume that all uncertainties are independent or that they all have the same distribution, right now we’ve enabled normal, lognormal and uniform distributions as well as correlations between all variables. For technological change, log normal distributions appear a good approximation of the data but this is not necessarily the same for other variables.
III/
“However, I would say we should conceptually care about E(“cost-effectiveness of A”)/E(“cost-effectiveness of B”), not E(“cost-effectiveness of A”/”cost-effectiveness of B”), where E is the expected value.”
Yes, in general we care about E(CE of A/CE of B). The different decomposition in the talk comes from the specific interest in that case, illustrating that even if we are quite uncertain in general about absolute values, we can make relatively confident statements about dominance relations, e.g. that in 91% of worlds a given org dominates even though the first intuitive reaction to the visualization would be “ah, this is all really uncertain, can we really know anything?”.
IV/ The less formalized versions of this overall framework have indeed already influenced a lot of other FP work in other cause areas, e.g. on bio and nuclear risk and also air pollution, and I do expect that some of the tools we are developing will indeed diffuse to other parts of the research team and potentially to other orgs (we aim to publish the code as well). This is very intentional, we try to push methodology forward in mid-to-high uncertainty contexts where little is published so far.
V/ Most of the donors to the Climate Fund are indeed not cause-neutral EAs and we mostly target non-EA audiences.
Thanks for the clarifications, Johannes!
I meant we should in theory just care about r = E(“CE of A”)/E(“CE of B”)[1], and pick A over B if the expected cost-effectiveness of A is greater than that of B (i.e. if r > 1), even if A was worse than B in e.g. 90 % of the worlds. In practice, if A is better than B in 90 % of the worlds (in which case the 10th precentile of “CE of A”/”CE of B” would be 1), r will often be higher than 1, so focussing on r or E(“CE of A”/”CE of B”) will lead to the same decisions.
If r is what matters, to investigate whether one’s decision to pick A over B is robust, the aim of the sensitivity analysis would be ensuring that r > 1 under various plausible conditions. So, instead of checking whether the CE of A is often higher than the CE of B, one should be testing whether the expected CE of A if often higher than the expected CE of B.
In practice, it might be the case that:
If r > 1 and A is better than B in e.g. 90 % of the worlds, then the conclusion that r > 1 is robust, i.e. we can be confident that A will continue to be better than B upon further investigation.
If r > 1 and A is better than B in e.g. just 25 % of the worlds, then the conclusion that r > 1 is not robust, i.e. we cannot be confident that A will continue to be better than B upon further investigation.
How do you think about adaptation (e.g. economic growth, adoption of air conditioning, and migration)? I forgot to finish this sentence in my last comment.
Note E(X/Y) is not equal to E(X)/E(Y).
Thanks, Vasco, for the great comment, upvoted! I am traveling for work right now, but we’ll try to get back to you by ~mid-week.