Thanks for your feedback, Vasco. It’s led me to make extensive changes to the post:
More analysis on the pros/cons of modelling with distributions. I argue that sometimes it’s good that the crudeness of point-estimate work reflects the crudeness of the evidence available. Interval-estimate work is more honest about uncertainty, but runs the risk of encouraging overconfidence in the final distribution.
I include the lognormal mean in my analysis of means. You have convinced me that the sensitivity of lognormal means to heavy right tails is a strength, not a weakness! But the lognormal mean appears to be sensitive to the size of the confidence interval you use to calculate it—which means subjective methods are required to pick the size, introducing bias.
Overall I agree that interval estimation is better suited to the Drake equation than to GiveWell CEAs. But I’d summarise my reasons as follows:
The Drake Equation really seeks to ask “how likely is it that we have intelligent alien neighbours?”, but point-estimate methods answer the question “what is the expected number of intelligent alien neighbours?”. With such high variability the expected number is virtually useless, but the distribution of this number allows us to estimate the number of alien neighbours. GiveWell CEAs probably have much less variation and hence a point-estimate answer is relatively more useful
Reliable research on the numbers that go into the Drake equation often doesn’t exist, so it’s not too bad to “make up” interval estimates to go into it. We know much more about the charities GiveWell studies, so made-up distributions (even those informed by reliable point-estimates) are much less permissible.
You have convinced me that the sensitivity of lognormal means to heavy right tails is a strength, not a weakness!
Yes, but only as long as we think the heavy right tail is being accurately modelled! Jaime Sevilla has this post on which methods to use to aggregate forecasts.
Interval-estimate work is more honest about uncertainty, but runs the risk of encouraging overconfidence in the final distribution.
I think it is worth flagging that risk, but I would say:
In general, if a given method is more accurate, it seems reasonable to follow that method everything else equal.
One can always warn about not overweighting results estimated with intervals.
Intuitively, there seems to be much higher risk of being overconfident about a point estimate than about a mean estimated with intervals together with a confidence interval. For example, regarding Toby Ord’s best guess given in Table 6.1 of The Precipice for the existential risk from nuclear war between 2021 and 2120, I think it is easier to be overconfident about A than B:
A. 0.1 %.
B. 0.1 % (90 % confidence interval, 0.03 % to 0.3 %). Toby mentions that:
“There is significant uncertainty remaining in these estimates and they should be treated as representing the right order of magnitude—each could easily be a factor of 3 higher or lower”.
But the lognormal mean appears to be sensitive to the size of the confidence interval you use to calculate it—which means subjective methods are required to pick the size, introducing bias.
Yes, for the same median, the wider the interval, the greater the mean. If one is having a hard time linking 2 given estimates to a confidence interval, one can try the narrowest and widest reasonable intervals, and see if the lognormal mean will vary a lot.
We know much more about the charities GiveWell studies, so made-up distributions (even those informed by reliable point-estimates) are much less permissible.
I think people with knowledge about GiveWell’s cost-effectiveness analyses would be able to come up with reasonable distributions. A point estimate is equivalent to assigning probability 1 to that estimate, and 0 to all other outcomes, so it is easy to come up with something better (although it may well not be worth the effort).
I think I have been trying to portray the point-estimate/interval-estimate trade-off as a difficult decision, but probably interval estimates are the obvious choice in most cases.
So I’ve re-done the “Should we always use interval estimates?” section to be less about pros/cons and more about exploring the importance of communicating uncertainty in your results. I have used the Ord example you mentioned.
Thanks for your feedback, Vasco. It’s led me to make extensive changes to the post:
More analysis on the pros/cons of modelling with distributions. I argue that sometimes it’s good that the crudeness of point-estimate work reflects the crudeness of the evidence available. Interval-estimate work is more honest about uncertainty, but runs the risk of encouraging overconfidence in the final distribution.
I include the lognormal mean in my analysis of means. You have convinced me that the sensitivity of lognormal means to heavy right tails is a strength, not a weakness! But the lognormal mean appears to be sensitive to the size of the confidence interval you use to calculate it—which means subjective methods are required to pick the size, introducing bias.
Overall I agree that interval estimation is better suited to the Drake equation than to GiveWell CEAs. But I’d summarise my reasons as follows:
The Drake Equation really seeks to ask “how likely is it that we have intelligent alien neighbours?”, but point-estimate methods answer the question “what is the expected number of intelligent alien neighbours?”. With such high variability the expected number is virtually useless, but the distribution of this number allows us to estimate the number of alien neighbours. GiveWell CEAs probably have much less variation and hence a point-estimate answer is relatively more useful
Reliable research on the numbers that go into the Drake equation often doesn’t exist, so it’s not too bad to “make up” interval estimates to go into it. We know much more about the charities GiveWell studies, so made-up distributions (even those informed by reliable point-estimates) are much less permissible.
Thanks again, and do let me know what you think!
Nice, thanks for the update!
Yes, but only as long as we think the heavy right tail is being accurately modelled! Jaime Sevilla has this post on which methods to use to aggregate forecasts.
I think it is worth flagging that risk, but I would say:
In general, if a given method is more accurate, it seems reasonable to follow that method everything else equal.
One can always warn about not overweighting results estimated with intervals.
Intuitively, there seems to be much higher risk of being overconfident about a point estimate than about a mean estimated with intervals together with a confidence interval. For example, regarding Toby Ord’s best guess given in Table 6.1 of The Precipice for the existential risk from nuclear war between 2021 and 2120, I think it is easier to be overconfident about A than B:
A. 0.1 %.
B. 0.1 % (90 % confidence interval, 0.03 % to 0.3 %). Toby mentions that:
“There is significant uncertainty remaining in these estimates and they should be treated as representing the right order of magnitude—each could easily be a factor of 3 higher or lower”.
Yes, for the same median, the wider the interval, the greater the mean. If one is having a hard time linking 2 given estimates to a confidence interval, one can try the narrowest and widest reasonable intervals, and see if the lognormal mean will vary a lot.
I think people with knowledge about GiveWell’s cost-effectiveness analyses would be able to come up with reasonable distributions. A point estimate is equivalent to assigning probability 1 to that estimate, and 0 to all other outcomes, so it is easy to come up with something better (although it may well not be worth the effort).
Thanks again!
I think I have been trying to portray the point-estimate/interval-estimate trade-off as a difficult decision, but probably interval estimates are the obvious choice in most cases.
So I’ve re-done the “Should we always use interval estimates?” section to be less about pros/cons and more about exploring the importance of communicating uncertainty in your results. I have used the Ord example you mentioned.
Makes sense, thanks!