Honestly I donât really like âastronomically cost-effectiveâ framings; I think theyâre misleading, because they imply too much equivalence with standard cost-effectiveness analysis, whereas if theyâre taken seriously then itâs probably the case that many many actions have astronomical expected impact.
On the intuition pump about human life expectancy:
Suppose that you in fact avert a death that would have occurred in any period, however small
Then this does have a pretty big impact on life expectancy!
Unless youâre deferring the death to mere moments later (like the fact that you were maybe about to die would be evidence that you were in a dangerous situation, and so maybe we care about getting to a non-dangerous situation
You kind of anticipated my reply with your last point. Once I condition on a given human dying in the next e.g. 1.80*10^-35 seconds (calculation in the post), I should expect there is an astronomically high cost of extending the life expectancy to the baseline e.g. 100 years (e.g. because it is super hard to avoid simulation shutdown or vacuum decay). With reasonable costs, I would only be able to increase it infinitesimaly (e.g. by moving a few kilometers away from the location from which the universal collapse wave comes).
An toy example:
Suppose that we could exogenously introduce a new risk which has a 1% chance of immediately ending the universe, otherwise nothing happens
That definitely decreases the expected value of the future by 1% (assuming non-existence is zero)
Therefore, eliminating that risk definitely increases the expected value of the future by 1%
If you think the expected value of the future is astronomical, then eliminating that risk would also have astronomical value
Agreed.
But it seems to me like a wild assumption to say that all of the probability mass goes onto such [âslightly-more valuable worldsâ] worlds
I know you italicised âallâ, but, just to clarify, I do not think all the probability mass would go to such worlds. Some non-negligible mass would go to astronomically valuable worlds, but I think it would be so small that the increase in the expected the value of these would be negligible. Do you have a preferred function describing the difference between the PDF of the value of the future after and before the intervention? It is unclear to me why assuming a decaying exponential, as I did in my calculations for illustration, would be wild.
If you want to avoid the conclusion that avoiding nearterm extinction risk has astronomical value, you therefore need to take one of three branches:
Claim that nearterm extinction risk is astronomically small
I think youâre kind of separately making this claim, but that itâs not the main point of this post?
Claim that the situation is importantly disanalogous from the toy example I give above
Maybe you think this? But I donât understand what the mechanisms of difference would be
Claim that the future does not have astronomical expected value
Honestly I think thereâs some plausibility to this line, but you seem not to be exploring it
I sort of take branch 2. I agree eliminating a 1 % instantaneous risk of the universe collapsing would have astronomical benefits, but I would say it would be basically impossible. I think the cost/âdifficulty of eliminating a risk tends to infinity as the duration of the risk tends to 0[2], in which case the cost-effectiveness of mitigating the risk also goes to 0. In addition, eliminating a risk becomes harder as its potential impact increases, such that eliminating a risk which endargers the whole universe is astronomically harder than eliminating one that only endangers humans.
As the duration and scope of the risk decreases, postulating that the relative increase in the expected value of the future matches the (absolute) reduction in risk becomes increasingly less valid.
Describing a risk as binary as in your example[3], it is more natural to assume that the probability mass which is taken out of the risk is e.g. loguniformly distributed across all the other worlds, thus meaningully increasing the chance of astronomically valuable worlds, and resulting in risk mitigation being astronomically cost-effective. However, describing a risk as continous (in time and scope), it is more natural to suppose that the change in its duration and impact will be continuous, and this does not obviously result in risk mitigation being astronomically cost-effective.
On 1, I guess the probability of humans going extinct over the next 10 years is 10^-7, so nowhere near as small as needed to result in non-astronomically large cost-effectiveness when multiplied by the expected value of the future. However, you are right this is not the main point of the post.
On 3, I guess you find a non-astronomical expected value of the future more plausible due to your guess for the nearterm risk of human extinction being orders of magnitude higher than mine.
Firstly, I donotthink reducing the nearterm risk of human extinction being astronomically cost-effective implies it is astronomically more cost-effective than interventions not explicitly focussing on tail risk, like ones in global health and development and animal welfare.
In general, any task (e.g. risk mitigation) can be made arbitrarily costly/âhard by decreasing the time in which it has to be performed. At the limit, the task becomes impossible when there is no time to complete it.
For the situation to be analogous, one would have to consider an intervention decreasing the risk of a death over 1.80*10^-35 seconds (see calculations in the post), in which case it does feel intuitive to me that changing the life expectancy would be quite hard. Either the risk would occur over a much longer period, and therefore the intervention would only be mitigating a minor fraction of it, or it would be a very binary risk like simulation shutdown that would be super hardly be reduced.
Of course itâs not intended to be a strict analogy. Rather, I take the whole shape of your argument to be a strong presumption against interventions during a short period X having an expected effect on total lifespan that is >>X.
But I claim that this happens with birth. Birth is something like 0.0001% of the duration of the whole life, but interventions during birth can easily have impacts on life expectancy which are much greater than 0.0001%. Clearly thereâs some mechanism in operation here which means these donât need to be tightly coupled. What is it, and why doesnât it apply to the case of extinction risk?
Obviously the numbers involved are much more extreme in the case of human extinction, but I think the birth:life duration ratio is already big enough that we could hope to learn useful lessons from that.
Thanks for clarifying. My argument depends on the specific numbers involved. Reducing the risk of human extinction over the next year only directly affects around 10^10 lives, i.e. 10^-43 (= 10^(9 â 52)) of my estimate for the expected value of the future. I see such reduction as astronomically harder than decreasing a risk which only directly affect 10^-6 (= 0.0001 %) of the value. My main issue is that these arguments are often analysed informally, whereas I think they require looking into maths like how fast the tail of counterfactual effects decays. I have now added the following bullets to the post, which were initially not imported:
I cannot help notice arguments for reducing the nearterm risk of human extinction being astronomically cost-effective might share some similarities with (supposedly) logical arguments for the existence of God (e.g. Thomas Aquinasâ Five Ways), although they are different in many aspects too. Their conclusions seem to mostly follow from:
Cognitive biases. In the case of the former, the following come to mind:
Authority bias. For example, in Existential Risk Prevention as Global Priority, Nick Bostrom interprets a reduction in (total/âcumulative) existential risk as a relative increase in the expected value of the future, which is fine, but then deals with the former as being independent from the latter, which I would argue is misguided given the dependence between the value of the future and increase in its PDF. âThe more technologically comprehensive estimate of 10^54 human brain-emulation subjective life-years (or 10^52 lives of ordinary length) makes the same point even more starkly. Even if we give this allegedly lower bound on the cumulative output potential of a technologically mature civilisation a mere 1 per cent chance of being correct, we find that the expected value of reducing existential risk by a mere one billionth of one billionth of one percentage point is worth a hundred billion times as much as a billion human livesâ.
Nitpick. The maths just above is not right. Nick meant 10^21 (= 10^(52 â 2 â 2*9 â 2 â 9)) times as much just above, i.e. a thousand billion billion times, not a hundred billion times (10^11).
Binary bias. This can manifest in assuming the value of the future is not only binary, but also that interventions reducing the nearterm risk of human extinction mostly move probability mass from worlds with value close to 0 to ones which are astronomically valuable, as opposed to just slightly more valuable.
Scope neglect. I agree the expected value of the future is astronomical, but it is easy to overlook that the increase in the probability of the astronomically valuable worlds driving that expected value can be astronomically low too, thus making the increase in the expected value of the astronomically valuable worlds negligible (see my illustration above).
Little use of empirical evidence and detailed quantitative models to catch the above biases. In the case of the former:
As far as I know, reductions in the nearterm risk of human extinction as well as its relationship with the relative increase in the expected value of the future are always directly guessed.
Sorry, I donât find this is really speaking to my question?
I totally believe that people make mistakes in thinking about this stuff for reasons along the lines of the biases you discuss. But I also think that youâre making some strong assumptions about things essentially cancelling out that I think are unjustified, and talking about mistakes that other people are making doesnât (it seems to me) work as a justification.
Sorry, I donât find this is really speaking to my question?
I do not think the difficulty of decreasing a risk is independent of the value at stake. It is harder to decrease a risk when a larger value is at stake. So, in my mind, decreasing the nearterm risk of human extinction is astronomically easier than decreasing the risk of not achieving 10^50 lives of value, such that decreasing the former by e.g. 10^-10 leads to a relative increase in the latter much smaller than 10^-10.
I also think that youâre making some strong assumptions about things essentially cancelling out
Could you elaborate on why you think I am making a strong assumption in terms of questioning the following?
In light of the above, I expect what David Thorstad calls rapid diminution. I see the difference between the PDF after and before an intervention reducing the nearterm risk of human extinction as quickly decaying to 0, thus making the increase in the expected value of the astronomically valuable worlds negligible. For instance:
If the difference between the PDF after and before the intervention decays exponentially with the value of the future v, the increase in the value density caused by the intervention will be proportional to v*e^-v[4].
The above rapidly goes to 0 as v increases. For a value of the future equal to my expected value of 1.40*10^52 human lives, the increase in value density will multiply a factor of 1.40*10^52*e^(-1.40*10^52) = 10^(log10(1.40)*52 - log10(e)*1.40*10^52) = 10^(-6.08*10^51), i.e. it will be basically 0.
Do you think I am overestimating how fast the difference between the PDF after and before the intervention decays? As far as I can tell, the (posterior) counterfactual impact of interventions whose effects can be accurately measured, like ones in global health and development, decays to 0 as time goes by. I do not have a strong view on the particular shape of the difference, but exponential decay is quite typical in many contexts.
I do not think the difficulty of decreasing a risk is independent of the value at stake. It is harder to decrease a risk when a larger value is at stake.
This makes sense as a kind of general prior to come in with. Although note:
Itâs surely observational, not causalâthereâs no magic at play which means if you keep a scenario fixed except for changing the value at stake, this should impact the difficulty
One of the plausible generating mechanisms is having a broad altruistic market which takes the best opportunities, leaving no free lunchesâbut for some of the cases weâre discussing itâs unclear the market could have made it efficient
So, in my mind, decreasing the nearterm risk of human extinction is astronomically easier than decreasing the risk of not achieving 10^50 lives of value, such that decreasing the former by e.g. 10^-10 leads to a relative increase in the latter much smaller than 10^-10.
Now it looks to me as though youâre dogmatically sticking with the prior. Having come across the (kinda striking) observation which says âif thereâs a realistic chance of spreading to the stars, then premature human extinction would forgo astronomical valueâ, it seems like youâre saying âwell that would mean that the prior was wrong, so that observation canât be quite rightâ, and then reasoning from your prior to try to draw conclusions about the causal relationships there.
Whereas I feel that the prior reasonably justifies more scepticism in cases where more lives are at stake (and indeed, I do put a bunch of probability on âaverting near-term extinction doesnât save astronomical value for some reason or anotherâ, though the reasons tend to be ones where we never actually had a shot of an astronomically big future in the first place, and I think that thatâs sort of the appropriate target for scepticism), but doesnât give you anything strong enough to be confident about things.
(I certainly wouldnât be surprised if Iâm somehow misunderstanding what youâre doing; Iâm just responding to the picture Iâm getting from what youâve written.)
Now it looks to me as though youâre dogmatically sticking with the prior.
Are there any interventions whose estimates of (posterior) counterfactual impact do not decay to 0 in at most a few centuries? From my perspective, their absence establishes a strong prior against persistent longterm effects.
I do put a bunch of probability on âaverting near-term extinction doesnât save astronomical value for some reason or anotherâ, though the reasons tend to be ones where we never actually had a shot of an astronomically big future in the first place, and I think that thatâs sort of the appropriate target for scepticism
In general our ability to measure long term effects is kind of lousy. But if I wanted to look for interventions which donât have that decay pattern it would be most natural to think of conservation work saving species from extinction. Once weâve lost biodiversity, itâs essentially gone (maybe taking millions of years to build up again naturally). Conservation work can stop that. And with rises in conservation work over time itâs quite plausible that early saving species wonât just lead to them going extinct slightly later, but being preserved indefinitely.
I was not clear above, but I meant (posterior) counterfactual impact under expectedtotalhedonisticutilitarianism. Even if a species is counterfactually preserved indefinitely due to actions now, which I think would be very hard, I do not see how it would permanently increase wellbeing. In addition, I meant to ask for actual empirical evidence as opposed to hypothetical examples (e.g. of one species being saved and making an immortal conservationist happy indefinitely).
I think this is something where our ability to measure is just pretty bad, and in particular our ability to empirically detect whether the type of things that plausibly have long lasting counterfactual impacts actually do is pretty terrible.
I respond to that by saying âok I guess empirics arenât super helpful for the big picture question letâs try to build mechanistic understanding of things grounded wherever possible in empirics, as well as priors about what types of distributions occur when various different generating mechanisms are at playâ, whereas it sounds like youâre responding by saying something like âwell as a prior weâll just use the parts of the distribution we can actually measure, and assume that generalizes unless we get contradictory dataâ?
I respond to that by saying âok I guess empirics arenât super helpful for the big picture question letâs try to build mechanistic understanding of things grounded wherever possible in empirics, as well as priors about what types of distributions occur when various different generating mechanisms are at playâ, whereas it sounds like youâre responding by saying something like âwell as a prior weâll just use the parts of the distribution we can actually measure, and assume that generalizes unless we get contradictory dataâ?
Yes, that would be my reply. Thanks for clarifying.
Yeah, so I basically think that that response feels âspiritually frequentistâ, and is more likely to lead you to large errors than the approach I outlined (which feels more âspiritually Bayesianâ), especially in cases like this where weâre trying to extrapolate significantly beyond the data weâve been able to gather.
Thanks for the comment, Owen!
Agreed[1].
You kind of anticipated my reply with your last point. Once I condition on a given human dying in the next e.g. 1.80*10^-35 seconds (calculation in the post), I should expect there is an astronomically high cost of extending the life expectancy to the baseline e.g. 100 years (e.g. because it is super hard to avoid simulation shutdown or vacuum decay). With reasonable costs, I would only be able to increase it infinitesimaly (e.g. by moving a few kilometers away from the location from which the universal collapse wave comes).
Agreed.
I know you italicised âallâ, but, just to clarify, I do not think all the probability mass would go to such worlds. Some non-negligible mass would go to astronomically valuable worlds, but I think it would be so small that the increase in the expected the value of these would be negligible. Do you have a preferred function describing the difference between the PDF of the value of the future after and before the intervention? It is unclear to me why assuming a decaying exponential, as I did in my calculations for illustration, would be wild.
I sort of take branch 2. I agree eliminating a 1 % instantaneous risk of the universe collapsing would have astronomical benefits, but I would say it would be basically impossible. I think the cost/âdifficulty of eliminating a risk tends to infinity as the duration of the risk tends to 0[2], in which case the cost-effectiveness of mitigating the risk also goes to 0. In addition, eliminating a risk becomes harder as its potential impact increases, such that eliminating a risk which endargers the whole universe is astronomically harder than eliminating one that only endangers humans.
As the duration and scope of the risk decreases, postulating that the relative increase in the expected value of the future matches the (absolute) reduction in risk becomes increasingly less valid.
Describing a risk as binary as in your example[3], it is more natural to assume that the probability mass which is taken out of the risk is e.g. loguniformly distributed across all the other worlds, thus meaningully increasing the chance of astronomically valuable worlds, and resulting in risk mitigation being astronomically cost-effective. However, describing a risk as continous (in time and scope), it is more natural to suppose that the change in its duration and impact will be continuous, and this does not obviously result in risk mitigation being astronomically cost-effective.
On 1, I guess the probability of humans going extinct over the next 10 years is 10^-7, so nowhere near as small as needed to result in non-astronomically large cost-effectiveness when multiplied by the expected value of the future. However, you are right this is not the main point of the post.
On 3, I guess you find a non-astronomical expected value of the future more plausible due to your guess for the nearterm risk of human extinction being orders of magnitude higher than mine.
I say in the post that:
In general, any task (e.g. risk mitigation) can be made arbitrarily costly/âhard by decreasing the time in which it has to be performed. At the limit, the task becomes impossible when there is no time to complete it.
Either it is on or off. Either it affects the whole universe or nothing at all.
Trying to aim at the point of most confusion about why weâre seeing things differently:
Why doesnât your argument show that medical intervention during birth can have only a negligible effect on life expectancy?
For the situation to be analogous, one would have to consider an intervention decreasing the risk of a death over 1.80*10^-35 seconds (see calculations in the post), in which case it does feel intuitive to me that changing the life expectancy would be quite hard. Either the risk would occur over a much longer period, and therefore the intervention would only be mitigating a minor fraction of it, or it would be a very binary risk like simulation shutdown that would be super hardly be reduced.
Of course itâs not intended to be a strict analogy. Rather, I take the whole shape of your argument to be a strong presumption against interventions during a short period X having an expected effect on total lifespan that is >>X.
But I claim that this happens with birth. Birth is something like 0.0001% of the duration of the whole life, but interventions during birth can easily have impacts on life expectancy which are much greater than 0.0001%. Clearly thereâs some mechanism in operation here which means these donât need to be tightly coupled. What is it, and why doesnât it apply to the case of extinction risk?
Obviously the numbers involved are much more extreme in the case of human extinction, but I think the birth:life duration ratio is already big enough that we could hope to learn useful lessons from that.
Thanks for clarifying. My argument depends on the specific numbers involved. Reducing the risk of human extinction over the next year only directly affects around 10^10 lives, i.e. 10^-43 (= 10^(9 â 52)) of my estimate for the expected value of the future. I see such reduction as astronomically harder than decreasing a risk which only directly affect 10^-6 (= 0.0001 %) of the value. My main issue is that these arguments are often analysed informally, whereas I think they require looking into maths like how fast the tail of counterfactual effects decays. I have now added the following bullets to the post, which were initially not imported:
Sorry, I donât find this is really speaking to my question?
I totally believe that people make mistakes in thinking about this stuff for reasons along the lines of the biases you discuss. But I also think that youâre making some strong assumptions about things essentially cancelling out that I think are unjustified, and talking about mistakes that other people are making doesnât (it seems to me) work as a justification.
I do not think the difficulty of decreasing a risk is independent of the value at stake. It is harder to decrease a risk when a larger value is at stake. So, in my mind, decreasing the nearterm risk of human extinction is astronomically easier than decreasing the risk of not achieving 10^50 lives of value, such that decreasing the former by e.g. 10^-10 leads to a relative increase in the latter much smaller than 10^-10.
Could you elaborate on why you think I am making a strong assumption in terms of questioning the following?
Do you think I am overestimating how fast the difference between the PDF after and before the intervention decays? As far as I can tell, the (posterior) counterfactual impact of interventions whose effects can be accurately measured, like ones in global health and development, decays to 0 as time goes by. I do not have a strong view on the particular shape of the difference, but exponential decay is quite typical in many contexts.
This makes sense as a kind of general prior to come in with. Although note:
Itâs surely observational, not causalâthereâs no magic at play which means if you keep a scenario fixed except for changing the value at stake, this should impact the difficulty
One of the plausible generating mechanisms is having a broad altruistic market which takes the best opportunities, leaving no free lunchesâbut for some of the cases weâre discussing itâs unclear the market could have made it efficient
Now it looks to me as though youâre dogmatically sticking with the prior. Having come across the (kinda striking) observation which says âif thereâs a realistic chance of spreading to the stars, then premature human extinction would forgo astronomical valueâ, it seems like youâre saying âwell that would mean that the prior was wrong, so that observation canât be quite rightâ, and then reasoning from your prior to try to draw conclusions about the causal relationships there.
Whereas I feel that the prior reasonably justifies more scepticism in cases where more lives are at stake (and indeed, I do put a bunch of probability on âaverting near-term extinction doesnât save astronomical value for some reason or anotherâ, though the reasons tend to be ones where we never actually had a shot of an astronomically big future in the first place, and I think that thatâs sort of the appropriate target for scepticism), but doesnât give you anything strong enough to be confident about things.
(I certainly wouldnât be surprised if Iâm somehow misunderstanding what youâre doing; Iâm just responding to the picture Iâm getting from what youâve written.)
Are there any interventions whose estimates of (posterior) counterfactual impact do not decay to 0 in at most a few centuries? From my perspective, their absence establishes a strong prior against persistent longterm effects.
This makes a lot of sense to me too.
In general our ability to measure long term effects is kind of lousy. But if I wanted to look for interventions which donât have that decay pattern it would be most natural to think of conservation work saving species from extinction. Once weâve lost biodiversity, itâs essentially gone (maybe taking millions of years to build up again naturally). Conservation work can stop that. And with rises in conservation work over time itâs quite plausible that early saving species wonât just lead to them going extinct slightly later, but being preserved indefinitely.
I was not clear above, but I meant (posterior) counterfactual impact under expected total hedonistic utilitarianism. Even if a species is counterfactually preserved indefinitely due to actions now, which I think would be very hard, I do not see how it would permanently increase wellbeing. In addition, I meant to ask for actual empirical evidence as opposed to hypothetical examples (e.g. of one species being saved and making an immortal conservationist happy indefinitely).
I think this is something where our ability to measure is just pretty bad, and in particular our ability to empirically detect whether the type of things that plausibly have long lasting counterfactual impacts actually do is pretty terrible.
I respond to that by saying âok I guess empirics arenât super helpful for the big picture question letâs try to build mechanistic understanding of things grounded wherever possible in empirics, as well as priors about what types of distributions occur when various different generating mechanisms are at playâ, whereas it sounds like youâre responding by saying something like âwell as a prior weâll just use the parts of the distribution we can actually measure, and assume that generalizes unless we get contradictory dataâ?
Yes, that would be my reply. Thanks for clarifying.
Yeah, so I basically think that that response feels âspiritually frequentistâ, and is more likely to lead you to large errors than the approach I outlined (which feels more âspiritually Bayesianâ), especially in cases like this where weâre trying to extrapolate significantly beyond the data weâve been able to gather.