To clarify, I agree with the consequences you outline given the hypothetical. I just think this is so unlikely and unfalsifiable that it has no practical value for cause prioritisation.
I disagree with you on this one Vasco Grilo, because I think the argument still works even when things a more stochastic.
To make things just slightly more realistic, suppose that progress, measured by q, is a continuous stochastic process, which at some point drops to zero and stays there (extinction). To capture the âendogenous endpointâ assumption, suppose that the probability of extinction in any given year is a function of q only. And to simplify things, lets assume itâs a Markov process (future behaviour depends only on current state, and is independent of past behaviour).
Suppose current level of progress is q_0, and weâre considering an intervention that will cause it to jump to q_1. We have, in absence of any intervention:
Total future value given currently at q_0 = Value generated before we first hit q_1 + Value generated after first hitting q_1
By linearity of expectation:
E(Total future value given q_0) = E(Value before first q_1) + E(Value after first q_1)
By Markov property (small error in this step, corrected in reply, doesnât change conclusions):
E(Total future value given q_0) = E(Value before first q_1) + E(Total future value given q_1)
So as long as E(Value before first q_1) is positive, then we decrease expected total future value by making an intervention that increases q_0 to q_1.
Itâs just the âskipping a track on a recordâ argument from the post, but I think it is actually really robust. Itâs not just an artefact of making a simplistic âeverything moves literally one year forwardâ assumption.
Iâm not sure how deep the reliance on the Markov property in the above is. Or how dramatically this gets changed when you allow the probability of extinction to depend slightly on things other than q. It would be interesting to look at that more.
But I think this still shows that the intuition of âonce loads of noisy unpredictable stuff has happened, the effect of your intervention must eventually get washed outâ is wrong.
Realised after posting that Iâm implicitly assuming you will hit q_1, and not go extinct before. For interventions in progress, this probably has high probability, and the argument is roughly right. To make it more general, once you get to this line:
E(Total value given q_0) = E(Value before first q_1) + E(Value after first q_1)
Next line should be, by Markov:
E(Total future value given q_0) = E(Value before first q_1) + P(hitting q_1) E(Value given q_1)
So:
E(Value given q_1) = (E(Value given q_0) - E(Value before first q_1)) /â P(hitting q_1)
Still gives the same conclusion if P(hitting q_1) is close to 1, although can give a very different conclusion if P(hitting q_1) is small (would be relevant if e.g. progress was tending to decrease in time towards extinction, in which case clearly bumping q_0 up to q_1 is better!)
Thanks, Toby. As far as I can tell, your model makes no use of empirical data, so I do not see how it can affect cause prioritisation in the real (empirical) world. I am fan of quantitative analyses and models, but I think these still have to be based on some empirical data to be informative.
The point of my comment was to show that a wide range of different possible models would all exhibit the property Toby Ord is talking about here, even if they involve lots of complexity and randomness. A lot of these models wouldnât predict the large negative utility change at a specific future time, that you find implausible, but would still lead to the exact same conclusion in expectation.
Iâm a fan of empiricism, but deductive reasoning has its place too, and can sometimes allow you to establish the existence of effects which are impossible to measure. Note this argument is not claiming to establish a conclusion with no empirical data. It is saying if certain conditions hold (which must be determined empirically) then a certain conclusion follows.
The point of my comment was to show that a wide range of different possible models would all exhibit the property Toby Ord is talking about here, even if they involve lots of complexity and randomness.
Your model says that instantly going from q_0 to q_1 is bad, but I do not think real world interventions allow for discontinuous changes in progress. So you would have to compare âvalue given q_0â with âvalue given q_1âł + âvalue accumulated in the transition from q_0 to q_1â. By neglecting this last term, I believe your model underestimates the value of accelerating progress.
A lot of these models wouldnât predict the large negative utility change at a specific future time, that you find implausible, but would still lead to the exact same conclusion in expectation.
More broadly, I find it very implausible that an intervention today could meaningully (counterfactually) increase/âdecrease (after adjusting for noise) expected total hedonistic utility more than a few centuries from now.
More broadly, I find it very implausible that an intervention today could meaningully (counterfactually) increase/âdecrease (after adjusting for noise) expected total hedonistic utility more than a few centuries from now.
Causing extinction (or even some planet scale catastrophe with thousand-year plus consequences that falls short of extinction) would be an example of this wouldnât it? Didnât Stanislav Petrov have the opportunity to meaningfully change expected utility for more than a few centuries?
I can only think of two ways of avoiding that conclusion:
Global nuclear war wouldnât actually meaningfully reduce utility for more than a few centuries from when it happens.
Nuclear war, or some similar scale catastrophe, is bound to happen within a few centuries anyway, so that after a few centuries the counterfactual impact disappears. Maybe the fact that stories like Petrovâs exist is what allows you to be confident in this.
I think either of these would be interesting claims, although it would now feel to me like you were the one using theoretical considerations to make overconfident claims about empirical questions. Even if (1) is true for global nuclear war, I can just pick a different human-induced catastrophic risk as an example, unless it is true for all such examples, which is an even stronger claim.
It seems implausible to me that we should be confident enough in either of these options that all meaningful change in expected utility disappears after a few centuries.
Is there a third option..?
I do actually think option 2 might have something going for it, itâs just that the âfew centuriesâ timescale maybe seems too short to me. But, if you did go down route 2, then Toby Ordâs argument as far as you were concerned would no longer be relying on considerations thousands of years from now. That big negative utility hit he is predicting would be in the next few centuries anyway, so youâd be happy after all?
Your model says that instantly going from q_0 to q_1 is bad, but I do not think real world interventions allow for discontinuous changes in progress. So you would have to compare âvalue given q_0â with âvalue given q_1âł + âvalue accumulated in the transition from q_0 to q_1â. By neglecting this last term, I believe your model underestimates the value of accelerating progress.
Sure, but the question is what do we change by speeding progress up. I considered the extreme case where we reduce the area under the curve between q_0 and q_1 to 0, in which case we lose all the value we would have accumulated in passing between those points without the intervention.
If we just go faster, but not discontinuously, we lose less value, but we still lose it, as long as that area under the curve has been reduced. The quite interesting thing is that itâs the shape of the curve right now that matters, even though the actual reduction in utility is happening far in the future.
Causing extinction (or even some planet scale catastrophe with thousand-year plus consequences that falls short of extinction) would be an example of this wouldnât it?
Didnât Stanislav Petrov have the opportunity to meaningfully change expected utility for more than a few centuries?
In my mind, not meaningfully so. Based on my adjustments to the mortality rates of cooling events by CEARCH, the expected annual mortality rate from nuclear winter is 7.32*10^-6. So, given a 1.31 % annual probability of a nuclear weapon being detonated as an act of war, the mortality rate from nuclear winter conditional on at least 1 nuclear detonation is 0.0559 % (= 7.32*10^-6/â0.0131). I estimated the direct deaths would be 1.16 times as large as the ones from the climatic effects, and I think CEARCHâs mortality rates only account for these. So the total mortality rate conditional on at least 1 nuclear detonation would be 0.121 % (= 5.59*10^-4*(1 + 1.16)), corresponding to 9.68 M (= 0.00121*8*10^9) deaths for todayâs population. Preventing millions of deaths is absolutely great, but arguably not enough to meaningfully improve the longterm future?
9.89 M people died from cancer in 2021, but eliminating cancer deaths over 1 year would not meaningully improve the future in e.g. 1 k years?
The Black Death killed like half of Europeâs population, i.e. 413 (= 0.5/â0.00121) times my estimate for the mortality rate conditional on a nuclear detonation, but will the world be meaningfully worse 300 years from now (i.e. 1 k years after the Black Death) because of it?
Even if the world is non-infinitesimaly different in 1 k years as a result of the above, would the counterfactual improvements in the world value after 1 k years be negligible in comparison with that before 1 k years?
Why not simply assessing the value of preventing wars and pandemics based on standard cost-effectiveness analyses, for which we can rely on empirically informed estimates like the above?
Sure, but the question is what do we change by speeding progress up.
The way I think about it is that interventions improve the world for a few years to centuries, and then after that practically nothing changes (in the sense the expected benefits coming from changing the longterm are negligible). If I understand correctly, frameworks like Tobyâs or yours that suggest that improving the world now may be outweighted by making the world worse in the future suppose we can reliably change the value the world will have longterm, and that this is may be the driver of the overall effect.
Thanks for the detailed reply on that! Youâve clearly thought about this a lot, and Iâm very happy to believe youâre right on the impact of nuclear war, but It sounds like you are more or less opting for what I called option 1? In which case, just substitute nuclear war for a threat that would literally cause extinction with high probability (say release of a carefully engineered pathogen with high fatality rate, long incubation period, and high infectiousness). Wouldnât that meaningfully affect utility for more than a few centuries? Because there would be literally no one left, and that effect is guaranteed to be persistent! Even if it âjustâ reduced the population by 99%, that seems like it would very plausibly have effects for thousands of years into the future.
It seems to me that to avoid this, you have to either say that causing extinction (or near extinction level catastrophe) is virtually impossible, through any means, (what I was describing as option 1) or go the other extreme and say that it is virtually guaranteed in the short term anyway, so that counterfactual impact disappears quickly (what I was describing as option 2). Just so I understand what youâre saying, are you claiming one of these two things? Or is there another way out that Iâm missing?
Thanks for the kind words. I was actually unsure whether I should have followed up given my comments in this thread had been downvoted (all else equal, I do not want to annoy readers!), so it is good to get some information.
Thanks for the detailed reply on that! Youâve clearly thought about this a lot, and Iâm very happy to believe youâre right on the impact of nuclear war, but It sounds like you are more or less opting for what I called option 1? In which case, just substitute nuclear war for a threat that would literally cause extinction with high probability (say release of a carefully engineered pathogen with high fatality rate, long incubation period, and high infectiousness). Wouldnât that meaningfully affect utility for more than a few centuries? Because there would be literally no one left, and that effect is guaranteed to be persistent! Even if it âjustâ reduced the population by 99%, that seems like it would very plausibly have effects for thousands of years into the future.
I think the effect of the intervention will still decrease to practically 0 in at most a few centuries in that case, such that reducing the nearterm risk of human extinction is not astronomically cost-effective. I guess you are imagining that humans either go extinct or have a long future where they go on to realise lots of value. However, this is overly binary in my view. I elaborate on this in the post I linked to at the start of this paragraph, and its comments.
It seems to me that to avoid this, you have to either say that causing extinction (or near extinction level catastrophe) is virtually impossible, through any means, (what I was describing as option 1) or go the other extreme and say that it is virtually guaranteed in the short term anyway, so that counterfactual impact disappears quickly (what I was describing as option 2). Just so I understand what youâre saying, are you claiming one of these two things? Or is there another way out that Iâm missing?
I guess the probability of human extinction in the next 10 years is around 10^-7, i.e. very unlikely, but far from impossible.
Iâm at least finding it useful figuring out exactly where we disagree. Please stop replying if itâs taking too much of your time, but not because of the downvotes!
I guess you are imagining that humans either go extinct or have a long future where they go on to realise lots of value.
This isnât quite what Iâm saying, depending on what you mean by âlotsâ and âlongâ. For your âimpossible for an intervention to have counterfactual effects for more than a few centuriesâ claim to be false, we only need the future of humanity to have a non-tiny chance of being longer than a few centuries (not that long), and for there to be conceivable interventions which have a non-tiny chance of very quickly causing extinction. These interventions would then meaningfully affect counterfactual utility for more than a few centuries.
To be more concrete and less binary, suppose we are considering an intervention that has a risk p of almost immediately leading to extinction, and otherwise does nothing. Let U be the expected utility generated in a year, in 500 years time, absent any intervention. If you decide to make this intervention, that has the effect of changing U to (1-p)U, and so the utility generated in that far future year has been changed by pU.
For this to be tiny/ânon-meaningful, we either need p to be tiny, or U to be tiny (or both).
Are you saying:
There are no concievable interventions someone could make with p non-tiny.
U, expected utility in a year in 500 years time, is approximately 0.
Something else⌠my setup of the situation is wrong, or unrealistic..?
There are no concievable interventions someone could make with p non-tiny.
U, expected utility in a year in 500 years time, is approximately 0.
Something else⌠my setup of the situation is wrong, or unrealistic..?
1, in the sense I think the change in the immediate risk of human extinction per cost is astronomically low for any conceivable intervention. Relatedly, you may want to check my discussion with Larks in the post I linked to.
To clarify, I agree with the consequences you outline given the hypothetical. I just think this is so unlikely and unfalsifiable that it has no practical value for cause prioritisation.
I disagree with you on this one Vasco Grilo, because I think the argument still works even when things a more stochastic.
To make things just slightly more realistic, suppose that progress, measured by q, is a continuous stochastic process, which at some point drops to zero and stays there (extinction). To capture the âendogenous endpointâ assumption, suppose that the probability of extinction in any given year is a function of q only. And to simplify things, lets assume itâs a Markov process (future behaviour depends only on current state, and is independent of past behaviour).
Suppose current level of progress is q_0, and weâre considering an intervention that will cause it to jump to q_1. We have, in absence of any intervention:
Total future value given currently at q_0 = Value generated before we first hit q_1 + Value generated after first hitting q_1
By linearity of expectation:
E(Total future value given q_0) = E(Value before first q_1) + E(Value after first q_1)
By Markov property (small error in this step, corrected in reply, doesnât change conclusions):
E(Total future value given q_0) = E(Value before first q_1) + E(Total future value given q_1)
So as long as E(Value before first q_1) is positive, then we decrease expected total future value by making an intervention that increases q_0 to q_1.
Itâs just the âskipping a track on a recordâ argument from the post, but I think it is actually really robust. Itâs not just an artefact of making a simplistic âeverything moves literally one year forwardâ assumption.
Iâm not sure how deep the reliance on the Markov property in the above is. Or how dramatically this gets changed when you allow the probability of extinction to depend slightly on things other than q. It would be interesting to look at that more.
But I think this still shows that the intuition of âonce loads of noisy unpredictable stuff has happened, the effect of your intervention must eventually get washed outâ is wrong.
Realised after posting that Iâm implicitly assuming you will hit q_1, and not go extinct before. For interventions in progress, this probably has high probability, and the argument is roughly right. To make it more general, once you get to this line:
E(Total value given q_0) = E(Value before first q_1) + E(Value after first q_1)
Next line should be, by Markov:
E(Total future value given q_0) = E(Value before first q_1) + P(hitting q_1) E(Value given q_1)
So:
E(Value given q_1) = (E(Value given q_0) - E(Value before first q_1)) /â P(hitting q_1)
Still gives the same conclusion if P(hitting q_1) is close to 1, although can give a very different conclusion if P(hitting q_1) is small (would be relevant if e.g. progress was tending to decrease in time towards extinction, in which case clearly bumping q_0 up to q_1 is better!)
Thanks, Toby. As far as I can tell, your model makes no use of empirical data, so I do not see how it can affect cause prioritisation in the real (empirical) world. I am fan of quantitative analyses and models, but I think these still have to be based on some empirical data to be informative.
The point of my comment was to show that a wide range of different possible models would all exhibit the property Toby Ord is talking about here, even if they involve lots of complexity and randomness. A lot of these models wouldnât predict the large negative utility change at a specific future time, that you find implausible, but would still lead to the exact same conclusion in expectation.
Iâm a fan of empiricism, but deductive reasoning has its place too, and can sometimes allow you to establish the existence of effects which are impossible to measure. Note this argument is not claiming to establish a conclusion with no empirical data. It is saying if certain conditions hold (which must be determined empirically) then a certain conclusion follows.
Your model says that instantly going from q_0 to q_1 is bad, but I do not think real world interventions allow for discontinuous changes in progress. So you would have to compare âvalue given q_0â with âvalue given q_1âł + âvalue accumulated in the transition from q_0 to q_1â. By neglecting this last term, I believe your model underestimates the value of accelerating progress.
More broadly, I find it very implausible that an intervention today could meaningully (counterfactually) increase/âdecrease (after adjusting for noise) expected total hedonistic utility more than a few centuries from now.
Causing extinction (or even some planet scale catastrophe with thousand-year plus consequences that falls short of extinction) would be an example of this wouldnât it? Didnât Stanislav Petrov have the opportunity to meaningfully change expected utility for more than a few centuries?
I can only think of two ways of avoiding that conclusion:
Global nuclear war wouldnât actually meaningfully reduce utility for more than a few centuries from when it happens.
Nuclear war, or some similar scale catastrophe, is bound to happen within a few centuries anyway, so that after a few centuries the counterfactual impact disappears. Maybe the fact that stories like Petrovâs exist is what allows you to be confident in this.
I think either of these would be interesting claims, although it would now feel to me like you were the one using theoretical considerations to make overconfident claims about empirical questions. Even if (1) is true for global nuclear war, I can just pick a different human-induced catastrophic risk as an example, unless it is true for all such examples, which is an even stronger claim.
It seems implausible to me that we should be confident enough in either of these options that all meaningful change in expected utility disappears after a few centuries.
Is there a third option..?
I do actually think option 2 might have something going for it, itâs just that the âfew centuriesâ timescale maybe seems too short to me. But, if you did go down route 2, then Toby Ordâs argument as far as you were concerned would no longer be relying on considerations thousands of years from now. That big negative utility hit he is predicting would be in the next few centuries anyway, so youâd be happy after all?
Sure, but the question is what do we change by speeding progress up. I considered the extreme case where we reduce the area under the curve between q_0 and q_1 to 0, in which case we lose all the value we would have accumulated in passing between those points without the intervention.
If we just go faster, but not discontinuously, we lose less value, but we still lose it, as long as that area under the curve has been reduced. The quite interesting thing is that itâs the shape of the curve right now that matters, even though the actual reduction in utility is happening far in the future.
I do not think so.
In my mind, not meaningfully so. Based on my adjustments to the mortality rates of cooling events by CEARCH, the expected annual mortality rate from nuclear winter is 7.32*10^-6. So, given a 1.31 % annual probability of a nuclear weapon being detonated as an act of war, the mortality rate from nuclear winter conditional on at least 1 nuclear detonation is 0.0559 % (= 7.32*10^-6/â0.0131). I estimated the direct deaths would be 1.16 times as large as the ones from the climatic effects, and I think CEARCHâs mortality rates only account for these. So the total mortality rate conditional on at least 1 nuclear detonation would be 0.121 % (= 5.59*10^-4*(1 + 1.16)), corresponding to 9.68 M (= 0.00121*8*10^9) deaths for todayâs population. Preventing millions of deaths is absolutely great, but arguably not enough to meaningfully improve the longterm future?
9.89 M people died from cancer in 2021, but eliminating cancer deaths over 1 year would not meaningully improve the future in e.g. 1 k years?
The Black Death killed like half of Europeâs population, i.e. 413 (= 0.5/â0.00121) times my estimate for the mortality rate conditional on a nuclear detonation, but will the world be meaningfully worse 300 years from now (i.e. 1 k years after the Black Death) because of it?
Even if the world is non-infinitesimaly different in 1 k years as a result of the above, would the counterfactual improvements in the world value after 1 k years be negligible in comparison with that before 1 k years?
Why not simply assessing the value of preventing wars and pandemics based on standard cost-effectiveness analyses, for which we can rely on empirically informed estimates like the above?
The way I think about it is that interventions improve the world for a few years to centuries, and then after that practically nothing changes (in the sense the expected benefits coming from changing the longterm are negligible). If I understand correctly, frameworks like Tobyâs or yours that suggest that improving the world now may be outweighted by making the world worse in the future suppose we can reliably change the value the world will have longterm, and that this is may be the driver of the overall effect.
Thanks for the detailed reply on that! Youâve clearly thought about this a lot, and Iâm very happy to believe youâre right on the impact of nuclear war, but It sounds like you are more or less opting for what I called option 1? In which case, just substitute nuclear war for a threat that would literally cause extinction with high probability (say release of a carefully engineered pathogen with high fatality rate, long incubation period, and high infectiousness). Wouldnât that meaningfully affect utility for more than a few centuries? Because there would be literally no one left, and that effect is guaranteed to be persistent! Even if it âjustâ reduced the population by 99%, that seems like it would very plausibly have effects for thousands of years into the future.
It seems to me that to avoid this, you have to either say that causing extinction (or near extinction level catastrophe) is virtually impossible, through any means, (what I was describing as option 1) or go the other extreme and say that it is virtually guaranteed in the short term anyway, so that counterfactual impact disappears quickly (what I was describing as option 2). Just so I understand what youâre saying, are you claiming one of these two things? Or is there another way out that Iâm missing?
Thanks for the kind words. I was actually unsure whether I should have followed up given my comments in this thread had been downvoted (all else equal, I do not want to annoy readers!), so it is good to get some information.
I think the effect of the intervention will still decrease to practically 0 in at most a few centuries in that case, such that reducing the nearterm risk of human extinction is not astronomically cost-effective. I guess you are imagining that humans either go extinct or have a long future where they go on to realise lots of value. However, this is overly binary in my view. I elaborate on this in the post I linked to at the start of this paragraph, and its comments.
I guess the probability of human extinction in the next 10 years is around 10^-7, i.e. very unlikely, but far from impossible.
Iâm at least finding it useful figuring out exactly where we disagree. Please stop replying if itâs taking too much of your time, but not because of the downvotes!
This isnât quite what Iâm saying, depending on what you mean by âlotsâ and âlongâ. For your âimpossible for an intervention to have counterfactual effects for more than a few centuriesâ claim to be false, we only need the future of humanity to have a non-tiny chance of being longer than a few centuries (not that long), and for there to be conceivable interventions which have a non-tiny chance of very quickly causing extinction. These interventions would then meaningfully affect counterfactual utility for more than a few centuries.
To be more concrete and less binary, suppose we are considering an intervention that has a risk p of almost immediately leading to extinction, and otherwise does nothing. Let U be the expected utility generated in a year, in 500 years time, absent any intervention. If you decide to make this intervention, that has the effect of changing U to (1-p)U, and so the utility generated in that far future year has been changed by pU.
For this to be tiny/ânon-meaningful, we either need p to be tiny, or U to be tiny (or both).
Are you saying:
There are no concievable interventions someone could make with p non-tiny.
U, expected utility in a year in 500 years time, is approximately 0.
Something else⌠my setup of the situation is wrong, or unrealistic..?
1, in the sense I think the change in the immediate risk of human extinction per cost is astronomically low for any conceivable intervention. Relatedly, you may want to check my discussion with Larks in the post I linked to.