From reading your post, your main claim seems to be: The expected value of the long-term future is similar whether it’s controlled by humans, unaligned AGI, or another Earth-originating intelligent species.
If that’s a correct understanding, I’d be interested in a more vigorous justification of that claim. Some counterarguments:
This claim seems to assume the falsity of the orthogonality thesis? (Which is fine, but I’d be interested in a justification of that premise.)
Let’s suppose that if humanity goes extinct, it will be replaced by another intelligent species, and that intelligent species will have good values. (I think these are big assumptions.) Priors would suggest that it would take millions of years for this species to evolve. If so, that’s millions of years where we’re not moving to capture universe real estate at near-light-speed, which means there’s an astronomical amount of real estate which will be forever out of this species’ light cone. It seems like just avoiding this delay of millions of years is sufficient for x-risk reduction to have astronomical value.
You also dispute that we’re living in a time of perils, though that doesn’t seem so cruxy, since your main claim above should be enough for your argument to go through either way. Still, your justification is that “I should be a priori very sceptical about claims that the expected value of the future will be significantly determined over the next few decades”. There’s a lot of literature (The Precipice, The Most Important Century, etc) which argues that we have enough evidence of this century’s uniqueness to overcome this prior. I’d be curious about your take on that.
(Separately, I think you had more to write after the sentence “Their conclusions seem to mostly follow from:” in your post’s final section?)
From reading your post, your main claim seems to be: The expected value of the long-term future is similar whether it’s controlled by humans, unaligned AGI, or another Earth-originating intelligent species.
I did not intend to make that claim, and I do not have strong views about it. My main claim is the 2nd bullet of the summary. Sorry for the lack of clarity. I appreciate I am not making a very clear/formal argument, although I think the effects I am pointing to are quite important. Namely, that the probability of increasing the value of the future by a given amount is not independent of that amount.
You also dispute that we’re living in a time of perils, though that doesn’t seem so cruxy, since your main claim above should be enough for your argument to go through either way. Still, your justification is that “I should be a priori very sceptical about claims that the expected value of the future will be significantly determined over the next few decades”. There’s a lot of literature (The Precipice, The Most Important Century, etc) which argues that we have enough evidence of this century’s uniqueness to overcome this prior. I’d be curious about your take on that.
I do not think we are in a time of perils, in the sense I would say the annual risk of human extinction has generally been going down until now, although with some noise[1]. A typical mammal species has a lifespan of 1 M years, which suggests an annual risk of going extinct of 10^-6. I have estimated values much lower than that. 5.93*10^-12 for nuclear wars, 2.20*10^-14 for asteroids and comets, 3.38*10^-14 for supervolcanoes, a prior of 6.36*10^-14 for wars, and a prior of 4.35*10^-15 for terrorist attacks. My actual best guess for the risk of human extinction over the next 10 years is 10^-7, i.e. around 10^-8 per year. However, besides this still being lower than 10^-6, it is driven by the risk from advanced AI which I assume has some moral value (even now), so the situation would not be analogous.
(Separately, I think you had more to write after the sentence “Their conclusions seem to mostly follow from:” in your post’s final section?)
Thanks for noting that! I have now added the bullets following that sentence, which were initially not imported (maybe they add a little bit of clarity to the post):
I cannot help notice arguments for reducing the nearterm risk of human extinction being astronomically cost-effective might share some similarities with (supposedly) logical arguments for the existence of God (e.g. Thomas Aquinas’ Five Ways), although they are different in many aspects too. Their conclusions seem to mostly follow from:
Cognitive biases. In the case of the former, the following come to mind:
Authority bias. For example, in Existential Risk Prevention as Global Priority, Nick Bostrom interprets a reduction in (total/cumulative) existential risk as a relative increase in the expected value of the future, which is fine, but then deals with the former as being independent from the latter, which I would argue is misguided given the dependence between the value of the future and increase in its PDF. “The more technologically comprehensive estimate of 10^54 human brain-emulation subjective life-years (or 10^52 lives of ordinary length) makes the same point even more starkly. Even if we give this allegedly lower bound on the cumulative output potential of a technologically mature civilisation a mere 1 per cent chance of being correct, we find that the expected value of reducing existential risk by a mere one billionth of one billionth of one percentage point is worth a hundred billion times as much as a billion human lives”.
Nitpick. The maths just above is not right. Nick meant 10^21 (= 10^(52 − 2 − 2*9 − 2 − 9)) times as much just above, i.e. a thousand billion billion times, not a hundred billion times (10^11).
Binary bias. This can manifest in assuming the value of the future is not only binary, but also that interventions reducing the nearterm risk of human extinction mostly move probability mass from worlds with value close to 0 to ones which are astronomically valuable, as opposed to just slightly more valuable.
Scope neglect. I agree the expected value of the future is astronomical, but it is easy to overlook that the increase in the probability of the astronomically valuable worlds driving that expected value can be astronomically low too, thus making the increase in the expected value of the astronomically valuable worlds negligible (see my illustration above).
Little use of empirical evidence and detailed quantitative models to catch the above biases. In the case of the former:
As far as I know, reductions in the nearterm risk of human extinction as well as its relationship with the relative increase in the expected value of the future are always directly guessed.
Even from the start of World War 2 in 1939 to when nuclear warheads peaked in 1986, the fraction of people living in democracies increased 21.6 pp (= 0.156 + 0.183 - (0.0400 + 0.0833)).
Thanks for the post, Vasco!
From reading your post, your main claim seems to be: The expected value of the long-term future is similar whether it’s controlled by humans, unaligned AGI, or another Earth-originating intelligent species.
If that’s a correct understanding, I’d be interested in a more vigorous justification of that claim. Some counterarguments:
This claim seems to assume the falsity of the orthogonality thesis? (Which is fine, but I’d be interested in a justification of that premise.)
Let’s suppose that if humanity goes extinct, it will be replaced by another intelligent species, and that intelligent species will have good values. (I think these are big assumptions.) Priors would suggest that it would take millions of years for this species to evolve. If so, that’s millions of years where we’re not moving to capture universe real estate at near-light-speed, which means there’s an astronomical amount of real estate which will be forever out of this species’ light cone. It seems like just avoiding this delay of millions of years is sufficient for x-risk reduction to have astronomical value.
You also dispute that we’re living in a time of perils, though that doesn’t seem so cruxy, since your main claim above should be enough for your argument to go through either way. Still, your justification is that “I should be a priori very sceptical about claims that the expected value of the future will be significantly determined over the next few decades”. There’s a lot of literature (The Precipice, The Most Important Century, etc) which argues that we have enough evidence of this century’s uniqueness to overcome this prior. I’d be curious about your take on that.
(Separately, I think you had more to write after the sentence “Their conclusions seem to mostly follow from:” in your post’s final section?)
Thanks for the comment, Ariel!
I did not intend to make that claim, and I do not have strong views about it. My main claim is the 2nd bullet of the summary. Sorry for the lack of clarity. I appreciate I am not making a very clear/formal argument, although I think the effects I am pointing to are quite important. Namely, that the probability of increasing the value of the future by a given amount is not independent of that amount.
I do not think we are in a time of perils, in the sense I would say the annual risk of human extinction has generally been going down until now, although with some noise[1]. A typical mammal species has a lifespan of 1 M years, which suggests an annual risk of going extinct of 10^-6. I have estimated values much lower than that. 5.93*10^-12 for nuclear wars, 2.20*10^-14 for asteroids and comets, 3.38*10^-14 for supervolcanoes, a prior of 6.36*10^-14 for wars, and a prior of 4.35*10^-15 for terrorist attacks. My actual best guess for the risk of human extinction over the next 10 years is 10^-7, i.e. around 10^-8 per year. However, besides this still being lower than 10^-6, it is driven by the risk from advanced AI which I assume has some moral value (even now), so the situation would not be analogous.
Thanks for noting that! I have now added the bullets following that sentence, which were initially not imported (maybe they add a little bit of clarity to the post):
Even from the start of World War 2 in 1939 to when nuclear warheads peaked in 1986, the fraction of people living in democracies increased 21.6 pp (= 0.156 + 0.183 - (0.0400 + 0.0833)).