Quantitatively, the willingness to pay to avoid extinction even just from the United States is truly enormous. The value of a statistical life in the US — used by the US government to estimate how much US citizens are willing to pay to reduce their risk of death — is around $10 million. The willingness to pay, therefore, from the US as a whole, to avoid a 0.1 percentage point of a catastrophe that would kill everyone in the US, is over $1 trillion. I don’t expect these amounts to be spent on global catastrophic risk reduction, but they show how much latent desire there is to reduce global catastrophic risk, which I’d expect to become progressively mobilised with increasing indications that various global catastrophic risks, such as biorisks, are real. [I think my predictions around this are pretty different than some others, who expect the world to be almost totally blindsided. Timelines and gradualness of AI takeoff is of course relevant here.]
In contrast, many areas of better futures work are likely to remain extraordinarily neglected. The amount of even latent interest in, for example, ensuring that resources outside of our solar system are put to their best use, or that misaligned AI produces a somewhat-better future than it would otherwise have done even if it kills us all, is tiny, and I don’t expect society to mobilise massive resources towards these issues even if there were indications that those issues were pressing.
In some cases, what people want will be actively opposed to what is in fact best, if what’s best involves self-sacrifice on the part of those alive today, or with power today.
And then I think the neglectedness consideration beats the tractability consideration. Here are some pretty general reasons for optimism on expected tractability:
In general, tractability doesn’t vary by as much as importance and neglectedness.
In cause areas where very little work has been done, it’s hard for expected tractability to be very low. Because of how little we know about tractability in unexplored cause areas, we should often put significant credence on the idea that the cause will turn out to be fairly tractable; this is enough to warrant some investment into the cause area — at least enough to find out how tractable the area is.
There are many distinct sub-areas within better futures work. It seems unlikely to me that tractability in all of them is very low, and unlikely that their tractability is very highly correlated.
There’s a reasonable track record of early-stage areas with seemingly low tractability turning out to be surprisingly tractable. A decade ago, work on risks from AI takeover and engineered pathogens seemed very intractable; there was very little that one could fund, and very little in the way of promising career paths. But this changed over time, in significant part because of (i) research work improving our strategic understanding, and shedding light on what interventions were most promising; (ii) scientific developments (e.g. progress in machine learning) making it clearer what interventions might be promising; (ii) the creation of organisations that could absorb funding and talent. All these same factors could well be true for better futures work, too.
Of these considerations, it’s the last that personally moves me the most. It doesn’t feel long ago that work on AI takeover risk felt extraordinarily speculative and low-tractability, where there was almost nowhere one could work for or donate to outside of the Future of Humanity Institute or Machine Intelligence Research Institute. In the early days, I was personally very sceptical about the tractability of the area. But I’ve been proved wrong. Via years of foundational work — both research work figuring out what the most promising paths forward are, and via founding new organisations that are actually squarely focused on the goal of reducing takeover risk or biorisk, rather than on a similar but tangential goal — the area has become tractable, and now there are dozens of great organisations that one can work for or donate to.
One reason you might believe in a difference in terms of tractability is the stickiness of extinction, and the lack of stickiness attaching to things like societal values. Here’s very roughly what I have in mind, running roughshod over certain caveats and the like.
The case where we go extinct seems highly stable, of course. Extinction is forever. If you believe some kind of ‘time of perils’ hypothesis, surviving through such a time should also result in a scenario where non-extinction is highly stable. And the case for longtermism arguably hinges considerably on such a time of perils hypothesis being true, as David argues.
By contrast, I think it’s natural to worry that efforts to alter values and institutions so as to beneficially effect the very long-run by nudging us closer to the very best possible outomces are far more vulnerable to wash-out. The key exception would be if you suppose that there will be some kind of lock-in event.
So does the case for focusing on better futures work hinge crucially, in your view, on assigning significant confidence to lock-in events occuring within the near-term?
Yeah, I think that lock-in this century is quite a bit more likely than extinction this century. (Especially if we’re talking about hitting a points of no return for total extinction.)
That’s via two pathways: - AGI-enforced institutions (including AGI-enabled immortality of rulers). - Defence-dominance of star systems
I do think that “path dependence” (a broader idea than lock-in) is a big deal, but most of the long-term impact of that goes via a billiards dynamic: path-dependence on X, today, affects some lock-in event around X down the road. (Where e.g. digital rights and space governance are plausible here.)
I think my gut reaction is to judge extinction this century as at least as likely as lock-in, though a lot might depend on what’s meant by lock-in. But I also haven’t thought about this much!
I see the argument about the US Government’s statistical value of a life used a lot—and I’m not sure if I agree. I don’t think it echoes public sentiment—rather a government’s desire to remove itself of blame. Note how much more is spent per life on say, air transport than disease prevention.
Yeah I’ve always been a bit sceptical of this as well. Surely it’s just a yardstick that a department uses to decide between which investments it should make, rather than a considered (or even descriptive) “value of a life” for the US Government. Descriptively—the US government would spend far more for a few lives if those lives were hostages of a foreign adversary, and probably has far less willingness to pay for cheap ways the US govt could save lives (idk what these are, probably there are examples in public health). Basically—I don’t think it’s a number that can be meaningfully extrapolated to figure out the value of avoiding extinction or catastrophe, because the number was designed with far smaller trade-offs in mind, and doesn’t really make sense outside of its intended purpose.
A cynical and oversimplified — but hopefully illuminating — view (mine) is that trajectory changes are just longterm power grabs by people with a certain set of values (moral, epistemic, or otherwise). One argument in the other direction is that lots of people are trying to grab power — it’s all powerful people do! And conflict with powerful people over resources is a significant kind of non-neglectedness. But very few people are trying to control the longterm future, due to (e.g.) hyperbolic discounting. So on this view, neglectedness provisionally favours trajectory changes that don’t reallocate power until the future, so that they are not in competition with people seeking power today. A similar argument would apply to other domains where power can be accrued but competitors are not power-seeking.
A broader coalition of actors will be motivated to pursue extinction prevention than longtermist trajectory changes… For instance, see Scott Alexander on the benefits of extinction risk as a popular meme compared to longtermism.
This might vary between:
The level of the abstract memes:
I agree “reducing risk of extinction (potentially in the near term)” may be more appealing than “longtermist trajectory change”
The level of concrete interventions:
“Promoting democracy” (or whatever one decides promotes long term value) might be more appealing than “reducing risk from AI”[1] (though there is likely significant variation within concrete interventions).
A broader coalition of actors will be motivated to pursue extinction prevention than longtermist trajectory changes.[1] This means:
Extinction risk reduction work will be more tractable, by virtue of having broader buy-in and more allies.
Values change work will be more neglected.[2]
Is this a reasonable framing—if so which effect dominates or how can we reason through this?
For instance, see Scott Alexander on the benefits of extinction risk as a popular meme compared to longtermism.
I argeud for something similar here.
I agree with the framing.
Quantitatively, the willingness to pay to avoid extinction even just from the United States is truly enormous. The value of a statistical life in the US — used by the US government to estimate how much US citizens are willing to pay to reduce their risk of death — is around $10 million. The willingness to pay, therefore, from the US as a whole, to avoid a 0.1 percentage point of a catastrophe that would kill everyone in the US, is over $1 trillion. I don’t expect these amounts to be spent on global catastrophic risk reduction, but they show how much latent desire there is to reduce global catastrophic risk, which I’d expect to become progressively mobilised with increasing indications that various global catastrophic risks, such as biorisks, are real. [I think my predictions around this are pretty different than some others, who expect the world to be almost totally blindsided. Timelines and gradualness of AI takeoff is of course relevant here.]
In contrast, many areas of better futures work are likely to remain extraordinarily neglected. The amount of even latent interest in, for example, ensuring that resources outside of our solar system are put to their best use, or that misaligned AI produces a somewhat-better future than it would otherwise have done even if it kills us all, is tiny, and I don’t expect society to mobilise massive resources towards these issues even if there were indications that those issues were pressing.
In some cases, what people want will be actively opposed to what is in fact best, if what’s best involves self-sacrifice on the part of those alive today, or with power today.
And then I think the neglectedness consideration beats the tractability consideration. Here are some pretty general reasons for optimism on expected tractability:
In general, tractability doesn’t vary by as much as importance and neglectedness.
In cause areas where very little work has been done, it’s hard for expected tractability to be very low. Because of how little we know about tractability in unexplored cause areas, we should often put significant credence on the idea that the cause will turn out to be fairly tractable; this is enough to warrant some investment into the cause area — at least enough to find out how tractable the area is.
There are many distinct sub-areas within better futures work. It seems unlikely to me that tractability in all of them is very low, and unlikely that their tractability is very highly correlated.
There’s a reasonable track record of early-stage areas with seemingly low tractability turning out to be surprisingly tractable. A decade ago, work on risks from AI takeover and engineered pathogens seemed very intractable; there was very little that one could fund, and very little in the way of promising career paths. But this changed over time, in significant part because of (i) research work improving our strategic understanding, and shedding light on what interventions were most promising; (ii) scientific developments (e.g. progress in machine learning) making it clearer what interventions might be promising; (ii) the creation of organisations that could absorb funding and talent. All these same factors could well be true for better futures work, too.
Of these considerations, it’s the last that personally moves me the most. It doesn’t feel long ago that work on AI takeover risk felt extraordinarily speculative and low-tractability, where there was almost nowhere one could work for or donate to outside of the Future of Humanity Institute or Machine Intelligence Research Institute. In the early days, I was personally very sceptical about the tractability of the area. But I’ve been proved wrong. Via years of foundational work — both research work figuring out what the most promising paths forward are, and via founding new organisations that are actually squarely focused on the goal of reducing takeover risk or biorisk, rather than on a similar but tangential goal — the area has become tractable, and now there are dozens of great organisations that one can work for or donate to.
One reason you might believe in a difference in terms of tractability is the stickiness of extinction, and the lack of stickiness attaching to things like societal values. Here’s very roughly what I have in mind, running roughshod over certain caveats and the like.
The case where we go extinct seems highly stable, of course. Extinction is forever. If you believe some kind of ‘time of perils’ hypothesis, surviving through such a time should also result in a scenario where non-extinction is highly stable. And the case for longtermism arguably hinges considerably on such a time of perils hypothesis being true, as David argues.
By contrast, I think it’s natural to worry that efforts to alter values and institutions so as to beneficially effect the very long-run by nudging us closer to the very best possible outomces are far more vulnerable to wash-out. The key exception would be if you suppose that there will be some kind of lock-in event.
So does the case for focusing on better futures work hinge crucially, in your view, on assigning significant confidence to lock-in events occuring within the near-term?
Yeah, I think that lock-in this century is quite a bit more likely than extinction this century. (Especially if we’re talking about hitting a points of no return for total extinction.)
That’s via two pathways:
- AGI-enforced institutions (including AGI-enabled immortality of rulers).
- Defence-dominance of star systems
I do think that “path dependence” (a broader idea than lock-in) is a big deal, but most of the long-term impact of that goes via a billiards dynamic: path-dependence on X, today, affects some lock-in event around X down the road. (Where e.g. digital rights and space governance are plausible here.)
I think my gut reaction is to judge extinction this century as at least as likely as lock-in, though a lot might depend on what’s meant by lock-in. But I also haven’t thought about this much!
I see the argument about the US Government’s statistical value of a life used a lot—and I’m not sure if I agree. I don’t think it echoes public sentiment—rather a government’s desire to remove itself of blame. Note how much more is spent per life on say, air transport than disease prevention.
Yeah I’ve always been a bit sceptical of this as well. Surely it’s just a yardstick that a department uses to decide between which investments it should make, rather than a considered (or even descriptive) “value of a life” for the US Government.
Descriptively—the US government would spend far more for a few lives if those lives were hostages of a foreign adversary, and probably has far less willingness to pay for cheap ways the US govt could save lives (idk what these are, probably there are examples in public health).
Basically—I don’t think it’s a number that can be meaningfully extrapolated to figure out the value of avoiding extinction or catastrophe, because the number was designed with far smaller trade-offs in mind, and doesn’t really make sense outside of its intended purpose.
A cynical and oversimplified — but hopefully illuminating — view (mine) is that trajectory changes are just longterm power grabs by people with a certain set of values (moral, epistemic, or otherwise). One argument in the other direction is that lots of people are trying to grab power — it’s all powerful people do! And conflict with powerful people over resources is a significant kind of non-neglectedness. But very few people are trying to control the longterm future, due to (e.g.) hyperbolic discounting. So on this view, neglectedness provisionally favours trajectory changes that don’t reallocate power until the future, so that they are not in competition with people seeking power today. A similar argument would apply to other domains where power can be accrued but competitors are not power-seeking.
This might vary between:
The level of the abstract memes:
I agree “reducing risk of extinction (potentially in the near term)” may be more appealing than “longtermist trajectory change”
The level of concrete interventions:
“Promoting democracy” (or whatever one decides promotes long term value) might be more appealing than “reducing risk from AI”[1] (though there is likely significant variation within concrete interventions).
Though our initial work does not suggest this.