One of my CERI fellows asked me to elaborate on a claim I made that was along the lines of,* “If AI timelines are shorter, then this makes (direct) nuclear risk work less important because the time during which nuclear weapons can wipe us out is shorter.”
There’s a general point here, I think, which isn’t limited to nuclear risk. Namely, AI timelines being shorter not only makes AI risk more important, but makes everything else less important. Because the time during which the other thing (whether that be an asteroid, engineered pandemic, nuclear war, nanotech-caused grey goo scenario, etc.) matters as a human-triggered x-risk is shortened.
To give the nuclear risk example:^
If TAI is 50 years away, and per-year risk of nuclear conflict is 0.5%, then risk of nuclear conflict before TAI is 1-(0.995^50) = 22%
If TAI is 15 years away, and per-year risk of nuclear conflict is 0.5%, then risk of nuclear conflict before TAI is 1-(0.995^15) = 7%
This does rely on the assumption that we’ll be playing a different ball game after TAI/AGI/HLMI arrives (if not, then there’s no particular reason to view TAI or similar as a cut-off point), but to me this different ball game assumption seems fair (see, e.g., Muehlhauser, 2019).
*My background thinking behind my claim here has been inspired by conversations with Michael Aird, though I’m not certain he’d agree with everything I’ve written in this shortform.
^A couple of not-that-important caveats:
“Before TAI” refers to the default arrival time of TAI if nuclear conflict does not happen.
The simple calculations I’ve performed assume mutual independence between nuclear-risk-in-given-year-x and nuclear-risk-in-given-year-y.
*My background thinking behind my claim here has been inspired by conversations with Michael Aird, though I’m not certain he’d agree with everything I’ve written in this shortform.
From a skim, I agree with everything in this shortform and think it’s important, except maybe “to me this different ball game assumption seems fair”.
I’d say this “different ball game” assumption seems at least 50% likely to be at least roughly true. But—at least given the current limits of my knowledge and thinking—it doesn’t seem 99% likely to be almost entirely true, and I think the chance it may be somewhat or very untrue should factor into our cause prioritisation & our strategies. (But maybe that’s what you meant by “seems fair”.)
I expand on this in this somewhat longwinded comment. I’ll copy that in a reply here for convenience. (See the link for Ajeya Cotra replying and me replying to that.)
My comment on Ajeya Cotra’s AMA, from Feb 2021 (so probably I’d write it differently today):
“[I’m not sure if you’ve thought about the following sort of question much. Also, I haven’t properly read your report—let me know if this is covered in there.]
I’m interested in a question along the lines of “Do you think some work done before TAI is developed matters in a predictable way—i.e., better than 0 value in expectation—for its effects on the post-TAI world, in ways that don’t just flow through how the work affects the pre-TAI world or how the TAI transition itself plays out? If so, to what extent? And what sort of work?”
An example to illustrate: “Let’s say TAI is developed in 2050, and the ‘TAI transition’ is basically ‘done’ by 2060. Could some work to improve institutional decision-making be useful in terms of how it affects what happens from 2060 onwards, and not just via reducing x-risk (or reducing suffering etc.) before 2060 and improving how the TAI transition goes?”
But I’m not sure it’s obvious what I mean by the above, so here’s my attempt to explain:
The question of when TAI will be developed[1] is clearly very important to a whole bunch of prioritisation questions. One reason is that TAI—and probably the systems leading up to it—will very substantially change how many aspects of how society works. Specifically, Open Phil has defined TAI as “AI that precipitates a transition comparable to (or more significant than) the agricultural or industrial revolution” (and Muehlhauser has provided some more detail on what is meant by that).
But I think some EAs implicitly assume something stronger, along the lines of:
The expected moral value of actions we take now is entirely based on those actions’ effects on what happens before TAI is developed and those actions’ effects on the development, deployment, etc. of TAI. That is, the expected value of the actions we take now is not partly based on how the actions affect aspects of the post-TAI world in ways unrelated to how TAI is developed, deployed, etc. This is either because we just can’t at all predict those effects or because those effects wouldn’t be important; the world will just be very shaken up and perhaps unrecognisable, and any effects of pre-TAI actions will be washed out unless they affect how the TAI transition occurs.
E.g., things we do now to improve institutional decision-making or reduce risks of war can matter inasmuch as they reduce risks before TAI and reduce risks from TAI (and maybe also reduce actual harms, increase benefits, etc.). But they’ll have no even-slightly-predictable or substantial effect on decision-making or risks of war in the post-TAI world.
But I don’t think that necessarily follows from how TAI is defined. E.g., various countries, religious, ideologies, political systems, technologies, etc., existed both before the Industrial Revolution and for decades/centuries afterwards. And it seems like some pre-Industrial-Revolution actions—e.g. people who pushed for democracy or the abolition of slavery—had effects on the post-Industrial-Revolution world that were probably predictably positive in advance and that weren’t just about affecting how the Industrial Revolution itself occurred.
(Though it may have still been extremely useful for people taking those actions to know that, when, where, and how the IR would occur, e.g. because then they could push for democracy and abolition in the countries that were about to become much more influential and powerful.)
So I’m tentatively inclined to think that some EAs are assuming that short timelines pushes against certain types of work more than it really does, and that certain (often “broad”) interventions could be in expectation useful for influencing the post-TAI world in a relatively “continuous” way. In other words, I’m inclined to thinks there might be less of an extremely abrupt “break” than some people seem to think, even if TAI occurs. (Though it’d still be quite extreme by many standards, just as the Industrial Revolution was.)
[1] Here I’m assuming TAI will be developed, which is questionable, though it seems to me pretty much guaranteed unless some existential catastrophe occurs beforehand.”
TAI makes everything else less important.
One of my CERI fellows asked me to elaborate on a claim I made that was along the lines of,* “If AI timelines are shorter, then this makes (direct) nuclear risk work less important because the time during which nuclear weapons can wipe us out is shorter.”
There’s a general point here, I think, which isn’t limited to nuclear risk. Namely, AI timelines being shorter not only makes AI risk more important, but makes everything else less important. Because the time during which the other thing (whether that be an asteroid, engineered pandemic, nuclear war, nanotech-caused grey goo scenario, etc.) matters as a human-triggered x-risk is shortened.
To give the nuclear risk example:^
If TAI is 50 years away, and per-year risk of nuclear conflict is 0.5%, then risk of nuclear conflict before TAI is 1-(0.995^50) = 22%
If TAI is 15 years away, and per-year risk of nuclear conflict is 0.5%, then risk of nuclear conflict before TAI is 1-(0.995^15) = 7%
This does rely on the assumption that we’ll be playing a different ball game after TAI/AGI/HLMI arrives (if not, then there’s no particular reason to view TAI or similar as a cut-off point), but to me this different ball game assumption seems fair (see, e.g., Muehlhauser, 2019).
*My background thinking behind my claim here has been inspired by conversations with Michael Aird, though I’m not certain he’d agree with everything I’ve written in this shortform.
^A couple of not-that-important caveats:
“Before TAI” refers to the default arrival time of TAI if nuclear conflict does not happen.
The simple calculations I’ve performed assume mutual independence between nuclear-risk-in-given-year-x and nuclear-risk-in-given-year-y.
From a skim, I agree with everything in this shortform and think it’s important, except maybe “to me this different ball game assumption seems fair”.
I’d say this “different ball game” assumption seems at least 50% likely to be at least roughly true. But—at least given the current limits of my knowledge and thinking—it doesn’t seem 99% likely to be almost entirely true, and I think the chance it may be somewhat or very untrue should factor into our cause prioritisation & our strategies. (But maybe that’s what you meant by “seems fair”.)
I expand on this in this somewhat longwinded comment. I’ll copy that in a reply here for convenience. (See the link for Ajeya Cotra replying and me replying to that.)
My comment on Ajeya Cotra’s AMA, from Feb 2021 (so probably I’d write it differently today):
“[I’m not sure if you’ve thought about the following sort of question much. Also, I haven’t properly read your report—let me know if this is covered in there.]
I’m interested in a question along the lines of “Do you think some work done before TAI is developed matters in a predictable way—i.e., better than 0 value in expectation—for its effects on the post-TAI world, in ways that don’t just flow through how the work affects the pre-TAI world or how the TAI transition itself plays out? If so, to what extent? And what sort of work?”
An example to illustrate: “Let’s say TAI is developed in 2050, and the ‘TAI transition’ is basically ‘done’ by 2060. Could some work to improve institutional decision-making be useful in terms of how it affects what happens from 2060 onwards, and not just via reducing x-risk (or reducing suffering etc.) before 2060 and improving how the TAI transition goes?”
But I’m not sure it’s obvious what I mean by the above, so here’s my attempt to explain:
The question of when TAI will be developed[1] is clearly very important to a whole bunch of prioritisation questions. One reason is that TAI—and probably the systems leading up to it—will very substantially change how many aspects of how society works. Specifically, Open Phil has defined TAI as “AI that precipitates a transition comparable to (or more significant than) the agricultural or industrial revolution” (and Muehlhauser has provided some more detail on what is meant by that).
But I think some EAs implicitly assume something stronger, along the lines of:
But I don’t think that necessarily follows from how TAI is defined. E.g., various countries, religious, ideologies, political systems, technologies, etc., existed both before the Industrial Revolution and for decades/centuries afterwards. And it seems like some pre-Industrial-Revolution actions—e.g. people who pushed for democracy or the abolition of slavery—had effects on the post-Industrial-Revolution world that were probably predictably positive in advance and that weren’t just about affecting how the Industrial Revolution itself occurred.
(Though it may have still been extremely useful for people taking those actions to know that, when, where, and how the IR would occur, e.g. because then they could push for democracy and abolition in the countries that were about to become much more influential and powerful.)
So I’m tentatively inclined to think that some EAs are assuming that short timelines pushes against certain types of work more than it really does, and that certain (often “broad”) interventions could be in expectation useful for influencing the post-TAI world in a relatively “continuous” way. In other words, I’m inclined to thinks there might be less of an extremely abrupt “break” than some people seem to think, even if TAI occurs. (Though it’d still be quite extreme by many standards, just as the Industrial Revolution was.)
[1] Here I’m assuming TAI will be developed, which is questionable, though it seems to me pretty much guaranteed unless some existential catastrophe occurs beforehand.”