I find myself agreeing with Nora on temporary pauses—and I don’t really understand the model by which a 6-month, or a 2-year, pause helps, unless you think we’re less than 6 months, or 2-years, from doom.
First, my perception is that progress in AI so far has been combining various advances into very large models. If the companies are working on other things in ML during those 6 months, this creates algorithmic overhang as soon as they put them together into larger models. This is separate from hardware overhang, which I think is even more concerning.
Second, there are lots of parts to putting these models together. If companies are confident the pause is 6 months long, they continue to build infrastructure and curate datasets. (This is in contrast to the status if we’re building comprehensive regulatory oversight, where training the planned future large models may not be approved, and further capital investments might be wasted.)
After writing this, another model occurs to me that someone might think makes a short pause useful—if we are playing a PR game, and think that the sudden advances after the pause will galvanize the public into worrying more. (But I don’t think that you’re proposing playing clever-sounding but fragile strategic moves, and think this type of pause would obviously be worse than trying for a useful governance regime.)
Edit: If the “(long) pause” you’re suggesting is actually an indefinite moratorium on larger models, I think we’re agreeing—but I think we need to build a global governance regime to make that happen, as I laid out.
I find myself agreeing with Nora on temporary pauses—and I don’t really understand the model by which a 6-month, or a 2-year, pause helps, unless you think we’re less than 6 months, or 2-years, from doom.
This doesn’t make a lot of sense to me. If we’re 3 years away from doom, I should oppose a 2-year pause because of the risk that (a) it might not work and (b) it will make progress more discontinuous?
In real life, if smarter-than-human AI is coming that soon then we’re almost certainly dead. More discontinuity implies more alignment difficulty, but on three-year timelines we have no prospect of figuring out alignment either way; realistically, it doesn’t matter whether the curve we’re looking at is continuous vs. discontinuous when the absolute amount of calendartimeto solve all of alignment for superhuman AI systems is 3 years, starting from today.
I don’t think “figure out how to get a god to do exactly what you want, using the standard current ML toolbox, under extreme time pressure” is a workable path to humanity surviving the transition to AGI. “Governments instituting and enforcing a global multi-decade pause, giving us time to get our ducks in order” does strike me as a workable path to surviving, and it seems fine to marginally increase the intractability of unworkable plans in exchange for marginally increasing the viability of plans that might work.
If a “2-year” pause really only buys you six months, then that’s still six months more time to try to get governments to step in.
If a “2-year” pause buys you zero time in expectation, and doesn’t help establish precedents like “now that at least one pause has occurred, more ambitious pauses are in the Overton window and have some chance of occurring”, then sure, 2-year moratoriums are useless; but I don’t buy that at all.
(Edit to add: The below is operating entirely in the world where we don’t get an indefinite moratorium initially. I strongly agree about the preferability of an idefinite governance regime, though I think that during a multi-year pause with review mechanisms we’ll get additional evidence that either safety is possible, and find a path, or conclude that we need a much longer time, or it’s not possible at all.)
If you grant that a pause increases danger by reducing the ability of society and safety researchers to respond, and you don’t think doom is very, very likely even with extreme effort, then it’s reasonable that we would prefer, say, a 50% probability of success controlling AI given 3 years over a 10% probability of success given a 2-year pause then only 18 months. Of course, if you’re 99.95% sure that we’re doomed given 3 years, it makes sense to me that the extra 6 months of survival moving the probability to 99.99% would seem more worth it. But I don’t understand how anyone gets that degree of confidence making predictions. (Superforecasters who have really fantastic predictive accuracy and calibration tend to laugh at claims like that.)
That said, I strongly agree that this isn’t an acceptable bet to make. We should not let anyone play Russian roulette with all of humanity, and even if you think it’s only a 0.05% probability of doom (again, people seem very obviously overconfident about their guesses about the future,) that seems like a reason to insist that other people get to check your work in saying the system is safe.
Finally, I don’t think that you can buy time to get governments to step in quite the way you’re suggesting, after a pause. That is, if we get a pause that then expires, we are going to need tons of marginal evidence after that point to get an even stronger response, even once it’s in the Overton window. But the evidence we’d need is presumably not showing up, or not showing up as quickly, because there isn’t as much progress. So either the pause is extended indefinitely without further evidence, or we’ll see a capabilities jump, and that increases risks.
And once we see the capabilities jump after a pause expires, it seems plausible that any stronger response will be far too slow. (It might be OK, they might re-implement the pause, but I don’t see a reason for confidence in their ability or willingness to do so.) And in general, unless there are already plans in place that they can just execute, governments react on timescales measured in years.
(Note for everyone reading that all of this assumes, as I do, that the risk is large and will become more obvious as time progresses and we see capabilities continue to outpace reliable safety. If safety gets solved during a pause, I guess it was worth it, or maybe even unnecessary. But I’m incredibly skeptical.)
I find myself agreeing with Nora on temporary pauses—and I don’t really understand the model by which a 6-month, or a 2-year, pause helps, unless you think we’re less than 6 months, or 2-years, from doom.
First, my perception is that progress in AI so far has been combining various advances into very large models. If the companies are working on other things in ML during those 6 months, this creates algorithmic overhang as soon as they put them together into larger models. This is separate from hardware overhang, which I think is even more concerning.
Second, there are lots of parts to putting these models together. If companies are confident the pause is 6 months long, they continue to build infrastructure and curate datasets. (This is in contrast to the status if we’re building comprehensive regulatory oversight, where training the planned future large models may not be approved, and further capital investments might be wasted.)
After writing this, another model occurs to me that someone might think makes a short pause useful—if we are playing a PR game, and think that the sudden advances after the pause will galvanize the public into worrying more. (But I don’t think that you’re proposing playing clever-sounding but fragile strategic moves, and think this type of pause would obviously be worse than trying for a useful governance regime.)
Edit: If the “(long) pause” you’re suggesting is actually an indefinite moratorium on larger models, I think we’re agreeing—but I think we need to build a global governance regime to make that happen, as I laid out.
This doesn’t make a lot of sense to me. If we’re 3 years away from doom, I should oppose a 2-year pause because of the risk that (a) it might not work and (b) it will make progress more discontinuous?
In real life, if smarter-than-human AI is coming that soon then we’re almost certainly dead. More discontinuity implies more alignment difficulty, but on three-year timelines we have no prospect of figuring out alignment either way; realistically, it doesn’t matter whether the curve we’re looking at is continuous vs. discontinuous when the absolute amount of calendar time to solve all of alignment for superhuman AI systems is 3 years, starting from today.
I don’t think “figure out how to get a god to do exactly what you want, using the standard current ML toolbox, under extreme time pressure” is a workable path to humanity surviving the transition to AGI. “Governments instituting and enforcing a global multi-decade pause, giving us time to get our ducks in order” does strike me as a workable path to surviving, and it seems fine to marginally increase the intractability of unworkable plans in exchange for marginally increasing the viability of plans that might work.
If a “2-year” pause really only buys you six months, then that’s still six months more time to try to get governments to step in.
If a “2-year” pause buys you zero time in expectation, and doesn’t help establish precedents like “now that at least one pause has occurred, more ambitious pauses are in the Overton window and have some chance of occurring”, then sure, 2-year moratoriums are useless; but I don’t buy that at all.
(Edit to add: The below is operating entirely in the world where we don’t get an indefinite moratorium initially. I strongly agree about the preferability of an idefinite governance regime, though I think that during a multi-year pause with review mechanisms we’ll get additional evidence that either safety is possible, and find a path, or conclude that we need a much longer time, or it’s not possible at all.)
If you grant that a pause increases danger by reducing the ability of society and safety researchers to respond, and you don’t think doom is very, very likely even with extreme effort, then it’s reasonable that we would prefer, say, a 50% probability of success controlling AI given 3 years over a 10% probability of success given a 2-year pause then only 18 months. Of course, if you’re 99.95% sure that we’re doomed given 3 years, it makes sense to me that the extra 6 months of survival moving the probability to 99.99% would seem more worth it. But I don’t understand how anyone gets that degree of confidence making predictions. (Superforecasters who have really fantastic predictive accuracy and calibration tend to laugh at claims like that.)
That said, I strongly agree that this isn’t an acceptable bet to make. We should not let anyone play Russian roulette with all of humanity, and even if you think it’s only a 0.05% probability of doom (again, people seem very obviously overconfident about their guesses about the future,) that seems like a reason to insist that other people get to check your work in saying the system is safe.
Finally, I don’t think that you can buy time to get governments to step in quite the way you’re suggesting, after a pause. That is, if we get a pause that then expires, we are going to need tons of marginal evidence after that point to get an even stronger response, even once it’s in the Overton window. But the evidence we’d need is presumably not showing up, or not showing up as quickly, because there isn’t as much progress. So either the pause is extended indefinitely without further evidence, or we’ll see a capabilities jump, and that increases risks.
And once we see the capabilities jump after a pause expires, it seems plausible that any stronger response will be far too slow. (It might be OK, they might re-implement the pause, but I don’t see a reason for confidence in their ability or willingness to do so.) And in general, unless there are already plans in place that they can just execute, governments react on timescales measured in years.
(Note for everyone reading that all of this assumes, as I do, that the risk is large and will become more obvious as time progresses and we see capabilities continue to outpace reliable safety. If safety gets solved during a pause, I guess it was worth it, or maybe even unnecessary. But I’m incredibly skeptical.)