When people discuss ‘pauses’ I think they often are conflating between two importantly distinct concepts: a Moratorium as a theory of victory and methods to simply slow the development of AGI. This brief post will explain what I see as the key differences between them. I may or may not go into more detail in the differences between these, and more generally discussing the possible space of options often (sometimes erroneously) called ‘pauses’ if people find this useful.
The first might be what we ‘classically’ think of as an long term (global) moratorium as a theory of victory. States, likely the Great Powers, agree that further development towards superintelligence is so dangerous for global security that it ought to be banned outright. The steps to develop such a moratorium will be very similar to the steps to develop a single global project. Essentially, the USA and China (it may even be possible for the USA to do this unilaterally, its not entirely clear) would have to agree that it is in both states best interests, and the interest of humanity, that AGI is not developed. A global moratorium would be established, with the weight of international law behind it. This may initially be quite weak (with little ability for enforcement), but strengthen significantly over time avoiding other powers developing AGI. If the securitisation process were strong enough, this would get backed implicitly or explicitly with the threat of violence. Advanced (narrow) AI may be used, or developed to be used, in order to aid with trust, verification and possibly enforcement of the moratorium. Such a model would take inspiration from proposals to take nuclear weapons into international hands and permanently end the threat of nuclear war. Whilst proposals to do so would not be unprecedented, nothing of this scale had ever been carried out, and it would therefore be incredibly ambitious. However, it is not obvious that such a proposal is more ambitious than any other theory of victory for AGI that we currently have.
Whilst the ambitions of such a moratorium may be profound, its not obvious that the moratorium needs to be particularly strong initially. Initially all it needs to do is prevent AGI development under a paradigm similar to that in the modern day, and deal with the current political challenges. In my view this is not incredibly difficult. However, what sets a global moratorium as a theory of victory apart is its intended longevity, which means it would have to be severe enough to restrict future develop of AGI. This seems possible, but would have to develop, and probably would need the help of more advanced (narrow) AI in monitoring and enforcement. The strength of enforcement would have to be much greater than what is possible under international law at present, and would also have to be able to impact not just state actors, but also private actors as well. This is profound, but if the great powers see the issue as significant, this could then develop, allowing for a permanent moratorium. Ultimately, this would be an international institution that would, at least in this narrow area, infringe significantly on sovereignty, although I have little sympathy to suggest that such an institution needs to necessarily increase the risk of totalitarianism.
‘Slowing’ is the second option, where we slow the development of AI in order to buy (maybe very significant amounts of) time in order to either ‘solve alignment’ or put in place governance measures (which may include a long term moratorium). This may come about even when AGI is not necessarily considered an extraordinary priority, and instead, safety-regulation chokes the development of AGI, or costs to develop it rise so high that states and companies are not willing to develop it. This seems to be very similar to the model of ‘RSPs are pauses done right’, where states put in regulation that is considered somewhat ‘routine’ to protect safety, and companies simply cannot prove at higher levels that their models are safe. This could also be a unilateral ban on developing systems above a certain compute level, which other states agree to. Or other controls that slow AI using other levers (eg algotihmic development, data etc) may also cause this sort of slowing. If slowing occurs automatically, due to the sheer cost needed to develop more advanced models, then a slowing could be maintained by simply assuring that the development of AGI is not seen as important enough for states or companies to raise such a large amount of money. Using other levers over algorithmic development and data, as well as deployment regulation and liability, may all further slow development, and certain conditions may mean development is forced to stop entirely. States may be motivated by a worry of existential risk from AGI, but also may be motivated by other concerns (such as biosecurity, cybersecurity, fairness, potential to spark war etc) as well.
It is unclear how much time this could buy, but it is not implausible that such slowing measures could extend timelines by at least years to decades. Cultural changes, where these measures make AGI a much less attractive thing to work on, could also assist in such changes. But this likely couldn’t endure in the long term as a ‘state of victory’ without a more stable arrangement to govern it. It may be sufficient to buy enough time to get alignment to work, and to get governance in such a place that a good future is possible.
These two forms of slowing require quite different actions, different narratives between them. Whilst it seems plausible that the second type of slowing could lead to an indefinite moratorium, this transition doesn’t seem automatic. This difference seems very important. A permanent moratorium as a theory of victory does, at least in the long term, require different actions to other theories of victory. Meanwhile, slowing is useful across a wide range of theories of victory.
There are a number of significant differences that I would like to highlight between these, and as such I have constructed a table:
Moratorium as a theory of victory
Slowing
Needs international agreement
Doesn’t necessarily need international agreement; leading states or companies unilateral actions may be very significant
Needs to be strictly enforceable, possibly including with the (implicit) threat of violence or by the use of advanced AI systems
Doesn’t necessarily need to be an agreement backed up with the threat of violence
Undermines sovereignty in its traditional sense and leads to the creation of an unprecedented international organisation. Best precident is the proposals for international control of nuclear weapons
Look very similar to actions already taken in domestic/international governance. There exists many precedents from domestic slowing (eg nuclear energy, human cloning) to international restrictions (eg bioweapons)
States must see the existence of AGI itself as an existential threat such that it is macrosecuritised
Could happen in the absence of any form of securitisation, and states may not even need to be concerned with the existential threat of AGI
Needs to be adaptive to a great many changes in socio-technical paradigm
Focused on restriction AGI built in a way similar to the current form
Focused on creating an institution that could maintain the moratorium independent of socio-technical changes
Focused on what socio-technical levers we can pull to resist further development towards AGI
Must be comprehensive such that it can endure in the long term
Each ‘lever’ pulled may further assist with slowing, but needn’t be comprehensive to be effective
May need new technical advances (eg advanced narrow AI systems) to ensure compliance and verification. In the long term these may become significant issues
Can carry out verification using existing technology
My view is that understanding these distinctions can help us better clarify both exactly what we are proposing when we discuss ‘pausing’ and why. I hope this can broadly mean we get rid of the term ‘pausing’, as I think this is unhelpful. I think ‘pausing’ sort of implies a weird conflation between these two concepts in a way that isn’t helpful, and often leads to discussions that go cross-purposes. In doing so, it also can either make people assume that getting slowing is too hard (by the suggestion it requires strict global agreement) or make people assume that a moratorium as a theory of victory is too easy (assuming that a moving from a robust slowing strategy could get you to an institution able of constructing a moratorium). I also think expanding from the term ‘pause’ which often seems overly narrow to a discussion of slowing more broadly, which could involve pausing, but also many actions that don’t look like states agreeing to ‘pause’ development, may also be useful and could help us address other concerns people have had with more simplistic ‘pause’ narratives (eg hardware overhang etc).
Deconfusing Pauses: Long Term Moratorium vs Slowing AI
When people discuss ‘pauses’ I think they often are conflating between two importantly distinct concepts: a Moratorium as a theory of victory and methods to simply slow the development of AGI. This brief post will explain what I see as the key differences between them. I may or may not go into more detail in the differences between these, and more generally discussing the possible space of options often (sometimes erroneously) called ‘pauses’ if people find this useful.
The first might be what we ‘classically’ think of as an long term (global) moratorium as a theory of victory. States, likely the Great Powers, agree that further development towards superintelligence is so dangerous for global security that it ought to be banned outright. The steps to develop such a moratorium will be very similar to the steps to develop a single global project. Essentially, the USA and China (it may even be possible for the USA to do this unilaterally, its not entirely clear) would have to agree that it is in both states best interests, and the interest of humanity, that AGI is not developed. A global moratorium would be established, with the weight of international law behind it. This may initially be quite weak (with little ability for enforcement), but strengthen significantly over time avoiding other powers developing AGI. If the securitisation process were strong enough, this would get backed implicitly or explicitly with the threat of violence. Advanced (narrow) AI may be used, or developed to be used, in order to aid with trust, verification and possibly enforcement of the moratorium. Such a model would take inspiration from proposals to take nuclear weapons into international hands and permanently end the threat of nuclear war. Whilst proposals to do so would not be unprecedented, nothing of this scale had ever been carried out, and it would therefore be incredibly ambitious. However, it is not obvious that such a proposal is more ambitious than any other theory of victory for AGI that we currently have.
Whilst the ambitions of such a moratorium may be profound, its not obvious that the moratorium needs to be particularly strong initially. Initially all it needs to do is prevent AGI development under a paradigm similar to that in the modern day, and deal with the current political challenges. In my view this is not incredibly difficult. However, what sets a global moratorium as a theory of victory apart is its intended longevity, which means it would have to be severe enough to restrict future develop of AGI. This seems possible, but would have to develop, and probably would need the help of more advanced (narrow) AI in monitoring and enforcement. The strength of enforcement would have to be much greater than what is possible under international law at present, and would also have to be able to impact not just state actors, but also private actors as well. This is profound, but if the great powers see the issue as significant, this could then develop, allowing for a permanent moratorium. Ultimately, this would be an international institution that would, at least in this narrow area, infringe significantly on sovereignty, although I have little sympathy to suggest that such an institution needs to necessarily increase the risk of totalitarianism.
‘Slowing’ is the second option, where we slow the development of AI in order to buy (maybe very significant amounts of) time in order to either ‘solve alignment’ or put in place governance measures (which may include a long term moratorium). This may come about even when AGI is not necessarily considered an extraordinary priority, and instead, safety-regulation chokes the development of AGI, or costs to develop it rise so high that states and companies are not willing to develop it. This seems to be very similar to the model of ‘RSPs are pauses done right’, where states put in regulation that is considered somewhat ‘routine’ to protect safety, and companies simply cannot prove at higher levels that their models are safe. This could also be a unilateral ban on developing systems above a certain compute level, which other states agree to. Or other controls that slow AI using other levers (eg algotihmic development, data etc) may also cause this sort of slowing. If slowing occurs automatically, due to the sheer cost needed to develop more advanced models, then a slowing could be maintained by simply assuring that the development of AGI is not seen as important enough for states or companies to raise such a large amount of money. Using other levers over algorithmic development and data, as well as deployment regulation and liability, may all further slow development, and certain conditions may mean development is forced to stop entirely. States may be motivated by a worry of existential risk from AGI, but also may be motivated by other concerns (such as biosecurity, cybersecurity, fairness, potential to spark war etc) as well.
It is unclear how much time this could buy, but it is not implausible that such slowing measures could extend timelines by at least years to decades. Cultural changes, where these measures make AGI a much less attractive thing to work on, could also assist in such changes. But this likely couldn’t endure in the long term as a ‘state of victory’ without a more stable arrangement to govern it. It may be sufficient to buy enough time to get alignment to work, and to get governance in such a place that a good future is possible.
These two forms of slowing require quite different actions, different narratives between them. Whilst it seems plausible that the second type of slowing could lead to an indefinite moratorium, this transition doesn’t seem automatic. This difference seems very important. A permanent moratorium as a theory of victory does, at least in the long term, require different actions to other theories of victory. Meanwhile, slowing is useful across a wide range of theories of victory.
There are a number of significant differences that I would like to highlight between these, and as such I have constructed a table:
My view is that understanding these distinctions can help us better clarify both exactly what we are proposing when we discuss ‘pausing’ and why. I hope this can broadly mean we get rid of the term ‘pausing’, as I think this is unhelpful. I think ‘pausing’ sort of implies a weird conflation between these two concepts in a way that isn’t helpful, and often leads to discussions that go cross-purposes. In doing so, it also can either make people assume that getting slowing is too hard (by the suggestion it requires strict global agreement) or make people assume that a moratorium as a theory of victory is too easy (assuming that a moving from a robust slowing strategy could get you to an institution able of constructing a moratorium). I also think expanding from the term ‘pause’ which often seems overly narrow to a discussion of slowing more broadly, which could involve pausing, but also many actions that don’t look like states agreeing to ‘pause’ development, may also be useful and could help us address other concerns people have had with more simplistic ‘pause’ narratives (eg hardware overhang etc).