What’s in a Pause?
This post is part of AI Pause Debate Week. Please see this sequence for other posts in the debate.
An AI Moratorium of some sort has been discussed, but details matter—it’s not particularly meaningful to agree or disagree with a policy that has no details. A discussion requires concrete claims.
To start, I see three key questions, namely:
What does a moratorium include?
When and how would a pause work?
What are the concrete steps forward?
Before answering those, I want to provide a very short introduction and propose what is in or out of bounds for a discussion.
There seems to be a strong consensus that future artificial intelligence could be very bad. There is quite a significant uncertainty and dispute about many of the details—how bad it could be, and when the different risks materialize. Pausing or stopping AI progress is anywhere from completely unreasonable to obviously necessary, depending on those risks, and the difficulty of avoiding them—but eliminating those uncertainties is a different discussion, and for now, I think we should agree to take the disputes and uncertainties about the risks as a given. We will need to debate and make decisions under uncertainty. So the question of whether to stop and how to do so depends on the details of the proposal—but these seem absent from most of the discussion. For that reason, I want to lay out a few of the places where I think these need clarification, including not just what a moratorium would include and exclude, but also concrete next steps to getting there.
Getting to a final proposal means facing a few uncomfortable policy constraints that I’d also like to suggest be agreed on for this discussion. An immediate, temporary pause isn’t currently possible to monitor, much less enforce, even if it were likely that some or most parties would agree. Similarly, a single company or country announcing a unilateral halt to building advanced models is not credible without assurances, and is likely both ineffective at addressing the broader race dynamics, and differentially advantages the least responsible actors. For these reasons, the type of moratorium I think worth discussing is a multilateral agreement centered on countries and international corporations, one which addresses both current and unclear future risks. But, as I will conclude, much needs to happen more rapidly than that—international oversight should not be an excuse for inaction.
What Does a Moratorium Include?
There is at least widespread agreement on many things that aren’t and wouldn’t be included. Current systems aren’t going to be withdrawn—any ban would be targeted to systems more dangerous than those that exist. We’re not talking about banning academic research using current models, and no ban would stop research to make future systems safer, assuming that the research itself does not involve building dangerous systems. Similarly, systems that reasonably demonstrate that they are low risk would be allowed, though how that safety is shown is unclear.
Next, there are certain parts of the proposal that are contentious, but not all of it. Most critics of a moratorium agree that we should not and can’t afford to build dangerous systems—they simply disagree where the line belongs. Should we allow arbitrary plugins? Should we ban open-sourcing models? When do we need to stop? The answers are debated. And while these all seem worrying to me, the debate makes sense—there are many irreducible uncertainties, we have a global community with differing views, and actual diplomatic solutions will require people who disagree to come to some agreement.
As should be clear from my views on the need to negotiate answers, I’m not planning to dictate exactly what I think we need to ban. However, there are things that are clearly on the far side of the line that we need to draw. Models that are directly dangerous, like self-driving AIs that target pedestrians, would already be illegal to use and should be illegal to build as well. The Chemical Weapons Convention already bans the development of new chemical weapons—but using AI to do so is already possible. The equivalent use of AI to create bioweapons is plausibly not far away, and such AI is likely also already a(n unenforceable) violation of the BWC. The making of models that violate treaty obligations to restrict the development of weapons is already illegal in some jurisdictions, but unenforceable rules do nothing. And looking forward, unacceptable model development includes, at the very least, models that pose a risk of being “black ball” technologies, in Bostrom’s framing—and the set of dangerous technologies that AI could enable will grow rather than shrink over time.
Given that we need lines, standards should be developed that categorize future models into unacceptable, high-risk, or low-risk, much as the EU AI Act does for applications. This would map to models that are banned, need approval, or can be deployed. The question is which models belong in each category, and how that decision is reached—something that can and will be debated. Figuring out which models pose unacceptable risks is important, and non-obvious, so caution is warranted—and some of that caution should be applied to unpredictable capability gains by larger models, which may be dangerous by default.
When and How Do We Stop?
Timing is critical, and there are some things that we need to ban, which can and should be banned today. However, I don’t think there’s a concrete proposal to temporarily or permanently pause that I could support—we don’t have clear criteria, we don’t have buy-in from the actors that is needed to make this work, and we don’t have a reasonable way to monitor, much less enforce, any agreement. Yes, companies could voluntarily pause AI development for 6 months, which could be a valuable signal. (Or would be so if we didn’t think it would be a smokescreen for “keep doing everything and delay releases slightly.”) But such temporary actions are neither necessary for actual progress on building governance, nor sufficient to reduce the risks from advanced AI, so it seems more like a distraction—albeit one that successfully started to move the Overton window—than a serious proposal.
And acting too soon is costly, being too conservative in what is allowed is costly, regulating and banning progress in areas that have the potential to deliver tremendous benefits is costly. These considerations are important. But AI is an emergency—the costs of any action are very large, and inaction and allowing unrestricted development is as much of a choice as the most extreme response—any decision, including the decision to move slowly on regulation or banning potential risks, needs to be justified, not excused.
We absolutely cannot delay responding. If there is a fire alarm for AI risk, at least one such alarm has been ringing for months. Just like a fire in the basement won’t yet burn people in the attic, AI that exists today does not pose immediate existential risks to humanity—but it’s doing significant damage already, and if you ignore the growing risks, further damage quickly becomes unavoidable. We should ask model developers to be responsible, but voluntary compliance rarely works, and in a race scenario self-governance is clearly insufficient. If we agree that we’re fighting a fire, turning on the sink isn’t the way to respond, and if we aren’t yet in agreement about the fire, we presumably still want something akin to a sprinkler system. For AI, the current risks and likely future risks means we need concrete action to create a governance regime at least capable of banning systems. Because if we don’t build future risk mitigation plans, by the time there is agreement about the risks it will be too late to respond.
And any broad moratorium needs a governance mechanism for making decisions—a fixed policy will be quickly outpaced by technology changes. We should already be building the governance empowered to make decisions and capable of enforcing them, and a priori agreement that they need to put in place restrictions on dangerous developments. And while the moratorium on each type of dangerous model development would be indefinite, it would not permanently ban all future AI technologies. Instead, we expect that AI safety experts will participate in this governance, and over time build a consensus that safety of certain types of systems is assured enough to permit relaxation of rules in that domain. As above, I think the details need to be negotiated, and this is what global governance experts, working with domain experts, are good at.
What are concrete steps forward?
Immediate action is needed on three fronts at the national level, at the least by major countries and ideally everywhere. First, countries should be building capacity to monitor usage of AI systems and enforce current and future rules. Second, they need to provide clarity on legal authorities for banning or restricting development of models that break existing laws. And third, they should impose a concrete timeline for creating an international moratorium, including a governance regime able to restrict dangerous models.
On these different fronts, there are a number of steps needed immediately, many of which will both concretely reduce harms of extant models, and help enable or set the stage for later governance and response.
Monitoring AI Systems Now
First, to enable both near term and longer term regulation, some forms of ongoing monitoring and reporting should be required of every group developing, deploying, or using these large scale AI systems—say, larger than GPT-3. Countries might require public reporting of models, or reporting to government agencies tasked with this. And again, this is true even if you think larger risks are far away.
The registration of model training, model development, and intended applications of larger and more capable models would include registering details about their training data, the methods used to reduce bias, monitor usage, and prevent misuse, and stating the expected capabilities in advance. It seems reasonable to expect countries to monitor development and deployment of at least any AI systems larger than GPT-3, as well as models in domains expected to have significant misuse or other risks. For example, models used for fully autonomous general reasoning, such as AutoGPT, should have specific monitoring and human oversight requirements. And there is plenty of legal room for this type of requirement—for example, if a company wants to market a model to consumers, or use it inside of companies that make customer-facing decisions, this is a consumer protection issue. And again, even ignoring near-certain future risks, the harms have been known for quite a while, well before the models that were breaking laws and harming people were being called AI.
Enforcing Extant Laws
Second, some things are already illegal, and those laws need to be enforced—AI is not an excuse. Regulation and prosecution of AI misuse in the near term is both obviously needed, and important for making it clear that governments aren’t going to give companies free reign to make harmful or risky decisions. There already are places restrictions are needed, and there are and certainly will be AI models that should be banned. We’re well on our way towards having widespread public and political understanding of this—most of the public agrees that some rules are needed.
I claimed above that the idea of banning some AI models is not controversial. Lots of things are illegal, and banning AI that does illegal things is just clarifying that the status quo won’t be swept away by technology. I even think that there is near-universal agreement that rules and limits are needed, perhaps excepting the most strident defenders of accelerationism. We do not allow bad actors to build bioweapons, saying that good biotech will beat bad biotech—and we cannot afford to allow companies to build dangerous AI systems, and say that good AI will defend against it.
Further, regardless of your views on future AI risks, there are risks today. A proof-of-concept for using AI to create chemical weapons already exists. And countries already have a treaty obligation to stop any misuse of AI for biological weapons - and this seems frighteningly plausible. Similarly, regulators can already see that stock market manipulation, inciting violence, election interference, and many other illegal acts are enabled by current AI models, and “the AI did it” is an abrogation of responsibility for foreseeable misuse, so should not act as a get-out-of-jail-free card. Clarifying the responsibility of AI model developers, application developers, and users when models are used to break laws is necessary to ensure that everyone knows what their responsibilities are. It would also highlight a critical gap—that we don’t have the capacity to investigate misuse and enforce laws that should apply to AI. This is unacceptable, and must be addressed, both immediately, and because there are larger risks on the horizon.
Plan for Future Governance and Policy
Governments must not wait for international consensus about how to mitigate risks. The types of misuse we see today are largely untraceable, because we don’t have any way to track who is building or deploying AI systems. Laws are being broken, and people are being hurt, and we as a society can’t respond because we don’t have the tools. But these tools and regulations are already technically and legally feasible. We need regulation and enforcement already, and lacking that, it is critical to at least build infrastructure enabling it as quickly as possible.
And to digress slightly, there is a debate about the extent to which we are prepared for automation and job loss. This is a policy debate that is inextricably tied to other decisions about AI risks, and not directly relevant to the class of large-scale risks we are most concerned with. Similarly, there are intellectual property rights and other issues that relate to AI which policymakers will continue to debate. But because policy is messy, these will be part of policy formulation to address AI in the near term. This will likely involve taxation, regulation of what is allowed, and other measures—for the purpose of the current debate, we should agree that addressing these issues will be part of any domestic debate on impacts of AI and responses, and then both appreciate the need for action, and clearly state that even solving the problems created by near-human or human-level AGI isn’t addressing key risks of those and more advanced systems.
Moving beyond current needs, as both a way to ensure that domestic policy doesn’t get stuck dealing with immediate economic, equity, and political issues, I think we should push for an ambitious intermediate goal to promote the adoption of international standards regarding high-risk future models. To that end, I would call for every country to pass laws today that will trigger a full ban on deploying or training AI systems larger than GPT-4 which have not been reviewed by an international regulatory body with authority to reject applications, starting in 2025, pending international governance regimes with mandatory review provisions for potentially dangerous applications and models. This isn’t helpful for the most obvious immediate risks and economic impacts of AI—and for exactly that reason, it’s critical as a way to ensure the tremendous future risks aren’t ignored.
Most of these steps are all possible today, and many or most could even be announced quickly—perhaps at the UK Summit, though it unfortunately seems like we’re not on track for countries making such commitments and concrete actions that quickly.
To conclude, yes, we need to stop certain uses of AI and future more risky systems; yes, steps need to be taken sooner rather than later because the most extreme risks are increasing; and yes, there are concrete things for governments to do, some of which are likely to build towards a governance regime that include a moratorium or the equivalent. And many helpful and concrete steps can start today, and are needed to address immediate harms—even if the necessary moratorium on dangerous uses and model development and the accompanying governance regimes will take time to negotiate.
That said, we shouldn’t ignore the harms AI is doing now—and restricting already illegal or harmful uses is certainly more than justified, and I applaud work being done by AI ethicists and other groups in that direction. The people in the basement should be saved, which is sufficient justification for many of the proposed policies—but I think milquetoast policy papers don’t help that cause either, and these abuses need more drastic responses.
The obligation for the Biological Weapons Convention is clear that countries are in violation if they allow bioweapons to be developed, even if the state party itself was not involved. The requirements for the chemical weapons convention are less broad, so AI used for chemical weapons development is not directly banned by the treaty—though it certainly seems worth considering how to prevent it. And as noted, it is currently technically and bureaucratically impossible to monitor or ban such uses.
In the interim, joint-and-several liability for developers, application providers, and users for misuse, copyright violation, and illegal discrimination would be a useful initial band-aid; among other things, this provides motive for companies to help craft regulation to provide clear rules about what is needed to ensure on each party’s behalf that they will not be financially liable for a given use, or misuse.