Luke Muehlhauser recently posted this list of ideas. See also this List of lists of government AI policy ideas and How major governments can help with the most important century.
The full text of the post[1] is below.
About two years ago, I wrote that “it’s difficult to know which ‘intermediate goals’ [e.g. policy goals] we could pursue that, if achieved, would clearly increase the odds of eventual good outcomes from transformative AI.” Much has changed since then, and in this post I give an update on 12 ideas for US policy goals[2] that I tentatively think would increase the odds of good outcomes from transformative AI.[3]
I think the US generally over-regulates, and that most people underrate the enormous benefits of rapid innovation. However, when 50% of the experts on a specific technology think there is a reasonable chance it will result in outcomes that are “extremely bad (e.g. human extinction),” I think ambitious and thoughtful regulation is warranted.[4]
First, some caveats:
These are my own tentative opinions, not Open Philanthropy’s.[5] I might easily change my opinions in response to further analysis or further developments.
My opinions are premised on a strategic picture similar to the one outlined in my colleague Holden Karnofsky’s Most Important Century and Implications of… posts. In other words, I think transformative AI could bring enormous benefits, but I also take full-blown existential risk from transformative AI as a plausible and urgent concern, and I am more agnostic about this risk’s likelihood, shape, and tractability than e.g. a recent TIME op-ed.
None of the policy options below have gotten sufficient scrutiny (though they have received far more scrutiny than is presented here), and there are many ways their impact could turn out — upon further analysis or upon implementation — to be net-negative, even if my basic picture of the strategic situation is right.
To my knowledge, none of these policy ideas have been worked out in enough detail to allow for immediate implementation, but experts have begun to draft the potential details for most of them (not included here). None of these ideas are original to me.
This post doesn’t explain much of my reasoning for tentatively favoring these policy options. All the options below have complicated mixtures of pros and cons, and many experts oppose (or support) each one. This post isn’t intended to (and shouldn’t) convince anyone. However, in the wake of recent AI advances and discussion, many people have been asking me for these kinds of policy ideas, so I am sharing my opinions here.
Some of these policy options are more politically tractable than others, but, as I think we’ve seen recently, the political landscape sometimes shifts rapidly and unexpectedly.
Those caveats in hand, below are some of my current personal guesses about US policy options that would reduce existential risk from AI in expectation (in no order).[6]
Software export controls. Control the export (to anyone) of “frontier AI models,” i.e. models with highly general capabilities over some threshold, or (more simply) models trained with a compute budget over some threshold (e.g. as much compute as $1 billion can buy today). This will help limit the proliferation of the models which probably pose the greatest risk. Also restrict API access in some ways, as API access can potentially be used to generate an optimized dataset sufficient to train a smaller model to reach performance similar to that of the larger model.
Require hardware security features on cutting-edge chips. Security features on chips can be leveraged for many useful compute governance purposes, e.g. to verify compliance with export controls and domestic regulations, monitor chip activity without leaking sensitive IP, limit usage (e.g. via interconnect limits), or even intervene in an emergency (e.g. remote shutdown). These functions can be achieved via firmware updates to already-deployed chips, though some features would be more tamper-resistant if implemented on the silicon itself in future chips.
Track stocks and flows of cutting-edge chips, and license big clusters. Chips over a certain capability threshold (e.g. the one used for the October 2022 export controls) should be tracked, and a license should be required to bring together large masses of them (as required to cost-effectively train frontier models). This would improve government visibility into potentially dangerous clusters of compute. And without this, other aspects of an effective compute governance regime can be rendered moot via the use of undeclared compute.
Track and require a license to develop frontier AI models. This would improve government visibility into potentially dangerous AI model development, and allow more control over their proliferation. Without this, other policies like the information security requirements below are hard to implement.
Information security requirements. Require that frontier AI models be subject to extra-stringent information security protections (including cyber, physical, and personnel security), including during model training, to limit unintended proliferation of dangerous models.
Testing and evaluation requirements. Require that frontier AI models be subject to extra-stringent safety testing and evaluation, including some evaluation by an independent auditor meeting certain criteria.[7]
Fund specific genres of alignment, interpretability, and model evaluation R&D. Note that if the genres are not specified well enough, such funding can effectively widen (rather than shrink) the gap between cutting-edge AI capabilities and available methods for alignment, interpretability, and evaluation. See e.g. here for one possible model.
Fund defensive information security R&D, again to help limit unintended proliferation of dangerous models. Even the broadest funding strategy would help, but there are many ways to target this funding to the development and deployment pipeline for frontier AI models.
Create a narrow antitrust safe harbor for AI safety & security collaboration. Frontier-model developers would be more likely to collaborate usefully on AI safety and security work if such collaboration were more clearly allowed under antitrust rules. Careful scoping of the policy would be needed to retain the basic goals of antitrust policy.
Require certain kinds of AI incident reporting, similar to incident reporting requirements in other industries (e.g. aviation) or to data breach reporting requirements, and similar to some vulnerability disclosure regimes. Many incidents wouldn’t need to be reported publicly, but could be kept confidential within a regulatory body. The goal of this is to allow regulators and perhaps others to track certain kinds of harms and close-calls from AI systems, to keep track of where the dangers are and rapidly evolve mitigation mechanisms.
Clarify the liability of AI developers for concrete AI harms, especially clear physical or financial harms, including those resulting from negligent security practices. A new framework for AI liability should in particular address the risks from frontier models carrying out actions. The goal of clear liability is to incentivize greater investment in safety, security, etc. by AI developers.
Create means for rapid shutdown of large compute clusters and training runs. One kind of “off switch” that may be useful in an emergency is a non-networked power cutoff switch for large compute clusters. As far as I know, most datacenters don’t have this.[8] Remote shutdown mechanisms on chips (mentioned above) could also help, though they are vulnerable to interruption by cyberattack. Various additional options could be required for compute clusters and training runs beyond particular thresholds.
Of course, even if one agrees with some of these high-level opinions, I haven’t provided enough detail in this short post for readers to know what, exactly, to advocate for, or how to do it. If you have useful skills, networks, funding, or other resources that you might like to direct toward further developing or advocating for one or more of these policy ideas, please indicate your interest in this short Google Form. (The information you share in this form will be available to me [Luke Muehlhauser] and some other Open Philanthropy employees, but we won’t share your information beyond that without your permission.)
- ^
(Copied with permission.)
- ^
Many of these policy options would plausibly also be good to implement in other jurisdictions, but for most of them the US is a good place to start (the US is plausibly the most important jurisdiction anyway, given the location of leading companies, and many other countries sometimes follow the US), and I know much less about politics and policymaking in other countries.
- ^
For more on intermediate goals, see Survey on intermediate goals in AI governance.
- ^
This paragraph was added on April 18, 2023
- ^
sides my day job at Open Philanthropy, I am also a Board member at Anthropic, though I have no shares in the company and am not compensated by it. Again, these opinions are my own, not Anthropic’s.
- ^
There are many other policy options I have purposely not mentioned here. These include:
- Hardware export controls. The US has already implemented major export controls on semiconductor manufacturing equipment and high-end chips. These controls have both pros and cons from my perspective, though it’s worth noting that they may be a necessary complement to some of the policies I tentatively recommend in this post. For example, the controls on semiconductor manufacturing equipment help to preserve a unified supply chain to which future risk-reducing compute governance mechanisms can be applied. These hardware controls will likely need ongoing maintenance by technically sophisticated policymakers to remain effective.
- “US boosting” interventions, such as semiconductor manufacturing subsidies or AI R&D funding. One year ago I was weakly in favor of these policies, but recent analyses have nudged me into weakly expecting these interventions are net-negative given e.g. the likelihood that they shorten AI timelines. But more analysis could flip me back. “US boosting” by increasing high-skill immigration may be an exception here because it relocates rather than creates a key AI input (talent), but I’m unsure, e.g. because skilled workers may accelerate AI faster in the US than in other jurisdictions. As with all the policy opinions in this post, it depends on the magnitude and certainty of multiple effects pushing in different directions, and those figures are difficult to estimate.
- AI-slowing regulation that isn’t “directly” helpful beyond slowing AI progress, e.g. a law saying that the “fair use” doctrine doesn’t apply to data used to train large language models. Some things in this genre might be good to do for the purpose of buying more time to come up with needed AI alignment and governance solutions, but I haven’t prioritized looking into these options relative to the options listed in the main text, which simultaneously buy more time and are “directly” useful to mitigating the risks I’m most worried about. Moreover, I think creating the ability to slow AI progress during the most dangerous period (in the future) is more important than slowing AI progress now, and most of the policies in the main text help with slowing AI progress in the future, whereas some policies that slow AI today don’t help much with slowing AI progress in the future.
- Launching new multilateral agreements or institutions to regulate AI globally. Global regulation is needed, but I haven’t yet seen proposals in this genre that I expect to be both feasible and effective. My guess is that the way to work toward new global regulation is similar to how the October 2022 export controls have played out: the US can move first with an effective policy on one of the topics above, and then persuade other influential countries to join it.
- A national research cloud. I’d guess this is unhelpful because it accelerates AI R&D broadly and creates a larger number of people who can train dangerously large models, though the implementation details matter.
- ^
See e.g. p. 15-16 of the GPT-4 system card report for an illustration.
- ^
E.g. the lack of an off switch exacerbated the fire that destroyed a datacenter in Strasbourg; see section VI.2.1 – iv of this report.
- ^
Full text crossposted with permission.
Tyler Cowen commented on these proposals:
I’ve set up a Manifold market for each of the 12 policy ideas discussed in the post, thanks to Michael Chen’s idea (Manifold uses collective wisdom to estimate the likelihood of events). You can visit the markets here and bet on whether the US will adopt these ideas by 2028. So go ahead and place your bets, because who said politics can’t be a bit of a gamble?
Fantastic to get this update—was just finding myself complaining about the lack of good object-level AI policy proposals!
At the risk of letting perfect be the enemy of the good, I would love a top level post for each of the recommendations, going into much greater detail. Getting discussions of policy proposals into the open where they can be criticized from diverse perspectives is crucial to arrive at policies that are robustly good.
One thing I find interesting to think about, is how well-funded non-governmental actors might be able to bring these policies to life. After all, I expect most progress to come out of a few influential labs. Getting a handshake agreement from those labs, would achieve results not too dissimilar from national legislation.
For rapid shutdown mechanisms, for example, the bottleneck to me seems just as much to be developing the actual protocols as getting adoption. If a great protocol is developed that would allow openAI leadership to shut down a compute cluster at the hardware level running an experimental AI, and adopting the protocol doesn’t add much overhead, I feel like there’s a non-zero chance they might adopt it without any coercion. If the overhead is significant, how significant would it be? Is it within the bounds of what a wealthy actor could subsidize?
I find any regulation as totally premature. We are not training IA for anything close to General Intelligence. We are still training brain tissue, not animals.
https://forum.effectivealtruism.org/posts/uHeeE5d96TKowTzjA/world-and-mind-in-artificial-intelligence-arguments-against
[Quick meta comment to try to influence forum norms]
This comment was at −5 karma when I saw it, and hidden.
I disagree with Arturo’s comment and disagree voted to indicate this disagreement. I also upvoted his the karma on his comment because I appreciated that he engaged with the post to express his views and that he posted something on the forum to explain those views.
I’d like other people to do something similar. I think that we should upvote people for expressing good faith disagreement and make an effort to explain that disagreement. Otherwise, the forum will become a complete echo chamber where we all just agree with each other.
I also think that we should try particularly hard to engage with new people in the community who express reasonable disagreement. Getting lots of anonymous downvotes without useful insights generally discourage engagements in most situations and I don’t think that this is what we want.
Of course. But that doesn’t really apply to Arturo’s comment, which expresses an attitude but doesn’t explain that attitude at all. So Arturo’s comment
can’t be useful to others and
is impossible to engage with, which is why nobody has replied to Arturo on the object-level.
I want less unhelpful-unexplained-attitude-expressing on the Forum.
Arturo, I wish you would explain your beliefs more so we can figure out the truth together.
He linked to his post in the comment. I presume that he believes that it explains why he disagrees. I’d consider that contribution to be deserving of not getting downvoted, but I see where you are coming from.
With that said, if he said, “I think we need regulation” and offered two lines of related thoughts and the same link, would people have downvoted his comment for not being useful and being impossible to engage with? Probably not, I suspect.
Anyway, I may be wrong in this case, but I still think that we probably shouldn’t be so quick to downvote comments like this (or at least a bit better). Especially for new community members.
I see a lot of stuff on the forum get no comments at all which seems worse than getting a few comments with opinions.
I often see low effort disagreeing comments on a post get downvoted but similarly low effort agreeable comment (e.g., this sounds great) get upvoted.
I am also influenced by other factors. Discussions I have had and seen where people I know who have been involved in EA for years said that they don’t like using the forum because it is too negative or because they don’t get any engagement on what write.
The expectation that lots of lurkers on the forum don’t feel comfortable sharing quick thoughts or disagreements because they could get downvotes.
My experiences writing posts that almost no-one commented on where I would have welcomed a 2-minute opinion comment made without arguments or a supposedly supporting link.
But of course other people might disagree with all of that or see different trade-offs etc.
I have written two recent posts describing my position. In the first I argued that nuclear war plus our primitive social systems, imply we live in an age of acute existential risk, and the substitution of our flawed governance by AI based government is our chance of survival.
In the second one, I argue that given the kind of specialized AI we are training so far, existential risk from AI is still negligible, and regulation would be premature.
You can comment on the posts themselves, or you can comment both posts here.
This seems aimed at regulators; I’d be more interested in a version for orgs like the CIA or NSA.
Both those orgs seem to have a lot more flexibility than regulators to more or less do what they want when national security is an issue, and AI could plausibly become just that kind of issue.
So ‘policy ideas for the NSA/CIA’ could be at once both more ambitious and more actionable.
Interesting. Do you know of existing sources related to ‘policy ideas for the NSA/CIA’? What can I read to learn about this?
I am (tentatively) excited about all of these ideas.
Thanks for sharing!
Does Open Phil plan to share any of the reasoning?