A Viral License for AI Safety
A Viral License for AI Safety: The GPL as a Model for Cooperation
Written by Ivan Vendrov & Nat Kozak
The GPL or General Public License was originally written by Richard Stallman for the GNU Project. If someone publishes a codebase under the GPL, the license guarantees the users’ freedom to run it, study it, change it, and redistribute copies. It also requires that anyone distributing code that includes any part of the publisher’s code must also make their codebase comply with the GPL. This means that a for-profit company cannot distribute proprietary software with any part of the codebase that uses the GPL license. In contrast, permissive licenses like MIT or Apache do not have this restriction, so a for-profit company can use part of it inside their own proprietary codebase with few restrictions or consequences.
An interesting and important feature of the GPL is that it spreads virally: it attaches the “free software” property to any part of the constituent code, no matter how many copying steps exist between the developer who wants to use the code and the original codebase.
Instead of using the property “free software”, could we define a property we care about like “beneficial” or “safe” or at least “audited by an AI safety org” that likewise perpetuates itself? Further, how can we make sure that the codebase that defines the first AGI system satisfies this property?
Currently AI research is extremely cumulative. Most research does not start from scratch but builds on existing code bases and libraries, increasingly even pre-trained models. Almost all AI codebases and models are released under permissive licenses like MIT. If most research was circulated with a viral license such that actors could only use code or models if they committed to a set of practices and processes ensuring safety and alignment, it seems likelier that the first AGI system would follow these practices.
Some Potential License Clauses
Virality: If you build on this code or use these pre-trained models, you have to publish the results under the same or more restrictive license.
Windfall clause (along the lines of FHI): if the company reaches a certain degree of success, it would have to redistribute some of its earnings
To the rest of the world (to mitigate technologically driven inequality and unemployment)
To AI Safety organizations or non-profits
Join the winner: In the long-term safety clause of their charter, OpenAI commits to eliminating competitive races. The language they use is: “If a value-aligned, safety-conscious project comes close to building AGI before we do, we commit to stop competing with and start assisting this project. We will work out specifics in case-by-case agreements, but a typical triggering condition might be ‘a better-than-even chance of success in the next two years.’ ”
Audit: Some kind of commitment to transparency or external audit to ensure safety.
This is hard to do in the absence of an external auditing organization. Possible models for auditing could be via a for-profit auditing organization, an academic organization, or a regulatory body.
Full public transparency is likely not feasible because of safety considerations where you don’t want to give malicious actors too much information about how your system is designed and optimized.
Responsible non-disclosure: if you build a model on top of this code, you need to do your due diligence before you release it to the public. For example:
An organization could be required to show their model to an external safety researcher out of a pool of “peer reviewers” to ensure that there wouldn’t be unintended consequences from public release, and that the model can’t be used by malicious actors in some catastrophic way.
Alternatively, an organization could commit to forewarning the community one week in advance of releasing the model to solicit possible concerns, making available a detailed description of the model’s workings and results achieved by the model, but not the exact code and not the exact weights.
Best practices: you must follow a fixed list of safety best practices, such as training for adversarial robustness and red-teaming.
Commitments to broad principles
There’s a class of clauses that constitutes a commitment to broad principles. Those principles could include things like:
Upholding user agency
Maintaining user privacy
Avoiding bias and discrimination
Protecting human rights like life, liberty, security of person, such as those enshrined in the United Nations’ Universal Declaration of Human Rights
Protecting the welfare of nonhuman animals
Promising adherence to principles is easy. However, it doesn’t guarantee that they will be followed in any meaningful way. Environmental treaties, for instance, seem to have been most effective when specific milestones are articulated in addition to general principles.
On the other hand, even nonbinding commitments have historically changed the behavior of state actors. Take the result of the Helsinki Accords: a signed agreement that never had treaty status nonetheless successfully cooled border tensions and granted legitimacy to Soviet dissident and liberal movements.
Overall, it seems that an effective license should contain both commitments to broad principles and specific enforceable clauses, with clear delineations between the two.
Versioning and corrigibility
The GPL has an “or any later version” clause that optionally allows later changes to the license to override earlier changes. An AI license may benefit from a clause like this, or an even stricter clause that requires code licensed under any version to be enforceable under the latest version.
Such a clause may prove critical because it is likely that many important safety measures have not yet been invented. Likewise, many relevant values have not yet been discovered or articulated. However, such a restrictive versioning clause would probably adversely affect adoption and enforcement, potentially to an impractical degree.
Determining license violations
To start, we could look at the GPL enforcement mechanisms to see what they have tried, and what has and has not worked for the free software movement.
We could rely on a central authority such as a committee that includes respected members of the AI Safety and policy communities. This committee would have the authority to determine whether the license has or has not been broken. Instead of deciding each case manually, the committee could maintain an automated alignment benchmark against which new code is evaluated.
Legal systems could represent a final fallback. If your code has been used in a way that you think has violated the license, you could sue the violator. This would not deter actors outside the legal system’s reach, such as state actors.
Enforcing violation decisions
The standard legal remedies actually seem quite compelling. The history of GPL enforcement shows large companies being forced by courts around the world to comply with the license terms and pay significant damages for violations.
Besides legal remedies, community enforcement is possible. We can draw inspiration from international treaties that create extralegal enforceable commitments.
Just like with the usual justice system, there is a question of how someone might be able to come back, if ever, from violating it.
For a sufficiently drastic violation, the best available enforcement mechanism may be to entirely suspend publication of work building on licensed code and models. Potentially the license could include terms such that all participants must suspend publication.
Call for Future Work
An EA organization or group seeking to pursue this idea may want to:
Draft the actual license terms, ideally with the input of legal and AI policy experts.
Set up an organization that could enforce the license terms on violators via lawsuits and that has the resources to do so.
Individuals could contribute in a number of ways, including
Researching the history of software license litigation to build an understanding of what could be legally enforceable. A comprehensive view of the legal landscape would be extremely helpful in setting up the license, but individuals could work on such a project independently and compare notes (via the EA Forum or via a wiki project)
Researching the possible unintended consequences from creating such a license
Do licenses deter friendly actors without affecting malicious actors?
Analyzing the comparative effectiveness of various AI organizations’ attempts to formalize safety principles, and track how those principles influence the behavior of those organizations.
Trying to figure out a set of principles (broad and specific) to which licensees could be held.
Articulating a versioning or corrigibility clause that allows for additions of values or safety measures.
Analyzing whether such a license would be most beneficial if it had limited scope (e.g. to countries where the enforcement mechanism can be the US legal system), or if it needs to be enforceable or effective globally.
Thanks to Jonathan Rodriguez, Guive Assadi, Todor Markov, and Jeremy Nixon for helpful comments on earlier drafts of this article.