Tony Blair Institute AI Safety Work

Link post

At TBI, we have just written up a big report on AI policy. We cover a whole range of things around improving advice into government, increasing state capacity to deal with emerging AI challenges, and reforming things like public services and our data infrastructure for the AI age.

But a big area of interest for this forum is our AI safety proposal for the UK. We propose creating AI Sentinel, a national laboratory effort focused on researching and testing safe AI, with the aim of becoming the “brain” for both a UK and an international AI regulator. Sentinel would recognise that effective regulation and control is and will likely remain an ongoing research problem, requiring an unusually close combination of research and regulation.

Beneath is the copied party of the report that relates to this proposal.

The UK should create a new national laboratory effort – here given a placeholder name of Sentinel – to test, understand and control safe AI, collaborating with the private sector and complementing its work. The long-term intention would be to grow this initiative into an international collaborative network. This will be catalysed by the UK embarking on a recruitment programme to attract the world’s best scientists to address AI-safety concerns.

Such an effort should be open to international collaborators who could join the scheme, similar to the EU’s Horizon Europe programme. The UK is uniquely well positioned to do this due to the headquartering of Google DeepMind in London, which has drawn exceptional talent to the city. The EU has previously considered a similar effort but does not appear to have made progress yet; a contributing factor may be that the EU lacks the UK’s depth of AI talent. Sentinel could offer incentives for international collaboration in the form of knowledge and personnel sharing.

This effort towards safe and interpretable forms of AI should be anchored by an elite public-sector physical laboratory, which has strong collaborative links with private companies. This would fill the space of the Alan Turing Institute in the UK but with a wider remit, markedly increased funding, and improved governance learning from the first New National Purpose report and Sir Paul Nurse’s recent review of the UK’s research, development and innovation landscape.

This endeavour would have three related core objectives:

Develop and deploy methods to interrogate and interpret advanced AI systems for safety, while devising regulatory approaches in tandem. This should also include development of measures to control and contain these systems, as well as design of new algorithms and models that may be more interpretable and controllable. Some starting-point evaluations do already exist, but part of Sentinel’s mission would be to work out which are the right evaluations, create new methods, as well as which can be public and which have to be private (to prevent future AI models from being trained on our evaluations and then being able to evade scrutiny). Built into the core mission of Sentinel is the expectation that it will focus on safety measures for the most capable current models.

Keep the UK and its partners’ understanding and capabilities in advanced AI systems close to the cutting edge of AI-relevant technology, and serve as a trusted source of advice on this to these nations. Sentinel could, for example, perform assessments of when advanced super-intelligent capabilities are likely to be accomplished within a two-year window, and help coordinate a slowing-down of capabilities. Crucially, the purpose of Sentinel should be to help to assess and understand the frontier of current capabilities, rather than push the frontier further in terms of capability absent safety improvements.

Promote a plurality of research endeavours and approaches to AI, particularly in new interpretable and controllable algorithms. Currently there is a risk of excessive private-sector focus on LLMs, which may be vulnerable to misuse. As tech giants focus their resources more on technology that can be most effectively commercialised, the state needs to avoid repeating the same mistake it made before. It should fund other forms of AI research, seeking to invent interpretable algorithms that, if scaled, could offer similar capabilities in a safer way.

Frontier private tech companies could pledge to give the codes and other details of their models to Sentinel for interrogation and analysis, or this could become a legal requirement if the approach achieves international buy-in. Long-term incentives for encouraging companies to collaborate with Sentinel could include providing public data to companies to implement training on new models where appropriate. Other measures could include making membership compulsory for AI companies beyond a particular scale through international legislation and requiring AI companies to conduct Sentinel evaluations before they can supply models above a certain capability threshold to participant countries’ governments.

There are a number of critical requirements for the success of such an endeavour. The following design features should be considered red lines in Sentinel’s development, which if crossed would mean it is likely to fail.

Sentinel should:

Be sufficiently resourced to operate at the cutting edge of AI, while having the freedom to partner with commercial actors flexibly and quickly. For reference, DeepMind’s budget was approximately £1 billion per year prior to its merger with Google ,which would primarily have been spent on salaries and compute costs. It is better not to fund something at all than fund it in a way in which it cannot be globally relevant.

Be given similar freedom of action to ARIA and the ability to recruit top technical talent. The legislation and funding structure for ARIA should serve as a model for the freedoms that will be required.

Be led by a technical expert, such as the leader of a frontier-industry lab, empowered with the freedom to lead an effective organisation free from bureaucratic hindrance. If this requires legislation, then the government should legislate. A business-as-usual public lab will fail for the same reasons that the others struggle, as highlighted in the first New National Purpose report and the Nurse review.

Have a high level of information security, both to prevent leaks of potentially dangerous models to hostile actors and to contain potentially dangerous systems.

Institutions like Montreal-based Mila, led by Bengio, show that it is possible for first-rate AI research to be done through publicly funded laboratories.

Without such an endeavour, there is a strong possibility that the UK simply becomes irrelevant to the progress of potentially the most transformative technology in history and has no meaningful influence over its safe development.

An effort such as Sentinel would loosely resemble a version of CERN for AI and would aim to become the “brain” of an international regulator of AI, which would operate similarly to how the International Atomic Energy Agency works to ensure the safe and peaceful use of nuclear energy. Sentinel would initially focus on ensuring best practice in the top AI labs, but the five-year aim of such an organisation would be to form the international regulatory function across the AI ecosystem, in preparation for the proliferation of very capable models.

Recommendation: The UK government should create a new national laboratory effort to test, understand and control AI to ensure it remains safe. This effort should be given sufficient freedom, funding and authority to empower it to succeed.

The UK should take the lead on creating this at pace, while making clear to allied countries they are welcome to join and that this is an intent of the programme. The prime minister should notify bodies like the Organisation for Economic Co-operation and Development (OECD) of the programme’s intention and offer access to visitors from these organisations. The UK should also commit not to use membership as leverage in other political negotiations, as the UK and Switzerland have experienced with the EU’s Horizon Europe programme. Unlike the EU with Horizon Europe, the UK would not seek to extract a financial premium from participant countries in exchange for Sentinel membership.

Recommendation: The UK should seek to build an international coalition around Sentinel and use it as a platform and research hub for an international regulatory approach in advanced AI systems. However, it should launch the effort now and invite others to join, rather than wait for buy-in to proceed.

From an international regulatory perspective, Sentinel would give the UK a seat at the table. Having a strong public body that can interact with and speak on the same terms as OpenAI and DeepMind is also a competitive advantage. The US government would likely be hesitant to carry out public-sector research, and the EU lacks the same frontier technical community to lead in this area.

Given how rapidly AI is evolving, models cannot yet be reduced to laws and theories, and new capabilities within existing models are continually being discovered. This means that the regulation of AI and research into it must be very tightly coupled: regulation is research; research is regulation. This point has not yet entered the public discussion to any significant extent, but it is important for policymakers to realise that AI regulation will likely look very different to previous regulation in other areas and should therefore be considered an active research project.

Recommendation: As UK regulation around AI will need to be far more closely integrated with research than is usual for other areas of regulation, Sentinel will need to act in part as an advisory system for regulation based on its research. Calls for regulation are inseparable from a research effort, and a joint regulatory-research programme is therefore required.

There remains a deficiency in talent allocation for AI safety, as opposed to capability. Given its importance, the government should take inspiration from the way in which, in the 1940s, some of the world’s best mathematicians and physicists were directly involved in addressing security issues and made fundamental contributions to international safety.

Recommendation: Through Sentinel, the UK should initiate a recruitment programme that attracts the world’s best scientific minds to address AI-safety concerns. The solutions to these concerns will likely not be solely technical, but rather encompass a range of approaches. Sentinel should therefore also seek legal, economic and philosophical expertise to help understand the broader implications of its research and how this interacts with other non-technical forms of safety.

This recruitment drive, coupled with investment into developing better ways of auditing and evaluating AI capabilities, would make the most of the UK’s potential as a home to a high-value AI assurance market. The UK’s AI assurance roadmap points to our comparative advantage in service exports, particularly in finance and insurance. Combining the UK’s strong technology-sector expertise with its service industry in the context of Sentinel would kickstart this market via public procurement and have enormous downstream effects in making the assurance industry a sustainable one.

Thoughts and engagement are very welcome!