New report on the state of AI safety in China

Link post

Concordia Consulting has released a new report (Oct 2023) on the state of AI safety in China. It’s about 160 pages long, and looks quite informative.

Here’s their Executive Summary (simply copied-and-pasted from their pages 4-6):

Amid the rapid evolution of the global artificial intelligence (AI) industry, China has emerged as a pivotal player.1 From advancing regulations on generative AI and calling for AI cooperation at the United Nations (UN), to pursuing technical research on AI safety and more, China’s actions on AI have global implications. However, international understanding of China’s thoughts and actions on AI safety remains limited. This report aims to close that knowledge gap by analyzing China’s domestic AI governance, international AI governance, technical AI safety research, expert views on AI risks, lab self-governance methods, and public opinion on AI risks.

China has developed powerful domestic governance tools that, while currently not used to mitigate frontier AI risks, could be employed that way in the future.2 Existing Chinese regulations have created an algorithm registry and safety/security reviews for certain AI functions, which could be adapted to more directly deal with frontier risks. Notably, an expert draft of China’s national AI law attempts to regulate certain AI scenarios by building upon the algorithm registry to create licenses for more risky cases, among other policy tools.3 The science and technology (S&T) ethics review system requires ethics reviews during the research and development (R&D) process for certain AI use-cases, though the system is still under construction and implementation details are yet to be clarified. While current domestic standards on AI safety are mostly oriented towards security and robustness concerns, China’s top AI standards body referenced alignment in a 2023 document, suggesting growing attention towards frontier capabilities.

In the international arena, China has recently intensified its efforts to position AI as a domain for international cooperation. In October 2023, President Xi Jinping announced the new Global AI Governance Initiative (全球人工智能治理倡议) at the Third Belt and Road Forum for International Cooperation, setting out China’s core positions on international cooperation on AI.4 The Chinese government has also indicated interest in maintaining human control over AI systems and preventing their misuse by extremist groups. However, successful cooperation with China on AI safety hinges on selecting the right international fora for exchanges, as China has expressed a clear preference for holding AI-related discussions under the aegis of the UN.

Technical research in China on AI safety has become more advanced in just the last year. Numerous Chinese labs are conducting research on AI safety, albeit with varying degrees of focus and sophistication. Chinese labs predominantly employ variants of reinforcement learning from human feedback (RLHF) techniques for specification research and have conducted internationally notable research on robustness. Some Chinese researchers have also developed safety evaluations for Chinese Large Language Models (LLMs), although they do not focus on dangerous capabilities. Additionally, several have extensively explored interpretability, particularly for computer vision. While this work diverges in certain aspects from research popular in leading AI labs based in the United States (US) and United Kingdom (UK), the surge in preprint research on AI safety by at least thirteen notable Chinese labs over the past year underscores the escalating interest of Chinese scientists.

Expert discussions around frontier AI risks have become more mainstream in the last year. While some leading Chinese experts expressed worries about risks from advanced AI systems as early as 2016, this was more the exception than the norm. The release of GPT-3 in 2020 spurred more academics to discuss frontier AI risks, but the topic was not yet common enough to merit dedicated discussion in China’s top two AI conferences, the World Artificial Intelligence Conference (WAIC) and Beijing Academy of Artificial Intelligence (BAAI) Conference. In 2023, however, frontier AI risks have become a common topic of debate, with multiple Chinese experts signing the Future of Life Institute (FLI) and Center for AI Safety (CAIS) open letters on frontier AI, and the 2023 Zhongguancun (ZGC) Forum and BAAI Conference featuring in-depth discussions on the matter. Several leading experts have also emphasized the Chinese concept “bottom-line thinking” (底线思维), which bears similarities to the precautionary principle in EU policymaking and offers a unique contribution to explorations on AI risks.

Chinese labs have largely adopted a passive approach to self-governance of frontier AI risks. While numerous labs began releasing ethics principles for AI development in 2018, these were fairly general and did not specifically address the safety of frontier models. More recent action in 2023 by a Chinese AI industry association indicates interest in AI alignment and safety/security issues. Some Chinese labs have publicized safety measures undertaken for their released LLMs, including alignment measures such as RLHF used for models published in 2023. However, the evaluations these labs have publicly stated they conducted primarily focused on truthfulness and toxic content, rather than more dangerous capabilities.

There is a significant lack of data regarding the Chinese public’s views of frontier AI. Existing public opinion surveys are outdated, have limited participation, and often lack precise survey questions. However, existing evidence weakly suggests that the Chinese public generally thinks that benefits from AI development outweigh the harms. One survey suggests that the Chinese public and AI scholars do think there are existential risks from Artificial General Intelligence (AGI), but still think AGI should be developed, suggesting that they think the risks are controllable. However, a more comprehensive exploration is essential to understand the Chinese public’s views on the significance of frontier AI risks and how to address such risks.

As this is the first report the authors are aware of that seeks to comprehensively map out the AI safety landscape in China, we see it as part of a larger, essential conversation on how China and the rest of the world should act to reduce the increasingly dangerous risks of frontier AI advancement. We hope that this will encourage other institutions to also better our common understanding of AI safety developments in China, which we believe will be beneficial to global security and prosperity