Understanding the diffusion of large language models

How might transformative AI technology—or the means of producing it—spread among companies, states, institutions, and even individuals? What might the impact of that be, and how can we minimize risks in light of that?

I think these are the central questions for the study of AI “diffusion”—the spread of artifacts from AI development among different actors, where artifacts include trained models, datasets, algorithms, and code. Diffusion can occur through a variety of mechanisms—not only open publication and replication, but also theft, the leaking of information, and other means.

As a step towards understanding and beneficially shaping diffusion for transformative AI, this sequence presents my findings from a project to study the diffusion of recent large language models specifically. The project was undertaken at Rethink Priorities, mostly during a Fellowship with the AI Governance & Strategy team. The core of the project was case studies of nine language models that are similar to OpenAI’s GPT-3 model, including GPT-3 itself. However, this sequence also provides a broader background on AI diffusion, discusses tentative implications that my research has for the governance of transformative AI, and outlines questions for further investigation of AI diffusion more broadly.

Un­der­stand­ing the diffu­sion of large lan­guage mod­els: summary

Back­ground for “Un­der­stand­ing the diffu­sion of large lan­guage mod­els”

GPT-3-like mod­els are now much eas­ier to ac­cess and de­ploy than to develop

The repli­ca­tion and em­u­la­tion of GPT-3

Drivers of large lan­guage model diffu­sion: in­cre­men­tal re­search, pub­lic­ity, and cascades

Publi­ca­tion de­ci­sions for large lan­guage mod­els, and their impacts

Im­pli­ca­tions of large lan­guage model diffu­sion for AI governance

Ques­tions for fur­ther in­ves­ti­ga­tion of AI diffusion

Con­clu­sion and Bibliog­ra­phy for “Un­der­stand­ing the diffu­sion of large lan­guage mod­els”