Big Picture AI Safety

tldr: I conducted 17 semi-structured interviews of AI safety experts about their big picture strategic view of the AI safety landscape: how will human-level AI play out, how things might go wrong, and what should the AI safety community be doing. While many respondents held “traditional” views (e.g. the main threat is misaligned AI takeover), there was more opposition to these standard views than I expected, and the field seems more split on many important questions than someone outside the field may infer.

You don’t need to have read an earlier post to understand a later one, so feel free to zoom straight in on what interests you.

I am very grateful to all of the participants for offering their time to this project. Also thanks to Vael Gates, Siao Si Looi, ChengCheng Tan, Adam Gleave, Quintin Davis, George Anadiotis, Leo Richter, McKenna Fitzgerald, Charlie Griffin and many of the participants for feedback on early drafts.

Big Pic­ture AI Safety: Introduction

What will the first hu­man-level AI look like, and how might things go wrong?

What should AI safety be try­ing to achieve?

What mis­takes has the AI safety move­ment made?