List of AI safety courses and resources
By: Daniel del Castillo, Chris Leong, and Kat Woods
We made a spreadsheet of resources for learning about AI safety. It was for internal purposes here at Nonlinear, but we thought it might be helpful to those interested in becoming safety researchers.
Please let us know if you notice anything that we’re missing or that we need to update by commenting below. We’ll update the sheet in response to comments.
Highlights
There are a lot of courses and reading lists out there. If you’re new to the field, out of the ones we investigated, we recommend Richard Ngo’s curriculum of the AGI safety fundamentals program. It is a good mix of shorter, more structured, and more broad than most alternatives. You can register interest for their program when the next round starts or simply read through the reading list on your own.
We’d also like to highlight that there is a remote AI safety reading group that might be worth looking into if you’re feeling isolated during the pandemic.
About us: Nonlinear is a new AI alignment organization founded by Kat Woods and Emerson Spartz. We are a means-neutral organization, so are open to a wide variety of interventions that reduce existential and suffering risks. Our current top two research priorities are multipliers for existing talent and prizes for technical problems.
PS—Our autumn Research Analyst Internship is open for applications. Deadline is September 7th, midnight EDT. The application should take around ten minutes if your CV is already written.
- (Even) More Early-Career EAs Should Try AI Safety Technical Research by 30 Jun 2022 21:14 UTC; 86 points) (
- Jobs that can help with the most important century by 12 Feb 2023 18:19 UTC; 57 points) (
- Jobs that can help with the most important century by 10 Feb 2023 18:20 UTC; 24 points) (LessWrong;
- AI Safety researcher career review by 23 Nov 2021 0:00 UTC; 13 points) (
- [AN #164]: How well can language models write code? by 15 Sep 2021 17:20 UTC; 13 points) (LessWrong;
- 22 Aug 2022 13:47 UTC; 7 points) 's comment on Introducing the Existential Risks Introductory Course (ERIC) by (
- キャリアレビュー:人工知能の安全性に取り組む研究者 by 17 Aug 2023 15:21 UTC; 2 points) (
- 28 Apr 2022 16:50 UTC; 2 points) 's comment on The Case for Non-Technical AI Safety/Alignment Growth & Funding by (
- [Opzionale] Ricerca sulla sicurezza delle IA: panoramica delle carriere by 17 Jan 2023 11:06 UTC; 1 point) (
Nice initiative, thanks!
Plugging my own list of resources (last updated April 2020, next update before the end of the year).
These aren’t entirely about AI, but Brian Tomasik’s Essays on Reducing Suffering and Tobias Baumann’s articles on S-risks are also worth reading. They contain a lot of articles related to futurism and scenarios that could result in astronomical suffering. On the topic of AI alignment, Tomasik wrote this article on the risks of a “near miss” in AI alignment, and how a slightly misaligned AI may create far more suffering than a completely unaligned AI.
Haven’t checked out your spreadsheet, but I do think these sorts of collections are good things to create! And on that note, I’ll mention my Collection of AI governance reading lists, syllabi, etc. (so that’s for AI governance, not technical AI safety stuff). I suggest people who want to read it read the doc version, but I’ll also copy the full contents into this comment for convenience.
What is this doc, and why did I make it?
AI governance is a large, complex, important area that intersects with a vast array of other fields. Unfortunately, it’s only fairly recently that this area started receiving substantial attention, especially from specialists with a focus on existential risks and/or the long-term future. And as far as I’m aware there aren’t yet any canonical, high-quality textbooks or online courses on the topic.[1] It seems to me that this means this is an area where well-curated and well-structured reading lists, syllabi, or similar can be especially useful, helping to fill the role that textbooks otherwise could.[2]
Fortunately, when I started looking for relevant reading lists and syllabi, I was surprised by how many there were. So I decided to try to collect them all in one place. I also tried to put them in very roughly descending order of how useful I’d guess they’d be to a randomly chosen EA-aligned person interested in learning about AI governance.
I think this might help myself, my colleagues, and others who are trying to “get up to speed”, for the reasons given in the following footnote.[3]
I might later turn this doc into a proper post on the EA Forum.
See also EA syllabi and teaching materials and Courses on longtermism.
How can you help
Please comment if you know of anything potentially relevant which I haven’t included!
Please comment if you have opinions on anything listed!
The actual collection
September AGI safety fundamentals curriculum—Richard Ngo
Alignment Newsletter Database—Rohin Shah
This is more relevant to technical AI safety than to AI governance, but some categories are pretty relevant to AI governance, especially “AI strategy and policy”, “Forecasting”, and “Field building”
AI Governance Reading List—SERI 2021 Summer
The author had also previously made a syllabus on the same topics: AI Governance Syllabus ’21.docx
Governance of AI Reading List – Oxford Spring 2020 - Markus Anderljung
Reading Guide for the Global Politics of Artificial Intelligence—Allan Dafoe
I’m guessing other lists made by people associated with GovAI already draw on and superseded this, but I don’t know
“Resources” section from Guide to working in artificial intelligence policy and strategy − 80,000 Hours
Note: I think the only book from there that’s available on Audible UK is The Second Machine Age.
But the description of the book sounds to me kind-of basic and not especially longtermism-relevant.
AI policy introductory reading list—Niel Bowerman (I think)
Governance of AI—Some suggested readings [v0.5, shared] - Ashwin Acharya
Drawn on for SERI’s reading list
Artificial Intelligence and International Security Syllabus [public] - Remco Zwetsloot, 2018 (I think)
Books and lecture series relevant to AI governance—me and commenters
Section on “Unaligned artificial intelligence” from Syllabus — The Precipice
Tangential critique: I personally think that it’s problematic and misleading that both The Precipice and this syllabus use the heading “unaligned artificial intelligence” while seeming to imply that this covers all key aspects of AI risk, since I think this obscures some risk pathways.
AI Policy Readings Draft.docx—EA Oxford
Drawn on for SERI’s reading list
My post Crucial questions for longtermists includes a structured list of questions related to the “Value of, and best approaches to, work related to AI”, and this associated doc contains readings related to each of those questions
I haven’t updated this much since 2020
Questions listed there include:
Is it possible to build an artificial general intelligence (AGI) and/or transformative AI (TAI) system? Is humanity likely to do so?
What form(s) is TAI likely to take? What are the implications of that? (E.g., AGI agents vs comprehensive AI services)
What will the timeline of AI developments be?
How much should longtermists’ prioritise AI?
What forms might an AI catastrophe take? How likely is each?
What are the best approaches to reducing AI risk or increasing AI benefits?
Good resources for getting a high-level understanding of AI risk—Michael Aird
AI governance intro readings—Felipe Calero
Luke Muehlhauser’s 2013 and 2014 lists of books he’d listened to recently
I think many/most of these books were chosen for “seem[ing] likely to have passages relevant to the question of how well policy-makers will deal with AGI”
Many/most of these aren’t available as audiobooks; Luke turned them into audiobooks himself
A Contra AI FOOM Reading List – Magnus Vinding
Described in SERI’s reading list as a “List of arguments (of varied quality) against ‘fast takeoff’”
List of resources on AI and agency—Ben Pace
You could also use research agendas related to AI governance as reading lists, by following the sources they cite on various topics. Relevant agendas include:
(Note that I haven’t checked how well each of these agendas would work for this purpose. This list is taken from my central directory for open research questions.)
The Centre for the Governance of AI’s research agenda − 2018
Some AI Governance Research Ideas—the Centre for the Governance of AI, 2021
Promising research projects—AI Impacts, 2018
They also made a list in 2015; I haven’t checked how much they overlap
Cooperation, Conflict, and Transformative Artificial Intelligence (the Center on Long-Term Risk’s research agenda) - Jesse Clifton, 2019
Open Problems in Cooperative AI—Dafoe et al., 2020
Problems in AI Alignment that philosophers could potentially contribute to—Wei Dai, 2019
Problems in AI risk that economists could potentially contribute to—Michael Aird, 2021
Technical AGI safety research outside AI—Richard Ngo, 2019
Artificial Intelligence and Global Security Initiative Research Agenda—Centre for a New American Security, no date
A survey of research questions for robust and beneficial AI—Future of Life Institute, no date
“studies which could illuminate our strategic situation with regard to superintelligence”—Luke Muehlhauser, 2014 (he also made a list in 2012)
A shift in arguments for AI risk—Tom Sittler, 2019
Longtermist AI policy projects for economists—Risto Uuk (this doc was originally just made for Risto’s own use, so the ideas shouldn’t be taken as high-confidence recommendations to anyone else)
Annotated Bibliography of Recommended Materials—CHAI
I think this is much more focused on technical AI safety than AI governance
Some Rethink Priorities staff may soon make a long, tiered reading list tailored to the AI governance project ideas we may work on. If it seems to me that this would be useful to other people, I might add a link to a version of it here.
There may be additional relevant reading lists / syllabi / sections in the links given here: EA syllabi and teaching materials—EA Forum
And here Courses on longtermism | Pablo’s miscellany
I think there was also a short reading list associated with the EA In-Depth Fellowship
A related category to reading lists is newsletters that provide summaries and commentary of a bunch of research outputs. E.g.:
Rohin’s
Jack Clarke’s
CSET’s
…
One person suggested that I or people reading this doc might also be interested in “syllabi aimed at aspiring AI technical safety researchers, such as this one: Technical AI Safety Reading List. I have a vague sense that engaging with some of this content has been helpful for my having a better broad sense of what’s going on with AI safety, which seems helpful for governance.”
Some parts of Krakovna’s AI safety resources and Maini’s AI Reading List may be quite useful for AI governance people, though I think they’re more relevant for technical AI safety people
My thanks to everyone who made these lists.
Footnotes
[1] Though there are various presumably high-quality textbooks or courses with some relevance, some high-quality non-textbook books on the topic, some in-person courses that might be high-quality (I haven’t participated in them), and some things that fill somewhat similar roles (like EA seminar series, reading groups, or fellowships).
[2] See also Research Debt and Suggestion: EAs should post more summaries and collections.
[3]
This collection should make it easier to find additional reading lists, syllabi, etc., and thus easier to find additional readings that have been evaluated as especially worth reading in general, especially worth reading on a given topic, and/or especially good as introductory resources.
This collection should make it easier to find and focus on reading lists, syllabi, etc. that are better and/or more relevant to one’s specific needs.
To help with this, please comment on this doc if you have opinions about anything listed.
Even before or without engaging with the actual items included in a given reading list, syllabus, or similar, engaging with the structure and commentary in that document itself could help one understand what the important components, divisions, concepts, etc. within AI governance are. And this collection should help people find more, better, and/or more relevant such documents.
US standards institute NIST also produces recommendations: https://www.nist.gov/itl/ai-risk-management-framework
Here is the curriculum of the ML4Good, an AGI safety camp organized by EffiSciences to tprosaic alignment researchers.
The program contains many programming exercises
I think the Introduction to ML Safety course would be a good addition!
You can add new ones here, I would but you probably have a clearer idea of what a good summary would be.
Oh, thanks, missed that form in the sheet. Might be worth updating this forum post with the form because it currently says: