Non-alignment project ideas for making transformative AI go well

Link post

This is a series of posts with lists of projects that it could be valuable for someone to work on. The unifying theme is that they are projects that:

  • Would be especially valuable if transformative AI is coming in the next 10 years or so.

  • Are not primarily about controlling AI or aligning AI to human intentions.[1]

    • Most of the projects would be valuable even if we were guaranteed to get aligned AI.

    • Some of the projects would be especially valuable if we were inevitably going to get misaligned AI.

The posts contain some discussion of how important it is to work on these topics, but not a lot. For previous discussion (especially: discussing the objection “Why not leave these issues to future AI systems?”), you can see the section How ITN are these issues? from my previous memo on some neglected topics.

The lists are definitely not exhaustive. Failure to include an idea doesn’t necessarily mean I wouldn’t like it. (Similarly, although I’ve made some attempts to link to previous writings when appropriate, I’m sure to have missed a lot of good previous content.)

There’s a lot of variation in how sketched out the projects are. Most of the projects just have some informal notes and would require more thought before someone could start executing. If you’re potentially interested in working on any of them and you could benefit from more discussion, I’d be excited if you reached out to me! [2]

There’s also a lot of variation in skills needed for the projects. If you’re looking for projects that are especially suited to your talents, you can search the posts for any of the following tags (including brackets):

[ML] [Empirical research] [Philosophical/​conceptual] [survey/​interview] [Advocacy] [Governance] [Writing] [Forecasting]

The projects are organized into the following categories (which are in separate posts). Feel free to skip to whatever you’re most interested in.

Acknowledgements

Few of the ideas in these posts are original to me. I’ve benefited from conversations with many people. Nevertheless, all views are my own.

For some projects, I credit someone who especially contributed to my understanding of the idea. If I do, that doesn’t mean they have read or agree with how I present the idea (I may well have distorted it beyond recognition). If I don’t, I’m still likely to have drawn heavily on discussion with others, and I apologize for any failure to assign appropriate credit.

For general comments and discussion, thanks to Joseph Carlsmith, Paul Christiano, Jesse Clifton, Owen Cotton-Barrat, Holden Karnofsky, Daniel Kokotajlo, Linh Chi Nguyen, Fin Moorhouse, Caspar Oesterheld, and Carl Shulman.

  1. ^

    Nor are they primarily about reducing risks from engineered pandemics.

  2. ^

    My email is [last name].[first name]@gmail.com

Crossposted to LessWrong (35 points, 1 comment)