Getting started independently in AI Safety

Drop the constraints. It’s great to have a mentor, a lab and lots of compute. But what if all you have is time and motivation?

I think too many people feel held back from doing a project like thing on their own. Getting the prerequisites in maths, probabilities, software is important to be able to work on a project but I think some people are pushing the prerequisites further than they should. Just do a project. Drop the constraints of ensuring your project is interesting or useful to someone else

You could spend 6 months learning more prerequisites like programming or maths and nobody else would be interested to see this work. How many programming students have reimplemented quick-sort without a second thought. If you have 6 months to work on something it might as well be on some alignment research project. If your experiments fail or it turns out to not be interesting to anyone else then the field is no more advanced than if you had spent the time learning even more maths.

It seems like a waste of 6 months to not have some useful output for the field at the end of it. But if you reframe the goal as learning for yourself then this can be very successful. You can learn quite a lot about project management and the methods of research from an otherwise failed project. If you attempt another project you are likely to plan and manage it much better and are more likely to be successful.

I’m a big fan of just-in-time learning, where you come to a problem in your project and then learn the technique or formula in order to get past it. This kind of motivated learning I find far more effective than learning something out of context and just for itself. Similarly I’ve read machine learning papers and thought that I understood them until I later need to use the techniques from the paper on another project. Even going back to the project it turned out that I didn’t understand it at all and had a lot of trouble implementing it.

So, how do you get started in AI Safety research independently?

As many suggest, start by reading papers and posts from the field. My advice differs here in that I think you should read with the intention of getting distracted. Follow what is interesting to you rather than continuing to read the next thing on your list.

Don’t take notes or summarise the papers. You’re not going to be examined on your knowledge of them and you can always access them in full later. Write down questions that you have instead. Write down things that are missing from what you are reading. Write down the things that don’t seem clear, you don’t understand or disagree with.

Look up some of the references or try to find other resources that answer your questions. If this leads you to some other things that are interesting to you then that’s great. Keep going down the rabbit hole (don’t go too far of course).

If there aren’t any answers to your questions or you can’t find resources that explain something clearly you have found an opportunity. Clarifying a small part of someone else’s publication is a great way to learn for yourself and contribute something valuable to the community. There are plenty of great forum posts on “Clarifying X”, and there is room for more.

Some questions you write down may be a one-liner that is never visited again. For others, you might write a few sentences on what you are trying to ask. Others still may take up a page, including some ideas on how you would answer the questions and some experiments you could run.

If you are following what is interesting to you, then you are likely to come across a question that you can’t help but keep thinking about. This is the project you should work on. The only thing likely holding you back is the thought that no one else would be interested in this, or the question isn’t that important. Do it anyway. It’s much harder to find a project that you are interested in than one that everyone else is.

____

Also check out MIRI’s Alignment Research Field Guide and this list of lists to study guides, research agendas and other resources.

____
Love to get peoples feedback on this below or privately via email
jj@aisafetysupport.org

Also, always just keen to talk to new people calendly.com/​jj-hepboin