ARC is hiring alignment theory researchers
The Alignment Research Center is hiring researchers; if you are interested, please apply!
(Update: We have wrapped up our hiring “round” for early 2022, but are still accepting researcher applications on a rolling basis here, though it may take us longer to get back to you.)
What is ARC?
ARC is a non-profit organization focused on theoretical research to align future machine learning systems with human interests. We are aiming to develop alignment strategies that would continue to work regardless of how far we scaled up ML or how ML models end up working internally.
Probably the best way to understand our work is to read Eliciting Latent Knowledge, a report describing some recent and upcoming research, which illustrates our general methodology.
We currently have 2 research staff (Paul Christiano and Mark Xu). We’re aiming to hire another 1-2 researchers in early 2022. ARC is a new organization and is hoping to grow significantly over the next few years, so early hires will play a key role in helping define and scale up our research.
Who should apply?
Most of all, you should send in an application if you feel excited about proposing the kinds of algorithms and counterexamples described in our report on ELK.
We’re open to anyone who is excited about working on alignment even if you don’t yet have any research background (or your research is in another field). You may be an especially good fit if you:
Are creative and generative (e.g. you may already have some ideas for potential strategies or counterexamples for ELK, even if they don’t work).
Have experience designing algorithms, proving theorems, or formalizing concepts.
Have a broad base of knowledge in mathematics and computer science (we often draw test cases and counterexamples from these fields).
Have thought a lot about the AI alignment problem, especially in the limit of very powerful AI systems.
Hiring will be a priority for us in early 2022 and we don’t mind reading a lot of applications, so feel free to err on the side of sending in an application.
Hiring process and details
You can apply by filling out this short form. Our hiring process involves a remote 2 hour interview followed by a day-long onsite. Where possible we also prefer to do a longer trial although we understand that’s not practical for everyone.
We are based in Berkeley, CA and would prefer people who can work from our office, but we’re open to discussing remote arrangements for great candidates.
Full-time salaries are in the $150k-400k/year range depending on experience. Intern salaries are $15k/month.
Q&A
We welcome any questions about what working at ARC is like, the hiring process, what we’re looking for, or whether you should apply.
- 2021 AI Alignment Literature Review and Charity Comparison by 23 Dec 2021 14:06 UTC; 176 points) (
- 2021 AI Alignment Literature Review and Charity Comparison by 23 Dec 2021 14:06 UTC; 168 points) (LessWrong;
- Announce summer EA internships farther in advance by 29 Mar 2022 21:51 UTC; 100 points) (
How does ARC differ from other AI alignment organizations, like MIRI?
Compared to MIRI: We are trying to align AI systems trained using techniques like modern machine learning. We’re looking for solutions that are (i) competitive, i.e. don’t make the resulting AI systems much weaker, (ii) work no matter how far we scale up ML, (iii) work for any plausible situation we can think of, i.e. don’t require empirical assumptions about what kind of thing ML systems end up learning. This forces us to confront many of the same issues at MIRI, though we are doing so in a very different style that you might describe as “algorithm-first” rather than “understanding-first.” You can read a bit about our methodology in “My research methodology” or this section of our ELK writeup.
I think that most researchers at MIRI don’t think that this goal is achievable, at least not without some kind of philosophical breakthrough. We don’t have the same intuition (perhaps we’re 50-50). Some of the reasons: it looks to us like there are a bunch of possible approaches for making progress, there aren’t really any clear articulations of fundamental obstacles that will cause those approaches to fail, and there is extremely little existing work pursuing plausible worst-case algorithms. Right now it mostly seems like people just have varying intuitions, but searching for a worst-case approach seems like it’s a good deal as long as there’s a reasonable chance it’s possible. (And if we fail we expect to learn something about why.)
Compared to everyone else: We think of a lot of possible algorithms, but we can virtually always rule it out without doing any experiments. That means we are almost always doing theoretical research with pen and paper. It’s not obvious whether a given algorithm works in practice, but it usually is obvious that there exist plausible situations where it wouldn’t work, and we are searching (optimistically) for something that works in every plausible situation.
Do you hire summer research interns? Thanks!
Where does ARC get funding from? Open Philanthropy?