Which AI Safety Org to Join?
Long TL;DR: You’re an engineer, you want to work on AI Safety, you’re not sure which org to apply to, so you’re going to apply to all of them. But—oh no—some of these orgs may actively be causing harm, and you don’t want to do that. What’s your alternative? Study AI Safety for 2 years before you apply? In this post I suggest you can collect the info you want quickly by going over specific posts
Why do I think some orgs might be [not helping] or [actively causing harm]?
Example link. (Help me out in the comments with more?)
My suggestion:
1. Open the tag of the org you want on lesswrong
How: Search for a post related to that org. You’ll have tags on top of the post. Click the tag with the org name.
2. Sort by “newest first״
3. Open 2-3 posts
(Don’t read the post yet!)
4. In each post, look at the top 2-3 most upvoted comments
What I expect you’ll find sometimes
A post by the org, with comments trying to politely say “this is not safe”, heavily upvoted.
Bonus: Read the comments
Or even crazier: Read the post! [David Johnson thinks this is a must!]
Ok, less jokingly, this seems to me like a friendly way to start to see the main arguments without having to read too much background material (unless you find, for example, a term you don’t know).
Extra crazy optimization: Interview before research
TL;DR: First apply to lots of orgs, and then, when you know which orgs are interested[1], then do your[2] research only those orgs.
Am I saying this idea for vetting AI Safety orgs is perfect?
No, I am saying it is better than the alternative of “apply to all of them (and do no research)”, assuming you resonate with my premise of “there’s a lot of variance in effectiveness of orgs” and “that matters”.
I also hope that by posting my idea, someone will comment with something even better.
- ^
However you choose to define “interested”. Maybe research the orgs that didn’t reject your CV? Maybe only research the ones that accepted you? Your call
- ^
Consider sharing your thoughts with the org. Just remember, whoever is talking to you was chosen as a person that convinces candidates to join. They will, of course, think their org is great. Beware of reasons like “the people saying we are causing harm are wrong, but we didn’t explain publicly why”. The whole point is letting the community help you with this complicated question.
I don’t endorse judging AI safety organisations by less wrong consensus alone—I think you should at least read the posts!
Thanks for the push back!
Added this to the post
I think this is fairly bad advice—LessWrong commenters are wrong about a lot of things. I think this is an acceptable way to get a vibe for the what the LessWrong bubble thinks though. But idk, for most of these questions the hard part is figuring out which bubble to believe. Most orgs will have some groups think they’re useless, some think they’re great, and probably some who think they’re net negative. Finding one bubble who believes one of these three doesn’t tell you much!
Thanks for the pushback!
Do you have an alternative suggestion?
I personally interpret Neel’s comment as saying this is ~not better (perhaps worse) than going in blindly. So I just wanted to highlight that a better alternative is not needed for the sake of arguing this (even if it’s a good idea to have one for the sake of future AI researchers).
Do you think that going to do capabilities work at DeepMind or OpenAI is just as impactful as going to whatever the lesswrong community recommends (as presented by their comments and upvotes) ?
Possibly. As we’ve discussed privately, I think some AI safety groups which are usually lauded are actually net negative 🙃
But I was trying to interpret Neel and not give my own opinion.
My meta-opinion is that it would be better to see what others think about working on capabilities in top labs, compared to going there without even considering the downsides. What do you think? (A)
And also that before working at “AI safety groups which are usually lauded [but] are actually net negative”, it would be better to read comments of people like you. What do you think? (B)
I somewhat disagree with both statements.
(A) Sure, it’d be good to have opinions from relevant people, but on the other hand it’s non-trivial to figure out who “relevant people” are, and “the general opinion on LW” is probably not the right category. I’d look more at what (1) people actually working in the field, and (2) the broad ML community, think about an org. So maybe the Alignment Forum.
(B) I can only answer on my specific views. My opinion on [MIRI] probably wouldn’t really help individuals seeking to work there, since they probably know everything I know and have their own opinions. My opinions are more suitable for discussions on the general AI safety community culture.
By the way, I personally resonate with your advice on forming an inside view and am taking that path, but it doesn’t fit everyone. Some people don’t want all that homework, they want to get in a company and write code, and, to be clear, it is common for them to apply to all orgs that [they see their names in EA spaces] or something like that (very wide, many orgs). This is the target audience I’m trying to help.
I would just probably tell people to work in another field than explicitly encouraging goodharting their way to trying to having positive impact in an area with extreme variance.
Thinking about where to work seems reasonable, listening to others’ thoughts on where to work seems reasonable, this post advises both.
This post also pretty strongly suggests that lesswrong comments are the best choice of others’ thoughts, and I would like to see that claim made explicit and then argued for rather than slipped in. As a couple of other comments have noted, lesswrong is far from a perfect signal of the alignment space.
Thanks (also) for the pushback part!
Do you have an alternative to lesswrong comments that you’d suggest?
Seems like there is a big gap between “Study AI Safety for 2 years before you apply” and reading posts, rather than just the most up-voted comments.
Other feedback: I don’t understand why you call some of your suggestions “crazy”/”crazier”. Also when you wrote “less joklingly” I had missed the joke. Perhaps your suggestions could be rewritten to be more clear without these words.
The joke is supposed to be that “reading the post” isn’t actually that crazy. I see this wasn’t understood (oops!). I’m going to start by trying to get the content right (since lots of people pushed back on it) and then try fixing this too
I didn’t understand, could you say this in other words please?
Or did you mean it like this:
?
I meant it like that, yes. Seems like there is a big gap between [“Study AI Safety for 2 years before you apply” and reading posts].
So what would you suggest? Reading a few posts about that org?
Seems worth asking in interviews “I’m concerned about advancing capabilities and shortening timelines, what actions is your organization taking to prevent that”, with the caveat that you will be BSed.
Bonus: You can turn down roles explicitly because they’re doing capabilities work, which if it becomes a pattern may incentivize them to change their plan.
I agree, see foot note 2