Substack shill @ parhelia.substack.com
Conor Barnes
Hi there, I’d like to share some updates from the last month.
Text during last update (July 5)
OpenAI is a leading AI research and product company, with teams working on alignment, policy, and security. We recommend specific positions at OpenAI that we think may be high impact. We do not necessarily recommend working at other jobs at OpenAI. You can read more about considerations around working at a leading AI company in our career review on the topic.
Text as of today:
OpenAI is a frontier AI research and product company, with teams working on alignment, policy, and security. We post specific opportunities at OpenAI that we think may be high impact. We do not necessarily recommend working at other positions at OpenAI. You can read concerns about doing harm by working at a frontier AI company in our career review on the topic. Note that there have also been concerns around OpenAI’s HR practices.
The thinking behind these updates has been:We continue to get negative updates concerning OpenAI, so it’s good for us to update our guidance accordingly.
While it’s unclear exactly what’s going on with the NDAs (are they cancelled or are they not?), it’s pretty clear that it’s in the interest of users to know there’s something they should look into with regard to HR practices.
We’ve tweaked the language to “concerns about doing harm” instead of “considerations” for all three frontier labs to indicate more strongly that these are potentially negative considerations to make before applying.
We don’t go into much detail for the sake of length / people not glazing over them—my guess is that the current text is the right length to have people notice it and then look into it more with our newly updated AI company article and the Washington Post link.
This is thanks to discussions within 80k and thanks to some of the comments here. While I suspect, @Raemon, that we still don’t align on important things, I nonetheless appreciate the prompt to think this through more and I believe that it has led to improvements!
I interpreted the title to mean “Is it a good idea to take an unpaid UN internship?”, and it took a bit to realize that isn’t the point of the post. You might want to change the title to be clear about what part of the unpaid UN internship is the questionable part!
Update: We’ve changed the language in our top-level disclaimers: example. Thanks again for flagging! We’re now thinking about how to best minimize the possibility of implying endorsement.
(Copied from reply to Raemon)
Yeah, I think this needs updating to something more concrete. We put it up while ‘everything was happening’ but I’ve neglected to change it, which is my mistake and will probably prioritize fixing over the next few days.
Re: On whether OpenAI could make a role that feels insufficiently truly safety-focused: there have been and continue to be OpenAI safety-ish roles that we don’t list because we lack confidence they’re safety-focused.
For the alignment role in question, I think the team description given at the top of the post gives important context for the role’s responsibilities:
OpenAI’s Alignment Science research teams are working on technical approaches to ensure that AI systems reliably follow human intent even as their capabilities scale beyond human ability to directly supervise them.
With the above in mind, the role responsibilities seem fine to me. I think this is all pretty tricky, but in general, I’ve been moving toward looking at this in terms of the teams:
Alignment Science: Per the above team description, I’m excited for people to work there – though, concerning the question of what evidence would shift me, this would change if the research they release doesn’t match the team description.
Preparedness: I continue to think it’s good for people to work on this team, as per the description: “This team … is tasked with identifying, tracking, and preparing for catastrophic risks related to frontier AI models.”
Safety Systems: I think roles here depend on what they address. I think the problems listed in their team description include problems I definitely want people working on (detecting unknown classes of harm, red-teaming to discover novel failure cases, sharing learning across industry, etc), but it’s possible that we should be more restrictive in which roles we list from this team.
I don’t feel confident giving a probability here, but I do think there’s a crux here around me not expecting the above team descriptions to be straightforward lies. It’s possible that the teams will have limited resources to achieve their goals, and with the Safety Systems team in particular, I think there’s an extra risk of safety work blending into product work. However, my impression is that the teams will continue to work on their stated goals.
I do think it’s worthwhile to think of some evidence that would shift me against listing roles from a team:
If a team doesn’t publish relevant safety research within something like a year.
If a team’s stated goal is updated to have less safety focus.
Other notes:
We’re actually in the process of updating the AI company article.
The top-level disclaimer: Yeah, I think this needs updating to something more concrete. We put it up while ‘everything was happening’ but I’ve neglected to change it, which is my mistake and will probably prioritize fixing over the next few days.
Thanks for diving into the implicit endorsement point. I acknowledge this could be a problem (and if so, I want to avoid it or at least mitigate it), so I’m going to think about what to do here.
Hi, I run the 80,000 Hours job board, thanks for writing this out!
I agree that OpenAI has demonstrated a significant level of manipulativeness and have lost confidence in them prioritizing existential safety work. However, we don’t conceptualize the board as endorsing organisations. The point of the board is to give job-seekers access to opportunities where they can contribute to solving our top problems or build career capital to do so (as we write in our FAQ). Sometimes these roles are at organisations whose mission I disagree with, because the role nonetheless seems like an opportunity to do good work on a key problem.
For OpenAI in particular, we’ve tightened up our listings since the news stories a month ago, and are now only posting infosec roles and direct safety work – a small percentage of jobs they advertise. See here for the OAI roles we currently list. We used to list roles that seemed more tangentially safety-related, but because of our reduced confidence in OpenAI, we limited the listings further to only roles that are very directly on safety or security work. I still expect these roles to be good opportunities to do important work. Two live examples:
Even if we were very sure that OpenAI was reckless and did not care about existential safety, I would still expect them to not want their model to leak out to competitors, and importantly, we think it’s still good for the world if their models don’t leak! So I would still expect people working on their infosec to be doing good work.
These still seem like potentially very strong roles with the opportunity to do very important work. We think it’s still good for the world if talented people work in roles like this!
This is true even if we expect them to lack political power and to play second fiddle to capabilities work and even if that makes them less good opportunities vs. other companies.
We also include a note on their ‘job cards’ on the job board (also DeepMind’s and Anthropic’s) linking to the Working at an AI company article you mentioned, to give context. We’re not opposed to giving more or different context on OpenAI’s cards and are happy to take suggestions!
I find the Leeroy Jenkins scenario quite plausible, though in this world it’s still important to build the capacity to respond well to public support.
Hi Remmelt,
Just following up on this — I agree with Benjamin’s message above, but I want to add that we actually did add links to the “working at an AI lab” article in the org descriptions for leading AI companies after we published that article last June.
It turns out that a few weeks ago the links to these got accidentally removed when making some related changes in Airtable, and we didn’t notice these were missing — thanks for bringing this to our attention. We’ve added these back in and think they give good context for job board users, and we’re certainly happy for more people to read our articles.
We also decided to remove the prompt engineer / librarian role from the job board, since we concluded it’s not above the current bar for inclusion. I don’t expect everyone will always agree with the judgement calls we make about these decisions, but we take them seriously, and we think it’s important for people to think critically about their career choices.
I think this is a joke, but for those who have less-explicit feelings in this direction:
I strongly encourage you to not join a totalizing community. Totalizing communities are often quite harmful to members and being in one makes it hard to reason well. Insofar as an EA org is a hardcore totalizing community, it is doing something wrong.
I really appreciated reading this, thank you.
Rereading your post, I’d also strongly recommend prioritizing finding ways to not spend all free time on it. Not only do I think that that level of fixating is one of the worst things people can do to make themselves suffer, it also makes it very hard to think straight and figure things out!
One thing I’ve seen suggested is dedicating time each day to use as research time on your questions. This is a compromise to free up the rest of your time to things that don’t hurt your head. And hang out with friends who are good at distracting you!
I’m really sorry you’re experiencing this. I think it’s something more and more people are contending with, so you aren’t alone, and I’m glad you wrote this. As somebody who’s had bouts of existential dread myself, there are a few things I’d like to suggest:
With AI, we fundamentally do not know what is to come. We’re all making our best guesses—as you can tell by finding 30 different diagnoses! This is probably a hint that we are deeply confused, and that we should not be too confident that we are doomed (or, to be fair, too confident that we are safe).
For this reason, it can be useful to practice thinking through the models on your own. Start making your own guesses! I also often find the technical and philosophical details beyond me—but that doesn’t mean we can’t think through the broad strokes. “How confident am I that instrumental convergence is real?” “Do I think evals for new models will become legally mandated?” “Do I think they will be effective at detecting deception?” At the least, this might help focus your content consumption instead of being an amorphous blob of dread—I refer to it this way because I found the invasion of Ukraine sent me similarly reading as much as I could. Developing a model by focusing on specific, concrete questions (e.g. What events would presage a nuclear strike?) helped me transform my anxiety from “Everything about this worries me” into something closer to “Events X and Y are probably bad, but event Z is probably good”.
I find it very empowering to work on the problems that worry me, even though my work is quite indirect. AI safety labs have content writing positions on occasion. I work on the 80,000 Hours job board and we list roles in AI safety. Though these are often research and engineering jobs, it’s worth keeping an eye out. It’s possible that proximity to the problem would accentuate your stress, to be fair, but I do think it trades against the feeling of helplessness!
C. S. Lewis has a take on dealing with the dread of nuclear extinction that I’m very fond of and think is applicable: ‘How are we to live in an atomic age?’ I am tempted to reply: ‘Why, as you would have lived in the sixteenth century when the plague visited London almost every year...’
I hope this helps!
I hadn’t seen the previous dashboard, but I think the new one is excellent!
Thanks for the Possible Worlds Tree shout-out!
I haven’t had capacity to improve it (and won’t for a long time), but I agree that a dashboard would be excellent. I think it could be quite valuable even if the number choice isn’t perfect.
Halifax Monthly Meetup: AI Safety Discussion
“Give a man money for a boat, he already knows how to fish” would play off of the original formation!
Quite happy to see this on the forum!
One example I can think of with regards to people “graduating” from philosophies is the idea that people can graduate out of arguably “adolescent” political philosophies like libertarianism and socialism. Often this looks like people realizing society is messy and that simple political philosophies don’t do a good job of capturing and addressing this.
However, I think EA as a philosophy is more robust than the above: There are opportunities to address the immense suffering in the world and to address existential risk, some of these opportunities are much more impactful than others, and it’s worth looking for and then executing on these opportunities. I expect this to be true for a very long time.
In general I think effective giving is the best opportunity for most people. We often get fixated on the status of directly working on urgent problems, which I think is a huge mistake. Effective giving is a way to have a profound impact, and I don’t like to think of it as something just “for mere mortals”—I think there’s something really amazing about people giving a portion of their income every year to save lives and health, and I think doing so makes you as much an EA as somebody whose job itself is impactful.