I recall previously hearing there might be a final round of potential amendments in response to things Gavin Newsom requests. Was/is that accurate?
Raemon
(several years late, whoops!)
Yeah, my intent here was more “be careful deciding to scale your company to the point you need a lot of middle managers, if you have a nuanced goal”, rather than “try to scale your company without middle managers.”
In the context of an EA jobs list it seems like both are pretty bad. (there’s the “job list” part, and the “EA” part)
Yeah, this does seem like an improvement. I appreciate you thinking about it and making some updates.
Can you say a bit more about:
and (2) worse in private than in public.
?
Mmm, nod. I will look into the actual history here more, but, sounds plausible. (edited the previous comment a bit for now)
Following up my other comment:
To try to be a bit more helpful rather than just complaining and arguing: when I model your current worldview, and try to imagine a disclaimer that helps a bit more with my concerns but seems like it might work for you given your current views, here’s a stab. Changes bolded.
OpenAI is a frontier AI research and product company, with teams working on alignment, policy, and security. We recommend specific opportunities at OpenAI that we think may be high impact. We recommend applicants pay attention to the details of individual roles at OpenAI, and form their own judgment about whether the role is net positive. We do not necessarily recommend working at other positions at OpenAI
You can read considerations around working at a frontier AI company in our career review on the topic.
(it’s not my main crux, by “frontier” felt both like a more up-to-date term for what OpenAI does, and also feels more specifically like it’s making a claim about the product than generally awarding status to the company the way “leading” does)
Thanks.
Fwiw while writing the above, I did also think “hmm, I should also have some cruxes for ‘what would update me towards ‘these jobs are more real than I currently think.’” I’m mulling that over and will write up some thoughts soon.
It sounds like you basically trust their statements about their roles. I appreciate you stating your position clearly, but, I do think this position doesn’t make sense:
we already have evidence of them failing to uphold commitments they’ve made in clear cut ways. (i.e. I’d count their superalignment compute promises as basically a straightforward lie, and if not a “lie”, it at least clearly demonstrates that their written words don’t count for much. This seems straightforwardly relevant to the specific topic of “what does a given job at OpenAI entail?”, in addition to being evidence about their overall relationship with existential safety)
we’ve similarly seen OpenAI change it’s stated policies, such asremoving restrictions on military use. Or, initially being a nonprofit and converting into “for-profit-managed by non-profit” (where the “managed by nonprofit board” part turned out to be pretty ineffectual)(not sure if I endorse this, mulling over Habryka’s comment)
Surely, this at at least updates you downward on how trustworthy their statements are? How many times do they have to “say things that turned out not to be true” before you stop taking them at face value? And why is that “more times than they have already?”.
Separate from straightforward lies, and/or altering of policy to the point where any statements they make seem very unreliable, there is plenty of degrees of freedom of “what counts as alignment.” They are already defining alignment in a way that is pretty much synonymous with short-term capabilities. I think the plan of “iterate on ‘alignment’ with nearterm systems as best you can to learn and prepare” is not necessarily a crazy plan. There are people I respect who endorse it, who previously defended it as an OpenAI approach, although notably most of those people have now left OpenAI (sometimes still working on similar plans at other orgs).
But, it’s very hard to tell the difference from the outside between:
“iterating on nearterm systems, contributing to AI race dynamics in the process, in a way that has a decent chance of teaching you skills that will be relevant for aligning superintelligences”
“iterating on nearterm systems, in a way that you think/hope will teach you skills for navigating superintelligence… but, you’re wrong about how much you’re learning, and whether it’s net positive”
“iterating on nearterm systems, and calling it alignment because it makes for better PR, but not even really believing that it’s particularly necessary to navigate superintelligence.
When recommending jobs for organizations that are potentially causing great harm, I think 80k has a responsibility to actually form good opinions on whether the job makes sense, independent on what the organization says it’s about.
You don’t just need to model whether OpenAI is intentionally lying, you also need to model whether they are phrasing things ambiguously, and you need to model whether they are self-decelving about whether these roles are legitimate alignment work, or valuable enough work to outweigh the risks. And, you need to model that they might just be wrong and incompetent at longterm alignment development (or: “insufficiently competent to outweigh risks and downsides”), even if their heart were in the right place.
I am very worried that this isn’t already something you have explicit models about.
Thanks. This still seems pretty insufficient to me, but, it’s at least an improvement and I appreciate you making some changes here.
Yeah same. (although, this focuses entirely on their harm as an AI organization, and not manipulative practices)
I think it leaves the question “what actually is the above-the-fold-summary” (which’d be some kind of short tag).
I think EAs vary wildly. I think most EAs do not have those skills – I think it is a very difficult skill. Merely caring about the world is not enough.
I think most EAs do not, by default, prioritize epistemics that highly, unless they came in through the rationalist scene, and even then, I think holding onto your epistemics while navigating social pressure is a very difficult skill that even rationalists who specialize in it tend to fail at. (Getting into details here is tricky because it involves judgment calls about individuals, in social situations that are selected for being murky and controversial, but, no, I do not think the median EA or even the 99th percentile EA is going to competent enough at this for it to be worthwhile for them to join OpenAI. I think ~99.5th percentile is the point where it seems even worth talking about, and I don’t think those people get most of their job leads through the job board).
Surely this isn’t the typical EA though?
I think job ads in particular are a filter for “being more typical.”
I expect the people who have a chance of doing a good job to be well connected to previous people who worked at OpenAI, with some experience under their belt navigating organizational social scenes while holding onto their own epistemics. I expect such a person to basically not need to see the job ad.
I do want to acknowledge:
I refer to Jan Leike’s and Daniel Kokotajlo’s comments about why the left, and reference other people leaving the company.
I do think this is important evidence.
I want to acknowledge I wouldn’t actually bet that Jan and Daniel would endorse everyone else leaving OpenAI, and only weakly bet that they’d endorse not leaving up the current 80k-ads as written.
I am grateful to them for having spoken up publicly, but I know that a reason people hesitate to speak publicly about this sort of thing is that it’s easier for soundbyte words to get taken and runaway by people arguing for positions stronger than you endorse, and I don’t want them to regret that.
I know at least one person who has less negative (but mixed) feelings who left OpenAI for somewhat different reasons, and another couple people who still work at OpenAI I respect in at least some domains.
(I haven’t chatted with either of them about this recently)
I have slightly complex thoughts about the “is 80k endorsing OpenAI?” question.
I’m generally on the side of “let people make individual statements without treating it as a blanket endorsement.”
In practice, I think the job postings will be read as an endorsement by many (most?) people. But I think the overall policy of “social-pressure people to stop making statements that could be read as endorsements” is net harmful.
I think you should at least be acknowledging the implication-of-endorsement as a cost you are paying.
I’m a bit confused about how to think about it here, because I do think listing people on the job site, with the sorts of phrasing you use, feels more like some sort of standard corporate political move than a purely epistemic move.
I do want to distinguish the question of “how does this job-ad funnel social status around?” from “does this job-ad communicate clearly?”. I think it’s still bad to force people only speak words that can’t be inaccurately read into, but, I think this is an important enough area to put extra effort in.
An accurate job posting, IMO, would say “OpenAI-in-particular has demonstrated that they do not follow through on safety promises, and we’ve seen people leave due to not feeling effectual.”
I think you maybe both disagree with that object level fact (if so, I think you are wrong, and this is important), as well as, well, that’d be a hell of a weird job ad. Part of why I am arguing here is I think it looks, from the outside, like 80k is playing a slightly confused mix of relating to orgs politically and making epistemic recommendations.
I kind of expect at this point you to leave the job ad up, and maybe change the disclaimer slightly in a way that is leaves some sort of plausibly-deniable veneer.
I attempted to address this in the Isn’t it better to have alignment researchers working there, than not? Are you sure you’re not running afoul of misguided purity instincts? FAQ section.
I think the evidence we have from OpenAI is that it isn’t very helpful to “be a safety conscious person there.” (i.e. combo of people leaving who did not find it tractable to be helpful there, and NDAs making it hard to reason about, and IMO better to default assume bad things rather than good things given the NDAs)
I think it’s especially not helpful if you’re a low-context person, who reads an OpenAI job board posting, and isn’t going in with a specific plan to operate in an adversarial environment.
If the job posting literally said “to be clear, OpenAI has a pretty bad track record and seems to be an actively misleading environment, take this job if you are prepared to deal with that”, that’d be a different story. (But, that’s also a pretty weird job ad, and OpenAI would be rightly skeptical of people coming from that funnel. I think taking jobs at OpenAI that are net helpful to the world requires a mix of a very strong moral and epistemic backbone, and nonetheless still able to make good faith positive sum trades with OpenAI leadership. Most of the people I know who maybe had those skills have left the company)
I expect the object-level impact of a person joining OpenAI to be slightly harmful on net (although realistically close to neutral because of replaceability effects. I expect them to be slightly-harmful on net because OpenAI is good at hiring competent people, and good at funneling them into harmful capabilities work. So, the fact that you got hired is evidence you are slightly better at it than the next person).
I do basically agree we don’t have bargaining power, and that they most likely don’t care about having a good relationship with us.
The reason for the diplomatic “line of retreat” in the OP is more because:
it’s hard to be sure how adversarial a situation you’re in, and it just seems like generally good practice to be clear on what would change your mind (in case you have overestimated the adversarialness)
it’s helpful for showing others, who might not share exactly my worldview, that I’m “playing fairly.”
I’d probably imagine no-one much at OpenAI really losing sleep over my decision either way, so I’d tend to do just whatever seemed best to me in terms of the direct consequences.
I’m not sure about “direct consequences” being quite the right frame. I agree the particular consequence-route of “OpenAI changes their policies because of our pushback” isn’t plausible enough to be worth worrying about, but, I think indirect consequences on our collective epistemics are pretty important.
fwiw I don’t think replacing the OpenAI logo or name makes much sense.
I do think it’s pretty important to actively communicate that even the safety roles shouldn’t be taken at face value.
Nod, thanks for the reply.
I won’t argue more for removing infosec roles at the moment. As noted in the post, I think this is at least a reasonable position to hold. I (weakly) disagree, but for reasons that don’t seem worth getting into here.
The things I’d argue here:
Safetywashing is actually pretty bad, for the world’s epistemics and for EA and AI safety’s collective epistemics. I think it also warps the epistemics of the people taking the job, so while they might be getting some career experience… they’re also likely getting a distorted view of what what AI safety is, and becoming worse researchers than they would otherwise.
As previously stated – it’s not that I don’t think anyone should take these jobs, but I think the sort of person who should take them is someone who has a higher degree of context and skill than I expect the 80k job board to filter for.
Even if you disagree with those points, you should have some kind of crux for “what would distinguish an ‘impactful AI safety job?’” vs a fake safety-washed role. It should be at least possible for OpenAI to make a role so clearly fake that you notice and stop listing it.
If you’re set on continuing to list OpenAI Alignment roles, I think the current disclaimer is really inadequate and misleading. (Partly because of the object-level content in Working at an AI company, which I think wrongly characterizes OpenAI, and partly because what disclaimers you do have are deep in that post. On the top-level job ad, there’s no indication that applicants should be skeptical about OpenAI.
Re: cruxes for safetywashing
You’d presumably agree, OpenAI couldn’t just call any old job ‘Alignment Science’ and have it automatically count as worth listing on your site.
Companies at least sometimes lie, and they often use obfuscating language to mislead. OpenAI’s track record is such that, we know that they do lie and mislead. So, IMO, your prior here should be moderately high.
Maybe you only think it’s, like, 10%? (or less? IMO less than 10% feels pretty strained to me). But, at what credence would you stop listing it on the job board? And what evidence would increase your odds?
Taking the current Alignment Science researcher role as an example:
We are seeking Researchers to help design and implement experiments for alignment research. Responsibilities may include:
Writing performant and clean code for ML training
Independently running and analyzing ML experiments to diagnose problems and understand which changes are real improvements
Writing clean non-ML code, for example when building interfaces to let workers interact with our models or pipelines for managing human data
Collaborating closely with a small team to balance the need for flexibility and iteration speed in research with the need for stability and reliability in a complex long-lived project
Understanding our high-level research roadmap to help plan and prioritize future experiments
Designing novel approaches for using LLMs in alignment research
You might thrive in this role if you:
Are excited about OpenAI’s mission of building safe, universally beneficial AGI and are aligned with OpenAI’s charter
Want to use your engineering skills to push the frontiers of what state-of-the-art language models can accomplish
Possess a strong curiosity about aligning and understanding ML models, and are motivated to use your career to address this challenge
Enjoy fast-paced, collaborative, and cutting-edge research environments
Have experience implementing ML algorithms (e.g., PyTorch)
Can develop data visualization or data collection interfaces (e.g., JavaScript, Python)
Want to ensure that powerful AI systems stay under human control.
Is this an alignment research role, or a capabilities role that pays token lip service to alignment?My guess (60%) based on my knowledge of OpenAI is it’s more like the latter.
It says “Designing novel approaches for using LLMs in alignment research”, but that’s only useful if you think OpenAI uses the phrase “alignment research” to mean something important. We know they eventually coined the term “superalignment”, distinguished from most of what they’d been calling “alignment” work (where “superalignment” is closer to what was originally meant by the term).
If OpenAI was creating jobs that weren’t really helpful at all, but labeling them “alignment” anyway, how would you know?
I’m with Ozzie here. I think EA Forum would do better with more technical content even if it’s hard for most people to engage with.
I have written a letter