I don’t regard the norms as being about witholding negative information, but about trying to err towards presenting friendly frames while sharing what’s pertinent, or something?
Honestly I’m not sure how much we really disagree here. I guess we’d have to concretely discuss wording for an org. In the case of OpenAI, I imagine it being appropriate to include some disclaimer like:
OpenAI is a frontier AI company. It has repeatedly expressed an interest in safety and has multiple safety teams. However, some people leaving the company have expressed concern that it is not on track to handle AGI safely, and that it wasn’t giving its safety teams resources they had been promised. Moreover, it has a track record of putting inappropriate pressure on people leaving the company to sign non-disparagement agreements. [With links]
I don’t regard the norms as being about witholding negative information, but about trying to err towards presenting friendly frames while sharing what’s pertinent, or something?
I agree with some definitions of “friendly” here, and disagree with others. I think there is an attractor here towards Orwellian language that is intentionally ambiguous about what it’s trying to say, in order to seem friendly or non-threatening (because in some sense it is), and that kind of “friendly” seems pretty bad to me.
I think the paragraph you have would strike me as somewhat too Orwellian, though it’s not too far off from what I would say. Something closer to what seems appropriate to me:
OpenAI is a frontier AI company, and as such it’s responsible for substantial harm by assisting in the development of dangerous AI systems, which we consider among the biggest risks to humanity’s future. In contrast to most of the jobs in our job board, we consider working at OpenAI more similar to working at a large tobacco company, hoping to reduce the harm that the tobacco company causes, or leveraging this specific tobacco company’s expertise with tobacco to produce more competetive and less harmful variations of tobacco products.
To its credit, it has repeatedly expressed an interest in safety and has multiple safety teams, which are attempting to reduce the likelihood of catastrophic outcomes from AI systems.
However, many people leaving the company have expressed concern that it is not on track to handle AGI safely, that it wasn’t giving its safety teams resources they had been promised, and that the leadership of the company is untrustworthy. Moreover, it has a track record of putting inappropriate pressure on people leaving the company to sign non-disparagement agreements. [With links]
We explicitly recommend against taking any roles not in computer security or safety at OpenAI, and consider those substantially harmful under most circumstances (though exceptions might exist).
I feel like this is currently a bit too “edgy” or something, and I would massage some sentences for longer, but it captures the more straightforward style that i think would be less likely to cause people to misunderstand the situation.
So it may be that we just have some different object-level views here. I don’t think I could stand behind the first paragraph of what you’ve written there. Here’s a rewrite that would be palatable to me:
OpenAI is a frontier AI company, aiming to develop artificial general intelligence (AGI). We consider poor navigation of the development of AGI to be among the biggest risks to humanity’s future. It is complicated to know how best to respond to this. Many thoughtful people think it would be good to pause AI development; others think that it is good to accelerate progress in the US. We think both of these positions are probably mistaken, although we wouldn’t be shocked to be wrong. Overall we think that if we were able to slow down across the board that would probably be good, and that steps to improve our understanding of the technology relative to absolute progress with the technology are probably good. In contrast to most of the jobs in our job board, therefore, it is not obviously good to help OpenAI with its mission. It may be more appropriate to consider working at OpenAI as more similar to working at a large tobacco company, hoping to reduce the harm that the tobacco company causes, or leveraging this specific tobacco company’s expertise with tobacco to produce more competetive and less harmful variations of tobacco products.
I want to emphasise that this difference is mostly not driven by a desire to be politically acceptable (although the inclusion/wording of the “many thoughtful people …” clauses are a bit for reasons of trying to be courteous), but rather a desire not to give bad advice, nor to be overconfident on things.
… That paragraph doesn’t distinguish at all between OpenAI and, say, Anthropic. Surely you want to include some details specific to the OpenAI situation? (Or do your object-level views really not distinguish between them?)
I was just disagreeing with Habryka’s first paragraph. I’d definitely want to keep content along the lines of his third paragraph (which is pretty similar to what I initially drafted).
I don’t regard the norms as being about witholding negative information, but about trying to err towards presenting friendly frames while sharing what’s pertinent, or something?
Honestly I’m not sure how much we really disagree here. I guess we’d have to concretely discuss wording for an org. In the case of OpenAI, I imagine it being appropriate to include some disclaimer like:
I largely agree with the rating-agency frame.
I agree with some definitions of “friendly” here, and disagree with others. I think there is an attractor here towards Orwellian language that is intentionally ambiguous about what it’s trying to say, in order to seem friendly or non-threatening (because in some sense it is), and that kind of “friendly” seems pretty bad to me.
I think the paragraph you have would strike me as somewhat too Orwellian, though it’s not too far off from what I would say. Something closer to what seems appropriate to me:
I feel like this is currently a bit too “edgy” or something, and I would massage some sentences for longer, but it captures the more straightforward style that i think would be less likely to cause people to misunderstand the situation.
So it may be that we just have some different object-level views here. I don’t think I could stand behind the first paragraph of what you’ve written there. Here’s a rewrite that would be palatable to me:
I want to emphasise that this difference is mostly not driven by a desire to be politically acceptable (although the inclusion/wording of the “many thoughtful people …” clauses are a bit for reasons of trying to be courteous), but rather a desire not to give bad advice, nor to be overconfident on things.
… That paragraph doesn’t distinguish at all between OpenAI and, say, Anthropic. Surely you want to include some details specific to the OpenAI situation? (Or do your object-level views really not distinguish between them?)
I was just disagreeing with Habryka’s first paragraph. I’d definitely want to keep content along the lines of his third paragraph (which is pretty similar to what I initially drafted).
Yeah, this paragraph seems reasonable (I disagree, but like, that’s fine, it seems like a defensible position).
Yeah same. (although, this focuses entirely on their harm as an AI organization, and not manipulative practices)
I think it leaves the question “what actually is the above-the-fold-summary” (which’d be some kind of short tag).