This post contains many claims that you interpret OpenAI to be making. However, unless I’m missing something, I don’t see citations for any of the claims you attribute to them. Moreover, several of the claims feel like they could potentially be described as misinterpretations of what OpenAI is saying or merely poorly communicated ideas.
I acknowledge that this post was hastily-written, and it’s not necessary to rigorously justify every claim, but your thesis also seems like the type of thing that should be proven, rather than asserted. It would indeed be damning if OpenAI is taking contradictory positions about numerous important issues, but I don’t think you’ve shown that they are in this post. This post would be stronger if you gave concrete examples.
For example, you say that OpenAI is simultaneously claiming,
OpenAI cares a lot about safety (good for public PR and government regulations).
OpenAI isn’t making anything dangerous and is unlikely to do so in the future (good for public PR and government regulations).
Is it true that OpenAI has claimed that they aren’t making anything dangerous and aren’t likely to do so in the future? Where have they said this?
I agree that it would be good to have citations. In case neither Ozzie nor anyone else here finds it a good use of their time to do it—I’ve been following OpenAIs and Sam Altman’s messaging specifically for a while and Ozzie’s summary of their (conflicting) messaging seems roughly accurate to me. It’s easy to notice the inconsistencies in Sam Altman’s messaging, especially when it comes to safety.
Another commenter (whose name I forgot, I think he was from CLTR) put it nicely: It feels likeAltman does not have one consistent set of beliefs (like an ethics/safety researcher would) but tends to say different things that are useful for achieving his goals (like many CEOs do), and he seems to do that more than other AI lab executives at Anthropic or Deepmind.
Thanks for sharing your impressions. But even if many observers have this impression, it still seems like it could be quite valuable to track down exactly what was said, because there’s some gap between:
(a) has nuanced models of the world and will strategically select different facets of those to share on different occasions; and
(b) will strategically select what to say on different occasions without internal validity or consistency.
… but either of these could potentially create the impressions in observers of inconsistency. (Not to say that (a) is ideal, but I think that (b) is clearly more egregious.)
I imagine we’re basically all in agreement on this.
Only question is who might want to / be able to do much of it. It does seem like it could be a fairly straightforward project, though it feels like it would be work.
It could be partially crowdsourced. People could add links to interviews to a central location as they come across them, quotes can be taken from news articles, maybe some others can do AI transcription of other interviews. I think subtitles from YouTube videos can also be downloaded?
> thesis also seems like the type of thing that should be proven, rather than asserted. It would indeed be damning if OpenAI is taking contradictory positions about numerous important issues, but I don’t think you’ve shown that they are in this post. This post would be stronger if you gave concrete examples.
I agree with this. I’d also prefer that there would be work to track down more of this. I’ve been overall surprised at the response my post had, but from previous comments, I assumed that readers mostly agreed with these claims. I’d like to see more work go into this (I’ll look for some sources, and encourage others to do a better job).
> OpenAI isn’t making anything dangerous and is unlikely to do so in the future (good for public PR and government regulations).
I feel like this is one of the more implicit items listed. It’s true that this is one that I don’t remember them saying explicitly, more in the manner of which they speak. There’s also a question here of what the bar for “dangerous” is. Also, to be clear, I think OpenAI’s is stating “We are working on things that could be dangerous if not handled well, but we are handling them well, so the results of our work won’t be dangerous”, not, “We are working on things that could never be dangerous.”
Here are some predictions I’d make: - If someone were to ask Sam Altman, “Do you think that OpenAI releasing LLMs to the point it has now, has endangered over 100 lives, or has the effects of doing so in the next few years?”, he’d say no. - If someone were to ask Sam Altman, “Do you think that GPT-5 is likely to be a real threat to humanity”, he’d say something like, “This is still too early. If it’s any threat, it’s the potential for things like misinformation, not an existential threat. We’re competent at handling such threats.” - If someone were to ask Sam Altman, “Is there a substantial chance that OpenAI will create something that destroys mankind, or kills 1k+ people, in the next 10 years”, he’ll say, “We are very careful, so the chances are very low of anything like that. However, there could be other competitors....”
Their actions really don’t make it seem, to me, like they think it’s very dangerous. - Main information otherwise is the funding of the alignment team, but that was just disbanded. - Removed the main board members who were publicly concerned about risk. - Very little public discussion of concrete/specific large-scale risks of their products and the corresponding risk-mitigation efforts (outside of things like short-term malicious use by bad API actors, where they are doing better work).
I’d also flag that such a message (it’s not very dangerous / it will be handled well) seems more common from Microsoft, I believe even when asked about OpenAI.
One quote from Sam I came across recently that might be of interest to you: “What I lose the most sleep over is the hypothetical idea that we already have done something really bad by launching ChatGPT. That maybe there was something hard and complicated in there (the system) that we didn’t understand and have now already kicked it off.”
[OpenAI do] very little public discussion of concrete/specific large-scale risks of their products and the corresponding risk-mitigation efforts (outside of things like short-term malicious use by bad API actors, where they are doing better work).
Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.
The “Preparedness” page—linked from the top navigation menu on their website—starts:
The study of frontier AI risks has fallen far short of what is possible and where we need to be. To address this gap and systematize our safety thinking, we are adopting the initial version of our Preparedness Framework. It describes OpenAI’s processes to track, evaluate, forecast, and protect against catastrophic risks posed by increasingly powerful models.
The page mentions “cybersecurity, CBRN (chemical, biological, radiological, nuclear threats), persuasion, and model autonomy”. The framework itself goes into more detail, proposing scorecards for assessing risk in each category. They define “catastrophic risk” as “any risk which could result
in hundreds of billions of dollars in economic damage or lead to the severe harm or death of many individuals—this includes, but is not limited to, existential risk”. The phrase “millions of deaths” appears in one of the scorecards.
I agree that there’s a lot of evidence that people at OpenAI have thought that AI could be a major risk, and I think that these are good examples.
I said here, “concrete/specific large-scale risks of their products and the corresponding risk-mitigation efforts (outside of things like short-term malicious use by bad API actors, where they are doing better work).”
Just looking at the examples you posted, most feel pretty high-level and vague, and not very related to their specific products.
This was a one-sentence statement. It easily sounds to me like saying, “Someone should deal with this, but not exactly us.”
> The framework itself goes into more detail, proposing scorecards for assessing risk in each category.
I think this is a good step, but it seems pretty vague to me. There’s fairly little quantifiable content here, a lot of words like “medium risk” and “high risk”.
From what I can tell, the “teeth” in the document is, “changes get brought up to management, and our board”, which doesn’t fill me with confidence.
Related, I’d be quite surprised if they actually followed through with this much in the next 1-3 years, but I’d be happy to be wrong!
This could be a community effort. If you’re reading this and have a spare minute, can you recall any sources for any of Ozzie’s claims and share links to them here? (or go the extra mile, copy his post in a google doc and add sources there?).
Yes. (4) and (11) are also very much “citation needed”. My sense is that they would need to be significantly moderated to fit the facts (e.g. the profit cap is still a thing).
“Is it true that OpenAI has claimed that they aren’t making anything dangerous and aren’t likely to do so in the future? Where have they said this?”
Related > AFAICT they’ve also never said “We’re aiming to make the thing that has a substantial chance of causing the end of humanity”. I think that is a far more important point.
There are two obvious ways to be dishonest: tell a lie or not tell the truth. This falls into the latter category.
As our systems get closer to AGI, we are becoming increasingly cautious with the creation and deployment of our models. Our decisions will require much more caution than society usually applies to new technologies, and more caution than many users would like. Some people in the AI field think the risks of AGI (and successor systems) are fictitious; we would be delighted if they turn out to be right, but we are going to operate as if these risks are existential.
At some point, the balance between the upsides and downsides of deployments (such as empowering malicious actors, creating social and economic disruptions, and accelerating an unsafe race) could shift, in which case we would significantly change our plans around continuous deployment.
[...]
The first AGI will be just a point along the continuum of intelligence. We think it’s likely that progress will continue from there, possibly sustaining the rate of progress we’ve seen over the past decade for a long period of time. If this is true, the world could become extremely different from how it is today, and the risks could be extraordinary. A misaligned superintelligent AGI could cause grievous harm to the world; an autocratic regime with a decisive superintelligence lead could do that too.
[...]
Successfully transitioning to a world with superintelligence is perhaps the most important—and hopeful, and scary—project in human history. Success is far from guaranteed, and the stakes (boundless downside and boundless upside) will hopefully unite all of us.
The “Planning for AGI & Beyond” doc seems to me to be heavily inspired by a few other people at OpenAI at the time, mainly the safety team, and I’m nervous those people have less influence now.
At the bottom, it says:
Thanks to Brian Chesky, Paul Christiano, Jack Clark, Holden Karnofsky, Tasha McCauley, Nate Soares, Kevin Scott, Brad Smith, Helen Toner, Allan Dafoe, and the OpenAI team for reviewing drafts of this.
Since then, Tasha and Helen have been fired off the board, and I’m guessing relations have soured with others listed.
Sam seemed to oversell the relationship with this acknowledgement, so I don’t think we should read much into the other names except literally “they were asked to review drafts”.
sigh… Part of me wants to spend a bunch of time trying to determine which of the following might apply here:
1. This is what Sam really believes. He wrote it himself. He pinged these people for advice. He continues to believe it. 2. This is something that Sam quickly said because he felt pressured by others. This could either be direct pressure (they asked for this), or indirect (he thought they would like him more if he did this) 3. Someone else wrote this, then Sam put his name on it, and barely noticed it.
But at the same time, given that Sam has, what seems to me, like a long track record of insincerity anyway, I don’t feel very optimistic about easily being able to judge this.
At the time I thought that Nate feeling the need to post and clarify about what actually happened was a pretty strong indication that Sam was using this opportunity to pretend they are on better terms with these folks. (Since I think he otherwise never talks to Nate/Eliezer/MIRI? I could be wrong.)
But yeah it could be that someone who still had influence thought this post was important to run by this set of people. (I consider this less likely.)
I don’t think Sam would have barely noticed. It sounds like he was the one who asked for feedback.
In any case this event seems like a minor thing, though imo a helpful part of the gestalt picture.
This post contains many claims that you interpret OpenAI to be making. However, unless I’m missing something, I don’t see citations for any of the claims you attribute to them. Moreover, several of the claims feel like they could potentially be described as misinterpretations of what OpenAI is saying or merely poorly communicated ideas.
I acknowledge that this post was hastily-written, and it’s not necessary to rigorously justify every claim, but your thesis also seems like the type of thing that should be proven, rather than asserted. It would indeed be damning if OpenAI is taking contradictory positions about numerous important issues, but I don’t think you’ve shown that they are in this post. This post would be stronger if you gave concrete examples.
For example, you say that OpenAI is simultaneously claiming,
Is it true that OpenAI has claimed that they aren’t making anything dangerous and aren’t likely to do so in the future? Where have they said this?
I agree that it would be good to have citations. In case neither Ozzie nor anyone else here finds it a good use of their time to do it—I’ve been following OpenAIs and Sam Altman’s messaging specifically for a while and Ozzie’s summary of their (conflicting) messaging seems roughly accurate to me. It’s easy to notice the inconsistencies in Sam Altman’s messaging, especially when it comes to safety.
Another commenter (whose name I forgot, I think he was from CLTR) put it nicely: It feels like Altman does not have one consistent set of beliefs (like an ethics/safety researcher would) but tends to say different things that are useful for achieving his goals (like many CEOs do), and he seems to do that more than other AI lab executives at Anthropic or Deepmind.
Thanks for sharing your impressions. But even if many observers have this impression, it still seems like it could be quite valuable to track down exactly what was said, because there’s some gap between:
(a) has nuanced models of the world and will strategically select different facets of those to share on different occasions; and
(b) will strategically select what to say on different occasions without internal validity or consistency.
… but either of these could potentially create the impressions in observers of inconsistency. (Not to say that (a) is ideal, but I think that (b) is clearly more egregious.)
I imagine we’re basically all in agreement on this.
Only question is who might want to / be able to do much of it. It does seem like it could be a fairly straightforward project, though it feels like it would be work.
It could be partially crowdsourced. People could add links to interviews to a central location as they come across them, quotes can be taken from news articles, maybe some others can do AI transcription of other interviews. I think subtitles from YouTube videos can also be downloaded?
Yes, that comment was made by Lukas Gloor here, when I asked what people thought Sam Altman’s beliefs are.
> thesis also seems like the type of thing that should be proven, rather than asserted. It would indeed be damning if OpenAI is taking contradictory positions about numerous important issues, but I don’t think you’ve shown that they are in this post. This post would be stronger if you gave concrete examples.
I agree with this. I’d also prefer that there would be work to track down more of this. I’ve been overall surprised at the response my post had, but from previous comments, I assumed that readers mostly agreed with these claims. I’d like to see more work go into this (I’ll look for some sources, and encourage others to do a better job).
> OpenAI isn’t making anything dangerous and is unlikely to do so in the future (good for public PR and government regulations).
I feel like this is one of the more implicit items listed. It’s true that this is one that I don’t remember them saying explicitly, more in the manner of which they speak. There’s also a question here of what the bar for “dangerous” is. Also, to be clear, I think OpenAI’s is stating “We are working on things that could be dangerous if not handled well, but we are handling them well, so the results of our work won’t be dangerous”, not, “We are working on things that could never be dangerous.”
Here are some predictions I’d make:
- If someone were to ask Sam Altman, “Do you think that OpenAI releasing LLMs to the point it has now, has endangered over 100 lives, or has the effects of doing so in the next few years?”, he’d say no.
- If someone were to ask Sam Altman, “Do you think that GPT-5 is likely to be a real threat to humanity”, he’d say something like, “This is still too early. If it’s any threat, it’s the potential for things like misinformation, not an existential threat. We’re competent at handling such threats.”
- If someone were to ask Sam Altman, “Is there a substantial chance that OpenAI will create something that destroys mankind, or kills 1k+ people, in the next 10 years”, he’ll say, “We are very careful, so the chances are very low of anything like that. However, there could be other competitors....”
Their actions really don’t make it seem, to me, like they think it’s very dangerous.
- Main information otherwise is the funding of the alignment team, but that was just disbanded.
- Removed the main board members who were publicly concerned about risk.
- Very little public discussion of concrete/specific large-scale risks of their products and the corresponding risk-mitigation efforts (outside of things like short-term malicious use by bad API actors, where they are doing better work).
I’d also flag that such a message (it’s not very dangerous / it will be handled well) seems more common from Microsoft, I believe even when asked about OpenAI.
One quote from Sam I came across recently that might be of interest to you: “What I lose the most sleep over is the hypothetical idea that we already have done something really bad by launching ChatGPT. That maybe there was something hard and complicated in there (the system) that we didn’t understand and have now already kicked it off.”
https://timesofindia.indiatimes.com/business/india-business/et-conversations-with-openai-ceo-sam-altman/amp_liveblog/100822923.cms
You wrote:
This doesn’t match my impression.
For example, Altman signed the CAIS AI Safety Statement, which reads:
The “Preparedness” page—linked from the top navigation menu on their website—starts:
The page mentions “cybersecurity, CBRN (chemical, biological, radiological, nuclear threats), persuasion, and model autonomy”. The framework itself goes into more detail, proposing scorecards for assessing risk in each category. They define “catastrophic risk” as “any risk which could result in hundreds of billions of dollars in economic damage or lead to the severe harm or death of many individuals—this includes, but is not limited to, existential risk”. The phrase “millions of deaths” appears in one of the scorecards.
Their “Planning for AGI & Beyond” blog post describes the risks as “existential”, I quote the relevant passage in another comment.
On their “Safety & Alignment” blog they highlight recent posts called Reimagining secure infrastructure for advanced AI and Building an early warning system for LLM-aided biological threat creation.
My sense is that there are many other examples, but I’ll stop here for now.
I agree that there’s a lot of evidence that people at OpenAI have thought that AI could be a major risk, and I think that these are good examples.
I said here, “concrete/specific large-scale risks of their products and the corresponding risk-mitigation efforts (outside of things like short-term malicious use by bad API actors, where they are doing better work).”
Just looking at the examples you posted, most feel pretty high-level and vague, and not very related to their specific products.
> For example, Altman signed the CAIS AI Safety Statement, which reads...
This was a one-sentence statement. It easily sounds to me like saying, “Someone should deal with this, but not exactly us.”
> The framework itself goes into more detail, proposing scorecards for assessing risk in each category.
I think this is a good step, but it seems pretty vague to me. There’s fairly little quantifiable content here, a lot of words like “medium risk” and “high risk”.
From what I can tell, the “teeth” in the document is, “changes get brought up to management, and our board”, which doesn’t fill me with confidence.
Related, I’d be quite surprised if they actually followed through with this much in the next 1-3 years, but I’d be happy to be wrong!
This could be a community effort. If you’re reading this and have a spare minute, can you recall any sources for any of Ozzie’s claims and share links to them here? (or go the extra mile, copy his post in a google doc and add sources there?).
Yes. (4) and (11) are also very much “citation needed”. My sense is that they would need to be significantly moderated to fit the facts (e.g. the profit cap is still a thing).
“Is it true that OpenAI has claimed that they aren’t making anything dangerous and aren’t likely to do so in the future? Where have they said this?”
Related > AFAICT they’ve also never said “We’re aiming to make the thing that has a substantial chance of causing the end of humanity”. I think that is a far more important point.
There are two obvious ways to be dishonest: tell a lie or not tell the truth. This falls into the latter category.
OpenAI’s “Planning for AGI & Beyond” blog post includes the following:
Altman signed the CAIS AI Safety Statement, which reads:
In 2015 he wrote a blog post which begins:
I have bad feelings about a lot of this.
The “Planning for AGI & Beyond” doc seems to me to be heavily inspired by a few other people at OpenAI at the time, mainly the safety team, and I’m nervous those people have less influence now.
At the bottom, it says:
Thanks to Brian Chesky, Paul Christiano, Jack Clark, Holden Karnofsky, Tasha McCauley, Nate Soares, Kevin Scott, Brad Smith, Helen Toner, Allan Dafoe, and the OpenAI team for reviewing drafts of this.
Since then, Tasha and Helen have been fired off the board, and I’m guessing relations have soured with others listed.
Fwiw the relationship with Nate seemed mostly that Sam asked for comments, Nate gave some, and there was no back and forth. See Nate’s post: https://www.lesswrong.com/posts/uxnjXBwr79uxLkifG/comments-on-openai-s-planning-for-agi-and-beyond
Sam seemed to oversell the relationship with this acknowledgement, so I don’t think we should read much into the other names except literally “they were asked to review drafts”.
sigh… Part of me wants to spend a bunch of time trying to determine which of the following might apply here:
1. This is what Sam really believes. He wrote it himself. He pinged these people for advice. He continues to believe it.
2. This is something that Sam quickly said because he felt pressured by others. This could either be direct pressure (they asked for this), or indirect (he thought they would like him more if he did this)
3. Someone else wrote this, then Sam put his name on it, and barely noticed it.
But at the same time, given that Sam has, what seems to me, like a long track record of insincerity anyway, I don’t feel very optimistic about easily being able to judge this.
These are good points!
At the time I thought that Nate feeling the need to post and clarify about what actually happened was a pretty strong indication that Sam was using this opportunity to pretend they are on better terms with these folks. (Since I think he otherwise never talks to Nate/Eliezer/MIRI? I could be wrong.)
But yeah it could be that someone who still had influence thought this post was important to run by this set of people. (I consider this less likely.)
I don’t think Sam would have barely noticed. It sounds like he was the one who asked for feedback.
In any case this event seems like a minor thing, though imo a helpful part of the gestalt picture.