richard_ngo

Karma: 7,890

Former AI safety research engineer, now AI governance researcher at OpenAI. Blog: thinkingcomplete.blogspot.com

richard_ngo 30 Nov 2025 21:28 UTC
15 points
2 ∶ 2
in reply to: Tristan W’s comment on: My 2025 donations (so far)
I used to not actually believe in heavy-tailed impact. On some level I thought that early rationalists (and to a lesser extent EAs) had “gotten lucky” in being way more right than academic consensus about AI progress. And I thought on some gut level that e.g. Thiel and Musk and so on kept getting lucky, because I didn’t want to picture a world in which they were actually just skillful enough to keep succeeding (due to various psychological blockers).
Now, thanks to dealing with a bunch of those blockers, I have internalized to a much greater extent that you can actually be good not just lucky. This means that I’m no longer interested in strategies that involve recruiting a whole bunch of people and hoping something good comes out of it. Instead I am trying to target outreach precisely to the very best people, without compromising much.
Relatedly, I’ve updated that the very best thinkers in this space are still disproportionately the people who were around very early. The people you need to soften/moderate your message to reach (or who need social proof in order to get involved) are seldom going to be the ones who can think clearly about this stuff. And we are very bottlenecked on high-quality thinking.
(My past self needed a lot of social proof to get involved in AI safety in the first place, but I also “got lucky” in the sense of being exposed to enough world-class people that I was able to update my mental models a lot—e.g. watching the OpenAI board coup close up, various conversations with OpenAI cofounders, etc. This doesn’t seem very replicable—though I’m trying to convey a bunch of the models I’ve gained on my blog, e.g. in this post.)

richard_ngo 1 Apr 2025 18:14 UTC
6 points
0 ∶ 0
in reply to: Cullen 🔸’s comment on: Third-wave AI safety needs sociopolitical thinking
My story is: Elon changing the twitter censorship policies was a big driver of a chunk of Silicon Valley getting behind Trump—separate from Elon himself promoting Trump, and separate from Elon becoming a part of the Trump team.
And I think anyone who bought Twitter could have done that.
If anything being Elon probably made it harder, because he then had to face advertiser boycotts.
Agree/disagree?

richard_ngo 1 Apr 2025 0:24 UTC
8 points
1 ∶ 1
in reply to: Ozzie Gooen’s comment on: Third-wave AI safety needs sociopolitical thinking
To be clear, my example wasn’t “I’m trying to talk to people in the south about racism” It’s more like, “I’m trying to talk to people in the south about animal welfare, and in doing so, I bring up examples around South people being racist.”
Yeah I got that. Let me flesh out an analogy a little more:
Suppose you want to pitch people in the south about animal welfare. And you have a hypothesis for why people in the south don’t care much about animal welfare, which is that they tend to have smaller circles of moral concern than people in the north. Here are two types of example you could give:
1. You could give an example which fits with their existing worldview—like “having small circles of moral concern is what the north is doing when they’re dismissive of the south”. And then they’ll nod along and think to themselves “yeah, fuck the north” and slot what you’re saying into their minds as another piece of ammunition.
2. Or you could give an example that actively clashes with their worldview: “hey, I think you guys are making the same kind of mistake that a bunch of people in the south have historically made by being racist”. And then most people will bounce off, but a couple will be like “oh shit, that’s what it looks like like to have a surprisingly small circle of moral concern and not realize it”.
My claims:
1. Insofar as the people in the latter category have that realization, it will be to a significant extent because you used an example that was controversial to them, rather than one which already made sense to them.
2. People in AI safety are plenty good at saying phrases like “some AI safety interventions are net negative” and “unilateralist’s curse” and so on. But from my perspective there’s a missing step of… trying to deeply understand the forces that make movements net negative by their own values? Trying to synthesize actual lessons from a bunch of past fuck-ups?
  
  I personally spent a long time being like “yeah I guess AI safety might have messed up big-time by leading to the founding of the AGI labs” but then not really doing anything differently. I only snapped out of complacency when I got to observe first-hand a bunch of the drama at OpenAI (which inspired this post). And so I have a hypothesis that it’s really valuable to have some experience where you’re like “oh shit, that’s what it looks like for something that seems really well-intentioned that everyone in my bubble is positive-ish about to make the world much worse”. That’s what I was trying to induce with my environmentalism slide (as best I can reconstruct, though note that the process by which I actually wrote it was much more intuitive and haphazard than the argument I’m making here).
I’m nervous that you and/or others might slide into clearly-incorrect and dangerous MAGA worldviews.
Yeah, that is a reasonable fear to have (which is part of why I’m engaging extensively here about meta-level considerations, so you can see that I’m not just running on reflexive tribalism).
Having said that, there’s something here reminiscent of I can tolerate anything except the outgroup. Intellectual tolerance isn’t for ideas you think are plausible—that’s just normal discussion. It’s for ideas you think are clearly incorrect, e.g. your epistemic outgroup. Of course you want to draw some lines for discussion of aliens or magic or whatever, but in this case it’s a memeplex endorsed (to some extent) by approximately half of America, so clearly within the bounds of things that are worth discussing. (You added “dangerous” too, but this is basically a general-purpose objection to any ideas which violate the existing consensus, so I don’t think that’s a good criterion for judging which speech to discourage.)

In other words, the optimal number of people raising and defending MAGA ideas in EA and AI safety is clearly not zero. Now, I do think that in an ideal world I’d be doing this more carefully. E.g. I flagged in the transcript a factual claim that I later realized was mistaken, and there’s been various pushback on the graphs I’ve used, and the “caused climate change” thing was an overstatement, and so on. Being more cautious would help with your concerns about “epistemic slight-of-hands”. But for better or worse I am temperamentally a big-ideas thinker, and when I feel external pressure to make my work more careful that often kills my motivation to do it (which is true in AI safety too—I try to focus much more on first-principles reasoning than detailed analysis). In general I think people should discount my views somewhat because of this (and I give several related warnings in my talk) but I do think that’s pretty different from the hypothesis you mention that I’m being deliberately deceptive.

richard_ngo 31 Mar 2025 22:12 UTC
3 points
0 ∶ 1
in reply to: Ozzie Gooen’s comment on: Third-wave AI safety needs sociopolitical thinking
a lot of your framing matches incredibly well with what I see as current right-wing talking points
Occam’s razor says that this is because I’m right-wing (in the MAGA sense not just the libertarian sense).
It seems like you’re downweighting this hypothesis primarily because you personally have so much trouble with MAGA thinkers, to the point where you struggle to understand why I’d sincerely hold this position. Would you say that’s a fair summary? If so hopefully some forthcoming writings of mine will help bridge this gap.
It seems like the other reason you’re downweighting that hypothesis is because my framing seems unnecessarily provocative. But consider that I’m not actually optimizing for the average extent to which my audience changes their mind. I’m optimizing for something closer to the peak extent to which audience members change their mind (because I generally think of intellectual productivity as being heavy-tailed). When you’re optimizing for that you may well do things like give a talk to a right-wing audience about racism in the south, because for each person there’s a small chance that this example changes their worldview a lot.
I’m open to the idea that this is an ineffective or counterproductive strategy, which is why I concede above that this one probably went a bit too far. But I don’t think it’s absurd by any means.
Insofar as I’m doing something I don’t reflectively endorse, I think it’s probably just being too contrarian because I enjoy being contrarian. But I am trying to decrease the extent to which I enjoy being contrarian in proportion to how much I decrease my fear of social judgment (because if you only have the latter then you end up too conformist) and that’s a somewhat slow process.

richard_ngo 29 Mar 2025 23:14 UTC
3 points
0 ∶ 2
in reply to: Camille’s comment on: Third-wave AI safety needs sociopolitical thinking
Thanks for the comment.
I think you probably should think of Silicon Valley as “the place” for politics. A bunch of Silicon Valley people just took over the Republican party, and even the leading Democrats these days are Californians (Kamala, Newsom, Pelosi) or tech-adjacent (Yglesias, Klein).
Also I am working on basically the same thing as Jan describes, though I think coalitional agency is a better name for it. (I even have a post on my opposition to bayesianism.)

richard_ngo 29 Mar 2025 23:07 UTC
6 points
0 ∶ 0
in reply to: Howie_Lempel’s comment on: Third-wave AI safety needs sociopolitical thinking
Good questions. I have been pretty impressed with:
- Balaji, who tweets about the dismantling of the American Empire (e.g. here, here) are the best geopolitical analysis I’ve seen of what’s going wrong with the current Trump administration
- NS Lyons, e.g. here.
- Some of Samo Burja’s concepts (e.g. live vs dead players) have proven much more useful than I expected when I heard about them a few years ago.
I think there are probably a bunch of other frameworks that have as much or more explanatory power as my two-factor model (e.g. Henrich’s concept of WEIRDness, Scott Alexander’s various models of culture war + discourse dynamics, etc). It’s less like they’re alternatives though and more like different angles on the same elephant.

richard_ngo 29 Mar 2025 16:00 UTC
4 points
0 ∶ 2
in reply to: JWS 🔸’s comment on: Third-wave AI safety needs sociopolitical thinking
Thanks for the thoughtful comment! Yeah the OpenAI board thing was the single biggest thing that shook me out of complacency and made me start doing sociopolitical thinking. (Elon’s takeover of twitter was probably the second—it’s crazy that you can get that much power for $44 billion.)
I do think I have a pretty clear story now of what happened there, and maybe will write about it explicitly going forward. But for now I’ve written about it implicitly here (and of course in the cooperative AI safety strategies post).

richard_ngo 29 Mar 2025 4:38 UTC
2 points
0 ∶ 0
in reply to: PeterSlattery’s comment on: Third-wave AI safety needs sociopolitical thinking
No central place for all the sources but the one you asked about is: https://www.sebjenseb.net/p/how-profitable-is-embryo-selection

richard_ngo 27 Mar 2025 19:32 UTC
4 points
0 ∶ 0
in reply to: jackva’s comment on: Third-wave AI safety needs sociopolitical thinking
Ah, gotcha. Yepp, that’s a fair point, and worth me being more careful about in the future.
I do think we differ a bit on how disagreeable we think advocacy should be, though. For example, I recently retweeted this criticism of Abundance, which is basically saying that they overly optimized for it to land with those who hear it.
And in general I think it’s worth losing a bunch of listeners in order to convey things more deeply to the ones who remain (because if my own models of movement failure have been informed by environmentalism etc, it’s hard to talk around them).
But in this particular case, yeah, probably a bit of an own goal to include the environmentalism stuff so strongly in an AI talk.

richard_ngo 27 Mar 2025 18:07 UTC
8 points
1 ∶ 0
in reply to: Matrice Jacobine🔸🏳️‍⚧️’s comment on: Third-wave AI safety needs sociopolitical thinking
him being one of the few safetyists on the political right to not have capitulated to accelerationism-because-of-China (as most recently even Elon did).
Thanks for noticing this. I have a blog post coming out soon criticizing this exact capitulation.
every time he tries to talk about object-level politics it feels like going into the bizarro universe and I would flip the polarity of the signs of all of it
I am torn between writing more about politics to clarify, and writing less about politics to focus on other stuff. I think I will compromise by trying to write about political dynamics more timelessly (e.g. as I did in this post, though I got a bit more object-level in the follow-up post).

richard_ngo 27 Mar 2025 17:17 UTC
23 points
3 ∶ 0
in reply to: David Mathers🔸’s comment on: Third-wave AI safety needs sociopolitical thinking
I worry that your bounties are mostly just you paying people to say things you already believe about those topics
This is a fair complaint and roughly the reason I haven’t put out the actual bounties yet—because I’m worried that they’re a bit too skewed. I’m planning to think through this more carefully before I do; okay to DM you some questions?
I think it is extremely easy to imagine the left/Democrat wing of AI safety becoming concerned with AI concentrating power, if it hasn’t already
It is not true that all people with these sort of concerns only care private power and not the state either. Dislike of Palantir’s nat sec ties is a big theme for a lot of these people, and many of them don’t like the nat sec-y bits of the state very much either.
I definitely agree with you with regard to corporate power (and see dislike of Palantir as an extension of that). But a huge part of the conflict driving the last election was “insiders” versus “outsiders”—to the extent that even historically Republican insiders like the Cheneys backed Harris. And it’s hard for insiders to effectively oppose the growth of state power. For instance, the “govt insider” AI governance people I talk to tend to be the ones most strongly on the “AI risk as anarchy” side of the divide, and I take them as indicative of where other insiders will go once they take AI risk seriously.
But I take your point that the future is uncertain and I should be tracking the possibility of change here.
(This is not a defense of the current administration, it is very unclear whether they are actually effectively opposing the growth of state power, or seizing it for themselves, or just flailing.)

richard_ngo 27 Mar 2025 17:06 UTC
19 points
3 ∶ 2
in reply to: jackva’s comment on: Third-wave AI safety needs sociopolitical thinking
Thanks for the feedback!
FWIW a bunch of the polemical elements were deliberate. My sense is something like: “All of these points are kinda well-known, but somehow people don’t… join the dots together? Like, they think of each of them as unfortunate accidents, when they actually demonstrate that the movement itself is deeply broken.”
There’s a kind of viewpoint flip from being like “yeah I keep hearing about individual cases that sure seem bad but probably they’ll do better next time” to “oh man, this is systemic”. And I don’t really know how to induce the viewpoint shift except by being kinda intense about it.
Upon reflection, I actually take this exchange to be an example of what I’m trying to address. Like, I gave a talk that was according to you “so extreme that it is hard to take seriously” and your three criticisms were:
1. An (admittedly embarrassing) terminological slip on NEPA.
2. A strawman of my point (I never said anyone was “single-handedly responsible”, that’s a much higher bar than “caused”—though I do in hindsight think that just saying “caused” without any qualifiers was sloppy of me).
3. A critique of an omission (on water/air pollution).
I imagine you have better criticisms to make, but ultimately (as you mention) we do agree on the core point, and so in some sense the message I’m getting is “yeah, listen, environmentalism has messed up a bunch of stuff really badly, but you’re not allowed to be mad about it”.
And I basically just disagree with that. I do think being mad about it (or maybe “outraged” is a better term) will have some negative effects on my personal epistemics (which I’m trying carefully to manage). But given the scale of the harms caused, this level of criticism seems like an acceptable and proportional discursive move. (Though note that I’d have done things differently if I felt like criticism that severe was already common within the political bubble of my audience—I think outrage is much worse when it bandwagons.)
EDIT: what do you mean by “how to get broad engagement on this”? Like, you don’t see how this could be interesting to a wider audience? You don’t know how to engage with it yourself? Something else?

richard_ngo 21 Jan 2025 10:46 UTC
2 points
0 ∶ 0
in reply to: Will Howard🔹’s comment on: It looks like there are some good funding opportunities in AI safety right now
yeah there was a tender offer, openai does them every year or two

richard_ngo 29 Dec 2024 16:32 UTC
84 points
0 ∶ 0
on: It looks like there are some good funding opportunities in AI safety right now
This post convinced me to sell $200,000 more OpenAI shares than I would otherwise have, in order to have more money available to donate rapidly. Thanks!

richard_ngo 14 Sep 2024 7:39 UTC
16 points
3 ∶ 1
on: Announcing the Meta Coordination Forum 2024
Thanks for sharing this, it does seem good to have transparency into this stuff.
My gut reaction was “huh, I’m surprised about how large a proportion of these people (maybe 30-50%, depending on how you count it) I don’t recall substantially interacting with” (where by “interaction” I include reading their writings).
To be clear, I’m not trying to imply that it should be higher; that any particular mistakes are being made; or that these people should have interacted with me. It just felt surprising (given how long I’ve been floating around EA) and worth noting as a datapoint. (Though one reason to take this with a grain of salt is that I do forget names and faces pretty easily.)

richard_ngo 9 Sep 2024 17:44 UTC
11 points
3 ∶ 1
in reply to: NickLaing’s comment on: Lizka’s Shortform
My point is not that the current EA forum would censor topics that were actually important early EA conversations, because EAs have now been selected for being willing to discuss those topics. My point is that the current forum might censor topics that would be important course-corrections, just as if the rest of society had been moderating early EA conversations, those conversations might have lost important contributions like impartiality between species (controversial: you’re saying human lives don’t matter very much!), the ineffectiveness of development aid (controversial: you’re attacking powerful organizations!), transhumanism (controversial, according to the people who say it’s basically eugenics), etc.
Re “conversations can be had in more sensitive ways”, I mostly disagree, because of the considerations laid out here: the people who are good at discussing topics sensitively are mostly not the ones who are good at coming up with important novel ideas.
For example, it seems plausible that genetic engineering for human intelligence enhancement is an important and highly neglected intervention. But you had to be pretty disagreeable to bring it into the public conversation a few years ago (I think it’s now a bit more mainstream).

richard_ngo 9 Sep 2024 1:55 UTC
14 points
3 ∶ 3
in reply to: huw’s comment on: Lizka’s Shortform
Narrowing in even further on the example you gave, as an illustration: I just had an uncomfortable conversation about age of consent laws literally yesterday with an old friend of mine. Specifically, my friend was advocating that the most important driver of crime is poverty, and I was arguing that it’s cultural acceptance of crime. I pointed to age of consent laws varying widely across different countries as evidence that there are some cultures which accept behavior that most westerners think of as deeply immoral (and indeed criminal).
Picturing some responses you might give to this:
1. That’s not the sort of uncomfortable claim you’re worried about
  1. But many possible continuations of this conversation would in fact have gotten into more controversial territory. E.g. maybe a cultural relativist would defend those other countries having lower age of consent laws. I find cultural relativism kinda crazy (for this and related reasons) but it’s a pretty mainstream position.
2. I could have made the point in more sensitive ways
  1. Maybe? But the whole point of the conversation was about ways in which some cultures are better than others. This is inherently going to be a sensitive claim, and it’s hard to think of examples that are compelling without being controversial.
3. This is not the sort of thing people should be discussing on the forum
  1. But EA as a movement is interested in things like:
    Criminal justice reform (which OpenPhil has spent many tens of millions of dollars on)
    Promoting women’s rights (especially in the context of global health and extreme poverty reduction)
    What factors make what types of foreign aid more or less effective
    More generally, the relationship between the developed and the developing world
    So this sort of debate does seem pretty relevant.
I think EA would’ve broadly survived intact by lightly moderating other kinds of discomfort (or it may have even expanded).
The important point is that we didn’t know in advance which kinds of discomfort were of crucial importance. The relevant baseline here is not early EAs moderating ourselves, it’s something like “the rest of academic philosophy/society at large moderating EA”, which seems much more likely to have stifled early EA’s ability to identify important issues and interventions.
(I also think we’ve ended up at some of the wrong points on some of these issues, but that’s a longer debate.)

richard_ngo 8 Sep 2024 18:25 UTC
18 points
3 ∶ 2
in reply to: NickLaing’s comment on: Lizka’s Shortform
Ty for the reply; a jumble of responses below.
I think there are better places to have these often awkward, fraught conversations.
You are literally talking about the sort of conversations that created EA. If people don’t have these conversations on the forum (the single best way to create common knowledge in the EA commmunity), then it will be much harder to course-correct places where fundamental ideas are mistaken. I think your comment proceeds from the implicit assumption that we’re broadly right about stuff, and mostly just need to keep our heads down and do the work. I personally think that a version of EA that doesn’t have the ability to course-correct in big ways would be net negative for the world. In general it is not possible to e.g. identify ongoing moral catastrophes when you’re optimizing your main venue of conversations for avoiding seeming weird.
I agree with you the quote from the Hamas charter is more dangerous—and think we shouldn’t be publishing or discussing that on the forum either.
If you’re not able to talk about evil people and their ideologies, then you will not be able to account for them in reasoning about how to steer the world. I think EA is already far too naive about how power dynamics work at large scales, given how much influence we’re wielding; this makes it worse.
There’s potential reputational damage for all the people doing great EA work across the spectrum here.
I think there are just a few particular topics which give people more ammunition for public take-downs, and there is wisdom in sometimes avoiding loading balls into your opponents cannons.
Insofar as you’re thinking about this as a question of coalitional politics, I can phrase it in those terms too: the more censorious EA becomes, the more truth-seeking people will disaffiliate from it. Habryka, who was one of the most truth-seeking people involved in EA, has already done so; I wouldn’t say it was directly because of EA not being truth-seeking enough, but I think that was one big issue for him amongst a cluster of related issues. I don’t currently plan to, but I’ve considered the possibility, and the quality of EA’s epistemic norms is one of my major considerations (of course, the forum’s norms are only a small part of that).
However, having said this, I don’t think you should support more open forum norms mostly as a concession to people like me, but rather in order to pursue your own goals more effectively. Movements that aren’t able to challenge foundational assumptions end up like environmentalists: actively harming the causes they’re trying to support.

richard_ngo 7 Sep 2024 18:26 UTC
35 points
7 ∶ 2
in reply to: Neel Nanda’s comment on: Lizka’s Shortform
I appreciate the thought that went into this. I also think that using rate-limits as a tool, instead of bans, is in general a good idea. I continue to strongly disagree with the decisions on a few points:
1. I still think including the “materials that may be easily perceived as such” clause has a chilling effect.
2. I also remember someone’s comment that the things you’re calling “norms” are actually rules, and it’s a little disingenuous to not call them that; I continue to agree with this.
3. The fact that you’re not even willing to quote the parts of the post that were objectionable feels like an indication of a mindset that I really disagree with. It’s like… treating words as inherently dangerous? Not thinking at all about the use-mention distinction? I mean, here’s a quote from the Hamas charter: “There is no solution for the Palestinian question except through Jihad.” Clearly this is way way more of an incitement to violence than any quote of dstudiocode’s, which you’re apparently not willing to quote. (I am deliberately not expressing any opinion about whether the Hamas quote is correct; I’m just quoting them.) What’s the difference?
4. “They see the fact that it is “just” a philosophical question as not changing the assessment.” Okay, let me now quote Singer. “Human babies are not born self-aware, or capable of grasping that they exist over time. They are not persons… the life of a newborn is of less value than the life of a pig, a dog, or a chimpanzee.” Will you warn/ban me from the EA forum for quoting Singer, without endorsing that statement? What if I asked, philosophically, “If Singer were right, would it be morally acceptable to kill a baby to save a dog’s life?” I mean, there are whole subfields of ethics based on asking about who you would kill in order to save whom (which is why I’m pushing on this so strongly: the thing you are banning from the forum is one of the key ways people have had philosophical debates over foundational EA ideas). What if I defended Singer’s argument in a post of my own?
As I say this, I feel some kind of twinge of concern that people will find this and use it to attack me, or that crazy people will act badly inspired by my questions. I hypothesize that the moderators are feeling this kind of twinge more generally. I think this is the sort of twinge that should and must be overridden, because listening to it means that your discourse will forever be at the mercy of whoever is most hostile to you, or whoever is craziest. You can’t figure out true things in that situation.
(On a personal level, I apologize to the moderators for putting them in difficult situations by saying things that are deliberately in the grey areas of their moderation policy. Nevertheless I think it’s important enough that I will continue doing this. EA is not just a group of nerds on the internet any more, it’s a force that shapes the world in a bunch of ways, and so it is crucial that we don’t echo-chamber ourselves into doing crazy stuff (including, or especially, when the crazy stuff matches mainstream consensus). If you would like to warn/ban me, then I would harbor no personal ill-will about it, though of course I will consider that evidence that I and others should be much more wary about the quality of discourse on the forum.)
What links here?
- Will Aldred's comment on Will Aldred’s Quick takes by Will Aldred (10 Jun 2025 17:48 UTC; 14 points)

richard_ngo 28 Aug 2024 20:04 UTC
10 points
3 ∶ 1
in reply to: JP Addison🔸’s comment on: Lizka’s Shortform
This moderation policy seems absurd. The post in question was clearly asking purely hypothetical questions, and wasn’t even advocating for any particular answer to the question. May as well ban users for asking whether it’s moral to push a man off a bridge to stop a trolley, or ban Peter Singer for his thought experiments about infanticide.
Perhaps dstudiocode has misbehaved in other ways, but this announcement focuses on something that should be clearly within the bounds of acceptable discourse. (In particular, the standard of “content that could be interpreted as X” is a very censorious one, since you now need to cater to a wide range of possible interpretations.)