richard_ngo

Karma: 7,659

Former AI safety research engineer, now AI governance researcher at OpenAI. Blog: thinkingcomplete.blogspot.com

richard_ngo Mar 27, 2025, 5:06 PM
2 points
0 ∶ 0
in reply to: jackva’s comment on: Third-wave AI safety needs sociopolitical thinking
Thanks for the feedback!
FWIW a bunch of the polemical elements were deliberate. My sense is something like: “All of these points are kinda well-known, but somehow people don’t… join the dots together? Like, they think of each of them as unfortunate accidents, when they actually demonstrate that the movement itself is deeply broken.”
There’s a kind of viewpoint flip from being like “yeah I keep hearing about individual cases that sure seem bad” to “oh man, this is systemic”. And I don’t really know how to induce the viewpoint shift except by being kinda intense about it.
Upon reflection, I actually take this exchange to be an example of what I’m trying to address. Like, I gave a talk that was according to you “so extreme that it is hard to take seriously” and your three criticisms were:
1. An (admittedly embarrassing) terminological slip on NEPA.
2. A strawman of my point (I never said anyone was “single-handedly responsible”).
3. A critique of an omission (on water/air pollution).
I imagine you have better criticisms to make, but ultimately (as you mention) we do agree on the core point, and so in some sense the message I’m getting is “yeah, listen, environmentalism has messed up a bunch of stuff really badly, but you’re not allowed to be mad about it”.
And I basically just disagree with that. I do think being mad about it (or maybe “outraged” is a better term) will have some negative effects on my personal epistemics (which I’m trying carefully to manage). But given the scale of the harms caused, this level of criticism seems like an acceptable and proportional discursive move. (Though note that I’d have done things differently if I felt like criticism that severe was already common within the political bubble of my audience—I think outrage is much worse when it bandwagons.)

Third-wave AI safety needs sociopolitical thinking

richard_ngoMar 27, 2025, 12:55 AM

57 points

6 comments1 min readEA link

(www.youtube.com)

richard_ngo Jan 21, 2025, 10:46 AM
2 points
0 ∶ 0
in reply to: Will Howard🔹’s comment on: It looks like there are some good funding opportunities in AI safety right now
yeah there was a tender offer, openai does them every year or two

richard_ngo Dec 29, 2024, 4:32 PM
82 points
0 ∶ 0
on: It looks like there are some good funding opportunities in AI safety right now
This post convinced me to sell $200,000 more OpenAI shares than I would otherwise have, in order to have more money available to donate rapidly. Thanks!

richard_ngo Sep 14, 2024, 7:39 AM
22 points
3 ∶ 0
on: Announcing the Meta Coordination Forum 2024
Thanks for sharing this, it does seem good to have transparency into this stuff.
My gut reaction was “huh, I’m surprised about how large a proportion of these people (maybe 30-50%, depending on how you count it) I don’t recall substantially interacting with” (where by “interaction” I include reading their writings).
To be clear, I’m not trying to imply that it should be higher; that any particular mistakes are being made; or that these people should have interacted with me. It just felt surprising (given how long I’ve been floating around EA) and worth noting as a datapoint. (Though one reason to take this with a grain of salt is that I do forget names and faces pretty easily.)

richard_ngo Sep 9, 2024, 5:44 PM
11 points
3 ∶ 1
in reply to: NickLaing’s comment on: Lizka’s Shortform
My point is not that the current EA forum would censor topics that were actually important early EA conversations, because EAs have now been selected for being willing to discuss those topics. My point is that the current forum might censor topics that would be important course-corrections, just as if the rest of society had been moderating early EA conversations, those conversations might have lost important contributions like impartiality between species (controversial: you’re saying human lives don’t matter very much!), the ineffectiveness of development aid (controversial: you’re attacking powerful organizations!), transhumanism (controversial, according to the people who say it’s basically eugenics), etc.
Re “conversations can be had in more sensitive ways”, I mostly disagree, because of the considerations laid out here: the people who are good at discussing topics sensitively are mostly not the ones who are good at coming up with important novel ideas.
For example, it seems plausible that genetic engineering for human intelligence enhancement is an important and highly neglected intervention. But you had to be pretty disagreeable to bring it into the public conversation a few years ago (I think it’s now a bit more mainstream).

richard_ngo Sep 9, 2024, 1:55 AM
14 points
3 ∶ 3
in reply to: huw’s comment on: Lizka’s Shortform
Narrowing in even further on the example you gave, as an illustration: I just had an uncomfortable conversation about age of consent laws literally yesterday with an old friend of mine. Specifically, my friend was advocating that the most important driver of crime is poverty, and I was arguing that it’s cultural acceptance of crime. I pointed to age of consent laws varying widely across different countries as evidence that there are some cultures which accept behavior that most westerners think of as deeply immoral (and indeed criminal).
Picturing some responses you might give to this:
1. That’s not the sort of uncomfortable claim you’re worried about
  1. But many possible continuations of this conversation would in fact have gotten into more controversial territory. E.g. maybe a cultural relativist would defend those other countries having lower age of consent laws. I find cultural relativism kinda crazy (for this and related reasons) but it’s a pretty mainstream position.
2. I could have made the point in more sensitive ways
  1. Maybe? But the whole point of the conversation was about ways in which some cultures are better than others. This is inherently going to be a sensitive claim, and it’s hard to think of examples that are compelling without being controversial.
3. This is not the sort of thing people should be discussing on the forum
  1. But EA as a movement is interested in things like:
    Criminal justice reform (which OpenPhil has spent many tens of millions of dollars on)
    Promoting women’s rights (especially in the context of global health and extreme poverty reduction)
    What factors make what types of foreign aid more or less effective
    More generally, the relationship between the developed and the developing world
    So this sort of debate does seem pretty relevant.
I think EA would’ve broadly survived intact by lightly moderating other kinds of discomfort (or it may have even expanded).
The important point is that we didn’t know in advance which kinds of discomfort were of crucial importance. The relevant baseline here is not early EAs moderating ourselves, it’s something like “the rest of academic philosophy/society at large moderating EA”, which seems much more likely to have stifled early EA’s ability to identify important issues and interventions.
(I also think we’ve ended up at some of the wrong points on some of these issues, but that’s a longer debate.)

richard_ngo Sep 8, 2024, 6:25 PM
13 points
3 ∶ 2
in reply to: NickLaing’s comment on: Lizka’s Shortform
Ty for the reply; a jumble of responses below.
I think there are better places to have these often awkward, fraught conversations.
You are literally talking about the sort of conversations that created EA. If people don’t have these conversations on the forum (the single best way to create common knowledge in the EA commmunity), then it will be much harder to course-correct places where fundamental ideas are mistaken. I think your comment proceeds from the implicit assumption that we’re broadly right about stuff, and mostly just need to keep our heads down and do the work. I personally think that a version of EA that doesn’t have the ability to course-correct in big ways would be net negative for the world. In general it is not possible to e.g. identify ongoing moral catastrophes when you’re optimizing your main venue of conversations for avoiding seeming weird.
I agree with you the quote from the Hamas charter is more dangerous—and think we shouldn’t be publishing or discussing that on the forum either.
If you’re not able to talk about evil people and their ideologies, then you will not be able to account for them in reasoning about how to steer the world. I think EA is already far too naive about how power dynamics work at large scales, given how much influence we’re wielding; this makes it worse.
There’s potential reputational damage for all the people doing great EA work across the spectrum here.
I think there are just a few particular topics which give people more ammunition for public take-downs, and there is wisdom in sometimes avoiding loading balls into your opponents cannons.
Insofar as you’re thinking about this as a question of coalitional politics, I can phrase it in those terms too: the more censorious EA becomes, the more truth-seeking people will disaffiliate from it. Habryka, who was one of the most truth-seeking people involved in EA, has already done so; I wouldn’t say it was directly because of EA not being truth-seeking enough, but I think that was one big issue for him amongst a cluster of related issues. I don’t currently plan to, but I’ve considered the possibility, and the quality of EA’s epistemic norms is one of my major considerations (of course, the forum’s norms are only a small part of that).
However, having said this, I don’t think you should support more open forum norms mostly as a concession to people like me, but rather in order to pursue your own goals more effectively. Movements that aren’t able to challenge foundational assumptions end up like environmentalists: actively harming the causes they’re trying to support.

richard_ngo Sep 7, 2024, 6:26 PM
35 points
7 ∶ 2
in reply to: Neel Nanda’s comment on: Lizka’s Shortform
I appreciate the thought that went into this. I also think that using rate-limits as a tool, instead of bans, is in general a good idea. I continue to strongly disagree with the decisions on a few points:
1. I still think including the “materials that may be easily perceived as such” clause has a chilling effect.
2. I also remember someone’s comment that the things you’re calling “norms” are actually rules, and it’s a little disingenuous to not call them that; I continue to agree with this.
3. The fact that you’re not even willing to quote the parts of the post that were objectionable feels like an indication of a mindset that I really disagree with. It’s like… treating words as inherently dangerous? Not thinking at all about the use-mention distinction? I mean, here’s a quote from the Hamas charter: “There is no solution for the Palestinian question except through Jihad.” Clearly this is way way more of an incitement to violence than any quote of dstudiocode’s, which you’re apparently not willing to quote. (I am deliberately not expressing any opinion about whether the Hamas quote is correct; I’m just quoting them.) What’s the difference?
4. “They see the fact that it is “just” a philosophical question as not changing the assessment.” Okay, let me now quote Singer. “Human babies are not born self-aware, or capable of grasping that they exist over time. They are not persons… the life of a newborn is of less value than the life of a pig, a dog, or a chimpanzee.” Will you warn/ban me from the EA forum for quoting Singer, without endorsing that statement? What if I asked, philosophically, “If Singer were right, would it be morally acceptable to kill a baby to save a dog’s life?” I mean, there are whole subfields of ethics based on asking about who you would kill in order to save whom (which is why I’m pushing on this so strongly: the thing you are banning from the forum is one of the key ways people have had philosophical debates over foundational EA ideas). What if I defended Singer’s argument in a post of my own?
As I say this, I feel some kind of twinge of concern that people will find this and use it to attack me, or that crazy people will act badly inspired by my questions. I hypothesize that the moderators are feeling this kind of twinge more generally. I think this is the sort of twinge that should and must be overridden, because listening to it means that your discourse will forever be at the mercy of whoever is most hostile to you, or whoever is craziest. You can’t figure out true things in that situation.
(On a personal level, I apologize to the moderators for putting them in difficult situations by saying things that are deliberately in the grey areas of their moderation policy. Nevertheless I think it’s important enough that I will continue doing this. EA is not just a group of nerds on the internet any more, it’s a force that shapes the world in a bunch of ways, and so it is crucial that we don’t echo-chamber ourselves into doing crazy stuff (including, or especially, when the crazy stuff matches mainstream consensus). If you would like to warn/ban me, then I would harbor no personal ill-will about it, though of course I will consider that evidence that I and others should be much more wary about the quality of discourse on the forum.)

richard_ngo Aug 28, 2024, 8:04 PM
10 points
3 ∶ 1
in reply to: JP Addison🔸’s comment on: Lizka’s Shortform
This moderation policy seems absurd. The post in question was clearly asking purely hypothetical questions, and wasn’t even advocating for any particular answer to the question. May as well ban users for asking whether it’s moral to push a man off a bridge to stop a trolley, or ban Peter Singer for his thought experiments about infanticide.
Perhaps dstudiocode has misbehaved in other ways, but this announcement focuses on something that should be clearly within the bounds of acceptable discourse. (In particular, the standard of “content that could be interpreted as X” is a very censorious one, since you now need to cater to a wide range of possible interpretations.)

richard_ngo Aug 28, 2024, 6:59 PM
3 points
2 ∶ 5
in reply to: Tobias Häberli’s comment on: The case for contributing to the 2024 US election with your time & money
I accept that I should talk about “Trump and the Republican party”. But conversely, when we talk about the Democratic party, we should also include the institutions over which it has disproportionate influence—including most mainstream media outlets, the FBI (which pushed for censorship of one of the biggest anti-Biden stories in the lead-up to the 2020 election—EDIT: I no longer endorse this phrasing, it seems like the FBI’s conversations with tech companies were fairly vague on this matter), the teams responsible for censorship at most major tech companies, the wide range of agencies that started regulatory harassment of Elon under the Biden administration, etc.
If Trump had anywhere near the level of influence over elite institutions that the Democrats do, then I’d agree that he’d be clearly more dangerous.

richard_ngo Aug 27, 2024, 6:09 PM
−2 points
1 ∶ 9
in reply to: richard_ngo’s comment on: The case for contributing to the 2024 US election with your time & money
One more point: in Scott’s blog post he talks about the “big lie” of Trump: that the election was stolen. I do worry that this is a key point of polarization, where either you fully believe that the election was stolen and the Democrats are evil, or you fully believe that Trump was trying to seize dictatorial power.
But reality is often much more complicated. My current best guess is that there wasn’t any centrally-coordinated plan to steal the election, but that the central Democrat party:
1. Systematically turned a blind eye to thousands of people who shouldn’t have been voting (like illegal immigrants) actually voting (in some cases because Democrat voter registration pushes deliberately didn’t track this distinction).
2. Blocked reasonable election integrity measures that would have prevented this (like voter ID), primarily in a cynical + self-interested way.
On priors I think this probably didn’t swing the election, but given how small the winning margins were in swing states, it wouldn’t be crazy if it did. From this perspective I think it reflects badly on Trump that he tried to do unconstitutional things to stay in power, but not nearly as badly as most Democrats think.
(Some intuitions informing this position: I think if there had been clear smoking guns of centrally-coordinated election fraud, then Trump would have won some of his legal challenges, and we’d have found out about it since then. But it does seem like a bunch of non-citizens are registered to vote in various states (e.g. here, here), and I don’t think this is a coincidence given that it’s so beneficial for Dems + Dems have so consistently blocked voter ID laws. Conversely, I do also expect that red states are being overzealous in removing people from voter rolls for things like changing their address. Basically it all seems like a shitshow, and not one which looks great for Trump, but not disqualifying either IMO, especially because in general I expect to update away from the mainstream media line over time as information they’ve suppressed comes to light.)

richard_ngo Aug 26, 2024, 4:09 PM
−4 points
1 ∶ 7
in reply to: LintzA’s comment on: The case for contributing to the 2024 US election with your time & money
I think this pales in comparison to Trump’s willingness to silence critics (e.g. via hush money and threats).
If you believe that Trump has done a bunch of things wrong, the Democrats have done very little wrong, and the people prosecuting Trump are just following normal process in doing so, then yes these threats are worrying.
But if you believe that the charges against Trump were in fact trumped-up, e.g. because Democrats have done similarly bad things without being charged, then most of Trump’s statements look reasonable. E.g. this testimony about Biden seems pretty concerning—and given that context, saying “appoint a Special Counsel to investigate Joe Biden who hates Biden as much as Jack Smith hates me” seems totally proportional.
Also, assuming the “hush money” thing is a reference to Stormy Daniels, I think that case reflects much worse on the Democrats than it does on Trump—the “crime” involved is marginal or perhaps not even a crime at all. (tl;dr: Paying hush money is totally legal, so the actual accusation they used was “falsifying business records”. But this by itself would only be a misdemeanor, unless it was done to cover up another crime, and even the prosecution wasn’t clear on what the other crime actually was.) Even if it technically stands up, you can imagine the reaction if Clinton was prosecuted on such flimsy grounds while Trump was president.
The Democratic party, like the GOP, is going to act in ways which help get their candidate elected. … There’s nothing illegal about [not hosting a primary] though, parties are private entities and can do whatever they want to select a candidate.
If that includes suing other candidates to get them off the ballots, then I’m happy to call that unusually undemocratic. More generally, democracy is constituted not just by a set of laws, but by a set of traditions and norms. Not hosting a primary, ousting Biden, Kamala refusing interviews, etc, all undermine democratic norms.
Now, I do think Trump undermines a lot of democratic norms too. So it’s really more of a question of who will do more damage. I think that many US institutions (including the media, various three-letter agencies, etc) push back strongly against Trump’s norm-breaking, but overlook or even enable Democrat norm-breaking—for instance, keeping Biden’s mental state secret for several years. Because of this I am roughly equally worried about both.
Scott Aaronson lays out some general concerns well here.
I don’t really see much substance here. E.g. Aaronson says “Trump’s values, such as they are, would seem to be “America First,” protectionism, vengeance, humiliation of enemies, winning at all costs, authoritarianism, the veneration of foreign autocrats, and the veneration of himself.” I think America First is a very reasonable value for an American president to have (and one which is necessary for the “American-led peaceful world order” that Scott wants). Re protectionism, seems probably bad in economic terms, but much less bad than many Democrat policies (e.g. taxing unrealized capital gains, anti-nuclear, etc). Re “vengeance, humiliation of enemies, winning at all costs, authoritarianism”: these are precisely the things I’m concerned about from the Democrats. Re “the veneration of foreign autocrats”: see my comments on Trump’s foreign policy.
I don’t think the link you provided on Reddit censorship demonstrates censorship
Sorry, I’d linked it on memory since I’ve seen a bunch of censorship examples from them, but I’d forgotten that they also post a bunch of other non-censorship stuff. Will dig out some of the specific examples I’m thinking about later.
Re Facebook, here’s Zuckerberg’s admission that the Biden administration “repeatedly pressured our teams for months” to censor covid-related content (he also mentions an FBI warning about Russian disinformation in relation to censorship of the Hunter Biden story, though the specific link is unclear).

richard_ngo Aug 25, 2024, 5:25 PM
7 points
3 ∶ 17
on: The case for contributing to the 2024 US election with your time & money
(This comment focuses on object-level arguments about Trump vs Kamala; I left another comment focused on meta-level considerations.)
Three broad arguments for why it’s plausibly better if Trump wins than if Kamala does:
1. I basically see this election as a choice between a man who’s willing to subvert democracy, and a party that is willing to subvert democracy—e.g. via massively biased media coverage, lawfare against opponents, and coordinated social media censorship (I’ve seen particularly egregious examples on Reddit, but I expect that Facebook and Instagram are just as bad). RFK Jr, a lifelong Democrat (and a Kennedy to boot), has now endorsed Trump because he considers Democrat behavior too undemocratic. Heck, even Jill Stein has make this same critique. It’s reasonable to think that the risk Trump poses outweighs that, but it’s also reasonable to lean the other way, especially if you think (like I do) that the neutrality + independence of many US institutions is at a low point (e.g. see the Biden administration’s regulatory harassment of Musk on some pretty ridiculous grounds).
2. On foreign policy: it seems like Trump was surprisingly prescient about several major geopolitical issues (e.g. his 2016 positions that the US should be more worried about China, and that the US should push European countries to contribute much more to NATO, were heavily criticized at the time, but now are mainstream). The Abraham Accords also seem pretty significant. And I think the fact that the Ukraine war and the Gaza war both broke out under Biden not Trump should make us update in Trump’s favor (though I’m open to arguments on how much we should update).
3. On AI and pandemics: I don’t like his object-level policies but I do think he’ll bring in some very competent people (like Musk and Ramaswamy), and as I argued in this post I think the EA community tends to err towards favoring people who agree with our current beliefs, and should update towards prioritizing competence. (Of course there are also some very competent people on the Democrat side on these issues, but I expect them to be more beholden to the status quo. So if e.g. you think that FDA reform is important for biosecurity, that’s probably easier under Trump than Harris.)

richard_ngo Aug 25, 2024, 4:53 PM
39 points
5 ∶ 13
on: The case for contributing to the 2024 US election with your time & money
(This comment focuses on meta-level issues; I left another comment with object-level disagreements.)
The EA case for Trump was heavily downvoted, with commenters arguing that e.g. “a lot of your arguments are extremely one-sided in that they ignore very obvious counterarguments and fail to make the relevant comparisons on the same issue.”
This post is effectively an EA case for Kamala, but less even-handed—e.g. because it:
1. Is framed it not just as a case for Kamala, but as a case for action (which, I think, requires a significantly higher bar than just believing that it’d be better on net if Kamala won).
2. Doesn’t address the biggest concerns with another Democrat administration (some of which I lay out here).
3. Generally feels like it’s primarily talking to an audience who already agrees that Trump is bad, and just needs to be persuaded about how bad he is (e.g. with headings like “A second Trump term would likely be far more damaging for liberal democracy than the last”).
And yet it has been heavily upvoted. Very disappointing lack of consistency here, which suggests that the criticisms of the previous post, while framed as criticisms of the post itself, were actually about the side chosen.
This matters both on epistemic grounds and because one of the most harmful things that can be done for AI safety is to heavily politicize it. By default, we should expect that a lot more people will end up getting on the AI safety train over time; the main blocker to that is if they’re so entrenched in their positions that they fail to update even in the face of overwhelming evidence. We’re already heading towards entrenchment; efforts like this will make it worse. (My impression is that political motivations were also a significant contributor to Good Ventures decoupling itself from the rationalist community—e.g. see this comment about fringe opinion holders. It’s easy to imagine this process spiraling further.)

richard_ngo Aug 25, 2024, 4:35 PM
11 points
0 ∶ 0
in reply to: Dustin Moskovitz’s comment on: [Linkpost] An update from Good Ventures
Anyone know what post Dustin was referring to? EDIT: as per a DM, probably this one.

Defining alignment research

richard_ngoAug 19, 2024, 10:49 PM

48 points

1 comment1 min readEA link

richard_ngo Aug 13, 2024, 9:24 PM
13 points
4 ∶ 8
on: richard_ngo’s Shortform
I recently had a very interesting conversation about master morality and slave morality, inspired by the recent AstralCodexTen posts.
The position I eventually landed on was:
1. Empirically, it seems like the world is not improved the most by people whose primary motivation is helping others, but rather by people whose primary motivation is achieving something amazing. If this is true, that’s a strong argument against slave morality.
2. The defensibility of morality as the pursuit of greatness depends on how sophisticated our cultural conceptions of greatness are. Unfortunately we may be in a vicious spiral where we’re too entrenched in slave morality to admire great people, which makes it harder to become great, which gives us fewer people to admire, which… By contrast, I picture past generations as being in a constant aspirational dialogue about what counts as greatness—e.g. defining concepts like honor, Aristotelean magnanimity (“greatness of soul”), etc.
3. I think of master morality as a variant of virtue ethics which is particularly well-adapted to domains which have heavy positive tails—entrepreneurship, for example. However, in domains which have heavy negative tails, the pursuit of greatness can easily lead to disaster. In those domains, the appropriate variant of virtue ethics is probably more like Buddhism: searching for equanimity or “green”. In domains which have both (e.g. the world as a whole) the closest thing I’ve found is the pursuit of integrity and attunement to oneself. So maybe that’s the thing that we need a cultural shift towards understanding better.

richard_ngo Aug 6, 2024, 10:27 PM
14 points
7 ∶ 9
on: The EA case for Trump 2024
My take is that most of the points raised here are second-order points, and actually the biggest issue in this election is how democratic the future of America will be. But having said that, it’s not clear which side is overall better on this front:
1. The strongest case for Trump is that the Democrat establishment is systematically deceiving the American people (e.g. via the years-long cover-up of Biden’s mental state, strong partisan bias in mainstream media, and extensive censorship campaigns), engaging in lawfare against political opponents (e.g. against Elon and Trump), and generally growing the power of unaccountable bureaucracies over all aspects of life (including bureaucracies which do a lot of harm, like the FDA, FTC, EPA etc). All of this is highly undemocratic, and implicitly coordinated via preference cascades (e.g. see how during covid the Democrats established strong party lines on masks, lockdowns, lab origin, etc, which occasionally required an 180-degree flip from their previous positions). While I think Democrat appointees are likely to be more competent on average than Republicans, I can imagine similar preference cascades leading to totally crazy AI policies.
2. The strongest case against Trump is how many of his cabinet members and previous close supporters from his last term turned against him—particularly Pence’s account of Trump trying to overturn the 2020 election results. I don’t trust a lot of the coverage about how authoritarian Trump is, since there’s a lot of anti-Trump bias in the media (see for instance the “very fine people” hoax), but those people were selected for being sympathetic to Trump in the first place, and should know the details, so their opposition to him updates me a lot. This is especially worrying given that AGI might provide an opportunity for a US leader to seize centralized power.

richard_ngo Jul 31, 2024, 8:25 PM
7 points
0 ∶ 0
in reply to: defun 🔸’s comment on: Twitter thread on AI safety evals
I remain in favor of people doing work on evals, and in favor of funding talented people to work on evals. The main intervention I’d like to make here is to inform how those people work on evals, so that it’s more productive. I think that should happen not on the level of grants but on the level of how they choose to conduct the research.

richard_ngo

Third-wave AI safety needs so­ciopoli­ti­cal thinking

Defin­ing al­ign­ment research

Third-wave AI safety needs sociopolitical thinking

Defining alignment research