Neel Nanda

Karma: 5,791

I lead the DeepMind mechanistic interpretability team

Neel Nanda Feb 17, 2025, 4:41 AM
2 points
0 ∶ 0
in reply to: Matthew_Barnett’s comment on: The standard case for delaying AI appears to rest on non-utilitarian assumptions
Ah, gotcha. Yes, I agree that if your expected reduction in p(doom) is less than around 1% per year of pause, and you assign zero value to future lives, then pausing is bad on utilitarian grounds

Note that my post was not about my actual numerical beliefs, but about a lower bound that I considered highly defensible—I personally expect notably higher than 1%/year reduction and was taking that as given, but on reflection I at least agree that that’s a more controversial belief (I also think that a true pause is nigh impossible)

I expect there are better solutions that achieve many of the benefits of pausing while still enabling substantially better biotech research, but that’s nitpicking

I’m not super sure what you mean by individualistic. I was modelling this as utilitarian but assigning literally zero value to future people. From a purely selfish perspective, I’m in my mid-20s and my chances of dying from natural causes in the next say 20 years are pretty damn low, and this means that given my background beliefs about doom and timelines, slowing down AI is great deal from my perspective. While if I expected to die from old age in the next 5 years I would be a lot more opposed

Neel Nanda Feb 17, 2025, 4:33 AM
2 points
0 ∶ 0
in reply to: Matthew_Barnett’s comment on: The standard case for delaying AI appears to rest on non-utilitarian assumptions
Ah! Thanks for clarifying—if I understand correctly, you think that it’s reasonable to assert that sentience and preferences are what makes an entity morally meaningful, but that anything more specific is not? I personally just disagree with that premise, but I can see where you’re coming from

But in that case, it’s highly non obvious to me that AIs will have sentience or preferences in ways that I consider meaningful—this seems like an open philosophical question. Defining actually what they are also seems like an open question to me—does a thermostat have preferences? Does a plant that grows towards the light? While I do feel fairly confident humans are morally meaningful. Is your argument that even if there’s a good chance they’re not morally meaningful, the expected amount of moral significance is comparable to humans?

Neel Nanda Feb 14, 2025, 11:13 AM
5 points
3 ∶ 0
in reply to: Matthew_Barnett’s comment on: The standard case for delaying AI appears to rest on non-utilitarian assumptions
Because there is a much higher correlation between the value of the current generation of humans and the next one than there is between the values of humans and arbitrary AI entities

Neel Nanda Feb 14, 2025, 11:13 AM
6 points
1 ∶ 0
in reply to: Matthew_Barnett’s comment on: The standard case for delaying AI appears to rest on non-utilitarian assumptions
For your broader point of impartiality, I feel like you are continuing to assume some bizarre form of moral realism and I don’t understand the case. Otherwise, why do you not consider rocks to be morally meaningful? Why is a plant not valuable? I can come up with reasons, but these are assuming specific things about what is and is not morally valuable in exactly the same way that when I say arbitrary AI beings are on average substantially less valuable because I have specific preferences and values over what matters. I do not understand the philosophical position you are taking here—it feels like you’re saying that the standard position is speciesist and arbitrary and then drawing an arbitrary distinction slightly further out?

Neel Nanda Feb 14, 2025, 11:10 AM
4 points
1 ∶ 0
in reply to: Matthew_Barnett’s comment on: The standard case for delaying AI appears to rest on non-utilitarian assumptions

If AI can accelerate technologies that save and improve the lives of people who exist right now, then slowing it down would cost lives in the near term.

Huh? This argument only goes through if you have a sufficiently low probability of existential risk or an extremely low change in your probability of existential risk, conditioned on things moving slower. I disagree with both of these assumptions. Which part of your post are you referring to?

Neel Nanda Feb 13, 2025, 7:49 PM
11 points
2 ∶ 0
in reply to: Matthew_Barnett’s comment on: The standard case for delaying AI appears to rest on non-utilitarian assumptions
Are you assuming some kind of moral realism here? That there’s some deep moral truth, humans may or may not have insight into it, so any other intelligent entity is equally likely to?

If so, idk, I just reject your premise. I value what I chose to value, which is obviously related to human values, and an arbitrary sampled entity is not likely to be better on that front

Neel Nanda Feb 12, 2025, 3:28 AM
7 points
0 ∶ 0
in reply to: Karthik Tadepalli’s comment on: What posts are you thinking about writing?
Fascinating, I’ve never heard of this before, thanks! If anyone’s curious, I had Deep Research [take a stab at writing this] (https://chatgpt.com/share/67ac150e-ac90-800a-9f49-f02489dee8d0) which I found pretty interesting (but have totally not fact checked for accuracy)

Neel Nanda Feb 11, 2025, 10:39 PM
20 points
6 ∶ 2
on: The standard case for delaying AI appears to rest on non-utilitarian assumptions
I think you’re using the world utilitarian in a very non standard way here. “AI civilization has comparable moral value to human civilization” is a very strong claim that you don’t provide evidence for. You can’t just call this speciesism and shift the burden of proof! At the very least, we should have wide error bars over the ratio of moral value between AIs and humans, and I would argue also whether AIs have moral value at all.

I personally am happy to bite the bullet and say that I morally value human civilization continuing over an AI civilization that killed all of humanity, and that this is a significant term in my utility function.

Neel Nanda Feb 11, 2025, 2:34 AM
5 points
0 ∶ 0
in reply to: Aaron Bergman’s comment on: aaronb50′s Shortform
Note that the UI is atrocious. You’re not using o1/o3-mini/o1-pro etc. It’s all the same model, a variant of o3, and the model in the bar at the top is completely irrelevant once you click the deep research button. I am very confused why they did it like this https://openai.com/index/introducing-deep-research/

Neel Nanda Feb 6, 2025, 8:40 PM
9 points
3 ∶ 0
in reply to: Michael Townsend🔸’s comment on: Sam Robinson’s Quick takes
I guess my issue is that this all seems strictly worse than “pledge to give 10% for the first 1-2 years after graduation, and then decide whether to commit for life”. Even “you commit for life, but with the option to withdraw 1-2 years after graduation”, ie with the default to continue. Your arguments about not getting used to a full salary apply just as well to those imo

More broadly, I think it’s bad to justify getting young people without much life experience to make a lifetime pledge, based on a controversial belief (that it should be normal to give 10%), by saying that you personally believe that belief is true. In this specific case I agree with your belief! I took the pledge (shortly after graduating I think). But there are all kinds of beliefs I disagree with that I do not want people using here. Lots of young people make choices that they regret later—I’m not saying they should be stopped from making these choices, but it’s bad to encourage them. I agree with Buck, at least to the extent of saying that undergrads who’ve been in EA for less than a year should not be encouraged to sign a lifetime pledge.

(On a meta level, the pledge can obviously be broken if someone really regrets it, it’s not legally binding. But I think arguments shouldn’t rely on the pledge being breakable)

Neel Nanda Feb 5, 2025, 5:37 PM
45 points
12 ∶ 7
in reply to: Sam Robinson 🔸’s comment on: Sam Robinson’s Quick takes
I personally think it’s quite bad to try to get people to sign a lifetime giving pledge before they’ve ever had a real job, and think this is overemphasized in EA.

I think it’s much better to eg make a pledge for the next 1-5 years, or the first year of your career, or something, and re-evaluate at the end of that, which I think mitigates some of your concerns

Neel Nanda Feb 1, 2025, 1:51 AM
13 points
1 ∶ 0
in reply to: Chris Leong’s comment on: defun’s Quick takes
Member of Technical Staff is often a catchall term for “we don’t want to pigeonhole you into a specific role, you do useful stuff in whatever way seems to add the most value”, I wouldn’t read much into it

Neel Nanda Jan 25, 2025, 5:46 PM
33 points
7 ∶ 0
in reply to: Guy Raveh’s comment on: No one has the ball on 1500 Russian olympiad winners who’ve received HPMOR
Speaking as an IMO medalist who partially got into AI safety because of reading HPMOR 10 years ago, I think this plan is extremely reasonable

Neel Nanda Jan 16, 2025, 2:03 AM
28 points
12 ∶ 1
in reply to: huw’s comment on: How big of a problem is abortion?
I disagree. I think it’s an important principle of EA that it’s socially acceptable to explore the implications of weird ideas, even if they feel uncomfortable, and to try to understand the perspective of those you disagree with. I want this forum to be a place where posts like this can exist.

Neel Nanda Dec 23, 2024, 9:57 AM
10 points
5 ∶ 10
in reply to: JWS 🔸’s comment on: GiveWell may have made 1 billion dollars of harmful grants, and Ambitious Impact incubated 8 harmful organisations via increasing factory-farming?
The EA community still donates far more to global health causes than animal welfare—I think the meat eater problem discourse seems like a much bigger deal than it actually is in the community. I personally think it’s all kinda silly and significantly prioritise saving human lives

Neel Nanda Dec 22, 2024, 10:29 PM
6 points
8 ∶ 8
in reply to: Fai’s comment on: GiveWell may have made 1 billion dollars of harmful grants, and Ambitious Impact incubated 8 harmful organisations via increasing factory-farming?
I strong downvoted because the title is unnecessarily provocative and in my opinion gives a misleading impression. I would rather not have this kind of thing on my forum feed

Neel Nanda Dec 22, 2024, 9:29 AM
9 points
3 ∶ 0
on: The “Progressive Pledge”
Interesting idea!
1. I recommend a different name, when I saw this I assumed it was about pledging around left wing causes
2. I feel like the spirit of the pledge would be to increase the 10% part with inflation? If you get a pay raise in line with inflation it seems silly to have to give half of that, since your real take home pay is unchanged. Even the further pledge is inflation linked

Neel Nanda Dec 19, 2024, 8:28 PM
5 points
0 ∶ 0
in reply to: AGB 🔸’s comment on: AMA: 10 years of Earning To Give
Would value drift be mitigated by donating to a DAF and investing there? Or are you afraid your views on where to donate might also shift

Neel Nanda Dec 13, 2024, 5:37 PM
4 points
2 ∶ 0
in reply to: Davidmanheim’s comment on: Be the First Person to Take the Better Career Pledge!
I feel pretty ok with a very mild and bounded commitment? Especially with an awareness that forcing yourself to be miserable is rarely the way to be just effective yourself. I think it’s pretty valid for someone’s college age self to say that impact does matter to them, and they do care about this, and don’t want to totally forget about it even if it becomes inconvenient, so long as they avoid ways this is psychological even by light of those values

Neel Nanda Dec 13, 2024, 11:18 AM
9 points
4 ∶ 0
in reply to: JP Addison🔸’s comment on: Technical Report on Mirror Bacteria: Feasibility and Risks
I’ve only upvoted Habryka , to reward good formatting