Neel Nanda comments on The standard case for delaying AI appears to rest on non-utilitarian assumptions

Neel Nanda Feb 17, 2025, 4:33 AM
2 points
0 ∶ 0
Ah! Thanks for clarifying—if I understand correctly, you think that it’s reasonable to assert that sentience and preferences are what makes an entity morally meaningful, but that anything more specific is not? I personally just disagree with that premise, but I can see where you’re coming from

But in that case, it’s highly non obvious to me that AIs will have sentience or preferences in ways that I consider meaningful—this seems like an open philosophical question. Defining actually what they are also seems like an open question to me—does a thermostat have preferences? Does a plant that grows towards the light? While I do feel fairly confident humans are morally meaningful. Is your argument that even if there’s a good chance they’re not morally meaningful, the expected amount of moral significance is comparable to humans?
- Matthew_Barnett Feb 17, 2025, 6:54 AM
  2 points
  0 ∶ 0
  Parent
  Thanks for clarifying—if I understand correctly, you think that it’s reasonable to assert that sentience and preferences are what makes an entity morally meaningful, but that anything more specific is not?
  I don’t think there’s any moral view that’s objectively more “reasonable” than any other moral view (as I’m a moral anti-realist). However, I personally don’t have a significant moral preference for humans beyond the fact that I am partial to my family, friends, and a lot of other people who are currently alive. When I think about potential future generations who don’t exist yet, I tend to adopt a more impartial, utilitarian framework.
  In other words, my moral views can be summarized as a combination of personal attachments and broader utilitarian moral concerns. My personal attachments are not impartial: for example, I care about my family more than I care about random strangers. However, beyond my personal attachments, I tend to take an impartial utilitarian approach that doesn’t assign any special value to the human species.
  In other words, to the extent I care about humans specifically, this concern merely arises from the fact that I’m attached to some currently living individuals who happen to be human—rather than because I think the human species is particularly important.
  Does that make sense?
  But in that case, it’s highly non obvious to me that AIs will have sentience or preferences in ways that I consider meaningful—this seems like an open philosophical question. Defining actually what they are also seems like an open question to me—does a thermostat have preferences?
  I agree this is an open question, but I think it’s much clearer that future AIs will have complex and meaningful preferences compared to a thermostat or a plant. I think we can actually be pretty confident about this prediction given the strong economic pressures that will push AIs towards being person-like and agentic. (Note, however, that I’m not making a strong claim here that all AIs will be moral patients in the future. It’s sufficient for my argument if merely a large number of them are.)
  In fact, a lot of arguments for AI risk rest on the premise that AI agents will exist in the future, and that they’ll have certain preferences (at least in a functional sense). If we were to learn that future AIs won’t have preferences, that would both undermine these arguments for AI risk, and many of my moral arguments for valuing AIs. Therefore, to the extent you think AIs will lack the cognitive prerequisites for moral patienthood—under my functionalist and preference utilitarian views—this doesn’t necessarily translate into a stronger case for worrying about AI takeover.
  However, I want to note that the view I have just described is actually broader than the thesis I gave in the post. If you read my post carefully, you’ll see that I actually hedged quite a bit by saying that there are potential, logically consistent utilitarian arguments that could be made in favor of pausing AI. My thesis in the post was not that such an argument couldn’t be given. It was actually a fairly narrow thesis, and I didn’t make a strong claim that AI-controlled futures would create about as much utilitarian moral value as human-controlled futures in expectation (even though I personally think this claim is plausible).
  - Neel Nanda Feb 17, 2025, 7:30 AM
    2 points
    0 ∶ 0
    Parent
    I think that even the association between functional agency and preferences in a morally valuable sense is an open philosophical question that I am not happy taking as a given.
    
    Regardless, it seems like our underlying crux is that we assign utility to different things. I somewhat object to you saying that your version of this is utilitarianism and notions of assigning utility that privilege things humans value are not
    - Matthew_Barnett Feb 17, 2025, 7:42 AM
      2 points
      0 ∶ 0
      Parent
      Regardless, it seems like our underlying crux is that we assign utility to different things. I somewhat object to you saying that your version of this is utilitarianism and notions of assigning utility that privilege things humans value are not
      I agree that our main point of disagreement seems to be about what we ultimately care about.
      For what it’s worth, I didn’t mean to suggest in my post that my moral perspective is inherently superior to others. For example, my argument is fully compatible with someone being a deontologist. My goal was simply to articulate what I saw standard impartial utilitarianism as saying in this context, and to point out how many people’s arguments for AI pause don’t seem to track what standard impartial utilitarianism actually says. However, this only matters insofar as one adheres to that specific moral framework.
      As a matter of terminology, I do think that the way I’m using the words “impartial utilitarianism” aligns more strongly with common usage in academic philosophy, given the emphasis that many utilitarians have placed on antispeciesist principles. However, even if you think I’m wrong on the grounds of terminology, I don’t think this disagreement subtracts much from the substance of my post as I’m simply talking about the implications of a common moral theory (regardless of whatever we choose to call it).
      - Neel Nanda Feb 17, 2025, 7:44 AM
        2 points
        0 ∶ 0
        Parent
        Thanks for clarifying. In that case I think that we broadly agree