Matthew_Barnett comments on The standard case for delaying AI appears to rest on non-utilitarian assumptions

Matthew_Barnett Feb 13, 2025, 8:01 PM
4 points
1 ∶ 1
I don’t subscribe to moral realism. My own ethical outlook is a blend of personal attachments—my own life, my family, my friends, and other living humans—as well as a broader utilitarian concern for overall well-being. In this post, I focused on impartial utilitarianism because that’s the framework most often used by effective altruists.
However, to the extent that I also have non-utilitarian concerns (like caring about specific people I know), those concerns incline me away from supporting a pause on AI. If AI can accelerate technologies that save and improve the lives of people who exist right now, then slowing it down would cost lives in the near term. A more complete, and more rigorous version of this argument was outlined in the post.
What I find confusing about other EA’s views, including yours, is why we would assign such great importance to “human values” as something specifically tied to the human species as an abstract concept, rather than merely being partial to actual individuals who exist. This perspective is neither utilitarian, nor is it individualistic. It seems to value the concept of the human species over and above the actual individuals that comprise the species, much like how an ideological nationalist might view the survival of their nation as more important than the welfare of all the individuals who actually reside within the nation.
- Neel Nanda Feb 14, 2025, 11:13 AM
  6 points
  1 ∶ 0
  Parent
  For your broader point of impartiality, I feel like you are continuing to assume some bizarre form of moral realism and I don’t understand the case. Otherwise, why do you not consider rocks to be morally meaningful? Why is a plant not valuable? I can come up with reasons, but these are assuming specific things about what is and is not morally valuable in exactly the same way that when I say arbitrary AI beings are on average substantially less valuable because I have specific preferences and values over what matters. I do not understand the philosophical position you are taking here—it feels like you’re saying that the standard position is speciesist and arbitrary and then drawing an arbitrary distinction slightly further out?
  - Matthew_Barnett Feb 14, 2025, 10:31 PM
    2 points
    0 ∶ 0
    Parent
    For your broader point of impartiality, I feel like you are continuing to assume some bizarre form of moral realism and I don’t understand the case. Otherwise, why do you not consider rocks to be morally meaningful? Why is a plant not valuable?
    Traditionally, utilitarianism regards these things (rocks and plants) as lacking moral value because they do not have well-being or preferences. This principle does not clearly apply to AI, though it’s possible that you are making the assumption that future AIs will lack sentience or meaningful preferences. It would be helpful if you clarified how you perceive me to be assuming a form of moral realism (a meta-ethical theory), as I simply view myself as applying a standard utilitarian framework (a normative theory).
    I do not understand the philosophical position you are taking here—it feels like you’re saying that the standard position is speciesist and arbitrary and then drawing an arbitrary distinction slightly further out?
    Standard utilitarianism recognizes both morally relevant and morally irrelevant distinctions in value. According to a long tradition, following Jeremy Bentham and Peter Singer, among others, the species category is considered morally irrelevant, whereas sentience and/or preferences are considered morally relevant. I do not think this philosophy rests on the premise of moral realism: rather, it’s a conceptual framework for understanding morality, whether from a moral realist or anti-realist point of view.
    To be clear, I agree that utilitarianism is itself arbitrary, from a sufficiently neutral point of view. But it’s also a fairly standard ethical framework, not just in EA but in academic philosophy too. I don’t think I’m making very unusual assumptions here.
    - Neel Nanda Feb 17, 2025, 4:33 AM
      2 points
      0 ∶ 0
      Parent
      Ah! Thanks for clarifying—if I understand correctly, you think that it’s reasonable to assert that sentience and preferences are what makes an entity morally meaningful, but that anything more specific is not? I personally just disagree with that premise, but I can see where you’re coming from
      
      But in that case, it’s highly non obvious to me that AIs will have sentience or preferences in ways that I consider meaningful—this seems like an open philosophical question. Defining actually what they are also seems like an open question to me—does a thermostat have preferences? Does a plant that grows towards the light? While I do feel fairly confident humans are morally meaningful. Is your argument that even if there’s a good chance they’re not morally meaningful, the expected amount of moral significance is comparable to humans?
      - Matthew_Barnett Feb 17, 2025, 6:54 AM
        2 points
        0 ∶ 0
        Parent
        Thanks for clarifying—if I understand correctly, you think that it’s reasonable to assert that sentience and preferences are what makes an entity morally meaningful, but that anything more specific is not?
        I don’t think there’s any moral view that’s objectively more “reasonable” than any other moral view (as I’m a moral anti-realist). However, I personally don’t have a significant moral preference for humans beyond the fact that I am partial to my family, friends, and a lot of other people who are currently alive. When I think about potential future generations who don’t exist yet, I tend to adopt a more impartial, utilitarian framework.
        In other words, my moral views can be summarized as a combination of personal attachments and broader utilitarian moral concerns. My personal attachments are not impartial: for example, I care about my family more than I care about random strangers. However, beyond my personal attachments, I tend to take an impartial utilitarian approach that doesn’t assign any special value to the human species.
        In other words, to the extent I care about humans specifically, this concern merely arises from the fact that I’m attached to some currently living individuals who happen to be human—rather than because I think the human species is particularly important.
        Does that make sense?
        But in that case, it’s highly non obvious to me that AIs will have sentience or preferences in ways that I consider meaningful—this seems like an open philosophical question. Defining actually what they are also seems like an open question to me—does a thermostat have preferences?
        I agree this is an open question, but I think it’s much clearer that future AIs will have complex and meaningful preferences compared to a thermostat or a plant. I think we can actually be pretty confident about this prediction given the strong economic pressures that will push AIs towards being person-like and agentic. (Note, however, that I’m not making a strong claim here that all AIs will be moral patients in the future. It’s sufficient for my argument if merely a large number of them are.)
        In fact, a lot of arguments for AI risk rest on the premise that AI agents will exist in the future, and that they’ll have certain preferences (at least in a functional sense). If we were to learn that future AIs won’t have preferences, that would both undermine these arguments for AI risk, and many of my moral arguments for valuing AIs. Therefore, to the extent you think AIs will lack the cognitive prerequisites for moral patienthood—under my functionalist and preference utilitarian views—this doesn’t necessarily translate into a stronger case for worrying about AI takeover.
        However, I want to note that the view I have just described is actually broader than the thesis I gave in the post. If you read my post carefully, you’ll see that I actually hedged quite a bit by saying that there are potential, logically consistent utilitarian arguments that could be made in favor of pausing AI. My thesis in the post was not that such an argument couldn’t be given. It was actually a fairly narrow thesis, and I didn’t make a strong claim that AI-controlled futures would create about as much utilitarian moral value as human-controlled futures in expectation (even though I personally think this claim is plausible).
        Neel Nanda Feb 17, 2025, 7:30 AM
        2 points
        0 ∶ 0
        Parent
        I think that even the association between functional agency and preferences in a morally valuable sense is an open philosophical question that I am not happy taking as a given.
        
        Regardless, it seems like our underlying crux is that we assign utility to different things. I somewhat object to you saying that your version of this is utilitarianism and notions of assigning utility that privilege things humans value are not
        Matthew_Barnett Feb 17, 2025, 7:42 AM
        2 points
        0 ∶ 0
        Parent
        Regardless, it seems like our underlying crux is that we assign utility to different things. I somewhat object to you saying that your version of this is utilitarianism and notions of assigning utility that privilege things humans value are not
        I agree that our main point of disagreement seems to be about what we ultimately care about.
        For what it’s worth, I didn’t mean to suggest in my post that my moral perspective is inherently superior to others. For example, my argument is fully compatible with someone being a deontologist. My goal was simply to articulate what I saw standard impartial utilitarianism as saying in this context, and to point out how many people’s arguments for AI pause don’t seem to track what standard impartial utilitarianism actually says. However, this only matters insofar as one adheres to that specific moral framework.
        As a matter of terminology, I do think that the way I’m using the words “impartial utilitarianism” aligns more strongly with common usage in academic philosophy, given the emphasis that many utilitarians have placed on antispeciesist principles. However, even if you think I’m wrong on the grounds of terminology, I don’t think this disagreement subtracts much from the substance of my post as I’m simply talking about the implications of a common moral theory (regardless of whatever we choose to call it).
        Neel Nanda Feb 17, 2025, 7:44 AM
        2 points
        0 ∶ 0
        Parent
        Thanks for clarifying. In that case I think that we broadly agree
- Neel Nanda Feb 14, 2025, 11:10 AM
  4 points
  1 ∶ 0
  Parent
  
  If AI can accelerate technologies that save and improve the lives of people who exist right now, then slowing it down would cost lives in the near term.
  
  Huh? This argument only goes through if you have a sufficiently low probability of existential risk or an extremely low change in your probability of existential risk, conditioned on things moving slower. I disagree with both of these assumptions. Which part of your post are you referring to?
  - Matthew_Barnett Feb 15, 2025, 12:04 AM
    14 points
    3 ∶ 0
    Parent
    Huh? This argument only goes through if you have a sufficiently low probability of existential risk or an extremely low change in your probability of existential risk, conditioned on things moving slower.
    This claim seems false, though its truth hinges on what exactly you mean by a “sufficiently low probability of existential risk” and “an extremely low change in your probability of existential risk”.
    To illustrate why I think your claim is false, I’ll perform a quick calculation. I don’t know your p(doom), but in a post from three years ago, you stated,
    If you believe the key claims of “there is a >=1% chance of AI causing x-risk and >=0.1% chance of bio causing x-risk in my lifetime” this is enough to justify the core action relevant points of EA.
    Let’s assume that there’s a 2% chance of AI causing existential risk, and that, optimistically, pausing for a decade would cut this risk in half (rather than barely decreasing it, or even increasing it). This would imply that the total risk would diminish from 2% to 1%.
    According to OWID, approximately 63 million people die every year, although this rate is expected to increase, rising to around 74 million in 2035. If we assume that around 68 million people will die per year during the relevant time period, and that they could have been saved by AI-enabled medical progress, then pausing AI for a decade would kill around 680 million people.
    This figure is around 8.3% of the current global population, and would constitute a death count higher than the combined death toll from World War 1, World War 2, the Mongol Conquests, the Taiping rebellion, the Transition from Ming to Qing, and the Three Kingdoms Civil war.
    (Note that, although we are counting deaths from old age in this case, these deaths are comparable to deaths in war from a years of life lost perspective, if you assume that AI-accelerated medical breakthroughs will likely greatly increase human lifespan.)
    From the perspective of an individual human life, a 1% chance of death from AI is significantly lower than a 8.3% chance of death from aging—though obviously in the former case this risk would apply independently of age, and in the latter case, the risk would be concentrated heavily among people who are currently elderly.
    Even a briefer pause lasting just two years, while still cutting risk in half, would not survive this basic cost-benefit test. Of course, it’s true that it’s difficult to directly compare the individual personal costs from AI existential risk to the diseases of old age. For example, AI existential risk has the potential to be briefer and less agonizing, which, all else being equal, should push us to favor it. On the other hand, most people might consider death from old age to be preferable since it’s more natural and allows the human species to continue.
    Nonetheless, despite these nuances, I think the basic picture that I’m presenting holds up here: under typical assumptions (such as the ones you gave three years ago), a purely individualistic framing of the costs and benefits of AI pause do not clearly favor pausing, from the perspective of people who currently exist. This fact was noted in Nick Bostrom’s original essay on Astronomical Waste, and more recently, by Chad Jones in his paper on the tradeoffs involved in stopping AI development.
    - Neel Nanda Feb 17, 2025, 4:41 AM
      2 points
      0 ∶ 0
      Parent
      Ah, gotcha. Yes, I agree that if your expected reduction in p(doom) is less than around 1% per year of pause, and you assign zero value to future lives, then pausing is bad on utilitarian grounds
      
      Note that my post was not about my actual numerical beliefs, but about a lower bound that I considered highly defensible—I personally expect notably higher than 1%/year reduction and was taking that as given, but on reflection I at least agree that that’s a more controversial belief (I also think that a true pause is nigh impossible)
      
      I expect there are better solutions that achieve many of the benefits of pausing while still enabling substantially better biotech research, but that’s nitpicking
      
      I’m not super sure what you mean by individualistic. I was modelling this as utilitarian but assigning literally zero value to future people. From a purely selfish perspective, I’m in my mid-20s and my chances of dying from natural causes in the next say 20 years are pretty damn low, and this means that given my background beliefs about doom and timelines, slowing down AI is great deal from my perspective. While if I expected to die from old age in the next 5 years I would be a lot more opposed
      - Matthew_Barnett Feb 17, 2025, 6:21 AM
        4 points
        0 ∶ 0
        Parent
        I’m not super sure what you mean by individualistic. I was modelling this as utilitarian but assigning literally zero value to future people. From a purely selfish perspective, I’m in my mid-20s and my chances of dying from natural causes in the next say 20 years are pretty damn low, and this means that given my background beliefs about doom and timelines, slowing down AI is great deal from my perspective. While if I expected to die from old age in the next 5 years I would be a lot more opposed
        A typical 25 year old man in the United States has around a 4.3% chance of dying before they turn 45 according to these actuarial statistics from 2019 (the most recent non-pandemic year in the data). I wouldn’t exactly call that “pretty damn low”, though opinions on these things differ. This is comparable to my personal credence that AIs will kill me in the next 20 years. And if AI goes well, it will probably make life really awesome. So from this narrowly selfish point of view I’m still not really convinced pausing is worth it.
        Perhaps more importantly: do you not have any old family members that you care about?
        Neel Nanda Feb 17, 2025, 7:25 AM
        4 points
        0 ∶ 0
        Parent
        4% is higher than I thought! Presumably much of that is people who had pre-existing conditions which I don’t or people who got into eg a car accidents which AI probably somewhat reduces, but this seems a lot more complicated and indirect to me.
        
        But this isn’t really engaging with my cruxes. it seems pretty unlikely to me that we will pause until we have pretty capable and impressive AIs and to me much of the non-doom scenarios comes from uncertainty about when we will get powerful ai and how capable it will be. And I expect this to be much clearer the closer we get to these systems, or at the very least the empirical uncertainty about whether it’ll happen will be a lot clearer. I would be very surprised if there was the political will to do anything about this before we got a fair bit closer to the really scary systems.
        
        And yep, I totally put more than 4% chance that I get killed by AI in the next 20 years. But I can see this is a more controversial belief and one that requires higher standards of evidence to argue for. If I imagine a hypothetical world where I know that in 2 years we could have aligned super intelligent AI with 98% probability and it would kill us all with 2% probability. Or we could pause for 20 years and that would get it from 98 to 99%, then I guess from a selfish perspective I can kind of see your point. But I know I do value humanity not going extinct a fair amount even if I think that total utilitarianism is silly. But I observe that I’m finding this debate kind of slippery and I’m afraid that I’m maybe moving the goalposts here because I disagree on many counts so it’s not clear what exactly my cruxes are, or where I’m just attacking points in what you say that seem off
        
        I do think that the title of your post is broadly reasonable though. I’m an advocate for making AI x-risk cases that are premised on common sense morality like “human extinction would be really really bad”, and utilitarianism in the true philosophical sense is weird and messy and has pathological edge cases and isn’t something that I fully trust in extreme situations
        Matthew_Barnett Feb 17, 2025, 8:17 AM
        2 points
        0 ∶ 0
        Parent
        I think what you’re saying about your own personal tradeoffs makes a lot of sense. Since I think we’re in agreement on a bunch of points here, I’ll just zero in on your last remark, since I think we still might have an important lingering disagreement:
        I do think that the title of your post is broadly reasonable though. I’m an advocate for making AI x-risk cases that are premised on common sense morality like “human extinction would be really really bad”, and utilitarianism in the true philosophical sense is weird and messy and has pathological edge cases and isn’t something that I fully trust in extreme situations
        I’m not confident, but I suspect that your perception of what common sense morality says is probably a bit inaccurate. For example, suppose you gave people the choice between the following scenarios:
        In scenario A, their lifespan, along with the lifespans of everyone currently living, would be extended by 100 years. Everyone in the world would live for 100 years in utopia. At the end of this, however, everyone would peacefully and painlessly die, and then the world would be colonized by a race of sentient aliens.
        In scenario B, everyone would receive just 2 more years to live. During this 2 year interval, life would be hellish and brutal. However, at the end of this, everyone would painfully die and be replaced by a completely distinct set of biological humans, ensuring that the human species is preserved.
        In scenario A, humanity goes extinct, but we have a good time for 100 years. In scenario B, humanity is preserved, but we all die painfully in misery.
        I suspect most people would probably say that scenario A is far preferable to scenario B, despite the fact that in scenario A, humanity goes extinct.
        To be clear, I don’t think this scenario is directly applicable to the situation with AI. However, I think this thought experiment suggests that, while people might have some preference for avoiding human extinction, it’s probably not anywhere near the primary thing that people care about.
        Based on people’s revealed preferences (such as how they spend their time, and who they spend their money on), most people care a lot about themselves and their family, but not much about the human species as an abstract concept that needs to be preserved. In a way, it’s probably the effective altruist crowd that is unusual in this respect by caring so much about human extinction, since most people don’t give the topic much thought at all.
        Neel Nanda Feb 17, 2025, 7:47 AM
        2 points
        0 ∶ 0
        Parent
        This got me curious so I had deep research make me a report on my probability of dying from different causes. It estimates that in the next 20 years I’ve maybe a 1 and ¹⁄₂ to 3% Chance of death, of which 0.5-1% is chronic illness where it’ll probably help a lot. Infectious diseases is less than .1%, Doesn’t really matter. Accidents are .5 to 1%, AI probably helps but kind of unclear. .5 to 1% on other, mostly suicide. Plausibly AI also leads to substantially improved mental health treatments which helps there? So yeah, I buy that having AGI today Vs in twenty years has small but non trivial costs to my chances of being alive when it happens