Owen Cotton-Barratt comments on A framework for thinking about AI power-seeking

Owen Cotton-Barratt 25 Jul 2024 20:37 UTC
4 points
0 ∶ 0
My read is that you can apply the framework two different ways:
- Say you’re worried about any take-over-the-world actions, violent or not—in which case this argument about the advantages of non-violent takeover is of scant comfort;
- Say you’re only worried about violent take-over-the-world actions, in which case your argument fits into the framework under “non-takeover satisfaction”: how good the AI feels about its best benign alternative action.
- Matthew_Barnett 26 Jul 2024 0:28 UTC
  2 points
  0 ∶ 0
  Parent
  Say you’re worried about any take-over-the-world actions, violent or not—in which case this argument about the advantages of non-violent takeover is of scant comfort;
  This is reasonable under the premise that you’re worried about any AI takeovers, no matter whether they’re violent or peaceful. But speaking personally, peaceful takeover scenarios where AIs just accumulate power—not by cheating us or by killing us via nanobots—but instead by lawfully beating humans fair and square and accumulating almost all the wealth over time, just seem much better than violent takeovers, and not very bad by themselves.
  I admit the moral intuition here is not necessarily obvious. I concede that there are plausible scenarios in which AIs are completely peaceful and act within reasonable legal constraints, and yet the future ends up ~worthless. Perhaps the most obvious scenario is the “Disneyland without children” scenario where the AIs go on to create an intergalactic civilization, but in which no one (except perhaps the irrelevant humans still on Earth) is sentient.
  But when I try to visualize the most likely futures, I don’t tend to visualize a sea of unsentient optimizers tiling the galaxies. Instead, I tend to imagine a transition from sentient biological life to sentient artificial life, which continues to be every bit as cognitively rich, vibrant, and sophisticated as our current world—indeed, it could be even moreso, given what becomes possible at a higher technological and population level.
  Worrying about non-violent takeover scenarios often seems to me to arise simply from discrimination against non-biological forms of life, or perhaps a more general fear of rapid technological change, rather than naturally falling out as a consequence of more robust moral intuitions.
  Let me put it another way.
  It is often conceded that it was good for humans to take over the world. Speaking broadly, we think this was good because we identify with humans and their aims. We belong to the “human” category of course; but more importantly, we think of ourselves as being part of what might be called the “human tribe”, and therefore we sympathize with the pursuits and aims of the human species as a whole. But equally, we could identify as part of the “sapient tribe”, which would include non-biological life as well as humans, and thus we could sympathize with the pursuits of AIs, whatever those may be. Under this framing, what reason is there to care much about a non-violent, peaceful AI takeover?
  - Owen Cotton-Barratt 26 Jul 2024 9:34 UTC
    2 points
    1 ∶ 0
    Parent
    I think that an eventual AI-driven ecosystem seems likely desirable. (Although possibly the natural conception of “agent” will be more like supersystems which include both humans and AI systems, at least for a period.)
    But my alarm at nonviolent takeover persists, for a couple of reasons:
    A feeling that some AI-driven ecosystems may be preferable to others, and we should maybe take responsibility for which we’re creating rather than just shrugging
    Some alarm that nonviolent takeover scenarios might still lead to catastrophic outcomes for humans
    e.g. “after nonviolently taking over, AI systems decide what to do humans, this stub part of the ecosystem; they conclude that they’re using too many physical resources, and it would be better to (via legitimate means!) reduce their rights and then cull their numbers, leaving a small population living in something resembling a nature reserve”
    Perhaps my distaste at this outcome is born in part from loyalty to the human tribe? But I do think that some of it is born from more robust moral intuitions
    - Matthew_Barnett 26 Jul 2024 22:44 UTC
      2 points
      0 ∶ 0
      Parent
      I think I basically agree with you, and I am definitely not saying we should just shrug. We should instead try to shape the future positively, as best we can. However, I still feel like I’m not quite getting my point across. Here’s one more attempt to explain what I mean.
      Imagine if we achieved a technology that enabled us to build physical robots that were functionally identical to humans in every relevant sense, including their observable behavior, and their ability to experience happiness and pain in exactly the same way that ordinary humans do. However, there is just one difference between these humanoid robots and biological humans: they are made of silicon rather than carbon, and they look robotic, rather than biological.
      In this scenario, it would certainly feel strange to me if someone were to suggest that we should be worried about a peaceful robot takeover, in which the humanoid robots collectively accumulate the vast majority of wealth in the world via lawful means.
      By assumption, these humanoid robots are literally functionally identical to ordinary humans. As a result, I think we should have no intrinsic reason to disprefer them receiving a dominant share of the world’s wealth, versus some other subset of human-like beings. This remains true even if the humanoid robots are literally “not human”, and thus their peaceful takeover is equivalent to “human disempowerment” in a technical sense.
      There ultimate reason why I think one should not worry about a peaceful robot takeover in this specific scenario is because I think these humanoid robots have essentially the same moral worth and right to choose as ordinary humans, and therefore we should respect their agency and autonomy just as much as we already do for ordinary humans. Since we normally let humans accumulate wealth and become powerful via lawful means, I think we should allow these humanoid robots to do the same. I hope you would agree with me here.
      Now, generalizing slightly, I claim that to be rationally worried about a peaceful robot takeover in general, you should usually be able to identify a relevant moral difference between the scenario I have just outlined and the scenario that you’re worried about. Here are some candidate moral differences that I personally don’t find very compelling:
      In the humanoid robot scenario, there’s no possible way the humanoid robots would ever end up killing the biological humans, since they are functionally identical to each other. In other words, biological humans aren’t at risk of losing their rights and dying.
      My response: this doesn’t seem true. Humans have committed genocide against other subsets of humanity based on arbitrary characteristics before. Therefore, I don’t think we can rule out that the humanoid robots would commit genocide against the biological humans either, although I agree it seems very unlikely.
      In the humanoid robot scenario, the humanoid robots are guaranteed to have the same values as the biological humans, since they are functionally identical to biological humans.
      My response: this also doesn’t seem guaranteed. Humans frequently have large disagreements in values with other subsets of humanity. For example, China as a group has different values than the United States as a group. This difference in values is even larger if you consider indexical preferences among the members of the group, which generally overlap very little.
      - Owen Cotton-Barratt 26 Jul 2024 23:19 UTC
        6 points
        1 ∶ 0
        Parent
        Since we normally let humans accumulate wealth and become powerful via lawful means, I think we should allow these humanoid robots to do the same. I hope you would agree with me here.
        I agree with this—and also agree with it for various non-humanoid AI systems.
        However, I see this as less about rights for systems that may at some point exist, and more about our responsibilities as the creators of those systems.
        Not entirely analogous, but: suppose we had a large creche of babies whom we had been told by an oracle would be extremely influential in the world. I think it would be appropriate for us to care more than normal about their upbringing (especially if for the sake of the example we assume that upbringing can meaningfully affect character).