Henry Howard🔸 comments on The basic reasons I expect AGI ruin

Henry Howard🔸Apr 18, 2023, 12:59 PM
4 points
5 ∶ 3
I feel the weakest part of this argument, and the weakest part of the AI Safety space generally, is the part where AI kills everyone (part 2, in this case).

You argue that most paths to some ambitious goal like whole-brain emulation end terribly for humans, because how else could the AI do whole-brain emulation without subjugating, eliminating or atomising everyone?
I don’t think that follows. This seems like what the average hunter-gatherer would have thought when made to imagine our modern commercial airlines or microprocessor industries: how could you achieve something requiring so much research, so many resources and so much coordination without enslaving huge swathes of society and killing anyone that gets in the way? And wouldn’t the knowledge to do these things cause terrible new dangers?
Luckily the peasant is wrong: the path here has led up a slope of gradually increasing quality of life (some disagree).
- Alexander Herwix 🔸Apr 18, 2023, 2:31 PM
  3 points
  0 ∶ 0
  Parent
  I think the point is not that it is not conceivable that progress can continue with humans still being alive but with the game theoretic dilemma that whatever we humans want to do is unlikely to be exactly what some super powerful advanced AI would want to do. And because the advanced AI does not need us or depend on us, we simply lose and get to be ingredients for whatever that advanced AI is up to.
  Your example with humanity fails because humans have always and continue to be a social species that is dependent on each other. An unaligned advanced AI would not be so. A more appropriate example would be to look at the relationship between humans and insects. I don’t know if you noticed but a lot of those are dying out right now because we simply don’t care about or depend on them. The point with advanced AI would be that because it is potentially even more removed from us than we are from insects and also much more capable in achieving its goals that this whole competitive process which we all engage in is going to be much more competitive and faster when advanced AIs start playing in the game.
  I don’t want to be the bearer of bad news but I think it is not that easy to reject this analysis… it seems pretty simple and solid. I would love to know if there is some flaw in the reasoning. Would help me sleep better at night!
  - RobBensinger Apr 18, 2023, 9:14 PM
    2 points
    0 ∶ 0
    Parent
    Your example with humanity fails because humans have always and continue to be a social species that is dependent on each other.
    I would much more say that it fails because humans have human values.
    Maybe a hunter-gatherer would have worried that building airplanes would somehow cause a catastrophe? I don’t exactly see why; the obvious hunter-gatherer rejoinder could be ‘we built fire and spears and our lives only improved; why would building wings to fly make anything bad happen?’.
    Regardless, it doesn’t seem like you can get much mileage via an analogy that sticks entirely to humans. Humans are indeed safe, because “safety” is indexed to human values; when we try to reason about non-human optimizers, we tend to anthropomorphize them and implicitly assume that they’ll be safe for many of the same reasons. Cf. The Tragedy of Group Selectionism and Anthropomorphic Optimism.
    You argue that most paths to some ambitious goal like whole-brain emulation end terribly for humans, because how else could the AI do whole-brain emulation without subjugating, eliminating or atomising everyone?
    ‘Wow, I can’t imagine a way to do something so ambitious without causing lots of carnage in the process’ is definitely not the argument! On the contrary, I think it’s pretty trivial to get good outcomes from humans via a wide variety of different ways we could build WBE ourselves.
    The instrumental convergence argument isn’t ‘I can’t imagine a way to do this without killing everyone’; it’s that sufficiently powerful optimization behaves like maximizing optimization for practical purposes, and maximizing-ish optimization is dangerous if your terminal values aren’t included in the objective being maximized.
    If it helps, we could maybe break the disagreement about instrumental convergence into three parts, like:
    Would a sufficiently powerful paperclip maximizer kill all humans, given the opportunity?
    Would sufficiently powerful inhuman optimization of most goals kill all humans, or are paperclips an exception?
    Is ‘build fast-running human whole-brain emulation’ an ambitious enough task to fall under the ‘sufficiently powerful’ criterion above? Or if so, is there some other reason random policies might be safe if directed at this task, even if they wouldn’t be safe for other similarly-hard tasks?
    - Henry Howard🔸Apr 19, 2023, 1:18 PM
      5 points
      1 ∶ 0
      Parent
      The step that’s missing for me is the one where the paperclip maximiser gets the opportunity to kill everyone.
      Your talk of “plans” and the dangers of executing them seems to assume that the AI has all the power it needs to execute the plans. I don’t think the AI crowd has done enough to demonstrate how this could happen.
      If you drop a naked human in amongst some wolves I don’t think the human will do very well despite its different goals and enormous intellectual advantage. Similarly, I don’t see how a fledgling sentient AGI on OpenAI servers can take over enough infrastructure that it poses a serious threat. I’ve not seen a convincing theory for how this would happen. Mailorder nanobots seem unrealistic (too hard to simulate the quantum effects in protein chemistry), the AI talking itself out of its box is another suggestion that seems far-fetched (main evidence seems to be some chat games that Yudkowsky played a few times?), a gradual takeover by its voluntary uptake into more an more of our lives seems slow enough to stop.
      - Ian Turner Apr 24, 2023, 1:33 AM
        1 point
        0 ∶ 0
        Parent
        Is your question basically how an AGI would gain power in the beginning in order to get to a point where it could execute on a plan to annihilate humans?
        
        I would argue that:
        
        Capitalists would quite readily give the AGI all the power it wants, in order to stay competitive and drive profits.
        Some number of people would deliberately help the AGI gain power just to “see what happens” or specifically to hurt humanity. Think ChaosGPT, or consider the story of David Charles Hahn.
        Some number of lonely, depressed, or desperate people could be persuaded over social media to carry out actions in the real world.
        
        Considering these channels, I’d say that a sufficiently intelligent AGI with as much access to the real world as ChatGPT has now would have all the power needed to increase its power to the point of being able to annihilate humans.