JordanStone comments on Why misaligned AGI won’t lead to mass killings (and what actually matters instead)

JordanStone Feb 6, 2025, 1:43 PM
1 point
1 ∶ 1
Not an AI safety expert, but I think you might be under-estimating the imagination a superintelligent AI may have. Or maybe you’re under-estimating what a post-AGI intelligence may be capable of?
With the “killing humans through machines” option, a superintelligent AI would probably be smart enough to kill us all without taking the time to build a robot army, which would definitely raise my suspicions! Maybe it would hack nuclear weapons and blow us all up, invent and release an airborne super-toxin, or make a self-replicating nanobot—wouldn’t see it coming, over as soon as we realised it wasn’t aligned.
And for “making humans kill themselves”, it might destabilise Governments with misinformation or breakdown communication channels, leading to global conflicts where we kill each other. To stay on the nuclear weapons route, maybe it tricks the USA’s nuclear detection system into thinking that nuclear weapons have been fired, and the USA retaliates causing a nuclear war.
I think the fact that I can use my limited human intelligence to imagine multiple scenarios where AI could kill us all makes me very confident that a superintelligent AI would find additional and more effective methods. The question from my perspective isn’t “can AI kill us all?” but “how likely is it that AI can kill us all?”, and my answer is I don’t know but definitely not 0%. So here’s a huge problem that needs to be solved.
Full disclosure, I have no idea how AI alignment will overcome that problem, but I’m very glad they’re working on it.
- titotal Feb 6, 2025, 4:23 PM
  2 points
  0 ∶ 0
  Parent
  With the “killing humans through machines” option, a superintelligent AI would probably be smart enough to kill us all without taking the time to build a robot army, which would definitely raise my suspicions! Maybe it would hack nuclear weapons and blow us all up, invent and release an airborne super-toxin, or make a self-replicating nanobot—wouldn’t see it coming, over as soon as we realised it wasn’t aligned.
  Drexlerian style nanotech is not a threat for the foreseeable future. It is not on the horizon in any meaningful sense, and may in fact be impossible. Intelligence, even superintelligence, is not magic, and cannot just reinvent a better design than DNA, from scratch, with no testing or development. If drexlerian nanotech becomes a threat, it will be very obvious.
  Also, “hacking nuclear weapons”? Do you understand the actual procedure involved in firing a nuclear weapon?
  - JordanStone Feb 6, 2025, 5:14 PM
    1 point
    0 ∶ 0
    Parent
    I accept that I don’t know actual procedure for firing a nuclear weapon. And no one in the west knows what North Korea’s nuclear weapons cybersecurity is like, and ChatGPT tells me its connected to digital networks. So there’s definitely some uncertainty and I wouldn’t dismiss the possibility outright that nuclear weapons would be more likely to be hacked if superintelligence existed. So I’d guess maybe a 10-20% chance that it’s possible to hack nuclear weapons based on what I know.
    And I agree that it may be impossible to create drexlerian style nanotech. Maybe a 0.5% chance an ASI could do something like that?
    But I don’t think the debate here is about any particular scenario that I came up with.
    I think if I tried really hard I could come up with about 20 scenarios where an artificial superintelligence might be able to destroy humanity (if you really want me to I can try and list them). And I guess my proposed scenarios would have an average chance of actually working of 1-2%, so maybe around 10% chance that one of my proposed scenarios would work.
    But are you saying that the chance of ASI being able to kill us is 0%? In which case every conceivable scenario (including any plan that an ASI could come up with) would have to have a 0% chance of working? I just don’t find that possible, human civilisation isn’t that robust. It must be at least a 10% chance that one of those plans could work right? In which case significant efforts in AI safety to mitigate this risk are warranted.