I only mentioned human consciousness to help describe an analogy; hope it wasn’t taken to say something about machine consciousness.
I haven’t read Superintelligence but I expect it contains the standard stuff―outer and inner alignment, instrumental convergence etc. For the sake of easy reading, I lean into instrumental convergence without naming it, and leave the alignment problem implicit as a problem of machines that are “too much” like humans, because
I think AGI builders have enough common sense not to build paperclip maximizers
Misaligned AGIs―that seem superficially humanlike but end up acting drastically pathological when scaled to ASI―are harder to describe so instead I describe (by analogy) something similar: humans outside the usual distribution. I argue that psychopathy is absence of empathy, so when AGIs surpass human ability it’s way too easy to build in a machine like that. (Indeed, I could’ve said, even normal humans can easily turn off their empathy with monstrous results, see: Nazis, Mao’s CCP).
I don’t incorporate Yudkowsky’s ideas because I found the List of Lethalities to be annoyingly incomplete and unconvincing, and I’m not aware of anything better (clear and complete) that he’s written. Let me know if you can point me to anything.
It’s a long post, and it starts by talking about consciousness.
Does it contain any response to the classic case for AI Risk, e.g. Bostrom’s Superintelligence or Yudkowsky’s List of Lethalities?
I only mentioned human consciousness to help describe an analogy; hope it wasn’t taken to say something about machine consciousness.
I haven’t read Superintelligence but I expect it contains the standard stuff―outer and inner alignment, instrumental convergence etc. For the sake of easy reading, I lean into instrumental convergence without naming it, and leave the alignment problem implicit as a problem of machines that are “too much” like humans, because
I think AGI builders have enough common sense not to build paperclip maximizers
Misaligned AGIs―that seem superficially humanlike but end up acting drastically pathological when scaled to ASI―are harder to describe so instead I describe (by analogy) something similar: humans outside the usual distribution. I argue that psychopathy is absence of empathy, so when AGIs surpass human ability it’s way too easy to build in a machine like that. (Indeed, I could’ve said, even normal humans can easily turn off their empathy with monstrous results, see: Nazis, Mao’s CCP).
I don’t incorporate Yudkowsky’s ideas because I found the List of Lethalities to be annoyingly incomplete and unconvincing, and I’m not aware of anything better (clear and complete) that he’s written. Let me know if you can point me to anything.