I feel like we shouldn’t expect to be able to express the Values Of Humanity to an AGI in order for it to be safe—in the same way that humans are currently mostly safe towards the rest of humanity despite not being able to articulate those Values Of Humanity themselves. There’s something stopping one person (even a very rich or powerful one) from killing everyone else, and it’s not explicit knowledge.
I feel like we shouldn’t expect to be able to express the Values Of Humanity to an AGI in order for it to be safe—in the same way that humans are currently mostly safe towards the rest of humanity despite not being able to articulate those Values Of Humanity themselves. There’s something stopping one person (even a very rich or powerful one) from killing everyone else, and it’s not explicit knowledge.