Eliezer seems to come from the position that utility is more or less equal to “achieving this agent’s goals, whatever those are” and as such even agents extremely different from humans can have it (example of a trillion times more powerful AI). This is very different from [my understanding of] what HjalmarWijk above says, where utility seems to be defined in a more-or-less universal way and a specific agent can have goals orthogonal or even opposite to utility, so you can have a trillion agents fully achieving their goals and yet not a single “utiliton”.
Re other ethical systems—I’m mostly asking about utilitarianism, because it’s what nearly everyone working on alignment subscribes to, and also I know even less about other systems. But at a first glance, seems like deontological or virtue ethics can have either ways out of this problem? And for relativism or egoism it’s a non-issue.
The distinctive feature of utilitarianism is not that it thinks happiness/utility matter, but that it thinks nothing else intrinsically matters. Almost all ethical systems apply at least some value to consequences and happiness. And even austere deontologists who didn’t would still face the question of whether AIs could have rights that might be impermissible to violate, etc. Agreed egoism seems less affected.
Eliezer seems to come from the position that utility is more or less equal to “achieving this agent’s goals, whatever those are” and as such even agents extremely different from humans can have it (example of a trillion times more powerful AI). This is very different from [my understanding of] what HjalmarWijk above says, where utility seems to be defined in a more-or-less universal way and a specific agent can have goals orthogonal or even opposite to utility, so you can have a trillion agents fully achieving their goals and yet not a single “utiliton”.
Re other ethical systems—I’m mostly asking about utilitarianism, because it’s what nearly everyone working on alignment subscribes to, and also I know even less about other systems. But at a first glance, seems like deontological or virtue ethics can have either ways out of this problem? And for relativism or egoism it’s a non-issue.
The distinctive feature of utilitarianism is not that it thinks happiness/utility matter, but that it thinks nothing else intrinsically matters. Almost all ethical systems apply at least some value to consequences and happiness. And even austere deontologists who didn’t would still face the question of whether AIs could have rights that might be impermissible to violate, etc. Agreed egoism seems less affected.