Thanks, Adrià. Is your argument similar to (or a more generic version of) what I say in the ‘Optimizing for AI safety might harm AI welfare’ section above?
It’s more or less similar. I do not focus that much on the moral dubiousness of “happy servants”. Instead, I try to show that standard alignment methods or preventing near-future AIs with moral patienthood from taking actions they are trying to take, causes net harm to the AIs according to desire satisfactionism, hedonism and objective list theories.
Thanks, Adrià. Is your argument similar to (or a more generic version of) what I say in the ‘Optimizing for AI safety might harm AI welfare’ section above?
I’d love to read your paper. I will reach out.
Perfect!
It’s more or less similar. I do not focus that much on the moral dubiousness of “happy servants”. Instead, I try to show that standard alignment methods or preventing near-future AIs with moral patienthood from taking actions they are trying to take, causes net harm to the AIs according to desire satisfactionism, hedonism and objective list theories.