It’s great to see this topic being discussed. I am currently writing the first (albeit significantly developed) draft of an academic paper on this. I argue that there is a conflict between AI safety and AI welfare concerns. This is so basically because (to reduce catastrophic risk) AI safety recommends implementing various kinds of control measures to near-future AI systems which are (in expectation) net-harmful for AI systems with moral patienthood according to the three major theories of well-being. I also discuss what we should do in light of this conflict. If anyone is interested in reading or giving comments on the draft when it is finished, send me a message or an e-mail (adriarodriguezmoret@gmail.com).
Thanks, Adrià. Is your argument similar to (or a more generic version of) what I say in the ‘Optimizing for AI safety might harm AI welfare’ section above?
It’s more or less similar. I do not focus that much on the moral dubiousness of “happy servants”. Instead, I try to show that standard alignment methods or preventing near-future AIs with moral patienthood from taking actions they are trying to take, causes net harm to the AIs according to desire satisfactionism, hedonism and objective list theories.
It’s great to see this topic being discussed. I am currently writing the first (albeit significantly developed) draft of an academic paper on this. I argue that there is a conflict between AI safety and AI welfare concerns. This is so basically because (to reduce catastrophic risk) AI safety recommends implementing various kinds of control measures to near-future AI systems which are (in expectation) net-harmful for AI systems with moral patienthood according to the three major theories of well-being. I also discuss what we should do in light of this conflict. If anyone is interested in reading or giving comments on the draft when it is finished, send me a message or an e-mail (adriarodriguezmoret@gmail.com).
This quick take seems relevant: https://forum.effectivealtruism.org/posts/auAYMTcwLQxh2jB6Z/zach-stein-perlman-s-quick-takes?commentId=HiZ8GDQBNogbHo8X8
Yes I saw this, thanks!
Thanks, Adrià. Is your argument similar to (or a more generic version of) what I say in the ‘Optimizing for AI safety might harm AI welfare’ section above?
I’d love to read your paper. I will reach out.
Perfect!
It’s more or less similar. I do not focus that much on the moral dubiousness of “happy servants”. Instead, I try to show that standard alignment methods or preventing near-future AIs with moral patienthood from taking actions they are trying to take, causes net harm to the AIs according to desire satisfactionism, hedonism and objective list theories.