Scott Alexander discusses this in his post here. I’m skeptical that humans will able to align AI with morality anytime soon. Humans have been disagreeing about what morality consists of for a few thousand years. It’s unlikely we’ll solve the issue in the next 10.
I don’t think we need to solve ethics in order to work on improving the ethics of models. Ethics may be something unsolvable, yet some AI models are and will be instilled with some values, or there will be some system to decide on the value section problem. I think more people need to work on that. Just now a great post relating to the value selection problem was published : Beyond Short-Termism: How δ and w Can Realign AI with Our Values
That post on deliberative alignment seems to be just about one method by which we might build aligned AIs, not about the idea of moral alignment in general.
I’m probably less skeptical than you are because take as evidence that we align humans to moral value systems all the time. And although we don’t do it perfectly, there are some very virtuous folks out there who take their morals seriously. So I think alignment to some system of morality is certainly possible.
Whether or not we can figure out which moral judgements are “right” is another matter, although perhaps we can at least build AI that is aligned with universally recognized norms like “don’t murder” and “save lives”.
Scott Alexander discusses this in his post here. I’m skeptical that humans will able to align AI with morality anytime soon. Humans have been disagreeing about what morality consists of for a few thousand years. It’s unlikely we’ll solve the issue in the next 10.
I don’t think we need to solve ethics in order to work on improving the ethics of models. Ethics may be something unsolvable, yet some AI models are and will be instilled with some values, or there will be some system to decide on the value section problem. I think more people need to work on that.
Just now a great post relating to the value selection problem was published :
Beyond Short-Termism: How δ and w Can Realign AI with Our Values
That post on deliberative alignment seems to be just about one method by which we might build aligned AIs, not about the idea of moral alignment in general.
I’m probably less skeptical than you are because take as evidence that we align humans to moral value systems all the time. And although we don’t do it perfectly, there are some very virtuous folks out there who take their morals seriously. So I think alignment to some system of morality is certainly possible.
Whether or not we can figure out which moral judgements are “right” is another matter, although perhaps we can at least build AI that is aligned with universally recognized norms like “don’t murder” and “save lives”.