It seems like a fundamental problem is the lack of a moral realist foundation, as “human intentions toward sentient beings” and “what is moral” are different things. Can someone recommend some reading on whether alignment is even a coherent ask, either from a moral realist or moral anti-realist perspective?
It seems like a fundamental problem is the lack of a moral realist foundation, as “human intentions toward sentient beings” and “what is moral” are different things. Can someone recommend some reading on whether alignment is even a coherent ask, either from a moral realist or moral anti-realist perspective?
And human deeds are very different from human states values.
I think research to define exactly what is a sentient centric AI, is one of the first important things to do, and it’s possible