SiobhanBall comments on AI Moral Alignment: The Most Important Goal of Our Generation

SiobhanBall Mar 31, 2025, 10:59 AM
1 point
0 ∶ 0
I agree with these two points raised by others:

we already can’t agree as humans on what is moral
Why would they build something that could disobey them and potentially betray them for some greater good that they might not agree with?

I’m mindful of the risk of confusion as one commenter mentioned that MA could be synonymous with social alignment. I think a different term is needed. I personally liked your use of the word ‘sentinel’. Sentinel —> sentience. Easy to remember what it means in this context: protecting all sentient life (through judicious development of AI). ‘Moral’ is too broad in my view. There are fields of moral consideration that have little to do with non-human sentient life/animals. So, again, I would change the name of the movement to more accurately and succinctly fit what it’s about. Not sure how far along you are with the MA terminology, though!

You’ve said:

If humans agree they want an AI that cares about everyone who feels, or at least that is what we are striving for, then classical alignment is aligned with a sentient centric AI.
In a world with much more abundance and less scarcity, less conflict of interests between humans and non humans, I suspect this view to be very popular, and I think it is already popular to an extent.

I fear it is not yet popular enough to work on the basis that we can skip humanity’s recognition of animal sentience, and go straight to developing AI with that in mind. Unfortunately, the vast majority of humans still don’t rate animal sentience as being a good enough reason to stop killing them en masse, so it’s unlikely that they’re going to care about it when developing AI. I agree with your second part: AI will probably usher in an era where morals come easier because of abundance. But that’s going to happen after AGI, not before. To the extent that it’s possible for non-human animals to be considered now, at this stage of AI development, I think AI for Animals is already making waves there.

So my key question is—what does MA seek to achieve, that isn’t already the focal point of AI for Animals? If I’ve understood correctly, you want MA to be a broader umbrella term for works which AI for Animals contributes to.
What I don’t understand is, what else is under that umbrella?

Of all the possible directions, I think your suggestion of creating an ethical pledge is by far the strongest. That’s something tangible that we can get working on right away.

TLDR: MA seems to be about developing AI with the interests of animals in mind. I have a hard time comprehending what else there is to it (I’m a bit thick though, so if I’m missing the point, please say!). If it is about animals, then I don’t think we need to obscure that behind broader notions of morality; we can be on-the-nose and say ‘we care about animals. We want everyone to stop harming them. We want AI to avoid harming them, and to be developed with a view to creating conditions whereby nobody is harming them anymore. Sign our pledge today!’
- Ronen Bar Apr 2, 2025, 5:50 AM
  1 point
  0 ∶ 0
  Parent
  Thanks for the feedback!!
  
  ”we already can’t agree as humans on what is moral”
  
  I don’t the fact that all humankind can’t agree on a specific set of morals, tough many things are quite in consensus, at least in the west, prevent AGI or ASI from having a set of value. They are baking morals into those models, so the question in—what will those values be? and they are already not the values of the median worldwide person but more like the values of the median person in San Francisco (e.g. the models are very LGBTQ+ friendly)
  “Why would they build something that could disobey them and potentially betray them for some greater good that they might not agree with?”
  
  I am not suggesting they build something that will betray the creators of the models, and one of the goals of AI alignment research is how to make models corrigible—so humans can change their set of values and not get stuck with something (What is value lock-in? (YouTube video)). We need to convince the leaders of AI companies and regulators to align models with a Sentientism worldview (because of morality, because of public demand for this, because it is a robust way to keep humans safe, and more).
  
  ”I’m mindful of the risk of confusion as one commenter mentioned that MA could be synonymous with social alignment. I think a different term is needed. ”
  That is a great point, and I didn’t make this clear in the post. Moral Alignment is the field focused on the question what are the right values, the true moral values, that we should align AI to. Within that there could be different views, and I think the stance of most people in our community is the promote the Sentientism view. Moral Alignment differs from AI technical Alignment since technical alignment focus on making AI do what we want, and MA focus on—what do we want?
  I would be glad to hear more alternative ideas for concepts, if you have some. I am going to do interviews with relevant people to get some structured feedback on several possible terms. I am not set yet on any term
  So you would call this Sentient beings sentinel? I like this play of words and also wrote something using it. I see the sentientist value alignment as inside MA.
  
  “The vast majority of humans still don’t rate animal sentience as being a good enough reason to stop killing them en masse, so it’s unlikely that they’re going to care about it when developing AI.”
  I think the majority does care about animals and would want AI to care about them. ppls states values are better, much better, than their deeds. This movement is not about asking ppl to go vegan, it is about striving to take the good stewardship role that humanity has long dreamed of in ancient books and stories.
  
  “what does MA seek to achieve, that isn’t already the focal point of AI for Animals? If I’ve understood correctly, you want MA to be a broader umbrella term for works which AI for Animals contributes to.”
  Yes, MA is about animals, humans, future digital minds, and anybody that can feel. It is the space that tries works on the question—what values should we align AI to? and Sentientism is the worldview that I hope many people will promote.
  
  I think there is a lot of work to be done in this space, some of it is about bringing more talent and money, some of it is about promoting the interests of all the groups altogether (e.g. how does a sentient-centric AI behaves? it is a crucial question that is not being researched), some is specific intervention e.g. we need to convince AI companies to have a clear stance on non-humans. They currentl don’t.