Hey James (Ozden), I am really glad that CE discussed this! I thought about them too, so wonder if you and CE would like to discuss? (CE rejected my proposal on AI x animals x longtermism, but I think they made the right call, these ideas were too immature and under-researched to set up a new charity!)
I now work as Peter Singer’s RA (contractor) at Princeton, on AI and animals. We touched on AI alignment, and we co-authored a paper on speciesist algorithmic bias in AI systems (language models, search algorithms), with two other professors, which might be relevant.
I also looked at other problems which might look like quasi-AI-alignment for animals problems. (or maybe, they are not quasi?)
For example, some AI systems are given the tasks to “tell” the mental states (+/-, scores) of farmed animals and zoo animals, and some of them will, in the future, be given the further task of satisficing/maximizing (I believe they won’t “maximize, they are satistficing for animal “welfare” due to legal and commercial concerns). A problem is that, the “ground truths” labels in the training datasets of these AI are, as far as I know, all labelled by humans (not the animals! Obviously. Also remember that among humans, the one chosen to label such data likely have interests in factory farming). This causes a great problem. What these welfare maximizing (let’s charitably think they will be do this instead of satisficing) systems will be optimizing are the scores attached to the physical parameters chosen to be given scores of. For example, if the AI system is told to look for “positive facial expressions” defined by “animal welfare experts”, which actually was something people trained AI on, the AI system would have a tendency to hack the reward by maximizing the instances the pigs have these “positive facial expressions, without true regard to welfare. If the systems get sophisticated enough, toy examples for human-AI alignment like an ASI controlling the facial muscles of humans to maximize the number of human smiles, could actually happen in factory farms. The same could happen even if the systems are told to minimize “negative expressions”—the AI could find ways to make the animals hide their pain and suffering.
If we keep using human labellers for “ground truths” of animals’ interests, preferences, welfare. There will be two alignment problems. 1. How to align human definitions and labels with the animals’ actual interests/preferences? 2. The human-AI alignment problem we usually talk about. (And if there is a mesa-optimizer problem in such systems, we have 3!)
There’s a kind of AI systems which might break this partially. There’s a few projects out there trying to decipher the “languages” of rats, whales, or animals generally. While there are huge potentials, it’s not only positive for me. Setting aside 10+ other philosophical problems I identified with “deciphering animal language”, I want to discuss the quasi-alignment problem I see here: Let’s say the approach is to use ML to group the patterns in animals’ sounds. To “decipher animal language”, at some point the human researchers still have to use their judgement to decide that a certain sound pattern means something in a human language. For example if the same sound pattern appears every time the rats are not fed, the researchers might conclude that this pattern means “hungry”. But that’s still the same problem, the interpretation what the animals actually expressed was done by humans first, before going to the AI. What if the rats are actually not saying “hungry”, but “feed me?”, or “hangry”, we might carry the prejudice that rats are not as sophisticated as that, but what if they are?
Wait, I don’t know why I wrote so much, but anyway, thank you if you have read so far :)
I haven’t read this fully (yet! will respond soon) but very quick clarification—Charity Entrepreneurship weren’t talking about this as an organisation. Rather, there’s a few different orgs with a bunch of individuals who use the CE office and happened to be talking about it (mostly animal people in this case). So I wouldn’t expect CE’s actual work to reflect that conversation given it only had one CE employee and 3 others who weren’t!
Great to learn about your paper Fai, I didn’t know about it till now, and this topic is quite interesting. I think when longtermism talks about the far future it’s usually “of humanity” that follows and this always scared me, because I was not sure either this is speciesist or is there some silent assumption that we should also care about sentient beings. I don’t think there were animal-focused considerations in Toby Ord’s book (I might be wrong here) and similar publications? I would gladly then read your paper. I quickly jumped to the conclusion of it, and it kinds of confirm my intuitions in regards to AI (but also long-term future work in general): “Up to now, the AI fairness community has largely disregarded this particular dimension of discrimination. Even more so, the field of AI ethics hitherto has had an anthropocentric tailoring. Hence, despite the longstanding discourse about AI fairness, comprising lots of papers critically scrutinizing machine biases regarding race, gender, political orientation, religion, etc., this is the first paper to describe speciesist biases in various common-place AI applications like image recognition, language models, or recommender systems. Accordingly, we follow the calls of another large corpus of literature, this time from animal ethics, pointing from different angles at the ethical necessity of taking animals directly into consideration [48,155–158]...”
Hey James (Ozden), I am really glad that CE discussed this! I thought about them too, so wonder if you and CE would like to discuss? (CE rejected my proposal on AI x animals x longtermism, but I think they made the right call, these ideas were too immature and under-researched to set up a new charity!)
I now work as Peter Singer’s RA (contractor) at Princeton, on AI and animals. We touched on AI alignment, and we co-authored a paper on speciesist algorithmic bias in AI systems (language models, search algorithms), with two other professors, which might be relevant.
I also looked at other problems which might look like quasi-AI-alignment for animals problems. (or maybe, they are not quasi?)
For example, some AI systems are given the tasks to “tell” the mental states (+/-, scores) of farmed animals and zoo animals, and some of them will, in the future, be given the further task of
satisficing/maximizing (I believe they won’t “maximize, they are satistficing for animal “welfare” due to legal and commercial concerns). A problem is that, the “ground truths” labels in the training datasets of these AI are, as far as I know, all labelled by humans (not the animals! Obviously. Also remember that among humans, the one chosen to label such data likely have interests in factory farming). This causes a great problem. What these welfare maximizing (let’s charitably think they will be do this instead of satisficing) systems will be optimizing are the scores attached to the physical parameters chosen to be given scores of. For example, if the AI system is told to look for “positive facial expressions” defined by “animal welfare experts”, which actually was something people trained AI on, the AI system would have a tendency to hack the reward by maximizing the instances the pigs have these “positive facial expressions, without true regard to welfare. If the systems get sophisticated enough, toy examples for human-AI alignment like an ASI controlling the facial muscles of humans to maximize the number of human smiles, could actually happen in factory farms. The same could happen even if the systems are told to minimize “negative expressions”—the AI could find ways to make the animals hide their pain and suffering.
If we keep using human labellers for “ground truths” of animals’ interests, preferences, welfare. There will be two alignment problems. 1. How to align human definitions and labels with the animals’ actual interests/preferences? 2. The human-AI alignment problem we usually talk about. (And if there is a mesa-optimizer problem in such systems, we have 3!)
There’s a kind of AI systems which might break this partially. There’s a few projects out there trying to decipher the “languages” of rats, whales, or animals generally. While there are huge potentials, it’s not only positive for me. Setting aside 10+ other philosophical problems I identified with “deciphering animal language”, I want to discuss the quasi-alignment problem I see here: Let’s say the approach is to use ML to group the patterns in animals’ sounds. To “decipher animal language”, at some point the human researchers still have to use their judgement to decide that a certain sound pattern means something in a human language. For example if the same sound pattern appears every time the rats are not fed, the researchers might conclude that this pattern means “hungry”. But that’s still the same problem, the interpretation what the animals actually expressed was done by humans first, before going to the AI. What if the rats are actually not saying “hungry”, but “feed me?”, or “hangry”, we might carry the prejudice that rats are not as sophisticated as that, but what if they are?
Wait, I don’t know why I wrote so much, but anyway, thank you if you have read so far :)
I haven’t read this fully (yet! will respond soon) but very quick clarification—Charity Entrepreneurship weren’t talking about this as an organisation. Rather, there’s a few different orgs with a bunch of individuals who use the CE office and happened to be talking about it (mostly animal people in this case). So I wouldn’t expect CE’s actual work to reflect that conversation given it only had one CE employee and 3 others who weren’t!
Oh okay, thanks for the clarification!
Great to learn about your paper Fai, I didn’t know about it till now, and this topic is quite interesting. I think when longtermism talks about the far future it’s usually “of humanity” that follows and this always scared me, because I was not sure either this is speciesist or is there some silent assumption that we should also care about sentient beings. I don’t think there were animal-focused considerations in Toby Ord’s book (I might be wrong here) and similar publications? I would gladly then read your paper. I quickly jumped to the conclusion of it, and it kinds of confirm my intuitions in regards to AI (but also long-term future work in general):
“Up to now, the AI fairness community has largely disregarded this particular dimension of discrimination. Even more so, the field of AI ethics hitherto has had an anthropocentric tailoring. Hence, despite the longstanding discourse about AI fairness, comprising lots of papers critically scrutinizing machine biases regarding race, gender, political orientation, religion, etc., this is the first paper to describe speciesist biases in various common-place AI applications like image recognition, language models, or recommender systems. Accordingly, we follow the calls of another large corpus of literature, this time from animal ethics, pointing from different angles at the ethical necessity of taking animals directly into consideration [48,155–158]...”