A lot of the discourse around AI safety uses terms like “human-friendly” or “human interests”. Does MIRI’s conception of friendly AI take the interests of non-human sentient beings into consideration as well? Especially troubling to me is Yudkowsky’s view on animal consciousness, but I’m not sure how representative his views are of MIRI in general.
(I realize that MIRI’s research focuses mainly on alignment theory, not target selection, but I am still concerned about this issue.)
“Human interests” is an unfortunate word choice; Nate talked about this last year too, and we’ve tried to avoid phrasings like that. Unfortunately, most ways of gesturing at the idea of global welfare aren’t very clear or widely understood, or they sound weird, or they borrow arguably speciesist language (“humane,” “humanitarian,” “philanthropy”...).
I’m pretty sure everyone at MIRI thinks we should value all sentient life (and extremely sure at least in the case of Eliezer, Nate, and myself), including sentient non-human animals and any sentient machines we someday develop. Eliezer thinks, as an empirical hypothesis, that relatively few animal species have subjective experience. Other people at MIRI, myself included, think a larger number of animal species have subjective experience. There’s no “consensus MIRI view” on this point, but I think it’s important to separate the empirical question from the strictly moral one, and I’m confident that if we learn more about what “subjective experience” is and how it’s implemented in brains, then people at MIRI will update. It’s also important to keep in mind that a good safety approach should be robust to the fact that the designers don’t have all the answers, and that humanity as a whole hasn’t fully developed scientifically (or morally).
I am not a MIRI employee, and this comment should not be interpreted as a response from MIRI, but I wanted to throw my two cents in about this topic.
I think that creating a friendly AI to specifically advance human values would actually turn out okay for animals. Such a human-friendly AI should optimize for everything humans care about, not just the quality of humans’ subjective experience. Many humans care a significant amount about the welfare of non-human animals. A human-friendly AI would thus care about animal welfare by proxy through the values of humans. As far as I am aware, there is not a significant number of humans who specifically want animals to suffer. It is extremely common for humans to want things (like food with the taste and texture of bacon) that currently can currently be produced most efficiently at significant expense to non-human animals. However, it seems unlikely that a friendly AI would not be able to find an efficient way of producing bacon that does not involve actual pigs.
If many people intrinsically value the proliferation of natural Darwinian ecosystems, and the fact that animals in such ecosystems suffer significantly would not change their mind, then that could happen. If it’s just that many people think it would be better for there to be more such ecosystems because they falsely believe that wild animals experience little suffering, and would prefer otherwise if their empirical beliefs were correct, then a human-friendly AI should not bring many such ecosystems into existence.
So you claim that you have values related to animals that most people don’t have and you want your eccentric values to be overrepresented in the AI?
I’m asking unironically (personally I also care about wild animal suffering but I also suspect that most people would care about if they spent sufficient time thinking about it and looking at the evidence).
A lot of the discourse around AI safety uses terms like “human-friendly” or “human interests”. Does MIRI’s conception of friendly AI take the interests of non-human sentient beings into consideration as well? Especially troubling to me is Yudkowsky’s view on animal consciousness, but I’m not sure how representative his views are of MIRI in general.
(I realize that MIRI’s research focuses mainly on alignment theory, not target selection, but I am still concerned about this issue.)
“Human interests” is an unfortunate word choice; Nate talked about this last year too, and we’ve tried to avoid phrasings like that. Unfortunately, most ways of gesturing at the idea of global welfare aren’t very clear or widely understood, or they sound weird, or they borrow arguably speciesist language (“humane,” “humanitarian,” “philanthropy”...).
I’m pretty sure everyone at MIRI thinks we should value all sentient life (and extremely sure at least in the case of Eliezer, Nate, and myself), including sentient non-human animals and any sentient machines we someday develop. Eliezer thinks, as an empirical hypothesis, that relatively few animal species have subjective experience. Other people at MIRI, myself included, think a larger number of animal species have subjective experience. There’s no “consensus MIRI view” on this point, but I think it’s important to separate the empirical question from the strictly moral one, and I’m confident that if we learn more about what “subjective experience” is and how it’s implemented in brains, then people at MIRI will update. It’s also important to keep in mind that a good safety approach should be robust to the fact that the designers don’t have all the answers, and that humanity as a whole hasn’t fully developed scientifically (or morally).
I am not a MIRI employee, and this comment should not be interpreted as a response from MIRI, but I wanted to throw my two cents in about this topic.
I think that creating a friendly AI to specifically advance human values would actually turn out okay for animals. Such a human-friendly AI should optimize for everything humans care about, not just the quality of humans’ subjective experience. Many humans care a significant amount about the welfare of non-human animals. A human-friendly AI would thus care about animal welfare by proxy through the values of humans. As far as I am aware, there is not a significant number of humans who specifically want animals to suffer. It is extremely common for humans to want things (like food with the taste and texture of bacon) that currently can currently be produced most efficiently at significant expense to non-human animals. However, it seems unlikely that a friendly AI would not be able to find an efficient way of producing bacon that does not involve actual pigs.
.
If many people intrinsically value the proliferation of natural Darwinian ecosystems, and the fact that animals in such ecosystems suffer significantly would not change their mind, then that could happen. If it’s just that many people think it would be better for there to be more such ecosystems because they falsely believe that wild animals experience little suffering, and would prefer otherwise if their empirical beliefs were correct, then a human-friendly AI should not bring many such ecosystems into existence.
So you claim that you have values related to animals that most people don’t have and you want your eccentric values to be overrepresented in the AI?
I’m asking unironically (personally I also care about wild animal suffering but I also suspect that most people would care about if they spent sufficient time thinking about it and looking at the evidence).