Thank you for the detailed response. Some responses to your points:
Our values might get locked in this century through technology or totalitarian politics, in which case we need to rush to reach something tolerable as quickly as possible;
I’m having a hard time thinking of how technology could lock in our values. One possibility is that AGI would be programmed to value what we currently value with no ability to have moral growth. However, it’s not clear to me why anyone would do this. People, as best as I can tell, value moral growth and thus would want AGI to be able to exhibit it.
There is the possibility that programming AGI to value only what we currently value right now without the possibility of moral growth would be technically easier. I don’t see why this would be the case, though. Implementing people’s CEV, as Eliezer proposed, would allow for moral growth. Narrow value learning, as Paul Christiano proposed, would presumably allow for moral growth if the AGI learns to avoid changing people’s goals. AGI alignment via direct specification may be made easier by prohibiting moral growth, but the general consensus I’ve seen is that alignment via direct specification would be extremely difficult and thus improbable.
There’s the possibility of people creating technology for the express purpose of preventing moral growth, but I don’t know why people would do that.
As for totalitarian politics, it’s not clear to me how they would stop moral growth. If there is anyone in charge, I would imagine they would value their personal moral growth and thus would be able to realize that animal rights are important. After that, I imagine the leader would then be able to spread their values onto others. I know little about politics, though, so there may be something huge I’m missing.
I’m also a little concerned that campaigning for animals rights may backfire. Currently many people seem unaware of just how bad animal suffering is. Many people also love eating meat. If people become informed of the extent of animal suffering, then to minimize cognitive dissonance I’m concerned people will stop caring about animals rather than stop eating meat.
So, my understanding is, getting a significant proportion of people to stop eating meat might make them more likely to exhibit moral growth by caring about other animals, which would be useful for one, unlikely to be used, alignment strategy. I’m not saying this is the entirety of your reasoning, but I suspect it would be much more efficient working on AI alignment by directly working on alignment research or by convincing people that such alignment research is important.
Another possibility is to attempt to spread humane values by directly teaching moral philosophy. Does this sound feasible?
Our values might end up on a bad but self-reinforcing track from which we can’t escape, which is a reason to get to something tolerable quickly, in order to make that less likely;
Do you have any situations in mind in which this could occur?
Fixing the problem of discrimination against animals allows us to progress to other moral circle expansions sooner, most notably from a long-termist perspective, recognising the risks of suffering in thinking machines;
I’m wondering what your reasoning behind this is.
Animal advocacy can draw people into relevant moral philosophy, effective altruism and related work on other problems, which arguably increases the value of the long-term future.
I’m concerned this may backfire as well. Perhaps people would after becoming vegan, figure they have done a sufficiently large amount of good and thus be less likely to pursue other forms of altruism.
This might seem unreasonable: performing one good deed does not seem to increase the costs or decrease the benefits of performing other good deeds by much. However, it does seem to be how people act. As evidence, I heard that despite wealth having steeply diminishing returns to happiness, wealthy individuals give a smaller proportion of their money to charities. Further, some EA’s have a policy of donating 10% of their income, even if after donating 10% they still have far more money than necessary for living comfortably.
I think many of your concerns will come down to views on the probabilities assigned to certain possibilities.
I’m having a hard time thinking of how technology could lock in our values. One possibility is that AGI would be programmed to value what we currently value with no ability to have moral growth. However, it’s not clear to me why anyone would do this. People, as best as I can tell, value moral growth and thus would want AGI to be able to exhibit it.
Even then, the initial values given to the AGIs may have a huge influence, and some of these can be very subjective, e.g. how much extra weight more intense suffering should receive compared to less intense suffering (if any) or other things we care about, and how much suffering we think certain beings experience in given circumstances.
Implementing people’s CEV, as Eliezer proposed, would allow for moral growth.
Besides being sensitive to the initial views, people hold contradictory views, so could there not be more than one CEV here? Some will be better or worse than others according to EAs who care about the wellbeing of sentient individuals, and if we reduce the influence of worse views, this could make better solutions more likely.
As for totalitarian politics, it’s not clear to me how they would stop moral growth. If there is anyone in charge, I would imagine they would value their personal moral growth and thus would be able to realize that animal rights are important. After that, I imagine the leader would then be able to spread their values onto others. I know little about politics, though, so there may be something huge I’m missing.
It’s of course possible, but is it almost inevitable that these leaders will value their own personal moral growth enough, and how many leaders will we go through before we get one that makes the right decision? Even if they do value personal moral growth, they still need to be exposed to ethical arguments or other reasons that would push them in a given direction. If the rights and welfare of certain groups of sentient beings are not on their radar, what can we expect from these leaders?
Also, these seem to be extremely high expectations of politicians, who are fallible and often very self-interested, and especially in the case of totalitarian politics.
I’m also a little concerned that campaigning for animals rights may backfire. Currently many people seem unaware of just how bad animal suffering is. Many people also love eating meat. If people become informed of the extent of animal suffering, then to minimize cognitive dissonance I’m concerned people will stop caring about animals rather than stop eating meat.
1. Concern for animal rights and welfare seems to be generally increasing (despite increasing consumption of animal products; this is not driven by changing attitudes on animals), and I think there is popular support for welfare reform in many places, with the success of corporate campaigns and improving welfare legislation as evidence for this, and attitude surveys generally. I think people at Sentience Institute see welfare reforms as building momentum more than justifying complacency.
2. If this is a significant risk, animal product substitutes (plant-based and cultured) and institutional approaches, which are currently prioritized in EA over individual outreach, should help to make the choice to not eat meat easier, so fewer people will resolve their cognitive dissonance this way. People who care about sentient beings can play an important role in the development and adoption of such technologies and reform of institutions (through campaigning), so it’s better to have more of them.
3. Animal advocates don’t just show people how bad animal suffering is; other arguments and approaches are used.
So, my understanding is, getting a significant proportion of people to stop eating meat might make them more likely to exhibit moral growth by caring about other animals, which would be useful for one, unlikely to be used, alignment strategy.
How unlikely do you think this is? It’s not just the AI safety community that will influence what safety features will go into AIs, but also possibly other policy makers, politicians, voters and corporations.
I’m wondering what your reasoning behind this is.
One reason could be just a matter of limited time and resources; advocates can move onto other issues when their higher priorities have been addressed. Another is that comparisons between more similar groups of individuals probably work better in moral arguments in practice, e.g. as mammals and birds receive more protections, it will become easier to advocate for fishes and invertebrates (although this doesn’t stop us from advocating for these now). If more sentient animals have more protections, it will be easier to advocate for the protection of sentient AIs.
I’m concerned this may backfire as well. Perhaps people would after becoming vegan, figure they have done a sufficiently large amount of good and thus be less likely to pursue other forms of altruism.
Anecdotally, the students in the animal rights society at the University of Waterloo are also much more engaged in environmental activism than most students. Social justice advocates are often involved in multiple issues.
As evidence, I heard that despite wealth having steeply diminishing returns to happiness, wealthy individuals give a smaller proportion of their money to charities.
Wealthier people might also be less compassionate on average:
Further, some EA’s have a policy of donating 10% of their income, even if after donating 10% they still have far more money than necessary for living comfortably.
I would guess that EAs who donate a larger percentage of their income (and people who donate more of their income to EA-aligned charities) are more involved in the movement in other ways on average.
Thank you for the detailed response. Some responses to your points:
I’m having a hard time thinking of how technology could lock in our values. One possibility is that AGI would be programmed to value what we currently value with no ability to have moral growth. However, it’s not clear to me why anyone would do this. People, as best as I can tell, value moral growth and thus would want AGI to be able to exhibit it.
There is the possibility that programming AGI to value only what we currently value right now without the possibility of moral growth would be technically easier. I don’t see why this would be the case, though. Implementing people’s CEV, as Eliezer proposed, would allow for moral growth. Narrow value learning, as Paul Christiano proposed, would presumably allow for moral growth if the AGI learns to avoid changing people’s goals. AGI alignment via direct specification may be made easier by prohibiting moral growth, but the general consensus I’ve seen is that alignment via direct specification would be extremely difficult and thus improbable.
There’s the possibility of people creating technology for the express purpose of preventing moral growth, but I don’t know why people would do that.
As for totalitarian politics, it’s not clear to me how they would stop moral growth. If there is anyone in charge, I would imagine they would value their personal moral growth and thus would be able to realize that animal rights are important. After that, I imagine the leader would then be able to spread their values onto others. I know little about politics, though, so there may be something huge I’m missing.
I’m also a little concerned that campaigning for animals rights may backfire. Currently many people seem unaware of just how bad animal suffering is. Many people also love eating meat. If people become informed of the extent of animal suffering, then to minimize cognitive dissonance I’m concerned people will stop caring about animals rather than stop eating meat.
So, my understanding is, getting a significant proportion of people to stop eating meat might make them more likely to exhibit moral growth by caring about other animals, which would be useful for one, unlikely to be used, alignment strategy. I’m not saying this is the entirety of your reasoning, but I suspect it would be much more efficient working on AI alignment by directly working on alignment research or by convincing people that such alignment research is important.
Another possibility is to attempt to spread humane values by directly teaching moral philosophy. Does this sound feasible?
Do you have any situations in mind in which this could occur?
I’m wondering what your reasoning behind this is.
I’m concerned this may backfire as well. Perhaps people would after becoming vegan, figure they have done a sufficiently large amount of good and thus be less likely to pursue other forms of altruism.
This might seem unreasonable: performing one good deed does not seem to increase the costs or decrease the benefits of performing other good deeds by much. However, it does seem to be how people act. As evidence, I heard that despite wealth having steeply diminishing returns to happiness, wealthy individuals give a smaller proportion of their money to charities. Further, some EA’s have a policy of donating 10% of their income, even if after donating 10% they still have far more money than necessary for living comfortably.
I think many of your concerns will come down to views on the probabilities assigned to certain possibilities.
Even then, the initial values given to the AGIs may have a huge influence, and some of these can be very subjective, e.g. how much extra weight more intense suffering should receive compared to less intense suffering (if any) or other things we care about, and how much suffering we think certain beings experience in given circumstances.
Besides being sensitive to the initial views, people hold contradictory views, so could there not be more than one CEV here? Some will be better or worse than others according to EAs who care about the wellbeing of sentient individuals, and if we reduce the influence of worse views, this could make better solutions more likely.
It’s of course possible, but is it almost inevitable that these leaders will value their own personal moral growth enough, and how many leaders will we go through before we get one that makes the right decision? Even if they do value personal moral growth, they still need to be exposed to ethical arguments or other reasons that would push them in a given direction. If the rights and welfare of certain groups of sentient beings are not on their radar, what can we expect from these leaders?
Also, these seem to be extremely high expectations of politicians, who are fallible and often very self-interested, and especially in the case of totalitarian politics.
There is indeed evidence that people react this way. However, I can think of a few reasons why we shouldn’t expect the risks to outweigh the possible benefits:
1. Concern for animal rights and welfare seems to be generally increasing (despite increasing consumption of animal products; this is not driven by changing attitudes on animals), and I think there is popular support for welfare reform in many places, with the success of corporate campaigns and improving welfare legislation as evidence for this, and attitude surveys generally. I think people at Sentience Institute see welfare reforms as building momentum more than justifying complacency.
2. If this is a significant risk, animal product substitutes (plant-based and cultured) and institutional approaches, which are currently prioritized in EA over individual outreach, should help to make the choice to not eat meat easier, so fewer people will resolve their cognitive dissonance this way. People who care about sentient beings can play an important role in the development and adoption of such technologies and reform of institutions (through campaigning), so it’s better to have more of them.
3. Animal advocates don’t just show people how bad animal suffering is; other arguments and approaches are used.
4. There’s some (weak) evidence that animal advocacy messaging works to get people to reduce their consumption of animal products, and cruelty messaging seemed more effective than environmental and abolitionist/rights/antispeciesist messages. See also other reports and blog posts by Humane League Labs.
How unlikely do you think this is? It’s not just the AI safety community that will influence what safety features will go into AIs, but also possibly other policy makers, politicians, voters and corporations.
One reason could be just a matter of limited time and resources; advocates can move onto other issues when their higher priorities have been addressed. Another is that comparisons between more similar groups of individuals probably work better in moral arguments in practice, e.g. as mammals and birds receive more protections, it will become easier to advocate for fishes and invertebrates (although this doesn’t stop us from advocating for these now). If more sentient animals have more protections, it will be easier to advocate for the protection of sentient AIs.
It’s possible. This is self-licensing. Some responses:
Anecdotally, the students in the animal rights society at the University of Waterloo are also much more engaged in environmental activism than most students. Social justice advocates are often involved in multiple issues.
Human rights organizations seem to be increasing their support for animal protection (written by an animal and human rights advocate).
Support for protections for human rights and welfare, and animals seems correlated both in individuals and legally at the state-level in the US.
Veg*nism seems inversely related to prejudice, dominance and authoritarianism generally.
There’s evidence that randomly assigning people people to participate in protests makes them more likely to participate in future protests.
Wealthier people might also be less compassionate on average:
https://www.psychologytoday.com/us/blog/the-science-behind-behavior/201711/why-people-who-have-less-give-more
https://www.scientificamerican.com/article/how-wealth-reduces-compassion/
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5240617/ (the effect might be small)
I would guess that EAs who donate a larger percentage of their income (and people who donate more of their income to EA-aligned charities) are more involved in the movement in other ways on average.