[ETA: Iāve now posted a more detailed version of this comment as a standalone post.]
(Personal views, rather than those of my employers, as with most of my comments)
Above, I wrote:
If moral circles were one-dimensional, any MCE might seem likely to increase the chances that any other type of entity will ultimately be included in peopleās moral circles. But since moral circles are multidimensional, if a person primarily cares about increasing moral concern for a particular type of entity (e.g., future people, digital minds, wild animals), it might be important to think about which dimensions a given intervention would expand moral circles along.
Hereās perhaps the key example I had in mind when I wrote that:
Some EAs and related organisations (especially but not only the Sentience Institute) seem to base big decisions on something roughly like the following argument:
---
Premise 1: Itās plausible that the vast majority of all the suffering and wellbeing that ever occurs will occur more than a hundred years into the future. Itās also plausible that the vast majority of that suffering and wellbeing would be experienced by beings towards which humans might, āby defaultā, exhibit little to no moral concern (e.g., artificial sentient beings, or wild animals on planets we terraform (see also)).
Premise 2: If Premise 1 is true, it could be extremely morally important to, either now or in the future, expand moral circles such that theyāre more likely to include those types of beings.
Premise 3: Such MCE may be urgent, as there could be value lock-in relatively soon, for example due to the development of an artificial general intelligence.
Premise 4: If more peopleās moral circles expand to include farm animals and/āor factory farming is ended, this increases the chances that moral circles will include all sentient beings in future (or at least all the very numerous beings).
Conclusion: It could be extremely morally important, and urgent, to do work that supports the expansion of peopleās moral circles to include farm animals and/āor supports the ending of factory farming (e.g., supporting the development of clean meat).
Personally, I find each of those premises plausible, along with the argument as a whole. But I think the multidimensionality of moral circles pushes somewhat against high confidence in Premise 4. And I think the multidimensionality also helps highlight the value of:
Actually investigating to what extent MCE along one dimension (or to one type of entity) āspills overā to expand moral circles along other dimensions (or to types of entities that werenāt the focus).
One could perhaps do this via investigating the strength of correlations between peopleās moral expansiveness along different dimensions
...or the strength of correlations between changes in moral expansiveness along different dimensions in the past
...or the strength of correlations of that sort when you do a particular intervention.
E.g., if I show you an argument for caring about farm animals, and it does convince you of that, does it also make you more open to caring about wild animals or digital minds?
Prioritising MCE efforts that target the dimensions most relevant to the entities youāre ultimately most focused on expanding moral circles to
I think many of the relevant EAs and related orgs are quite aware of this sort of issue. E.g., Jacy Reese from Sentience Institute discusses related matters in this talk (around the 10 minute mark). So Iām not claiming that this is a totally novel idea, just that itās an important one, and that framing moral circles as multidimensional seems a useful way to get at that idea.
(By the way, Iāve got some additional thoughts on various questions in this general area, and various ways one might go about answering them, so feel free to reach out to me if thatās something youāre interested in potentially doing.)
You donāt need to agree with premise 3 to think that working on MCE is a cost-effective way to reduce s-risks. Below is the text from a conversation I recently had, where alternating paragraphs are alternating between the two participants.
The brief conversation below focuses on the idea of premise 3, but Iād also note that the existing psychological evidence for the āsecondary transfer effectā is relevant to premise 4. I think that you could make progress on empirically testing premise 4. I agree that this would be fairly high priority to run more tests on (focused specifically on farmed animals /ā wild animals /ā artificial sentience) from the perspective of prioritising between different potential ātargetsā of MCE advocacy, and also perhaps for deciding how important MCE work is within the longtermist portfolio of interventions. I can imagine this research fitting well within Sentience Institute. If you know anyone who could be interested in applying to our current researcher job opening (closing in ~one weekās time), to work on this or other questions, please do let them know about the opening.
----
āI donāt personally see value lock-in as a prerequisite for farmed animals ā artificial sentience far future impactā¦ if the graph of future moral progress is a sine wave, and you increase the whole sine wave by 2 units, then your expected value is still 2*duration-of-everything, even if you donāt increase the value at the lock-in point.ā
āIt doesnāt seem that likely to me that you would increase the whole sine wave by 2 units, as opposed to just increasing the gradient of one of the upward slopes or something like that.ā
āHm, why do you think increasing the gradient is more likely? If you just add an action into the world that wouldnāt happen otherwise (e.g. donate $100 to an animal rights org), then it seems the default is an increase in the whole sine wave. For that to be simply an increase in upward slope, youād need to think thereās a fundamental dynamic in society changing the impact of that contribution, such as a limit on short-term progress. But one can also imagine the opposite effect where progress is easier during certain periods, so you could cause >2 units of increase. There are lots of intuitions that can pull the impact up or down, but overall, a +2 increase in the whole wave seems like the go-to assumption.ā
āPresumably it depends on the extent to which you think thereās something like a secondary transfer effect, or some other mechanism by which successful advocacy for farmed animals enables advocacy for other sentient beings. E.g. imagine that we have 100% certainty that animal farming will end within 1000 years, and we know that all advocates (apart from us) are interested in farmed animal advocacy specifically, rather than MCE advocacy. Then, all MCE work would be doing would be speeding up the time before the end of animal farming. But if we remove those assumptions, then I guess it would have some sort of āincreaseā effect, rather than just an effect on the slope. Both those assumptions are unreasonable, but presumably you could get something similar if it was close to 100% and most farmed animal advocacy efforts seemed likely to terminate at the end of animal farming, as oppose to be shifted into other forms of MCE advocacy.ā
āYep, that makes sense if you donāt think thereās some diminishing factor on the flow-through from farmed animal advocacy to moral inclusion of AS, as long as you donāt think there are increasing factors that outweigh it.ā
Iād also note that the existing psychological evidence for the āsecondary transfer effectā is relevant to premise 4
Good point, thanks! Judging only from that paperās abstract, Iād guess that itād indeed be useful for work on these questions to draw on evidence and theorising about secondary transfer effects.
I can imagine this research fitting well within Sentience Institute. If you know anyone who could be interested in applying to our current researcher job opening (closing in ~one weekās time), to work on this or other questions, please do let them know about the opening.
Yes, Iād agree that this kind of work seems to clearly fit the Sentience Instituteās mission, and that SI seems like itās probably among the best homes for this kind of work. (Off the top of my head, other candidate ābest homesā might be Rethink Priorities or academic psychology. But itās possible that, in the latter, itād be hard to sell people on focusing a lot of resources on the most relevant questions.)
So Iām glad you stated that explicitly (perhaps I shouldāve too), and mentioned SIās job opening here, so people interested in researching these questions can see it.
(Having written this comment and then re-read your comment, I have a sense that I might be sort-of talking past you or totally misunderstanding you, so let me know if thatās the case.)
Responding to your conversation text:
I still find it hard to wrap my head around what the claims or arguments in that conversation would actually mean. Though I might say the same about a lot of other arguments about extremely long-term trajectories, so this comment isnāt really meant as a critique.
Some points where Iām confused about what you mean, or about how to think about it:
What, precisely, do we mean by āvalue lock-inā?
If we mean something as specific as āa superintelligent AI is created with a particular set of values, and then its values never change and the accessible universe is used however the superintelligent AI decided to use itā, then I think moral advocacy can clearly have a lasting impact without that sort of value lock-in.
Do we mean that some actorsā (e.g., current humans) values are locked in, or that thereās a lock-in of what values will determine how the accessible universe is used?
Do we mean that a specific set of values are locked in, or that something like a particular ātrajectoryā or ārangeā of values are locked in? E.g., would we count it as āvalue lock-inā if we lock in a particular recurring pattern of shifts in values? Or if we just lock-in disvaluing suffering, but values could still shift along all other dimensions?
āif the graph of future moral progress is a sine waveāādo you essentially mean that thereās a recurring pattern of values getting ābetterā and then later getting āworseā?
And do you mean that that pattern lasts indefinitelyāi.e., until something like the heat death of the universe?
Do you see it as plausible that that sort of a pattern could last an extremely long time? If so, what sort of things do you think would drive it?
At first glance, it feels to me like that would be extremely unlikely to happen āby chanceā, and that thereās no good reason to believe weāre already stuck with this sort of a pattern happening indefinitely. So it feels like it would have to be the case that something in particular happens (which we currently could still prevent) that causes us to be stuck with this recurring pattern.
If so, I think Iād want to say that this is meaningfully similar to a value lock-in; it seems like a lock-in of a particular trajectory has to occur at a particular point, and that what matters is whether that lock-in occurs, and what trajectory weāre locked into when it occurs. (Though it could be that the lock-in occurs āgraduallyā, in the sense that it gradually becomes harder and harder to get out of that pattern. I think this is also true for lock-in of a specific set of values.)
I think that thinking about what might cause us to end up with an indefinite pattern of improving and then worsening moral values would help us think about whether moral advocacy work would just speed us along one part of the pattern, shift the whole pattern, change what pattern weāre likely to end up with, or change whether we end up with such a pattern at all. (For present purposes, Iād say we could call farm animal welfare work āindirect moral advocacy workā, if its ultimate aim is shifting values.)
I also think an argument can be made that, given a few plausible yet uncertain assumptions, thereās practically guaranteed to eventually be a lock-in of major aspects of how the accessible universe is used. Iāve drafted a brief outline of this argument and some counterpoints to it, which Iāll hopefully post next month, but could also share on request.
You donāt need to agree with premise 3 to think that working on MCE is a cost-effective way to reduce s-risks.
Yeah, I agree.
For one thing, if we just cut premise 3 in my statement of that argument, all that the conclusion would automatically lose is the claim of āurgencyā, not the claim of āimportanceā. And it may sometimes be best to work on very important things even if theyāre not urgent. (For this reason, and because my key focus was really on Premise 4, Iād actually considered leaving Premise 3 out of my comment.)
For another thing, which I think is your focus, it seems conceivable that MCE could be urgent even if value lock-in was pretty much guaranteed not to happen anytime soon. That said, I donāt think Iāve heard an alternative argument for MCE being urgent that Iāve understood and been convinced by. Tomorrow, with more sleep under my belt, I plan to have another crack at properly thinking through the dialogue you provide (including trying to think about what it would actually mean for the graph of future moral progress to be like a sine wave, and what could cause that to occur indefinitely).
Three researchers have now reached out to me in relation to this post. One is doing work related to the above questions, one is at least interested in those questions, and one is interested in the topic of moral circles more broadly. So if youāre interested in these topics and reach out to me, I could also give those people your info so you could potentially connect with them too :)
[ETA: Iāve now posted a more detailed version of this comment as a standalone post.]
(Personal views, rather than those of my employers, as with most of my comments)
Above, I wrote:
Hereās perhaps the key example I had in mind when I wrote that:
Some EAs and related organisations (especially but not only the Sentience Institute) seem to base big decisions on something roughly like the following argument:
---
Premise 1: Itās plausible that the vast majority of all the suffering and wellbeing that ever occurs will occur more than a hundred years into the future. Itās also plausible that the vast majority of that suffering and wellbeing would be experienced by beings towards which humans might, āby defaultā, exhibit little to no moral concern (e.g., artificial sentient beings, or wild animals on planets we terraform (see also)).
Premise 2: If Premise 1 is true, it could be extremely morally important to, either now or in the future, expand moral circles such that theyāre more likely to include those types of beings.
Premise 3: Such MCE may be urgent, as there could be value lock-in relatively soon, for example due to the development of an artificial general intelligence.
Premise 4: If more peopleās moral circles expand to include farm animals and/āor factory farming is ended, this increases the chances that moral circles will include all sentient beings in future (or at least all the very numerous beings).
Conclusion: It could be extremely morally important, and urgent, to do work that supports the expansion of peopleās moral circles to include farm animals and/āor supports the ending of factory farming (e.g., supporting the development of clean meat).
(See also Why I prioritize moral circle expansion over artificial intelligence alignment.)
---
Personally, I find each of those premises plausible, along with the argument as a whole. But I think the multidimensionality of moral circles pushes somewhat against high confidence in Premise 4. And I think the multidimensionality also helps highlight the value of:
Actually investigating to what extent MCE along one dimension (or to one type of entity) āspills overā to expand moral circles along other dimensions (or to types of entities that werenāt the focus).
One could perhaps do this via investigating the strength of correlations between peopleās moral expansiveness along different dimensions
...or the strength of correlations between changes in moral expansiveness along different dimensions in the past
...or the strength of correlations of that sort when you do a particular intervention.
E.g., if I show you an argument for caring about farm animals, and it does convince you of that, does it also make you more open to caring about wild animals or digital minds?
Prioritising MCE efforts that target the dimensions most relevant to the entities youāre ultimately most focused on expanding moral circles to
I think many of the relevant EAs and related orgs are quite aware of this sort of issue. E.g., Jacy Reese from Sentience Institute discusses related matters in this talk (around the 10 minute mark). So Iām not claiming that this is a totally novel idea, just that itās an important one, and that framing moral circles as multidimensional seems a useful way to get at that idea.
(By the way, Iāve got some additional thoughts on various questions in this general area, and various ways one might go about answering them, so feel free to reach out to me if thatās something youāre interested in potentially doing.)
You donāt need to agree with premise 3 to think that working on MCE is a cost-effective way to reduce s-risks. Below is the text from a conversation I recently had, where alternating paragraphs are alternating between the two participants.
The brief conversation below focuses on the idea of premise 3, but Iād also note that the existing psychological evidence for the āsecondary transfer effectā is relevant to premise 4. I think that you could make progress on empirically testing premise 4. I agree that this would be fairly high priority to run more tests on (focused specifically on farmed animals /ā wild animals /ā artificial sentience) from the perspective of prioritising between different potential ātargetsā of MCE advocacy, and also perhaps for deciding how important MCE work is within the longtermist portfolio of interventions. I can imagine this research fitting well within Sentience Institute. If you know anyone who could be interested in applying to our current researcher job opening (closing in ~one weekās time), to work on this or other questions, please do let them know about the opening.
----
āI donāt personally see value lock-in as a prerequisite for farmed animals ā artificial sentience far future impactā¦ if the graph of future moral progress is a sine wave, and you increase the whole sine wave by 2 units, then your expected value is still 2*duration-of-everything, even if you donāt increase the value at the lock-in point.ā
āIt doesnāt seem that likely to me that you would increase the whole sine wave by 2 units, as opposed to just increasing the gradient of one of the upward slopes or something like that.ā
āHm, why do you think increasing the gradient is more likely? If you just add an action into the world that wouldnāt happen otherwise (e.g. donate $100 to an animal rights org), then it seems the default is an increase in the whole sine wave. For that to be simply an increase in upward slope, youād need to think thereās a fundamental dynamic in society changing the impact of that contribution, such as a limit on short-term progress. But one can also imagine the opposite effect where progress is easier during certain periods, so you could cause >2 units of increase. There are lots of intuitions that can pull the impact up or down, but overall, a +2 increase in the whole wave seems like the go-to assumption.ā
āPresumably it depends on the extent to which you think thereās something like a secondary transfer effect, or some other mechanism by which successful advocacy for farmed animals enables advocacy for other sentient beings. E.g. imagine that we have 100% certainty that animal farming will end within 1000 years, and we know that all advocates (apart from us) are interested in farmed animal advocacy specifically, rather than MCE advocacy. Then, all MCE work would be doing would be speeding up the time before the end of animal farming. But if we remove those assumptions, then I guess it would have some sort of āincreaseā effect, rather than just an effect on the slope. Both those assumptions are unreasonable, but presumably you could get something similar if it was close to 100% and most farmed animal advocacy efforts seemed likely to terminate at the end of animal farming, as oppose to be shifted into other forms of MCE advocacy.ā
āYep, that makes sense if you donāt think thereās some diminishing factor on the flow-through from farmed animal advocacy to moral inclusion of AS, as long as you donāt think there are increasing factors that outweigh it.ā
Thanks for this comment!
Good point, thanks! Judging only from that paperās abstract, Iād guess that itād indeed be useful for work on these questions to draw on evidence and theorising about secondary transfer effects.
Yes, Iād agree that this kind of work seems to clearly fit the Sentience Instituteās mission, and that SI seems like itās probably among the best homes for this kind of work. (Off the top of my head, other candidate ābest homesā might be Rethink Priorities or academic psychology. But itās possible that, in the latter, itād be hard to sell people on focusing a lot of resources on the most relevant questions.)
So Iām glad you stated that explicitly (perhaps I shouldāve too), and mentioned SIās job opening here, so people interested in researching these questions can see it.
(Having written this comment and then re-read your comment, I have a sense that I might be sort-of talking past you or totally misunderstanding you, so let me know if thatās the case.)
Responding to your conversation text:
I still find it hard to wrap my head around what the claims or arguments in that conversation would actually mean. Though I might say the same about a lot of other arguments about extremely long-term trajectories, so this comment isnāt really meant as a critique.
Some points where Iām confused about what you mean, or about how to think about it:
What, precisely, do we mean by āvalue lock-inā?
If we mean something as specific as āa superintelligent AI is created with a particular set of values, and then its values never change and the accessible universe is used however the superintelligent AI decided to use itā, then I think moral advocacy can clearly have a lasting impact without that sort of value lock-in.
Do we mean that some actorsā (e.g., current humans) values are locked in, or that thereās a lock-in of what values will determine how the accessible universe is used?
Do we mean that a specific set of values are locked in, or that something like a particular ātrajectoryā or ārangeā of values are locked in? E.g., would we count it as āvalue lock-inā if we lock in a particular recurring pattern of shifts in values? Or if we just lock-in disvaluing suffering, but values could still shift along all other dimensions?
āif the graph of future moral progress is a sine waveāādo you essentially mean that thereās a recurring pattern of values getting ābetterā and then later getting āworseā?
And do you mean that that pattern lasts indefinitelyāi.e., until something like the heat death of the universe?
Do you see it as plausible that that sort of a pattern could last an extremely long time? If so, what sort of things do you think would drive it?
At first glance, it feels to me like that would be extremely unlikely to happen āby chanceā, and that thereās no good reason to believe weāre already stuck with this sort of a pattern happening indefinitely. So it feels like it would have to be the case that something in particular happens (which we currently could still prevent) that causes us to be stuck with this recurring pattern.
If so, I think Iād want to say that this is meaningfully similar to a value lock-in; it seems like a lock-in of a particular trajectory has to occur at a particular point, and that what matters is whether that lock-in occurs, and what trajectory weāre locked into when it occurs. (Though it could be that the lock-in occurs āgraduallyā, in the sense that it gradually becomes harder and harder to get out of that pattern. I think this is also true for lock-in of a specific set of values.)
I think that thinking about what might cause us to end up with an indefinite pattern of improving and then worsening moral values would help us think about whether moral advocacy work would just speed us along one part of the pattern, shift the whole pattern, change what pattern weāre likely to end up with, or change whether we end up with such a pattern at all. (For present purposes, Iād say we could call farm animal welfare work āindirect moral advocacy workā, if its ultimate aim is shifting values.)
I also think an argument can be made that, given a few plausible yet uncertain assumptions, thereās practically guaranteed to eventually be a lock-in of major aspects of how the accessible universe is used. Iāve drafted a brief outline of this argument and some counterpoints to it, which Iāll hopefully post next month, but could also share on request.
Yeah, I agree.
For one thing, if we just cut premise 3 in my statement of that argument, all that the conclusion would automatically lose is the claim of āurgencyā, not the claim of āimportanceā. And it may sometimes be best to work on very important things even if theyāre not urgent. (For this reason, and because my key focus was really on Premise 4, Iād actually considered leaving Premise 3 out of my comment.)
For another thing, which I think is your focus, it seems conceivable that MCE could be urgent even if value lock-in was pretty much guaranteed not to happen anytime soon. That said, I donāt think Iāve heard an alternative argument for MCE being urgent that Iāve understood and been convinced by. Tomorrow, with more sleep under my belt, I plan to have another crack at properly thinking through the dialogue you provide (including trying to think about what it would actually mean for the graph of future moral progress to be like a sine wave, and what could cause that to occur indefinitely).
(commented in wrong place, sorry)
Three researchers have now reached out to me in relation to this post. One is doing work related to the above questions, one is at least interested in those questions, and one is interested in the topic of moral circles more broadly. So if youāre interested in these topics and reach out to me, I could also give those people your info so you could potentially connect with them too :)
Hey! As you probably know, Iād be keen to connect with them. Thanks!
Great, Iāll share your info with them :)