[ETA: I’ve now posted a more detailed version of this comment as a standalone post.]
(Personal views, rather than those of my employers, as with most of my comments)
Above, I wrote:
If moral circles were one-dimensional, any MCE might seem likely to increase the chances that any other type of entity will ultimately be included in people’s moral circles. But since moral circles are multidimensional, if a person primarily cares about increasing moral concern for a particular type of entity (e.g., future people, digital minds, wild animals), it might be important to think about which dimensions a given intervention would expand moral circles along.
Here’s perhaps the key example I had in mind when I wrote that:
Some EAs and related organisations (especially but not only the Sentience Institute) seem to base big decisions on something roughly like the following argument:
---
Premise 1: It’s plausible that the vast majority of all the suffering and wellbeing that ever occurs will occur more than a hundred years into the future. It’s also plausible that the vast majority of that suffering and wellbeing would be experienced by beings towards which humans might, “by default”, exhibit little to no moral concern (e.g., artificial sentient beings, or wild animals on planets we terraform (see also)).
Premise 2: If Premise 1 is true, it could be extremely morally important to, either now or in the future, expand moral circles such that they’re more likely to include those types of beings.
Premise 3: Such MCE may be urgent, as there could be value lock-in relatively soon, for example due to the development of an artificial general intelligence.
Premise 4: If more people’s moral circles expand to include farm animals and/or factory farming is ended, this increases the chances that moral circles will include all sentient beings in future (or at least all the very numerous beings).
Conclusion: It could be extremely morally important, and urgent, to do work that supports the expansion of people’s moral circles to include farm animals and/or supports the ending of factory farming (e.g., supporting the development of clean meat).
Personally, I find each of those premises plausible, along with the argument as a whole. But I think the multidimensionality of moral circles pushes somewhat against high confidence in Premise 4. And I think the multidimensionality also helps highlight the value of:
Actually investigating to what extent MCE along one dimension (or to one type of entity) “spills over” to expand moral circles along other dimensions (or to types of entities that weren’t the focus).
One could perhaps do this via investigating the strength of correlations between people’s moral expansiveness along different dimensions
...or the strength of correlations between changes in moral expansiveness along different dimensions in the past
...or the strength of correlations of that sort when you do a particular intervention.
E.g., if I show you an argument for caring about farm animals, and it does convince you of that, does it also make you more open to caring about wild animals or digital minds?
Prioritising MCE efforts that target the dimensions most relevant to the entities you’re ultimately most focused on expanding moral circles to
I think many of the relevant EAs and related orgs are quite aware of this sort of issue. E.g., Jacy Reese from Sentience Institute discusses related matters in this talk (around the 10 minute mark). So I’m not claiming that this is a totally novel idea, just that it’s an important one, and that framing moral circles as multidimensional seems a useful way to get at that idea.
(By the way, I’ve got some additional thoughts on various questions in this general area, and various ways one might go about answering them, so feel free to reach out to me if that’s something you’re interested in potentially doing.)
You don’t need to agree with premise 3 to think that working on MCE is a cost-effective way to reduce s-risks. Below is the text from a conversation I recently had, where alternating paragraphs are alternating between the two participants.
The brief conversation below focuses on the idea of premise 3, but I’d also note that the existing psychological evidence for the “secondary transfer effect” is relevant to premise 4. I think that you could make progress on empirically testing premise 4. I agree that this would be fairly high priority to run more tests on (focused specifically on farmed animals / wild animals / artificial sentience) from the perspective of prioritising between different potential “targets” of MCE advocacy, and also perhaps for deciding how important MCE work is within the longtermist portfolio of interventions. I can imagine this research fitting well within Sentience Institute. If you know anyone who could be interested in applying to our current researcher job opening (closing in ~one week’s time), to work on this or other questions, please do let them know about the opening.
----
“I don’t personally see value lock-in as a prerequisite for farmed animals → artificial sentience far future impact… if the graph of future moral progress is a sine wave, and you increase the whole sine wave by 2 units, then your expected value is still 2*duration-of-everything, even if you don’t increase the value at the lock-in point.”
“It doesn’t seem that likely to me that you would increase the whole sine wave by 2 units, as opposed to just increasing the gradient of one of the upward slopes or something like that.”
“Hm, why do you think increasing the gradient is more likely? If you just add an action into the world that wouldn’t happen otherwise (e.g. donate $100 to an animal rights org), then it seems the default is an increase in the whole sine wave. For that to be simply an increase in upward slope, you’d need to think there’s a fundamental dynamic in society changing the impact of that contribution, such as a limit on short-term progress. But one can also imagine the opposite effect where progress is easier during certain periods, so you could cause >2 units of increase. There are lots of intuitions that can pull the impact up or down, but overall, a +2 increase in the whole wave seems like the go-to assumption.”
“Presumably it depends on the extent to which you think there’s something like a secondary transfer effect, or some other mechanism by which successful advocacy for farmed animals enables advocacy for other sentient beings. E.g. imagine that we have 100% certainty that animal farming will end within 1000 years, and we know that all advocates (apart from us) are interested in farmed animal advocacy specifically, rather than MCE advocacy. Then, all MCE work would be doing would be speeding up the time before the end of animal farming. But if we remove those assumptions, then I guess it would have some sort of “increase” effect, rather than just an effect on the slope. Both those assumptions are unreasonable, but presumably you could get something similar if it was close to 100% and most farmed animal advocacy efforts seemed likely to terminate at the end of animal farming, as oppose to be shifted into other forms of MCE advocacy.”
“Yep, that makes sense if you don’t think there’s some diminishing factor on the flow-through from farmed animal advocacy to moral inclusion of AS, as long as you don’t think there are increasing factors that outweigh it.”
I’d also note that the existing psychological evidence for the “secondary transfer effect” is relevant to premise 4
Good point, thanks! Judging only from that paper’s abstract, I’d guess that it’d indeed be useful for work on these questions to draw on evidence and theorising about secondary transfer effects.
I can imagine this research fitting well within Sentience Institute. If you know anyone who could be interested in applying to our current researcher job opening (closing in ~one week’s time), to work on this or other questions, please do let them know about the opening.
Yes, I’d agree that this kind of work seems to clearly fit the Sentience Institute’s mission, and that SI seems like it’s probably among the best homes for this kind of work. (Off the top of my head, other candidate “best homes” might be Rethink Priorities or academic psychology. But it’s possible that, in the latter, it’d be hard to sell people on focusing a lot of resources on the most relevant questions.)
So I’m glad you stated that explicitly (perhaps I should’ve too), and mentioned SI’s job opening here, so people interested in researching these questions can see it.
(Having written this comment and then re-read your comment, I have a sense that I might be sort-of talking past you or totally misunderstanding you, so let me know if that’s the case.)
Responding to your conversation text:
I still find it hard to wrap my head around what the claims or arguments in that conversation would actually mean. Though I might say the same about a lot of other arguments about extremely long-term trajectories, so this comment isn’t really meant as a critique.
Some points where I’m confused about what you mean, or about how to think about it:
What, precisely, do we mean by “value lock-in”?
If we mean something as specific as “a superintelligent AI is created with a particular set of values, and then its values never change and the accessible universe is used however the superintelligent AI decided to use it”, then I think moral advocacy can clearly have a lasting impact without that sort of value lock-in.
Do we mean that some actors’ (e.g., current humans) values are locked in, or that there’s a lock-in of what values will determine how the accessible universe is used?
Do we mean that a specific set of values are locked in, or that something like a particular “trajectory” or “range” of values are locked in? E.g., would we count it as “value lock-in” if we lock in a particular recurring pattern of shifts in values? Or if we just lock-in disvaluing suffering, but values could still shift along all other dimensions?
“if the graph of future moral progress is a sine wave”—do you essentially mean that there’s a recurring pattern of values getting “better” and then later getting “worse”?
And do you mean that that pattern lasts indefinitely—i.e., until something like the heat death of the universe?
Do you see it as plausible that that sort of a pattern could last an extremely long time? If so, what sort of things do you think would drive it?
At first glance, it feels to me like that would be extremely unlikely to happen “by chance”, and that there’s no good reason to believe we’re already stuck with this sort of a pattern happening indefinitely. So it feels like it would have to be the case that something in particular happens (which we currently could still prevent) that causes us to be stuck with this recurring pattern.
If so, I think I’d want to say that this is meaningfully similar to a value lock-in; it seems like a lock-in of a particular trajectory has to occur at a particular point, and that what matters is whether that lock-in occurs, and what trajectory we’re locked into when it occurs. (Though it could be that the lock-in occurs “gradually”, in the sense that it gradually becomes harder and harder to get out of that pattern. I think this is also true for lock-in of a specific set of values.)
I think that thinking about what might cause us to end up with an indefinite pattern of improving and then worsening moral values would help us think about whether moral advocacy work would just speed us along one part of the pattern, shift the whole pattern, change what pattern we’re likely to end up with, or change whether we end up with such a pattern at all. (For present purposes, I’d say we could call farm animal welfare work “indirect moral advocacy work”, if its ultimate aim is shifting values.)
I also think an argument can be made that, given a few plausible yet uncertain assumptions, there’s practically guaranteed to eventually be a lock-in of major aspects of how the accessible universe is used. I’ve drafted a brief outline of this argument and some counterpoints to it, which I’ll hopefully post next month, but could also share on request.
You don’t need to agree with premise 3 to think that working on MCE is a cost-effective way to reduce s-risks.
Yeah, I agree.
For one thing, if we just cut premise 3 in my statement of that argument, all that the conclusion would automatically lose is the claim of “urgency”, not the claim of “importance”. And it may sometimes be best to work on very important things even if they’re not urgent. (For this reason, and because my key focus was really on Premise 4, I’d actually considered leaving Premise 3 out of my comment.)
For another thing, which I think is your focus, it seems conceivable that MCE could be urgent even if value lock-in was pretty much guaranteed not to happen anytime soon. That said, I don’t think I’ve heard an alternative argument for MCE being urgent that I’ve understood and been convinced by. Tomorrow, with more sleep under my belt, I plan to have another crack at properly thinking through the dialogue you provide (including trying to think about what it would actually mean for the graph of future moral progress to be like a sine wave, and what could cause that to occur indefinitely).
Three researchers have now reached out to me in relation to this post. One is doing work related to the above questions, one is at least interested in those questions, and one is interested in the topic of moral circles more broadly. So if you’re interested in these topics and reach out to me, I could also give those people your info so you could potentially connect with them too :)
[ETA: I’ve now posted a more detailed version of this comment as a standalone post.]
(Personal views, rather than those of my employers, as with most of my comments)
Above, I wrote:
Here’s perhaps the key example I had in mind when I wrote that:
Some EAs and related organisations (especially but not only the Sentience Institute) seem to base big decisions on something roughly like the following argument:
---
Premise 1: It’s plausible that the vast majority of all the suffering and wellbeing that ever occurs will occur more than a hundred years into the future. It’s also plausible that the vast majority of that suffering and wellbeing would be experienced by beings towards which humans might, “by default”, exhibit little to no moral concern (e.g., artificial sentient beings, or wild animals on planets we terraform (see also)).
Premise 2: If Premise 1 is true, it could be extremely morally important to, either now or in the future, expand moral circles such that they’re more likely to include those types of beings.
Premise 3: Such MCE may be urgent, as there could be value lock-in relatively soon, for example due to the development of an artificial general intelligence.
Premise 4: If more people’s moral circles expand to include farm animals and/or factory farming is ended, this increases the chances that moral circles will include all sentient beings in future (or at least all the very numerous beings).
Conclusion: It could be extremely morally important, and urgent, to do work that supports the expansion of people’s moral circles to include farm animals and/or supports the ending of factory farming (e.g., supporting the development of clean meat).
(See also Why I prioritize moral circle expansion over artificial intelligence alignment.)
---
Personally, I find each of those premises plausible, along with the argument as a whole. But I think the multidimensionality of moral circles pushes somewhat against high confidence in Premise 4. And I think the multidimensionality also helps highlight the value of:
Actually investigating to what extent MCE along one dimension (or to one type of entity) “spills over” to expand moral circles along other dimensions (or to types of entities that weren’t the focus).
One could perhaps do this via investigating the strength of correlations between people’s moral expansiveness along different dimensions
...or the strength of correlations between changes in moral expansiveness along different dimensions in the past
...or the strength of correlations of that sort when you do a particular intervention.
E.g., if I show you an argument for caring about farm animals, and it does convince you of that, does it also make you more open to caring about wild animals or digital minds?
Prioritising MCE efforts that target the dimensions most relevant to the entities you’re ultimately most focused on expanding moral circles to
I think many of the relevant EAs and related orgs are quite aware of this sort of issue. E.g., Jacy Reese from Sentience Institute discusses related matters in this talk (around the 10 minute mark). So I’m not claiming that this is a totally novel idea, just that it’s an important one, and that framing moral circles as multidimensional seems a useful way to get at that idea.
(By the way, I’ve got some additional thoughts on various questions in this general area, and various ways one might go about answering them, so feel free to reach out to me if that’s something you’re interested in potentially doing.)
You don’t need to agree with premise 3 to think that working on MCE is a cost-effective way to reduce s-risks. Below is the text from a conversation I recently had, where alternating paragraphs are alternating between the two participants.
The brief conversation below focuses on the idea of premise 3, but I’d also note that the existing psychological evidence for the “secondary transfer effect” is relevant to premise 4. I think that you could make progress on empirically testing premise 4. I agree that this would be fairly high priority to run more tests on (focused specifically on farmed animals / wild animals / artificial sentience) from the perspective of prioritising between different potential “targets” of MCE advocacy, and also perhaps for deciding how important MCE work is within the longtermist portfolio of interventions. I can imagine this research fitting well within Sentience Institute. If you know anyone who could be interested in applying to our current researcher job opening (closing in ~one week’s time), to work on this or other questions, please do let them know about the opening.
----
“I don’t personally see value lock-in as a prerequisite for farmed animals → artificial sentience far future impact… if the graph of future moral progress is a sine wave, and you increase the whole sine wave by 2 units, then your expected value is still 2*duration-of-everything, even if you don’t increase the value at the lock-in point.”
“It doesn’t seem that likely to me that you would increase the whole sine wave by 2 units, as opposed to just increasing the gradient of one of the upward slopes or something like that.”
“Hm, why do you think increasing the gradient is more likely? If you just add an action into the world that wouldn’t happen otherwise (e.g. donate $100 to an animal rights org), then it seems the default is an increase in the whole sine wave. For that to be simply an increase in upward slope, you’d need to think there’s a fundamental dynamic in society changing the impact of that contribution, such as a limit on short-term progress. But one can also imagine the opposite effect where progress is easier during certain periods, so you could cause >2 units of increase. There are lots of intuitions that can pull the impact up or down, but overall, a +2 increase in the whole wave seems like the go-to assumption.”
“Presumably it depends on the extent to which you think there’s something like a secondary transfer effect, or some other mechanism by which successful advocacy for farmed animals enables advocacy for other sentient beings. E.g. imagine that we have 100% certainty that animal farming will end within 1000 years, and we know that all advocates (apart from us) are interested in farmed animal advocacy specifically, rather than MCE advocacy. Then, all MCE work would be doing would be speeding up the time before the end of animal farming. But if we remove those assumptions, then I guess it would have some sort of “increase” effect, rather than just an effect on the slope. Both those assumptions are unreasonable, but presumably you could get something similar if it was close to 100% and most farmed animal advocacy efforts seemed likely to terminate at the end of animal farming, as oppose to be shifted into other forms of MCE advocacy.”
“Yep, that makes sense if you don’t think there’s some diminishing factor on the flow-through from farmed animal advocacy to moral inclusion of AS, as long as you don’t think there are increasing factors that outweigh it.”
Thanks for this comment!
Good point, thanks! Judging only from that paper’s abstract, I’d guess that it’d indeed be useful for work on these questions to draw on evidence and theorising about secondary transfer effects.
Yes, I’d agree that this kind of work seems to clearly fit the Sentience Institute’s mission, and that SI seems like it’s probably among the best homes for this kind of work. (Off the top of my head, other candidate “best homes” might be Rethink Priorities or academic psychology. But it’s possible that, in the latter, it’d be hard to sell people on focusing a lot of resources on the most relevant questions.)
So I’m glad you stated that explicitly (perhaps I should’ve too), and mentioned SI’s job opening here, so people interested in researching these questions can see it.
(Having written this comment and then re-read your comment, I have a sense that I might be sort-of talking past you or totally misunderstanding you, so let me know if that’s the case.)
Responding to your conversation text:
I still find it hard to wrap my head around what the claims or arguments in that conversation would actually mean. Though I might say the same about a lot of other arguments about extremely long-term trajectories, so this comment isn’t really meant as a critique.
Some points where I’m confused about what you mean, or about how to think about it:
What, precisely, do we mean by “value lock-in”?
If we mean something as specific as “a superintelligent AI is created with a particular set of values, and then its values never change and the accessible universe is used however the superintelligent AI decided to use it”, then I think moral advocacy can clearly have a lasting impact without that sort of value lock-in.
Do we mean that some actors’ (e.g., current humans) values are locked in, or that there’s a lock-in of what values will determine how the accessible universe is used?
Do we mean that a specific set of values are locked in, or that something like a particular “trajectory” or “range” of values are locked in? E.g., would we count it as “value lock-in” if we lock in a particular recurring pattern of shifts in values? Or if we just lock-in disvaluing suffering, but values could still shift along all other dimensions?
“if the graph of future moral progress is a sine wave”—do you essentially mean that there’s a recurring pattern of values getting “better” and then later getting “worse”?
And do you mean that that pattern lasts indefinitely—i.e., until something like the heat death of the universe?
Do you see it as plausible that that sort of a pattern could last an extremely long time? If so, what sort of things do you think would drive it?
At first glance, it feels to me like that would be extremely unlikely to happen “by chance”, and that there’s no good reason to believe we’re already stuck with this sort of a pattern happening indefinitely. So it feels like it would have to be the case that something in particular happens (which we currently could still prevent) that causes us to be stuck with this recurring pattern.
If so, I think I’d want to say that this is meaningfully similar to a value lock-in; it seems like a lock-in of a particular trajectory has to occur at a particular point, and that what matters is whether that lock-in occurs, and what trajectory we’re locked into when it occurs. (Though it could be that the lock-in occurs “gradually”, in the sense that it gradually becomes harder and harder to get out of that pattern. I think this is also true for lock-in of a specific set of values.)
I think that thinking about what might cause us to end up with an indefinite pattern of improving and then worsening moral values would help us think about whether moral advocacy work would just speed us along one part of the pattern, shift the whole pattern, change what pattern we’re likely to end up with, or change whether we end up with such a pattern at all. (For present purposes, I’d say we could call farm animal welfare work “indirect moral advocacy work”, if its ultimate aim is shifting values.)
I also think an argument can be made that, given a few plausible yet uncertain assumptions, there’s practically guaranteed to eventually be a lock-in of major aspects of how the accessible universe is used. I’ve drafted a brief outline of this argument and some counterpoints to it, which I’ll hopefully post next month, but could also share on request.
Yeah, I agree.
For one thing, if we just cut premise 3 in my statement of that argument, all that the conclusion would automatically lose is the claim of “urgency”, not the claim of “importance”. And it may sometimes be best to work on very important things even if they’re not urgent. (For this reason, and because my key focus was really on Premise 4, I’d actually considered leaving Premise 3 out of my comment.)
For another thing, which I think is your focus, it seems conceivable that MCE could be urgent even if value lock-in was pretty much guaranteed not to happen anytime soon. That said, I don’t think I’ve heard an alternative argument for MCE being urgent that I’ve understood and been convinced by. Tomorrow, with more sleep under my belt, I plan to have another crack at properly thinking through the dialogue you provide (including trying to think about what it would actually mean for the graph of future moral progress to be like a sine wave, and what could cause that to occur indefinitely).
(commented in wrong place, sorry)
Three researchers have now reached out to me in relation to this post. One is doing work related to the above questions, one is at least interested in those questions, and one is interested in the topic of moral circles more broadly. So if you’re interested in these topics and reach out to me, I could also give those people your info so you could potentially connect with them too :)
Hey! As you probably know, I’d be keen to connect with them. Thanks!
Great, I’ll share your info with them :)