Hm, yeah, I don’t think I fully understand you here either, and this seems somewhat different than what we discussed via email.
My concern is with (2) in your list. “[T]hey do not wish to be convinced to expand their moral circle” is extremely ambiguous to me. Presumably you mean they—without MCE advocacy being done—wouldn’t put in wide-MC* values or values that lead to wide-MC into an aligned AI. But I think it’s being conflated with, “they actively oppose” or “they would answer ‘no’ if asked, ‘Do you think your values are wrong when it comes to which moral beings deserve moral consideration?’”
I think they don’t actively oppose it, they would mostly answer “no” to that question, and it’s very uncertain if they will put the wide-MC-leading values into an aligned AI. I don’t think CEV or similar reflection processes reliably lead to wide moral circles. I think they can still be heavily influenced by their initial set-up (e.g. what the values of humanity when reflection begins).
This leads me to think that you only need (2) to be true in a very weak sense for MCE to matter. I think it’s quite plausible that this is the case.
*Wide-MC meaning an extremely wide moral circle, e.g. includes insects, small/weird digital minds.
I don’t think CEV or similar reflection processes reliably lead to wide moral circles. I think they can still be heavily influenced by their initial set-up (e.g. what the values of humanity when reflection begins).
Why do you think this is the case?
Do you think there is an alternative reflection process (either implemented by an AI, by a human society, or combination of both) that could be defined that would reliably lead to wide moral circles? Do you have any thoughts on what would it look like?
If we go through some kind of reflection process to determine our values, I would much rather have a reflection process that wasn’t dependent on whether or not MCE occurred before hand, and I think not leading to a wide moral circle should be considered a serious bug in any definition of a reflection process. It seems to me that working on producing this would be a plausible alternative or at least parallel path to directly performing MCE.
I think that there’s an inevitable tradeoff between wanting a reflection process to have certain properties and worries about this violating goal preservation for at least some people. This blogpost is not about MCE directly, but if you think of “BAAN thought experiment” as “we do moral reflection and the outcome is such a wide circle that most people think it is extremely counterintuitive” then the reasoning in large parts of the blogpost should apply perfectly to the discussion here.
That is not to say that trying to fine tune reflection processes is pointless: I think it’s very important to think about what our desiderata should be for a CEV-like reflection process. I’m just saying that there will be tradeoffs between certain commonly mentioned desiderata that people don’t realize are there because they think there is such a thing as “genuinely free and open-ended deliberation.”
Thanks for commenting, Lukas. I think Lukas, Brian Tomasik, and others affiliated with FRI have thought more about this, and I basically defer to their views here, especially because I haven’t heard any reasonable people disagree with this particular point. Namely, I agree with Lukas that there seems to be an inevitable tradeoff here.
Hm, yeah, I don’t think I fully understand you here either, and this seems somewhat different than what we discussed via email.
My concern is with (2) in your list. “[T]hey do not wish to be convinced to expand their moral circle” is extremely ambiguous to me. Presumably you mean they—without MCE advocacy being done—wouldn’t put in wide-MC* values or values that lead to wide-MC into an aligned AI. But I think it’s being conflated with, “they actively oppose” or “they would answer ‘no’ if asked, ‘Do you think your values are wrong when it comes to which moral beings deserve moral consideration?’”
I think they don’t actively oppose it, they would mostly answer “no” to that question, and it’s very uncertain if they will put the wide-MC-leading values into an aligned AI. I don’t think CEV or similar reflection processes reliably lead to wide moral circles. I think they can still be heavily influenced by their initial set-up (e.g. what the values of humanity when reflection begins).
This leads me to think that you only need (2) to be true in a very weak sense for MCE to matter. I think it’s quite plausible that this is the case.
*Wide-MC meaning an extremely wide moral circle, e.g. includes insects, small/weird digital minds.
Why do you think this is the case? Do you think there is an alternative reflection process (either implemented by an AI, by a human society, or combination of both) that could be defined that would reliably lead to wide moral circles? Do you have any thoughts on what would it look like?
If we go through some kind of reflection process to determine our values, I would much rather have a reflection process that wasn’t dependent on whether or not MCE occurred before hand, and I think not leading to a wide moral circle should be considered a serious bug in any definition of a reflection process. It seems to me that working on producing this would be a plausible alternative or at least parallel path to directly performing MCE.
I think that there’s an inevitable tradeoff between wanting a reflection process to have certain properties and worries about this violating goal preservation for at least some people. This blogpost is not about MCE directly, but if you think of “BAAN thought experiment” as “we do moral reflection and the outcome is such a wide circle that most people think it is extremely counterintuitive” then the reasoning in large parts of the blogpost should apply perfectly to the discussion here.
That is not to say that trying to fine tune reflection processes is pointless: I think it’s very important to think about what our desiderata should be for a CEV-like reflection process. I’m just saying that there will be tradeoffs between certain commonly mentioned desiderata that people don’t realize are there because they think there is such a thing as “genuinely free and open-ended deliberation.”
Thanks for commenting, Lukas. I think Lukas, Brian Tomasik, and others affiliated with FRI have thought more about this, and I basically defer to their views here, especially because I haven’t heard any reasonable people disagree with this particular point. Namely, I agree with Lukas that there seems to be an inevitable tradeoff here.