Ben_West🔸 comments on Why I prioritize moral circle expansion over reducing extinction risk through artificial intelligence alignment

Ben_West🔸 22 Feb 2018 18:49 UTC
4 points
0 ∶ 0
AI designers, even if speciesist themselves, might nonetheless provide the right apparatus for value learning such that resulting AI will not propagate the moral mistakes of its creators

This is something I also struggle with in understanding the post. it seems like we need:
1. AI creators can be convinced to expand their moral circle
2. Despite (1), they do not wish to be convinced to expand their moral circle
3. The AI follows this second desire to not be convinced to expand their moral circle
I imagine this happening with certain religious things; e.g. I could imagine someone saying “I wish to think the Bible is true even if I could be convinced that the Bible is false”.

But it seems relatively implausible with regards to MCE?

Particularly given that AI safety talks a lot about things like CEV, it is unclear to me whether there is really a strong trade-off between MCE and AIA.

(Note: Jacy and I discussed this via email and didn’t really come to a consensus, so there’s a good chance I am just misunderstanding his argument.)
- Jacy 22 Feb 2018 19:02 UTC
  18 points
  0 ∶ 0
  Parent
  Hm, yeah, I don’t think I fully understand you here either, and this seems somewhat different than what we discussed via email.
  
  My concern is with (2) in your list. “[T]hey do not wish to be convinced to expand their moral circle” is extremely ambiguous to me. Presumably you mean they—without MCE advocacy being done—wouldn’t put in wide-MC* values or values that lead to wide-MC into an aligned AI. But I think it’s being conflated with, “they actively oppose” or “they would answer ‘no’ if asked, ‘Do you think your values are wrong when it comes to which moral beings deserve moral consideration?’”
  
  I think they don’t actively oppose it, they would mostly answer “no” to that question, and it’s very uncertain if they will put the wide-MC-leading values into an aligned AI. I don’t think CEV or similar reflection processes reliably lead to wide moral circles. I think they can still be heavily influenced by their initial set-up (e.g. what the values of humanity when reflection begins).
  
  This leads me to think that you only need (2) to be true in a very weak sense for MCE to matter. I think it’s quite plausible that this is the case.
  
  *Wide-MC meaning an extremely wide moral circle, e.g. includes insects, small/weird digital minds.
  - William_S 26 Feb 2018 3:50 UTC
    3 points
    0 ∶ 0
    Parent
    
    I don’t think CEV or similar reflection processes reliably lead to wide moral circles. I think they can still be heavily influenced by their initial set-up (e.g. what the values of humanity when reflection begins).
    
    Why do you think this is the case? Do you think there is an alternative reflection process (either implemented by an AI, by a human society, or combination of both) that could be defined that would reliably lead to wide moral circles? Do you have any thoughts on what would it look like?
    
    If we go through some kind of reflection process to determine our values, I would much rather have a reflection process that wasn’t dependent on whether or not MCE occurred before hand, and I think not leading to a wide moral circle should be considered a serious bug in any definition of a reflection process. It seems to me that working on producing this would be a plausible alternative or at least parallel path to directly performing MCE.
    - Lukas_Gloor 26 Feb 2018 7:23 UTC
      10 points
      0 ∶ 0
      Parent
      I think that there’s an inevitable tradeoff between wanting a reflection process to have certain properties and worries about this violating goal preservation for at least some people. This blogpost is not about MCE directly, but if you think of “BAAN thought experiment” as “we do moral reflection and the outcome is such a wide circle that most people think it is extremely counterintuitive” then the reasoning in large parts of the blogpost should apply perfectly to the discussion here.
      
      That is not to say that trying to fine tune reflection processes is pointless: I think it’s very important to think about what our desiderata should be for a CEV-like reflection process. I’m just saying that there will be tradeoffs between certain commonly mentioned desiderata that people don’t realize are there because they think there is such a thing as “genuinely free and open-ended deliberation.”
      - Jacy 27 Feb 2018 17:29 UTC
        10 points
        0 ∶ 0
        Parent
        Thanks for commenting, Lukas. I think Lukas, Brian Tomasik, and others affiliated with FRI have thought more about this, and I basically defer to their views here, especially because I haven’t heard any reasonable people disagree with this particular point. Namely, I agree with Lukas that there seems to be an inevitable tradeoff here.
- Brian_Tomasik 22 Feb 2018 21:23 UTC
  15 points
  0 ∶ 0
  Parent
  I tend to think of moral values as being pretty contingent and pretty arbitrary, such that what values you start with makes a big difference to what values you end up with even on reflection. People may “imprint” on the values they receive from their culture to a greater or lesser degree.
  
  I’m also skeptical that sophisticated philosophical-type reflection will have significant influence over posthuman values compared with more ordinary political/economic forces. I suppose philosophers have sometimes had big influences on human politics (religions, Marxism, the Enlightenment), though not necessarily in a clean “carefully consider lots of philosophical arguments and pick the best ones” kind of way.
  - Jacy 27 Feb 2018 17:32 UTC
    11 points
    0 ∶ 0
    Parent
    I’d qualify this by adding that the philosophical-type reflection seems to lead in expectation to more moral value (positive or negative, e.g. hedonium or dolorium) than other forces, despite overall having less influence than those other forces.