Ben Millwood🔸 comments on Matthew_Barnett’s Quick takes

Ben Millwood🔸 3 Feb 2024 19:59 UTC
8 points
6 ∶ 1
A lot of these points seem like arguments that it’s possible that unaligned AI takeover will go well, e.g. there’s no reason not to think that AIs are conscious, or will have interesting moral values, or etc.

My stance is that we (more-or-less) know humans are conscious and have moral values that, while they have failed to prevent large amounts of harm, seem to have the potential to be good. AIs may be conscious and may have welfare-promoting values, but we don’t know that yet. We should try to better understand whether AIs are worthy successors before transitioning power to them.

Probably a core point of disagreement here is whether, presented with a “random” intelligent actor, we should expect it to promote welfare or prevent suffering “by default”. My understanding is that some accelerationists believe that we should. I believe that we shouldn’t. Moreover I believe that it’s enough to be substantially uncertain about whether this is or isn’t the default to want to take a slower and more careful approach.
- Matthew_Barnett 4 Feb 2024 20:55 UTC
  10 points
  1 ∶ 1
  Parent
  My stance is that we (more-or-less) know humans are conscious and have moral values that, while they have failed to prevent large amounts of harm, seem to have the potential to be good.
  I claim there’s a weird asymmetry here where you’re happy to put trust into humans because they have the “potential” to do good, but you’re not willing to say the same for AIs, even though they seem to have the same type of “potential”.
  Whatever your expectations about AIs, we already know that humans are not blank slates that may or may not be altruistic in the future: we actually have a ton of evidence about the quality and character of human nature, and it doesn’t make humans look great. Humans are not mainly described as altruistic creatures. I mentioned factory farming in my original comment, but one can examine the way people spend their money (i.e. not mainly on charitable causes), or the history of genocides, war, slavery, and oppression for additional evidence.
  Probably a core point of disagreement here is whether, presented with a “random” intelligent actor, we should expect it to promote welfare or prevent suffering “by default”.
  I don’t expect humans to “promote welfare or prevent suffering” by default either. Look at the current world. Have humans, on net, reduced or increased suffering? Even if you think humans have been good for the world, it’s not obvious. Sure, it’s easy to dismiss the value of unaligned AIs if you compare against some idealistic baseline; but I’m asking you to compare against a realistic baseline, i.e. actual human nature.
  - Ben Millwood🔸 4 Feb 2024 22:33 UTC
    10 points
    3 ∶ 1
    Parent
    It seems like you’re just substantially more pessimistic than I am about humans. I think factory farming will be ended, and though it seems like humans have caused more suffering than happiness so far, I think their default trajectory will be to eventually stop doing that, and to ultimately do enough good to outweigh their ignoble past. I don’t think this is certain by any means, but I think it’s a reasonable extrapolation. (I maybe don’t expect you to find it a reasonable extrapolation.)
    
    Meanwhile I expect the typical unaligned AI may seize power for some purpose that seems to us entirely trivial, and may be uninterested in doing any kind of moral philosophy, and/or may not place any terminal (rather than instrumental) value in paying attention to other sentient experiences in any capacity. I do think humans, even with their kind of terrible track record, are more promising than that baseline, though I can see why other people might think differently.
  - Ben Millwood🔸 5 Feb 2024 12:23 UTC
    7 points
    1 ∶ 0
    Parent
    
    Sure, it’s easy to dismiss the value of unaligned AIs if you compare against some idealistic baseline; but I’m asking you to compare against a realistic baseline, i.e. actual human nature.
    
    I haven’t read your entire post about this, but I understand you believe that if we created aligned AI, it would get essentially “current” human values, rather than e.g. some improved / more enlightened iteration of human values. If instead you believed the latter, that would set a significantly higher bar for unaligned AI, right?
    - Matthew_Barnett 5 Feb 2024 23:54 UTC
      7 points
      1 ∶ 0
      Parent
      
      If instead you believed the latter, that would set a significantly higher bar for unaligned AI, right?
      
      That’s right, if I thought human values would improve greatly in the face of enormous wealth and advanced technology, I’d definitely be open to seeing humans as special and extra valuable from a total utilitarian perspective. Note that many routes through which values could improve in the future could apply to unaligned AIs too. So, for example, I’d need to believe that humans would be more likely to reflect, and be more likely to do the right type of reflection, relative to the unaligned baseline. In other words it’s not sufficient to argue that humans would reflect a little bit; that wouldn’t really persuade me at all.