Wei Dai comments on The argument for near-term human disempowerment through AI

Wei Dai Apr 17, 2024, 9:37 AM
4 points
1 ∶ 0

If future humans were in the driver’s seat instead, but with slightly more control over the process

Why only “slightly” more control? It’s surprising to see you say this without giving any reasons or linking to some arguments, as this degree of alignment difficulty seems like a very unusual position that I’ve never seen anyone argue for before.
- Matthew_Barnett Apr 17, 2024, 6:49 PM
  2 points
  0 ∶ 0
  Parent
  I’m a bit surprised you haven’t seen anyone make this argument before. To be clear, I wrote the comment last night on a mobile device, and it was intended to be a brief summary of my position, which perhaps explains why I didn’t link to anything or elaborate on that specific question. I’m not sure I want to outline my justifications for my view right now, but my general impression is that civilization has never had much central control over cultural values, so it’s unsurprising if this situation persists into the future, including with AI. Even if we align AIs, cultural and evolutionary forces can nonetheless push our values far. Does that brief explanation provide enough of a pointer to what I’m saying for you to be ~satisfied? I know I haven’t said much here; but I kind doubt my view on this issue is that rare that you’ve literally never seen someone present a case for it.
  - Ryan Greenblatt Apr 17, 2024, 7:01 PM
    1 point
    0 ∶ 0
    Parent
    Where the main counterargument is that now the groups in power can be immortal and digital minds will be possible.
    See also: AGI and Lock-in
    - Matthew_Barnett Apr 17, 2024, 7:32 PM
      2 points
      0 ∶ 0
      Parent
      I have some objections to the idea that groups will be “immortal” in the future, in the sense of never changing, dying, or rotting, and persisting over time in a roughly unchanged form, exerting consistent levels of power over a very long time period. To be clear, I do think AGI can make some forms of value lock-in more likely, but I want to distinguish a few different claims:
      (1) is a future value lock-in likely to occur at some point, especially not long after human labor has become ~obsolete?
      (2) is lock-in more likely if we perform, say, a century more of technical AI alignment research, before proceeding forward?
      (3) is it good to make lock-in more likely by, say, delaying AI by 100 years to do more technical alignment research, before proceeding forward? (i.e., will it be good or bad to do this type of thing?)
      My quick and loose current answers to these questions are as follows:
      This seems plausible but unlikely to me in a strong form. Some forms of lock-in seem likely; I’m more skeptical of the more radical scenarios people have talked about.
      I suspect lock-in would become more likely in this case, but the marginal effect of more research would likely be pretty small.
      I am pretty uncertain about this question, but I lean towards being against deliberately aiming for this type of lock-in. I am inclined to this view for a number of reasons, but one reason is that this policy seems to make it more likely that we restrict innovation and experience system rot on a large scale, causing the future to be much bleaker than it otherwise could be. See also Robin Hanson’s post on world government rot.