...we have, in my opinion, some pretty compelling reasons to think that it not solvable even in principle, (1) given the diversity, complexity, and ideological nature of many human values… There is no reason to expect that any AI systems could be ‘aligned’ with the totality of other sentient life on Earth.
One way to decompose the alignment question is into 2 parts:
Can we align it with human values? (the blockquote is an example of this)
Folks at e.g. MIRI think (1) is the hard problem and (2) isn’t as hard; folks like you think the opposite. Then you all talk past each other. (“You” isn’t aimed at literally you in particular, I’m summarizing what I’ve seen.) I don’t have a clear stance on which is harder; I just wish folks would engage with the best arguments from each side.
Mo—you might be right about what MIRI thinks will be hard. I’m not sure; it often seems difficult to understand what they write about these issues, since it’s often very abstract and seems not very grounded in specific goals and values that AIs might need to implement. I do think the MIRI-type approach radically under-estimates the difficulty of your point number 2.
On the other hand, I’m not at all confident that point number 1 will be easy. My hunch is that both 1 and 2 will prove surprisingly hard. Which is a good reason to pause AI research until we make a lot more progress on both issues. (And if we don’t make dramatic progress on both issues, the ‘pause’ should remain in place as long as it takes. Which could be decades or centuries.)
One way to decompose the alignment question is into 2 parts:
Can we aim ASI at all? (e.g. Nate Soares’ What I mean by “alignment is in large part about making cognition aimable at all”)
Can we align it with human values? (the blockquote is an example of this)
Folks at e.g. MIRI think (1) is the hard problem and (2) isn’t as hard; folks like you think the opposite. Then you all talk past each other. (“You” isn’t aimed at literally you in particular, I’m summarizing what I’ve seen.) I don’t have a clear stance on which is harder; I just wish folks would engage with the best arguments from each side.
Mo—you might be right about what MIRI thinks will be hard. I’m not sure; it often seems difficult to understand what they write about these issues, since it’s often very abstract and seems not very grounded in specific goals and values that AIs might need to implement. I do think the MIRI-type approach radically under-estimates the difficulty of your point number 2.
On the other hand, I’m not at all confident that point number 1 will be easy. My hunch is that both 1 and 2 will prove surprisingly hard. Which is a good reason to pause AI research until we make a lot more progress on both issues. (And if we don’t make dramatic progress on both issues, the ‘pause’ should remain in place as long as it takes. Which could be decades or centuries.)