I think Will MacAskill and Finn Morehouse’s paper rests on the crucial consideration that aligning ASI is possible (by anyone at all). They haven’t established this (EDIT: by this I mean they don’t cite to any supporting arguments for this, rather than personally coming up with the arguments themselves. But as far as I know, there aren’t any supporting arguments for the assumption, and in fact there are good arguments on the other side for why aligning ASI is fundamentally impossible).
Why do you think this? What make you think that it’s possible at all?[1] And what do you mean by “large minority”? Can you give an approximate percentage?
To respond to Yampolskiy without disagreeing with the fundamental point, I think it’s definitely possible for a less intelligent species to align or even indefinitely control a boundedly and only slightly more intelligent species, especially given greater resources, speed, and/or numbers, and sufficient effort.
The problem is that humans aren’t currently trying to limit the systems or trying much to monitor, much less robustly align or control them.
Fair point. But AI is indeed unlikely to top out at merely “slighlty more” intelligent. And it has the potential for a massive speed/numbers advantage too.
To clarify, do you think there’s a large minority change that it is possible to align an arbitrarily powerful system, or do you think there is a large minority chance that it is going to happen with the first such arbitrarily powerful system, such that we’re not locked in to a different future / killed by a misaligned singleton?
I think Will MacAskill and Finn Morehouse’s paper rests on the crucial consideration that aligning ASI is possible (by anyone at all). They haven’t established this (EDIT: by this I mean they don’t cite to any supporting arguments for this, rather than personally coming up with the arguments themselves. But as far as I know, there aren’t any supporting arguments for the assumption, and in fact there are good arguments on the other side for why aligning ASI is fundamentally impossible).
This seems like a really critical issue, and I’d be very interested in hearing whether this is disputed by @tylermjohn / @William_MacAskill.
I think there is a large minority chance that we will successfully align ASI this century, so I definitely think it is possible.
Why do you think this? What make you think that it’s possible at all?[1] And what do you mean by “large minority”? Can you give an approximate percentage?
Or to paraphrase Yampolskiy: it’s not possible for a less intelligent species to indefinitely control a more intelligent species.
To respond to Yampolskiy without disagreeing with the fundamental point, I think it’s definitely possible for a less intelligent species to align or even indefinitely control a boundedly and only slightly more intelligent species, especially given greater resources, speed, and/or numbers, and sufficient effort.
The problem is that humans aren’t currently trying to limit the systems or trying much to monitor, much less robustly align or control them.
Fair point. But AI is indeed unlikely to top out at merely “slighlty more” intelligent. And it has the potential for a massive speed/numbers advantage too.
Yes, by default self-improving AI goes very poorly, but this is a plausible case where would could have aligned AGI, if not ASI.
To clarify, do you think there’s a large minority change that it is possible to align an arbitrarily powerful system, or do you think there is a large minority chance that it is going to happen with the first such arbitrarily powerful system, such that we’re not locked in to a different future / killed by a misaligned singleton?