But the difficulty of alignment doesn’t seem to imply much about whether slowing is good or bad, or about its priority relative to other goals.
At the extremes, if alignment-to-”good”-values by default was 100% likely I presume slowing down would be net-negative, and racing ahead would look great. It’s unclear to me where the tipping point is, what kind of distribution over different alignment difficulty levels one would need to have to tip from wanting to speed up vs wanting to slow down AI progress.
Seems to me like the more longtermist one is, the more slowing down looks good even when one is very optimistic about alignment. Then again there are some considerations that push against this: risk of totalitarianism, risk of pause that never ends, risk of value-agnostic alignment being solved and the first AGI being aligned to “worse” values than the default outcome.
(I realize I’m using two different definitions of alignment in this comment, would like to know if there’s standardized terminology to differentiate between them)
Naively I would trade a lot of clearly-safe stuff being delayed or temporarily prohibited for even a minor decrease in chance of safe-seeming-but-actually-dangerous stuff going through, which pushes me towards favoring a more expansive scope of regulation.
(in my mind the potential loss of decades of life improvements currently pale vs potential non-existence of all lives in the longterm future)
Don’t know how to think about it when accounting for public opinion though, I expect a larger scope will gather more opposition to regulation, which could be detrimental in various ways, the most obvious being decreased likelihood of such regulation being passed/upheld/disseminated to other places.