I agree with your “no lock-in” view in the case of alignment going well: in that world, we’d surely use the aligned superintelligence to help us with things like understanding AI sentience and making sure that sentient AIs aren’t suffering.
In the case of misalignment and humanity losing control of the future, I don’t think I understand the view that there wouldn’t be lock-in. I may well be missing something, but I can’t see why there wouldn’t be lock-in of things related to suffering risk—for example, whether or not the ASI creates sentient subroutines which help it achieve its goals but which incidentally suffer—that could in theory be steered away from even if we fail at alignment, given that the ASI’s future actions (even if they’re very hard to exactly predict) are decided by how we build it, and which we could likely steer away from more effectively if we better understood AI sentience (because then we’d know more about things like what kinds of subroutines can suffer).
I agree with your “no lock-in” view in the case of alignment going well: in that world, we’d surely use the aligned superintelligence to help us with things like understanding AI sentience and making sure that sentient AIs aren’t suffering.
In the case of misalignment and humanity losing control of the future, I don’t think I understand the view that there wouldn’t be lock-in. I may well be missing something, but I can’t see why there wouldn’t be lock-in of things related to suffering risk—for example, whether or not the ASI creates sentient subroutines which help it achieve its goals but which incidentally suffer—that could in theory be steered away from even if we fail at alignment, given that the ASI’s future actions (even if they’re very hard to exactly predict) are decided by how we build it, and which we could likely steer away from more effectively if we better understood AI sentience (because then we’d know more about things like what kinds of subroutines can suffer).