But AIs could value anything. They don’t have to value some metric of importance that lines up with what we care about on reflection. That is, it wouldn’t be a blunder in an epistemic sense. AIs could know their values lack nuance and go against human values, and just not care.
Or maybe you’re just saying that, with the path we’re currently on, it looks like powerful AIs will in fact end up with nuanced values in line with humanity’s. I think this could still constitute a value lock-in, though, just not one that you consider bad. And I expect there would still be value disagreements between humans even if we had perfect information, so I’m skeptical we could ever instill values into AIs that everyone is happy about it.
I’m also not sure AI would cause a value lock-in, but more because powerful AIs may be widely distributed such that no single AGI takes over everything.
Yeah. Though for a utilitarian it could still be instrumentally good to believe the points in this post, at least on an emotional level. But lying to yourself is plausibly a bad norm for other reasons, and in any case, this type of reasoning is the exact thing the post is arguing against.