In brief, I think that the transitional period before ASI will shape the world that ASI is built into. That is, it will indirectly shape the values of ASI and the institutions that it’s embedded in and who controls it. I think the impact flows through to ASI. An analogy is how the behaviour of humans over the last two decades has flowed through to the character and impacts of today’s AI systems.
You might be right that the same alignment methods don’t work for superintelligence, but I don’t think that undermines the value of this work. If we can agree on what a good character would be for superintelligence, then you can use the new alignment methods to aim for that same character. Of course, if alignment is completely hopeless, then we shouldn’t work on AI character, but I think there is a good chance that alignment is solvable.
This comment discusses a similar objection.
In brief, I think that the transitional period before ASI will shape the world that ASI is built into. That is, it will indirectly shape the values of ASI and the institutions that it’s embedded in and who controls it. I think the impact flows through to ASI. An analogy is how the behaviour of humans over the last two decades has flowed through to the character and impacts of today’s AI systems.
You might be right that the same alignment methods don’t work for superintelligence, but I don’t think that undermines the value of this work. If we can agree on what a good character would be for superintelligence, then you can use the new alignment methods to aim for that same character. Of course, if alignment is completely hopeless, then we shouldn’t work on AI character, but I think there is a good chance that alignment is solvable.