David Johnston comments on Linch’s Quick takes

David Johnston 2 Dec 2021 20:18 UTC
1 point
0 ∶ 0
The world’s first slightly superhuman AI might be only slightly superhuman at AI alignment. Thus if creating it was a suicidal act by the world’s leading AI researchers, it might be suicidal in exactly the same way. In the other hand, if it has a good grasp of alignment then it’s creators might also have a good grasp of alignment.

In the first scenario (but not the second!), creating more capable but not fully aligned descendants seems like it must be a stable behaviour of intelligent agents, as by assumption
1. behaviour of descendants is only weakly controlled by parents
2. the parents keep making better descendants until the descendants are strongly superhuman
I think that Buck’s also right that the world’s first superhuman AI might have a simpler alignment problem to solve.