I’ve been meaning to write something about ‘revisiting the alignment strategy’. The section 5 here (‘Won’t AGI make post-AGI catastrophes essentially irrelevant?’) makes the point very clearly:
On this view, a post-AGI world is nearly binary—utopia or extinction—leaving little room for Sisyphean scenarios.
But I think this is too optimistic about the speed and completeness of the transition to globally deployed, robustly aligned “guardian” systems.
without making much of a case for it. Interested in Will and reviewers’ sense of the space and literature here.
I’ve often been frustrated by this assumption over the last 20 years, but don’t remember any good pieces about it.
It may be partly from Eliezer’s first alignment approach being to create a superintelligent sovereign AI, where if that goes right, other risks really would be dealt with.
I’ve been meaning to write something about ‘revisiting the alignment strategy’. The section 5 here (‘Won’t AGI make post-AGI catastrophes essentially irrelevant?’) makes the point very clearly:
without making much of a case for it. Interested in Will and reviewers’ sense of the space and literature here.
I’ve often been frustrated by this assumption over the last 20 years, but don’t remember any good pieces about it.
It may be partly from Eliezer’s first alignment approach being to create a superintelligent sovereign AI, where if that goes right, other risks really would be dealt with.