Partially aligned transformative AIs are likely to be stable under reflection
Work on corrigibility has provided a decent outline of how to do this. My response is heavily dependent on weak guesses as to how diligent AI companies will be at incorporating the best ideas.
Work on corrigibility has provided a decent outline of how to do this. My response is heavily dependent on weak guesses as to how diligent AI companies will be at incorporating the best ideas.