Do I understand correctly: “safety in the long run” is unrelated to what you’re currently doing in any negative way—you don’t think you’re advancing AGI-relevant capabilities (and so there is no need to try to align-or-whatever your forever-well-below-AGI system), do I understand correctly?
Do I understand correctly: “safety in the long run” is unrelated to what you’re currently doing in any negative way—you don’t think you’re advancing AGI-relevant capabilities (and so there is no need to try to align-or-whatever your forever-well-below-AGI system), do I understand correctly?
Please feel free to correct me!
No, it’s that our case for alignment doesn’t rest on “the system is only giving advice” as a step. I sketched the actual case in this comment.