Any realistic pause would only be lifted once there is a consensus on a potential solution to x-safety (or at least, say, full solutions to all jailbreaks, mechanistic interpretability and alignment up to the (frozen) frontier). If compute limits are in place during the pause, they can gradually be ratcheted up, with evals performed on models trained at each step, to avoid any such sudden snap back.
Any realistic pause would only be lifted once there is a consensus on a potential solution to x-safety (or at least, say, full solutions to all jailbreaks, mechanistic interpretability and alignment up to the (frozen) frontier). If compute limits are in place during the pause, they can gradually be ratcheted up, with evals performed on models trained at each step, to avoid any such sudden snap back.