Plausible, yes. For one thing you can run versions of the coordination tech in parallel with old cheap models, and flag and dig into discrepancies. This could make it harder for misalignment to strongly bite.
Of course if there are big misalignment issues and we’re not seriously tracking that there could be big misalignment issues, that’s gonna be a problem.
A separate cluster of threat models that is worth disentangling is creating more surface area for anti-human-user coordination within the economy, particularly if it’s much easier for smart, misaligned AI systems to coordinate with relatively stupid, corrigible AI systems (e.g., Opus 4.7). The arguments for AI <> AI coordination advantage (over AI <> human) are quite intuitive to me, but I don’t think you actually need an asymmetry here to put society in a more vulnerable state than the current one. I don’t have a great sense of how this washes out, but it feels like a crux for evaluating the net benefit of coordination tech.
Similar to how traditional → digital banking probably creates more surface area for exploitation by computer hackers, it’s probably very good to have primitive computers touching nukes rather than more modern ones.
Plausible, yes. For one thing you can run versions of the coordination tech in parallel with old cheap models, and flag and dig into discrepancies. This could make it harder for misalignment to strongly bite.
Of course if there are big misalignment issues and we’re not seriously tracking that there could be big misalignment issues, that’s gonna be a problem.
A separate cluster of threat models that is worth disentangling is creating more surface area for anti-human-user coordination within the economy, particularly if it’s much easier for smart, misaligned AI systems to coordinate with relatively stupid, corrigible AI systems (e.g., Opus 4.7). The arguments for AI <> AI coordination advantage (over AI <> human) are quite intuitive to me, but I don’t think you actually need an asymmetry here to put society in a more vulnerable state than the current one. I don’t have a great sense of how this washes out, but it feels like a crux for evaluating the net benefit of coordination tech.
Similar to how traditional → digital banking probably creates more surface area for exploitation by computer hackers, it’s probably very good to have primitive computers touching nukes rather than more modern ones.