Great piece overall! I’m hoping AI risk assessment and management processes can be improved.
Anthropic found that Claude 3 didn’t trigger AI Safety Level 3 for CBRN, but gave it a 30% chance of doing so in three months
30% chance of crossing the Yellow Line threshold (which requires building harder evals), not ASL-3 threshold
Thanks for the spot! Have fixed
Great piece overall! I’m hoping AI risk assessment and management processes can be improved.
30% chance of crossing the Yellow Line threshold (which requires building harder evals), not ASL-3 threshold
Thanks for the spot! Have fixed