In this recently published analysis, researcher Luca Righetti critiques OpenAI’s claim that its latest model can’t meaningfully help novices make chemical and biological weapons.
While OpenAI reported that o1-preview poses a “medium” risk level for CBRN threats—higher than GPT-4o’s “low” rating—Righetti argues that their testing methodology doesn’t clearly establish whether the model could meaningfully assist novices in creating bioweapons. The article highlights that OpenAI’s reliance on multiple-choice tests as proxies for laboratory capabilities may be insufficient to conclusively conclude that, especially when the model could be combined with other tools or used interactively by humans.
While OpenAI should be commended for releasing the data that supported this work, this analysis raises important questions about OpenAI’s CBRN testing for this and future models.
Report – Developing Guardrails for AI Biodesign Tools
A new NTI:bio report by Carter, Wheeler, Isaac, and Yassif examines the critical need for safeguards in AI biodesign tools (BDTs). While BDTs offer tremendous potential for therapeutic development and advancement across the bioeconomy, the authors argue that they also pose significant risks of misuse.
Based on interviews with over 20 experts, the authors recommend a number of pilot projects to reduce risks from the misuse of BDTs while still supporting innovation. They break down these into two key areas: built-in guardrails and managed access, with proposed pilot projects including developing screening mechanisms for risky inputs/outputs, implementing cryptographic signing of design metadata, and establishing managed access platforms.
Article – What will be the impact of AI on the bioweapons treaty?
A recent article by Revill, Rios, and Mazeaud in the Bulletin of Atomic Scientists examines how AI might affect the Biological Weapons Convention (BWC). The authors highlight that while AI could lower barriers to weapons development and assist in creating novel biological agents, it also offers opportunities to strengthen biological arms control.
The article emphasizes the need for updated national laws and regulations to address new challenges, such as cloud labs and cyber-biosecurity concerns. They argue that while the BWC’s intent-based definition should remain effective, implementation and verification mechanisms may need adaptation to address AI-related challenges.
Article – The Double-Edged Sword: Opportunities and Risks of AI in Biosecurity
Sarah Morgan’s analysis in the Georgetown Security Studies Review examines the dual-use nature of AI in biosecurity. The article emphasizes that while AI’s current impact on biothreats is minimal, gaps in regulation are concerning. Morgan argues for implementing mandatory monitoring and red-teaming in AI development, along with stronger regulation of manufacturing supply chains, particularly at the intersection of AI and biotechnology.
Report – US AISI and UK AISI Pre-Deployment Evaluation of Anthropic’s Upgraded Claude 3.5 Sonnet
The UK and US AI Safety Institutes recently conducted a comprehensive pre-deployment evaluation of Anthropic’s upgraded Claude 3.5 Sonnet model. Their testing focused on biological capabilities, cyber capabilities, software development, and safeguard efficacy.
In biological testing, the model showed comparable performance to reference models but fell significantly below human expert baselines. However, when augmented with bioinformatic tools, performance increased beyond the baseline model and sometimes exceeded human expert levels. The evaluation highlights both the potential and limitations of AI systems in biological applications.
If you’d like to suggest a recently published article or report to include in the next addition of the At the Nexus AIxBio newsletter, please feel free to get in touch.
AIxBio Newsletter #3 - At the Nexus
Link post
The At the Nexus AIxBio newsletter provides a regular summary of the latest news, articles and reports from the AIxBio space.
News at a glance
Google DeepMind open-sources AlphaFold 3
International Network of AI Safety Institutes launched
U.S. AISI establishes new U.S. government taskforce to collaborate on research and testing of AI models to manage national security capabilities and risks
Article – OpenAI’s CBRN tests seem unclear
In this recently published analysis, researcher Luca Righetti critiques OpenAI’s claim that its latest model can’t meaningfully help novices make chemical and biological weapons.
While OpenAI reported that o1-preview poses a “medium” risk level for CBRN threats—higher than GPT-4o’s “low” rating—Righetti argues that their testing methodology doesn’t clearly establish whether the model could meaningfully assist novices in creating bioweapons. The article highlights that OpenAI’s reliance on multiple-choice tests as proxies for laboratory capabilities may be insufficient to conclusively conclude that, especially when the model could be combined with other tools or used interactively by humans.
While OpenAI should be commended for releasing the data that supported this work, this analysis raises important questions about OpenAI’s CBRN testing for this and future models.
OpenAI’s CBRN tests seem unclear—Planned Obsolescence
Report – Developing Guardrails for AI Biodesign Tools
A new NTI:bio report by Carter, Wheeler, Isaac, and Yassif examines the critical need for safeguards in AI biodesign tools (BDTs). While BDTs offer tremendous potential for therapeutic development and advancement across the bioeconomy, the authors argue that they also pose significant risks of misuse.
Based on interviews with over 20 experts, the authors recommend a number of pilot projects to reduce risks from the misuse of BDTs while still supporting innovation. They break down these into two key areas: built-in guardrails and managed access, with proposed pilot projects including developing screening mechanisms for risky inputs/outputs, implementing cryptographic signing of design metadata, and establishing managed access platforms.
Developing Guardrails for AI Biodesign Tools—NTI:bio
Article – What will be the impact of AI on the bioweapons treaty?
A recent article by Revill, Rios, and Mazeaud in the Bulletin of Atomic Scientists examines how AI might affect the Biological Weapons Convention (BWC). The authors highlight that while AI could lower barriers to weapons development and assist in creating novel biological agents, it also offers opportunities to strengthen biological arms control.
The article emphasizes the need for updated national laws and regulations to address new challenges, such as cloud labs and cyber-biosecurity concerns. They argue that while the BWC’s intent-based definition should remain effective, implementation and verification mechanisms may need adaptation to address AI-related challenges.
What will be the impact of AI on the bioweapons treaty? - Bulletin for Atomic Scientists
Article – The Double-Edged Sword: Opportunities and Risks of AI in Biosecurity
Sarah Morgan’s analysis in the Georgetown Security Studies Review examines the dual-use nature of AI in biosecurity. The article emphasizes that while AI’s current impact on biothreats is minimal, gaps in regulation are concerning. Morgan argues for implementing mandatory monitoring and red-teaming in AI development, along with stronger regulation of manufacturing supply chains, particularly at the intersection of AI and biotechnology.
The Double-Edged Sword: Opportunities and Risks of AI in Biosecurity—Georgetown Security Studies Review
Report – US AISI and UK AISI Pre-Deployment Evaluation of Anthropic’s Upgraded Claude 3.5 Sonnet
The UK and US AI Safety Institutes recently conducted a comprehensive pre-deployment evaluation of Anthropic’s upgraded Claude 3.5 Sonnet model. Their testing focused on biological capabilities, cyber capabilities, software development, and safeguard efficacy.
In biological testing, the model showed comparable performance to reference models but fell significantly below human expert baselines. However, when augmented with bioinformatic tools, performance increased beyond the baseline model and sometimes exceeded human expert levels. The evaluation highlights both the potential and limitations of AI systems in biological applications.
US AISI and UK AISI Pre-Deployment Evaluation of Anthropic’s Upgraded Claude 3.5 Sonnet—UK AI Safety Institute
If you’d like to suggest a recently published article or report to include in the next addition of the At the Nexus AIxBio newsletter, please feel free to get in touch.