Open Philanthropy is launching a big new Request for Proposals for technical AI safety research, with plans to fund roughly $40M in grants over the next 5 months, and available funding for substantially more depending on application quality.
Applications start with a simple 300 word expression of interest and are open until April 15, 2025.
Apply nowOverview
We’re seeking proposals across 21 different research areas, organized into five broad categories:
Adversarial Machine Learning
*Jailbreaks and unintentional misalignment
*Control evaluations
*Backdoors and other alignment stress tests
*Alternatives to adversarial training
Robust unlearning
Exploring sophisticated misbehavior of LLMs
*Experiments on alignment faking
*Encoded reasoning in CoT and inter-model communication
Black-box LLM psychology
Evaluating whether models can hide dangerous behaviors
Reward hacking of human oversight
Model transparency
Applications of white-box techniques
Activation monitoring
Finding feature representations
Toy models for interpretability
Externalizing reasoning
Interpretability benchmarks
More transparent architectures
Trust from first principles
White-box estimation of rare misbehavior
Theoretical study of inductive biases
Alternative approaches to mitigating AI risks
Conceptual clarity about risks from powerful AI
New moonshots for aligning superintelligence
We’re willing to make a range of types of grants including:
Research expenses (compute, APIs, etc.)
Discrete research projects (typically lasting 6-24 months)
Academic start-up packages
Support for existing nonprofits
Funding to start new research organizations or new teams at existing organizations.
The full RFP provides much more detail on each research area, including eligibility criteria, example projects, and nice-to-haves.
Read moreWe want the bar to be low for submitting expressions of interest: even if you’re unsure whether your project fits perfectly, we encourage you to submit an EOI. This RFP is partly an experiment to understand the demand for funding in AI safety research.
Please email aisafety@openphilanthropy.org with questions, or just submit an EOI.
Has Open Phil (or others) conducted a comprehensive analysis for both understanding and building the AI safety field?
If yes, could you share some leads to add to my research?
If not, would Open Phil consider funding such work? (either under the above or other funds)
Here is a recent example: Introducing SyDFAIS: A Systemic Design Framework for AI Safety Field-Building
I’m new to applying for an AIS grant, so I have some common questions that might have been answered elsewhere:
(1) what are some failure modes that I might need to consider when writing a proposal, specifically for a research project?
(2) will research expenses include stipends for the researchers?
(3) can I write a grant to do a research project with my university AI safety group? I’m not sure if this will be considered a field-building or a technical AI safety grant.
Some common failure modes:
Not reading the eligibility criteria
Not clearly distinguishing your project from prior work on the topic you’re interested in
Not demonstrating a good understanding of prior work (would be good to read some/all of the papers we link to in this doc for whatever section you’re applying within)
Not demonstrating that you/your team has prior experience doing ML projects. If you don’t have such experience, then it’s good to work with/be mentored by someone who does.
“Research expeneses” does not include stipends, but you can apply for a project grant, which does.
If you’re looking for money to spend on ML experiments or to pay people who are spending their time doing ML research, then that may fall within this RFP. If you’re looking for money to do other things (e.g. reading groups, events, etc), then that may fall under the capacity-building team’s RFPs.