Disclaimer: I joined OP two weeks ago in the Program Associate role on the Technical AI Safety team. I’m leaving some comments describing questions I wanted to know to assess whether I should take the job (which, obviously, I ended up doing).
Is it way easier for researchers to do AI safety research within AI scaling labs (due to: more capable/diverse AI models, easier access to them (i.e. no rate limits/usage caps), better infra for running experiments, maybe some network effects from the other researchers at those labs, not having to deal with all the logistical hassle that comes from being a professor/independent researcher)?
Does this imply that the research ecosystem OP is funding (which is ~all external to these labs) isn’t that important/cutting-edge for AI safety?
I think this is definitely a real dynamic, but a lot of EAs seem to exaggerate it a lot in their minds and inappropriately round the impact of external research down to 0. Here are a few scattered points on this topic:
Third party researchers can influence the research that happens at labs through the normal diffusion process by which all research influences all other research. There’s definitely some barrier to research insight diffusing from academia to companies (and e.g. it’s unfortunately common for an academic project to have no impact on company practice because it just wasn’t developed with the right practical constraints in mind), but it still happens all the time (and some types of research, e.g. benchmarks, are especially easy to port over). If third party research can influence lab practice to a substantial degree, then funding third party research just straightforwardly increases the total amount of useful research happening, since labs can’t hire everyone who could do useful work.
It will increasingly be possible to do good (non-interpretability) research on large models through APIs provided by labs, and Open Phil could help facilitate that and increase the rate at which it happens. We can also help facilitate greater compute budgets and engineering support.
The work of the lab-external safety research community can also impact policy and public opinion; the safety teams at scaling labs are not their only audience. For example, capability evaluations and model organisms work both have the potential to have at least as big an impact on policy as they do on technical safety work happening inside labs.
We can fund nonprofits and companies which directly interface with AI companies in a consulting-like manner (e.g. red-teaming consultants); I expect an increasing fraction of our opportunities to look like this.
Academics and other external safety researchers we fund now can end up joining scaling labs later (as e.g. Ethan Perez and Collin Burns did), to implement ideas that they developed on the outside; I think this is likely to happen more and more.
Some research directions benefit less than others from access to cutting edge models. For example, it seems like there’s a lot of interpretability work that can be done on very small models, whereas scalable oversight work seems harder to do without quite smart models.
Disclaimer: I joined OP two weeks ago in the Program Associate role on the Technical AI Safety team. I’m leaving some comments describing questions I wanted to know to assess whether I should take the job (which, obviously, I ended up doing).
Is it way easier for researchers to do AI safety research within AI scaling labs (due to: more capable/diverse AI models, easier access to them (i.e. no rate limits/usage caps), better infra for running experiments, maybe some network effects from the other researchers at those labs, not having to deal with all the logistical hassle that comes from being a professor/independent researcher)?
Does this imply that the research ecosystem OP is funding (which is ~all external to these labs) isn’t that important/cutting-edge for AI safety?
I think this is definitely a real dynamic, but a lot of EAs seem to exaggerate it a lot in their minds and inappropriately round the impact of external research down to 0. Here are a few scattered points on this topic:
Third party researchers can influence the research that happens at labs through the normal diffusion process by which all research influences all other research. There’s definitely some barrier to research insight diffusing from academia to companies (and e.g. it’s unfortunately common for an academic project to have no impact on company practice because it just wasn’t developed with the right practical constraints in mind), but it still happens all the time (and some types of research, e.g. benchmarks, are especially easy to port over). If third party research can influence lab practice to a substantial degree, then funding third party research just straightforwardly increases the total amount of useful research happening, since labs can’t hire everyone who could do useful work.
It will increasingly be possible to do good (non-interpretability) research on large models through APIs provided by labs, and Open Phil could help facilitate that and increase the rate at which it happens. We can also help facilitate greater compute budgets and engineering support.
The work of the lab-external safety research community can also impact policy and public opinion; the safety teams at scaling labs are not their only audience. For example, capability evaluations and model organisms work both have the potential to have at least as big an impact on policy as they do on technical safety work happening inside labs.
We can fund nonprofits and companies which directly interface with AI companies in a consulting-like manner (e.g. red-teaming consultants); I expect an increasing fraction of our opportunities to look like this.
Academics and other external safety researchers we fund now can end up joining scaling labs later (as e.g. Ethan Perez and Collin Burns did), to implement ideas that they developed on the outside; I think this is likely to happen more and more.
Some research directions benefit less than others from access to cutting edge models. For example, it seems like there’s a lot of interpretability work that can be done on very small models, whereas scalable oversight work seems harder to do without quite smart models.