Yeah, this seems to me like an important question. I see it as one subquestion of the broader, seemingly important, and seemingly neglected questions “What fraction of importance-adjusted AI safety and governance work will be done or heavily boosted by AIs? What’s needed to enable that? What are the implications of that?”
I previously had a discussion focused on another subquestion of that, which is what the implications are for government funding programs in particular. I wrote notes from that conversation and will copy them below. (Some of this is also relevant to other questions in this vicinity.)
“Key takeaways
Maybe in future most technical AI safety work will be done by AIs.
Maybe that has important implications for whether & how to get government funding for technical AI safety work?
E.g., be less enthusiastic about getting government funding for more human AI safety researchers?
E.g., be more enthusiastic about laying the groundwork for gov funding for AI assistance for top AI safety researchers later?
Such as by more strongly prioritizing having well-scoped research agendas, or ensuring top AI safety researchers (or their orgs) have enough credibility signals to potentially attract major government funding?
This is a subquestion of the broader question “What should we do to prep for a world where most technical AI safety work can be done by AIs?”, which also seems neglected as far as I can tell.
Seems worth someone spending 1-20 hours doing distillation/research/writing on that topic, then sharing that with relevant people.
(Feel free to request access, though it may not be granted.)
But there may in future be a huge army of AI safety researchers in the form of AIs, or AI tools/systems that boost AI safety researchers in other ways. What does that imply, esp. for gov funding programs?
Reduced importance of funding for AI safety work, since it’ll be less bottlenecked by labor (which is costly) and more by a handful of good scalable ideas?
Funding for AI safety work is mostly important for getting top AI safety researchers to have huge compute budgets to run (and train?) all those AI assistance, rather than funding people themselves or other things?
Perhaps this even increases the importance of funding, since we thought it’d be hard to scale the relevant labor via people but it may be easier to scale via lots of compute and hence AI assistance?
Increased importance of particular forms of “well-scoped” research agendas/questions? Or more specifically, focusing now on whatever work it’s hardest to hand off to AIs but that best sets things up for using AIs?
Make the best AI safety researchers, research agendas, and orgs more credible/legible to gov people so that they can absorb lots of funding to support AI assistants?
What does that require?
Might mean putting some of the best AI safety researchers in new or existing institutions that look credible? E.g. into academic labs, or merging a few safety projects into one org that we ensure has a great brand?
Start pushing the idea (in EA, to gov people, etc.) that gov should now/soon provide increasingly much funding for AI safety via compute support for relevant people?
Start pushing the idea that gov should be very choosy about who to support but then support them a lot? Like support just a few of the best AI safety researchers/orgs but providing them with a huge compute budget?
That’s unusual and seems hard to make happen. Maybe that makes it worth actively laying groundwork for this?
Research proposal
I think this seems worth a brief investigation of, then explicitly deciding whether or not to spend more time.
Ideally this’d be done by someone with decent AI technical knowledge and/or gov funding program knowledge.
If someone isn’t the ideal fit for working on this but has capacity and interest, they could:
spend 1-10 hours
aim to point out some somewhat-obvious-once-stated hypotheses, without properly vetting them or fleshing them out
Lean somewhat on conversations with relevant people or on sharing a rough doc with relevant people to elicit their thoughts
Maybe the goals of an initial stab at this would be:
Increase the chance that someone who does have strong technical and/or gov knowledge does further thinking on this
Increase the chance that relevant technical AI safety people, leaders of technical AI safety orgs, and/or people in government bear this in mind and adjust their behavior in relevant ways”
Yeah, this seems to me like an important question. I see it as one subquestion of the broader, seemingly important, and seemingly neglected questions “What fraction of importance-adjusted AI safety and governance work will be done or heavily boosted by AIs? What’s needed to enable that? What are the implications of that?”
I previously had a discussion focused on another subquestion of that, which is what the implications are for government funding programs in particular. I wrote notes from that conversation and will copy them below. (Some of this is also relevant to other questions in this vicinity.)
“Key takeaways
Maybe in future most technical AI safety work will be done by AIs.
Maybe that has important implications for whether & how to get government funding for technical AI safety work?
E.g., be less enthusiastic about getting government funding for more human AI safety researchers?
E.g., be more enthusiastic about laying the groundwork for gov funding for AI assistance for top AI safety researchers later?
Such as by more strongly prioritizing having well-scoped research agendas, or ensuring top AI safety researchers (or their orgs) have enough credibility signals to potentially attract major government funding?
This is a subquestion of the broader question “What should we do to prep for a world where most technical AI safety work can be done by AIs?”, which also seems neglected as far as I can tell.
Seems worth someone spending 1-20 hours doing distillation/research/writing on that topic, then sharing that with relevant people.
Additional object-level notes
See [v. A] Introduction & summary – Survey on intermediate goals in AI governance for an indication of how excited AI risk folks are about “Increase US and/or UK government spending on AI reliability, robustness, verification, reward learning, interpretability, and explainability”.
Details of people’s views can be found in [v. B] Ratings & comments on goals related to government spending – Survey on intermediate goals in AI governance
(Feel free to request access, though it may not be granted.)
But there may in future be a huge army of AI safety researchers in the form of AIs, or AI tools/systems that boost AI safety researchers in other ways. What does that imply, esp. for gov funding programs?
Reduced importance of funding for AI safety work, since it’ll be less bottlenecked by labor (which is costly) and more by a handful of good scalable ideas?
Funding for AI safety work is mostly important for getting top AI safety researchers to have huge compute budgets to run (and train?) all those AI assistance, rather than funding people themselves or other things?
Perhaps this even increases the importance of funding, since we thought it’d be hard to scale the relevant labor via people but it may be easier to scale via lots of compute and hence AI assistance?
Increased importance of particular forms of “well-scoped” research agendas/questions? Or more specifically, focusing now on whatever work it’s hardest to hand off to AIs but that best sets things up for using AIs?
Make the best AI safety researchers, research agendas, and orgs more credible/legible to gov people so that they can absorb lots of funding to support AI assistants?
What does that require?
Might mean putting some of the best AI safety researchers in new or existing institutions that look credible? E.g. into academic labs, or merging a few safety projects into one org that we ensure has a great brand?
Start pushing the idea (in EA, to gov people, etc.) that gov should now/soon provide increasingly much funding for AI safety via compute support for relevant people?
Start pushing the idea that gov should be very choosy about who to support but then support them a lot? Like support just a few of the best AI safety researchers/orgs but providing them with a huge compute budget?
That’s unusual and seems hard to make happen. Maybe that makes it worth actively laying groundwork for this?
Research proposal
I think this seems worth a brief investigation of, then explicitly deciding whether or not to spend more time.
Ideally this’d be done by someone with decent AI technical knowledge and/or gov funding program knowledge.
If someone isn’t the ideal fit for working on this but has capacity and interest, they could:
spend 1-10 hours
aim to point out some somewhat-obvious-once-stated hypotheses, without properly vetting them or fleshing them out
Lean somewhat on conversations with relevant people or on sharing a rough doc with relevant people to elicit their thoughts
Maybe the goals of an initial stab at this would be:
Increase the chance that someone who does have strong technical and/or gov knowledge does further thinking on this
Increase the chance that relevant technical AI safety people, leaders of technical AI safety orgs, and/or people in government bear this in mind and adjust their behavior in relevant ways”