Thank you so much for your insightful and detailed list of ideas for AGI safety careers, Richard! I really appreciate your excellent post.
I would propose explicitly grouping some of your ideas and additional ones under a third category: “identifying and raising public awareness of AGI’s dangers.” In fact, I think this category may plausibly contain some of the most impactful ideas for reducing catastrophic and existential risks, given that alignment seems potentially difficult to achieve in a reasonable period of time (if ever) and the implementation of governance ideas is bottlenecked by public support.
I don’t actually think the implementation of governance ideas is mainly bottlenecked by public support; I think it’s bottlenecked by good concrete proposals. And to the extent that it is bottlenecked by public support, that will change by default as more powerful AI systems are released.
I don’t actually think the implementation of governance ideas is mainly bottlenecked by public support; I think it’s bottlenecked by good concrete proposals. And to the extent that it is bottlenecked by public support, that will change by default as more powerful AI systems are released.
I appreciate Richard stating this explicitly. I think this is (and has been) a pretty big crux in the AI governance space right now.
Some folks (like Richard) believe that we’re mainly bottlenecked by good concrete proposals. Other folks believe that we have concrete proposals, but we need to raise awareness and political support in order to implement them.
I’d like to see more work going into both of these areas. On the margin, though, I’m currently more excited about efforts to raise awareness [well], acquire political support, and channel that support into achieving useful policies.
I think this is largely due to (a) my perception that this work is largely neglected, (b) the fact that a few AI governance professionals I trust have also stated that they see this as the higher priority thing at the moment, and (c) worldview beliefs around what kind of regulation is warranted (e.g., being more sympathetic to proposals that require a lot of political will).
I can see a worldview in which prioritizing raising awareness is more valuable, but I don’t see the case for believing “that we have concrete proposals”. Or at least, I haven’t seen any; could you link them, or explain what you mean by a concrete proposal?
My guess is that you’re underestimating how concrete a proposal needs to be before you can actually muster political will behind it. For example, you don’t just need “let’s force labs to pass evals”, you actually need to have solid descriptions of the evals you want them to pass.
I also think that recent events have been strong evidence in favor of my position: we got a huge amount of political will “for free” from AI capabilities advances, and the best we could do with it was to push a deeply flawed “let’s all just pause for 6 months” proposal.
Clarification: I think we’re bottlenecked by both, and I’d love to see the proposals become more concrete.
Nonetheless, I think proposals like “Get a federal agency to regulate frontier AI labs like the FDA/FAA” or even “push for an international treaty that regulates AI in a way that the IAEA regulates atomic energy” are “concrete enough” to start building political will behind them. Other (more specific) examples include export controls, compute monitoring, licensing for frontier AI models, and some others on Luke’s list.
I don’t think any of these are concrete enough for me to say “here’s exactly how the regulatory process should be operationalized”, and I’m glad we’re trying to get more people to concretize these.
At the same time, I expect that a lot of the concretization happens after you’ve developed political will. If the USG really wanted to figure out how to implement compute monitoring, I’m confident they’d be able to figure it out.
More broadly, my guess is that we might disagree on how concrete a proposal needs to be before you can actually muster political will behind it, though. Here’s a rough attempt at sketching out three possible “levels of concreteness”. (First attempt; feel free to point out flaws).
Level 1, No concreteness: You have a goal but no particular ideas for how to get there. (e.g., “we need to make sure we don’t build unaligned AGI”)
Level 2, Low concreteness: You have a goal with some vagueish ideas for how to get there (e.g., “we need to make sure we don’t build unaligned AGI, and this should involve evals/compute monitoring, or maybe a domestic ban on AGI projects and a single international project).
Level 3, Medium concreteness: You have a goal with high-level ideas for how to get there. (e.g., “We would like to see licensing requirements for models trained above a certain threshold. Still ironing out whether or not that threshold should be X FLOP, Y FLOP, or $Z, but we’ve got some initial research and some models for how this would work.)
Level 4, High concreteness: You have concrete proposals that can be debated. (e.g., We should require licenses for anything above X FLOP, and we have some drafts of the forms that labs would need to fill out.)
I get the sense that some people feel like we need to be at “medium concreteness” or “high concreteness” before we can start having conversations about implementation. I don’t think this is true.
Many laws, executive orders, and regulatory procedures have vague language (often at Level 2 or in-between Level 2 and Level 3). My (loosely-held, mostly based on talking to experts and reading things) sense quite common for regulators to be like “we’re going to establish regulations for X, and we’re not yet exactly sure what they look like. Part of this regulatory agency’s job is going to be to figure out exactly how to operationalize XYZ.”
I also think that recent events have been strong evidence in favor of my position: we got a huge amount of political will “for free” from AI capabilities advances, and the best we could do with it was to push a deeply flawed “let’s all just pause for 6 months” proposal.
I don’t think this is clear evidence in favor of the “we are more bottlenecked by concrete proposals” position. My current sense is that we were bottlenecked both by “not having concrete proposals” and by “not having relationships with relevant stakeholders.”
I also expect that the process of concretizing these proposals will likely involve a lot of back-and-forth with people (outside the EA/LW/AIS community) who have lots of experience crafting policy proposals. Part of the benefit of “building political will” is “finding people who have more experience turning ideas into concrete proposals.”
Richard, I hope you turn out to be correct that public support for AI governance ideas will become less of a bottleneck as more powerful AI systems are released!
But I think it is plausible that we should not leave this to chance. Several of the governance ideas you have listed as promising (e.g., global GPU tracking, data center monitoring) are probably infeasible at the moment, to say the least. It is plausible that these ideas will only become globally implementable once a critical mass of people around the world become highly aware of and concerned about AGI dangers.
This means that timing may be an issue. Will the most detrimental of the AGI dangers manifest before meaningful preventative measures are implemented globally? It is plausible that before the necessary critical mass of public support builds up, a catastrophic or even existential outcome may already have occurred. It would then be too late.
The plausibility of this scenario is why I agree with Akash that identifying and raising public awareness of AGI’s dangers is an underrated approach.
Thank you so much for your insightful and detailed list of ideas for AGI safety careers, Richard! I really appreciate your excellent post.
I would propose explicitly grouping some of your ideas and additional ones under a third category: “identifying and raising public awareness of AGI’s dangers.” In fact, I think this category may plausibly contain some of the most impactful ideas for reducing catastrophic and existential risks, given that alignment seems potentially difficult to achieve in a reasonable period of time (if ever) and the implementation of governance ideas is bottlenecked by public support.
For a similar argument that I found particularly compelling, please check out Greg Colbourn’s recent post: https://forum.effectivealtruism.org/posts/8YXFaM9yHbhiJTPqp/agi-rising-why-we-are-in-a-new-era-of-acute-risk-and
I don’t actually think the implementation of governance ideas is mainly bottlenecked by public support; I think it’s bottlenecked by good concrete proposals. And to the extent that it is bottlenecked by public support, that will change by default as more powerful AI systems are released.
I appreciate Richard stating this explicitly. I think this is (and has been) a pretty big crux in the AI governance space right now.
Some folks (like Richard) believe that we’re mainly bottlenecked by good concrete proposals. Other folks believe that we have concrete proposals, but we need to raise awareness and political support in order to implement them.
I’d like to see more work going into both of these areas. On the margin, though, I’m currently more excited about efforts to raise awareness [well], acquire political support, and channel that support into achieving useful policies.
I think this is largely due to (a) my perception that this work is largely neglected, (b) the fact that a few AI governance professionals I trust have also stated that they see this as the higher priority thing at the moment, and (c) worldview beliefs around what kind of regulation is warranted (e.g., being more sympathetic to proposals that require a lot of political will).
I can see a worldview in which prioritizing raising awareness is more valuable, but I don’t see the case for believing “that we have concrete proposals”. Or at least, I haven’t seen any; could you link them, or explain what you mean by a concrete proposal?
My guess is that you’re underestimating how concrete a proposal needs to be before you can actually muster political will behind it. For example, you don’t just need “let’s force labs to pass evals”, you actually need to have solid descriptions of the evals you want them to pass.
I also think that recent events have been strong evidence in favor of my position: we got a huge amount of political will “for free” from AI capabilities advances, and the best we could do with it was to push a deeply flawed “let’s all just pause for 6 months” proposal.
Clarification: I think we’re bottlenecked by both, and I’d love to see the proposals become more concrete.
Nonetheless, I think proposals like “Get a federal agency to regulate frontier AI labs like the FDA/FAA” or even “push for an international treaty that regulates AI in a way that the IAEA regulates atomic energy” are “concrete enough” to start building political will behind them. Other (more specific) examples include export controls, compute monitoring, licensing for frontier AI models, and some others on Luke’s list.
I don’t think any of these are concrete enough for me to say “here’s exactly how the regulatory process should be operationalized”, and I’m glad we’re trying to get more people to concretize these.
At the same time, I expect that a lot of the concretization happens after you’ve developed political will. If the USG really wanted to figure out how to implement compute monitoring, I’m confident they’d be able to figure it out.
More broadly, my guess is that we might disagree on how concrete a proposal needs to be before you can actually muster political will behind it, though. Here’s a rough attempt at sketching out three possible “levels of concreteness”. (First attempt; feel free to point out flaws).
Level 1, No concreteness: You have a goal but no particular ideas for how to get there. (e.g., “we need to make sure we don’t build unaligned AGI”)
Level 2, Low concreteness: You have a goal with some vagueish ideas for how to get there (e.g., “we need to make sure we don’t build unaligned AGI, and this should involve evals/compute monitoring, or maybe a domestic ban on AGI projects and a single international project).
Level 3, Medium concreteness: You have a goal with high-level ideas for how to get there. (e.g., “We would like to see licensing requirements for models trained above a certain threshold. Still ironing out whether or not that threshold should be X FLOP, Y FLOP, or $Z, but we’ve got some initial research and some models for how this would work.)
Level 4, High concreteness: You have concrete proposals that can be debated. (e.g., We should require licenses for anything above X FLOP, and we have some drafts of the forms that labs would need to fill out.)
I get the sense that some people feel like we need to be at “medium concreteness” or “high concreteness” before we can start having conversations about implementation. I don’t think this is true.
Many laws, executive orders, and regulatory procedures have vague language (often at Level 2 or in-between Level 2 and Level 3). My (loosely-held, mostly based on talking to experts and reading things) sense quite common for regulators to be like “we’re going to establish regulations for X, and we’re not yet exactly sure what they look like. Part of this regulatory agency’s job is going to be to figure out exactly how to operationalize XYZ.”
I don’t think this is clear evidence in favor of the “we are more bottlenecked by concrete proposals” position. My current sense is that we were bottlenecked both by “not having concrete proposals” and by “not having relationships with relevant stakeholders.”
I also expect that the process of concretizing these proposals will likely involve a lot of back-and-forth with people (outside the EA/LW/AIS community) who have lots of experience crafting policy proposals. Part of the benefit of “building political will” is “finding people who have more experience turning ideas into concrete proposals.”
Richard, I hope you turn out to be correct that public support for AI governance ideas will become less of a bottleneck as more powerful AI systems are released!
But I think it is plausible that we should not leave this to chance. Several of the governance ideas you have listed as promising (e.g., global GPU tracking, data center monitoring) are probably infeasible at the moment, to say the least. It is plausible that these ideas will only become globally implementable once a critical mass of people around the world become highly aware of and concerned about AGI dangers.
This means that timing may be an issue. Will the most detrimental of the AGI dangers manifest before meaningful preventative measures are implemented globally? It is plausible that before the necessary critical mass of public support builds up, a catastrophic or even existential outcome may already have occurred. It would then be too late.
The plausibility of this scenario is why I agree with Akash that identifying and raising public awareness of AGI’s dangers is an underrated approach.