My friend Devin Kalish recently convinced me, at least to some extent, to focus on A.I. governance in the short-to-medium term, more than technical A.I. safety.
The argument that persuaded me was this:
A key point of stress among AI doomsayers like Yudkowsky is that we, first of all, have too little time, and second of all, have no way to implement any of the progress alignment workers do make. Both of these are governance problems, not alignment problems. They are also, arguably, far easier to picture possible promising interventions for than alignment research.
To lay out the logic more explicitly...
The case for governance now (Skip if you’re already convinced)
AI safety is urgent insofar as different capabilities groups are working towards it.
Different capabilities groups are propelled at least partly by “if we don’t do this, another group will anyway”, or the subtly different “we must do this, or another group will first”.
(2) is a coordination problem, potentially solveable with community governance.
AI safety is less tractable insofar as capabilities groups don’t have ways to implement alignment research into their work.
Solutions to (4) will, at some point, require groups/resources made available to capabilities groups.
(5) looks kinda like a governance problem in practice.
The AI alignment problem is quite hard, on the technical level. Governance work, as noted in (3) and (6), is both more tractable and more neglected than technical work. At least, it is right now.
The rest of this essay is less organized, but contains my thoughts for how and why this could work.
Specific stories of how this would really look in the real world, for real
An OpenAI team is getting ready to train a new model, but they’re worried about it’s self improvement capabilities getting out of hand. Luckily, they can consult MIRI’s 2025 Reflexivity Standards when reviewing their codebase, and get 3rd-party auditing done by The Actually Pretty Good Auditing Group (founded 2023).
A DeepMind employee has an idea for speeding up agent-training, but is worried about its potential to get out of hand. Worse, she’s afraid she’ll look like a fearmonger if she brings up her concerns at work. Luckily, she can bring up her concerns with The Pretty Decent Independent Tip Line, where it can then go to her boss anonymously.
OpenAI, DeepMind, and Facebook AI Research are all worried about their ability to control their new systems, but the relevant project managers are resigned to fatalism. Luckily, they can all communicate their progress with each other through The Actually Pretty Good Red Phone Forum, and their bosses can make a treaty through The Actually Pretty Trustworthy AI Governance Group to not train more powerful models until concrete problems X Y and Z are solved.
These aren’t necessarily the exact solutions to the above problems. Rather, they’re intuition pumps for what AI governance could look like on the ground.
Find and use existing coordination mechanisms
What happened to the Partnership For AI? Or the Asilomar conference? Can we use existing channels and build them out into coordination mechanisms that researchers can actually interact productively with?
If coordination is the bottleneck, a full effort is called for. This means hokey coordination mechanisms borrowed from open-source and academia, groups for peer-reviewing and math-checking and software auditing and standards-writing. Anything other than declaring “coordination is the bottleneck!” on a public forum and then getting nothing done.
Politics VS the other stuff
Many people in this community are turned-off by politics, perhaps explaining some of the shortage of AI governance work. But “politics”, especially in this neglected area, probably isn’t actually as hard as you think.
There’s a middle ground between “do nothing” and “become President or wage warfare”. Indeed, most effective activism is there.
Quick Thoughts on A.I. Governance
Link post
My friend Devin Kalish recently convinced me, at least to some extent, to focus on A.I. governance in the short-to-medium term, more than technical A.I. safety.
The argument that persuaded me was this:
To lay out the logic more explicitly...
The case for governance now (Skip if you’re already convinced)
AI safety is urgent insofar as different capabilities groups are working towards it.
Different capabilities groups are propelled at least partly by “if we don’t do this, another group will anyway”, or the subtly different “we must do this, or another group will first”.
(2) is a coordination problem, potentially solveable with community governance.
AI safety is less tractable insofar as capabilities groups don’t have ways to implement alignment research into their work.
Solutions to (4) will, at some point, require groups/resources made available to capabilities groups.
(5) looks kinda like a governance problem in practice.
The AI alignment problem is quite hard, on the technical level. Governance work, as noted in (3) and (6), is both more tractable and more neglected than technical work. At least, it is right now.
The rest of this essay is less organized, but contains my thoughts for how and why this could work.
Specific stories of how this would really look in the real world, for real
An OpenAI team is getting ready to train a new model, but they’re worried about it’s self improvement capabilities getting out of hand. Luckily, they can consult MIRI’s 2025 Reflexivity Standards when reviewing their codebase, and get 3rd-party auditing done by The Actually Pretty Good Auditing Group (founded 2023).
A DeepMind employee has an idea for speeding up agent-training, but is worried about its potential to get out of hand. Worse, she’s afraid she’ll look like a fearmonger if she brings up her concerns at work. Luckily, she can bring up her concerns with The Pretty Decent Independent Tip Line, where it can then go to her boss anonymously.
OpenAI, DeepMind, and Facebook AI Research are all worried about their ability to control their new systems, but the relevant project managers are resigned to fatalism. Luckily, they can all communicate their progress with each other through The Actually Pretty Good Red Phone Forum, and their bosses can make a treaty through The Actually Pretty Trustworthy AI Governance Group to not train more powerful models until concrete problems X Y and Z are solved.
These aren’t necessarily the exact solutions to the above problems. Rather, they’re intuition pumps for what AI governance could look like on the ground.
Find and use existing coordination mechanisms
What happened to the Partnership For AI? Or the Asilomar conference? Can we use existing channels and build them out into coordination mechanisms that researchers can actually interact productively with?
If coordination is the bottleneck, a full effort is called for. This means hokey coordination mechanisms borrowed from open-source and academia, groups for peer-reviewing and math-checking and software auditing and standards-writing. Anything other than declaring “coordination is the bottleneck!” on a public forum and then getting nothing done.
Politics VS the other stuff
Many people in this community are turned-off by politics, perhaps explaining some of the shortage of AI governance work. But “politics”, especially in this neglected area, probably isn’t actually as hard as you think.
There’s a middle ground between “do nothing” and “become President or wage warfare”. Indeed, most effective activism is there.