One thing the AI Pause Debate Week has made salient to me: there appears to be a mismatch between the kind of slowing that on-the-ground AI policy folks talk about, versus the type that AI policy researchers and technical alignment people talk about.
My impression from talking to policy folks who are in or close to government—admittedly a sample of only five or so—is that the main[1] coordination problem for reducing AI x-risk is about ensuring the so-called alignment tax gets paid (i.e., ensuring that all the big labs put some time/money/effort into safety, and that none “defect” by skimping on safety to jump ahead on capabilities). This seems to rest on the assumption that the alignment tax is a coherent notion and that technical alignment people are somewhat on track to pay this tax.
On the other hand, my impression is that technical alignment people, and AI policy researchers at EA-oriented orgs,[2] are not at all confident in there being a viable level of time/money/effort that will produce safe AGI on the default trajectory. The type of policy action that’s needed, so they seem to say, is much more drastic. For example, something in the vein of global coordination to slow, limit, or outright stop development and deployment of AI capabilities (see, e.g., Larsen’s,[3] Bensinger’s, and Stein-Perlman’s debate week posts), whilst alignment researchers scramble to figure out how on earth to align frontier systems.
I’m concerned by this mismatch. It would appear that the game plans of two adjacent clusters of people working to reduce AI x-risk are at odds. (Clearly, this is an oversimplification and there are a range of takes from within both clusters, but my current epistemic status is that this oversimplification gestures at a true and important pattern.)
Am I simply mistaken about there being a mismatch here? If not, is anyone working to remedy the situation? Or does anyone have thoughts on how this arose, how it could be rectified, or how to prevent similar mismatches from arising in the future?
- ^
In the USA, this main is served with a hearty side order of “Let’s make sure China in particular never races ahead on capabilities.”
- ^
e.g., Rethink Priorities, AI Impacts
- ^
I’m aware that Larsen recently crossed over into writing policy bills, but I’m counting them as a technical person on account of their technical background and their time spent in the Berkeley sphere of technical alignment people. Nonetheless, perhaps crossovers like this are a good omen for policy and technical people getting onto the same page.
According to the debate week announcement, Scott Alexander will be writing a summary/conclusion post.