I appreciated this post. I also found this Twitter thread arguing for caution bout slowing down AI progress (from @Matthew_Barnett) really interesting & helpful for learning about different considerations for why a pause might be harmful or not as helpful as one might think.
I should flag that I think it relies on a number of assumptions that people disagree on, or at least makes a number of controversial claims — see e.g. of direct pushback. I’m really interested in seeing more discussion on basically everything here, especially places where I’m misinterpreting or pushback/corroboration on any of the points from the thread. (Related to an older thread that I’ve summarized with GPT in this footnote.[1])
Some assorted highlights from the thread, for myself (including things that I think I disagree with):
The hardware overhang argument: If a ban on large AI training runs is lifted, it could lead to a sudden jump in AI capabilities, which is particularly dangerous (starts here) — see also LessWrong’s Concepts page on Computing Overhang.
“To summarize: if a ban on large training runs is ever lifted, then large actors will be able to immediately train larger runs than ever before. The longer a ban is sustained, the larger the jump will be, which would mean that we would see abrupt AI progress. // It seems that our best hope for making AI go well is ensuring that AI follows a smooth, incremental development curve. If we can do that, then we can test AI safety ideas on incrementally more powerful models, which would be safer than needing everything to work on the first try. Some have said to me recently that we can simply continue the ban on large training runs indefinitely, or slowly lift the ban, which would ensure that progress is continuous. I agree that this strategy is possible in theory, but it’s a very risky move. // There are many reasons why a ban on large training runs might be lifted suddenly. For example, a global war might break out, and the United States might want to quickly develop powerful AI to win the war. [...] Right now, incremental progress is likely driven by the fact that companies simply can’t scale their training runs by e.g. 6 OOMs in a short period of time. If we had a major hardware overhang caused by artificial policy constraints, that may no longer be true. // Even worse, a ban on large training runs would probably cause people to keep doubling down and try to extend the ban to prevent rapid progress via a sudden lifting of restrictions. The more that people double down, the more likely we are to get an overhang.”
In a ~section arguing that it’s important to track that AI companies are incentivized[2] to make their models aligned, there was this point:
″ under [the theory that AI capabilities are vastly outpacing alignment research], you might expect GPT-2 to be very aligned with users, GPT-3 to be misaligned in subtle ways, and GPT-4 to be completely off-the-rails misaligned. However, if anything, we see the opposite, with GPT-4 exhibiting more alignment than GPT-2. // Of course, this is mostly because OpenAI put much more effort into aligning GPT-4 than they did for GPT-2, but that’s exactly my point. As AI capabilities advance, we should expect AI companies to care more about safety. It’s not clear to me that we’re on a bad trajectory.”
Bold mine; that last bit was especially interesting for me.
There are downsides to competition, and a moratorium right now increases competition
“To beat competition, companies have incentives to cut corners in AI safety in a risky gamble to remain profitable. The impacts of these mistakes may be amplified if their AI is very powerful. // A moratorium on model scaling right now would likely *increase* the incentive for OpenAI to cut corners, since it would cut their lead ahead of competitors. // More generally, given that algorithmic progress increasingly makes it cheaper to train powerful AI, scaling regulations plausibly have the effect of reducing the barriers to entry for AI firms, increasing competition—ironically the opposite of traditional regulation.”
I don’t know if there’s a canonical source for the arguments about downsides to competition (e.g. it’s discussed by Scott Alexander here, in the “The Race Argument” section) — I’ve read about it in bits and pieces in different places, and participated in conversations about it. I’m interested.
We have a limited budget of delays, and it might be better to use that budget closer to seriously dangerous systems
This is related to the value of earlier vs. later work done on AI safety.
1) AI progress is driven by two factors: model scaling and algorithmic progress. 2) The open letter suggests prohibiting large training runs while allowing algorithmic progress, leading to a “hardware overhang.” 3) Discontinuous AI progress is less safe than continuous progress due to its unpredictability and challenges in coping with sudden powerful AI. 4) Prohibiting large training runs for 6 months would reduce OpenAI’s lead in AI development. 5) A single leading actor in AI is safer than multiple top actors close together, as they can afford to slow down when approaching dangerous AI. 6) Implementing a moratorium on model scaling now may be premature and would weaken OpenAI’s lead. 7) There is no clear rule for ending the proposed moratorium or a detailed plan for proceeding after the 6-month period. 8) A general call to slow down AI could be acceptable if actors shift focus from algorithmic progress to safety, but a government-enforced prohibition on model scaling seems inappropriate.
I can see why some people think the publicity effects of the letter might be valuable, but — when it comes to the 6-month pause proposal itself — I think Matthew’s reasoning is right.
I’ve been surprised by how many EA folk are in favour of the actual proposal, especially given that AI governance literature often focuses on the risks of fuelling races. I’d be keen to read people’s counterpoints to Matthew’s thread(s); I don’t think many expect GPT-5 will pose an existential threat, and I’m not yet convinced that ‘practice’ is a good enough reason to pursue a bad policy.
I appreciated this post. I also found this Twitter thread arguing for caution bout slowing down AI progress (from @Matthew_Barnett) really interesting & helpful for learning about different considerations for why a pause might be harmful or not as helpful as one might think.
I should flag that I think it relies on a number of assumptions that people disagree on, or at least makes a number of controversial claims — see e.g. of direct pushback. I’m really interested in seeing more discussion on basically everything here, especially places where I’m misinterpreting or pushback/corroboration on any of the points from the thread. (Related to an older thread that I’ve summarized with GPT in this footnote.[1])
Some assorted highlights from the thread, for myself (including things that I think I disagree with):
The hardware overhang argument: If a ban on large AI training runs is lifted, it could lead to a sudden jump in AI capabilities, which is particularly dangerous (starts here) — see also LessWrong’s Concepts page on Computing Overhang.
“To summarize: if a ban on large training runs is ever lifted, then large actors will be able to immediately train larger runs than ever before. The longer a ban is sustained, the larger the jump will be, which would mean that we would see abrupt AI progress. // It seems that our best hope for making AI go well is ensuring that AI follows a smooth, incremental development curve. If we can do that, then we can test AI safety ideas on incrementally more powerful models, which would be safer than needing everything to work on the first try.
Some have said to me recently that we can simply continue the ban on large training runs indefinitely, or slowly lift the ban, which would ensure that progress is continuous. I agree that this strategy is possible in theory, but it’s a very risky move. // There are many reasons why a ban on large training runs might be lifted suddenly. For example, a global war might break out, and the United States might want to quickly develop powerful AI to win the war. [...]
Right now, incremental progress is likely driven by the fact that companies simply can’t scale their training runs by e.g. 6 OOMs in a short period of time. If we had a major hardware overhang caused by artificial policy constraints, that may no longer be true. // Even worse, a ban on large training runs would probably cause people to keep doubling down and try to extend the ban to prevent rapid progress via a sudden lifting of restrictions. The more that people double down, the more likely we are to get an overhang.”
In a ~section arguing that it’s important to track that AI companies are incentivized[2] to make their models aligned, there was this point:
″ under [the theory that AI capabilities are vastly outpacing alignment research], you might expect GPT-2 to be very aligned with users, GPT-3 to be misaligned in subtle ways, and GPT-4 to be completely off-the-rails misaligned. However, if anything, we see the opposite, with GPT-4 exhibiting more alignment than GPT-2. // Of course, this is mostly because OpenAI put much more effort into aligning GPT-4 than they did for GPT-2, but that’s exactly my point. As AI capabilities advance, we should expect AI companies to care more about safety. It’s not clear to me that we’re on a bad trajectory.”
Bold mine; that last bit was especially interesting for me.
There are downsides to competition, and a moratorium right now increases competition
“To beat competition, companies have incentives to cut corners in AI safety in a risky gamble to remain profitable. The impacts of these mistakes may be amplified if their AI is very powerful. //
A moratorium on model scaling right now would likely *increase* the incentive for OpenAI to cut corners, since it would cut their lead ahead of competitors. //
More generally, given that algorithmic progress increasingly makes it cheaper to train powerful AI, scaling regulations plausibly have the effect of reducing the barriers to entry for AI firms, increasing competition—ironically the opposite of traditional regulation.”
I don’t know if there’s a canonical source for the arguments about downsides to competition (e.g. it’s discussed by Scott Alexander here, in the “The Race Argument” section) — I’ve read about it in bits and pieces in different places, and participated in conversations about it. I’m interested.
We have a limited budget of delays, and it might be better to use that budget closer to seriously dangerous systems
This is related to the value of earlier vs. later work done on AI safety.
The older thread:
1) AI progress is driven by two factors: model scaling and algorithmic progress.
2) The open letter suggests prohibiting large training runs while allowing algorithmic progress, leading to a “hardware overhang.”
3) Discontinuous AI progress is less safe than continuous progress due to its unpredictability and challenges in coping with sudden powerful AI.
4) Prohibiting large training runs for 6 months would reduce OpenAI’s lead in AI development.
5) A single leading actor in AI is safer than multiple top actors close together, as they can afford to slow down when approaching dangerous AI.
6) Implementing a moratorium on model scaling now may be premature and would weaken OpenAI’s lead.
7) There is no clear rule for ending the proposed moratorium or a detailed plan for proceeding after the 6-month period.
8) A general call to slow down AI could be acceptable if actors shift focus from algorithmic progress to safety, but a government-enforced prohibition on model scaling seems inappropriate.
(Although, not as much as an impartial observer, probably.)
I can see why some people think the publicity effects of the letter might be valuable, but — when it comes to the 6-month pause proposal itself — I think Matthew’s reasoning is right.
I’ve been surprised by how many EA folk are in favour of the actual proposal, especially given that AI governance literature often focuses on the risks of fuelling races. I’d be keen to read people’s counterpoints to Matthew’s thread(s); I don’t think many expect GPT-5 will pose an existential threat, and I’m not yet convinced that ‘practice’ is a good enough reason to pursue a bad policy.
I wrote a shortform on this thread, inspired by Lizka’s sharing of it.