Global moratorium on AGI, now (Twitter). Founder of CEEALAR (née the EA Hotel; ceealar.org)
Greg_Colbourn ⏸️
Reposting this from Daniel Kokotajlo:
This is probably the most important single piece of evidence about AGI timelines right now. Well done! I think the trend should be superexponential, e.g. each doubling takes 10% less calendar time on average. Eli Lifland and I did some calculations yesterday suggesting that this would get to AGI in 2028. Will do more serious investigation soon.
Why do I expect the trend to be superexponential? Well, it seems like it sorta has to go superexponential eventually. Imagine: We’ve got to AIs that can with ~100% reliability do tasks that take professional humans 10 years. But somehow they can’t do tasks that take professional humans 160 years? And it’s going to take 4 more doublings to get there? And these 4 doublings are going to take 2 more years to occur? No, at some point you “jump all the way” to AGI, i.e. AI systems that can do any length of task as well as professional humans -- 10 years, 100 years, 1000 years, etc.
Also, zooming in mechanistically on what’s going on, insofar as an AI system can do tasks below length X but not above length X, it’s gotta be for some reason—some skill that the AI lacks, which isn’t important for tasks below length X but which tends to be crucial for tasks above length X. But there are only a finite number of skills that humans have that AIs lack, and if we were to plot them on a horizon-length graph (where the x-axis is log of horizon length, and each skill is plotted on the x-axis where it starts being important, such that it’s not important to have for tasks less than that length) the distribution of skills by horizon length would presumably taper off, with tons of skills necessary for pretty short tasks, a decent amount necessary for medium tasks (but not short), and a long thin tail of skills that are necessary for long tasks (but not medium), a tail that eventually goes to 0, probably around a few years on the x-axis. So assuming AIs learn skills at a constant rate, we should see acceleration rather than a constant exponential. There just aren’t that many skills you need to operate for 10 days that you don’t also need to operate for 1 day, compared to how many skills you need to operate for 1 hour that you don’t also need to operate for 6 minutes.
There are two other factors worth mentioning which aren’t part of the above: One, the projected slowdown in capability advances that’ll come as compute and data scaling falters due to becoming too expensive. And two, pointing in the other direction, the projected speedup in capability advances that’ll come as AI systems start substantially accelerating AI R&D.
This is going viral on X (2.8M views as of posting this comment).
On this view, why not work to increase extinction risk? (It would be odd if doing nothing was the best course of action when the stakes are so high either way.)
Even if it seems net-negative now, we don’t know that it always will be (and we can work to make it net-positive!).
Also, on this view, why not work to increase our chance of extinction?
I think another crucial consideration is how likely, and near, extinction is. If it is near, with high likelihood (and I think it is down to misaligned ASI being on the horizon), then it’s unlikely there will be time for trajectory change work to bear fruit.
With intelligence comes reverence for life and increased awareness, altruism.
This isn’t always true—see in humans, intelligent sociopaths and mass murderers. It’s unlikely to be true with AI either, unless moral realism is true AND the AI discovers the true morality of the universe AND said morality compatible with human flourishing. See: Othogonality Thesis.
Typos:
Footnotes 142-144 are out of order (looks like 2 paragraphs have been swapped without their footnote numbers being swapped)
Footnote 151: “‘What Do I Think about Community Notes?’” isn’t referenced (it’s Vitalik’s essay, I guess?)
Footnote 154 truncated mid-sentence? (also Footnote 154 is Footnote 152 in the pdf..)
“conditions under which it does not make sense to punt on early preparation.”—should this be “does make sense”?
Some other quotes and comments from my notes (in addition to my main comment):
“If GPT-6 were as capable as a human being”
This is conservative. Why not “GPT-5”? (In which case the 100,000x efficiency gain becomes 10,000,000,000x.)
See APM section for how misaligned ASI takeover could lead to extinction. Also
“if nuclear fusion grew to produce half as much power as the solar radiation which falls on Earth”
brings to mind Yudkowsky’s “boiling the oceans” scenario.
“Issues around digital rights and welfare interact with other grand challenges, most notably AI takeover. In particular, granting AIs more freedoms might accelerate a ‘gradual disempowerment’ scenario, or make a more coordinated takeover much easier, since AI systems would be starting in a position of greater power. Concerns around AI welfare could potentially limit some methods for AI alignment and control. On the other hand, granting freedoms to digital people (and giving them power to enjoy those freedoms) could reduce their incentive to deceive us and try to seize power, by letting them pursue their goals openly instead and improving their default conditions.”
This is important. Something I need to read and think more about.
“If we can capture more of the wealth that advanced AI would generate before it poses catastrophic risks, then society as a whole would behave more cautiously.”
Why is this likely? Surely we need a Pause to be able to do this?
“Unknown unknowns”
Expect these to be more likely to cause extinction than a good future? (Given Vulnerable World).
“One stark and still-underappreciated challenge is that we accidentally lose control over the future to an AI takeover.”
“Here’s a sceptical response you could make to our argument: many of the challenges we list will arise only after the development of superintelligence. If superintelligence is catastrophically misaligned, then it will take over, and the other challenges won’t be relevant.” [my emphasis in bold]
Yes!
“Ensuring that we get helpful superintelligence earlier in time, that it is useable and in fact used by key decision-makers, and that is accessible to as wide a range of actors as possible without increasing other catastrophic risks.” [my emphasis in bold]
It increases takeover risk(!) given lack of progress on (the needed perfect[1]) alignment and control techniques for ASI.
“6. AGI Preparedness”
This whole section (the whole paper?) assumes that an intelligence explosion is inevitable. There is no mention of “pause” or “moratorium” anywhere in the paper.
“At the moment, the machine learning community has major influence via which companies they choose to work for. They could form a “union of concerned computer scientists” in order to be able to act as a bloc to push development towards more socially desirable outcomes, refusing to work for companies or governments that cross certain red lines. It would be important to do this soon, because most of this influence will be lost once AI has automated machine learning research and development.
Other actors have influence too. Venture capitalists have influence via which private companies they invest in. Consumers have influence through which companies they purchase AI products from. Investigative journalists can have major influence by uncovering bad behaviour from AI companies or politicians, and by highlighting which actors seem to be acting responsibly. Individuals can do similarly by amplifying those messages on social media, and by voting for more responsible political candidates.”
We need much more of this!
“Slowing the intelligence explosion. If we could slow down the intelligence explosion in general, that would give decision-makers and institutions more time to react thoughtfully.”
Yes!
“One route to prevent chaotically fast progress is for the leading power (like the US and allies) to build a strong lead, allowing it to comfortably use stabilising measures over the period of fastest change. Such a lead could even be maintained by agreement, if the leader can credibly commit to sharing power and benefits with the laggards after achieving AGI, rather than using that advantage to dismantle its competition. Because post-superintelligence abundance will be so great, agreements to share power and benefits should strongly be in the leader’s national self-interest: as we noted in the section on abundance, having only 80% of a very large pie is much more desirable than an 80% chance of the whole pie and 20% chance of nothing. Of course, making such commitments credible is very challenging, but this is something that AI itself could help with.”
But could also just lead to Mutually Assured AI Malfunction (MAIM).
“Second, regulations which are sensible on their own terms could also slow peak rates of development. These could include mandatory predeployment testing for alignment and dangerous capabilities, tied to conditions for release; or even welfare-oriented rights for AI systems with a reasonable claim to moral status. That said, regulation along these lines would probably need international agreement in order to be effective, otherwise they could simply advantage whichever countries did not abide by them.”
An international agreement sounds good.
“Third, we could bring forward the start of the intelligence explosion, stretching out the intelligence explosion over time, so that peak rates of change are more manageable. This could give more time to react, and a longer period of time to benefit from excellent AI advice prior to grand challenges. For example, accelerating algorithmic progress now means there would be less available room for improvement in software at the time of the intelligence explosion, and the software feedback loop couldn’t go on for as long before compute constraints kick in.”
This sounds like a terrible and reckless idea! Because we don’t know exactly where the thresholds are for recursive self-improvement to kick in.
“we think an intelligence explosion is more likely than not this century, and may well begin within a decade.”
Yes, unless we stop it happening (and we should!)
“If–then commitments”
Problem is knowing that by the time the “if” is verified to have occurred, it could well be too late to do the “then” (e.g. once a proto-ASI has already escaped onto the internet).
“We shouldn’t succumb to the evidence dilemma: if we wait until we have certainty about the likelihood of the intelligence explosion, it will by then be too late to prepare. It’s too late to buy home insurance by the time you see smoke creeping under the kitchen door.”
Exactly! Need a moratorium now, not unworkable “if-then” commitments!
“challenges around space governance, global governance, missile defence, and nuclear weapons are not directly questions about how to design, build, and deploy AI itself. Rather, AI accelerates and reorders the pace at which these challenges arrive, forcing us to confront them in a world changing at disorienting speed.”
This is assuming ASI is alignable! (The whole Not just misalignment section is).
“And, often, the most important thing to do is to ensure that superintelligence is in fact used in beneficial ways, and as soon as possible.”
This has not been justified in the paper.
- ^
We need at least 13 9s of safety for ASI, and the best current alignment techniques aren’t even getting 3 9s...
- ^
The paper is an interesting read, but I think that it unfortunately isn’t of much practical value down to the omission of a crucial consideration:
The paper rests on the assumption that alignment/control of artificial superintelligence (ASI) is possible. This has not been theoretically established, let alone assessed to be practically likely in the time we have before an intelligence explosion. As far as I know, there aren’t any sound supporting arguments for the assumption (and you don’t reference any), and in fact there are good arguments on the other side for why aligning or controlling ASI is fundamentally impossible.AI Takeover is listed first in the Grand Challenges section, but it trumps all the others because it is the default outcome. You even say “we should expect AIs that can outsmart humans”, and “There are reasonable arguments for expecting misalignment, and subsequent takeover, as the ‘default’ outcome (without concerted efforts to prevent it).”, and “There is currently no widely agreed-upon solution to the problems of aligning and controlling advanced AI systems, and so leading experts currently see the risk of AI takeover as substantial.” I still don’t understand where the ~10% estimates are coming from though; [fn 93:] “just over 50% of respondents assigned a subjective probability of 10% or more to the possibility that, “human inability to control future advanced Al systems causing human extinction or similarly permanent and severe disempowerment of the human species” (Grace et al., ‘Thousands of AI Authors on the Future of AI’.)”]. They seem logically unfounded. What is happening in the other ~90%? I didn’t get any satisfactory answers when asking here a while back.
You say “In this paper, we won’t discuss AI takeover risk in depth, but that’s because it is already well-discussed elsewhere.” It’s fine that you want to talk about other stuff in the paper, but that doesn’t make it any less of a crucial consideration that overwhelms concern for all of the other issues!
You conclude by saying that “Many are admirably focused on preparing for a single challenge, like misaligned AI takeover… But focusing on one challenge is not the same as ignoring all others: if you are a single-issue voter on AI, you are probably making a mistake.” I disagree, because alignment of ASI hasn’t been shown to even be solvable in principle! It is the single most important issue by far. The others don’t materialise because they assume humans will be in control of ASI for the most part (which is very unlikely to happen). The only practical solution (which is also dissolves nearly all the other issues identified in the paper) is to prevent ASI from being built[1]. We need a well enforced global moratorium on ASI as soon as possible.
- ^
At least until either it can be built safely, or the world collectively decides to take whatever risk remains after a consensus on an alignment/control solution is reached. At which point the other issues identified in the paper become relevant.
- ^
I think that without knowing people’s assessment of extinction risk (e.g. chance of extinction over the next 5, 10, 20, 50, 100 years)[1], the answers here don’t provide a lot of information value.
I think a lot of people on the disagree side would change their mind if they believed (as I do) that there is a >50% chance of extinction in the next 5 years (absent further intervention).
Would be good if there was a short survey to establish such background assumptions to people’s votes.
- ^
And their assessment of the chance that AI successors will be morally valuable, as per footnote 2 of the statement.
- ^
AI Safety work is starting to make real progress for avoiding the worst outcomes
What makes you think this? Every technique there is is statistical in nature (due to the nature of the deep learning paradigm), and none are even approaching 3 9s of safety and we need something like 13 9s if we are going to survive more than a few years of ASI.
AI capabilities advancement seems to be on a relatively good path (less foomy)
I also don’t see how it’s less foomy. SWE bench and ML researcher automation are still improving—what happens when the models are drop in replacements for top researchers?
Yet gradual disempowerment risks seem extremely hard to mitigate
What is the eventual end result after total disempowerment? Extinction, right?
Wait, how does your 93% disagree tie in with your support for PauseAI?
Does this change if our chance of extinction in the next few years is high? (Which I think it is, from AI).
Vinding says:
There is a key point on which I agree strongly with advocates for an AI pause: there is a massive moral urgency in ensuring that we do not end up with horrific AI-controlled outcomes. Too few people appreciate this insight, and even fewer seem to be deeply moved by it.
At the same time, I think there is a similarly massive urgency in ensuring that we do not end up with horrific human-controlled outcomes. And humanity’s current trajectory is unfortunately not all that reassuring with respect to either of these broad classes of risks …
The upshot for me is that there is a roughly equal moral urgency in avoiding each of these categories of worst-case risksBut he does not justify this equality. It seems highly likely to me that ASI-induced s-risks are on a much larger scale than human-induced ones (down to ASI being much more powerful than humanity), creating a (massive) asymmetry in favour of preventing ASI.
Agree. But I’m sceptical that we could robustly align or control a large population of such AIs (and how would we cap the population?), especially considering the speed advantage they are likely to have.
Yeah, I think a lot of the overall debate—including what is most ethical to focus on(!) -- depends on AI trajectories and control.
What level of intelligence are you imagining such a system as being at? Some percentile on the scale of top performing humans? Somewhat above the most intelligent humans?
Ok, so in the spirit of
[about p(doom|AGI)], and
[is lacking], I ask if you have seriously considered whether
is even possible? (Let alone at all likely from where we stand.)
You (we all) should be devoting a significant fraction of resources toward slowing down/pausing/stopping AGI (e.g. pushing for a well enforced global non-proliferation treaty on AGI/ASI), if we want there to be a future at all.