Global moratorium on AGI, now (Twitter). Founder of CEEALAR (née the EA Hotel; ceealar.org)
Greg_Colbourn ⏸️
does it really make sense to prioritize AI over problems like poverty, malnutrition, or lack of healthcare?
It really depends on how long you think we have left before AI threatens our extinction (i.e. causes the death of every human and animal, and all biological life, on the planet). I think it could be as little as a year, and it’s quite (>50%) likely to be within the next 5 years.
AGI will effect everyone on the planet , whether they believe the “hype” or not (kill them all most likely, once recursive self-improvement kicks in, before 2030 at this rate).
Thecompendium.ai is a good reference. Please read it. Feel free to ask any questions you have about it. (Also, cryonics isn’t a sham, it’s still alive and well; just not many adopters still, but that’s another topic.)
We shouldn’t be working on making the synthetic brain, we should be working on stopping further development!
Do you disagree or were we just understanding the claim differently?
I disagree, assuming we are operating under the assumption that GPT-5 means “increase above GPT-4 relative to the increase GPT-4 was above GPT-3” (which I think is what you are getting at in the paper?), rather than what the thing that will actually be called GPT-5 will be like. And it has an “o-series style” reasoning model built on top of it, and whatever other scaffolding needed to make it agentic (computer use etc).
“a notably incompetent or poorly-prepared society learns lots of new unknown unknowns all at once”
I think that is, unfortunately, where we are heading!
“It [ensuring that we get helpful superintelligence earlier in time] increases takeover risk(!)”
Emphasis here on the “helpful”
I think the problem is the word “ensuring”, when there’s no way we can ensure it. The result is increasing risk when people take this as a green light to go faster and bring forward the time where we take the (most likely fatal) gamble on ASI.
“We need at least 13 9s of safety for ASI, and the best current alignment techniques aren’t even getting 3 9s...”
Can you elaborate on this? How are we measuring the reliability of current alignment techniques here?
I’m going by published results where various techniques are reported, and show things like 80% reduction in harmful outputs, 90% reduction in deception, 99% reduction in jailbreaks etc.
Is this good or bad, on your view? Seems more stabilising than a regime which favours AI malfunction “first strikes”?
Yeah. Although an international non-proliferation treaty would be far better. Perhaps MAIM might prompt this though?
but perhaps we should have emphasised more that pausing is an option.
Yes!
But most “if-then” policies I am imagining are not squarely focused on avoiding AI takeover
They should be! We need strict red lines in the evals program[1].
I currently think it’s more likely than not that we avoid full-blown AI takeover, which makes me think it’s worth considering downstream issues.
See replies in the other thread. Thanks again for engaging!
- ^
That are short of things like “found in the wild escaped from the lab”(!)
- ^
Thanks for the reply.
By (stupid) analogy, all the preparations for a wedding would be undermined if the couple got into a traffic accident on the way to the ceremony; this does not justify spending ~all the wedding budget on car safety.
This is a stupid analogy! (Traffic accidents aren’t very likely.) A better analogy would be “all the preparations for a wedding would be undermined if the couple weren’t able to to be together because one was stranded on Mars with no hope of escape. This justifies spending all the wedding budget on trying to rescue them.” Or perhaps even better: “all the preparations for a wedding would be undermined if the couple probably won’t be able to be together, because one taking part in a mission to Mars that half the engineers and scientists on the guest list are convinced will be a death trap (for detailed technical reasons). This justifies spending all the wedding budget on trying to stop the mission from going ahead.”
see e.g. Katja Grace’s post here
I think Wei Dei’s reply articulates my position well:
Suppose you went through the following exercise. For each scenario described under “What it might look like if this gap matters”, ask:
Is this an existentially secure state of affairs?
If not, what are the main obstacles to reaching existential security from here?
and collected the obstacles, you might assemble a list like this one, which might update you toward AI x-risk being “overwhelmingly likely”. (Personally, if I had to put a number on it, I’d say 80%.)
Your next point seems somewhat of a straw man?
If I tell someone the world will be run by dolphins in the year 2050, and they disagree, I can reply, “oh yeah, well you tell me what the world looks like in 2050”
No, the correct reply is that dolphins won’t run the world because they can’t develop technology down to their physical form (no opposable thumbs etc), and they won’t be able to evolve their physical form in such a short time (even with help from human collaborators)[1]. i.e. an object level rebuttal.
The opponents of these arguments were not able to describe the ways that the world could avoid these dire fates in detail
No, but they had sound theoretical arguments. I’m saying these are lacking when it comes to why it’s possible to align/control/not go extinct from ASI.
Altogether, I think you’re coming from a reasonable but different position, that takeover risk from ASI is very high (sounds like 60–99% given ASI?)
I’d say ~90% (and the remaining 10% is mostly exotic factors beyond our control [footnote 10 of linked post]).
I do think this axis of disagreement might not be as sharp as it seems, though — suppose person A has [9]0% p(takeover) and person B is on 1%. Assuming the same marginal tractability and neglectedness between takeover and non-takeover work, person A thinks that takeover-focused work is [9]0× more important; but non-takeover work is 10/99≈0.[1] times as important, compared to person B.
But it’s worse than this, because the only viable solution to avoid takeover is to stop building ASI, in which case the non-takeover work is redundant (we can mostly just hope to luck out with one of the exotic factors).
- ^
And they won’t be able to be helped by ASIs either, because the control/alignment problem will remain unsolved (and probably unsolvable, for reasons x, y, z...)
Shouldn’t this mean you agree with the statement?
Thanks for the explanation.
Whilst zdgroff’s comment “acknowledges the value of x-risk reduction in general from a non-longtermist perspective” it downplays it quite heavily imo (and the OP comment does even more, using the pejorative “fanatical”).
I don’t think the linked post makes the point very persuasively. Looking at the table, at best there is an equivalence.I think a rough estimate of the cost effectiveness of pushing for a Pause is orders of magnitude higher.
I’m not sure if GiveWell top charities do? Preventing extinction is a lot of QALYs, and it might not cost more than a few $B per year of extra time bought in terms of funding Pause efforts (~$1/QALY!?)
I’m not that surprised that the above comment has been downvoted to −4 without any replies (and this one will probably buried by an even bigger avalanche of downvotes!), but it still makes me sad. EA will be ivory-tower-ing until the bitter end it seems. It’s a form of avoidance. These things aren’t nice to think about. But it’s close now, so it’s reasonable for it to feel viscerally real. I guess it won’t be EA that saves us (from the mess it helped accelerate), if we do end up saved.
Having the superpowers on board is the main thing. If others opt out, then enforcement against them can be effective in that case.
No, but it’s far better than what we have now.
My model looks something like this:
There are a bunch of increasingly hard questions on the Alignment Test. We need to get enough of the core questions right to avoid the ASI → everyone quickly dies scenario. This is the ‘passing grade’. There are some bonus/extra credit questions that we need to also get right to get an A (a flourishing future).I think the bonus/extra credit questions are part of the main test—if you don’t get them right everyone still dies, but maybe a bit more slowly.
All the doom flows through the cracks of imperfect alignment/control. And we can asymptote toward, but never reach, existential safety[1].
- ^
Of course this applies to all other x-risks too. It’s just that ASI x-risk is very near term and acute (in absolute terms, and relative to all the others), and we aren’t even starting in earnest with the asymptoting yet (and likely won’t if we don’t get a Pause).
- ^
Hi Niel, what I’d like to see is an argument for the tractability of successfully “navigating the transition to a world with AGI” without a global catastrophe (or extinction) (i.e. an explanation for why your p(doom|AGI) is lower). I think this is much less tractable than getting a (really effective) Pause! (Even if a Pause itself is somewhat unlikely at this point.)
I think most people in EA have relatively low (but still macroscopic) p(doom)s (e.g. 1-20%), and have the view that “by default, everything turns out fine”. And I don’t think this has ever been sufficiently justified. The common view is that alignment will just somehow be solved enough to keep us alive, and maybe even thrive (if we just keep directing more talent and funding to research). But then the extrapolation to the ultimate implications of such imperfect alignment (e.g. gradual disempowerment → existential catastrophe) never happens.
Ok, so in the spirit of
EA’s focus on collaborativeness and truthseeking has meant that people encouraged us to interrogate whether our previous plans were in line with our beliefs
[about p(doom|AGI)], and
we aim to be prepared to change our minds and plans if the evidence
[is lacking], I ask if you have seriously considered whether
safely navigating the transition to a world with AGI
is even possible? (Let alone at all likely from where we stand.)
You (we all) should be devoting a significant fraction of resources toward slowing down/pausing/stopping AGI (e.g. pushing for a well enforced global non-proliferation treaty on AGI/ASI), if we want there to be a future at all.
Reposting this from Daniel Kokotajlo:
This is probably the most important single piece of evidence about AGI timelines right now. Well done! I think the trend should be superexponential, e.g. each doubling takes 10% less calendar time on average. Eli Lifland and I did some calculations yesterday suggesting that this would get to AGI in 2028. Will do more serious investigation soon.
Why do I expect the trend to be superexponential? Well, it seems like it sorta has to go superexponential eventually. Imagine: We’ve got to AIs that can with ~100% reliability do tasks that take professional humans 10 years. But somehow they can’t do tasks that take professional humans 160 years? And it’s going to take 4 more doublings to get there? And these 4 doublings are going to take 2 more years to occur? No, at some point you “jump all the way” to AGI, i.e. AI systems that can do any length of task as well as professional humans -- 10 years, 100 years, 1000 years, etc.
Also, zooming in mechanistically on what’s going on, insofar as an AI system can do tasks below length X but not above length X, it’s gotta be for some reason—some skill that the AI lacks, which isn’t important for tasks below length X but which tends to be crucial for tasks above length X. But there are only a finite number of skills that humans have that AIs lack, and if we were to plot them on a horizon-length graph (where the x-axis is log of horizon length, and each skill is plotted on the x-axis where it starts being important, such that it’s not important to have for tasks less than that length) the distribution of skills by horizon length would presumably taper off, with tons of skills necessary for pretty short tasks, a decent amount necessary for medium tasks (but not short), and a long thin tail of skills that are necessary for long tasks (but not medium), a tail that eventually goes to 0, probably around a few years on the x-axis. So assuming AIs learn skills at a constant rate, we should see acceleration rather than a constant exponential. There just aren’t that many skills you need to operate for 10 days that you don’t also need to operate for 1 day, compared to how many skills you need to operate for 1 hour that you don’t also need to operate for 6 minutes.
There are two other factors worth mentioning which aren’t part of the above: One, the projected slowdown in capability advances that’ll come as compute and data scaling falters due to becoming too expensive. And two, pointing in the other direction, the projected speedup in capability advances that’ll come as AI systems start substantially accelerating AI R&D.
This is going viral on X (2.8M views as of posting this comment).
On this view, why not work to increase extinction risk? (It would be odd if doing nothing was the best course of action when the stakes are so high either way.)
Even if it seems net-negative now, we don’t know that it always will be (and we can work to make it net-positive!).
Also, on this view, why not work to increase our chance of extinction?
I think another crucial consideration is how likely, and near, extinction is. If it is near, with high likelihood (and I think it is down to misaligned ASI being on the horizon), then it’s unlikely there will be time for trajectory change work to bear fruit.
Whilst this works for saving individual lives (de Sousa Mendes, starfish), it unfortunately doesn’t work for AI x-risk. Whether or not AI kills everyone is pretty binary. And we probably haven’t got long left. Some donations (e.g. those to orgs pushing for a global moratorium on further AGI development) might incrementally reduce x-risk[1], but I think most won’t (AI Safety research without a moratorium first[2]). And failing at preventing extinction is not “ok”! We need to be putting much more effort into it.
And at least kick the can down the road a few years, if successful.
I guess you are much more optimistic about AI Safety research paying off, if your p(doom) is “only” 10%. But I think the default outcome is doom (p(doom|AGI)~90% and we are nowhere near solving alignment/control of ASI (the deep learning paradigm is statistical, and all the doom flows through the cracks of imperfect alignment).