New article in Time Ideas by Eliezer Yudkowsky.
Here’s some selected quotes.
In reference to the letter that just came out (discussion here):
We are not going to bridge that gap in six months.
It took more than 60 years between when the notion of Artificial Intelligence was first proposed and studied, and for us to reach today’s capabilities. Solving safety of superhuman intelligence—not perfect safety, safety in the sense of “not killing literally everyone”—could very reasonably take at least half that long. And the thing about trying this with superhuman intelligence is that if you get that wrong on the first try, you do not get to learn from your mistakes, because you are dead. Humanity does not learn from the mistake and dust itself off and try again, as in other challenges we’ve overcome in our history, because we are all gone.
…
Some of my friends have recently reported to me that when people outside the AI industry hear about extinction risk from Artificial General Intelligence for the first time, their reaction is “maybe we should not build AGI, then.”
Hearing this gave me a tiny flash of hope, because it’s a simpler, more sensible, and frankly saner reaction than I’ve been hearing over the last 20 years of trying to get anyone in the industry to take things seriously. Anyone talking that sanely deserves to hear how bad the situation actually is, and not be told that a six-month moratorium is going to fix it.
Here’s what would actually need to be done:
The moratorium on new large training runs needs to be indefinite and worldwide. There can be no exceptions, including for governments or militaries. If the policy starts with the U.S., then China needs to see that the U.S. is not seeking an advantage but rather trying to prevent a horrifically dangerous technology which can have no true owner and which will kill everyone in the U.S. and in China and on Earth. If I had infinite freedom to write laws, I might carve out a single exception for AIs being trained solely to solve problems in biology and biotechnology, not trained on text from the internet, and not to the level where they start talking or planning; but if that was remotely complicating the issue I would immediately jettison that proposal and say to just shut it all down.
Shut down all the large GPU clusters (the large computer farms where the most powerful AIs are refined). Shut down all the large training runs. Put a ceiling on how much computing power anyone is allowed to use in training an AI system, and move it downward over the coming years to compensate for more efficient training algorithms. No exceptions for anyone, including governments and militaries. Make immediate multinational agreements to prevent the prohibited activities from moving elsewhere. Track all GPUs sold. If intelligence says that a country outside the agreement is building a GPU cluster, be less scared of a shooting conflict between nations than of the moratorium being violated; be willing to destroy a rogue datacenter by airstrike.
Frame nothing as a conflict between national interests, have it clear that anyone talking of arms races is a fool. That we all live or die as one, in this, is not a policy but a fact of nature. Make it explicit in international diplomacy that preventing AI extinction scenarios is considered a priority above preventing a full nuclear exchange, and that allied nuclear countries are willing to run some risk of nuclear exchange if that’s what it takes to reduce the risk of large AI training runs.
That’s the kind of policy change that would cause my partner and I to hold each other, and say to each other that a miracle happened, and now there’s a chance that maybe Nina will live. The sane people hearing about this for the first time and sensibly saying “maybe we should not” deserve to hear, honestly, what it would take to have that happen. And when your policy ask is that large, the only way it goes through is if policymakers realize that if they conduct business as usual, and do what’s politically easy, that means their own kids are going to die too.
Shut it all down.
We are not ready. We are not on track to be significantly readier in the foreseeable future. If we go ahead on this everyone will die, including children who did not choose this and did not do anything wrong.
Shut it down.
In light of this discussion about whether people would find this article alienating, I sent it to four very smart/reasonable friends who aren’t involved in EA, don’t work on AI, and don’t live in the Bay Area (definitely not representative TIME readers, but maybe representative of the kind of people EAs want to reach). Given I don’t work on AI/have only ever discussed AI risk with one of them, I don’t think social desirability bias played much of a role. I also ran this comment by them after we discussed. Here’s a summary of their reactions:
Friend 1: Says it’s hard for them to understand why AI would want to kill everyone, but acknowledges that experts know much more about this than they do and takes seriously that experts believe this is a real possibility. Given this, they think it makes sense to err on the side of caution and drastically slow down AI development to get the right safety measures in place.
Friend 2: Says it’s intuitive that AI being super powerful, not well understood, and rapidly developing is a dangerous combination. Given this, they think it makes sense to implement safeguards. But they found the article overwrought, especially given missing links in the argument (e.g., they think it’s unclear whether/why AI would want our atoms, given immense uncertainty about what AI would want; compared their initial reaction to this argument to their initial reaction to Descartes’ ontological argument).
Friend 3: Says they find this article hard to argue with, especially because they recognize how little they know on the topic relative to EY; compared themselves disagreeing with it to anti-vaxxers arguing with virologists. Given the uncertainty about risks, they think it’s pretty obvious we ought to slow down.
Friend 4: Says EY knows vastly more about this issue than they do, but finds the tone of the article a little over the top, given missing links. Remains optimistic AI will make the world better, but recognizes possible optimism bias. Generally agrees there should be more safeguards in place, especially given there are ~none.
Anyways, I would encourage others to add their own anecdata to the mix, so we can get a bit more grounded on how people interpret articles like this one, since this seems important to understand and we can do better than just speculate.
This comment was fantastic! Thanks for taking the time to do this.
In a world where the most prominent online discussants tend to be weird in a bunch of ways, we don’t hear enough reactions from “normal” people who are in a mindset of “responding thoughtfully to a friend”. I should probably be doing more friend-scanning myself.
I strongly disagree with sharing this outside rationalist/EA circles, especially if people don’t know much about AI safety or x risk. I think this could drastically shift someone’s opinion on Effective Altruism if they’re new to the idea.
This article was published in TIME, which has a print readership of 1.6 million
The article doesn’t even use the words “effective altruism”
These non-EAs were open to the ideas raised by the article
Value of information seems to exceed the potential damage done at these sample sizes for me.
Hi Wil,
Thanks for sharing your thoughts. I am slightly confused your comment is overall downvoted (-8 total karma now). I upvoted it, but disagreed.
Given the typical correlation between upvotes and agreevotes, this is actually much more upvoted than you would expect (holding constant the disagreevotes).
I didn’t actually downvote, but I did consider it, because I dislike PR-criticism of people for for disclosing true widely-available information in the process of performing a useful service.
Thanks for the feedback, Larks.
Fair point.
I think it makes sense to downvote if one thinks:
The comment should be less visible.
It would have been better for the comment not to have been published.
Thanks! It’s okay. This is a very touchy subject and I wrote a strongly opinionated piece so I’m not surprised. I appreciate it.
Thanks for reporting back! I’m sharing it with my friends as well (none of which are in tech and most of them live in fairly rural parts in Canada) to see their reaction.
It made it to the White House Press Briefing. This clip is like something straight out of the film Don’t Look Up. Really hope that the ending is better (i.e. the warning is actually heeded).
Things seem to be going at least a bit better than Yudkowsky was thinking just a month ago.
Sounds right!
When a very prominent member of the community is calling for governments to pre-commit to pre-emptive military strikes against countries allowing the construction of powerful AI in the relatively near-term, including against nuclear powers*, it’s really time for people to actually take seriously the stuff about rejecting naive utilitarianism where you do crazy-sounding stuff if a quick expected value calcualtion makes it look maximizing.
*At least I assume that’s what he means by being prepared to risk a higher chance of nuclear war.
Clarification for anyone who’s reading this comment outside of having read the article – the article calls for governments to adopt clear policies involving potential preemptive military strikes in certain circumstances (specifically, against a hypothetical “rogue datacenter”, as these datacenters could be used to build AGI), but it is not calling for any specific military strike right now.
Have edited. Does that help?
Yeah, I think that’s better
Agreed! I think the policy proposal is a good one that makes a lot of sense, and I also think this is a good time to remind people that “international treaties with teeth are plausibly necessary here” doesn’t mean it’s open season on terrible naively consequentialist ideas that sound “similarly extreme”. See the Death With Dignity FAQ.
This goes considerably beyond ‘international treaties with teeth are plausibly necessary here’:
‘If intelligence says that a country outside the agreement is building a GPU cluster, be less scared of a shooting conflict between nations than of the moratorium being violated; be willing to destroy a rogue datacenter by airstrike.’
Eliezer is proposing attacks on any countries that are building AI-above-a-certain-level, whether or not they sign up to the treaty. That is not a treaty enforcement mechanism. I also think “with teeth” kind of obscures by abstraction here (since it doesn’t necessarily sound like it means war/violence, but that’s what’s being proposed.)
Is this actually inconsistent? If a country doesn’t sign up for the Biological Weapons Convention, and then acts in flagrant disregard of it, would they not be expected to be faced with retaliatory action from signatories, including, depending on specifics, plausibly up to military force? My sense was that people who pushed for the introduction and enforcement of the BWC would have imaged such a plausible response as within bounds.
I don’t think what Eliezer is proposing would necessarily mean war/violence either – conditional on a world actually getting to the point were major countries are agreeing to such a treaty, I find it plausible that smaller countries would simply acquiesce in shutting down rogue datacenters. If they didn’t, before military force was used, diplomacy would be used. Then probably economic sanctions would be used. Eliezer is saying that governments should be willing to escalate to using military force if necessary, but I don’t think it’s obvious that in such a world military force would be necessary.
Yep, I +1 this response. I don’t think Eliezer is proposing anything unusual (given the belief that AGI is more dangerous than nukes, which is a very common belief in EA, though not universally shared). I think the unusual aspect is mostly just that Eliezer is being frank and honest about what treating AGI development and proliferation like nuclear proliferation looks like in the real world.
He explained his reasons for doing that here:
https://twitter.com/ESYudkowsky/status/1641452620081668098
https://www.lesswrong.com/posts/Lz64L3yJEtYGkzMzu/rationality-and-the-english-language
I’m not sure I have too much to add, and I think that I do have concerns about how Eliezer wrote some of this letter given the predictable pushback it’s seen, though maybe breaking the Overton Window is a price worth paying? I’m not sure there.
In any case, I just wanted to note that we have at least 2 historical examples of nations carrying out airstrikes on bases in other countries without that leading to war, though admittedly the nation attacked was not nuclear:
Operation Opera—where Israeli jets destroyed an unfinished nuclear reactor in Iraq in 1981.
Operation Orchard—where the Israeli airforce (again) destroyed a suspected covert nuclear facility in Syria in 2007.
Both of these cases were a nation taking action somewhat unilaterally against another, destroying the other nation’s capability with an airstrike, and what followed was not war but sabre-rattling and proxy conflict (note: That’s my takeaway as a lay non-expert, I may be wrong about the consequences of these strikes! The consequences of Opera especially seem to be a matter of some historical debate).
I’m sure that there are other historical examples that could be found which shed light on what Eliezer’s foreign policy would mean, though I do accept that with nuclear-armed states, all bets are off. Though also worth considering, China has (as far as I know) an unconditional policy on No Nuclear First Use, though that doesn’t preclude retaliation for non-nuclear air strikes on Chinese Soil; such as a disrupting trade, mass cyberattacks, or invading Taiwan in response, or them reversing that policy once actually under attack.
In both cases, that’s a nuclear power attacking a non-nuclear one. Contrast how Putin is being dealt with for doing Putin things—no one is suggesting bombing Russia.
Yeah, haven’t we learned anything from the last 6 months?
Looking forward to seeing the CEA, Toby Ord, and Will MacAskill statements condemning EY for calling for state-sponsored terrorism in a national magazine
Very hard hitting and emotional. I’m feeling increasingly like I did in February/March 2020, pre-lockdown. Full on broke down to tears after reading this. Shut it all down.
Strong agree, hope this gets into the print version (if it hasn’t already).
Here’s a comment I shared on my LessWrong shortform.
——
I’m still thinking this through, but I am deeply concerned about Eliezer’s new article for a combination of reasons:
I don’t think it will work.
Given that it won’t work, I expect we lose credibility and it now becomes much harder to work with people who were sympathetic to alignment, but still wanted to use AI to improve the world.
I am not convinced as he is about doom and I am not as cynical about the main orgs as he is.
In the end, I expect this will just alienate people. And stuff like this concerns me.
I think it’s possible that the most memetically powerful approach will be to accelerate alignment rather than suggesting long-term bans or effectively antagonizing all AI use.
A couple of things make me inclined to disagree with you about whether this will alienate people, including:
1) The reaction on Twitter seems okay so far
2) Over the past few months, I’ve noticed a qualitative shift among non-EA friends/family regarding their concerns about AI; people seem worried
3) Some of the signatories of the FLI letter didn’t seem to be the usual suspects; I have heard one prominent signatory openly criticize EA, so that feels like a shift, too
4) I think smart, reasonable people who have been exposed to ChatGPT but don’t know much about AI—i.e., many TIME readers—intuitively get that “powerful thing we don’t really understand + very rapid progress + lack of regulation/coordination/good policy” is a very dangerous mix
I’d actually be eager to hear more EAs talk about how they became concerned about AI safety, because I was persuaded that this was something we should be paying close attention to over the course of one long conversation, and it would take less convincing today. Maybe we should send this article to a few non-EA friends/family members and see what their reaction is?
So, things have blown up way more than I expected and things are chaotic. Still not sure what will happen or if a treaty is actually in the cards, but I’m beginning to see a path to tons of more investment in alignment potentially. One example why is that Jeff Bezos just followed Eliezer on Twitter and I think it may catch the attention of pretty powerful and rich people who want to see AI go well. We are so off-distribution, could go in any direction.
Wow, Bezos has indeed just followed Eliezer:
https://twitter.com/BigTechAlert/status/1641659849539833856
Related: “Amazon partners with startup Hugging Face for ChatGPT rival” (Los Angeles Times, Feb 21st 2023)
In case we have very different feeds, here’s a set of tweets critical about the article:
https://twitter.com/mattparlmer/status/1641230149663203330?s=61&t=ryK3X96D_TkGJtvu2rm0uw (lots of quote-tweets on this one)
https://twitter.com/jachiam0/status/1641271197316055041?s=61&t=ryK3X96D_TkGJtvu2rm0uw
https://twitter.com/finbarrtimbers/status/1641266526014803968?s=61&t=ryK3X96D_TkGJtvu2rm0uw
https://twitter.com/plinz/status/1641256720864530432?s=61&t=ryK3X96D_TkGJtvu2rm0uw
https://twitter.com/perrymetzger/status/1641280544007675904?s=61&t=ryK3X96D_TkGJtvu2rm0uw
https://twitter.com/post_alchemist/status/1641274166966996992?s=61&t=ryK3X96D_TkGJtvu2rm0uw
https://twitter.com/keerthanpg/status/1641268756071718913?s=61&t=ryK3X96D_TkGJtvu2rm0uw
https://twitter.com/levi7hart/status/1641261194903445504?s=61&t=ryK3X96D_TkGJtvu2rm0uw
https://twitter.com/luke_metro/status/1641232090036600832?s=61&t=ryK3X96D_TkGJtvu2rm0uw
https://twitter.com/gfodor/status/1641236230611562496?s=61&t=ryK3X96D_TkGJtvu2rm0uw
https://twitter.com/luke_metro/status/1641263301169680386?s=61&t=ryK3X96D_TkGJtvu2rm0uw
https://twitter.com/perrymetzger/status/1641259371568005120?s=61&t=ryK3X96D_TkGJtvu2rm0uw
https://twitter.com/elaifresh/status/1641252322230808577?s=61&t=ryK3X96D_TkGJtvu2rm0uw
https://twitter.com/markovmagnifico/status/1641249417088098304?s=61&t=ryK3X96D_TkGJtvu2rm0uw
https://twitter.com/interpretantion/status/1641274843692691463?s=61&t=ryK3X96D_TkGJtvu2rm0uw
https://twitter.com/lan_dao_/status/1641248437139300352?s=61&t=ryK3X96D_TkGJtvu2rm0uw
https://twitter.com/lan_dao_/status/1641249458053861377?s=61&t=ryK3X96D_TkGJtvu2rm0uw
https://twitter.com/growing_daniel/status/1641246902363766784?s=61&t=ryK3X96D_TkGJtvu2rm0uw
https://twitter.com/alexandrosm/status/1641259179955601408?s=61&t=ryK3X96D_TkGJtvu2rm0uw
Yeah, I’m definitely not disputing that some people will be alienated by this. My basic reaction is just: AI safety people are already familiar with EY’s takes; I suspect people like my parents will read this and be like “whoa, this makes some sense and is kind of scary.” (With regard to differing feeds, I just put the link to the article into the Twitter search bar and sorted by latest. I still think the negative responses are a minority.)
Worth noting that Matt Parlmer has said:
I’m not particularly well informed about current EA discourse on AI alignment, but I imagine that two possible strategies are
accelerating alignment research and staying friendly with the big AI companies
getting governments to slow AI development in a worldwide-coordinated way, even if this angers people at AI companies.
Yudkowsky’s article helps push on the latter approach. Making the public and governments more worried about AI risk does seem to me the most plausible way of slowing it down. If more people in the national-security community worry about AI risks, there could be a lot more attention to these issues, as well as the possibility of policies like limiting total computing power for AI training that only governments could pull off.
I expect a lot of AI developers would be angry about getting the public and governments more alarmed, but if the effort to raise alarm works well enough, then the AI developers will have to comply. OTOH, there’s also a possible “boy who cried wolf” situation in which AI progress continues, nothing that bad happens for a few years, and then people assume the doomsayers were overreacting—making it harder to ring alarm bells the next time.
In any social policy battle (climate change, racial justice, animal rights) there will be people who believe that extreme actions are necessary. It’s perhaps unusual on the Al front that one of the highest profile experts is on that extreme, but it’s still not an unusual situation. A couple of points in favour of this message having a net positive effect.
I don’t buy the debate that extreme arguments alienate people about the cause in general. This is a common assumption, but the little evidence we have suggests that extreme actions or talk might actually both increase visibility of the cause and increase support to more moderate groups. Anecdotally on the AI front @lilly seems to be seeing something similar too.
On a rational front, if he is this sure of doom, his practical solution seems to make the most sense. It shows intellectual integrity. We can’t expect someone to have a pdoom of 99% given the status quo, then just suggest better alignment strategies. From a scout mindset perspective, we need to put ourselves in the 99% doom shoes before dismissing this opinion as irrational, even if we strongly disagree with his pdoom.
(Related to 1), I feel like AI risk is still perhaps at the “Any publicity is good publicity” stage as many people are still completely unaware of it. Anything a bit wild like this which attracts more attention and debate is likely to be good. Within a few months/years this may change though as AI risk becomes truly mainstream. Outside tech bubbles it certainly isn’t yet.
People who know that they are outliers amongst experts in how likely they think X is (as I think being 99% sure of doom is, particular combined with short-ish timelines), should be cautious about taking extreme actions on the basis of an outlying view, even if they think they have performed a personal adjustment to down-weight their confidence to take account of the fact that other experts disagree, and still ended up north of 99%. Otherwise you get the problem that extreme actions are taken even when most experts think they will be bad. In that sense integrity of the kind your praising is actually potentially very bad and dangerous, even if there are some readings of “rational” on which it counts as rational.
Of course, what Eliezer is doing is not taking extreme actions, but recommending governments do so in certain circumstances, and that is much less obviously a bad thing to do, since govs will also hear from experts who are closer to the median expert.
Archive link
(More of a meta point somewhat responding to some other comments.)
It currently seems unlikely there will be a unified AI risk public communication strategy. AI risk is an issue that affects everyone, and many people are going to weigh in on it. That includes both people who are regulars on this forum and people who have never heard of it.
I imagine many people will not be moved by Yudkowsky’s op ed, and others will be. People who think AI x-risk is an important issue but who still disagree with Yudkowsky will have their own public writing that may be partially contradictory. Of course people should continue to talk to each other about their views, in public and in private, but I don’t expect that to produce “message discipline” (nor should it).
The number of people concerned about AI x-risk is going to get large enough (and arguably already is) that credibility will become highly unevenly distributed among those concerned about AI risk. Some people may think that Yudkowsky lacks credibility, or that his op ed damages it, but that needn’t damage the credibility of everyone who is concerned about the risks. Back when there were only a few major news articles on the subject, that might have been more true, but it’s not anymore. Now everyone from Geoffrey Hinton to Gary Marcus (somehow) to Elon Musk to Yuval Noah Harari are talking about the risks. While it’s possible everyone could be lumped together as “the AI x-risk people,” at this point, I think that’s a diminishing possibility.
I hope that this article sends the signals that pausing the development of the largest AI-models is good, informing society about AGI xrisk is good, and that we should find a coordination method (regulation) to make sure we can effectively stop training models that are too capable.
What I think we should do now is:
1) Write good hardware regulation policy proposals that could reliably pause the development towards AGI.
2) Campaign publicly to get the best proposal implemented, first in the US and then internationally.
This could be a path to victory.
I appreciate Eliezer’s honesty and consistency in what he is is calling for. This approach makes sense if you believe, as Eliezer does, that p(doom | business as usual)>99%. Then it is worth massively increasing the risk of a nuclear war. If you believe, as I do and as most AI experts do, that p(doom | business as usual) <20%, this plan is absolutely insane.
This line of thinking is becoming more and more common in EA. It is going to get us all killed if it has any traction. No, the U.S. should not be willing to bomb Chinese data centers and risk a global nuclear war. No, repeatedly bombing China for pursing something that is a central goal of the CCP that has dangers that are completely illegible to 90% of the population is not a small, incremental risk of nuclear war on the scale of aiding Ukraine as some other commenters are suggesting. This is insane.
By all means, I support efforts for international treaties. Bombing Chinese data centers is suicidal and we all know it.
I say this all as someone who is genuinely frightened of AGI. It might well kill us, but not as quickly or surely as implementing this strategy will.
Edited to reflect that upon further thought, I probably do not support bombing the data centers of less powerful countries either.
Why do you assume that China/the CCP (or any other less powerful countries) won’t wake up to the risk of AGI? They aren’t suicidal. The way I interpret “rogue data centre” is a cavalier non-state actor (or even a would-be world-ending terrorist cell).
Unless the non-state actor is operating off a ship in international waters, it’s operating within a nation-state’s boundaries, and bombing it would be a serious incursion on that nation-state’s territorial sovereignty. There’s a reason such incursions against a nuclear state have been off the table except in the most dire of circumstances.
The possibility of some actor having the financial and intellectual resources necessary to develop AGI without the acquiescence of the nation within which operates seems rather remote. And the reference elsewhere to nuclear options probably colors this statement—why discuss that if the threat model is some random terrorist cell?
Ok, I guess it depends on what level the “acquiescence” is. I’d hope that diplomacy would be successful in nearly all cases here, with the nuclear state reining in the rogue data centre inside its borders without any outside coalition resorting to an airstrike.
You mention “except in the most dire of circumstances”—these would be the absolute most dire circumstances imaginable. More dire, in fact, than anything in recorded history—the literal end of the world at stake.
I hope they do wake up to the danger, and I am all for trying to negotiate treaties!
It’s possible I am misinterpreting what EY means by “rogue data centers.” To clarify, the specific thing I am calling insane is the idea that the U.S. or NATO should under (almost) any circumstance bomb data centers inside other nuclear powers.
I don’t think this is insane, and I think <20% is probably too low a threshold to carry the case—a 15% risk of extinction from AGI would mean we should be drastically, drastically more scared of AGI than of nuclear war.
What percentage chance would you estimate of a large scale nuclear war conditional on the U.S. bombing a Chinese data center? What percentage of the risk from agi do you think this strategy reduces?
We can still have this ending (cf. Terminator analogies):
Yudkowsky’s suggestions seem entirely appropriate if you truly believe, like him, that AI x-risk is probability ~100%.
However, that belief is absurdly high, based on unproven and unlikely assumptions, like that an AI could build nanofactories by ordering proteins to be mixed over email.
In the actual world, where the probability of extinction is signficantly less than 100%, are these proposals valuable? It seems like they will just get everyone else labelled luddites and fearmongerers, especially if years and decades go by with no apocalypse in sight.
Many things about this comment seem wrong to me.
These proposals would plausibly be correct (to within an order of magnitude) in terms of the appropriate degree of response with much lower probabilities of doom (i.e. 10-20%). I think you need to actually run the math to say that this doesn’t make sense.
This is a deeply distorted understanding of Eliezer’s threat model, which is not any specific story that he can tell, but the brute fact that something smarter than you (and him, and everyone else) will come up with something better than that.
I do not think it is ever particularly useful to ask “is someone else’s conclusion valid given my premises, which are importantly different from theirs”, if you are attempting to argue against someone’s premises. Obviously “A ⇒ B” & “C” does not imply “B”, and it especially does not imply “~A”.
This is an empirical claim about PR, which:
does not seem obviously correct to me
has little to say about the object-level arguments
falls into pattern of suggesting that people should optimize for how others perceive us, rather than optimizing for communicating our true beliefs about the world.
I strongly disagree with this. This article includes the following passage:
This advocates for risking nuclear war for the sake of preventing mere “AI training runs”. I find it highly unlikely that this risk-reward payoff is logical at a 10% x-risk estimate.
You do make a fair point about the structure of my comment, though. In truth I don’t have a problem with yudkowsky writing this article, given his incorrect beliefs about AI-risk. I have a problem with the proposals themselves, because his beliefs are incorrect.
I really don’t like the rhetorical move you’re making here. You (as well as many people on this forum) think his beliefs are incorrect; others on this forum think they are correct. Insofar as there’s no real consensus for which side is correct, I’d strongly prefer people (on both sides) use language like “given his, in my opinion, incorrect beliefs” as opposed to just stating as a matter of fact that he’s incorrect.
Are you expecting me to preface every statement I make with “in my opinion”? Obviously this is my opinion and people are free to disagree. I don’t need to state that explicitly every single time.
Sorry, after rereading my comment, it comes off as more hostile than I was intending (currently sleep-deprived, which sometimes has that effect on me). The intended tone of my comment was more like “this move feels like it could lead to epistemics slipping or to newcomers being confused” and not like “this move violates some important norm of good behavior”.
Regarding your specific question – no, I’m obviously not expecting you to preface every statement with “in my opinion”. Most writing doesn’t include “in my opinion” at the beginning of every statement, yet also most writing doesn’t lead to a flag going off in my head for “huh, this statement is stated as a fact but is actually a matter up for debate” which I did notice here.
All else equal, this depends on what increase in risk of nuclear war you’re trading off against what decrease in x-risk from AI. We may have “increased” risk of nuclear war by providing aid to Ukraine in its war against Russia, but if it was indeed an increase it was probably small and worth the trade-off[1] against our other goals (such as disincentivizing the beginning of wars which might lead to nuclear escalation in the first place). I think approximately the only unusual part of Eliezer’s argument is the fact that he doesn’t beat around the bush in spelling out the implications.
Asserted for the sake of argument; I haven’t actually demonstrated that this is true but my point is more that there are many situations where we behave as if it is obviously a worthwhile trade-off to marginally increase the risk of nuclear war.
He’s not talking about a “marginal increase” in risk of nuclear war. What Eliezer is proposing is nuclear blackmail.
If China, today, told us that “you have 3 months to disband OpenAI or we will nuke you”, what are the chances that the US would comply? I guarantee you they are almost zero, because if the US gives in, then china can demand something else, and then something else, and so on. Instead, the US would probably try to talk them out of their ultimatum, or failing that, do a preemptive strike.
If the deadline does come, China can either launch the nukes and start armageddon, or reveal an empty threat and not be taken seriously, in which case the whole exercise is worthless.
He proposes instituting an international treaty, which seems to be aiming for the reference class of existing treaties around the proliferation of nuclear and biological weapons. He is not proposing that the United States issue unilateral threats of nuclear first strikes.
I do not believe this interpretation is correct. Here is the passage again, including the previous paragraph for added context:
He advocates for bombing datacentres and being prepared to start shooting conflicts to destroy GPU clusters, and then advocates for “running some risk of nuclear exchange if that’s what it takes to reduce the risk of large AI training runs”. I cannot see any interpretation other than “threaten to bomb nuclear armed countries that train AI’s”.
To be fair, upon reading it again it’s more likely he means “threaten to conventionally bomb datacentres”. But this is still nuclear brinksmanship: bombing russia or china is an act of war, carrying a high chance of nuclear exchange.
Your post begins with,
And ends with,
If in the writing of a comment you realize that you were wrong, you can just say that.
I don’t think the crux here is about nanofactories – I’d imagine that if Eliezer considered a world identical to ours but where nanofactories were impossible, he’d place (almost) as high probability on doom (though he’d presumably expect doom to be somewhat more drawn out).
This proposal seems to have become extremely polarizing, more so and for different reasons than I would have expected after first reading this. I am more on the “this is pretty fine” side of the spectrum, and think some of the reasons it has been controversial are sort of superficial. Given this though, I want to steelman the other side (I know Yudkowsky doesn’t like steelmanning, too bad, I do), with a few things that are plausibly bad about it that I don’t think are superficial or misreadings, as well as some start of my reasons for worrying less about them:
“While it’s true that this isn’t ‘the same as’ calling for outright violence, if we are at least a little bit on Orwell’s side on political violence, surely the position that significantly risking nuclear war is worse than a few terrorists bombing a GPU center seems quite silly. If he supports the former but not the latter, that is quite an extreme position!”:
I’m sympathetic to this, in no small part because I lean Orwell on state violence in many cases, but I think it misunderstands Yudkowsky’s problem with the terrorists. It’s not that the fact that this is terrorism adds enough intrinsic badness to outweigh a greater chance of literal nuclear war, it’s that legitimate state authority credibly being willing to go to nuclear war is likely to actually work, while terrorism is just a naïve tactic which will likely backfire. In fact nothing even rests on the idea that nuclear war is worth preventing AI (though in a quite bad and now deleted tweet Yudkowsky does argue for this, and given that he expects survivors of nuclear war but not AI misalignment nothing about this judgement rests on his cringe “reaching the stars” aside). If a NATO country is invaded, letting it be invaded is surely not as bad as global nuclear war, but supporters of Article 5 tacitly accept the cost of risking this outcome, because non-naïve consequentialism cares about credibly backing certain important norms, even when, in isolation, the cost of going through with them doesn’t look worth it.
“While there are international resolutions that involve credibly risking nuclear war, like Article 5, and there are international resolutions that involve punishing rogue states, like ones governing the development of weapons of mass destruction, the combination of these two is in practice not really there, so pointing to each in isolation fails to recognize the way this proposal pushes a difference in degrees all the way to basically a difference in kind”:
I am again sympathetic to this. What Yudkowsky is proposing here is kind of a big deal, and it involves a stricter international order than we have ever seen before. This is very troubling! It isn’t clear that there is a single difference in kind (except perhaps democracy) between Stalinism and a state that I would be mostly fine with. It’s largely about pushing state powers and state flaws that are tolerable at some degree to a point where they are no longer tolerable. I think I’m just not certain if his proposal reaches this crucial level for me. One reason is that I’m just not sure what level of international control really crosses that line for me, and risking war to prevent x-risk seems like a candidate okay think for countries to apply a unique level of force to. Certainly if you believe the things Yudkowsky does. The second reason however, is that his actual proposal is ambiguous in crucial ways that I will cover in point 3, so I would probably be okay with some but not other versions of it.
“Yes there is a difference between state force and random acts of violence, but it isn’t clear what general heuristic we can use to distinguish the two other than ‘one is carried out by a legitimate state authority and the other isn’t’. We know what this looks like at the country level because the relevant state authority is usually pretty clear, but on the international stage this is just not something where the difference between legitimate and illegitimate governance is super obvious! How many countries have to sign up? What portion of the world population do they have to represent? How powerful on the international stage do they already have to be? What agreement mechanism/mediating body is needed to grant this authority? Surely there is a difference in the types of ways Yudkowsky cares about between violence carried out by the mafia versus the local government, but on the international stage this sort of difference is very foggy. Given the huge difference he places on illegitimate versus legitimate force, Yudkowsky should have been specific about what it would take for such a governing agreement to be legitimate. Otherwise people can fill in the details however they think one should answer this question, and Yudkowsky specifically is shielded from the most relevant sort of criticism he could face for his proposal”:
This is the objection I am most sympathetic to, and the place I wish critics would focus most of their attention. If NATO agrees to this treaty, does that give them legitimate authority to threaten China with drone strikes that isn’t just like the mafia writing threatening letters to AI developers? What if China joins in with NATO, does this grant the authority to threaten Russia? Probably not for both, but while the difference is probably at an ambiguous threshold of which countries sign up, it’s pretty clear when a country-wide law becomes legitimate, because there’s an agreed upon legitimate process for passing it. These are questions deeply tied to any proposal like this and it does bug me how little Yudkowsky has spelled this out. That said, I think this is sort of a problem for everyone? As I’ve said, and Yudkowsky has said, basically everyone distinguishes between state enforcement and random acts of civilian violence, and aside from this, basically everyone seems confused about how to apply this at the international scale on ambiguous margins. Insofar as you want to apply something like this to the international scale sometimes, you have to live with this tension, and probably just remain a bit confused.
Yudkowsky claims that AI developers are plunging headlong into our research in spite of believing we are about to kill all of humanity. He says each of us continues this work because we believe the herd will just outrun us if any one of us were to stop.
The truth is nothing like this. The truth is that we do not subscribe to Yudkowsky’s doomsday predictions. We work on artificial intelligence because we believe it will have great benefits for humanity and we want to do good for humankind.
We are not the monsters that Yudkowsky makes us out to be.
I believe you that you’re honestly speaking for your own views, and for the views of lots of other people in ML. From experience, I know that there are also lots of people in ML who do think AGI is likely to kill us all, and choose to work on advancing capabilities anyway. (With the justification Eliezer highlighted, and in many cases with other justifications, though I don’t think these are adequate.)
I’d be interested to hear your views about this, and why you don’t think superintelligence risk is a reason to pause scaling today. I can imagine a variety of reasons someone might think this, but I have no idea what your reason is, and I think conversation about this is often quite productive.
What experiences tell you there are also lots of people in ML who do think AGI is likely to kill us all, and choose to work on advancing capabilities anyway?
It’s hard to have strong confidence in these numbers, but surveys of AI developers who publish at prestigious conferences on probability of AGI-caused “causing human extinction or similarly permanent and severe disempowerment of the human species? ” often gets you numbers in the single-digit percentage points.
This is a meaningfully different claim than “likely to kill us all” which is implicitly >50%, but not that different in moral terms. The optimal level of extinction risk that humanity should be willing to incur is not 0, but it should be quite low.
@Linch Have you ever met any of these engineers who work on advancing AI in spite of thinking that the “most likely result … is that literally everyone on Earth will die.”
I have never met anyone so thoroughly depraved.
Mr. Yudkowsky and @RobBensinger think our field has many such people.
I wonder if there is a disconnect in the polls. I wonder if people at MIRI have actually talked to AI engineers who admit to this abomination. What do you even say to someone so contemptible? Perhaps there are no such people.
I think it is much more likely that these MIRI folks have worked themselves into a corner of an echo chamber than it is that our field has attracted so many low-lifes who would sooner kill every last human than walk away from a job.
I don’t think I’ve met people working on AGI who has P(doom) >50%. I think I fairly often talk to people at e.g. OpenAI or DeepMind who believe it’s 0.1%-10% however. And again, I don’t find the difference that morally significant between probabilistically killing people at 5% vs 50% is that significant.
I don’t know how useful it is to conceptualize AI engineers who actively believe >50% P(doom) as evil or “low-lifes”, while giving a pass to people who have lower probabilities of doom. My guess is that it isn’t, and it would be better if we have an honest perspective overall. Relatedly, it’s better for people to be able to honestly admit “many people will see my work as evil but I’m doing it for xyz reasons anyway” rather than delude themselves otherwise and come up with increasingly implausible analogies, or refuse to engage at all.
I agree this is a confusing situation. My guess is most people compartmentalize and/or don’t think of what they’re doing as that critical to advancing the doomsday machine, and/or they think other people will get there first and/or they think AGI is so far away that current efforts don’t matter, etc.
I would bet that most people who work in petroleum companies[1] (and for that matter, consumers) don’t think regularly about their consequences on climate change, marketers for tobacco companies don’t think about their impacts on lung cancer, Google engineers at Project Maven don’t think too hard about how their work accelerates drone warfare, etc. I quite like the bookThank You for Smoking for some of this mentality.
Of course probabilistically “killing all of humanity” is axiologically worse in scope than causing lung cancer or civilian casualties of drones or arguably marginal effects on climate change. But scope neglect is a well-known problem with human psychology, and we shouldn’t be too surprised that people’s psychology is not extremely sensitive to magnitude.
For the record, I’m not pure here and I in fact do fly.
I agree with @Linch. People find ways of rationalising what they do being OK, whether it is working for oil companies, or tobacco companies, or arms dealers, or AI capabilities. I don’t consider anyone a “low-life” really, but perhaps aren’t acting rationally if we assume doing net good for the world is an important goal for them (which it isn’t for a lot of people as well)
I also agree I don’t see a huge difference between the practical outworking 5% and 50% probablity of doom. Both probabilities should cause anyone who even thinks there is a small possibility that their work could contribute to that disaster to immediately stop and do something else. Given we are talking about existential risk, if those OpenAI or DeepMind people even believe the probability is 0.1% then they should probably lay down their tools and reconsider.
But we are human, and have specific skills, and pride, and families to feed so we justify to ourselves doing things which are bad all the time. This doesn’t make us “low-lifes”, just flawed humans.
I do not believe @RobBensinger ’s and Yudkowsky’s claim that “there are also lots of people in ML who do think AGI is likely to kill us all, and choose to work on advancing capabilities anyway.”
I’m going to go against the grain here, and explain how I truly feel about this sort of AI safety messaging.
As others have pointed out, fearmongering on this scale is absolutely insane to those who don’t have a high probability of doom. Worse, Elizier is calling for literal nuclear strikes and great power war to stop a threat that isn’t even provably real! Most AI researchers do not share his views, neither do I.
I want to publicly state that pushing this maximized narrative about AI x risk will lead to terrorist actions against GPU clusters or individuals involved in AI. These types of acts follow from the intense beliefs of those who agree with Elizier, and have a doomsday cult style of thought.
Not only will that sort of behavior discredit AI safety and potentially EA entirely, it could hand the future to other actors or cause governments to lock down AI for themselves, making outcomes far worse.
He’s calling for a policy that would be backed by whatever level of response was necessary to enforce it, including, if it escalated to that level, military response (plausibly including nuclear). This is different from, right now, literally calling for nuclear strikes. The distinction may be somewhat subtle, but I think it’s important to keep this distinction in mind during this discussion.
This statement strikes me as overconfident. While the narrative presumably does at least somewhat increase the personal security concerns of individuals involved in AI, I think we need to be able to have serious discussions on the topic, and public policy shouldn’t be held hostage to worries that discussions about problems will somewhat increase the security concerns of those involved in those problems (e.g., certain leftist discourse presumably somewhat increases the personal security concerns of rich people, but I don’t think that fact is a good argument against leftism or in favor of silencing leftists).
I don’t see where Eliezer has said “plausibly including nuclear”. The point of mentioning nuclear was to highlight the scale of the risk on Eliezer’s model (‘this is bad enough that even a nuclear confrontation would be preferable’), not to predict nuclear confrontation.
You’re right. I wasn’t trying to say that Eliezer explicitly said that the response should plausibly include nuclear use – I was saying that he was saying that force should be used if needed, and it was plausible to me that he was imagining that in certain circumstances the level of force needed may be nuclear (hardened data centers?). But he has more recently explicitly stated that he was not imagining any response would include nuclear use, so I hereby retract that part of my statement.
The full text of the TIME piece is now available on the EA Forum here, with two clarifying notes by Eliezer added at the end.
What if actors with bad intentions don’t stop (and we won’t be able to know about that), and they create a more powerful AI than what we have now?
Extremely cringe article.
The argument that AI will inevitably kill us has never been well-formed and he doesn’t propose a good argument for it here. No-one has proposed a reasonable scenario by which immediate, unpreventable AI doom will happen (the protein nanofactories-by-mail idea underestimates the difficulty of simulating quantum effects on protein behaviour).
A human dropped into a den of lions won’t immediately become its leader just because the human is more intelligent.