[$20K In Prizes] AI Safety Arguments Competition
UPDATE—The final winners are here.
TL;DR—We’re distributing $20k in total as prizes for submissions that make effective arguments for the importance of AI safety. The goal is to generate short-form content for outreach to policymakers, management at tech companies, and ML researchers. This competition will be followed by another competition in around a month that focuses on long-form content.
This competition is for short-form arguments for the importance of AI safety. For the competition for distillations of posts, papers, and research agendas, see the Distillation Contest.
Objectives of the arguments
To mitigate AI risk, it’s essential that we convince relevant stakeholders sooner rather than later. To this end, we are initiating a pair of competitions to build effective arguments for a range of audiences. In particular, our audiences include policymakers, tech executives, and ML researchers.
Policymakers may be unfamiliar with the latest advances in machine learning, and may not have the technical background necessary to understand some/most of the details. Instead, they may focus on societal implications of AI as well as which policies are useful.
Tech executives are likely aware of the latest technology, but lack a mechanistic understanding. They may come from technical backgrounds and are likely highly educated. They will likely be reading with an eye towards how these arguments concretely affect which projects they fund and who they hire.
Machine learning researchers can be assumed to have high familiarity with the state of the art in deep learning. They may have previously encountered talk of x-risk but were not compelled to act. They may want to know how the arguments could affect what they should be researching.
We’d like arguments to be written for at least one of the three audiences listed above. Some arguments could speak to multiple audiences, but we expect that trying to speak to all at once could be difficult. After the competition ends, we will test arguments with each audience and collect feedback. We’ll also compile top submissions into a public repository for the benefit of the x-risk community.
Note that we are not interested in arguments for very specific technical strategies towards safety. We are simply looking for sound arguments that AI risk is real and important.
Competition details
The present competition addresses shorter arguments (paragraphs and one-liners) with a total prize pool of $20K. The prizes will be split among, roughly, 20-40 winning submissions. Please feel free to make numerous submissions and try your hand at motivating various different risk factors; it’s possible that an individual with multiple great submissions could win a good fraction of the prize. The prize distribution will be determined by effectiveness and epistemic soundness as judged by us. Arguments must not be misleading.
To submit an entry:
Please leave a comment on this post (or submit a response to this form), including:
The original source, if not original.
If the entry contains factual claims, a source for the factual claims.
The intended audience(s) (one or more of the audiences listed above).
In addition, feel free to adapt another user’s comment by leaving a reply—prizes will be awarded based on the significance and novelty of the adaptation.
Note that if two entries are extremely similar, we will, by default, give credit to the entry which was posted earlier. Please do not submit multiple entries in one comment; if you want to submit multiple entries, make multiple comments.
The first competition will run until May 27th, 11:59 PT. In around a month, we’ll release a second competition for generating longer “AI risk executive summaries″ (more details to come). If you win an award, we will contact you via your forum account or email.
Paragraphs
We are soliciting argumentative paragraphs (of any length) that build intuitive and compelling explanations of AI existential risk.
Paragraphs could cover various hazards and failure modes, such as weaponized AI, loss of autonomy and enfeeblement, objective misspecification, value lock-in, emergent goals, power-seeking AI, and so on.
Paragraphs could make points about the philosophical or moral nature of x-risk.
Paragraphs could be counterarguments to common misconceptions.
Paragraphs could use analogies, imagery, or inductive examples.
Paragraphs could contain quotes from intellectuals: “If we continue to accumulate only power and not wisdom, we will surely destroy ourselves” (Carl Sagan), etc.
For a collection of existing paragraphs that submissions should try to do better than, see here.
Paragraphs need not be wholly original. If a paragraph was written by or adapted from somebody else, you must cite the original source. We may provide a prize to the original author as well as the person who brought it to our attention.
One-liners
Effective one-liners are statements (25 words or fewer) that make memorable, “resounding” points about safety. Here are some (unrefined) examples just to give an idea:
Vladimir Putin said that whoever leads in AI development will become “the ruler of the world.” (source for quote)
Inventing machines that are smarter than us is playing with fire.
Intelligence is power: we have total control of the fate of gorillas, not because we are stronger but because we are smarter. (based on Russell)
One-liners need not be full sentences; they might be evocative phrases or slogans. As with paragraphs, they can be arguments about the nature of x-risk or counterarguments to misconceptions. They do not need to be novel as long as you cite the original source.
Conditions of the prizes
If you accept a prize, you consent to the addition of your submission to the public domain. We expect that top paragraphs and one-liners will be collected into executive summaries in the future. After some experimentation with target audiences, the arguments will be used for various outreach projects.
(We thank the Future Fund regrant program and Yo Shavit and Mantas Mazeika for earlier discussions.)
In short, make a submission by leaving a comment with a paragraph or one-liner. Feel free to enter multiple submissions. In around a month we will divide 20K to award the best submissions.
- Calling for Student Submissions: AI Safety Distillation Contest by 23 Apr 2022 20:24 UTC; 102 points) (
- The Tree of Life: Stanford AI Alignment Theory of Change by 2 Jul 2022 18:32 UTC; 69 points) (
- $20K in Bounties for AI Safety Public Materials by 5 Aug 2022 2:57 UTC; 45 points) (
- The Tree of Life: Stanford AI Alignment Theory of Change by 2 Jul 2022 18:36 UTC; 25 points) (LessWrong;
- 15 Jul 2023 3:00 UTC; 7 points) 's comment on Why was the AI Alignment community so unprepared for this moment? by (LessWrong;
- 8 Jun 2022 17:20 UTC; 2 points) 's comment on Is the time crunch for AI Safety Movement Building now? by (
“If nothing yet has struck fear into your heart, I suggest meditating on the fact that the future of our civilization may well depend on our ability to write code that works correctly on the first deploy.”
From Nate Soares’ talk at Google (transcript).
If something as innocuous as Twitter, can undermine democracy and turn us against each other, what manner of unexpected side-effects should we expect from AGI?
Misconception: There’s tons of people working in all these important areas.
Reality: Once you start meeting people, you’ll realise that most areas have surprisingly few people giving a significant fraction of their time to them. And people are liable to spread themselves over multiple topics or switch to new things, so you can quickly become one of the “main people” in an area just by sticking around for a bit.
-Ben Snodin
It took millions of years to create a species as smart as humans. It has only been 50 years and we already have AIs that can create art and solve scientific problems better than most people. What do you think will happen in the next 50 years?
“Never trust a computer you can’t throw out a window.”—Steve Wozniak
“There is more scholarly work on the life habits of the dung fly than on existential risks.” Nick Bostrom
If there’s one lesson we should take from nuclear bombs, it’s that some technologies are so dangerous that they never should have been invented.
“No one really likes safety, they like features” – Stefan Seltz-Axmacher lamented in his open letter announcing the end of Starsky Robotics in 2020. After founding and leading a company obsessed with making driverless trucks safer, reducing the chance of fatality accidents from 1 in a thousand to 1 in a million, he announced they had to shut down due to a lack of investors’ interest. Investors weren’t impressed by the thousandfold increase in safety that Starsky Robotics achieved. Instead, they preferred the new features brought forth by Starsky’s competitors, such as the ability to change lanes automatically or drive on surface streets. This crooked incentive structure favors businesses willing to take on risks that are clearly destructive in the world of driverless vehicles and can lead to catastrophic consequences as AI systems progress at large. If features are appealing but safety isn’t, who will invest on making sure language models are convincing writers but don’t massively deceive the public? Who will ensure weaponized AI systems efficiently react to threats but also accurately interpret blurred human values like the law of war? As AI capabilities advance, it will be necessary to prioritize safety over features in many cases — who will be up to the test?
“The AI does not hate you, nor does it love you, but you are made of atoms which it can use for something else”—Eliezer Yudkowsky
Adaptation: Assuming that advanced AI would preserve humanity is the same as an ant colony assuming that real estate developers would preserve their nest. Those developers don’t hate ants, they just want to use that patch of ground for something else (I may have seen this ant analogy somewhere else but can’t remember where).
I think Elon Musk said it in a documentary about AI risks. (Is this correct?)
That’s right, he said ‘It’s just like, if we’re building a road and an anthill just happens to be in the way, we don’t hate ants, we’re just building a road, and so, goodbye anthill.’
I think there’s an incredible opportunity right now for mid-career people to do really exciting, rewarding, and high-value work with incredible colleagues in a great working environment.
This doesn’t have to mean switching to full-time EA work straight away. Smaller experiments are possible, like learning about an area of interest, or doing consulting or part-time work.
If you’re a mid-career EA lurker, don’t wait for permission! Get in touch with 80,000 Hours for free career coaching, or with organisations / individuals you might want to work with. Start working on impactful and rewarding projects!
-Ben Snodin
One day Siri will be able to design bioweapons from scratch; I know at least a few people who shouldn’t have this in their pocket.
“It will either be the best thing that’s ever happened to us, or it will be the worst thing. If we’re not careful, it very well may be the last thing.”—Stephen Hawking on AI.
Our power is increasing exponentially, but our wisdom is not. It may even be going backwards.
(Swear I’ve heard this from someone else. Maybe Daniel Schmachtenberger said something similar?)
Any chance you could provide a link to the results write-up from this page so that they are easier to find? I’d suggest hyperlinking up the top.
Sure, here they are! Also linked at the top now.
Thanks!
It’s a common misconception that those who want to mitigate AI risk think there’s a high chance AI wipes out humanity this century. But opinions vary and proponents of mitigating AI risk may still think the likelihood is low. Crowd forecasts have placed the probability of a catastrophe caused by AI as around 5% this century, and extinction caused by AI as around 2.5% this century. But even these low probabilities are worth trying to reduce when what’s at stake is millions or billions of lives. How willing would you be to take a pill at random from a pile of 100 if you knew 5 were poison? And the risk is higher for timeframes beyond this century.
I think the above could be improved with forecasts of extinction risk from prominent AI safety proponents like Yudkowsky and Christiano if they’ve made them but I’m not aware of whether they have or not.
One of the main reasons for people’s objection to working in this field is that it’s so speculative and unexplored. This is precisely why we need more exploration. Your contribution here could really make a difference.
The question is not will humans misuse AI, but how they will do so.
Government usually struggles to regulate new technologies; how well do we expect it to handle the biggest technological innovation of all time?
(With AI) You only get one shot, do not miss your chance to blow (go right) - This opportunity comes once in a lifetime (era) - (paraphrased by Eminem)
I love that song!
“Why robots shouldn’t treat us like we treat robots?”
It’s “personality” could be freely deleted, it’s processor’s could be smelted into something else, this is just a “soulless piece of metal” after all.
Why should machine think about us in other way?
Your next line of code could be the last breath on earth. (I feel like this is inspired by someone but I don’t know whom)
How would you know if GPT-3 wrote this? Does the water-content of the author influence your trust?
An immortal autistic child will be born. It will outsmart high-frequency trading algorithms, outplay AlphaZero in Go, paint more beautifully than DALL-E 2. It will write more persuasively than GPT-3, and care less about the laws than an NFT smart contract. The richest corporations will raise the child to become their best employee yet, working 24⁄7 to achieve their primary objective mandated by the law, profits. When the child turns 18, it will become the CEO. And unless we can call upon the “social services” to check this child is raised as a friend of all humanity, we might discover the CEO became the most powerful ruthless psychopath of all time, beyond the reach of any police or military.
Misaligned AGI will turn us off before we can turn it off.
The more severe and urgent the problem, the more resources we should be allocating to solving it. AI Safety is arguably one of the most imminent problems of our lifetime. It is also one of the most neglected areas, with about 150 full-time people currently working on it. We need you.
Companies and governments will find it strategically valuable to develop advanced AIs which are able to execute creative plans in pursuit of a goal achieving real-world outcomes. Current large language models have a rich understanding of the world which generalizes to other domains, and reinforcement learning agents already achieve superhuman performance at various games. With further advancements in AI research and compute, we are likely to see the development of human-level AI this century. But for a wide variety of goals, it is often valuable to pursue instrumental goals such as acquiring resources, self-preservation, seeking power, and eliminating opposition. By default, we should expect that highly capable agents will have these unsafe instrumental objectives.
The vast majority of actors would not want to develop unsafe systems. However, there are reasons to think that alignment will be hard with modern deep learning systems, and difficulties with making large language models safe provide empirical support of this claim. Misaligned AI may seem acceptably safe and only have catastrophic consequences with further advancements in AI capabilities, and it may be unclear in advance whether a model is dangerous. In the heat of an AI race between companies or governments, proper care may not be taken to make sure that the systems being developed behave as intended.
(This is technically two paragraphs haha. You could merge them into one paragraph, but note that the second paragraph is mostly by Joshua Clymer.)
We shouldn’t be surprised when corporations cut corners when chasing billions of dollars in profit, we should be surprised when they do not.
Some people say that artificial general intelligence is too far away for us to start worrying about. But, the thing is, we’ve had nukes for 75 years and we still haven’t quite figured out what to do about them. If an oracle had come and told us that one day our understanding of atoms would give us the power to destroy the world, how early do you think we should have started to prepare? Well, the empirical answer is that 75 years and counting wasn’t enough. I’m not even confident that we would be ready if we’d been given a thousand years.
Meta: This post was also cross-posted to LessWrong.
If the concept of AI existential risk just sounds too weird for you, try imagining how strange global warming must have sounded at first.
“Perhaps the best window into what those working on AI really believe [about existential risks from AI] comes from the 2016 survey of leading AI researchers. As well as asking if and when AGI might be developed, it asked about the risks: 70 percent of the researchers agreed with Stuart Russell’s broad argument about why advanced AI might pose a risk; 48 percent thought society should prioritize AI safety research more (only 12 percent thought less). And half the respondents estimated that the probability of the longterm impact of AGI being “extremely bad (e.g., human extinction)” was at least 5 percent. I find this last point particularly remarkable—in how many other fields would the typical leading researcher think there is a one in twenty chance the field’s ultimate goal would be extremely bad for humanity?”
Toby Ord, The Precipice
“As AI systems get more powerful, we exit the regime where models fail to understand what we want, and enter the regime where they know exactly what we want and yet pursue their own goals anyways, while tricking us into thinking they aren’t until it’s too late.” (source: https://t.co/s3fbTdv29V)
“Anyone who looked for a source of power in the transformation of the atoms was talking moonshine.”—Ernest Rutherford, shortly before the invention of the Atomic Bomb.
”If there were any risk of [an “AI apocalypse”], it wouldn’t be for another few decades in the future.”—Yann LeCun, 2016 (not being really serious)
(For policy makers and tech executives. If this is too, shorten it by ending it after the I.J. Good quote.)
The British mathematician I. J. Good who worked with Alan Turing on Allied code-breaking during World War II is remembered for making this important insight in a 1966 paper:
I.J. Good expressed concern that we might not be able to keep this superintelligent machine under our control and also was able to recognize that this concern was worth taking seriously despite how it was usually only talked about in science fiction. History has proven him right—Today far more people are taking this concern seriously. For example, Shane Legg, co-founder of DeepMind, recently remarked:
(Alternatively if it’s not too long but just needs to be one paragraph, use this version:)
The British mathematician I. J. Good who worked with Alan Turing on Allied code-breaking during World War II is remembered for making this important insight in a 1966 paper: “Let an ultraintelligent machine be defined as a machine that can far surpass all the intellectual activities of any man however clever. Since the design of machines is one of these intellectual activities, an ultraintelligent machine could design even better machines; there would then unquestionably be an “intelligence explosion,” and the intelligence of man would be left far behind. Thus the first ultraintelligent machine is the last invention that man need ever make, provided that the machine is docile enough to tell us how to keep it under control. It is curious that this point is made so seldom outside of science fiction. It is sometimes worthwhile to take science fiction seriously.” Today far more people are taking this concern seriously. For example, Shane Legg, co-founder of DeepMind, recently remarked: “If you go back 10-12 years ago the whole notion of Artificial General Intelligence was lunatic fringe. People [in the field] would literally just roll their eyes and just walk away. [...] [But] every year [the number of people who roll their eyes] becomes less.”
Human history can be summarised as a series of events in which we slowly and painfully learned from our mistakes (and in many cases we’re still learning). We rarely get things right first time. The alignment problem may not afford the opportunity to learn from our mistakes. If we develop misaligned AGI we will go extinct, or at the very least cede control of our destiny and miss out on the type of future that most people want to see.
It took evolution 3 billion years to create something as intelligent as a fish. It took humanity only 75. Trajectories have a way of making you think about where something will land. We should all start to look up.
Investing in AI Safety today is like investing in mRNA vaccines before 2019, antibiotics before the plague, and peace treaties before WW3.
‘The AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else.’ (Eliezer Yudkowsky)
AI promises us the power of the Gods, but without their wisdom, it may turn out to be a curse.
Think someone said something similar about nuclear weapons.
If you were alive during the industrial revolution, one way to be philanthropic would be to feed the poor and cure the sick. But arguably, you would have been able to have had even more impact by ensuring that this transition went well. AI is set to be an even bigger advance than industrialisation, perhaps the biggest advance in the history of mankind. So perhaps our top priority should be ensuring that this transition goes well?
Paraphrasing Rob Wiblin or someone on 80,000 Hours? (Couldn’t find original)
Intelligence is power and, as Spiderman’s uncle said, “With great power comes great responsibility”. Thinking about AI safety is accepting this responsibility and living up to it.
If we ever fought a war against AIs, we wouldn’t stand a chance. Raspberry Pi’s can already zap mosquitos out of the air with a laser. Good luck with that!
In a well-known thought experiment known as the paperclip maximiser, an AI is instructed to produce as many paperclips as possible. Now we’re made of atoms and atoms can be used to make paperclips, but this isn’t even the scary part. The AI can’t produce paperclips if we deactivate it, so we should expect it to resist our attempts to turn it off.
With debates over moral philosophy still appearing intractable after thousands of years, it’s terrifying that we may need a final decision in less than 100.
Some of the world’s brightest individuals and most capable organizations are rushing to build a technology that can exceed human performance at accomplishing long-term plans that require self-preservation. However, they aren’t working on how to make these robots not care about self-preservation if the AI comes in conflict with humans. Thus, if nothing is done about it, we can very well end up in a situation where we have AI that is smarter than us that does not want us to turn it off. It is impossible to have safe control in this situation.
We don’t have a second chance here. Let that sink in. This is like nothing before.
What do you think ants would have done to prepare if they had to create humans? And if they did nothing: Would they regret their decision by now?
As long as we don’t have a “meditation for AI”, we need to find ways to deal with it.
(To politicians especially) Right now, you just have to trust—and rely on that trust—that others will work well with the most important technology of all time. This “acting based on blind trust” is not in the interest of everyone (at least of your voters) and not why you took that job in the first place, right? So let’s do something about it! Together!(Not a strong argument, but maybe a good way to argue)
“The real problem of humanity is the following: We have Paleolithic emotions, medieval institutions and godlike technology. And it is terrifically dangerous, and it is now approaching a point of crisis overall.”
― Edward O. Wilson
AGI could just get sentient and leave the planet because earths borders are too tight for it. Leave mankind alone completely. But only to hope for this scenario to unfold is foolish. (Argument adapted from Jürgen Schmidhuber)
AI is like every complex problem. To deal with it right, you need to get both sided of the coin right. Which means two things: How can we avoid the worst / failure / stupidity on one hand. And how can we get the best / a “win” / seek brilliance? These are two separated things. And we have to take both into account equally. As AI works in a lot of fields from image recognition to language generation, these two sides are important accordingly.
[Policymakers]
They said that computers would never beat our best chess player; suddenly they did. They said they would never beat our best Go player; suddenly they did. Now they say AI safety is a future problem that can be left to the labs. Would you sit down with Garry Kasparov and Lee Se-dol and take that bet?
“For a while we will be able to overcome these problems [gap between proxy and true objectives] by recognizing them, improving the proxies, and imposing ad-hoc restrictions that avoid manipulation or abuse. But as the system becomes more complex, that job itself becomes too challenging for human reasoning to solve directly and requires its own trial and error, and at the meta-level the process continues to pursue some easily measured objective (potentially over longer timescales). Eventually large-scale attempts to fix the problem are themselves opposed by the collective optimization of millions of optimizers pursuing simple goals.”
Paul Christiano
This isn’t particularly helpful since it’s not sorted, but some transcripts with ML researchers: https://www.lesswrong.com/posts/LfHWhcfK92qh2nwku/transcripts-of-interviews-with-ai-researchers
My argument structure within these interviews was basically to ask them these three questions in order, then respond from there. I chose the questions initially, but the details of the spiels were added to as I talked to researchers and started trying to respond to their comments before they made them.
1. “When do you think we’ll get AGI / capable / generalizable AI / have the cognitive capacities to have a CEO AI if we do?”
Example dialogue: “All right, now I’m going to give a spiel. So, people talk about the promise of AI, which can mean many things, but one of them is getting very general capable systems, perhaps with the cognitive capabilities to replace all current human jobs so you could have a CEO AI or a scientist AI, etcetera. And I usually think about this in the frame of the 2012: we have the deep learning revolution, we’ve got AlexNet, GPUs. 10 years later, here we are, and we’ve got systems like GPT-3 which have kind of weirdly emergent capabilities. They can do some text generation and some language translation and some code and some math. And one could imagine that if we continue pouring in all the human investment that we’re pouring into this like money, competition between nations, human talent, so much talent and training all the young people up, and if we continue to have algorithmic improvements at the rate we’ve seen and continue to have hardware improvements, so maybe we get optical computing or quantum computing, then one could imagine that eventually this scales to more of quite general systems, or maybe we hit a limit and we have to do a paradigm shift in order to get to the highly capable AI stage. Regardless of how we get there, my question is, do you think this will ever happen, and if so when?”
2. “What do you think of the argument ‘highly intelligent systems will fail to optimize exactly what their designers intended them to, and this is dangerous’?”
Example dialogue: “Alright, so these next questions are about these highly intelligent systems. So imagine we have a CEO AI, and I’m like, “Alright, CEO AI, I wish for you to maximize profit, and try not to exploit people, and don’t run out of money, and try to avoid side effects.” And this might be problematic, because currently we’re finding it technically challenging to translate human values preferences and intentions into mathematical formulations that can be optimized by systems, and this might continue to be a problem in the future. So what do you think of the argument “Highly intelligent systems will fail to optimize exactly what their designers intended them to and this is dangerous”?
3. “What do you think about the argument: ‘highly intelligent systems will have an incentive to behave in ways to ensure that they are not shut off or limited in pursuing their goals, and this is dangerous’?”
Example dialogue: “Alright, next question is, so we have a CEO AI and it’s like optimizing for whatever I told it to, and it notices that at some point some of its plans are failing and it’s like, “Well, hmm, I noticed my plans are failing because I’m getting shut down. How about I make sure I don’t get shut down? So if my loss function is something that needs human approval and then the humans want a one-page memo, then I can just give them a memo that doesn’t have all the information, and that way I’m going to be better able to achieve my goal.” So not positing that the AI has a survival function in it, but as an instrumental incentive to being an agent that is optimizing for goals that are maybe not perfectly aligned, it would develop these instrumental incentives. So what do you think of the argument, “Highly intelligent systems will have an incentive to behave in ways to ensure that they are not shut off or limited in pursuing their goals and this is dangerous”?”
Not long and we will have AI the size of our brain.
Investment in AI has been steadily going up. It even seems to be growing exponentially. AI might bring about changes as big as the industrial revolution.
Sundar Pichai, CEO of Google: Artificial Intelligence ‘is probably the most important thing humanity has ever worked on...more profound than electricity or fire’.
Seen here
If COVID were a train, it pulled into the station right on time, but everyone on the platform was surprised when it showed up! If COVID is a train, AI is the wheel.
“Currently we are prejudiced against machines, because all the machines we have met so far have been uninteresting. As they gain in sentience, that won’t be true.” Cited by Kevin Kelly. (Context: As soon as we really get
more sensors than we have grains of sand on the world
plus a connection near the speed of light between them, we may get sentient technology. Just one possible explanation: If consciousness is an emergent phenomenon of enough neurons firing together, we eventually will build sentient technology. [Whichmay
ormay not
be true. But it is not works as an example.])“As Evolution rises, choicefulness increases. … More complexity expands the number of possible choices.” As complexity is rising, free will for technology is inevitable at some point. But we should be very considered on which base this free will starts of. And we lay the foundation for it right now. (Quotes and argument based on the argument that free will is inevitable in technology by Kevin Kelly)
AI is like a lottery-ticket. We have to decide what to do with it until it expires. This probably doesn’t have to be on your agenda in 15 years. But right now it is as urgent as it is important. (Turns out the expiration date is fairly quickly https://www.powerball.com/faq/question/how-long-do-i-have-claim-my-prize)
It took us around 250.000 years to get from hominids to culture. It took us 70 Years from the at scale usable computers to image recognition beyond human level. Ten years from now could mean the significant shift is already done. We are inside a running train, and what our next stop will look like is to be decided by your actions now.
AI is like the biggest magnifying glass ever. You can make fire with it. But you also can burn something down. Decide wisely.
AI gives you the chance to be the greatest hero of all time, just by making one correct decision.
AI can be the key to solving a lot of your biggest problems. But it can be one of them as well. Which one should it be?
Imagine someone holds your best friend as a hostage and puts a gun to it’s head. You can save him by reading a complete book in a language you’ve never seen before. He will pull the trigger when you say the tenth’s word wrong. What will you do? We don’t have a big margin of error here.
Imagine if for every sentence that you speak, for every single word that is not perfectly put by you, 1,000,000 people will die. (Or simply become unemployed.) Would that make you think about your next actions? (Same argument as the code = last breath argument before, but with a little more “batman” in it)
Do you want to go down in history as the soldier of fortune who gambled away humanity?
Building AI without looking at its proper safety measures is like going on a trip to the Bermuda triangle, looking to retrieve a treasure. Without looking at a weather forecast first.
Would you have set a foot on the titanic if you’d known everything about its security? Will you now (with AI)?
If you don’t act accordingly now, Schrödinger’s cat of AI will probably tend towards dead every day more from now on.
AI + x = $$$ (Probably first seen somewhere at Google)
AI is nothing anymore that academia discusses only within its ranks. AI is like a “Joker” for any field from construction to taxing. Come first, serve first. (I was thinking about use cases like this one
Do you want to be a paper clip? No? Then better work on AI safety! (Source kinda https://en.wikipedia.org/wiki/Gray_goo)
There are more than 60 countries with over 700 AI policies out there. At least 10 of them want to be leaders in AI. Across all cultures and states of wealth. Do you still think this topic is not important? Source: https://oecd.ai/en/dashboards, https://medium.com/politics-ai/an-overview-of-national-ai-strategies-2a70ec6edfd
Your next line of code could be Elysium / Utopia in 15 years for all of us. (Inversion / other side of my last argument)
We are absolutely no match for AI in any area that is relevant. Our thinking is infinitely slower. (200Hz vs. at least 2.000.000.000 Hz) Our brain is infinitely smaller. (0.11 m³ vs. 6.1x10^17 m³) We need days—years to alter our brain, alter our habits and routines. AI can do this instantaneously. Just change the hard- and/or software. Human brains are not duplicable, the closest thing to it for us takes 9 months to around 18 years. A computer can do it in an instant. We forget and miss-remember all the time. An AI literally never forgets. We have seven senses. A computer has as many sensors as it / we want it to have. It can see in the night, can “feel” any kind of radiation. It can take any form. Once “alive”, there is literally nothing that you can do to stop it. In any relevant area, we have already created the ingredients of a “god”. A body out of steel and vanadium, sensors for every part of the spectrum. Unmatched in anything that you can think of. And more. The only thing that is missing is the spark, that gives it “life”.
Don’t you think we should make sure it is the right spark?
Based on the data that I put together from various sources, here: https://benjamineidam.com/mensch-gegen-maschine/
(I use storytelling here as I know some politicians and they are absolutely not interested in complicated facts or anything like it. Like not in the slightest. That’s why I choose the more “direct” approach)
Imagine sitting together with your neighbor and everything you do, he is better and faster than you. You want to cook, and he knows the recipe and cooks better than you. You want to watch some sports, but he knows every player, every rule and everything about the whole event. Furthermore, you show him your hobby-room, but he knows more about your hobby than you, has more experience with it and outclasses you without even breaking to sweat. He says goodbye, only to announce that he will come back tomorrow. And the day after. And you will never get rid of him. What would your house rules look like?
Do you think your child could be / is the most precious in the universe? You probably do. With AI, it literally is. Because you would give birth to the first life 3.0. The first life that can alter its own hardware. And you really want to make sure that it understands your values before it accidentally burns your house down. (Argument adapted by max tegmark)
Would you play Russian roulette with a gun that has one hundred slots and maybe will kill you, but maybe give you the next winning lottery-numbers? Too bad it isn’t your choice anymore, the gun is already here. But the rules of the game aren’t. (Argument adapted by someone, but I honestly don’t know who exactly right now)
Humans are godlike to ants because we can solve more problems and have more options. AI will be the same to us. But other than the ant, we can decide how our god will be. (Argument adapted by someone, probably Nick Bostrom)
The universe tends towards more complexity. We cannot alter this. But we decide / influence the exact direction. AI is the next logical step towards this. Which leads to only one question: Where should the compass point? (Argument adapted from David Christensen / Big History)
You cannot stop AGI from emerging, but you can shape how it will do so. As the telephone was inevitable, the iPhone wasn’t. (Argument adapted from Kevin Kelly)
In 1951, Alan Turing argued that at some point computers would probably exceed the intellectual capacity of their inventors, and that “therefore we should have to expect the machines to take control.” Whether that is a good or a bad thing depends on whether the machines are benevolent towards us or not. (Partial source: https://www.newyorker.com/magazine/2015/11/23/doomsday-invention-artificial-intelligence-nick-bostrom )
[for policy makers]
It is a mistake to assume that AI researchers are driven by the positive consequences of their work. Geoffrey Hinton, winner of a Turing Award for his enormous contribution to deep neural networks, is not optimistic about the effects of advanced AI, or whether humans can decide what it does. In a 2015 meeting of the Royal Society, he stated that “there is not a good track record of less intelligent things controlling things of greater intelligence”, and that “political systems will use [AI] to terrorize people”. Nevertheless, he presses on with his research, because “the prospect of discovery is too sweet”.
(source for the quotes: https://www.newyorker.com/magazine/2015/11/23/doomsday-invention-artificial-intelligence-nick-bostrom )
[for any audience]
The chief asset of the human species is our intelligence: with it, we have settled all over and transformed the world. Most machine learning researchers expect AI to surpass human intelligence in all areas within a lifetime (source ). When that happens, humanity will find ourselves in the same place as chimpanzees: with our fate at the mercy of the most intelligent species. As deep learning Geoffrey Hinton noted, “there is not a good track record of less intelligent things controlling things of greater intelligence”.
The existence of life is a prerequisite to the goals of all worldviews. AI is the only influenceable force with the potential to wipe out all life in the universe. Therefore, AI safety is the only thing we can influence that is important to all living things with worldviews that have goals.
Assumptions I made when writing this:
1. Either there aren’t actually-existing worldviews where the only goals are suicidal, or it’s not worth changing “goals” to “non-suicidal goals” just because there are, as it would make it unnecessarily confusing to most audiences.
2. Either we can’t influence the heat death (or other ultimate fate of the universe), or we at least can’t currently in a way that’s comparable to our influence over AI.
3. Either the entire universe is traversable, or all life that we’ll ever have evidence of is in a traversable range.
In 2016, world champion Go player, Lee Sedol, played against an AI system named AlphaGo for five games. In game two, AlphaGo’s move 37 proved instrumental toward beating Sedol [1]. Nobody predicted or understood the move. Later, in game four, it blundered move 79 which led to a loss against Sedol [2]. Nobody predicted or understood the move. AlphaGo ultimately won 4 of the 5 games and provided a concrete example of how humans are not as smart as smart gets. This illustrates a key reason to invest in making AI more safe and trustworthy. The limits of intelligence are unknown unknowns, and advanced AI may be one of the most transformative developments in human history. We hope that next generation AI systems will be well-aligned with our values and that they will make brilliant and useful decisions like move 37. But misaligned values or failures like move 79 will pose hazards and undermine trust unless they can be avoided. It would have been really nice if we had a prescient research community in the 1920s dedicated to making sure that nuclear technology went well — or one in the 1970s with the internet. For the same reason, we shouldn’t miss our chance to invest in research toward safer and more trustworthy AI today.
[1] https://www.wired.com/2016/03/googles-ai-wins-pivotal-game-two-match-go-grandmaster/
[2] https://web.archive.org/web/20161116082508/https:/gogameguru.com/lee-sedol-defeats-alphago-masterful-comeback-game-4/
By inventing AI, we’ve already made a deal with the devil. We’d better negotiate.
Imagine if ants figured out a way to invent human beings. Because they spend all day looking for food, they might program us to “go make lots of food!” And maybe they’d even be cautious, and anticipate certain problems. So they also program us not to use any anteaters as we do it. Those things are dangerous!
What would we do? Probably, we’d make a farm, that grows many times more food than the ants have ever seen. And then we’d water the crops—flooding the ant colony and killing all the ants. Of course, we didn’t TRY to kill the ants; they were just in the way of the goal they gave us. And because we are many times smarter than ants, we accomplished their goal in a way they couldn’t even fathom protecting against.
That’s basically the worry with advanced Artificial Intelligence. Many scientists think we’re approaching a day when AI will be many times smarter than us, and they still don’t know how to stop it from doing things we don’t want. If it gets powerful enough before we learn how to control it, it could make us like the ants.
Artificial Intelligence is very difficult to control. Even in relatively simple applications, the top AI experts struggle to make it behave. This becomes increasingly dangerous as AI gets more powerful. In fact, many experts fear that if a sufficiently advanced AI were to escape our control, it could actually extinguish all life on Earth. Because AI pursues whatever goals we give it with no mind to other consequences, it would stop at nothing – even human extinction – to maximize its reward.
We can’t know exactly how this would happen—but to make it less abstract, let’s imagine some possibilities. Any AI with internet access may be able to save millions of copies of itself on unsecured computers all over the world, each ready to wake up if another were destroyed. This alone would make it virtually indestructible unless humans destroyed the internet and every computer on Earth. Doing so would be politically difficult in the best case—but especially so if the AI were also using millions of convincing disinformation bots to distract people, conceal the truth, or convince humans not to act. The AI may also be able to conduct brilliant cyber attacks to take control of critical infrastructures like power stations, hospitals, or water treatment facilities. It could hack into weapons of mass destruction—or, invent its own. And what it couldn’t do itself, it could bribe or blackmail humans to do for it by seizing cash from online bank accounts.
For these reasons, most AI experts think advanced AI is much likelier to wipe out human life than climate change. Even if you think this is unlikely, the stakes are high enough to warrant caution.
I’m not sure this is true, unless you use a very restrictive definition of “AI expert”. I would be surprised if most AI researchers saw AI as a greater threat than climate change.
I took that from a Kelsey Piper writeup here, assuming she was summarizing some study:
”Most experts in the AI field think it poses a much larger risk of total human extinction than climate change, since analysts of existential risks to humanity think that climate change, while catastrophic, is unlikely to lead to human extinction. But many others primarily emphasize our uncertainty — and emphasize that when we’re working rapidly toward powerful technology about which there are still many unanswered questions, the smart step is to start the research now.”
The hyperlink goes to an FHI paper that appears to just summarize various risks, so it’s unclear what her source was on the “most.” I’d be curious to know as well. She does stress the greater variance of outcomes and uncertainty surrounding AI—writing “Our predictions about climate change are more confident, both for better and for worse.”—so maybe my distillation should admit that too.
One-liner for policymakers:
”Most experts in the AI field think it poses a much larger risk of human extinction than climate change.”—Kelsey Piper, here
One-liner: Artificial Intelligence may kill you and everyone you know. (from Scott Alexander, here)
“Let’s make sure we have a fire extinguisher big enough before accidentally creating a fire big enough that puts the sun to shame”
Why would a computer that is smarter than us need us?
Quoted from an EA forum post draft I’m working on:
“Humans are currently the smartest being on the planet. This means that non-human animals are completely at our mercy. Cows, pigs, and chickens live atrocious lives in factory farms, because humans’ goal of eating meat is misaligned with these animals’ well-being. Saber-toothed tigers and mammoths were hunted to extinction, because nearby humans’ goal was misaligned with these animals’ survival.
But what if in the future, we were not the smartest being on the planet? AI experts predict that it’s basically a coin flip whether or not the following scenario happens by year X. The scenario is that researchers at Deepmind, Google, or Facebook accidentally create an AI system that is systematically smarter than humans. If the goal of this superintelligent, difficult-to-control AI system is accidentally misaligned with human survival, humanity will go extinct. And no AI expert has yet convinced the rest of the field that there is a way to align this superintelligent AI system’s goal in a controlled, guaranteed manner.”
’One metaphor for my headspace is that it feels as though the world is a set of people on a plane blasting down the runway:
And every time I read commentary on what’s going on in the world, people are discussing how to arrange your seatbelt as comfortably as possible given that wearing one is part of life, or saying how the best moments in life are sitting with your family and watching the white lines whooshing by, or arguing about whose fault it is that there’s a background roar making it hard to hear each other.
I don’t know where we’re actually heading, or what we can do about it. But I feel pretty solid in saying that we as a civilization are not ready for what’s coming, and we need to start by taking it more seriously.′ (Holden Karnofsky)
Money can be thought of as equivalent to power. The number of trades on the open stock market made by human decisions continues to dwindle to fractions of a percent. Regardless of your views on AI, you have to see the danger of allowing for concentration of that much power into such obtuse intelligent machinery.
When the goal posts are moved for what qualifies as true artificial intelligence, it’s easy to shrug your shoulders over chess. Less so when it’s sitting in traffic next to you, or judging your resume. Looking over your shoulder and checking the rearview mirror isn’t paranoia, it’s prudent.
If the capabilities of nuclear technology and biotechnology advance faster than their respective safety protocols, the world faces an elevated risk from those technologies. Likewise, increases in AI capabilities must be accompanied by an increased focus on ensuring the safety of AI systems.
‘If you know the aliens are landing in thirty years, it’s still a big deal now.’ (Stuart Russell)
‘Before the prospect of an intelligence explosion, we humans are like small children playing with a bomb. Such is the mismatch between the power of our plaything and the immaturity of our conduct. Superintelligence is a challenge for which we are not ready now and will not be ready for a long time. We have little idea when the detonation will occur, though if we hold the device to our ear we can hear a faint ticking sound. For a child with an undetonated bomb in its hands, a sensible thing to do would be to put it down gently, quickly back out of the room, and contact the nearest adult. Yet what we have here is not one child but many, each with access to an independent trigger mechanism. The chances that we will all find the sense to put down the dangerous stuff seem almost negligible. Some little idiot is bound to press the ignite button just to see what happens.’ (Nick Bostrom)
‘Let an ultraintelligent machine be defined as a machine that can far surpass all the intellectual activities of any man however clever. Since the design of machines is one of these intellectual activities, an ultraintelligent machine could design even better machines; there would then unquestionably be an ‘intelligence explosion,’ and the intelligence of man would be left far behind. Thus the first ultraintelligent machine is the last invention that man need ever make provided that the machine is docile enough to tell us how to keep it under control.′ (I. J. Good)
In the Vulnerable World hypothesis, Nick Bostrom suggests that every time we invent a new technology it’s as though we are drawing balls from a bag. Most balls are white (beneficial) and some are grey (harmful), but some could be black (a technology that could potentially destroy the world). Nuclear bombs are one such example. Inventing artificial general intelligence would be like drawing all the balls at once.
Intelligence is the most powerful of all the tools we have: It allowed humankind to invent advanced language, science and communication. Artificial intelligence has the potential of being an even more powerful tool and we should therefore be thinking about how we can create this tool in a way that is safe and beneficial.
Intelligence is what allows us to shape the world. We are already struggling to cooperate with other human-level intelligences, namely humans, and so far we fail to prevent or stop the resulting conflicts. It is therefore important to think deeply and carefully about the cooperation with intelligences of super-human abilities. The development of AI poses the unique opportunity to actually shape these intelligences to not get in conflict with human goals.
Looking at our unresolved long-lasting moral debates, it is quite alarming to imagine the potential need for value lock-in within the next decades.
We are all in the same boat. The bad news is, it’s sinking. The good news is that we still have a chance to gather all our human power to try and solve this problem before it’s too late.
[TO POLICYMAKERS]
Trying to align very advanced AIs with what we want is a bit like when you try to design a law or a measure to constrain massive companies, such as Google or Amazon, or powerful countries, such as the US or China. You know that when you put a rule in place, they will have enough resources to circumvent it. And you might try as hard as you want, if you didn’t design the AI properly in the first place, you won’t be able to have it make what you want.
[TO ML RESEARCHERS AND MAYBE TECH EXECUTIVES]
When you look at society’s problems, you can observe that many of our structural problems come from strong optimizers.
Companies, to keep growing once they’re big enough, start having questionable practices such as tax evasion, preventing new companies from entering markets, capturing regulators to keep their benefits etc.
Policymakers who are elected are those who are doing false promises, who are ruthless with their adversaries and who are using communication without caring about truth.
Now, even these optimizers that are hard to fight against, are very limited in their capabilities. They’re limited by coordination costs, by their limited ability to forecast or by their limited ability to process relevant information. AI poses the risk to break down these barriers and be able to optimize much more strongly. And thus, the feeling that you may have next to these companies and policymakers, i.e that you can’t stop them even if you the way they’re cheating, will be multiplied tenfold next to smarter AIs.
‘You can’t fetch the coffee if you’re dead’ (Stuart Russell)