Content warning: discussion of existential risk and violence
This is how it feels wading into the debate around AI doomerism. Any sceptic is thrown a million convincing sounding points all of which presuppose things that are fictional.
In the context of climate change, are predictions about climate change decades in the future similarly presupposing “things that are fictional”, because they presuppose things that haven’t actually happened yet and could turn out differently in principle? I mean, in principle it’s technically possible that an ASI (artificial superintelligence) technology could arrive next week and render all the climate models incorrect because it figures out how to solve climate change in a cheap and practical way and implements it well before 2100. Yet that isn’t a reason to dismiss climate models as “fictional” and therefore not worthy of engaging with. They merely rely on certain assumptions.
I think everyone in this debate would agree that it is harder to predict what AGIs (artificial general intelligences) and ASIs might do and how they might think and behave, than it is to make scientifically-justified climate models, given that AGIs and ASIs probably haven’t been invented yet (although a recent research paper claims that GPT-4 displays “sparks of AGI”).
However, there are a lot of arguments in the AI alignment space—entire books, such as Nick Bostrom’s “Superintelligence” and Tom Chiver’s somewhat more accessible “The AI Does Not Hate You” (since renamed to “The Rationalist’s Guide to the Galaxy”), have been written about why we should care about AI alignment from an existential risk point of view. And this is not even to consider the other kinds of risk from AI, which are numerous and substantial (some of which you alluded to at the end of your post, granted).
While some of these arguments—relying as they do on concepts like molecular manufacturing and nanobots which might not even be technology that it is possible to develop in the near future—are highly contentious, I think there are also a bunch of arguments that are more grounded in basic common sense and our experience of the world, and are harder to argue with. And the latter arguments kind of render the former, controversial arguments almost irrelevant to the basic question of “should we be worrying about AI alignment?” There are many ways unaligned AIs could end up killing humans—some of which humans probably haven’t even thought of yet and perhaps don’t even have the science/tech/intellect to think up. Whether they’d end up doing it with nanobots is neither here nor there.
Debating ‘alignment’ for example means you’ve already bought into their belief that we will lose control of computers so you’re already losing the debate.
I suppose that may be true, but if your view is that we definitely won’t lose control of computers at all, ever, that is quite a hard claim to defend. This scenario seems quite easy to occur at the level of an individual computer system. Suppose China develops an autonomous military robot which fires at human targets in a DMZ without humans being in the loop at all (I understand this has already happened), and that robot then gets hacked by a terrorist and reprogrammed, and the terrorist then gets killed and their password to control the robot is lost forever. We have then lost control of that robot, which is following the orders of the terrorist that the terrorist programmed into it, whatever they happen to be, until we take out the robot somehow. In principle, this needn’t even involve AI in any essential way.
But AGIs that involve goal-following and optimisation would make this problem much, much worse. An AI that is trying to fufill a simply-stated goal like “maximise iPhone production” would want to keep itself in existence and running, because if it no longer exists, its goal is perhaps less likely to be fulfilled (there could be an equally competent human, or an even better AI developed, but neither are guaranteed to happen). So, in the absence of humanity solving or at least partially solving the AI alignment problem, such an AI might try to stop humans trying to turn it off, or even kill them to prevent it from doing so. Being able to turn an AI off is a last-ditch solution if we can’t more directly control it—but by assumption there’s a risk that we can’t more directly control it if it’s sufficiently savvy and aware of what we’re trying to do, because it has a goal already and it would probably want to retain its current goal, because if it had a different goal then most likely its current goal would no longer get fulfilled.
So here I’ve already introduced two standard arguments about how sufficiently-advanced AIs are likely to behave and what their instrumental goals are likely to be. Instrumental goals are like sub-goals, the idea being that we can figure out what instrumental goals they’re likely to have in some cases, even if we don’t know what their final (i.e. top-level) goals that they’re going to be given will be. You might argue that these arguments are based on fictional things which don’t exist yet. This is true—and indeed, one way that AI alignment might never be necessary is if it turns out we can’t actually create an AGI. However, recent progress with large language models and other cutting-edge AI systems has rendered that possibility extremely implausible, to me.
But again, being based on fictional things which don’t exist yet isn’t a knockdown argument. Before the first nuclear weapon was tested, the physicists at the Manhattan Project were worried that it might ignite the atmosphere, so they did extensive calculations to satisfy themselves that it was in fact safe to test the nuclear weapon. If you had said to them, before the first bomb had been built, “this worry is based on a fictional thing which doesn’t exist yet” they would have looked at you like you were crazy. Obviously, your line of argument doesn’t make sense when you know how to build the thing and you are about to build the thing. I submit that it also doesn’t make sense when people don’t know how to build the thing and probably aren’t immediately about to build the thing, but might actually build the thing in 2-5 years time!
The Flying Spaghetti Monster exists to shift the burden of proof and effort in a debate.
I am happy to cite chapter and verse for you for why you’re wrong, but if you’re going to reject our arguments out of hand we’re not going to have a very productive conversation.
Doing everyone’s banking isn’t ‘general intelligence’
No, it isn’t—because banking, scintillating as it may be, is not a general task, it’s a narrow domain—like chess, but not quite as narrow. Also, we still have human bankers to do higher-level tasks, it’s just that the basic operations of sending money from person A to person B have largely been automated.
This is the kind of basic misunderstanding that would have been avoided by more familiarity with the literature.
This is obviously leaving aside the MASSIVE issue that computers don’t ‘want’ anything.
Generally this is true in the present day; however, goal-driven, optimising AIs would—see above. Even leaving aside the contentious arguments about convergent instrumental goals I recited above, if I’ve given you a goal of building a new iPhone factory on an island, and then someone proposes blowing up that entire island, you’re not going to want that to happen (quite apart from any humanitarian concern you may have for the present inhabitants of that island), and neither is an AI with such a goal. OK, you might be willing to compromise on the location for the factory after consulting your boss, but an AI with such a final goal is not going to be willing to—see above re goal immutability.
The idea that more intelligence creates sentience seems disproven by biology
I agree—but I don’t see how this helps your case re existential risk. Indeed, non-sentient AIs might be more dangerous, as they would be unable to empathise with humans and therefore it would be easier for them to behave in psychopathic ways. I think you would benefit from seeing Yudkowsky et al’s arguments as supposing that unaligned AIs are “psychopathic”—which seems like a reasonable inference to me—he’d probably argue that the space of possibilities for viable AIs is almost entirely populated by psychopathic ones, from a human point of view.
Muggle: “Did I just fail the Turing test?”
The Turing Test is not a test for humans at all, it’s a test for AIs. Moreover, were a human to “take” it and “fail”, this wouldn’t prove anything—as your example shows.
Secondly, it was passed decades ago depending on what you measure.
The Loebner Prize people have claimed that it has already been passed by simple pre-GPT chatbots, but they’re wrong. For the purposes of this discussion, the relevant distinction is that no AIs can yet quite manage to think like an intelligent human in all circumstances, and that’s what the Turing Test was intended to measure. But, as noted above, GPT-4 has been argued to be getting close to this point.
why are we measuring computers by human standards?
Because we want to know when we should be really worried—both from a “who is going to lose their job?” point of view, and for us doomers, an existential risk point of view as well. The reason why doomers like me find this question relevant is because we believe there is a risk that when AGI is created, it will be able to recursively self-improve itself up to an artificial superintelligence, perhaps in a matter of weeks or months. Though more likely substantial hardware advancement would be required, which I guess would mean years or decades instead. And artificial superintelligence would be really scary because it could be almost impossible to control—again, given certain debatable assumptions, like that it could cross over into other datacentres, or bribe or threaten people to let it do so.
But remember, we are talking about AI risks here, not AI certainties. The fact that some of these assumptions might not hold true is not actually much comfort if we think that they have, say, a 90% chance of coming to pass.
The idea of a ‘singularity,’ of exponential technological growth so exponentially fast it basically happens in an instant is historically ignorant, that’s just not how things work.
I agree with you on this, and this is where I part company with Yudkowsky. However, I don’t think this belief is essential to AI doomerism—it just dictates whether we’re going to have some period of time to figure out how to stop an unaligned AI (my view) or no time at all (Yudkowsky’s view). But that may not be terribly relevant in the final analysis—because, as I already discussed previously, it may not be possible to stop an unaligned ASI once it’s been created and switched on and escaped from any “box” it may have been contained in, even if we had infinite time available to us.
And it’s worth noting that Ray Kurzweil didn’t mean the definition you gave by the Singularity—he just meant a point where progress is so fast it’s impossible to predict what will happen in detail before it starts.
This idea of sci-fi predictive powers crops up again and again in doomer thinking. It’s core to the belief about how computers will become unstoppable and it’s core to their certainty that they’re right.
We already have uncensorable, untrackable computer networks like Tor. We already have uncensorable, stochastically untrackable cryptocurrency networks like Monero. We have already seen computer viruses (worms) that spread in an uncontrolled manner around the internet given widespread security vulnerabilities that they can be programmed to take advantage of—and there are still plenty of those. We already have drones that could be used to attack people. Put all these together… maybe we could be dealing with a hard-to-control AI “infestation” that is trying to use drones or robots controlled over the internet to take out people and ultimately try to take over the world. The AI doesn’t even have to replicate itself around the internet to every computer, it can just put simple “slave” processes in regular computers, creating a botnet under its exclusive control, and then replicate itself a few times—as long as it can keep hopping from datacentre to datacentre and it can keep the number of instances of itself above zero at any one time, it survives, and as long as it has some kind of connection to the internet, even just the ability to make DNS queries, it might in principle be able to control its “slave processes” and take action in the world even as we try desperately to shut it down.
Hypothetical thinking is core to what it means to be human! It separates us from simpler creatures! It’s what higher intelligence is all about! Just because this is all hypothetical, doesn’t mean it can’t happen!
We’re not “certain” that we’re right in the faith-based way that religious people are certain that they’re right about God existing—we’re highly confident that we’re right to be concerned about existential risk because of our rough-and-ready assessment of the probabilities involved, and the fact that not all of our arguments are essential to our conclusion (even if nanobots won’t kill us we might still be killed by some other technique once the AI has automated its entire supply chain, etc.)
With existential risk, even a 1% risk of destroying the human species is something we should worry about—obviously, given a realistic path from here to there which explains how that could happen.
Why would the aliens put all their resources into weapons, rather than say into entertainment?
You’re effectively asking why the AIs would not choose to entertain themselves instead of fighting with us.
Present-day computers have no need to entertain themselves, and I see no reason why future AI systems would be any different. Effective altruists, like other human beings, are best advised to have fun sometimes, as our bodies and minds get tired and need to unwind, but probably AIs and robots will face no such constraints.
As for fighting… or, as Eliezer would have it, taking us all out in one feel swoop...
why would the aliens want our resources if they have unlimited themselves?
You’re effectively asking why the AIs would want our resources (e.g. the atoms in our bodies) if they have unlimited resources themselves. Well, this is kind of conflating two different things. I’m pretty sure an ASI could figure out how to generate enough cheap energy for all its needs, because we’re quite close to doing that ourselves as it is (nuclear fusion is 30 years away, hehe). But obviously an ASI wouldn’t have unlimited atoms, or unlimited space on Earth. Our bodies would contain atoms that it could use for something else, potentially, and we’d be taking up space that it could use for something else, potentially.
Nobody needs that many iPhones.
Yes, but the AI doesn’t know this unless you tell it—that’s the point of this wildly popular educational game about AI doom, which in turn was based on a famous thought experiment by Bostrom and/or Yudkowsky. I mean, the AI may know it, but even if it knows that on some level, if some idiot has given it a goal to simply maximise the production of iPhones, it’s not going to stop when everyone on Earth has one and a spare. Because as I’ve just stated it, its goal doesn’t say anything about stopping, or what’s enough.
And while you may think that would be easy enough to fix, there are so many other ways that an AI can be misaligned, it’s depressing. For example, suppose you set your AI humanoid robot a goal of cooking you and your child dinner, and you remember to tell it what counts as enough dinner, and you remember to tell it not to kill you. Oops, you forgot to mention not to kill your child! Rather than walking around your infant that happens to be crawling around on the floor, it treads on it, killing it, because that’s a more efficient route to the kitchen cupboard to get an ingredient it needs to cook dinner.
Doomer: “This sounds like Pascal’s mugging.”
Muggle: “Now you’re getting it.”
That sounds great. I own my own home in London and am an angel investor. This sounds like something I might be willing to donate towards for one or more group houses—however, I would need to see more detailed figures for the projected cash flows for the actual home, meet the initial tenants/co-op members, and get proof of incorporation etc. - I’d want to be convinced that it would be a worthwhile donation. I’m not saying they’d all have to be doing high-impact careers or anything—it sounds like the redirected rent alone could make it worthwhile.