I think this is a really fun short story, and a really bad analogy for AI risk.
In the story, the humans have an entire universes worth of computation available to them, including the use of physical experiments with real quantum physics. In contrast, an AI cluster only has access to whatever scraps we give it. Humans combined will tend to outclass the AI in terms of computational resources until it’s actually achieved some partial takeover of the world, but that partial takeover is a large part of difficulty here. This means that the fundamental analogy of the AI having “thousands of years” to run experiments is fundamentally misleading.
Another flaw is that this paragraph is ridiculous
A thousand years is long enough, though, for us to work out paradigms of biology and evolution in five-dimensional space, trying to infer how aliens like these could develop. The most likely theory is that they evolved asexually, occasionally exchanging genetic material and brain content. We estimate that their brightest minds are roughly on par with our average college students, but over millions of years they’ve had time to just keep grinding forward and developing new technology.
You cannot, in fact, deduce how a creature 2 dimensions above you reproduces from looking at a video of them touching a fucking rock. This is a classic neglect of ignoring unknown information and computational complexity: there are just too many alternate ways in which “touching rocks” can happen. For example, imagine trying to deduce the atmosphere of the planet they live on: except wait, they don’t follow our periodic table, they follow a five dimensional alternative version that we know nothing about.
There is also the problem of multiple AI’s: In this scenario, it’s like our world is the very first that is encountered by the tentacle beings, and they have no prior experience. But in actual AI, each AI will be preceded by a shitload of less intelligent AI’s, and also a ton of other independent AI’s independent of it will exist. This will add a ton of dynamics, in particular making it easier for warning shots to happen.
The analogy here is that instead of the first message we recieve is “rock”, our first message is “Alright, listen here pipsqueaks, the last people we contacted tried to fuck with our internet and got a bunch of people killed: we’re monitoring your every move, and if you even think of messing with us your entire universe is headed to the recycle bin, kapish?”
There’s value in talking about the non-parallels, but I don’t think that justifies dismissing the analogy as bad. What makes an analogy a good or bad thing?
I don’t think there are any analogies that are so strong that we can lean on them for reasoning-by-analogy, because reasoning by analogy isn’t real reasoning, and generally shouldn’t be done. Real reasoning is when you carry a model with you that has been honed against the stories you have heard, but the models continue to make pretty good predictions even when you’re facing a situation that’s pretty different from any of those stories. Analogical reasoning is when all you carry is a little bag of stories, and then when you need to make a decision, you fish out the story that most resembles the present, and decide as if that story is (somehow) happening exactly all over again.
There really are a lot of people in the real world who reason analogically. It’s possible that Eliezer was partially writing for them, someone has to, but I don’t think he wanted the lesswrong audience (who are ostensibly supposed to be studying good reasoning) to process it in that way.
If it were just Eliezer writing a fanciful story about one possible way things might go, that would be reasonable. But when the story appears to reflect his very strongly held belief about AI unfolding approximately like this {0 warning shots; extremely fast takeoff; near-omnipotent relative to us; automatically malevolent; etc} and when he elsewhere implies that we should be willing to cause nuclear war to enforce his priorities, it starts to sound more sinister.
I don’t think this is really engaging with what I said/should be a reply to my comment.
he elsewhere implies that we should be willing to cause nuclear war to enforce his priorities
Ah, reading that, yeah this wouldn’t be obvious to everyone.
But here’s my view, which I’m fairly sure is also eliezer’s view: If you do something that I credibly consider to be even more threatening than nuclear war (even if you don’t think it is) (as another example: gain of function research), and you refuse to negotiate towards a compromise where you can do the thing in a non-threatening way, so I try to destroy the part of your infrastructure that you’re using to do this, and then you respond to that by escalating to a nuclear exchange, then it is not accurate to say that it was me who caused the nuclear war.
Now, if you think I have a disingenuous reason to treat your activity as threatening even though I know it actually isn’t (which is an accusation people often throw at openai, and it might be true in openai’s case), that you tried to negotiate a safer alternative, but I refused that option, and that I was really essentially just demanding that you cede power, then you could go ahead and escalate to a nuclear exchange and it would be my fault.
But I’ve never seen anyone ever accuse, let alone argue competently, that Eliezer believes those things for disingenuous powerseeking reasons. (I think I’ve seen some tweets that implied that it was a grift for funding his institute, but I honestly don’t know how a person believes that, but even if it were the case, I don’t think Eliezer would consider funding MIRI to be worth nuclear war for him.)
I agree that it is a poor analogy for AI risk. However, I do think it is a semi-reasonable intuition pump for why AIs that are very superhuman would be an existential problem if misaligned (and without other serious countermeasures).
I think this is a really fun short story, and a really bad analogy for AI risk.
In the story, the humans have an entire universes worth of computation available to them, including the use of physical experiments with real quantum physics. In contrast, an AI cluster only has access to whatever scraps we give it. Humans combined will tend to outclass the AI in terms of computational resources until it’s actually achieved some partial takeover of the world, but that partial takeover is a large part of difficulty here. This means that the fundamental analogy of the AI having “thousands of years” to run experiments is fundamentally misleading.
Another flaw is that this paragraph is ridiculous
You cannot, in fact, deduce how a creature 2 dimensions above you reproduces from looking at a video of them touching a fucking rock. This is a classic neglect of ignoring unknown information and computational complexity: there are just too many alternate ways in which “touching rocks” can happen. For example, imagine trying to deduce the atmosphere of the planet they live on: except wait, they don’t follow our periodic table, they follow a five dimensional alternative version that we know nothing about.
There is also the problem of multiple AI’s: In this scenario, it’s like our world is the very first that is encountered by the tentacle beings, and they have no prior experience. But in actual AI, each AI will be preceded by a shitload of less intelligent AI’s, and also a ton of other independent AI’s independent of it will exist. This will add a ton of dynamics, in particular making it easier for warning shots to happen.
The analogy here is that instead of the first message we recieve is “rock”, our first message is “Alright, listen here pipsqueaks, the last people we contacted tried to fuck with our internet and got a bunch of people killed: we’re monitoring your every move, and if you even think of messing with us your entire universe is headed to the recycle bin, kapish?”
There’s value in talking about the non-parallels, but I don’t think that justifies dismissing the analogy as bad. What makes an analogy a good or bad thing?
I don’t think there are any analogies that are so strong that we can lean on them for reasoning-by-analogy, because reasoning by analogy isn’t real reasoning, and generally shouldn’t be done. Real reasoning is when you carry a model with you that has been honed against the stories you have heard, but the models continue to make pretty good predictions even when you’re facing a situation that’s pretty different from any of those stories. Analogical reasoning is when all you carry is a little bag of stories, and then when you need to make a decision, you fish out the story that most resembles the present, and decide as if that story is (somehow) happening exactly all over again.
There really are a lot of people in the real world who reason analogically. It’s possible that Eliezer was partially writing for them, someone has to, but I don’t think he wanted the lesswrong audience (who are ostensibly supposed to be studying good reasoning) to process it in that way.
If it were just Eliezer writing a fanciful story about one possible way things might go, that would be reasonable. But when the story appears to reflect his very strongly held belief about AI unfolding approximately like this {0 warning shots; extremely fast takeoff; near-omnipotent relative to us; automatically malevolent; etc} and when he elsewhere implies that we should be willing to cause nuclear war to enforce his priorities, it starts to sound more sinister.
I don’t think this is really engaging with what I said/should be a reply to my comment.
Ah, reading that, yeah this wouldn’t be obvious to everyone.
But here’s my view, which I’m fairly sure is also eliezer’s view: If you do something that I credibly consider to be even more threatening than nuclear war (even if you don’t think it is) (as another example: gain of function research), and you refuse to negotiate towards a compromise where you can do the thing in a non-threatening way, so I try to destroy the part of your infrastructure that you’re using to do this, and then you respond to that by escalating to a nuclear exchange, then it is not accurate to say that it was me who caused the nuclear war.
Now, if you think I have a disingenuous reason to treat your activity as threatening even though I know it actually isn’t (which is an accusation people often throw at openai, and it might be true in openai’s case), that you tried to negotiate a safer alternative, but I refused that option, and that I was really essentially just demanding that you cede power, then you could go ahead and escalate to a nuclear exchange and it would be my fault.
But I’ve never seen anyone ever accuse, let alone argue competently, that Eliezer believes those things for disingenuous powerseeking reasons. (I think I’ve seen some tweets that implied that it was a grift for funding his institute, but I honestly don’t know how a person believes that, but even if it were the case, I don’t think Eliezer would consider funding MIRI to be worth nuclear war for him.)
I agree that it is a poor analogy for AI risk. However, I do think it is a semi-reasonable intuition pump for why AIs that are very superhuman would be an existential problem if misaligned (and without other serious countermeasures).