This post deserves deep reflection and response from those that believe that AGI needs tons of money to spend on alignment problems. I sadly suspect that that won’t really happen in the EA or LW communities.
One of the things that makes me an outlier in today’s technology scene is that I cannot even begin to understand or empathize with mindsets capable of being scared of AI and robots in a special way. I sincerely believe their fears are strictly nonsensical in a philosophical sense, in the same sense that I consider fear of ghosts or going to hell in an “afterlife” to be strictly nonsensical. Since I often state this position (and not always politely, I’m afraid) and walk away from many AI conversations, I decided to document at least a skeleton version of my argument. One that grants the AI-fear view the most generous and substantial interpretation I can manage.
Let me state upfront that I share in normal sorts of fears about AI-based technologies that apply to all kinds of technologies. Bridges can collapse, nuclear weapons can end the world, and chemical pollution can destabilize ecosystems. In that category of fears, we can include: killer robots and drones can kill more efficiently than guns and Hellfire missiles, disembodied AIs might misdiagnose X-rays, over-the-air autopilot updates might brick entire fleets of Teslas causing big pile-ups, and trading algorithms might cause billions in losses through poorly judged trades. Criticisms of the effects of existing social media platforms that rely on AI algorithms fall within these boundaries as well.
These are normal engineering risks that are addressable through normal sorts of engineering risk-management. Hard problems but not categorically novel.
I am talking about “special” fears of AI and robots. In particular, ones that arise from our tendency to indulge in what I will call hyperanthropomorphic projection, which I define as attributing to AI technologies “super” versions of traits that we perceive in ourselves in poorly theorized, often ill-posed ways. These include:
“Sentience”
“Consciousness”
“Intentionality”
“Self-awareness”
“General intelligence”
I will call the terms in this list pseudo-traits. I’m calling them that, and putting scare quotes around all of them, because I think they are all, without exception, examples of what the philosopher Gilbert Ryle referred to as “philosophical nonsense.” It’s not that they don’t point to (or at least gesture at) real phenomenology, but that they do so in a way that is so ill-posed and not-even-wrong that anything you might say about that phenomenology using those terms is essentially nonsensical. But this can be hard to see because sentences and arguments written using these terms can be read coherently. Linguistic intelligibility does not imply meaningfulness (sentences like “colorless green ideas sleeping furiously” or “water is triangular” proposed by Chomsky/Pinker are examples of intelligible philosophical nonsense).
To think about AI, I myself use a super-pseudo-trait mental model, which I described briefly in Superhistory, not Superintelligence. But my mental model rests on a non-anthropomorphic pseudo-trait: time, and doesn’t lead to any categorically unusual fears or call for categorically novel engineering risk-management behaviors. I will grant that my model might also be philosophical nonsense, but if so, it is a different variety of it, with its nonsensical aspects rooted in our poor understanding of time.
Read the entire thing.
Upvoted because I think the linked post raises an actually valid objection, even thought it does not seem devastating to me and it is kind of obscured by a lot of philosophy that also seems not that relevant to me.
There was a linkpost for this in LessWrong a few days ago, I think the discussion in the comments is good.
The top voted comment in LW says: “(I kinda skimmed, sorry to everyone if I’m misreading / mischaracterizing!)”
All the comments there just seem to assert that VGR’s core argument isn’t really valid. It’s not really an actual engagement.
Well, I also think that the core argument is not really valid. Engagement does not require conceding that the other person is right.
The way I understand it, the core of the argument is that AI fears are based on taking a pseudo-trait like “intelligence” and extrapolating it to a “super” regime. The author claims that this is philosophical nonsense and thus there’s nothing to worry about. I reject that AI fears are based on those pseudo-traits.
AI risk is not in principle about intelligence or agency. A sufficient amount of brute-force search is enough to be catastrophic. An example of this is the “Outcome Pump”. But if you want a less exotic example, consider evolution. Evolution is not sentient, not intelligent, and not an agent (unless your definition of those is very broad). And yet, evolution from time to time makes human civilization stumble by coming up with deadly, contagious viruses.
Now, viruses evolve to make more copies of themselves, so it is quite unlikely that an evolved virus will kill 100% of the population. But if virus evolution didn’t have that life-preserving property, and if it happened 1000 times faster, then we would all die within months.
The analogy with AI is: suppose we spend 10^100000 FLOPs on a brute force search for industrial robot designs. We simulate the effects of different designs on the current world and pick the one whose effects are closest to out target goal. The final designs will be exceedingly good at whatever the target of the search is, including at convincing us that we should actually build the robots. Basically, the moment someone sees those designs, humanity will have lost some control over their future. In the same way that, once SARS-CoV-2 entered a single human body, the future of humanity suddenly became much more dependent on our pandemic response.
In practice we don’t have that much computational power. That’s why intelligence becomes a necessary component of this, because intelligence vastly reduces the search space. Note that this is not some “pseudo-trait” built on human psychology. This is intelligence in the sense of compression: how many bits of evidence you need to complete a search. It is a well-defined concept with clear properties.
Current AIs are not very intelligent by this measure. And maybe they’ll be. Maybe it would take some paradigm different from Deep Learning to achieve this level of intelligence. That is an empirical question that we’ll need to solve. But at no point does SIILTBness play any role in this.
Sufficiently powerful search is dangerous even if there’s nothing like it is to be a search process. And ‘powerful’ here is a measure of how many states you visit and how efficiently you do it. Evolution itself is a testament to the power of search. It is not philosophical non-sense, but the most powerful force on Earth for billions of years.
(Note: the version of AI risk I have explored here is a particularly ‘hard’ version, associated with the people who are most pessimistic about AI, notably MIRI. There are other versions that do rest on something like agency or intelligence)
The objection that I thought was valid is that current generative AIs might not be that dangerous. But the author himself acknowledges that training situated and embodied AIs could be dangerous, and it seems clear that the economic incentives to build that kind of AI are strong enough that it will happen eventually (and we are already training AIs on virtual environments such as Minecraft. Is that situated and embodied enough?)