MichaelStJules comments on My highly personal skepticism braindump on existential risk from artificial intelligence.

MichaelStJules 30 Jan 2023 20:53 UTC
48 points
14 ∶ 0
One of my main high-level hesitations with AI doom and futility arguments is something like this, from Katja Grace:
My weak guess is that there’s a kind of bias at play in AI risk thinking in general, where any force that isn’t zero is taken to be arbitrarily intense. Like, if there is pressure for agents to exist, there will arbitrarily quickly be arbitrarily agentic things. If there is a feedback loop, it will be arbitrarily strong. Here, if stalling AI can’t be forever, then it’s essentially zero time. If a regulation won’t obstruct every dangerous project, then is worthless. Any finite economic disincentive for dangerous AI is nothing in the face of the omnipotent economic incentives for AI. I think this is a bad mental habit: things in the real world often come down to actual finite quantities. This is very possibly an unfair diagnosis. (I’m not going to discuss this later; this is pretty much what I have to say.)
“Omnipotent” is the impression I get from a lot of the characterization of AGI.
Another recent specific example here.
Similarly, I’ve had the impression that specific AI takeover scenarios don’t engage enough with the ways they could fail for the AI. Some are based primarily on nanotech or engineered pathogens, but from what I remember of the presentations and discussions I saw, they don’t typically directly address enough of the practical challenges for an AI to actually pull them off, e.g. access to the materials and a sufficiently sophisticated lab/facility with which to produce these things, little or poor verification of the designs before running them through the lab/facility (if done by humans), attempts by humans to defend ourselves (e.g. the military) or hide, ways humans can disrupt power supplies and electronics, and so on. Even if AI takeover scenarios are disjunctive, so are the ways humans can defend ourselves and the ways such takeover attempts could fail, and we have a huge advantage through access to and control over stuff in the outside world, including whatever the AI would “live” on and what powers it. Some of the reasons AI could fail across takeover plans could be common across significant shares of otherwise promising takeover plans, potentially placing a limit on how far an AI can get by considering or trying more and more such plans or more complex plans.
I’ve seen it argued that it would be futile to try to make the AI more risk-averse (e.g. sharply decreasing marginal returns), but this argument didn’t engage with how risks for the AI from human detection and possible shutdown, threats by humans or the opportunity to cooperate/trade with humans would increasingly disincentivize such an AI from taking extreme action the more risk-averse it is.
I’ve also heard an argument (in private, and not by anyone working at an AI org or otherwise well-known in the community) that AI could take over personal computers and use them, but distributing computations that way seems extremely impractical for computations that run very deep, so there could be important limits on what an AI could do this way.
That being said, I also haven’t personally engaged deeply with these arguments or read a lot on the topic, so I may have missed where these issues are addressed, but this is in part because I haven’t been impressed by what I have read (among other reasons, like concerns about backfire risks, suffering-focused views and very low probabilities of the typical EA or me in particular making any difference at all).