One of my main high-level hesitations with AI doom and futility arguments is something like this, from Katja Grace:
My weak guess is that thereās a kind of bias at play in AI risk thinking in general, where any force that isnāt zero is taken to be arbitrarily intense. Like, if there is pressure for agents to exist, there will arbitrarily quickly be arbitrarily agentic things. If there is a feedback loop, it will be arbitrarily strong. Here, if stalling AI canāt be forever, then itās essentially zero time. If a regulation wonāt obstruct every dangerous project, then is worthless. Any finite economic disincentive for dangerous AI is nothing in the face of the omnipotent economic incentives for AI. I think this is a bad mental habit: things in the real world often come down to actual finite quantities. This is very possibly an unfair diagnosis. (Iām not going to discuss this later; this is pretty much what I have to say.)
āOmnipotentā is the impression I get from a lot of the characterization of AGI.
Similarly, Iāve had the impression that specific AI takeover scenarios donāt engage enough with the ways they could fail for the AI. Some are based primarily on nanotech or engineered pathogens, but from what I remember of the presentations and discussions I saw, they donāt typically directly address enough of the practical challenges for an AI to actually pull them off, e.g. access to the materials and a sufficiently sophisticated lab/āfacility with which to produce these things, little or poor verification of the designs before running them through the lab/āfacility (if done by humans), attempts by humans to defend ourselves (e.g. the military) or hide, ways humans can disrupt power supplies and electronics, and so on. Even if AI takeover scenarios are disjunctive, so are the ways humans can defend ourselves and the ways such takeover attempts could fail, and we have a huge advantage through access to and control over stuff in the outside world, including whatever the AI would āliveā on and what powers it. Some of the reasons AI could fail across takeover plans could be common across significant shares of otherwise promising takeover plans, potentially placing a limit on how far an AI can get by considering or trying more and more such plans or more complex plans.
Iāve seen it argued that it would be futile to try to make the AI more risk-averse (e.g. sharply decreasing marginal returns), but this argument didnāt engage with how risks for the AI from human detection and possible shutdown, threats by humans or the opportunity to cooperate/ātrade with humans would increasingly disincentivize such an AI from taking extreme action the more risk-averse it is.
Iāve also heard an argument (in private, and not by anyone working at an AI org or otherwise well-known in the community) that AI could take over personal computers and use them, but distributing computations that way seems extremely impractical for computations that run very deep, so there could be important limits on what an AI could do this way.
That being said, I also havenāt personally engaged deeply with these arguments or read a lot on the topic, so I may have missed where these issues are addressed, but this is in part because I havenāt been impressed by what I have read (among other reasons, like concerns about backfire risks, suffering-focused views and very low probabilities of the typical EA or me in particular making any difference at all).
One of my main high-level hesitations with AI doom and futility arguments is something like this, from Katja Grace:
āOmnipotentā is the impression I get from a lot of the characterization of AGI.
Another recent specific example here.
Similarly, Iāve had the impression that specific AI takeover scenarios donāt engage enough with the ways they could fail for the AI. Some are based primarily on nanotech or engineered pathogens, but from what I remember of the presentations and discussions I saw, they donāt typically directly address enough of the practical challenges for an AI to actually pull them off, e.g. access to the materials and a sufficiently sophisticated lab/āfacility with which to produce these things, little or poor verification of the designs before running them through the lab/āfacility (if done by humans), attempts by humans to defend ourselves (e.g. the military) or hide, ways humans can disrupt power supplies and electronics, and so on. Even if AI takeover scenarios are disjunctive, so are the ways humans can defend ourselves and the ways such takeover attempts could fail, and we have a huge advantage through access to and control over stuff in the outside world, including whatever the AI would āliveā on and what powers it. Some of the reasons AI could fail across takeover plans could be common across significant shares of otherwise promising takeover plans, potentially placing a limit on how far an AI can get by considering or trying more and more such plans or more complex plans.
Iāve seen it argued that it would be futile to try to make the AI more risk-averse (e.g. sharply decreasing marginal returns), but this argument didnāt engage with how risks for the AI from human detection and possible shutdown, threats by humans or the opportunity to cooperate/ātrade with humans would increasingly disincentivize such an AI from taking extreme action the more risk-averse it is.
Iāve also heard an argument (in private, and not by anyone working at an AI org or otherwise well-known in the community) that AI could take over personal computers and use them, but distributing computations that way seems extremely impractical for computations that run very deep, so there could be important limits on what an AI could do this way.
That being said, I also havenāt personally engaged deeply with these arguments or read a lot on the topic, so I may have missed where these issues are addressed, but this is in part because I havenāt been impressed by what I have read (among other reasons, like concerns about backfire risks, suffering-focused views and very low probabilities of the typical EA or me in particular making any difference at all).