I can think of a few scenarios where AGI doesn’t kill us.
AGI does not act as a rational agent. The predicted doom scenarios rely on the AGI acting as a rational agent that maximises a utility function at all costs. This behaviour has not been seen in nature. Instead, all intelligences (natural or artificial) have some degree of laziness, which results in them being less destructive. Assuming the orthogonality thesis is true, this is unlikely to change.
The AGI sees humans as more useful alive than dead, probably because it’s utility function involves humans somehow. This covers a lot of scenarios from horrible dystopias where AGI tortures us constantly to see how we react all the way to us actually somehow getting alignment right on the first try. It keeps us alive for the same reason as why we keep out pets alive.
The first A”G”I’s are actually just a bunch of narrow AI’s in a trenchcoat, and no one of them is able to overthrow humanity. A lot of recent advances in AI (including GPT4) have been propelled by a move away from generality and towards a “mixture of experts” model, where complex tasks are split into simpler ones. If this scales, one could expect more advanced systems to still not be general enough to act autonomously in a way that overpowers humanity.
AGI can’t self improve because it runs face-first into the alignment problem! If we can think of how creating an intelligence greater than us results in the alignment problem, so can AGI. An AGI that fears creating something more powerful than itself will not do that, resulting in it remaining at around human level. Such an AGI would not be strong enough to beat all of humanity combined, so it will be smart enough not to try.
I can think of a few scenarios where AGI doesn’t kill us.
AGI does not act as a rational agent. The predicted doom scenarios rely on the AGI acting as a rational agent that maximises a utility function at all costs. This behaviour has not been seen in nature. Instead, all intelligences (natural or artificial) have some degree of laziness, which results in them being less destructive. Assuming the orthogonality thesis is true, this is unlikely to change.
The AGI sees humans as more useful alive than dead, probably because it’s utility function involves humans somehow. This covers a lot of scenarios from horrible dystopias where AGI tortures us constantly to see how we react all the way to us actually somehow getting alignment right on the first try. It keeps us alive for the same reason as why we keep out pets alive.
The first A”G”I’s are actually just a bunch of narrow AI’s in a trenchcoat, and no one of them is able to overthrow humanity. A lot of recent advances in AI (including GPT4) have been propelled by a move away from generality and towards a “mixture of experts” model, where complex tasks are split into simpler ones. If this scales, one could expect more advanced systems to still not be general enough to act autonomously in a way that overpowers humanity.
AGI can’t self improve because it runs face-first into the alignment problem! If we can think of how creating an intelligence greater than us results in the alignment problem, so can AGI. An AGI that fears creating something more powerful than itself will not do that, resulting in it remaining at around human level. Such an AGI would not be strong enough to beat all of humanity combined, so it will be smart enough not to try.