I think the main benefit of explicitly modeling ASI as being a ‘new player’ in the geopolitical game is that it highlights precisely the idea that the ASI will NOT just automatically be a tool used by China or the US—but rather than it will have its own distinctive payoffs, interests, strategies, and agendas. That’s the key issue that many current political leaders (e.g. AI Czar David Sacks) do not seem to understand—if America builds an ASI, it won’t be ‘America’s ASI’, it will be the ASI’s ASI, so to speak.
ASI being unaligned doesn’t necessarily mean that it will kill all humans quickly—there are many, many possible outcomes other than immediate extinction that might be in the ASI’s interests.
The more seriously we model the possible divergences of ASI interests from the interests of current nation-states, the more persuasively we can make the argument that any nation building an ASI is not just flipping a coin between ‘geopolitical dominance forever’ and ‘human extinction forever’—rather, it’s introducing a whole new set of ASI interests that need to be taken into account.
Having given this a bit more thought, I think the starting point for something like this might be to generalize and assume the ASI just has “different” interests (we don’t know what those interests are right now both because we don’t know how ASI will be developed and because we haven’t solved alignment yet), and then also to assume that the ASI has just enough power to make it interesting to model (not because this assumption is realistic, but because if the ASI was too weak or too strong relative to humans, the modeling exercise would be uninformative).
I don’t know where to go from here, however. Maybe Buterin’s def/acc world that I linked in my earlier comment would be a good scenario to start with.
Matt—thanks for the quick and helpful reply.
I think the main benefit of explicitly modeling ASI as being a ‘new player’ in the geopolitical game is that it highlights precisely the idea that the ASI will NOT just automatically be a tool used by China or the US—but rather than it will have its own distinctive payoffs, interests, strategies, and agendas. That’s the key issue that many current political leaders (e.g. AI Czar David Sacks) do not seem to understand—if America builds an ASI, it won’t be ‘America’s ASI’, it will be the ASI’s ASI, so to speak.
ASI being unaligned doesn’t necessarily mean that it will kill all humans quickly—there are many, many possible outcomes other than immediate extinction that might be in the ASI’s interests.
The more seriously we model the possible divergences of ASI interests from the interests of current nation-states, the more persuasively we can make the argument that any nation building an ASI is not just flipping a coin between ‘geopolitical dominance forever’ and ‘human extinction forever’—rather, it’s introducing a whole new set of ASI interests that need to be taken into account.
Having given this a bit more thought, I think the starting point for something like this might be to generalize and assume the ASI just has “different” interests (we don’t know what those interests are right now both because we don’t know how ASI will be developed and because we haven’t solved alignment yet), and then also to assume that the ASI has just enough power to make it interesting to model (not because this assumption is realistic, but because if the ASI was too weak or too strong relative to humans, the modeling exercise would be uninformative).
I don’t know where to go from here, however. Maybe Buterin’s def/acc world that I linked in my earlier comment would be a good scenario to start with.