kokotajlod comments on Discussion with Eliezer Yudkowsky on AGI interventions

kokotajlod Nov 19, 2021, 10:45 AM
3 points
0 ∶ 0
Oh I too think multipolar scenarios are plausible. I tend to think unipolar scenarios are more plausible due to my opinions about takeoff speed and homogeneity.
In that case, the people in India started out at a disadvantage, whereas humans currently have the upper hand relative to AIs. But there have also been cases in history where the side that seemed to be weaker ended up gaining strength quickly and winning.
As far as I can tell the British were the side that seemed to be weaker initially.
- Brian_Tomasik Nov 20, 2021, 2:41 AM
  3 points
  0 ∶ 0
  Parent
  Interesting. :) What do you mean by “homogeneity”?
  
  Even in the case of a fast takeoff, don’t you think people would create multiple AGIs of roughly comparable ability at the same time? So wouldn’t that already create a bit of a multipolar situation, even if it all occurred in the DeepMind labs or something? Maybe if the AGIs all have roughly the same values it would still effectively be a unipolar situation.
  
  I guess if you think it’s game over the moment that a more advanced AGI is turned on, then there might be only one such AGI. If the developers were training multiple random copies of the AGI in parallel in order to average the results across them or see how they differed, there would already be multiple slightly different AGIs. But I don’t know how these things are done. Maybe if the model was really expensive to train, the developers would only train one of them to start with.
  
  If the AGIs are deployed to any degree (even on an experimental / beta testing basis), I would expect there to be multiple instances (though maybe they would just be clones of a single trained model and therefore would have roughly the same values).
  - kokotajlod Nov 20, 2021, 6:19 PM
    3 points
    0 ∶ 0
    Parent
    Sorry, should have linked to it when I introduced the term.
    
    I think mostly my claim is that AIs will probably cooperate well enough with each other that humans won’t be able to pit AIs against each other in ways that benefit humans enough to let humans retain control of the future. However I’m also making the stronger claim that I think unipolar takeoff is likely; this is because I think >50% chance (though <90% chance) that one AI or copy-clan of AIs will be sufficiently ahead of the others during the relevant period, or at least that the relevant set of AIs will have similar enough values and worldviews that serious cooperation failure isn’t on the table. I’m less confident in this stronger claim.
    - Brian_Tomasik Nov 24, 2021, 3:40 AM
      3 points
      0 ∶ 0
      Parent
      Thanks for the link. :) It’s very relevant to this discussion.
      
      AIs will probably cooperate well enough with each other
      
      Maybe, but what if trying to coordinate in that way is prohibited? Similar to how if a group of people tries to organize a coup against the dictator, other people may rat them out.
      
      in ways that benefit humans enough to let humans retain control of the future
      
      I agree that these anti-coup measures alone are unlikely to let humans retain control forever, or even for very long. Dictatorships tend to experience coups or revolutions eventually.
      
      at least that the relevant set of AIs will have similar enough values and worldviews that serious cooperation failure isn’t on the table
      
      I see. :) I’d define “multipolar” as just meaning that there are different agents with nontrivially different values, rather than that a serious bargaining failure occurs (unless you’re thinking that the multipolar AIs would cooperate to unify into a homogeneous compromise agent, which would make the situation unipolar).
      
      I think even tiny differences in training data and randomization can make nontrivial differences in the values of an agent. Most humans are almost clones of one another. We use the same algorithms and have pretty similar training data for determining our values. Yet the differences in values between people can be pretty significant.
      - Brian_Tomasik Nov 24, 2021, 4:06 AM
        3 points
        0 ∶ 0
        Parent
        I guess the distinction between unipolar and multipolar sort of depends on the level of abstraction at which something is viewed. For example, the USA is normally thought of as a single actor, but it’s composed of 330 million individual human agents, each with different values, which is a highly multipolar situation. Likewise, I suppose you could have lots of AIs with somewhat different values, but if they coordinated on an overarching governance system, that governance system itself could be considered unipolar.
        
        Even a single person can be seen as sort of multipolar if you look at the different, sometimes conflicting emotions, intuitions, and reasoning within that person’s brain.
        kokotajlod Nov 24, 2021, 11:37 AM
        3 points
        0 ∶ 0
        Parent
        I was thinking the reason we care about the multipolar vs. unipolar distinction is that we are worried about conflict/cooperation-failure/etc. and trying to understand what kinds of scenarios might lead to it. So, I’m thinking we can define the distinction in terms of whether conflict/etc. is a significant possibility.
        
        I agree that if we define it your way, multipolar takeoff is more likely than not.
        Brian_Tomasik Nov 26, 2021, 5:32 AM
        3 points
        0 ∶ 0
        Parent
        Ok, cool. :) And as I noted, even if we define it my way, there’s ambiguity regarding whether a collection of agents should count as one entity or many. We’d be more inclined to say that there are many entities in cases where conflict between them is a significant possibility, which gets us back to your definition.