Erich_Grunewald 🔸 comments on The bullseye framework: My case against AI doom

Erich_Grunewald 🔸 30 May 2023 14:26 UTC
9 points
1 ∶ 0
The flaws and bugs that are most relevant to an AI’s performance in it’s domain of focus will be weeded out, but flaws outside of it’s relevant domain will not be. Bobby Fischer’s insane conspiracism had no effect on his chess playing ability. The same principle applies to stockfish. “Idiot savant” AI’s are entirely plausible, even likely.
[...]
For these reasons, I expect AGI to be flawed, and especially flawed when doing things it was not originally meant to do, like conquer the entire planet.
We might actually expect an AGI to be trained to conquer the entire planet, or rather to be trained in many of the abilities needed to do so. For example, we may train it to be good at things like:
- Strategic planning
- Getting humans to do what it wants effectively
- Controlling physical systems
- Cybersecurity
- Researching new, powerful technologies
- Engineering
- Running large organizations
- Communicating with humans and other AIs
Put differently, I think “taking control over humans” and “running a multinational corporation” (which seems like the sort of thing people will want AIs to be able to do) have lots more overlap than “playing chess” and “having true beliefs about subjects of conspiracies”. I’d be curious to hear if you have thoughts about which specific abilities you expect an AGI would need to have to take control over humanity that it’s unlikely to actually possess?
- Vasco Grilo🔸 18 May 2024 20:00 UTC
  2 points
  0 ∶ 1
  Parent
  Hi Erich,
  Note humans are also trained on all those abilities, but no single human is trained to be a specialist in all those areas. Likewise for AIs.
  - Erich_Grunewald 🔸 18 May 2024 20:12 UTC
    4 points
    0 ∶ 0
    Parent
    Yes, that’s true. Can you spell out for me what you think that implies in a little more detail?
    - Vasco Grilo🔸 18 May 2024 20:56 UTC
      2 points
      0 ∶ 0
      Parent
      For an agent to conquer to world, I think it would have to be close to the best across all those areas, but I think this is super unlikely based on it being super unlikely for a human to be close to the best across all those areas.
      - Erich_Grunewald 🔸 18 May 2024 22:22 UTC
        6 points
        1 ∶ 0
        Parent
        
        For an agent to conquer to world, I think it would have to be close to the best across all those areas
        
        That seems right.
        
        I think this is super unlikely based on it being super unlikely for a human to be close to the best across all those areas
        
        I’m not sure that follows? I would expect improvements on these types of tasks to be highly correlated in general-purpose AIs. I think we’ve seen that with GPT-3 to GPT-4, for example: GPT-4 got better pretty much across the board (excluding the tasks that neither of them can do, and the tasks that GPT-3 could already do perfectly). That is not the case for a human who will typically improve in just one domain or a few domains from one year to the next, depending on where they focus their effort.
        Vasco Grilo🔸 19 May 2024 9:04 UTC
        2 points
        0 ∶ 0
        Parent
        I would expect improvements on these types of tasks to be highly correlated in general-purpose AIs.
        Higher IQ in humans is correlated with better performance in all sorts of tasks too, but the probability of finding a single human performing better than 99.9 % of (human or AI) workers in each of the areas you mentioned is still astronomically low. So I do not expect a single AI system to become better than 99.9 % of (human or AI) workers in each of the areas you mentioned. It can still be the case that the AI systems share a baseline common architecture, in the same way that humans share the same underlying biology, but I predict the top performers in each area will still be specialised systems.
        I think we’ve seen that with GPT-3 to GPT-4, for example: GPT-4 got better pretty much across the board (excluding the tasks that neither of them can do, and the tasks that GPT-3 could already do perfectly). That is not the case for a human who will typically improve in just one domain or a few domains from one year to the next, depending on where they focus their effort.
        Going from GPT-3 to GPT-4 seems more analogous to a human going from 10 to 20 years old. There are improvements across the board during this phase, but specialisation still matters among adults. Likewise, I assume specialisation will matter among frontier AI systems (although I am quite open to a single future AI system being better than all humans at any task). GPT-4 is still far from being better than 99.9 % of (human or AI) workers in the areas you mentioned.
        Erich_Grunewald 🔸 19 May 2024 19:28 UTC
        10 points
        0 ∶ 0
        Parent
        Let me see if I can rephrase your argument, because I’m not sure I get it. As I understand it, you’re saying:
        
        In humans, higher IQ means better performance across a variety of tasks. This is analogous to AI, where more compute/parameters/data etc. means better performance across a variety of tasks.
        AI systems tend to share a common underlying architecture, just as humans share the same basic biology.
        For humans, when IQ increases, there are improvements across the board, but still specialization, meaning no single human (the one with the most IQ) will be better than all other humans at all of those things.
        By analogy: For AIs, when they’re scaled up, there are improvements across the board, but (likely) still specialization, meaning no single AI (the one with the most compute/parameters/data/etc.) will be better than all other AIs at all of those things.
        
        Now I’m a bit unsure about whether you’re saying that you find it extremely unlikely that any AI will be vastly better in the areas I mentioned than all humans, or that you find it extremely unlikely that any AI will be vastly better than all humans and all other AIs in those areas.
        
        If you mean 1-4 to suggest that no AI is will be better than all humans and other AIs, I’m not sure about whether 4 follows from 1-3, but I think that seems plausible at least. But if this is what you mean, I’m not sure what you’re original comment (“Note humans are also trained on all those abilities, but no single human is trained to be a specialist in all those areas. Likewise for AIs.”) was meant to say in response to my original comment, which was meant as pushback against the view that AGI would be bad at taking over the planet since it wouldn’t be intended for that purpose.
        
        If you mean 1-4 to suggest that no AI will be better than all humans, I don’t think the analogy holds, because the underlying factor (IQ versus AI scale/algorithms) is different. Like, it seems possible that even unspecialized AIs could just sweep past the most intelligent and specialized humans, given enough time.
        Vasco Grilo🔸 19 May 2024 21:11 UTC
        4 points
        0 ∶ 0
        Parent
        Thanks for the clarification, Erich! Strongly upvoted.
        Let me see if I can rephrase your argument
        I think your rephrasement was great.
        Now I’m a bit unsure about whether you’re saying that you find it extremely unlikely that any AI will be vastly better in the areas I mentioned than all humans, or that you find it extremely unlikely that any AI will be vastly better than all humans and all other AIs in those areas.
        The latter.
        If you mean 1-4 to suggest that no AI is will be better than all humans and other AIs, I’m not sure about whether 4 follows from 1-3, but I think that seems plausible at least. But if this is what you mean, I’m not sure what you’re original comment (“Note humans are also trained on all those abilities, but no single human is trained to be a specialist in all those areas. Likewise for AIs.”) was meant to say in response to my original comment, which was meant as pushback against the view that AGI would be bad at taking over the planet since it wouldn’t be intended for that purpose.
        I think a single AI agent would have to be better than the vast majority of agents (including both human and AI agents) to gain control over the world, which I consider extremely unlikely given gains from specialisation.
        If you mean 1-4 to suggest that no AI will be better than all humans, I don’t think the analogy holds, because the underlying factor (IQ versus AI scale/algorithms) is different. Like, it seems possible that even unspecialized AIs could just sweep past the most intelligent and specialized humans, given enough time.
        I agree.
        I’d be curious to hear if you have thoughts about which specific abilities you expect an AGI would need to have to take control over humanity that it’s unlikely to actually possess?
        I believe the probability of a rogue (human or AI) agent gaining control over the world mostly depends on its level of capabilities relative to those of the other agents, not on the absolute level of capabilities of the rogue agent. So I mostly worry about concentration of capabilities rather than increases in capabilities per se. In theory, the capabilities of a given group of (human or AI) agents could increase a lot in a short period of time such that capabilities become so concentrated that the group would be in a position to gain control over the world. However, I think this is very unlikely in practice. I guess the annual probability of human extinction over the next 10 years is around 10^-6.