My basic reason for thinking “early rogue [AGI] will inevitably succeed in defeating us” is:
I think human intelligence is crap. E.g.:
Human STEM ability occurs in humans as an accidental side-effect — our brains underwent zero selection for the ability to do STEM in our EAA, and barely-any selection to optimize this skill since the Scientific Revolution. We should expect that much more is possible when humans are deliberately optimizing brains to be good at STEM.
There are many embarrassingly obvious glaring flaws in human reasoning.
One especially obvious example is “ability to think mathematically at all”. This seems in many respects like a reasoning ability that’s both relatively simple (it doesn’t require grappling with the complexity of the physical world) and relatively core. Yet the average human can’t even do trivial tasks like ‘multiply two eight-digit numbers together in your head in under a second’. This gap on its own seems sufficient for AGI to blow humans out of the water.
(E.g., I expect there are innumerable scientific fields, subfields, technologies, etc. that are easy to find when you can hold a hundred complex mathematical structures in your 200 slots of working memory simultaneously and perceive connections between those structures. Many things are hard to do across a network of separated brains, calculators, etc. that are far easier to do within a single brain that can hold everything in view at once, understand the big picture, consciously think about many relationships at once, etc.)
Example: AlphaGo Zero. There was a one-year gap between ‘the first time AI ever defeated a human professional’ and ‘the last time a human professional ever beat a SotA AI’. AlphaGo Zero in particular showed that 2500 years of human reasoning about Go was crap compared to what was pretty easy to do with 2017 hardware and techniques and ~72 hours of self-play. This isn’t a proof that human intelligence is similarly crap in physical-data-dependent STEM work, or in other formal settings, but it seems like a strong hint.
I’d guess we already have a hardware overhang for running AGI. (Considering, e.g., that we don’t need to recapitulate everything a human brain is doing in order to achieve AGI. Indeed, I’d expect that we only need to capture a small fraction of what the human brain is doing in order to produce superhuman STEM reasoning. I expect that AGI will be invented in the future (i.e., we don’t already have it), and that we’ll have more than enough compute.)
I’d be curious to know (1) whether you disagree with these points, and (2) whether you disagree that theses points are sufficient to predict that at least one early AGI system will be capable enough to defeat humans, if we don’t succeed on the alignment problem.
(I usually think of “early AGI systems” as ‘AGI systems built within five years of when humanity first starts building a system that could be deployed to do all the work human engineers can do in at least one hard science, if the developers were aiming at that goal’.)
My basic reason for thinking “early rogue [AGI] will inevitably succeed in defeating us” is:
I think human intelligence is crap. E.g.:
Human STEM ability occurs in humans as an accidental side-effect — our brains underwent zero selection for the ability to do STEM in our EAA, and barely-any selection to optimize this skill since the Scientific Revolution. We should expect that much more is possible when humans are deliberately optimizing brains to be good at STEM.
There are many embarrassingly obvious glaring flaws in human reasoning.
One especially obvious example is “ability to think mathematically at all”. This seems in many respects like a reasoning ability that’s both relatively simple (it doesn’t require grappling with the complexity of the physical world) and relatively core. Yet the average human can’t even do trivial tasks like ‘multiply two eight-digit numbers together in your head in under a second’. This gap on its own seems sufficient for AGI to blow humans out of the water.
(E.g., I expect there are innumerable scientific fields, subfields, technologies, etc. that are easy to find when you can hold a hundred complex mathematical structures in your 200 slots of working memory simultaneously and perceive connections between those structures. Many things are hard to do across a network of separated brains, calculators, etc. that are far easier to do within a single brain that can hold everything in view at once, understand the big picture, consciously think about many relationships at once, etc.)
Example: AlphaGo Zero. There was a one-year gap between ‘the first time AI ever defeated a human professional’ and ‘the last time a human professional ever beat a SotA AI’. AlphaGo Zero in particular showed that 2500 years of human reasoning about Go was crap compared to what was pretty easy to do with 2017 hardware and techniques and ~72 hours of self-play. This isn’t a proof that human intelligence is similarly crap in physical-data-dependent STEM work, or in other formal settings, but it seems like a strong hint.
I’d guess we already have a hardware overhang for running AGI. (Considering, e.g., that we don’t need to recapitulate everything a human brain is doing in order to achieve AGI. Indeed, I’d expect that we only need to capture a small fraction of what the human brain is doing in order to produce superhuman STEM reasoning. I expect that AGI will be invented in the future (i.e., we don’t already have it), and that we’ll have more than enough compute.)
I’d be curious to know (1) whether you disagree with these points, and (2) whether you disagree that theses points are sufficient to predict that at least one early AGI system will be capable enough to defeat humans, if we don’t succeed on the alignment problem.
(I usually think of “early AGI systems” as ‘AGI systems built within five years of when humanity first starts building a system that could be deployed to do all the work human engineers can do in at least one hard science, if the developers were aiming at that goal’.)