This is to some extent captured in the “headroom” point, but when I examine my own reasons for being less worried about AI risk than the community, it’s primarily about computational constraints on both the hardware and algorithmic side.
Hardware side: We have strong reasons to believe that naive extrapolations of past (i.e. last 50 years) progress on compute will be a substantial overestimate of future progress. In particular, we face Dennard scaling failing, Moore’s Law certainly slowing down and likely failing, and Amdahl’s Law making both of these worse by limiting returns to parallelism (which in a world with no Dennard or Moore is effectively what increased economic resources gets you). It’s also easy to implicitly be very optimistic here while forecasting e.g. bio anchors timelines, because something that looks moderate like “Moore’s Law, but 50% slower” is an exponentially more optimistic assumption than what are at least plausible outcomes like “Moore’s Law completely breaks”. This also makes forecasts very sensitive to technical questions about computer architecture.
Algorithmic side (warning, this is more speculative): Many problems in computer science that can be precisely stated (giving us the benefit of quantifying “how much” progress there is per year) go through periods of rapid advances, where we can say quantitatively that we have moved e.g. 30% closer to the perfect matrix multiplication algorithm, or even found exponentially faster algorithms for breaking all encryption. But we have sources of belief that this progress “cannot” get us to a certain point, whether this is believing no SAT solver will ever run in 1.999^n time or even specific barrier results on all currently known ways to multiply matrices. Because of these results and beliefs, current progress (exciting as it is) causes very little updating among experts in the field that we are close to fundamental breakthroughs. The lack of these barrier results in AI could well be due to the lack of precise formulations of the problem, rather than the actual lack of these barriers.
Regarding parallelism and Amdahl’s Law: I don’t think this is a particular issue for AI progress. Biological brains are themselves extremely parallel, far more so than any processors we use today, and we still have general intelligence in brains but not in computers. If anything, the fact that computers are more serial than brains gives the former an advantage, since algorithms which run well in parallel can be easily “serialized”. It is only the other direction which is potentially very inefficient, since some (many?) algorithms are very slow in parallel. In case of neural networks, parallelism only has an advantage in terms of energy requirements. But AI seems not substantially energy bottlenecked, in contrast to biological organisms.
There are certainly algorithms where it would be a severe issue (many RL approaches, for instance), but I’m not categorically saying that all intelligence (or approaches to such) requires a certain minimum depth. It’s just that unless you already have a strong prior that easy-to-parallelize approaches will get us to AGI, the existence of Amdahl’s law implies Moore slowing down is very important.
I think the brains example is somewhat misleading, for two reasons:
1: For biological anchors, people occasionally talk about brain emulation in a “simulate the interactions on a cellular level” sense (I’m not saying you do this), and this is practically the most serial task I could come up with.*
2: The brain is the inference stage of current-intelligence, not the training stage. The way we got to brains was very serial.
*(For all we know, it could be possible to parallelize every single algorithm. CS theory is weird!)
Well, I don’t know how serial RL algorithms are, but even highly parallel animals can be interpreted as doing some sort of RL—“operant conditioning” is the term from psychology.
I agree that brain emulation is unlikely to happen. The analogy with the brain does not mean we have to emulate it very closely. Artificial neural networks are already highly successful without a close correspondence to actual neural networks.
Inference stage—aren’t we obviously both at inference and training stage at the same time, unlike current ML models? We can clearly learn things everyday, and we only use our very parallel wetware. The way we got brains, through natural selection, is indeed a different matter, but I would not necessarily label this the training stage. Clearly some information is hardwired from the evolutionary process, but this is only a small fraction of what a human brain does in fact learn.
And okay, so NC≠P has not been proven, but it is clearly well-supported by the available evidence.
Certainly agree that we are learning right now (I hope :)).
“this is only a small fraction of what a human brain does in fact learn”
Disagree here. The description size of my brain (in CS analogy, the size of the circuit) seems much much larger than the total amount of information I have ever learned or ever will learn (one argument: I have fewer bits of knowledge than Wikipedia, describing my brain in the size of Wikipedia would be an huge advance in neuroscience). Even worse, the description size of the circuit doesn’t (unless P=NP) provide any nontrivial bound on the amount of computation we need to invest to find it.
Surely the information transferred from natural selection to the brain must be a fraction of the information in the genome. Which is much less:
https://en.m.wikipedia.org/wiki/Human_genome#Information_content
The organism, including the brain, seems to be roughly a decompressed genome. And actually the environment can provide a lot of information through the senses. We can’t memorize the Wikipedia, but that may be because we are not optimized for storing plain text efficiently. We still can recall quite a bit of visual and auditory information.
This is to some extent captured in the “headroom” point, but when I examine my own reasons for being less worried about AI risk than the community, it’s primarily about computational constraints on both the hardware and algorithmic side.
Hardware side: We have strong reasons to believe that naive extrapolations of past (i.e. last 50 years) progress on compute will be a substantial overestimate of future progress. In particular, we face Dennard scaling failing, Moore’s Law certainly slowing down and likely failing, and Amdahl’s Law making both of these worse by limiting returns to parallelism (which in a world with no Dennard or Moore is effectively what increased economic resources gets you). It’s also easy to implicitly be very optimistic here while forecasting e.g. bio anchors timelines, because something that looks moderate like “Moore’s Law, but 50% slower” is an exponentially more optimistic assumption than what are at least plausible outcomes like “Moore’s Law completely breaks”. This also makes forecasts very sensitive to technical questions about computer architecture.
Algorithmic side (warning, this is more speculative): Many problems in computer science that can be precisely stated (giving us the benefit of quantifying “how much” progress there is per year) go through periods of rapid advances, where we can say quantitatively that we have moved e.g. 30% closer to the perfect matrix multiplication algorithm, or even found exponentially faster algorithms for breaking all encryption. But we have sources of belief that this progress “cannot” get us to a certain point, whether this is believing no SAT solver will ever run in 1.999^n time or even specific barrier results on all currently known ways to multiply matrices. Because of these results and beliefs, current progress (exciting as it is) causes very little updating among experts in the field that we are close to fundamental breakthroughs. The lack of these barrier results in AI could well be due to the lack of precise formulations of the problem, rather than the actual lack of these barriers.
Regarding parallelism and Amdahl’s Law: I don’t think this is a particular issue for AI progress. Biological brains are themselves extremely parallel, far more so than any processors we use today, and we still have general intelligence in brains but not in computers. If anything, the fact that computers are more serial than brains gives the former an advantage, since algorithms which run well in parallel can be easily “serialized”. It is only the other direction which is potentially very inefficient, since some (many?) algorithms are very slow in parallel. In case of neural networks, parallelism only has an advantage in terms of energy requirements. But AI seems not substantially energy bottlenecked, in contrast to biological organisms.
There are certainly algorithms where it would be a severe issue (many RL approaches, for instance), but I’m not categorically saying that all intelligence (or approaches to such) requires a certain minimum depth. It’s just that unless you already have a strong prior that easy-to-parallelize approaches will get us to AGI, the existence of Amdahl’s law implies Moore slowing down is very important.
I think the brains example is somewhat misleading, for two reasons:
1: For biological anchors, people occasionally talk about brain emulation in a “simulate the interactions on a cellular level” sense (I’m not saying you do this), and this is practically the most serial task I could come up with.*
2: The brain is the inference stage of current-intelligence, not the training stage. The way we got to brains was very serial.
*(For all we know, it could be possible to parallelize every single algorithm. CS theory is weird!)
Well, I don’t know how serial RL algorithms are, but even highly parallel animals can be interpreted as doing some sort of RL—“operant conditioning” is the term from psychology.
I agree that brain emulation is unlikely to happen. The analogy with the brain does not mean we have to emulate it very closely. Artificial neural networks are already highly successful without a close correspondence to actual neural networks.
Inference stage—aren’t we obviously both at inference and training stage at the same time, unlike current ML models? We can clearly learn things everyday, and we only use our very parallel wetware. The way we got brains, through natural selection, is indeed a different matter, but I would not necessarily label this the training stage. Clearly some information is hardwired from the evolutionary process, but this is only a small fraction of what a human brain does in fact learn.
And okay, so NC≠P has not been proven, but it is clearly well-supported by the available evidence.
Certainly agree that we are learning right now (I hope :)).
“this is only a small fraction of what a human brain does in fact learn”
Disagree here. The description size of my brain (in CS analogy, the size of the circuit) seems much much larger than the total amount of information I have ever learned or ever will learn (one argument: I have fewer bits of knowledge than Wikipedia, describing my brain in the size of Wikipedia would be an huge advance in neuroscience). Even worse, the description size of the circuit doesn’t (unless P=NP) provide any nontrivial bound on the amount of computation we need to invest to find it.
Surely the information transferred from natural selection to the brain must be a fraction of the information in the genome. Which is much less: https://en.m.wikipedia.org/wiki/Human_genome#Information_content The organism, including the brain, seems to be roughly a decompressed genome. And actually the environment can provide a lot of information through the senses. We can’t memorize the Wikipedia, but that may be because we are not optimized for storing plain text efficiently. We still can recall quite a bit of visual and auditory information.