Regarding parallelism and Amdahl’s Law: I don’t think this is a particular issue for AI progress. Biological brains are themselves extremely parallel, far more so than any processors we use today, and we still have general intelligence in brains but not in computers. If anything, the fact that computers are more serial than brains gives the former an advantage, since algorithms which run well in parallel can be easily “serialized”. It is only the other direction which is potentially very inefficient, since some (many?) algorithms are very slow in parallel. In case of neural networks, parallelism only has an advantage in terms of energy requirements. But AI seems not substantially energy bottlenecked, in contrast to biological organisms.
There are certainly algorithms where it would be a severe issue (many RL approaches, for instance), but I’m not categorically saying that all intelligence (or approaches to such) requires a certain minimum depth. It’s just that unless you already have a strong prior that easy-to-parallelize approaches will get us to AGI, the existence of Amdahl’s law implies Moore slowing down is very important.
I think the brains example is somewhat misleading, for two reasons:
1: For biological anchors, people occasionally talk about brain emulation in a “simulate the interactions on a cellular level” sense (I’m not saying you do this), and this is practically the most serial task I could come up with.*
2: The brain is the inference stage of current-intelligence, not the training stage. The way we got to brains was very serial.
*(For all we know, it could be possible to parallelize every single algorithm. CS theory is weird!)
Well, I don’t know how serial RL algorithms are, but even highly parallel animals can be interpreted as doing some sort of RL—“operant conditioning” is the term from psychology.
I agree that brain emulation is unlikely to happen. The analogy with the brain does not mean we have to emulate it very closely. Artificial neural networks are already highly successful without a close correspondence to actual neural networks.
Inference stage—aren’t we obviously both at inference and training stage at the same time, unlike current ML models? We can clearly learn things everyday, and we only use our very parallel wetware. The way we got brains, through natural selection, is indeed a different matter, but I would not necessarily label this the training stage. Clearly some information is hardwired from the evolutionary process, but this is only a small fraction of what a human brain does in fact learn.
And okay, so NC≠P has not been proven, but it is clearly well-supported by the available evidence.
Certainly agree that we are learning right now (I hope :)).
“this is only a small fraction of what a human brain does in fact learn”
Disagree here. The description size of my brain (in CS analogy, the size of the circuit) seems much much larger than the total amount of information I have ever learned or ever will learn (one argument: I have fewer bits of knowledge than Wikipedia, describing my brain in the size of Wikipedia would be an huge advance in neuroscience). Even worse, the description size of the circuit doesn’t (unless P=NP) provide any nontrivial bound on the amount of computation we need to invest to find it.
Surely the information transferred from natural selection to the brain must be a fraction of the information in the genome. Which is much less:
https://en.m.wikipedia.org/wiki/Human_genome#Information_content
The organism, including the brain, seems to be roughly a decompressed genome. And actually the environment can provide a lot of information through the senses. We can’t memorize the Wikipedia, but that may be because we are not optimized for storing plain text efficiently. We still can recall quite a bit of visual and auditory information.
Regarding parallelism and Amdahl’s Law: I don’t think this is a particular issue for AI progress. Biological brains are themselves extremely parallel, far more so than any processors we use today, and we still have general intelligence in brains but not in computers. If anything, the fact that computers are more serial than brains gives the former an advantage, since algorithms which run well in parallel can be easily “serialized”. It is only the other direction which is potentially very inefficient, since some (many?) algorithms are very slow in parallel. In case of neural networks, parallelism only has an advantage in terms of energy requirements. But AI seems not substantially energy bottlenecked, in contrast to biological organisms.
There are certainly algorithms where it would be a severe issue (many RL approaches, for instance), but I’m not categorically saying that all intelligence (or approaches to such) requires a certain minimum depth. It’s just that unless you already have a strong prior that easy-to-parallelize approaches will get us to AGI, the existence of Amdahl’s law implies Moore slowing down is very important.
I think the brains example is somewhat misleading, for two reasons:
1: For biological anchors, people occasionally talk about brain emulation in a “simulate the interactions on a cellular level” sense (I’m not saying you do this), and this is practically the most serial task I could come up with.*
2: The brain is the inference stage of current-intelligence, not the training stage. The way we got to brains was very serial.
*(For all we know, it could be possible to parallelize every single algorithm. CS theory is weird!)
Well, I don’t know how serial RL algorithms are, but even highly parallel animals can be interpreted as doing some sort of RL—“operant conditioning” is the term from psychology.
I agree that brain emulation is unlikely to happen. The analogy with the brain does not mean we have to emulate it very closely. Artificial neural networks are already highly successful without a close correspondence to actual neural networks.
Inference stage—aren’t we obviously both at inference and training stage at the same time, unlike current ML models? We can clearly learn things everyday, and we only use our very parallel wetware. The way we got brains, through natural selection, is indeed a different matter, but I would not necessarily label this the training stage. Clearly some information is hardwired from the evolutionary process, but this is only a small fraction of what a human brain does in fact learn.
And okay, so NC≠P has not been proven, but it is clearly well-supported by the available evidence.
Certainly agree that we are learning right now (I hope :)).
“this is only a small fraction of what a human brain does in fact learn”
Disagree here. The description size of my brain (in CS analogy, the size of the circuit) seems much much larger than the total amount of information I have ever learned or ever will learn (one argument: I have fewer bits of knowledge than Wikipedia, describing my brain in the size of Wikipedia would be an huge advance in neuroscience). Even worse, the description size of the circuit doesn’t (unless P=NP) provide any nontrivial bound on the amount of computation we need to invest to find it.
Surely the information transferred from natural selection to the brain must be a fraction of the information in the genome. Which is much less: https://en.m.wikipedia.org/wiki/Human_genome#Information_content The organism, including the brain, seems to be roughly a decompressed genome. And actually the environment can provide a lot of information through the senses. We can’t memorize the Wikipedia, but that may be because we are not optimized for storing plain text efficiently. We still can recall quite a bit of visual and auditory information.