I am skeptical of the FOOM idea too, but I don’t think most of this post argues effectively against it. Some responses here:
1.0/1.1 - This seems nonobvious to me. Do you have examples of these superlinear decays? This seems like the best argument of the entire piece if true, and I’d love to see this specific point fleshed out.
2 - ELO of 0 is not the floor of capability. ELO is measured as a relative ranking of competitors, and is not an objective measure of chess capability—it doesn’t start at “You don’t know how to play at 0”, it starts at 1200 being defined as the average chess player who plays enough ELO matches to be stably ranked, then goes up and down from there based on victories or losses.
You also have to compare potential, not just actual skill. A beginner might be close to a chimpanzee in Chess ability, but give me a median human who has never played Chess before and a chimpanzee, then give me a few hours or a few days to train them both, and I predict we will see a very swift capabilities jump for the human relative to the chimp. Similarly, I bet there was a point in its training process where I was a superior Go player to AlphaGo. That didn’t last long.
As for the general relativity example—I think you’re looking at the wrong measurement, here. Large language models often have “emergent” properties where they suddenly become able to do something (e.g, 3-digit multiplication) that they couldn’t do before, but if you measured their general maths ability (Say, via a 2-digit addition task), you would find it was increasing with scale long before the model reached the level of mathematics required to do 3-digit multiplication correctly even occasionally. “Ability to invent general relativity” is impossible for both a rock and the median human, and yet the median human would still outperform a rock on a cognitive test to measure scientific acumen. If two agents are equally incapable or capable of performing a specific task, that does not make them equal in the underlying domain of that task.
In addition, both Chess and Go are bounded domains. I wouldn’t be surprised if Magnus Carlsen was closer in ELO to an unbounded superintelligence than a median human, because once you’ve solved chess you’ve hit an ELO ceiling, and adding more intelligence doesn’t help. It is possible that an ELO of 4500+ is legitimately impossible. This probably does not generalise to real-world domains.
3 - I don’t see how this acts as an argument against recursive self-improvement. It doesn’t matter if an AGI would lose to a specialised Chess bot of the same level. That does not stop it from FOOMing. If recursive self-improvement is possible, the AI just has to reach a certain capability level in each of X domains. This can be achieved eventually even if each X could be individually reached faster by a specialised machine.
4 - This is not an argument against FOOM, this is just a description of what the world might look like if FOOM does not happen and is not possible.
5 - I agree with you that superintelligence requires a very high level of strongly superhuman cognitive abilities. In a world where FOOM-level improvement is possible, these levels will be reached anyway. This is, again, not an argument that argues against FOOM—it seems more an argument of “If FOOM is not possible, we won’t get a singleton through ordinary capabilities improvements”.
In short—point 1 is a reasonable crux and I agree that if 1.0 and 1.1 were true, this would mean a recursively self-improving AI would quickly hit a ceiling of diminishing returns. I’d like to see a more detailed argument for why this would be expected.
Point 2 appears to have several misconceptions that are pretty fundamental to the understanding of what capabilities are and how they scale.
Points 3-5 do not, to me, seem to argue against the possibility of FOOM at all. They are interesting points, in that they argue against the concept of a singleton in the event that FOOM is not possible, but they don’t present in argument in favor of that being the case.
I am skeptical of the FOOM idea too, but I don’t think most of this post argues effectively against it. Some responses here:
1.0/1.1 - This seems nonobvious to me. Do you have examples of these superlinear decays? This seems like the best argument of the entire piece if true, and I’d love to see this specific point fleshed out.
2 - ELO of 0 is not the floor of capability. ELO is measured as a relative ranking of competitors, and is not an objective measure of chess capability—it doesn’t start at “You don’t know how to play at 0”, it starts at 1200 being defined as the average chess player who plays enough ELO matches to be stably ranked, then goes up and down from there based on victories or losses.
You also have to compare potential, not just actual skill. A beginner might be close to a chimpanzee in Chess ability, but give me a median human who has never played Chess before and a chimpanzee, then give me a few hours or a few days to train them both, and I predict we will see a very swift capabilities jump for the human relative to the chimp. Similarly, I bet there was a point in its training process where I was a superior Go player to AlphaGo. That didn’t last long.
As for the general relativity example—I think you’re looking at the wrong measurement, here. Large language models often have “emergent” properties where they suddenly become able to do something (e.g, 3-digit multiplication) that they couldn’t do before, but if you measured their general maths ability (Say, via a 2-digit addition task), you would find it was increasing with scale long before the model reached the level of mathematics required to do 3-digit multiplication correctly even occasionally. “Ability to invent general relativity” is impossible for both a rock and the median human, and yet the median human would still outperform a rock on a cognitive test to measure scientific acumen. If two agents are equally incapable or capable of performing a specific task, that does not make them equal in the underlying domain of that task.
In addition, both Chess and Go are bounded domains. I wouldn’t be surprised if Magnus Carlsen was closer in ELO to an unbounded superintelligence than a median human, because once you’ve solved chess you’ve hit an ELO ceiling, and adding more intelligence doesn’t help. It is possible that an ELO of 4500+ is legitimately impossible. This probably does not generalise to real-world domains.
3 - I don’t see how this acts as an argument against recursive self-improvement. It doesn’t matter if an AGI would lose to a specialised Chess bot of the same level. That does not stop it from FOOMing. If recursive self-improvement is possible, the AI just has to reach a certain capability level in each of X domains. This can be achieved eventually even if each X could be individually reached faster by a specialised machine.
4 - This is not an argument against FOOM, this is just a description of what the world might look like if FOOM does not happen and is not possible.
5 - I agree with you that superintelligence requires a very high level of strongly superhuman cognitive abilities. In a world where FOOM-level improvement is possible, these levels will be reached anyway. This is, again, not an argument that argues against FOOM—it seems more an argument of “If FOOM is not possible, we won’t get a singleton through ordinary capabilities improvements”.
In short—point 1 is a reasonable crux and I agree that if 1.0 and 1.1 were true, this would mean a recursively self-improving AI would quickly hit a ceiling of diminishing returns. I’d like to see a more detailed argument for why this would be expected.
Point 2 appears to have several misconceptions that are pretty fundamental to the understanding of what capabilities are and how they scale.
Points 3-5 do not, to me, seem to argue against the possibility of FOOM at all. They are interesting points, in that they argue against the concept of a singleton in the event that FOOM is not possible, but they don’t present in argument in favor of that being the case.
Thanks for the detailed reply.
I’ll try and address these objections later.