[A rebuttal of one argument for hard AI takeoff due to recursive self-improvement which I’m not sure anyone was ever making.
Wrote this as a comment in a Google doc, but thought I might want to link to it sometimes.]
I’m worried that naive appeals to self-improvement are sometimes due to a fallacious inference from current AIs / computer programs to advanced AIs. I.e. the implicit argument is something like:
1. It’s much easier to modify code than to do brain surgery. 2. Therefore, advanced AI (self-)improvement will be much easier than human (self-)improvement.
But my worry is that the actual reason why “modifying code” seems more feasible to us is the large complexity difference between current computer programs and human cognition. Indeed, at a physicalist level, it’s not at all clear that moving around the required matter on a hard disk or SDD or in RAM or whatever is easier than moving around the required matter in a brain. The difference instead is that we have developed intermediate levels of abstraction (from assembly to high-level programming languages) that massively facilitate the editing process—they bridge the gap between the hardware level and our native cognition in just the right way. But especially to someone with functionalist or otherwise substrate-neutral inclinations it may seem likely that they key feature that enabled us to construct the intermediate-level abstractions was precisely the small complexity of the “target cognition” compared to our native cognition.
To evaluate its editability, we can compare AI code to code, and to the human brain, along various dimensions: storage size, understandability, copyability, etc. (i.e. let’s decompose “complexity” into “storage size” and “understandability” to ensure conceptual clarity)
For size, AI code seems more similar to humans. AI models are already pretty big, so may be around human-sized by the time a hypothetical AI is created.
For understandability, I would expect it to be more like code, than to a human brain. After all, it’s created with a known design and objective that was built intentionally. Even if the learned model has a complex architecture, we should be able to understand its relatively simpler training procedure and incentives.
And then, an AI code will, like ordinary code—and unlike the human brain—be copyable, and have a digital storage medium, which are both potentially critical factors for editing.
Size (i.e. storage complexity) doesn’t seem like a very significant factor here.
I’d guess the editability of AI code would resemble the editability of code moreso than that of a human brain. But even if you don’t agree, I think this points at a better way to analyse the question.
Agree that looking at different dimensions is more fruitful.
I also agree that size isn’t important in itself, but it might correlate with understandability.
I may overall agree with AI code understandability being closer to code than the human brain. But I think you’re maybe a bit quick here: yes, we’ll have a known design and intentional objective on some level. But this level may be quite far removed from “live” cognition. E.g. we may know a lot about developmental psychology or the effects of genes and education, but not a lot about how to modify an adult human brain in order to make specific changes. The situation could be similar from an AI system’s perspective when trying to improve itself.
Copyability does seem like a key difference that’s unlikely to change as AI systems become more advanced. However, I’m not sure if it points to rapid takeoffs as opposed to orthogonal properties. (Though it does if we’re interested in how quickly the total capacity of all AI system grows, and assume hardware overhang plus sufficiently additive capabilities between systems.) To the extent it does, the mechanism seems to be relevantly different from recursive self-improvement—more like “sudden population explosion”.
Well, I guess copyability would help with recursive self-improvement as follows: it allows to run many experiments in parallel that can be used to test the effects of marginal changes.
I would expect advanced AI systems to still be improveable in a way that humans are not. You might lose all ability to see inside the AI’s thinking process, but you could still make hyperparameter tweaks. Humans you can also make hyperparameter tweaks, but unless your think AIs will take 20 years to train, it still seems easier than comparable human improvement.
Fair point. It seems that the central property of AI systems this arguments rests on is their speed, or the time until you get feedback. I agree it seems likely that AI training time (and then ability to evaluate performance on withheld test data or similar) in wall-clock speed will be shorter than feedback loops for humans (e.g. education reforms, genetic engineering, …).
However, some ways in which this could fail to enable rapid self-improvement:
The speed advantage could be offset by other differences, e.g. even less interpretable “thinking processes”.
Performance at certain tasks may be bottlenecked by feedback from slow real-world interactions. (If sim2real transfer doesn’t work well for some tasks.)
[A rebuttal of one argument for hard AI takeoff due to recursive self-improvement which I’m not sure anyone was ever making.
Wrote this as a comment in a Google doc, but thought I might want to link to it sometimes.]
I’m worried that naive appeals to self-improvement are sometimes due to a fallacious inference from current AIs / computer programs to advanced AIs. I.e. the implicit argument is something like:
1. It’s much easier to modify code than to do brain surgery.
2. Therefore, advanced AI (self-)improvement will be much easier than human (self-)improvement.
But my worry is that the actual reason why “modifying code” seems more feasible to us is the large complexity difference between current computer programs and human cognition. Indeed, at a physicalist level, it’s not at all clear that moving around the required matter on a hard disk or SDD or in RAM or whatever is easier than moving around the required matter in a brain. The difference instead is that we have developed intermediate levels of abstraction (from assembly to high-level programming languages) that massively facilitate the editing process—they bridge the gap between the hardware level and our native cognition in just the right way. But especially to someone with functionalist or otherwise substrate-neutral inclinations it may seem likely that they key feature that enabled us to construct the intermediate-level abstractions was precisely the small complexity of the “target cognition” compared to our native cognition.
To evaluate its editability, we can compare AI code to code, and to the human brain, along various dimensions: storage size, understandability, copyability, etc. (i.e. let’s decompose “complexity” into “storage size” and “understandability” to ensure conceptual clarity)
For size, AI code seems more similar to humans. AI models are already pretty big, so may be around human-sized by the time a hypothetical AI is created.
For understandability, I would expect it to be more like code, than to a human brain. After all, it’s created with a known design and objective that was built intentionally. Even if the learned model has a complex architecture, we should be able to understand its relatively simpler training procedure and incentives.
And then, an AI code will, like ordinary code—and unlike the human brain—be copyable, and have a digital storage medium, which are both potentially critical factors for editing.
Size (i.e. storage complexity) doesn’t seem like a very significant factor here.
I’d guess the editability of AI code would resemble the editability of code moreso than that of a human brain. But even if you don’t agree, I think this points at a better way to analyse the question.
Agree that looking at different dimensions is more fruitful.
I also agree that size isn’t important in itself, but it might correlate with understandability.
I may overall agree with AI code understandability being closer to code than the human brain. But I think you’re maybe a bit quick here: yes, we’ll have a known design and intentional objective on some level. But this level may be quite far removed from “live” cognition. E.g. we may know a lot about developmental psychology or the effects of genes and education, but not a lot about how to modify an adult human brain in order to make specific changes. The situation could be similar from an AI system’s perspective when trying to improve itself.
Copyability does seem like a key difference that’s unlikely to change as AI systems become more advanced. However, I’m not sure if it points to rapid takeoffs as opposed to orthogonal properties. (Though it does if we’re interested in how quickly the total capacity of all AI system grows, and assume hardware overhang plus sufficiently additive capabilities between systems.) To the extent it does, the mechanism seems to be relevantly different from recursive self-improvement—more like “sudden population explosion”.
Well, I guess copyability would help with recursive self-improvement as follows: it allows to run many experiments in parallel that can be used to test the effects of marginal changes.
I would expect advanced AI systems to still be improveable in a way that humans are not. You might lose all ability to see inside the AI’s thinking process, but you could still make hyperparameter tweaks. Humans you can also make hyperparameter tweaks, but unless your think AIs will take 20 years to train, it still seems easier than comparable human improvement.
Fair point. It seems that the central property of AI systems this arguments rests on is their speed, or the time until you get feedback. I agree it seems likely that AI training time (and then ability to evaluate performance on withheld test data or similar) in wall-clock speed will be shorter than feedback loops for humans (e.g. education reforms, genetic engineering, …).
However, some ways in which this could fail to enable rapid self-improvement:
The speed advantage could be offset by other differences, e.g. even less interpretable “thinking processes”.
Performance at certain tasks may be bottlenecked by feedback from slow real-world interactions. (If sim2real transfer doesn’t work well for some tasks.)