It seems hard (but not impossible) to build something that’s better than humans at designing AI systems & has access to its own software and new hardware, which does not self improve rapidly. Scenarios where this doesn’t occur include (a) scenarios where the top AI systems are strongly hardware limited; (b) scenarios where all operators of all AI systems successfully remove all incentives to self-improve; or (c) the first AI system is strong enough to prevent all intelligence explosions, but is also constructed such that it does not itself self-improve.
Couldn’t it be that the returns on intelligence tend to not be very high for a self-improving agent around the human area? Like, it could be that modifying yourself when you’re human-level intelligent isn’t very useful, but that things really take off at 20x the human level. That would seem to suggest a possible d) the first superhuman AI system is self-improves for some time and then peters out. More broadly, the suggestion is that since the machine is presumably not yet superintelligent, there might be relevant constraints other than incentives and hardware. Plausible or not?
Couldn’t it be that the returns on intelligence tend to not be very high for a self-improving agent around the human area?
Seems unlikely to me, given my experience as an agent at roughly the human level of intelligence. If you gave me a human-readable version of my source code, the ability to use money to speed up my cognition, and the ability to spawn many copies of myself (both to parallelize effort and to perform experiments with) then I think I’d be “superintelligent” pretty quickly. (In order for the self-improvement landscape to be shallow around the human level, you’d need systems to be very hardware-limited, and hardware currently doesn’t look like the bottleneck.)
(I’m also not convinced it’s meaningful to talk about “the human level” except in a very broad sense of “having that super powerful domain generality that humans seem to possess”, so I’m fairly uncomfortable with terminology such as “20x the human level.”)
Couldn’t it be that the returns on intelligence tend to not be very high for a self-improving agent around the human area? Like, it could be that modifying yourself when you’re human-level intelligent isn’t very useful, but that things really take off at 20x the human level. That would seem to suggest a possible d) the first superhuman AI system is self-improves for some time and then peters out. More broadly, the suggestion is that since the machine is presumably not yet superintelligent, there might be relevant constraints other than incentives and hardware. Plausible or not?
Seems unlikely to me, given my experience as an agent at roughly the human level of intelligence. If you gave me a human-readable version of my source code, the ability to use money to speed up my cognition, and the ability to spawn many copies of myself (both to parallelize effort and to perform experiments with) then I think I’d be “superintelligent” pretty quickly. (In order for the self-improvement landscape to be shallow around the human level, you’d need systems to be very hardware-limited, and hardware currently doesn’t look like the bottleneck.)
(I’m also not convinced it’s meaningful to talk about “the human level” except in a very broad sense of “having that super powerful domain generality that humans seem to possess”, so I’m fairly uncomfortable with terminology such as “20x the human level.”)