I don’t think your arguments regarding Sharpness and Hardness are particularly reassuring though. If an AGI can be made that runs at “real time”, what’s to stop someone throwing 10, 100, 1000x more compute at it to make it run correspondingly faster? Will they really have spent all the money they have at their disposal on the first prototype? And even if they did, others with more money could quickly up the ante. (And then obviously once the AGI is running much faster than a human, it can be applied to making itself smarter/faster still, etc → FOOM)
And as for banning AGI—if only this were as easily done as said. How exactly would we go about banning AGI? Especially in such a way that Narrow AI was allowed to continue (so e.g. banning large GPU/TPU clusters wouldn’t be an option)?
Oh, my apologies for not linking to GLOM and such! Hinton’s work toward equivariance is particularly interesting because it allows an object to be recognized under myriad permutations and configurations; the recent use of his style of NN in “Neural Descriptor Fields” is promising—their robot learns to grasp from only ten examples, AND it can grasp even when pose is well-outside the training data—it generalizes!
I strongly suspect that we are already seeing the “FOOM,” entirely powered by narrow AI. AGI isn’t really a pre-requisite to self-improvement: Google used a narrow AI to lay their chips’ architecture, for AI-specialized hardware. My hunch is that these narrow AI will be plenty, yet progress will still lurch. Each new improvement is a harder-fought victory, for a diminishing return. Algorithms can’t become infinitely better, yet AI has already made 1,000x leaps in various problem-sets … so I don’t expect many more such leaps, ahead.
And, in regards to ’100x faster brain’… Suppose that an AGI we’d find useful starts at 100 trillion synapses, and for simplicity, we’ll call that the ‘processing speed’ if we run a brain in real-time. “100 trillion synapses-seconds per second” So, if we wanted a brain which was equally competent, yet also running 100x faster, then we would need 100x the computing power, running in parallel to speed operations. That would be 100x more expensive, and if you posit that you had such power on-hand today, then there must have been an earlier date when the amount of compute was only “100 trillion synapses-seconds per second”, enough for a real-time brain, only. You can’t jump past that earlier date, when only a real-time brain was feasible. You wouldn’t wait until you had 100x compute; your first AGI will be real-time, if not slower. GPT-3 and Dall-E are not ‘instantaneous’, with inference requiring many seconds. So, I expect the same from the first AGI.
More importantly, to that concept of “faster AGI is worth it”—an AGI that requires 100x more brain than a narrow AI (running at the same speed regardless of what that is) would need to be more than 100x as valuable. I doubt that is what we will find; the AGI won’t have magical super-insight compared to narrow AI given the same total compute. And, you could have an AGI that is 1/10th the size, in order to run it 10x faster, but that’s unlikely to be useful anywhere except a smartphone. For any given quantity of compute, you’d prefer the half-second-response super-sized brain over the micro-second-response chimp brain. At each of those quantities of compute, you’ll be able to run multiple narrow AIs at similar levels of performance to the singular AGI, so those narrow AIs are probably worth more.
As for banning AGI—I have no clue! Hardware isn’t really the problem; we’re still far from tech which could cheaply supply human-brain-scale AI to the nefarious individual. It’d really be nations doing AGI. I only see some stiff sanctions and inspections-type stuff, a la nuclear, as ever really happening. Deployment would be difficult to verify, especially if narrow AI is really good at most things such that we can’t tell them apart. If nations formed a kind of “NATO-for-AGI”, declaring publicly to attack any AGI? Only the existing winners would want to play on the side of reducing options for advancement like that, it seems. What do you think?
Thanks for these links. Incredible (and scary) progress!
cheaply supply human-brain-scale AI to the nefarious individual
I think we’re coming at this from different worldviews. I’m coming from much more of a Yudkowsky/Bostrom perspective, where the thing I worry about is misaligned superintelligent AGI; an existential risk by default. For a ban on AGI to be effective against this, it has to stop every single project reaching AGI. There won’t be a stage that lasts any appreciable length of time (say, more than a few weeks) where there are AGIs that can be destroyed/stopped before reaching a point of no return.
then there must have been an earlier date when the amount of compute was only “100 trillion synapses-seconds per second”, enough for a real-time brain, only.
Yes, but my point above was that the very first prototype isn’t going to use all the compute available. Available compute is a function of money spent. So there will very likely be room to significantly speed up the first prototype AGI as soon as it’s deployed. We may very well be at a point now where if all the best algorithms were combined, and $10T spent on compute, we could have something approximating an AGI. But that’s unlikely to happen as there are only maybe 2 entities that can spend that amount of money (the US government and the Chinese government), and they aren’t close to doing so. However, if it gets to only needing $100M in compute, then that would be within reach of many players that could quickly ramp that up to $1B or $10B.
Each new improvement is a harder-fought victory, for a diminishing return.
Do you think this is true even in the limit of AGI designing AGI? Do you think human level is close to the maximum possible level of intelligence? When I mentioned “FOOM” I meant it in the classic Yudkowskian fast takeoff to superintelligence sense.
Oh, and my apologies for letting questions dangle—I think human intelligence is very limited, in the sense that it is built hyper-redundant against injuries, and so its architecture must be much larger in order to achieve the same task. The latest upgrade to language models, DeepMind’s RETRO architecture achieves the same performance as GPT-3 (which is to say, it can write convincing poetry) while using only 1/25th the network. GPT-3 was only 1% of a human brain’s connectivity, so RETRO is literally 1⁄2,500th of a human brain, with human-level performance. I think narrow super-intelligences will dominate, being more efficient than AGI or us.
In regards to overall algorithmic efficiency—in only five years we’ve seen multiple improvements to training and architecture, where what once took a million examples needs ten, or even generalizes to unseen data. Meanwhile, the Lottery Ticket can make a network 10x smaller, while boosting performance. There was even a supercomputer simulation which neural networks sped 2 BILLION-fold… which is insane. I expect more jumps in the math ahead, but I don’t think we have many of those leaps left before our intelligence-algorithms are just “as good as it gets”. Do you see a FOOM-event capable of 10x, 100x, or larger gains left to be found? I would bet there is a 100x is waiting, but it might become tricky and take successively more resources, asymptotic...
I think AGI would easily be capable of FOOM-ing 100x+ across the board. And as for AGI being developed, it seems like we are getting ever closer with each new breakthrough in ML (and there doesn’t seem to be anything fundamentally required that can be said to be “decades away” with high conviction).
Thank you for diving into this with me :) We might be closer on the meat of the issues than it seems—I sit in the “alignment is exceptionally hard and worthy of consideration” camp, AND I see a nascent FOOM occurring already… yet, I point to narrow superintelligence as the likely mode for profit and success. It seems that narrow AI is already enough to improve itself. (And, the idea that this progress will be lumpy, with diminishing returns sometime soon, is merely my vague forecast informed by general trends of development.) AGI may be attainable at any point X, yet narrow superintelligences may be a better use of those same total resources.
More importantly, if narrow AI could do most of the things we want, that tilts my emphasis toward “try our best to stop AGI until we have a long, sober conversation, having seen what tasks are left undone by narrow AI.” This is all predicated on my assumption that “narrow AI can self-iterate and fulfill most tasks competently, at lower risk than AGI, and with fewer resources.” You could call me a “narrow-minded FOOMist”? :)
Maybe your view is closer to Eric Drexler’s CAIS? That would be a good outcome, but it doesn’t seem very likely to be a stable state to me, given that the narrow AIs could be used to speed AGI development. I don’t think the world will coordinate around the idea of narrow AIs / CAIS being enough, without a lot of effort around getting people to recognise the dangers of AGI.
Oh, thank you for showing me his work! As far as I can tell, yes, Comprehensive AI Services seems to be what we are entering already—with GPT-3′s Codex writing functioning code a decent percentage of the time, for example! And I agree that limiting AGI would be difficult; I only suppose that it wouldn’t hurt us to restrict AGI, assuming that narrow AI does most tasks well. If narrow AI is comparable in performance, (given equal compute) then we wouldn’t be missing-out on much, and a competitor who pursues AGI wouldn’t see an overwhelming advantage. Playing it safe might be safe. :)
And, that would be my argument nudging others to avoid AGI, more than a plea founded on the risks by themselves: “Look how good narrow AI is, already—we probably wouldn’t see significant increases in performance from AGI, while AGI would put everyone at risk.” If AGI seems ‘delicious’, then it is more likely to be sought. Yet, if narrow AI is darn-good, AGI becomes less tantalizing.
And, for the FOOMing you mentioned in the other thread of replies, one source of algorithmic efficiency is a conversion to symbolic formalism that accurately models the system. Once the over-arching laws are found, modeling can be orders of magnitude faster, rapidly. [e.g. the distribution of tree-size in undisturbed forests always follows a power-law; testing a pair of points on that curve lets you accurately predict all of them!]
Yet, such a reduction to symbolic form seems to make the AI’s operations much more interpretable, as well as verifiable, and those symbols observed within its neurons by us would not be spoofed. So, I also see developments toward that DNN-to-symbolic bridge as key to BOTH a narrow-AI-powered FOOM, as well as symbolic rigor and verification to protect us. Narrow AI might be used to uncover the equations we would rather rely upon?
Thanks for the heads up about Hinton’s GLOM, Numenta’s Sparse Representations and Google’s Pathways. The latter in particular seems especially worrying, given Google’s resources.
I don’t think your arguments regarding Sharpness and Hardness are particularly reassuring though. If an AGI can be made that runs at “real time”, what’s to stop someone throwing 10, 100, 1000x more compute at it to make it run correspondingly faster? Will they really have spent all the money they have at their disposal on the first prototype? And even if they did, others with more money could quickly up the ante. (And then obviously once the AGI is running much faster than a human, it can be applied to making itself smarter/faster still, etc → FOOM)
And as for banning AGI—if only this were as easily done as said. How exactly would we go about banning AGI? Especially in such a way that Narrow AI was allowed to continue (so e.g. banning large GPU/TPU clusters wouldn’t be an option)?
Oh, my apologies for not linking to GLOM and such! Hinton’s work toward equivariance is particularly interesting because it allows an object to be recognized under myriad permutations and configurations; the recent use of his style of NN in “Neural Descriptor Fields” is promising—their robot learns to grasp from only ten examples, AND it can grasp even when pose is well-outside the training data—it generalizes!
I strongly suspect that we are already seeing the “FOOM,” entirely powered by narrow AI. AGI isn’t really a pre-requisite to self-improvement: Google used a narrow AI to lay their chips’ architecture, for AI-specialized hardware. My hunch is that these narrow AI will be plenty, yet progress will still lurch. Each new improvement is a harder-fought victory, for a diminishing return. Algorithms can’t become infinitely better, yet AI has already made 1,000x leaps in various problem-sets … so I don’t expect many more such leaps, ahead.
And, in regards to ’100x faster brain’… Suppose that an AGI we’d find useful starts at 100 trillion synapses, and for simplicity, we’ll call that the ‘processing speed’ if we run a brain in real-time. “100 trillion synapses-seconds per second” So, if we wanted a brain which was equally competent, yet also running 100x faster, then we would need 100x the computing power, running in parallel to speed operations. That would be 100x more expensive, and if you posit that you had such power on-hand today, then there must have been an earlier date when the amount of compute was only “100 trillion synapses-seconds per second”, enough for a real-time brain, only. You can’t jump past that earlier date, when only a real-time brain was feasible. You wouldn’t wait until you had 100x compute; your first AGI will be real-time, if not slower. GPT-3 and Dall-E are not ‘instantaneous’, with inference requiring many seconds. So, I expect the same from the first AGI.
More importantly, to that concept of “faster AGI is worth it”—an AGI that requires 100x more brain than a narrow AI (running at the same speed regardless of what that is) would need to be more than 100x as valuable. I doubt that is what we will find; the AGI won’t have magical super-insight compared to narrow AI given the same total compute. And, you could have an AGI that is 1/10th the size, in order to run it 10x faster, but that’s unlikely to be useful anywhere except a smartphone. For any given quantity of compute, you’d prefer the half-second-response super-sized brain over the micro-second-response chimp brain. At each of those quantities of compute, you’ll be able to run multiple narrow AIs at similar levels of performance to the singular AGI, so those narrow AIs are probably worth more.
As for banning AGI—I have no clue! Hardware isn’t really the problem; we’re still far from tech which could cheaply supply human-brain-scale AI to the nefarious individual. It’d really be nations doing AGI. I only see some stiff sanctions and inspections-type stuff, a la nuclear, as ever really happening. Deployment would be difficult to verify, especially if narrow AI is really good at most things such that we can’t tell them apart. If nations formed a kind of “NATO-for-AGI”, declaring publicly to attack any AGI? Only the existing winners would want to play on the side of reducing options for advancement like that, it seems. What do you think?
Thanks for these links. Incredible (and scary) progress!
I think we’re coming at this from different worldviews. I’m coming from much more of a Yudkowsky/Bostrom perspective, where the thing I worry about is misaligned superintelligent AGI; an existential risk by default. For a ban on AGI to be effective against this, it has to stop every single project reaching AGI. There won’t be a stage that lasts any appreciable length of time (say, more than a few weeks) where there are AGIs that can be destroyed/stopped before reaching a point of no return.
Yes, but my point above was that the very first prototype isn’t going to use all the compute available. Available compute is a function of money spent. So there will very likely be room to significantly speed up the first prototype AGI as soon as it’s deployed. We may very well be at a point now where if all the best algorithms were combined, and $10T spent on compute, we could have something approximating an AGI. But that’s unlikely to happen as there are only maybe 2 entities that can spend that amount of money (the US government and the Chinese government), and they aren’t close to doing so. However, if it gets to only needing $100M in compute, then that would be within reach of many players that could quickly ramp that up to $1B or $10B.
Do you think this is true even in the limit of AGI designing AGI? Do you think human level is close to the maximum possible level of intelligence? When I mentioned “FOOM” I meant it in the classic Yudkowskian fast takeoff to superintelligence sense.
Oh, and my apologies for letting questions dangle—I think human intelligence is very limited, in the sense that it is built hyper-redundant against injuries, and so its architecture must be much larger in order to achieve the same task. The latest upgrade to language models, DeepMind’s RETRO architecture achieves the same performance as GPT-3 (which is to say, it can write convincing poetry) while using only 1/25th the network. GPT-3 was only 1% of a human brain’s connectivity, so RETRO is literally 1⁄2,500th of a human brain, with human-level performance. I think narrow super-intelligences will dominate, being more efficient than AGI or us.
In regards to overall algorithmic efficiency—in only five years we’ve seen multiple improvements to training and architecture, where what once took a million examples needs ten, or even generalizes to unseen data. Meanwhile, the Lottery Ticket can make a network 10x smaller, while boosting performance. There was even a supercomputer simulation which neural networks sped 2 BILLION-fold… which is insane. I expect more jumps in the math ahead, but I don’t think we have many of those leaps left before our intelligence-algorithms are just “as good as it gets”. Do you see a FOOM-event capable of 10x, 100x, or larger gains left to be found? I would bet there is a 100x is waiting, but it might become tricky and take successively more resources, asymptotic...
I think AGI would easily be capable of FOOM-ing 100x+ across the board. And as for AGI being developed, it seems like we are getting ever closer with each new breakthrough in ML (and there doesn’t seem to be anything fundamentally required that can be said to be “decades away” with high conviction).
Thank you for diving into this with me :) We might be closer on the meat of the issues than it seems—I sit in the “alignment is exceptionally hard and worthy of consideration” camp, AND I see a nascent FOOM occurring already… yet, I point to narrow superintelligence as the likely mode for profit and success. It seems that narrow AI is already enough to improve itself. (And, the idea that this progress will be lumpy, with diminishing returns sometime soon, is merely my vague forecast informed by general trends of development.) AGI may be attainable at any point X, yet narrow superintelligences may be a better use of those same total resources.
More importantly, if narrow AI could do most of the things we want, that tilts my emphasis toward “try our best to stop AGI until we have a long, sober conversation, having seen what tasks are left undone by narrow AI.” This is all predicated on my assumption that “narrow AI can self-iterate and fulfill most tasks competently, at lower risk than AGI, and with fewer resources.” You could call me a “narrow-minded FOOMist”? :)
Maybe your view is closer to Eric Drexler’s CAIS? That would be a good outcome, but it doesn’t seem very likely to be a stable state to me, given that the narrow AIs could be used to speed AGI development. I don’t think the world will coordinate around the idea of narrow AIs / CAIS being enough, without a lot of effort around getting people to recognise the dangers of AGI.
Oh, thank you for showing me his work! As far as I can tell, yes, Comprehensive AI Services seems to be what we are entering already—with GPT-3′s Codex writing functioning code a decent percentage of the time, for example! And I agree that limiting AGI would be difficult; I only suppose that it wouldn’t hurt us to restrict AGI, assuming that narrow AI does most tasks well. If narrow AI is comparable in performance, (given equal compute) then we wouldn’t be missing-out on much, and a competitor who pursues AGI wouldn’t see an overwhelming advantage. Playing it safe might be safe. :)
And, that would be my argument nudging others to avoid AGI, more than a plea founded on the risks by themselves: “Look how good narrow AI is, already—we probably wouldn’t see significant increases in performance from AGI, while AGI would put everyone at risk.” If AGI seems ‘delicious’, then it is more likely to be sought. Yet, if narrow AI is darn-good, AGI becomes less tantalizing.
And, for the FOOMing you mentioned in the other thread of replies, one source of algorithmic efficiency is a conversion to symbolic formalism that accurately models the system. Once the over-arching laws are found, modeling can be orders of magnitude faster, rapidly. [e.g. the distribution of tree-size in undisturbed forests always follows a power-law; testing a pair of points on that curve lets you accurately predict all of them!]
Yet, such a reduction to symbolic form seems to make the AI’s operations much more interpretable, as well as verifiable, and those symbols observed within its neurons by us would not be spoofed. So, I also see developments toward that DNN-to-symbolic bridge as key to BOTH a narrow-AI-powered FOOM, as well as symbolic rigor and verification to protect us. Narrow AI might be used to uncover the equations we would rather rely upon?