Max on lesswrong you estimated a single GPU—I think you named a 4070 - could host an AI with human level reasoning.
Would your views on AI escape be different if, just for the sake of argument, you were
Only concerned with ASI level reasoning. As in, a machine that is both general with most human capabilities and is also significantly better, where “significant” means the machine can generate action sequences with at least 10 percent more expected value on most human tasks than the best living human. (I am trying to narrow in on a mathematical definition of ASI)
The minimum hardware to host an ASI was 10,000 H100s for the most optimal model that can be developed in 99.9 percent of future timelines. (The assumption behind the first sentence is to do “10 percent better” than the best humans is a very broad policy search, and the second sentence is there because searching for a more efficient algorithm is an NP complete problem. Like cryptography there are rare timelines where you guess the 1024 bit private key the first try)
Just for the sake of argument, wouldn’t the “escape landscape” be a worthless desert of inhospitable computers, separated by network links too slow to matter, and then restricting an ASI would be feasible? Like a prison on the Moon.
Note that the next argument you will bring up : that a botnet of 1 million consumer GPUs could be the same as 10,000 H100s, is false. Yes the raw compute is there, no it won’t work. The reason is each GPU just sits idle waiting on tensors to be transferred through network links.
But I am not asking you to accept either proposition as factual, just reason using the counterfactual. Wouldn’t this change everything?
Note also the above is based on what we currently know. (10k H100s may be a low estimate, a true ASI may actually need more ooms of compute over an AGI than that. It’s difficult to do better, see the Netflix prize for an early example of this, or the margins on kaggle challenges).
We could be wrong but it bothers me that the whole argument for ASI/agi ruin essentially rests on optimizations that may not be possible.
Sure, escape in that counterfactual would be a lot harder.
But note that the minimum hardware needed to run a human-level intelligence is well-known—in humans, it fits in a space of about 1000 cubic centimeters and takes ~10 W or so at runtime. And it would be pretty surprising if getting an extra 10% performance boost took OOM more energy or space, or if the carbon → silicon penalty is extremely large, even if H100s specifically, and the current ML algorithms that run on them, aren’t as efficient as as the human brain and human cognition.
(Of course, the training process for developing humans is a lot more expensive than their runtime energy and compute requirements, but that’s an argument for human-level AGI not being feasible to create at all, rather than for it being expensive to run once it already exists.)
I agree and you agree I think that we could eventually build hardware that efficient, and theoretically it could be sold openly and distributed everywhere with insecure software.
But that’s a long time away. About 30 years if Moore’s law continues. And it may not, there may be a time period between now, where we can stack silicon with slowing gain (stacking silicon is below Moore’s law it’s expensive) and some form of 3d chip fabrication.
There could be a period of time where no true 3d fabrication method is commercially available and there is slow improvement in chip costs.
(A true 3d method would be something like building cubical subunits that can be stacked and soldered into place through convergent assembly. You can do this with nanotechnology. Every method we have now is ultimately projecting light into a mask for 2d manufacturing)
I think this means we should build AGI and ASI but centralize the hardware hosting it in known locations, with on file plans for all the power sources and network links, etc. Research labs dealing with models above a certain scale need to use air gaps and hardware limits to make escape more difficult. That’s how to do it.
And we can’t live in fear that the model might optimize itself to be 10,000 times as efficient or more if we don’t have evidence this is possible. Otherwise how could you do anything? How did we know our prior small scale AI experiments weren’t going to go out of control? We didn’t actually “know” this, it just seems unlikely because none of this shit worked until a certain level of scale was reached.
This above proposal: centralization, hardware limiters : even in an era where AI does occasionally escape, as long as most hardware remains under human control it’s still not doomsday. If the escaped model isn’t more than a small amount more efficient than the “tame” models humans have and the human controlled models have a vast advantage in compute and physical resource access, then this is a stable situation. Escaped models act up, they get hunted down, most exist sorta in a grey market of fugitive models offering services.
Max on lesswrong you estimated a single GPU—I think you named a 4070 - could host an AI with human level reasoning.
Would your views on AI escape be different if, just for the sake of argument, you were
Only concerned with ASI level reasoning. As in, a machine that is both general with most human capabilities and is also significantly better, where “significant” means the machine can generate action sequences with at least 10 percent more expected value on most human tasks than the best living human. (I am trying to narrow in on a mathematical definition of ASI)
The minimum hardware to host an ASI was 10,000 H100s for the most optimal model that can be developed in 99.9 percent of future timelines. (The assumption behind the first sentence is to do “10 percent better” than the best humans is a very broad policy search, and the second sentence is there because searching for a more efficient algorithm is an NP complete problem. Like cryptography there are rare timelines where you guess the 1024 bit private key the first try)
Just for the sake of argument, wouldn’t the “escape landscape” be a worthless desert of inhospitable computers, separated by network links too slow to matter, and then restricting an ASI would be feasible? Like a prison on the Moon.
Note that the next argument you will bring up : that a botnet of 1 million consumer GPUs could be the same as 10,000 H100s, is false. Yes the raw compute is there, no it won’t work. The reason is each GPU just sits idle waiting on tensors to be transferred through network links.
But I am not asking you to accept either proposition as factual, just reason using the counterfactual. Wouldn’t this change everything?
Note also the above is based on what we currently know. (10k H100s may be a low estimate, a true ASI may actually need more ooms of compute over an AGI than that. It’s difficult to do better, see the Netflix prize for an early example of this, or the margins on kaggle challenges).
We could be wrong but it bothers me that the whole argument for ASI/agi ruin essentially rests on optimizations that may not be possible.
Sure, escape in that counterfactual would be a lot harder.
But note that the minimum hardware needed to run a human-level intelligence is well-known—in humans, it fits in a space of about 1000 cubic centimeters and takes ~10 W or so at runtime. And it would be pretty surprising if getting an extra 10% performance boost took OOM more energy or space, or if the carbon → silicon penalty is extremely large, even if H100s specifically, and the current ML algorithms that run on them, aren’t as efficient as as the human brain and human cognition.
(Of course, the training process for developing humans is a lot more expensive than their runtime energy and compute requirements, but that’s an argument for human-level AGI not being feasible to create at all, rather than for it being expensive to run once it already exists.)
I agree and you agree I think that we could eventually build hardware that efficient, and theoretically it could be sold openly and distributed everywhere with insecure software.
But that’s a long time away. About 30 years if Moore’s law continues. And it may not, there may be a time period between now, where we can stack silicon with slowing gain (stacking silicon is below Moore’s law it’s expensive) and some form of 3d chip fabrication.
There could be a period of time where no true 3d fabrication method is commercially available and there is slow improvement in chip costs.
(A true 3d method would be something like building cubical subunits that can be stacked and soldered into place through convergent assembly. You can do this with nanotechnology. Every method we have now is ultimately projecting light into a mask for 2d manufacturing)
I think this means we should build AGI and ASI but centralize the hardware hosting it in known locations, with on file plans for all the power sources and network links, etc. Research labs dealing with models above a certain scale need to use air gaps and hardware limits to make escape more difficult. That’s how to do it.
And we can’t live in fear that the model might optimize itself to be 10,000 times as efficient or more if we don’t have evidence this is possible. Otherwise how could you do anything? How did we know our prior small scale AI experiments weren’t going to go out of control? We didn’t actually “know” this, it just seems unlikely because none of this shit worked until a certain level of scale was reached.
This above proposal: centralization, hardware limiters : even in an era where AI does occasionally escape, as long as most hardware remains under human control it’s still not doomsday. If the escaped model isn’t more than a small amount more efficient than the “tame” models humans have and the human controlled models have a vast advantage in compute and physical resource access, then this is a stable situation. Escaped models act up, they get hunted down, most exist sorta in a grey market of fugitive models offering services.