Yeah, I agree—I’d rather have a blackbird than AlphaZero. For one thing, it’d make our current level of progress in AI much clearer. But on your second and third points, I think of ML training as somewhat analogous to evolution, and the trained agent as analogous to an animal. Both the training process and evolution are basically blind but goal-directed processes with a ton of iterations (I’m bullish on evolution’s ability to transmit information through generations) that result in well-adapted agents.
If that’s the right analogy, then we can compare AlphaZero’s superhuman board game abilities with a blackbird’s subhuman-but-general performance. If we’re not meaningfully compute-constrained, then the question is: what kinds of problems will we soon be able to train AI systems to solve? AI research might be one such problem. There are a lot of different training techniques out in the wild, and many of the more impressive recent developments have come from combining multiple techniques in novel ways (with lots of compute). That strikes me as the kind of search space that an AI system might be able to explore much faster than human teams.
I remember looking into communication speed, but unfortunately I can’t find the sources I found last time! As I recall, when I checked the communication figures weren’t meaningfully different from processing speed figures.
Edit: found it! AI Impacts on TEPS (traversed edges per second): https://aiimpacts.org/brain-performance-in-teps/
Yeah, basically computers are closer in communication speed to a human brain than they are in processing speed. Which makes intuitive sense—they can transfer information at the speed of light, while brains are stuck sending chemical signals in many (all?) cases.
2nd edit: On your earlier point about training time vs. total engineering time...”Most honest” isn’t really the issue. It’s what you care about—training time illustrates that human-level performance can be quickly surpassed by an AI system’s capabilities once it’s built. Then the AI will keep improving, leaving us in the dust (although the applicability of current algorithms to more complex tasks is unclear). Total engineering time would show that these are massive projects which take time to develop...which is also true.
Thanks! Yeah, it might have been a bad idea to take general chip cost decreases as super relevant for specialized AI chips’ cost efficiency. I read Carey’s estimates for cost decreases as applying to AI chips, when upon closer inspection he was referring to general chips. Probably we’ll see faster gains in AI chips’ cost efficiency for a while as the low-hanging fruit is picked.
My point was something like, “Development costs to make AI chips will largely be borne by leading AI companies. If this is right, then they won’t be able to take advantage of cheaper, better chips in the same way that consumers have with Moore’s Law—i.e. passively benefiting from the results without investing their own capital into R&D”. I didn’t mean for it to sound like I was focusing on chip production capacity—I think cost efficiency is the key metric.
But I don’t have a sense of how much money will be spent on development costs for a certain increase in chips’ cost efficiency. It might be that early on, unit costs swamp development costs.
Frankly, I’m starting to think that my ideas about development costs may not be accurate. It looks like traditional chip companies are entering the AI chip business in force, although they could be 10% of the market or 90% for all I know. That could change things from the perspective of how much compute leading AI firms could afford to buy. This coupled with the aforementioned difference in cost efficiency rates between general chips and AI chips means I may have underestimated future increases in the cost efficiency of AI chips.
I claim that this is not how I think about AI capabilities, and it is not how many AI researchers think about AI capabilities. For a particularly extreme example, the Go-explore paper out of Uber had a very nominally impressive result on Montezuma’s Revenge, but much of the AI community didn’t find it compelling because of the assumptions that their algorithm used.
Sorry, I meant the results in light of which methods were used, implications for other research, etc. The sentence would better read, “My understanding (and I think everyone else’s) of AI capabilities is largely shaped by how impressive major papers seem.”
Tbc, I definitely did not intend for that to be an actual metric.
Yeah, totally got that—I just think that making a relevant metric would be hard, and we’d have to know a lot that we don’t know now, including whether current ML techniques can ever lead to AGI.
I would say that I have a set of intuitions and impressions that function as a very weak prediction of what AI will look like in the future, along the lines of that sort of metric. I trust timelines based on extrapolation of progress using these intuitions more than timelines based solely on compute.
Interesting. Yeah, I don’t much trust my own intuitions on our current progress. I’d love to have a better understanding of how to evaluate the implications of new developments, but I really can’t do much better than, “GPT-2 impressed me a lot more than AlphaStar.” And just to be 100% clear—I tend to think that the necessary amount of compute is somewhere in the 18-to-300-year range. After we reach it, I’m stuck using my intuition to guess when we’ll have the right algorithms to create AGI.
Thanks for the comment! In order:
I think that its performance at test time is one of the more relevant measures—I take grandmasters’ considering fewer moves during a game as evidence that they’ve learned something more of the ‘essence’ of chess than AlphaZero, and I think AlphaZero’s learning was similarly superior to Stockfish’s relatively blind approach. Training time is also an important measure—but that’s why Carey brings up the 300-year AlphaGo Zero milestone.
Indeed we are. And it’s not clear to me that we’re much better optimized for general cognition. We’re extremely bad at doing math that pocket calculators have no problem with, yet it took us a while to build a good chess and Go-playing AI. I worry we have very little idea how hard different cognitive tasks will be to something with a brain-equivalent amount of compute.
I’m focusing on compute partly because it’s the easiest to measure. My understanding (and I think everyone else’s) of AI capabilities is largely shaped by how impressive the results of major papers intuitively seem. And when AI can use something like the amount of compute a human brain has, we should eventually get a similar level of capability, so I think compute is a good yardstick.
I’m not sure I fully understand how the metric would work. For the Atari example, it seems clear to me that we could easily reach it without making a generalizable AI system, or vice versa. I’m not sure what metric could be appropriate—I think we’d have to know a lot more about intelligence. And I don’t know if we’ll need a completely different computing paradigm from ML to learn in a more general way. There might not be a relevant capability level for ML systems that would correspond to human-level AI.
But let’s say that we could come up with a relevant metric. Then I’d agree with Garfinkel, as long as people in the community had known roughly the current state of AI in relation to it and the rate of advance toward it before the release of “AI and Compute”.