Another nice post! I think it massively overstates the case for Bio Anchors being too aggressive:
More broadly, Bio Anchors could be too aggressive due to its assumption that “computing power is the bottleneck”:
It assumes that if one could pay for all the computing power to do the brute-force “training” described above for the key tasks (e.g., automating scientific work), this would be enough to develop transformative AI.
But in fact, training an AI model doesn’t just require purchasing computing power. It requires hiring researchers, running experiments, and perhaps most importantly, finding a way to set up the “trial and error” process so that the AI can get a huge number of “tries” at the key task. It may turn out that doing so is prohibitively difficult.
The assumption of the Bio Anchors framework is that compute is the bottleneck, not that compute is all you need (but your phrasing gives the opposite impression).
In the bio anchors framework we pretty quickly (by 2025 I think?) get to a regime where people are willing to spend a billion+ dollars on the compute for a single training run. It’s pretty darn plausible that by the time you are spending billions of dollars on compute, you’ll also be able to afford the associated data collection, researcher salaries, etc. for some transformative task.
(There are many candidate transformative tasks, some of which would be directly transformative and others indirectly by rapidly leading to the creation of AIs that can do the directly transformative things. So for compute to not be the bottleneck, it has to be that we are data-collection-limited, or researcher-salary limited, or whatever, for all of these tasks.)
(Also, I don’t think “transformative AI” should be our milestone anyway. AI-induced point of no return is, and that probably comes earlier.)
Thanks for this, I can see how that could be confusing language. I’ve changed “this would be enough to develop transformative AI” to “transformative AI would (likely) follow” and cut “But in fact” from the next bullet point. (I’ve only made these changes at the Cold Takes version; editing this version can cause bugs.)
I agree directionally with the points you make about “many transformative tasks” and “point of no return,” but I still think AI systems would have to be a great deal more capable than today’s—likely with a pretty high degree of generality (or at least far more sample-efficient learning than we see today) - to get us to that point.
Update: I thought about it a bit more & asked this question & got some useful feedback, especially from tin482 and vladimir_nesov. I now am confused about what people mean when they say current AI systems are much less sample-efficient than humans. On some interpretations, GPT-3 is already about as sample-efficient as humans. My guess is it’s something like: “Sure, GPT-3 can see a name or fact once in its dataset and then remember it later & integrate it with the rest of its knowledge. But that’s because it’s part of the general skill/task of predicting text. For new skills/tasks, GPT-3 would need huge amounts of fine-tuning data to perform acceptably.”
The sample-efficient learning thing is an interesting crux. I tentatively agree with you that it seems hard for AIs that are as sample-inefficient as todays to be dangerous. However… on my todo list is to interrogate that. In my “median future” story, for example, we have chatbots that are talking to millions of people every day and online-learning from those interactions. Maybe it can make up in quantity what it lacks in quality, so to speak—maybe it can keep up with world affairs and react to recent developments via seeing millions of data points about it, rather than by seeing one data point and being sample-efficient. Idk.
Another nice post! I think it massively overstates the case for Bio Anchors being too aggressive:
The assumption of the Bio Anchors framework is that compute is the bottleneck, not that compute is all you need (but your phrasing gives the opposite impression).
In the bio anchors framework we pretty quickly (by 2025 I think?) get to a regime where people are willing to spend a billion+ dollars on the compute for a single training run. It’s pretty darn plausible that by the time you are spending billions of dollars on compute, you’ll also be able to afford the associated data collection, researcher salaries, etc. for some transformative task.
(There are many candidate transformative tasks, some of which would be directly transformative and others indirectly by rapidly leading to the creation of AIs that can do the directly transformative things. So for compute to not be the bottleneck, it has to be that we are data-collection-limited, or researcher-salary limited, or whatever, for all of these tasks.)
(Also, I don’t think “transformative AI” should be our milestone anyway. AI-induced point of no return is, and that probably comes earlier.)
Thanks for this, I can see how that could be confusing language. I’ve changed “this would be enough to develop transformative AI” to “transformative AI would (likely) follow” and cut “But in fact” from the next bullet point. (I’ve only made these changes at the Cold Takes version; editing this version can cause bugs.)
I agree directionally with the points you make about “many transformative tasks” and “point of no return,” but I still think AI systems would have to be a great deal more capable than today’s—likely with a pretty high degree of generality (or at least far more sample-efficient learning than we see today) - to get us to that point.
Update: I thought about it a bit more & asked this question & got some useful feedback, especially from tin482 and vladimir_nesov. I now am confused about what people mean when they say current AI systems are much less sample-efficient than humans. On some interpretations, GPT-3 is already about as sample-efficient as humans. My guess is it’s something like: “Sure, GPT-3 can see a name or fact once in its dataset and then remember it later & integrate it with the rest of its knowledge. But that’s because it’s part of the general skill/task of predicting text. For new skills/tasks, GPT-3 would need huge amounts of fine-tuning data to perform acceptably.”
Surely a big part of the resolution is that GPT-3 is sample-inefficient in total, but sample-efficient on the margin?
Excellent, thanks!
The sample-efficient learning thing is an interesting crux. I tentatively agree with you that it seems hard for AIs that are as sample-inefficient as todays to be dangerous. However… on my todo list is to interrogate that. In my “median future” story, for example, we have chatbots that are talking to millions of people every day and online-learning from those interactions. Maybe it can make up in quantity what it lacks in quality, so to speak—maybe it can keep up with world affairs and react to recent developments via seeing millions of data points about it, rather than by seeing one data point and being sample-efficient. Idk.