A 10^29 FLOP training run is an x-risk itself in terms of takeover risk from inner misalignment during training, fine-tuning or evals (lab leak risk).
I’m not convinced that AI lab leaks are significant sources of x-risks, but I can understand your frustration with my predictions if you disagree with that. In the post, I mentioned that I disagree with hard takeoff models of AI, which might explain our disagreement.
Regulation—hopefully this will slow things down, but for the sake of argument (i.e. in order to argue for regulation) it’s best to not incorporate it into this analysis.
I’m not sure about that. It seems like you might still want to factor in the effects of regulation into your analysis even if you’re arguing for regulation. But even so, I’m not trying to make an argument for regulation in this post. I’m just trying to predict the future.
I think this result is very interesting, but my impression is that the result is generally in line with the slow progress we’ve seen over the last 10 years. I’ll be more impressed when I start seeing results that work well in a diverse array of settings, across multiple different types of tasks, with no guarantees about the environment, at speeds comparable to human workers, and with high reliability. I currently don’t expect results like that until the end of the 2020s.
Epoch says on your trends page: “4.2x/year growth rate in the training compute of milestone systems”. Do you expect this trend to break in less than 4 years?
I wouldn’t be very surprised if the 4.2x/year trend continued for another 4 years, although I expect it to slow down some time before 2030, especially for the largest training run. If it became obvious that the trend was not slowing down, I would very likely update towards shorter timelines. Indeed, I believe the trend from 2010-2015 was a bit faster than from 2015-2022, and from 2017-2023 the largest training run (according to Epoch data) went from ~3*10^23 FLOP to ~2*10^25 FLOP, which was only an increase of about 0.33 OOMs per year.
1 OOM per year (as per Epoch trends, inc. algorithmic improvement) is 10^29 in 2027.
But I was talking about physical FLOP in the comment above. My median for the amount of FLOP required to train TAI is closer to 10^32 FLOP using 2023 algorithms, which was defined in the post in a specific way. Given this median, I agree there is a small (perhaps 15% chance) that TAI can be trained at 10^29 2023 FLOP, which means I think there’s a non-negligible chance that TAI could be trained in 2027. However, I expect the actual explosive growth part to happen at least a year later, though, for reasons I outlined above.
I’ll be more impressed when I start seeing results that work well in a diverse array of settings, across multiple different types of tasks, with no guarantees about the environment, at speeds comparable to human workers, and with high reliability. I currently don’t expect results like that until the end of the 2020s.
I’m not convinced that AI lab leaks are significant sources of x-risks, but I can understand your frustration with my predictions if you disagree with that. In the post, I mentioned that I disagree with hard takeoff models of AI, which might explain our disagreement.
I’m not sure about that. It seems like you might still want to factor in the effects of regulation into your analysis even if you’re arguing for regulation. But even so, I’m not trying to make an argument for regulation in this post. I’m just trying to predict the future.
I think this result is very interesting, but my impression is that the result is generally in line with the slow progress we’ve seen over the last 10 years. I’ll be more impressed when I start seeing results that work well in a diverse array of settings, across multiple different types of tasks, with no guarantees about the environment, at speeds comparable to human workers, and with high reliability. I currently don’t expect results like that until the end of the 2020s.
I wouldn’t be very surprised if the 4.2x/year trend continued for another 4 years, although I expect it to slow down some time before 2030, especially for the largest training run. If it became obvious that the trend was not slowing down, I would very likely update towards shorter timelines. Indeed, I believe the trend from 2010-2015 was a bit faster than from 2015-2022, and from 2017-2023 the largest training run (according to Epoch data) went from ~3*10^23 FLOP to ~2*10^25 FLOP, which was only an increase of about 0.33 OOMs per year.
But I was talking about physical FLOP in the comment above. My median for the amount of FLOP required to train TAI is closer to 10^32 FLOP using 2023 algorithms, which was defined in the post in a specific way. Given this median, I agree there is a small (perhaps 15% chance) that TAI can be trained at 10^29 2023 FLOP, which means I think there’s a non-negligible chance that TAI could be trained in 2027. However, I expect the actual explosive growth part to happen at least a year later, though, for reasons I outlined above.
Let’s see what happens with the Gemini release.