I really appreciated the extension on “AI and Compute”. Do you have a sense of the extent to which your estimate of the doubling time differs from “AI and Compute” stems from differences in selection criteria vs new data since its publication in 2018? Have you done analysis on what the trend looks like if you only include data points that fulfil their inclusion criteria?
For reference, it seems like their criteria is ”… results that are relatively well known, used a lot of compute for their time, and gave enough information to estimate the compute used.” Whereas yours is “important publication within the field of AI OR lots of citations OR performance record on common benchmark”. ”… used a lot of compute for their time” would probably do a whole lot of work to select data points that will show a faster doubling time.
I have been wondering the same. However, given that OpenAI’s “AI and Compute” inclusion criteria are also a bit vague, I’m having a hard time which of our data points would fulfill their criteria.
In general, I would describe our dataset matching the same criteria because:
“relatively well known” equals our “lots of citations”.
“used a lot of compute for their time” equals our dataset if we exclude outliers from efficient ML models.
There’s a recent trend in efficient ML models that achieve similar performance by using less compute for inference and training (those models are then used for e.g., deployment on embedded systems or smartphones).
“gave enough information to estimate the compute”: We also rely on estimates from us or the community based on the information available in the paper. For a source of the estimate see the note on the cell in our dataset.
We’re working on gathering more compute data by directly asking researchers (next target n=100) .
I’d be interested in discussing more precise inclusion criteria. As I say in the post:
Also, it is unclear on which models we should base this trend. The piece AI and Compute also quickly discusses this in the appendix. Given the recent trend of efficient ML models due to emerging fields such as Machine Learning on the Edge, I think it might be worthwhile discussing how to integrate and interpret such models in analyses like this — ignoring them cannot be the answer.
The described doubling time of 6.2 months is the result when the outliers are excluded.
If one includes all our models, the doubling time was around ≈7 months. However, the number of efficient ML models was only one or two.
I really appreciated the extension on “AI and Compute”. Do you have a sense of the extent to which your estimate of the doubling time differs from “AI and Compute” stems from differences in selection criteria vs new data since its publication in 2018? Have you done analysis on what the trend looks like if you only include data points that fulfil their inclusion criteria?
For reference, it seems like their criteria is ”… results that are relatively well known, used a lot of compute for their time, and gave enough information to estimate the compute used.” Whereas yours is “important publication within the field of AI OR lots of citations OR performance record on common benchmark”. ”… used a lot of compute for their time” would probably do a whole lot of work to select data points that will show a faster doubling time.
I have been wondering the same. However, given that OpenAI’s “AI and Compute” inclusion criteria are also a bit vague, I’m having a hard time which of our data points would fulfill their criteria.
In general, I would describe our dataset matching the same criteria because:
“relatively well known” equals our “lots of citations”.
“used a lot of compute for their time” equals our dataset if we exclude outliers from efficient ML models.
There’s a recent trend in efficient ML models that achieve similar performance by using less compute for inference and training (those models are then used for e.g., deployment on embedded systems or smartphones).
“gave enough information to estimate the compute”: We also rely on estimates from us or the community based on the information available in the paper. For a source of the estimate see the note on the cell in our dataset.
We’re working on gathering more compute data by directly asking researchers (next target
n=100
) .I’d be interested in discussing more precise inclusion criteria. As I say in the post:
Thanks! What happens to your doubling times if you exclude the outliers from efficient ML models?
The described doubling time of 6.2 months is the result when the outliers are excluded. If one includes all our models, the doubling time was around ≈7 months. However, the number of efficient ML models was only one or two.