(comment originally posted on Twitter, Cheryl’s response here)
I’ll flag that estimating firm-level training compute with [Epoch AI’s] notable models dataset will produce big underestimates. E.g. with your methodology, OpenAI spent ~4e25 FLOP on training and 1.3e25 FLOP on research in 2023 and 2024. the latter would cost ~$30 million. but we know OpenAI spent at least $1 billion on research in 2024! (also note they spent $1 billion on research compute after amortizing this cost with an undisclosed schedule).
But I don’t have a great sense of how sensitive your results are to this issue.
(this raises other questions: what did OpenAI spend $3 billion in training compute on in 2024? that’s enough for 50 GPT-4 sized models. Maybe my cost accounting is quite different from OpenAI’s. A lot of that “training” compute might really be more experimental)
Here is a fleshed out version of Cheryl’s response. Lets suppose actual research capital is qK but we just used K in our estimation equation.
Then the true estimation equation is
lnqKL=σlnγ1−γ+σlnwr
re-arranging we get
lnKL=σlnγ1−γ−lnq+σlnwr
So if we regress lnKL on a constant and lnwr then the coefficient on lnwr is still σ as long as q is independent of w/r.
Nevertheless, I think this should increase your uncertainty in our estimates because there is clearly a lot going on behind the scenes that we might not fully understand—like how is research vs. training compute measured, etc.
(comment originally posted on Twitter, Cheryl’s response here)
I’ll flag that estimating firm-level training compute with [Epoch AI’s] notable models dataset will produce big underestimates. E.g. with your methodology, OpenAI spent ~4e25 FLOP on training and 1.3e25 FLOP on research in 2023 and 2024. the latter would cost ~$30 million. but we know OpenAI spent at least $1 billion on research in 2024! (also note they spent $1 billion on research compute after amortizing this cost with an undisclosed schedule).
But I don’t have a great sense of how sensitive your results are to this issue.
(this raises other questions: what did OpenAI spend $3 billion in training compute on in 2024? that’s enough for 50 GPT-4 sized models. Maybe my cost accounting is quite different from OpenAI’s. A lot of that “training” compute might really be more experimental)
Here is a fleshed out version of Cheryl’s response. Lets suppose actual research capital is qK but we just used K in our estimation equation.
Then the true estimation equation is
lnqKL=σlnγ1−γ+σlnwr
re-arranging we get
lnKL=σlnγ1−γ−lnq+σlnwr
So if we regress lnKL on a constant and lnwr then the coefficient on lnwr is still σ as long as q is independent of w/r.
Nevertheless, I think this should increase your uncertainty in our estimates because there is clearly a lot going on behind the scenes that we might not fully understand—like how is research vs. training compute measured, etc.