I agree that these models assume something like “large discontinuous algorithmic breakthroughs aren’t needed to reach AGI”.
(But incremental advances which are ultimately quite large in aggregate and which broadly follow long running trends are consistent.)
However, I interpreted “current paradigm + scale” in the original post as “the current paradigm of scaling up LLMs and semi-supervised pretraining”. (E.g., not accounting for totally new RL schemes or wildly different architectures trained with different learning algorithms which I think are accounted for in this model.)
I agree that these models assume something like “large discontinuous algorithmic breakthroughs aren’t needed to reach AGI”.
(But incremental advances which are ultimately quite large in aggregate and which broadly follow long running trends are consistent.)
However, I interpreted “current paradigm + scale” in the original post as “the current paradigm of scaling up LLMs and semi-supervised pretraining”. (E.g., not accounting for totally new RL schemes or wildly different architectures trained with different learning algorithms which I think are accounted for in this model.)