To be clear I also have high error bars on whether traversing 5 OOMs of algorithmic efficiency in the next five years are possible, but that’s because a) high error bars on diminishing returns to algorithmic gains, and b) a tentative model that most algorithmic gains in the past were driven by compute gains, rather than exogeneous to it. Algorithmic improvements in ML seems much more driven by the “f-ck around and find out” paradigm than deep theoretical or conceptual breakthroughs; if we model experimentation gains as a function of quality-adjusted researchers multiplied by compute multiplied by time, it’s obvious that the compute term is the one that’s growing the fastest (and thus the thing that drives the most algorithmic progress).
To be clear I also have high error bars on whether traversing 5 OOMs of algorithmic efficiency in the next five years are possible, but that’s because a) high error bars on diminishing returns to algorithmic gains, and b) a tentative model that most algorithmic gains in the past were driven by compute gains, rather than exogeneous to it. Algorithmic improvements in ML seems much more driven by the “f-ck around and find out” paradigm than deep theoretical or conceptual breakthroughs; if we model experimentation gains as a function of quality-adjusted researchers multiplied by compute multiplied by time, it’s obvious that the compute term is the one that’s growing the fastest (and thus the thing that drives the most algorithmic progress).