Current scaling “laws” are not laws of nature. And there are already worrying signs that things like dataset optimization/pruning, curriculum learning and synthetic data might well break them
Interesting—can you provide some citations?
Interesting—can you provide some citations?