Presumably a lot of these are all optimised for the current gen-AI paradigm, though. But we’re talking about what happens if the current paradigm fails. I’m sure some of it would carry over to a different AI paradigm, but also it’s pretty likely there would be other bottleneck we would have to tune to get things working.
Yup, some stuff will be useful and others won’t. The subset of useful stuff will make future researchers’ lives easier and allow them to work faster. For example, here are people using JAX for lots of computations that are not deep learning at all.
I feel like what you’re saying is the equivalent of pointing out in 2020 that we have had so many optimisations and computing resources that went into, say, google searches, and then using that as evidence that surely the big data that goes into LLM’s should be instantaneous as well.
In like 2010–2015, “big data” and “the cloud” were still pretty hot new things, and people developed a bunch of storage formats, software tools, etc. for distributed data, distributed computing, parallelization, and cloud computing. And yes I do think that stuff turned out to be useful when deep learning started blowing up (and then LLMs after that), in the sense that ML researchers would have made slower progress (on the margin) if not for all that development. I think Docker and Kubernetes are good examples here. I’m not sure exactly how different the counterfactual would have been, but I do think it made more than zero difference.
Things like Docker containers or cloud VMs that can be, in principle, applied to any sort of software or computation could be helpful for all sorts of applications we can’t anticipate. They are very general-purpose. That makes sense to me.
The extent to which things designed for deep learning, such as PyTorch, could be applied to ideas outside deep learning seems much more dubious.
And if we’re thinking about ideas that fall within deep learning, but outside what is currently mainstream and popular, then I simply don’t know.
Yup, some stuff will be useful and others won’t. The subset of useful stuff will make future researchers’ lives easier and allow them to work faster. For example, here are people using JAX for lots of computations that are not deep learning at all.
In like 2010–2015, “big data” and “the cloud” were still pretty hot new things, and people developed a bunch of storage formats, software tools, etc. for distributed data, distributed computing, parallelization, and cloud computing. And yes I do think that stuff turned out to be useful when deep learning started blowing up (and then LLMs after that), in the sense that ML researchers would have made slower progress (on the margin) if not for all that development. I think Docker and Kubernetes are good examples here. I’m not sure exactly how different the counterfactual would have been, but I do think it made more than zero difference.
Things like Docker containers or cloud VMs that can be, in principle, applied to any sort of software or computation could be helpful for all sorts of applications we can’t anticipate. They are very general-purpose. That makes sense to me.
The extent to which things designed for deep learning, such as PyTorch, could be applied to ideas outside deep learning seems much more dubious.
And if we’re thinking about ideas that fall within deep learning, but outside what is currently mainstream and popular, then I simply don’t know.