Yarrow Bouchard 🔸 comments on AGI by 2032 is extremely unlikely

Yarrow Bouchard 🔸 19 Oct 2025 3:13 UTC
5 points
0 ∶ 0
right now we have lots of resources that did not exist in 2018, like dramatically more compute, better tooling and frameworks like PyTorch and JAX, armies of experts on parallelization, and on and on. These were bottlenecks in 2018, without which we presumably would have gotten the LLMs of today years earlier.
I fear this may be pointless nitpicking, but if I’m getting the timeline right, PyTorch’s initial alpha release was in September 2016, its initial proper public release was in January 2017, and PyTorch version 1.0 was released in October 2018. I’m much less familiar with JAX, but apparently it was released in December 2018. Maybe you simply intended to say that PyTorch and JAX are better today than they were in 2018. I don’t know. This just stuck out to me as I was re-reading your comment just now.

For context, OpenAI published a paper about GPT-1 (or just GPT) in 2018, released GPT-2 in 2019, and released GPT-3 in 2020. (I’m going off the dates on the Wikipedia pages for each model.) GPT-1 apparently used TensorFlow, which was initially released in 2015, the same year OpenAI was founded. TensorFlow had a version 1.0 release in 2017, the year before the GPT-1 paper. (In 2020, OpenAI said in a blog post they would be switching to using PyTorch exclusively.)
- Steven Byrnes 19 Oct 2025 12:42 UTC
  2 points
  0 ∶ 0
  Parent
  Maybe you simply intended to say that PyTorch and JAX are better today than they were in 2018.
  Yup! E.g. torch.compile “makes code run up to 2x faster” and came out in PyTorch 2.0 in 2023.
  More broadly, what I had in mind was: open-source software for everything to do with large-scale ML training—containerization, distributed training, storing checkpoints, hyperparameter tuning, training data and training environments, orchestration and pipelines, dashboards for monitoring training runs, on and on—is much more developed now compared to 2018, and even compared to 2022, if I understand correctly (I’m not a practitioner). Sorry for poor wording. :)
  - titotal 19 Oct 2025 15:08 UTC
    9 points
    0 ∶ 0
    Parent
    Presumably a lot of these are all optimised for the current gen-AI paradigm, though. But we’re talking about what happens if the current paradigm fails. I’m sure some of it would carry over to a different AI paradigm, but also it’s pretty likely there would be other bottleneck we would have to tune to get things working.
    I feel like what you’re saying is the equivalent of pointing out in 2020 that we have had so many optimisations and computing resources that went into, say, google searches, and then using that as evidence that surely the big data that goes into LLM’s should be instantaneous as well.
    - Steven Byrnes 19 Oct 2025 20:22 UTC
      2 points
      0 ∶ 0
      Parent
      Presumably a lot of these are all optimised for the current gen-AI paradigm, though. But we’re talking about what happens if the current paradigm fails. I’m sure some of it would carry over to a different AI paradigm, but also it’s pretty likely there would be other bottleneck we would have to tune to get things working.
      Yup, some stuff will be useful and others won’t. The subset of useful stuff will make future researchers’ lives easier and allow them to work faster. For example, here are people using JAX for lots of computations that are not deep learning at all.
      I feel like what you’re saying is the equivalent of pointing out in 2020 that we have had so many optimisations and computing resources that went into, say, google searches, and then using that as evidence that surely the big data that goes into LLM’s should be instantaneous as well.
      In like 2010–2015, “big data” and “the cloud” were still pretty hot new things, and people developed a bunch of storage formats, software tools, etc. for distributed data, distributed computing, parallelization, and cloud computing. And yes I do think that stuff turned out to be useful when deep learning started blowing up (and then LLMs after that), in the sense that ML researchers would have made slower progress (on the margin) if not for all that development. I think Docker and Kubernetes are good examples here. I’m not sure exactly how different the counterfactual would have been, but I do think it made more than zero difference.
      - Yarrow Bouchard 🔸 19 Oct 2025 20:35 UTC
        1 point
        0 ∶ 0
        Parent
        Things like Docker containers or cloud VMs that can be, in principle, applied to any sort of software or computation could be helpful for all sorts of applications we can’t anticipate. They are very general-purpose. That makes sense to me.
        
        The extent to which things designed for deep learning, such as PyTorch, could be applied to ideas outside deep learning seems much more dubious.
        
        And if we’re thinking about ideas that fall within deep learning, but outside what is currently mainstream and popular, then I simply don’t know.