Steven Byrnes comments on LLMs won’t lead to AGI—Francois Chollet

Steven Byrnes 13 Jun 2024 15:48 UTC
4 points
0 ∶ 0
A big crux I think here is whether ‘PASTA’ is possible at all, or at least whether it can be used as a way to bootstrap everything else.
Do you mean “possible at all using LLM technology” or do you mean “possible at all using any possible AI algorithm that will ever be invented”?
As for the latter, I think (or at least, I hope!) that there’s wide consensus that whatever human brains do (individually and collectively), it is possible in principle for algorithms-running-on-chips to do those same things too. Brains are not magic, right?
- titotal 13 Jun 2024 16:15 UTC
  6 points
  1 ∶ 0
  Parent
  As for the latter, I think (or at least, I hope!) that there’s wide consensus that whatever human brains do (individually and collectively), it is possible in principle for algorithms-running-on-chips to do those same things too. Brains are not magic, right?
  I think this is probably true, but I wouldn’t be 100% certain about it. Brains may not be magic, but they are also very different physical entities to silicon chips, so there is no guarantee that the function of one could be efficiently emulated by the other. There could be some crucial aspect of the mind relying on a physical process which would be computationally infeasible to simulate using binary silicon transistors.
  If there are any neuroscientists who have investigated this I would be interested!
  - Steven Byrnes 13 Jun 2024 18:21 UTC
    6 points
    0 ∶ 0
    Parent
    OK yeah, “AGI is possible on chips but only if you have 1e100 of them or whatever” is certainly a conceivable possibility. :) For example, here’s me responding to someone arguing along those lines.
    If there are any neuroscientists who have investigated this I would be interested!
    There is never a neuroscience consensus but fwiw I fancy myself a neuroscientist and have some thoughts at: Thoughts on hardware / compute requirements for AGI.
    One of various points I bring up is that:
    (1) if you look at how human brains, say, go to the moon, or invent quantum mechanics, and you think about what algorithms could underlie that, then you would start talking about algorithms that entail building generative models, and editing them, and querying them, and searching through them, and composing them, blah blah.
    (2) if you look at a biological brain’s low-level affordances, it’s a bunch of things related to somatic spikes and dendritic spikes and protein cascades and releasing and detecting neuropeptides etc.
    (3) if you look at silicon chip’s low-level affordances, it’s a bunch of things related to switching transistors and currents going down wires and charging up capacitors and so on.
    My view is: implementing (1) via (3) would involve a lot of inefficient bottlenecks where there’s no low-level affordance that’s a good match to the algorithmic operation we want … but the same is true of implementing (1) via (2). Indeed, I think the human brain does what it does via some atrociously inefficient workarounds to the limitations of biological neurons, limitations which would not be applicable to silicon chips.
    By contrast, many people thinking about this problem are often thinking about “how hard is it to use (3) to precisely emulate (2)?”, rather than “what’s the comparison between (1)←(3) versus (1)←(2)?”. (If you’re still not following, see my discussion here—search for “transistor-by-transistor simulation of a pocket calculator microcontroller chip”.)
    Another thing is that, if you look at what a single consumer GPU can do when it runs an LLM or diffusion model… well it’s not doing human-level AGI, but it’s sure doing something, and I think it’s a sound intuition (albeit hard to formalize) to say “well it kinda seems implausible that the brain is doing something that’s >1000× harder to calculate than that”.
    - titotal 13 Jun 2024 19:59 UTC
      4 points
      0 ∶ 0
      Parent
      Thanks for those links, this is an interesting topic I may look into more in the future.
      Another thing is that, if you look at what a single consumer GPU can do when it runs an LLM or diffusion model… well it’s not doing human-level AGI, but it’s sure doing something, and I think it’s a sound intuition (albeit hard to formalize) to say “well it kinda seems implausible that the brain is doing something that’s >1000× harder to calculate than that”.
      It doesn’t seem that implausible to me. In general I find the computational power required for different tasks (such as what I do in computational physics) frequently varies by many orders of magnitude. LLMs get to their level of performance by sifting through all the data on the internet, something we can’t do, and yet still perform worse than a regular human on many tasks, so clearly theres a lot of extra something going on here. It actually seems kind of likely to me that what the brain is doing is more than 3 orders of magnitude more difficult.
      I don’t know enough to be confident on any of this, but If AGI turns out to be impossible on silicon chips with earths resources, I would be surprised but not totally shocked.
- JWS 🔸 14 Jun 2024 11:12 UTC
  4 points
  0 ∶ 0
  Parent
  Yeah I definitely don’t mean ‘brains are magic’, humans are generally intelligent by any meaningful definition of the words, so we have an existence proof there that it is possible to be instantiated in some form.
  I’m more sceptical of thinking science can be ‘automated’ though—I think progressing scientific understanding of the world is in many ways quite a creative and open-ended endeavour. It requires forming beliefs about the world, updating them due to evidence, and sometimes making radical new shifts. It’s essentially the epistemological frame problem, and I think we’re way off a solution there.
  I think I have a big similar crux with Aschenbrenner when he says things like “automating AI research is all it takes”—like I think I disagree with that anyway but automating AI research is really, really hard! It might be ‘all it takes’ because that problem is already AGI complete!
  - Steven Byrnes 14 Jun 2024 13:36 UTC
    2 points
    0 ∶ 0
    Parent
    I’m confused what you’re trying to say… Supposing we do in fact invent AGI someday, do you think this AGI won’t be able to do science? Or that it will be able to do science, but that wouldn’t count as “automating science”?
    Or maybe when you said “whether ‘PASTA’ is possible at all”, you meant “whether ‘PASTA’ is possible at all via future LLMs”?
    Maybe you’re assuming that everyone here has a shared assumption that we’re just talking about LLMs, and that if someone says “AI will never do X” they obviously means “LLMs will never do X”? If so, I think that’s wrong (or at least I hope it’s wrong), and I think we should be more careful with our terminology. AI is broader than LLMs. …Well maybe Aschenbrenner is thinking that way, but I bet that if you were to ask a typical senior person in AI x-risk (e.g. Karnofsky) whether it’s possible that there will be some big AI paradigm shift (away from LLMs) between now and TAI, they would say “Well yeah duh of course that’s possible,” and then they would say that they would still absolutely want to talk about and prepare for TAI, in whatever algorithmic form it might take.
    - JWS 🔸 14 Jun 2024 15:36 UTC
      4 points
      0 ∶ 0
      Parent
      Apologies for not being clear! I’ll try and be a bit more clear here, but there’s probably a lot of inferential distance here and we’re covering some quite deep topics:
      Supposing we do in fact invent AGI someday, do you think this AGI won’t be able to do science? Or that it will be able to do science, but that wouldn’t count as “automating science”?
      Or maybe when you said “whether ‘PASTA’ is possible at all”, you meant “whether ‘PASTA’ is possible at all via future LLMs”?
      So on the first section, I’m going for the latter and taking issue with the term ‘automation’, which I think speaks to mindless, automatic process of achieving some output. But if digital functionalism were true, and we successful made a digital emulation of a human who contributed to scientific research, I wouldn’t call that ‘automating science’, instead we would have created a being that can do science. That being would be creative, agentic, with the ability to formulate it’s own novel ideas and hypotheses about the world. It’d be limited by its ability to sample from the world, design experiments, practice good epistemology, wait for physical results etc. etc. It might be the case that some scientific research happens quickly, and then subsequent breakthroughs happen more slowly, etc.
      My opinions on this are also highly influenced by the works of Deutsch and Popper too, who essentially argue that the growth of knowledge cannot be predicted, and since science is (in some sense) the stock of human knowledge, and since what cannot be predicted cannot be automated, scientific ‘automation’ is in some sense impossible.
      Maybe you’re assuming that everyone here has a shared assumption that we’re just talking about LLMs...but I bet that if you were to ask a typical senior person in AI x-risk (e.g. Karnofsky) whether it’s possible that there will be some big AI paradigm shift (away from LLMs) between now and TAI, they would say “Well yeah duh of course that’s possible,” and then they would say that they would still absolutely want to talk about and prepare for TAI, in whatever algorithmic form it might take.
      Agreed, AI systems are larger than LLMs, and maybe I was being a bit loose with language. On the whole though, I think much of the case by proponents for the importance of working on AI Safety does assume that current paradigm + scale is all you need, or rest on works that assume it. For instance, Davidson’s Compute-Centric Framework model for OpenPhil states right in that opening page:
      In this framework, AGI is developed by improving and scaling up approaches within the current ML paradigm, not by discovering new algorithmic paradigms.
      And I get off the bus with this approach immediately because I don’t think that’s plausible.
      As I said in my original comment, I’m working on a full post on the discussion between Chollet and Dwarkesh, which will hopefully make the AGI-sceptical position I’m coming from a bit more clear. If you end up reading it, I’d be really interested in your thoughts! :)
      What links here?
      Steven Byrnes's comment on On the Dwarkesh/Chollet Podcast, and the cruxes of scaling to AGI by JWS 🔸 (17 Jun 2024 3:16 UTC; 18 points)
      JWS 🔸's comment on On the Dwarkesh/Chollet Podcast, and the cruxes of scaling to AGI by JWS 🔸 (24 Jun 2024 16:06 UTC; 5 points)
      - Steven Byrnes 14 Jun 2024 16:19 UTC
        4 points
        0 ∶ 0
        Parent
        On the whole though, I think much of the case by proponents for the importance of working on AI Safety does assume that current paradigm + scale is all you need, or rest on works that assume it.
        Yeah this is more true than I would like. I try to push back on it where possible, e.g. my post AI doom from an LLM-plateau-ist perspective.
        There were however plenty of people who were loudly arguing that it was important to work on AI x-risk before “the current paradigm” was much of a thing (or in some cases long before “the current paradigm” existed at all), and I think their arguments were sound at the time and remain sound today. (E.g. Alan Turing, Norbert Weiner, Yudkowsky, Bostrom, Stuart Russell, Tegmark…) (OpenPhil seems to have started working seriously on AI in 2016, which was 3 years before GPT-2.)