OK, hereâs the big picture of this discussion as I see it.
As someone who doesnât think LLMs will scale to AGI, I skipped over pretty much all of your OP as off-topic from my perspective, until I got to the sentences:
Eventually, there will be some AI paradigm beyond LLMs that is better at generality or generalization. However, we donât know what that paradigm is yet and thereâs no telling how long it will take to be discovered. Even if, by chance, it were discovered soon, itâs extremely unlikely it would make it all the way from conception to working AGI system within 7 years.
(Plus the subsequent couple paragraphs about brain computation, which I responded to briefly in my top-level comment.)
So that excerpt is what I was responding to originally, and thatâs what weâve been discussing pretty much this whole time. Right?
My claim is that, in the context of this paragraph, âextremely unlikelyâ (as in â<0.1%â) is way way too confident. Technological forecasting is hard, a lot can happen in seven years ⌠I think thereâs just no way to justify such an extraordinarily high confidence [conditioned on LLMs not scaling to AGI as always].
If you had said â<20%â instead of â<0.1%â, then OK sure, I would have been in close-enough agreement with you, that I wouldnât have bothered replying.
Does that help? Sorry if Iâm misunderstanding.
Hmm, reading what you wrote again, I think part of your mistake is saying ââŚconception to working AGI systemâ. Whoâs to say that this âAI paradigm beyond LLMsâ hasnât already been discovered ten years ago or more? There are a zillion speculative non-LLM AI paradigms that have been under development for years or decades. Nobody has heard of them because theyâre not doing impressive things yet. That doesnât mean that there hasnât already been a lot of development progress.
As someone who doesnât think LLMs will scale to AGI, I skipped over pretty much all of your OP as off-topic from my perspective
Okay, good to know.
I know that there are different views, but it seems like a lot of people in EA have started taking near-term AGI a lot more seriously since ChatGPT was released, and those people generally donât give the other views â the views on which LLMs arenât evidence of near-term AGI â much credence. Thatâs why the focus on LLMs.
The other views tend to be highly abstract, highly theoretical, highly philosophical and so to argue about them you basically have to write the whole Encyclopedia Britannica and you canât point to clear evidence from tests, studies, economic or financial indicators, and practical performance to make a case about AGI timelines within about 2,000 words.
Trying to argue those other views is not something I want to do, but I do want to argue about near-term AGI in a context where people are using LLMs as their key evidence for it.
Because my brain works that way, Iâm tempted to argue about the other views as well, but I never find those kinds of discussions satisfying. It feels like by the time you get a few exchanges deep into those discussions (either me personally or people in general), it gets into âHow many angels can dance on the head of a pin?â territory. For any number of sub-questions under that very abstract AGI discussion, maybe the answer is this, maybe itâs that, but nobody actually knows, thereâs no firm evidence, thereâs no theoretical consensus, and in fact the theorizing is very loose and pre-paradigmatic. (This is my impression after 15-20 years observing these discussions online and occasionally participating in them.) I think my response to these ideas should be, âYeah. Maybe. Who knows?â because I donât think thereâs much to say beyond that.
My claim is that, in the context of this paragraph, âextremely unlikelyâ (as in â<0.1%â) is way way too confident. Technological forecasting is hard, a lot can happen in seven years ⌠I think thereâs just no way to justify such an extraordinarily high confidence [conditioned on LLMs not scaling to AGI as always].
If you had said â<20%â instead of â<0.1%â, then OK sure, I would have been in close-enough agreement with you, that I wouldnât have bothered replying.
Does that help? Sorry if Iâm misunderstanding.
I didnât actually give a number for what I think are the chances of going from conception of a new AI paradigm to a working AGI system in 7 years. I did say itâs extremely unlikely, which is the same language I used for AGI within 7 years overall. I said I think the overall chances of AGI within 7 years is significantly less than 0.1%, so itâs understandable you might think by saying going from a new paradigm to working AGI in 7 years is extremely unlikely, I also mean I think that has a significantly less than 0.1% chance of success, or a similar number.
The relationship between the overall chance of AGI within 7 years and the chance of AGI conditional on the right paradigm being conceived isnât clear because that depends on a third variable, which is the chance that the right paradigm has already been conceived (or soon will be) â and also how long ago it was conceived (or how soon it will be). That seems basically unknowable to me.
I havenât really thought about what number I would assign to that specific outcome: a new AI paradigm going from conception to a working AGI system within 7 years. It seems very unlikely to me. In general, I donât like the practice of just thinking up numbers to assign to things like that. It could be an okay practice if people didnât take these numbers as literally and seriously as they do. Then it wouldnât really matter. But people take these numbers really seriously and I think thatâs unwise, and I donât like contributing to that practice if I can help it.
I do think where guessing a number is helpful is when it helps convey an intuition that might be otherwise hard to express. If you just had a first date and your ask asks how it went, and you say, âIt was a 7 out of 10,â that isnât a rigorous scale, your friend isnât expecting that all first dates of that quality will always be given a 7 rather than a 6 or an 8, but it helps convey a sense of somewhere between bad and fantastic. I think giving a number to a probability can be helpful like that. I think it can also be helpful to compare the probability of an event, like AGI being created within 7 years, to the probability of another event, which is why I came up with the Jill Stein example. (The problem is for this to work your interlocutor or your audience has to share your intuitive sense of how probable the other event is.)
I donât know how you would try to rigorously estimate how long it would take to go from the right idea about AGI to a working AGI system. This depends largely on what the right idea is, which is precisely what we donât know. So, there is irreducible uncertainty here.
We can come up with points of comparison. You used LLMs from 2018 to 2025 as as an example â 7 years. I brought up backpropagation in 1970 to AlexNet in 2011 as another potential point of comparison â 41 years. You could also choose the conception of connectionism in 1943 to AlphaGo beating Lee Seedol in 2016 as another comparison â 73 years. Or you can take Yann LeCunâs guess of at least 12 years and probably much more from his position paper to human-level AI, or Richard Suttonâs guess of a 25% chance of âunderstanding the mindâ (still not sure if that implies the ability to build AGI) in 8 years after publishing the Alberta Plan for AI Research. Who knows which of these points of comparison is most apt? Maybe none of them are particularly apt. Who knows.
The other thing I tried was considering the computation required for AGI in comparison to the human brain. This is almost as fraught as the above. We donât know for sure how much computation the human brain uses. We donât know at all whether AGI will require as much computation, or much less, or much more. Who knows?
In principle, almost anything could happen at almost any time, even if it goes against how we thought the world works, and this is uncomfortable, but itâs true. (I donât just mean with AI, I mean with everything. Volcanoes, aliens, physics, cosmology, the fabric of society â everything.)
What to do in the face of that uncertainty is a discussion that I think belongs in and under another post. For example, if we assume at least for the sake of argument that we have no idea which of several various ideas for building AGI will turn out to be correct, such as program synthesis, LeCunâs energy-based models, the Alberta Plan, Numentaâs Thousand Brains approach, whole brain emulation, and so on â and also if we have no idea whether all of these ideas will turn out to be the wrong ones â is there a strongly defensible course of action for preparing for AGI? Is there, indeed, a strongly defensible case for why AGI would be dangerous?
I worry that such a discussion would quickly get into the âHow many angels can dance on the head of a pin?â territory I said I donât like. But I would be impressed if someone could make a strong case for some course of action that makes sense even under a high level of irreducible uncertainty about which theoretical ideas will underpin the design of AGI and about when it will ultimately arrive.
I imagine this would be hard to do, however. For example, suppose Scenario A is that: the MIRI worldview on AI alignment is correct, there will be a hard takeoff, and AGI will be designed with a combination of deep learning and symbolic AI. Suppose Scenario B is: the MIRI worldview is false, whole brain emulation is the fastest possible path to AGI, and it will slowly scale up from a mouse brain emulation around 2065 to a human brain emulation around 2125,[1] and gradually from 2125 to 2165 it (or, more accurately, they) will become like AlphaGo for everything â a world champion at all tasks. Is there any strongly defensible course of action that makes sense if we donât know whether Scenario A or Scenario B is true (or many other possible scenarios I could describe) and if we canât even cogently assign probabilities to these scenarios? That sounds like a very tall order.
Itâs especially a tall order if part of the required defense is arguing why the proposed course of action wouldnât backfire and make things worse.
Whoâs to say that this âAI paradigm beyond LLMsâ hasnât already been discovered ten years ago or more? There are a zillion speculative non-LLM AI paradigms that have been under development for years or decades. Nobody has heard of them because theyâre not doing impressive things yet. That doesnât mean that there hasnât already been a lot of development progress.
Do you have any response to the arguments made in the post? I would be curious to hear if you have any interesting counterarguments.
As for the rest, I think itâs been addressed at sufficient length already.
OK, hereâs the big picture of this discussion as I see it.
As someone who doesnât think LLMs will scale to AGI, I skipped over pretty much all of your OP as off-topic from my perspective, until I got to the sentences:
(Plus the subsequent couple paragraphs about brain computation, which I responded to briefly in my top-level comment.)
So that excerpt is what I was responding to originally, and thatâs what weâve been discussing pretty much this whole time. Right?
My claim is that, in the context of this paragraph, âextremely unlikelyâ (as in â<0.1%â) is way way too confident. Technological forecasting is hard, a lot can happen in seven years ⌠I think thereâs just no way to justify such an extraordinarily high confidence [conditioned on LLMs not scaling to AGI as always].
If you had said â<20%â instead of â<0.1%â, then OK sure, I would have been in close-enough agreement with you, that I wouldnât have bothered replying.
Does that help? Sorry if Iâm misunderstanding.
Hmm, reading what you wrote again, I think part of your mistake is saying ââŚconception to working AGI systemâ. Whoâs to say that this âAI paradigm beyond LLMsâ hasnât already been discovered ten years ago or more? There are a zillion speculative non-LLM AI paradigms that have been under development for years or decades. Nobody has heard of them because theyâre not doing impressive things yet. That doesnât mean that there hasnât already been a lot of development progress.
Okay, good to know.
I know that there are different views, but it seems like a lot of people in EA have started taking near-term AGI a lot more seriously since ChatGPT was released, and those people generally donât give the other views â the views on which LLMs arenât evidence of near-term AGI â much credence. Thatâs why the focus on LLMs.
The other views tend to be highly abstract, highly theoretical, highly philosophical and so to argue about them you basically have to write the whole Encyclopedia Britannica and you canât point to clear evidence from tests, studies, economic or financial indicators, and practical performance to make a case about AGI timelines within about 2,000 words.
Trying to argue those other views is not something I want to do, but I do want to argue about near-term AGI in a context where people are using LLMs as their key evidence for it.
Because my brain works that way, Iâm tempted to argue about the other views as well, but I never find those kinds of discussions satisfying. It feels like by the time you get a few exchanges deep into those discussions (either me personally or people in general), it gets into âHow many angels can dance on the head of a pin?â territory. For any number of sub-questions under that very abstract AGI discussion, maybe the answer is this, maybe itâs that, but nobody actually knows, thereâs no firm evidence, thereâs no theoretical consensus, and in fact the theorizing is very loose and pre-paradigmatic. (This is my impression after 15-20 years observing these discussions online and occasionally participating in them.) I think my response to these ideas should be, âYeah. Maybe. Who knows?â because I donât think thereâs much to say beyond that.
I didnât actually give a number for what I think are the chances of going from conception of a new AI paradigm to a working AGI system in 7 years. I did say itâs extremely unlikely, which is the same language I used for AGI within 7 years overall. I said I think the overall chances of AGI within 7 years is significantly less than 0.1%, so itâs understandable you might think by saying going from a new paradigm to working AGI in 7 years is extremely unlikely, I also mean I think that has a significantly less than 0.1% chance of success, or a similar number.
The relationship between the overall chance of AGI within 7 years and the chance of AGI conditional on the right paradigm being conceived isnât clear because that depends on a third variable, which is the chance that the right paradigm has already been conceived (or soon will be) â and also how long ago it was conceived (or how soon it will be). That seems basically unknowable to me.
I havenât really thought about what number I would assign to that specific outcome: a new AI paradigm going from conception to a working AGI system within 7 years. It seems very unlikely to me. In general, I donât like the practice of just thinking up numbers to assign to things like that. It could be an okay practice if people didnât take these numbers as literally and seriously as they do. Then it wouldnât really matter. But people take these numbers really seriously and I think thatâs unwise, and I donât like contributing to that practice if I can help it.
I do think where guessing a number is helpful is when it helps convey an intuition that might be otherwise hard to express. If you just had a first date and your ask asks how it went, and you say, âIt was a 7 out of 10,â that isnât a rigorous scale, your friend isnât expecting that all first dates of that quality will always be given a 7 rather than a 6 or an 8, but it helps convey a sense of somewhere between bad and fantastic. I think giving a number to a probability can be helpful like that. I think it can also be helpful to compare the probability of an event, like AGI being created within 7 years, to the probability of another event, which is why I came up with the Jill Stein example. (The problem is for this to work your interlocutor or your audience has to share your intuitive sense of how probable the other event is.)
I donât know how you would try to rigorously estimate how long it would take to go from the right idea about AGI to a working AGI system. This depends largely on what the right idea is, which is precisely what we donât know. So, there is irreducible uncertainty here.
We can come up with points of comparison. You used LLMs from 2018 to 2025 as as an example â 7 years. I brought up backpropagation in 1970 to AlexNet in 2011 as another potential point of comparison â 41 years. You could also choose the conception of connectionism in 1943 to AlphaGo beating Lee Seedol in 2016 as another comparison â 73 years. Or you can take Yann LeCunâs guess of at least 12 years and probably much more from his position paper to human-level AI, or Richard Suttonâs guess of a 25% chance of âunderstanding the mindâ (still not sure if that implies the ability to build AGI) in 8 years after publishing the Alberta Plan for AI Research. Who knows which of these points of comparison is most apt? Maybe none of them are particularly apt. Who knows.
The other thing I tried was considering the computation required for AGI in comparison to the human brain. This is almost as fraught as the above. We donât know for sure how much computation the human brain uses. We donât know at all whether AGI will require as much computation, or much less, or much more. Who knows?
In principle, almost anything could happen at almost any time, even if it goes against how we thought the world works, and this is uncomfortable, but itâs true. (I donât just mean with AI, I mean with everything. Volcanoes, aliens, physics, cosmology, the fabric of society â everything.)
What to do in the face of that uncertainty is a discussion that I think belongs in and under another post. For example, if we assume at least for the sake of argument that we have no idea which of several various ideas for building AGI will turn out to be correct, such as program synthesis, LeCunâs energy-based models, the Alberta Plan, Numentaâs Thousand Brains approach, whole brain emulation, and so on â and also if we have no idea whether all of these ideas will turn out to be the wrong ones â is there a strongly defensible course of action for preparing for AGI? Is there, indeed, a strongly defensible case for why AGI would be dangerous?
I worry that such a discussion would quickly get into the âHow many angels can dance on the head of a pin?â territory I said I donât like. But I would be impressed if someone could make a strong case for some course of action that makes sense even under a high level of irreducible uncertainty about which theoretical ideas will underpin the design of AGI and about when it will ultimately arrive.
I imagine this would be hard to do, however. For example, suppose Scenario A is that: the MIRI worldview on AI alignment is correct, there will be a hard takeoff, and AGI will be designed with a combination of deep learning and symbolic AI. Suppose Scenario B is: the MIRI worldview is false, whole brain emulation is the fastest possible path to AGI, and it will slowly scale up from a mouse brain emulation around 2065 to a human brain emulation around 2125,[1] and gradually from 2125 to 2165 it (or, more accurately, they) will become like AlphaGo for everything â a world champion at all tasks. Is there any strongly defensible course of action that makes sense if we donât know whether Scenario A or Scenario B is true (or many other possible scenarios I could describe) and if we canât even cogently assign probabilities to these scenarios? That sounds like a very tall order.
Itâs especially a tall order if part of the required defense is arguing why the proposed course of action wouldnât backfire and make things worse.
Yeah, maybe. Who knows?
2065 for a mouse brain and 2125 for a human brain are real guesses from an expert survey:
Zeleznikow-Johnston A, Kendziorra EF, McKenzie AT (2025) What are memories made of? A survey of neuroscientists on the structural basis of long-term memory. PLoS One 20(6): e0326920. https://ââdoi.org/ââ10.1371/ââjournal.pone.0326920