I don’t think I’m retreating into a weaker claim. I’m just explaining why, from my point of view, your analogy doesn’t seem to make sense as an argument against my post and why I don’t find it persuasive at all (and why I don’t think anyone in my shoes would or should find it persuasive). I don’t understand why you would interpret this as me retreating into a weaker claim.
If you’re making the claim:
The probability that a new future AI paradigm would take as little as 7 years to go from obscure arxiv papers to AGI, is extremely low (say, <10%).
…then presumably you should have some reason to believe that. If your position is “nobody can possibly know how long it will take”, then that obviously is not a reason to believe that claim above. Indeed, your OP didn’t give any reason whatsoever, it just said “extremely unlikely” (“Even if, by chance, it were discovered soon, it’s extremely unlikely it would make it all the way from conception to working AGI system within 7 years.”)
Gee, a lot can happen in 7 years in AI, including challenges transitioning from ‘this seems wildly beyond SOTA and nobody has any clue where to even start’ to ‘this is so utterly trivial that we take it for granted and collectively forget it was ever hard’, and including transitioning from ‘kinda the first setup of this basic technique that anyone thought to try’ to ‘a zillion iterations and variations of the technique have been exhaustively tested and explored by researchers around the world’, etc. That seems like a reason to start somewhere like, I dunno, 50-50 on ≤7 years, as opposed to <10%. 50-50 is like saying ‘some things in AI take less than 7 years, and other things take more than 7 years, who knows, shrug’.
Then you replied here that “your analogy is not persuasive”. I kinda took that to mean: my example of LLM development does not prove that a future “obscure arxiv papers to AGI” transition will take ≤7 years. Indeed it does not! I didn’t think I was offering proof of anything. But you are still making a quite confident claim of <10%, and I am still waiting to see any reason at all explaining where that confidence is coming from. I think the LLM example above is suggestive evidence that 7 years is not some crazy number wildly outside the range of reasonable guesses for “obscure arxiv papers to AGI”, whereas you are saying that 7 years is in fact a pretty crazy number, and that sane numbers would be way bigger than 7 years. How much bigger? You didn’t say. Why? You didn’t say.
So that’s my evidence, and yes it’s suggestive not definitive evidence, but OTOH you have offered no evidence whatsoever, AFAICT.
Okay, I think I understand now, hopefully. Thank you for explaining. Your complaint is that I didn’t try to substantiate why I think it’s extremely unlikely for a new paradigm in AI to go from conception to a working AGI system in 7 years. That’s a reasonable complaint.
I would never hold any of these sorts of arguments to the standard of “proving” something or establishing certainty. By saying the argument is not persuasive, I mean it didn’t really shift me in one direction or the other.
The reason I didn’t find your analogy persuasive is that I’m already aware of the progress there’s been in AI since 2012 in different domains including computer vision, natural language processing, games (imitation learning and reinforcement learning in virtual environments), and robotics. So, your analogy didn’t give me any new information to update on.
My reason for thinking it’s extremely unlikely is just an intuition from observing progress in AI (and, to some extent, other fields). It seems like your analogy is an attempt to express your own intuition about this from watching AI progress. I can understand the intention now and I can respect that as a reasonable attempt at persuasion. It might be persuasive to someone in my position who is unaware of how fast some AI progress has been.
I think I was misinterpreting it too much as an argument with a clear logical structure and not enough as an attempt to express an intuition. I think as the latter it’s perfectly fine, and it would be too much to expect the former in such a context.
I can’t offer much in this context (I don’t think anyone can). The best I can do is just try to express my intuition, like you did. What you consider fast or slow in terms of progress depends where you start and end and what examples you choose. If you pick deep learning as your example, and if you start at the invention of backpropagation in 1970 and end at AlexNet in 2011, that’s 41 years from conception to realization.
A factor that makes a difference is there just seems little interest in funding fundamental AI research outside of the sort of ideas that are already in the mainstream. For example, Richard Sutton has said it’s hard to get funding for fundamental AI research. It’s easier for him given his renown as an AI researcher, but the impression I get is that fundamental research funding overall is scarce, and it’s especially scarce if you’re working on novel, unusual, off-the-beaten-path ideas. So, even if there is an arXiv paper out there somewhere that has the key insight or key insights needed to get to AGI, the person who wrote it probably can’t get funded and they’re probably now working on a McDonald’s drive-through LLM.
[Edit: See my reply to this comment below for an important elaboration on why I think getting from an arXiv paper to AGI within 7 years is unlikely.]
Out of curiosity, what do you think of my argument that LLMs can’t pass a rigorous Turing test because a rigorous Turing test could include ARC-AGI 2 as a subset (and, indeed, any competent panel of judges should include it) and LLMs can’t pass that? Do you agree? Do you think that’s a higher level of rigour than a Turing test should have and that’s shifting the goal posts?
I should add, fairly belatedly, another point of comparison. Two Turing Award-winning AI researchers, Yann LeCun and Richard Sutton, each have novel fundamental ideas — not based on scaling LLMs or other comparably mainstream ideas — for how to get to AGI. (A few days ago, I wrote a comment about this here.)
In a 2024 interview, Yann LeCun said he thought it would take “at least a decade and probably much more” to get to AGI or human-level AI by executing his research roadmap. Trying to pinpoint when ideas first started is a fraught exercise. If we say the start time is the 2022 publication of LeCun’s position paper “A Path Towards Autonomous Machine Intelligence”, then by LeCun’s own estimate, the time from publication to human-level AI is at least 12 years and “probably much more”.
In another 2024 interview, Richard Sutton said he thinks there’s a 25% chance by 2030 we’ll “understand intelligence”, although it’s unclear to me if he imagines by 2030 there’s a 25% chance we’ll actually build AGI (or be in a position to do so straightforwardly) or just have the fundamental theoretical knowledge required to do so. The equivalent paper co-authored by Sutton is “The Alberta Plan for AI Research”, coincidentally also published in 2022. So, Sutton’s own estimate is a 25% chance of success in 8 years, although it’s not clear if success here means actually building AGI or a different goal.
But, crucially, I also definitely don’t think we should just automatically accept these numbers. (I also discussed this in my previous comment about this here.) Researchers like Yann LeCun and Richard Sutton have a very high level of self-belief, which I think is psychologically healthy and rational. It is good to be this ambitious. But we shouldn’t think of these as predictions or forecasts, but rather as goals.
LeCun himself has explicitly said you should be skeptical of anyone who says they have found the secret to AGI and will deliver it ten years, including him (as I discussed here). Which of course is very reasonable!
I think we should strive for, like, you know, 2030, and knowing that we probably won’t succeed, but you have to try.
This was in response to one of the interviewers noting that Sutton had said “decades”, plural, when he said “these are the decades when we’re going to figure out how the mind works.”
We have good reason to be skeptical if we look at predictions from people in AI that have now come false, such as Dario Amodei’s incorrect prediction about AI writing 90% of code by mid-September 2025 or, for that matter, his prediction made 2 years and 2 months ago that we could have something that sounds a lot like AGI in 2 or 3 years, which still has 10 months left to go but looks extremely dubious. As I mentioned in the post, there’s also Geoffrey Hinton’s prediction about radiology getting automated and various wrong predictions from various people in AI about widespread fully autonomous driving.
So, to summarize: what Yann LeCun and Richard Sutton are saying is already much more conservative than a trajectory from publishing a paper to building AGI within 7 years. They both tell us to be skeptical of even the timelines they lay out. And, independent of whether they tell us to be skeptical or not, based on the track record of similar predictions, we have good reason to be skeptical.
To me, this seems to be the much more apt point of comparison than the progress of LLMs from 2018 to 2025.
In a 2024 interview, Yann LeCun said he thought it would take “at least a decade and probably much more” to get to AGI or human-level AI by executing his research roadmap. Trying to pinpoint when ideas first started is a fraught exercise. If we say the start time is the 2022 publication of LeCun’s position paper “A Path Towards Autonomous Machine Intelligence”, then by LeCun’s own estimate, the time from publication to human-level AI is at least 12 years and “probably much more”.
Here’s why I don’t think “start time for LeCun’s research program is 2022” is true in any sense relevant to this conversation.
IIUC, the subtext of your OP and this whole conversation is that you think people shouldn’t be urgently trying to prepare for AGI / ASI right now.
In that context, one could say that the two relevant numbers are “(A) how far in advance should we be preparing for AGI / ASI?” and “(B) how far away is AGI / ASI?”. And you should start preparing when (A)=(B).
I think that’s a terrible model, because we don’t and won’t know either (A) or (B) until it’s too late, and there’s plenty of work we can be doing right now, so it’s nuts not to be doing that work ASAP. Indeed, I think it’s nuts that we weren’t doing more work on AGI x-risk in 2015, and 2005, and 1995 etc.
As bad as I think that “start when (A)=(B)” model is, I’m concerned that your implicit model is even worse. You seem to be acting as if (A) is less than 7 years, but you haven’t justified that, and I don’t think you can. I am concerned that what you’re actually thinking is more like: “AGI doesn’t feel imminent, therefore (B)<(A)”.
Does the clock start in 2022 when LeCun published A Path Towards Autonomous Machine Intelligence (APTAMI)? That was 3 years ago. Yet you still, right now, don’t seem to feel like we should be urgently preparing for AGI. If LeCun et al. keep making progress, maybe someday you will start feeling that sense of urgency about imminent LeCun-style AGI. And when that day comes, that’s when the relevant clock starts. And I think that clock will leave very little time indeed until AGI and ASI. (My own guess would be 0–2 years, if your sense of urgency will be triggered by obvious signals of impressiveness like using language and solving problems beyond current LLMs. If you have some other trigger that you’re looking for, what is it?)
What would it look like to feel a sense of urgency starting from the moment that APTAMI was published? It would look like what I did, which was write the response: LeCun’s “A Path Towards Autonomous Machine Intelligence” has an unsolved technical alignment problem. I’m pretty sure LeCun knows that this post exists, but he has not responded, and to this day he continues to insist that he has a great plan for AI alignment. Anyway, here I am, arguably the only person on Earth who is working on solving the technical alignment problem for APTAMI. LeCun and his collaborators have not shown the slightest interest in helping, and I don’t expect that situation to change as they get ever closer to AGI / ASI (on the off-chance that their research program is headed towards AGI / ASI).
(If you think we should be urgently preparing for AGI / ASI x-risk right now, despite AGI being extremely unlikely by 2032, then great, we would be in much more agreement than I assumed. If that’s the situation, then I think your post does not convey that mood, and I think that almost all readers will interpret it as having that subtext unless you explicitly say otherwise.)
I find this comment fairly confusing, so I’m going to try to hopefully clear up some of the confusion.
Here’s why I don’t think “start time for LeCun’s research program is 2022” is true in any sense relevant to this conversation.
Was the intention of the comment I made about Yann LeCun’s and Richard Sutton’s research roadmaps unclear? It has nothing to do with the question of how far in advance we should start preparing for AGI. I was just giving a different point of comparison than your example of the progress in LLMs from 2018 to 2025. These were examples of how two successful AI researchers think about the amount of time between formulating the fundamental concepts — or at least the fundamental research directions — necessary to build AGI in a paper and actually building AGI. How much in advance of AGI you’d want to prepare is a separate question.
Similarly, I don’t think your example of the amount of progress in LLMs from 2018 to 2025 was intended to make an argument about how long in advance of AGI to start preparing, was it? I thought you were simply trying to argue that the time between a novel AI paradigm being conceptualized and AGI being created could indeed be 7 years, contrary to what I asserted in the conclusion to my post.
Am I misunderstanding something? This response doesn’t seem to be a response of what I was trying to say in the comment it’s responding to. Am I missing the point?
IIUC, the subtext of your OP and this whole conversation is that you think people shouldn’t be urgently trying to prepare for AGI / ASI right now.
The topic of how much in advance we should be preparing for AGI and what, specifically, we should be doing to prepare is, of course, related to the topic of when we think AGI is likely to happen, but someone could make the argument that it’s important to start preparing for AGI now even if it’s 50 or 100 years away. The correctness or incorrectness of that argument wouldn’t depend on whether AGI by 2032 is extremely unlikely. My post is about whether AGI by 2032 is extremely unlikely and isn’t intended to comment on the question of how far in advance of AGI we should prepare, or what we should do to prepare.
If we really should be preparing for AGI 50 or 100 years in advance, then whether I think we should start preparing for AGI now really doesn’t depend on whether I think AGI is likely within 7 years.
I think it’s nuts that we weren’t doing more work on AGI x-risk in 2015, and 2005, and 1995 etc.
If you think there is a strong argument for doing work on AGI safety or alignment 35+ years in advance of when AGI is expected to be created, then you can make that argument without arguing that AGI is likely to be created within 7 years, so that argument could be correct even if my thesis is correct that AGI by 2032 is extremely unlikely. Forgive me if I’m repeating myself here.
You seem to be acting as if (A) is less than 7 years, but you haven’t justified that, and I don’t think you can.
I didn’t say anything about that in the post. As I said just above, if it’s true, as you say, that we should start preparing for AGI long before we think it’s likely to arrive, then this wouldn’t be a logical inference from what I’ve argued.
I am concerned that what you’re actually thinking is more like: “AGI doesn’t feel imminent, therefore (B)<(A)”.
Is “feel” supposed to be pejorative here? Is “AGI doesn’t feel imminent” supposed to mean something other than “I don’t think AGI is imminent”? Are your opinions about AGI timelines also something you “feel”?
Does the clock start in 2022 when LeCun published A Path Towards Autonomous Machine Intelligence (APTAMI)?
Are you asking me whether I think Yann LeCun has published the roadmap that will, in fact, lead to AGI? I brought up LeCun’s roadmap as an example. I brought up Richard Sutton’s Alberta Plan as another example. As far as I can tell, these are mutually incompatible roadmaps to AGI. They could also both be wrong. But I just brought these up as examples. I wasn’t saying one of them will actually lead to the invention of AGI.
...if your sense of urgency will be triggered by obvious signals of impressiveness like using language and solving problems beyond current LLMs. If you have some other trigger that you’re looking for, what is it?
In the post, I mentioned a few different broad areas where I think current AI systems do poorly and used this as evidence to argue that AGI is unlikely within 7 years. It would stand to reason, therefore, that I think if AI systems started significantly improving in these areas, it would be a reason for me to believe AGI is closer than I currently think it is.
I would at least be curious to know what you think about the reasons I gave in the post, even if I disagree.
(If you think we should be urgently preparing for AGI / ASI x-risk right now, despite AGI being extremely unlikely by 2032, then great, we would be in much more agreement than I assumed. If that’s the situation, then I think your post does not convey that mood, and I think that almost all readers will interpret it as having that subtext unless you explicitly say otherwise.)
How far in advance of AGI we should start preparing for it is logically independent from the thesis of this post — which is about the likelihood of near-term AGI — and I didn’t say anything in this post about whether we should start preparing now or not. I would prefer to discuss that in the context of a post that does make an argument about how far in advance we should start preparing (and, if so, what kind of preparation would be useful or even possible).
That topic depends on a lot of things other than AGI timelines, e.g., hard takeoff vs. soft takeoff, the “MIRI worldview” on AI alignment vs. other views, and the scientific/technological paradigm used to build AGI.
I made this post because I had certain ideas I wanted to talk about that I wanted to hear what people thought about. If you have thoughts about what I said in this post, I would be curious to hear them. If I’m wrong about what I said in the post, why am I wrong? Tell me!
(…But I still think your post has a connotation in context that “AGI by 2032 is extremely unlikely [therefore AGI x-risk work is not an urgent priority]”, and that it would be worth clarifying that you are just arguing the narrow point.)
If someone in 1900 had looked at everyone before the Wright brothers saying that they’ll get heavier-than-air flight soon, all those predictions would have been falsified, and they might have generalized to “We have good reason to be skeptical if we look at predictions from people in [inventing airplanes] that have now come false”. But that generalization would have then failed when the Wright brothers came along.
Sutton does not seem to believe that “AGI by 2032 is extremely unlikely” so I’m not sure how that’s evidence on your side. You’re saying that he’s over-optimistic, and maybe he is, but we don’t know that. If you want examples of AI researchers and experts being over-pessimistic about the speed of progress, they are very easy to find (e.g.).
You’ve heard of Sutton & LeCun. There are a great many other research programs that you haven’t heard of, toiling away and writing obscure arxiv papers. Some of those people have been writing obscure arxiv papers for many years already, even decades. We both agree that it takes >>7 years for an R&D pathway to get from its first obscure arxiv paper to ASI. What I’m pushing back on the claim that it takes >>7 years to get from the final obscure arxiv paper (after which point the R&D pathway is impressive enough to stop being obscure) to ASI.
OK, here’s the big picture of this discussion as I see it.
As someone who doesn’t think LLMs will scale to AGI, I skipped over pretty much all of your OP as off-topic from my perspective, until I got to the sentences:
Eventually, there will be some AI paradigm beyond LLMs that is better at generality or generalization. However, we don’t know what that paradigm is yet and there’s no telling how long it will take to be discovered. Even if, by chance, it were discovered soon, it’s extremely unlikely it would make it all the way from conception to working AGI system within 7 years.
(Plus the subsequent couple paragraphs about brain computation, which I responded to briefly in my top-level comment.)
So that excerpt is what I was responding to originally, and that’s what we’ve been discussing pretty much this whole time. Right?
My claim is that, in the context of this paragraph, “extremely unlikely” (as in “<0.1%”) is way way too confident. Technological forecasting is hard, a lot can happen in seven years … I think there’s just no way to justify such an extraordinarily high confidence [conditioned on LLMs not scaling to AGI as always].
If you had said “<20%” instead of “<0.1%”, then OK sure, I would have been in close-enough agreement with you, that I wouldn’t have bothered replying.
Does that help? Sorry if I’m misunderstanding.
Hmm, reading what you wrote again, I think part of your mistake is saying “…conception to working AGI system”. Who’s to say that this “AI paradigm beyond LLMs” hasn’t already been discovered ten years ago or more? There are a zillion speculative non-LLM AI paradigms that have been under development for years or decades. Nobody has heard of them because they’re not doing impressive things yet. That doesn’t mean that there hasn’t already been a lot of development progress.
As someone who doesn’t think LLMs will scale to AGI, I skipped over pretty much all of your OP as off-topic from my perspective
Okay, good to know.
I know that there are different views, but it seems like a lot of people in EA have started taking near-term AGI a lot more seriously since ChatGPT was released, and those people generally don’t give the other views — the views on which LLMs aren’t evidence of near-term AGI — much credence. That’s why the focus on LLMs.
The other views tend to be highly abstract, highly theoretical, highly philosophical and so to argue about them you basically have to write the whole Encyclopedia Britannica and you can’t point to clear evidence from tests, studies, economic or financial indicators, and practical performance to make a case about AGI timelines within about 2,000 words.
Trying to argue those other views is not something I want to do, but I do want to argue about near-term AGI in a context where people are using LLMs as their key evidence for it.
Because my brain works that way, I’m tempted to argue about the other views as well, but I never find those kinds of discussions satisfying. It feels like by the time you get a few exchanges deep into those discussions (either me personally or people in general), it gets into “How many angels can dance on the head of a pin?” territory. For any number of sub-questions under that very abstract AGI discussion, maybe the answer is this, maybe it’s that, but nobody actually knows, there’s no firm evidence, there’s no theoretical consensus, and in fact the theorizing is very loose and pre-paradigmatic. (This is my impression after 15-20 years observing these discussions online and occasionally participating in them.) I think my response to these ideas should be, “Yeah. Maybe. Who knows?” because I don’t think there’s much to say beyond that.
My claim is that, in the context of this paragraph, “extremely unlikely” (as in “<0.1%”) is way way too confident. Technological forecasting is hard, a lot can happen in seven years … I think there’s just no way to justify such an extraordinarily high confidence [conditioned on LLMs not scaling to AGI as always].
If you had said “<20%” instead of “<0.1%”, then OK sure, I would have been in close-enough agreement with you, that I wouldn’t have bothered replying.
Does that help? Sorry if I’m misunderstanding.
I didn’t actually give a number for what I think are the chances of going from conception of a new AI paradigm to a working AGI system in 7 years. I did say it’s extremely unlikely, which is the same language I used for AGI within 7 years overall. I said I think the overall chances of AGI within 7 years is significantly less than 0.1%, so it’s understandable you might think by saying going from a new paradigm to working AGI in 7 years is extremely unlikely, I also mean I think that has a significantly less than 0.1% chance of success, or a similar number.
The relationship between the overall chance of AGI within 7 years and the chance of AGI conditional on the right paradigm being conceived isn’t clear because that depends on a third variable, which is the chance that the right paradigm has already been conceived (or soon will be) — and also how long ago it was conceived (or how soon it will be). That seems basically unknowable to me.
I haven’t really thought about what number I would assign to that specific outcome: a new AI paradigm going from conception to a working AGI system within 7 years. It seems very unlikely to me. In general, I don’t like the practice of just thinking up numbers to assign to things like that. It could be an okay practice if people didn’t take these numbers as literally and seriously as they do. Then it wouldn’t really matter. But people take these numbers really seriously and I think that’s unwise, and I don’t like contributing to that practice if I can help it.
I do think where guessing a number is helpful is when it helps convey an intuition that might be otherwise hard to express. If you just had a first date and your ask asks how it went, and you say, “It was a 7 out of 10,” that isn’t a rigorous scale, your friend isn’t expecting that all first dates of that quality will always be given a 7 rather than a 6 or an 8, but it helps convey a sense of somewhere between bad and fantastic. I think giving a number to a probability can be helpful like that. I think it can also be helpful to compare the probability of an event, like AGI being created within 7 years, to the probability of another event, which is why I came up with the Jill Stein example. (The problem is for this to work your interlocutor or your audience has to share your intuitive sense of how probable the other event is.)
I don’t know how you would try to rigorously estimate how long it would take to go from the right idea about AGI to a working AGI system. This depends largely on what the right idea is, which is precisely what we don’t know. So, there is irreducible uncertainty here.
We can come up with points of comparison. You used LLMs from 2018 to 2025 as as an example — 7 years. I brought up backpropagation in 1970 to AlexNet in 2011 as another potential point of comparison — 41 years. You could also choose the conception of connectionism in 1943 to AlphaGo beating Lee Seedol in 2016 as another comparison — 73 years. Or you can take Yann LeCun’s guess of at least 12 years and probably much more from his position paper to human-level AI, or Richard Sutton’s guess of a 25% chance of “understanding the mind” (still not sure if that implies the ability to build AGI) in 8 years after publishing the Alberta Plan for AI Research. Who knows which of these points of comparison is most apt? Maybe none of them are particularly apt. Who knows.
The other thing I tried was considering the computation required for AGI in comparison to the human brain. This is almost as fraught as the above. We don’t know for sure how much computation the human brain uses. We don’t know at all whether AGI will require as much computation, or much less, or much more. Who knows?
In principle, almost anything could happen at almost any time, even if it goes against how we thought the world works, and this is uncomfortable, but it’s true. (I don’t just mean with AI, I mean with everything. Volcanoes, aliens, physics, cosmology, the fabric of society — everything.)
What to do in the face of that uncertainty is a discussion that I think belongs in and under another post. For example, if we assume at least for the sake of argument that we have no idea which of several various ideas for building AGI will turn out to be correct, such as program synthesis, LeCun’s energy-based models, the Alberta Plan, Numenta’s Thousand Brains approach, whole brain emulation, and so on — and also if we have no idea whether all of these ideas will turn out to be the wrong ones — is there a strongly defensible course of action for preparing for AGI? Is there, indeed, a strongly defensible case for why AGI would be dangerous?
I worry that such a discussion would quickly get into the “How many angels can dance on the head of a pin?” territory I said I don’t like. But I would be impressed if someone could make a strong case for some course of action that makes sense even under a high level of irreducible uncertainty about which theoretical ideas will underpin the design of AGI and about when it will ultimately arrive.
I imagine this would be hard to do, however. For example, suppose Scenario A is that: the MIRI worldview on AI alignment is correct, there will be a hard takeoff, and AGI will be designed with a combination of deep learning and symbolic AI. Suppose Scenario B is: the MIRI worldview is false, whole brain emulation is the fastest possible path to AGI, and it will slowly scale up from a mouse brain emulation around 2065 to a human brain emulation around 2125,[1] and gradually from 2125 to 2165 it (or, more accurately, they) will become like AlphaGo for everything — a world champion at all tasks. Is there any strongly defensible course of action that makes sense if we don’t know whether Scenario A or Scenario B is true (or many other possible scenarios I could describe) and if we can’t even cogently assign probabilities to these scenarios? That sounds like a very tall order.
It’s especially a tall order if part of the required defense is arguing why the proposed course of action wouldn’t backfire and make things worse.
Who’s to say that this “AI paradigm beyond LLMs” hasn’t already been discovered ten years ago or more? There are a zillion speculative non-LLM AI paradigms that have been under development for years or decades. Nobody has heard of them because they’re not doing impressive things yet. That doesn’t mean that there hasn’t already been a lot of development progress.
2065 for a mouse brain and 2125 for a human brain are real guesses from an expert survey:
Zeleznikow-Johnston A, Kendziorra EF, McKenzie AT (2025) What are memories made of? A survey of neuroscientists on the structural basis of long-term memory. PLoS One 20(6): e0326920. https://doi.org/10.1371/journal.pone.0326920
Out of curiosity, what do you think of my argument that LLMs can’t pass a rigorous Turing test because a rigorous Turing test could include ARC-AGI 2 as a subset (and, indeed, any competent panel of judges should include it) and LLMs can’t pass that? Do you agree? Do you think that’s a higher level of rigour than a Turing test should have and that’s shifting the goal posts?
I think we both agree that there are ways to tell apart a human from an LLM of 2025, including handing ARC-AGI-2 to each.
Whether or not that fact means “LLMs of 2025 cannot pass the Turing Test” seems to be purely an argument about the definition / rules of “Turing Test”. Since that’s a pointless argument over definitions, I don’t really care to hash it out further. You can have the last word on that. Shrug :-P
Okay, since you’re giving me the last word, I’ll take it.
There are some ambiguities in terms of how to interpret the concept of the Turing test. People have disagreed about what the rules should be. I will say that in Turing’s original paper, he did introduce the concept of testing the computer via sub-games:
Q: Do you play chess?
A: Yes.
Q: I have K at my K1, and no other pieces. You have only K at K6 and R at R1. It is your move. What do you play?
A: (After a pause of 15 seconds) R-R8 mate.
Including other games or puzzles, like the ARC-AGI 2 puzzles, seems in line with this.
My understanding of the Turing test has always been that there should be basically no restrictions at all — no time limit, no restrictions on what can be asked, no word limit, no question limit.
In principle, I don’t see why you wouldn’t allow sending of images, but if you only allowed text-based questions, I suppose even then a judge could tediously write out the ARC-AGI 2 tasks, since they consist of coloured squares in a 30 x 30 grid, and ask the interlocutor to re-create them in Paint.
To be clear, I don’t think ARC-AGI 2 is nearly the only thing you could use to make an LLM fail the Turing test, it’s just an easy example.
In Daniel Dennett’s 1985 essay “Can Machines Think?” on the Turing test (included in the anthology Brainchildren), Dennett says that “the unrestricted test” is “the only test that is of any theoretical interest at all”. He emphasizes that judges should be able to ask anything:
People typically ignore the prospect of having the judge ask off-the-wall questions in the Turing test, and hence they underestimate the competence a computer would have to have to pass the test. But remember, the rules of the imitation game as Turing presented it permit the judge to ask any question that could be asked of a human being—no holds barred.
He also warns:
Cheapened versions of the Turing test are everywhere in the air. Turing’s test is not just effective, it is entirely natural—this is, after all, the way we assay the intelligence of each other every day. And since incautious use of such judgments and such tests is the norm, we are in some considerable danger of extrapolating too easily, and judging too generously, about the understanding of the systems we are using.
It’s true that before we had LLMs we had lower expectations of what computers can do and asked easier questions. But it doesn’t seem right to me to say that as computers get better at natural language, we shouldn’t be able to ask harder questions.
I do think the definition and conception of the Turing test is important. If people say that LLMs have passed the Turing test and that’s not true, it gives a false impression of LLMs’ capabilities, just like when people falsely claim LLMs are AGI.
You could qualify this by saying LLMs can pass a restricted, weak version of the Turing test — but not an unrestricted, adversarial Turing test — which was also true of older computer systems before deep learning. This would sidestep the question of defining the “true” Turing test and still give accurate information.
If you’re making the claim:
The probability that a new future AI paradigm would take as little as 7 years to go from obscure arxiv papers to AGI, is extremely low (say, <10%).
…then presumably you should have some reason to believe that. If your position is “nobody can possibly know how long it will take”, then that obviously is not a reason to believe that claim above. Indeed, your OP didn’t give any reason whatsoever, it just said “extremely unlikely” (“Even if, by chance, it were discovered soon, it’s extremely unlikely it would make it all the way from conception to working AGI system within 7 years.”)
Then my top comment was like:
Gee, a lot can happen in 7 years in AI, including challenges transitioning from ‘this seems wildly beyond SOTA and nobody has any clue where to even start’ to ‘this is so utterly trivial that we take it for granted and collectively forget it was ever hard’, and including transitioning from ‘kinda the first setup of this basic technique that anyone thought to try’ to ‘a zillion iterations and variations of the technique have been exhaustively tested and explored by researchers around the world’, etc. That seems like a reason to start somewhere like, I dunno, 50-50 on ≤7 years, as opposed to <10%. 50-50 is like saying ‘some things in AI take less than 7 years, and other things take more than 7 years, who knows, shrug’.
Then you replied here that “your analogy is not persuasive”. I kinda took that to mean: my example of LLM development does not prove that a future “obscure arxiv papers to AGI” transition will take ≤7 years. Indeed it does not! I didn’t think I was offering proof of anything. But you are still making a quite confident claim of <10%, and I am still waiting to see any reason at all explaining where that confidence is coming from. I think the LLM example above is suggestive evidence that 7 years is not some crazy number wildly outside the range of reasonable guesses for “obscure arxiv papers to AGI”, whereas you are saying that 7 years is in fact a pretty crazy number, and that sane numbers would be way bigger than 7 years. How much bigger? You didn’t say. Why? You didn’t say.
So that’s my evidence, and yes it’s suggestive not definitive evidence, but OTOH you have offered no evidence whatsoever, AFAICT.
Okay, I think I understand now, hopefully. Thank you for explaining. Your complaint is that I didn’t try to substantiate why I think it’s extremely unlikely for a new paradigm in AI to go from conception to a working AGI system in 7 years. That’s a reasonable complaint.
I would never hold any of these sorts of arguments to the standard of “proving” something or establishing certainty. By saying the argument is not persuasive, I mean it didn’t really shift me in one direction or the other.
The reason I didn’t find your analogy persuasive is that I’m already aware of the progress there’s been in AI since 2012 in different domains including computer vision, natural language processing, games (imitation learning and reinforcement learning in virtual environments), and robotics. So, your analogy didn’t give me any new information to update on.
My reason for thinking it’s extremely unlikely is just an intuition from observing progress in AI (and, to some extent, other fields). It seems like your analogy is an attempt to express your own intuition about this from watching AI progress. I can understand the intention now and I can respect that as a reasonable attempt at persuasion. It might be persuasive to someone in my position who is unaware of how fast some AI progress has been.
I think I was misinterpreting it too much as an argument with a clear logical structure and not enough as an attempt to express an intuition. I think as the latter it’s perfectly fine, and it would be too much to expect the former in such a context.
I can’t offer much in this context (I don’t think anyone can). The best I can do is just try to express my intuition, like you did. What you consider fast or slow in terms of progress depends where you start and end and what examples you choose. If you pick deep learning as your example, and if you start at the invention of backpropagation in 1970 and end at AlexNet in 2011, that’s 41 years from conception to realization.
A factor that makes a difference is there just seems little interest in funding fundamental AI research outside of the sort of ideas that are already in the mainstream. For example, Richard Sutton has said it’s hard to get funding for fundamental AI research. It’s easier for him given his renown as an AI researcher, but the impression I get is that fundamental research funding overall is scarce, and it’s especially scarce if you’re working on novel, unusual, off-the-beaten-path ideas. So, even if there is an arXiv paper out there somewhere that has the key insight or key insights needed to get to AGI, the person who wrote it probably can’t get funded and they’re probably now working on a McDonald’s drive-through LLM.
[Edit: See my reply to this comment below for an important elaboration on why I think getting from an arXiv paper to AGI within 7 years is unlikely.]
Out of curiosity, what do you think of my argument that LLMs can’t pass a rigorous Turing test because a rigorous Turing test could include ARC-AGI 2 as a subset (and, indeed, any competent panel of judges should include it) and LLMs can’t pass that? Do you agree? Do you think that’s a higher level of rigour than a Turing test should have and that’s shifting the goal posts?
I should add, fairly belatedly, another point of comparison. Two Turing Award-winning AI researchers, Yann LeCun and Richard Sutton, each have novel fundamental ideas — not based on scaling LLMs or other comparably mainstream ideas — for how to get to AGI. (A few days ago, I wrote a comment about this here.)
In a 2024 interview, Yann LeCun said he thought it would take “at least a decade and probably much more” to get to AGI or human-level AI by executing his research roadmap. Trying to pinpoint when ideas first started is a fraught exercise. If we say the start time is the 2022 publication of LeCun’s position paper “A Path Towards Autonomous Machine Intelligence”, then by LeCun’s own estimate, the time from publication to human-level AI is at least 12 years and “probably much more”.
In another 2024 interview, Richard Sutton said he thinks there’s a 25% chance by 2030 we’ll “understand intelligence”, although it’s unclear to me if he imagines by 2030 there’s a 25% chance we’ll actually build AGI (or be in a position to do so straightforwardly) or just have the fundamental theoretical knowledge required to do so. The equivalent paper co-authored by Sutton is “The Alberta Plan for AI Research”, coincidentally also published in 2022. So, Sutton’s own estimate is a 25% chance of success in 8 years, although it’s not clear if success here means actually building AGI or a different goal.
But, crucially, I also definitely don’t think we should just automatically accept these numbers. (I also discussed this in my previous comment about this here.) Researchers like Yann LeCun and Richard Sutton have a very high level of self-belief, which I think is psychologically healthy and rational. It is good to be this ambitious. But we shouldn’t think of these as predictions or forecasts, but rather as goals.
LeCun himself has explicitly said you should be skeptical of anyone who says they have found the secret to AGI and will deliver it ten years, including him (as I discussed here). Which of course is very reasonable!
In the 2024 interview, Sutton said:
This was in response to one of the interviewers noting that Sutton had said “decades”, plural, when he said “these are the decades when we’re going to figure out how the mind works.”
We have good reason to be skeptical if we look at predictions from people in AI that have now come false, such as Dario Amodei’s incorrect prediction about AI writing 90% of code by mid-September 2025 or, for that matter, his prediction made 2 years and 2 months ago that we could have something that sounds a lot like AGI in 2 or 3 years, which still has 10 months left to go but looks extremely dubious. As I mentioned in the post, there’s also Geoffrey Hinton’s prediction about radiology getting automated and various wrong predictions from various people in AI about widespread fully autonomous driving.
So, to summarize: what Yann LeCun and Richard Sutton are saying is already much more conservative than a trajectory from publishing a paper to building AGI within 7 years. They both tell us to be skeptical of even the timelines they lay out. And, independent of whether they tell us to be skeptical or not, based on the track record of similar predictions, we have good reason to be skeptical.
To me, this seems to be the much more apt point of comparison than the progress of LLMs from 2018 to 2025.
Here’s why I don’t think “start time for LeCun’s research program is 2022” is true in any sense relevant to this conversation.
IIUC, the subtext of your OP and this whole conversation is that you think people shouldn’t be urgently trying to prepare for AGI / ASI right now.
In that context, one could say that the two relevant numbers are “(A) how far in advance should we be preparing for AGI / ASI?” and “(B) how far away is AGI / ASI?”. And you should start preparing when (A)=(B).
I think that’s a terrible model, because we don’t and won’t know either (A) or (B) until it’s too late, and there’s plenty of work we can be doing right now, so it’s nuts not to be doing that work ASAP. Indeed, I think it’s nuts that we weren’t doing more work on AGI x-risk in 2015, and 2005, and 1995 etc.
As bad as I think that “start when (A)=(B)” model is, I’m concerned that your implicit model is even worse. You seem to be acting as if (A) is less than 7 years, but you haven’t justified that, and I don’t think you can. I am concerned that what you’re actually thinking is more like: “AGI doesn’t feel imminent, therefore (B)<(A)”.
Does the clock start in 2022 when LeCun published A Path Towards Autonomous Machine Intelligence (APTAMI)? That was 3 years ago. Yet you still, right now, don’t seem to feel like we should be urgently preparing for AGI. If LeCun et al. keep making progress, maybe someday you will start feeling that sense of urgency about imminent LeCun-style AGI. And when that day comes, that’s when the relevant clock starts. And I think that clock will leave very little time indeed until AGI and ASI. (My own guess would be 0–2 years, if your sense of urgency will be triggered by obvious signals of impressiveness like using language and solving problems beyond current LLMs. If you have some other trigger that you’re looking for, what is it?)
What would it look like to feel a sense of urgency starting from the moment that APTAMI was published? It would look like what I did, which was write the response: LeCun’s “A Path Towards Autonomous Machine Intelligence” has an unsolved technical alignment problem. I’m pretty sure LeCun knows that this post exists, but he has not responded, and to this day he continues to insist that he has a great plan for AI alignment. Anyway, here I am, arguably the only person on Earth who is working on solving the technical alignment problem for APTAMI. LeCun and his collaborators have not shown the slightest interest in helping, and I don’t expect that situation to change as they get ever closer to AGI / ASI (on the off-chance that their research program is headed towards AGI / ASI).
(If you think we should be urgently preparing for AGI / ASI x-risk right now, despite AGI being extremely unlikely by 2032, then great, we would be in much more agreement than I assumed. If that’s the situation, then I think your post does not convey that mood, and I think that almost all readers will interpret it as having that subtext unless you explicitly say otherwise.)
I find this comment fairly confusing, so I’m going to try to hopefully clear up some of the confusion.
Was the intention of the comment I made about Yann LeCun’s and Richard Sutton’s research roadmaps unclear? It has nothing to do with the question of how far in advance we should start preparing for AGI. I was just giving a different point of comparison than your example of the progress in LLMs from 2018 to 2025. These were examples of how two successful AI researchers think about the amount of time between formulating the fundamental concepts — or at least the fundamental research directions — necessary to build AGI in a paper and actually building AGI. How much in advance of AGI you’d want to prepare is a separate question.
Similarly, I don’t think your example of the amount of progress in LLMs from 2018 to 2025 was intended to make an argument about how long in advance of AGI to start preparing, was it? I thought you were simply trying to argue that the time between a novel AI paradigm being conceptualized and AGI being created could indeed be 7 years, contrary to what I asserted in the conclusion to my post.
Am I misunderstanding something? This response doesn’t seem to be a response of what I was trying to say in the comment it’s responding to. Am I missing the point?
The topic of how much in advance we should be preparing for AGI and what, specifically, we should be doing to prepare is, of course, related to the topic of when we think AGI is likely to happen, but someone could make the argument that it’s important to start preparing for AGI now even if it’s 50 or 100 years away. The correctness or incorrectness of that argument wouldn’t depend on whether AGI by 2032 is extremely unlikely. My post is about whether AGI by 2032 is extremely unlikely and isn’t intended to comment on the question of how far in advance of AGI we should prepare, or what we should do to prepare.
If we really should be preparing for AGI 50 or 100 years in advance, then whether I think we should start preparing for AGI now really doesn’t depend on whether I think AGI is likely within 7 years.
If you think there is a strong argument for doing work on AGI safety or alignment 35+ years in advance of when AGI is expected to be created, then you can make that argument without arguing that AGI is likely to be created within 7 years, so that argument could be correct even if my thesis is correct that AGI by 2032 is extremely unlikely. Forgive me if I’m repeating myself here.
I didn’t say anything about that in the post. As I said just above, if it’s true, as you say, that we should start preparing for AGI long before we think it’s likely to arrive, then this wouldn’t be a logical inference from what I’ve argued.
Is “feel” supposed to be pejorative here? Is “AGI doesn’t feel imminent” supposed to mean something other than “I don’t think AGI is imminent”? Are your opinions about AGI timelines also something you “feel”?
Are you asking me whether I think Yann LeCun has published the roadmap that will, in fact, lead to AGI? I brought up LeCun’s roadmap as an example. I brought up Richard Sutton’s Alberta Plan as another example. As far as I can tell, these are mutually incompatible roadmaps to AGI. They could also both be wrong. But I just brought these up as examples. I wasn’t saying one of them will actually lead to the invention of AGI.
In the post, I mentioned a few different broad areas where I think current AI systems do poorly and used this as evidence to argue that AGI is unlikely within 7 years. It would stand to reason, therefore, that I think if AI systems started significantly improving in these areas, it would be a reason for me to believe AGI is closer than I currently think it is.
I would at least be curious to know what you think about the reasons I gave in the post, even if I disagree.
How far in advance of AGI we should start preparing for it is logically independent from the thesis of this post — which is about the likelihood of near-term AGI — and I didn’t say anything in this post about whether we should start preparing now or not. I would prefer to discuss that in the context of a post that does make an argument about how far in advance we should start preparing (and, if so, what kind of preparation would be useful or even possible).
That topic depends on a lot of things other than AGI timelines, e.g., hard takeoff vs. soft takeoff, the “MIRI worldview” on AI alignment vs. other views, and the scientific/technological paradigm used to build AGI.
I made this post because I had certain ideas I wanted to talk about that I wanted to hear what people thought about. If you have thoughts about what I said in this post, I would be curious to hear them. If I’m wrong about what I said in the post, why am I wrong? Tell me!
I am desperate to hear good counterarguments.
OK, sorry for getting off track.
(…But I still think your post has a connotation in context that “AGI by 2032 is extremely unlikely [therefore AGI x-risk work is not an urgent priority]”, and that it would be worth clarifying that you are just arguing the narrow point.)
Wilbur Wright overestimated how long it would take him to fly by a factor of 25—he said 50 years, it was actually 2. This is an example of how even researchers estimating their own very-near-term progress on their own R&D pathway can absolutely suck at timelines, including in the over-pessimistic direction.
If someone in 1900 had looked at everyone before the Wright brothers saying that they’ll get heavier-than-air flight soon, all those predictions would have been falsified, and they might have generalized to “We have good reason to be skeptical if we look at predictions from people in [inventing airplanes] that have now come false”. But that generalization would have then failed when the Wright brothers came along.
Sutton does not seem to believe that “AGI by 2032 is extremely unlikely” so I’m not sure how that’s evidence on your side. You’re saying that he’s over-optimistic, and maybe he is, but we don’t know that. If you want examples of AI researchers and experts being over-pessimistic about the speed of progress, they are very easy to find (e.g.).
You’ve heard of Sutton & LeCun. There are a great many other research programs that you haven’t heard of, toiling away and writing obscure arxiv papers. Some of those people have been writing obscure arxiv papers for many years already, even decades. We both agree that it takes >>7 years for an R&D pathway to get from its first obscure arxiv paper to ASI. What I’m pushing back on the claim that it takes >>7 years to get from the final obscure arxiv paper (after which point the R&D pathway is impressive enough to stop being obscure) to ASI.
Do you have any response to the arguments made in the post? I would be curious to hear if you have any interesting counterarguments.
As for the rest, I think it’s been addressed at sufficient length already.
OK, here’s the big picture of this discussion as I see it.
As someone who doesn’t think LLMs will scale to AGI, I skipped over pretty much all of your OP as off-topic from my perspective, until I got to the sentences:
(Plus the subsequent couple paragraphs about brain computation, which I responded to briefly in my top-level comment.)
So that excerpt is what I was responding to originally, and that’s what we’ve been discussing pretty much this whole time. Right?
My claim is that, in the context of this paragraph, “extremely unlikely” (as in “<0.1%”) is way way too confident. Technological forecasting is hard, a lot can happen in seven years … I think there’s just no way to justify such an extraordinarily high confidence [conditioned on LLMs not scaling to AGI as always].
If you had said “<20%” instead of “<0.1%”, then OK sure, I would have been in close-enough agreement with you, that I wouldn’t have bothered replying.
Does that help? Sorry if I’m misunderstanding.
Hmm, reading what you wrote again, I think part of your mistake is saying “…conception to working AGI system”. Who’s to say that this “AI paradigm beyond LLMs” hasn’t already been discovered ten years ago or more? There are a zillion speculative non-LLM AI paradigms that have been under development for years or decades. Nobody has heard of them because they’re not doing impressive things yet. That doesn’t mean that there hasn’t already been a lot of development progress.
Okay, good to know.
I know that there are different views, but it seems like a lot of people in EA have started taking near-term AGI a lot more seriously since ChatGPT was released, and those people generally don’t give the other views — the views on which LLMs aren’t evidence of near-term AGI — much credence. That’s why the focus on LLMs.
The other views tend to be highly abstract, highly theoretical, highly philosophical and so to argue about them you basically have to write the whole Encyclopedia Britannica and you can’t point to clear evidence from tests, studies, economic or financial indicators, and practical performance to make a case about AGI timelines within about 2,000 words.
Trying to argue those other views is not something I want to do, but I do want to argue about near-term AGI in a context where people are using LLMs as their key evidence for it.
Because my brain works that way, I’m tempted to argue about the other views as well, but I never find those kinds of discussions satisfying. It feels like by the time you get a few exchanges deep into those discussions (either me personally or people in general), it gets into “How many angels can dance on the head of a pin?” territory. For any number of sub-questions under that very abstract AGI discussion, maybe the answer is this, maybe it’s that, but nobody actually knows, there’s no firm evidence, there’s no theoretical consensus, and in fact the theorizing is very loose and pre-paradigmatic. (This is my impression after 15-20 years observing these discussions online and occasionally participating in them.) I think my response to these ideas should be, “Yeah. Maybe. Who knows?” because I don’t think there’s much to say beyond that.
I didn’t actually give a number for what I think are the chances of going from conception of a new AI paradigm to a working AGI system in 7 years. I did say it’s extremely unlikely, which is the same language I used for AGI within 7 years overall. I said I think the overall chances of AGI within 7 years is significantly less than 0.1%, so it’s understandable you might think by saying going from a new paradigm to working AGI in 7 years is extremely unlikely, I also mean I think that has a significantly less than 0.1% chance of success, or a similar number.
The relationship between the overall chance of AGI within 7 years and the chance of AGI conditional on the right paradigm being conceived isn’t clear because that depends on a third variable, which is the chance that the right paradigm has already been conceived (or soon will be) — and also how long ago it was conceived (or how soon it will be). That seems basically unknowable to me.
I haven’t really thought about what number I would assign to that specific outcome: a new AI paradigm going from conception to a working AGI system within 7 years. It seems very unlikely to me. In general, I don’t like the practice of just thinking up numbers to assign to things like that. It could be an okay practice if people didn’t take these numbers as literally and seriously as they do. Then it wouldn’t really matter. But people take these numbers really seriously and I think that’s unwise, and I don’t like contributing to that practice if I can help it.
I do think where guessing a number is helpful is when it helps convey an intuition that might be otherwise hard to express. If you just had a first date and your ask asks how it went, and you say, “It was a 7 out of 10,” that isn’t a rigorous scale, your friend isn’t expecting that all first dates of that quality will always be given a 7 rather than a 6 or an 8, but it helps convey a sense of somewhere between bad and fantastic. I think giving a number to a probability can be helpful like that. I think it can also be helpful to compare the probability of an event, like AGI being created within 7 years, to the probability of another event, which is why I came up with the Jill Stein example. (The problem is for this to work your interlocutor or your audience has to share your intuitive sense of how probable the other event is.)
I don’t know how you would try to rigorously estimate how long it would take to go from the right idea about AGI to a working AGI system. This depends largely on what the right idea is, which is precisely what we don’t know. So, there is irreducible uncertainty here.
We can come up with points of comparison. You used LLMs from 2018 to 2025 as as an example — 7 years. I brought up backpropagation in 1970 to AlexNet in 2011 as another potential point of comparison — 41 years. You could also choose the conception of connectionism in 1943 to AlphaGo beating Lee Seedol in 2016 as another comparison — 73 years. Or you can take Yann LeCun’s guess of at least 12 years and probably much more from his position paper to human-level AI, or Richard Sutton’s guess of a 25% chance of “understanding the mind” (still not sure if that implies the ability to build AGI) in 8 years after publishing the Alberta Plan for AI Research. Who knows which of these points of comparison is most apt? Maybe none of them are particularly apt. Who knows.
The other thing I tried was considering the computation required for AGI in comparison to the human brain. This is almost as fraught as the above. We don’t know for sure how much computation the human brain uses. We don’t know at all whether AGI will require as much computation, or much less, or much more. Who knows?
In principle, almost anything could happen at almost any time, even if it goes against how we thought the world works, and this is uncomfortable, but it’s true. (I don’t just mean with AI, I mean with everything. Volcanoes, aliens, physics, cosmology, the fabric of society — everything.)
What to do in the face of that uncertainty is a discussion that I think belongs in and under another post. For example, if we assume at least for the sake of argument that we have no idea which of several various ideas for building AGI will turn out to be correct, such as program synthesis, LeCun’s energy-based models, the Alberta Plan, Numenta’s Thousand Brains approach, whole brain emulation, and so on — and also if we have no idea whether all of these ideas will turn out to be the wrong ones — is there a strongly defensible course of action for preparing for AGI? Is there, indeed, a strongly defensible case for why AGI would be dangerous?
I worry that such a discussion would quickly get into the “How many angels can dance on the head of a pin?” territory I said I don’t like. But I would be impressed if someone could make a strong case for some course of action that makes sense even under a high level of irreducible uncertainty about which theoretical ideas will underpin the design of AGI and about when it will ultimately arrive.
I imagine this would be hard to do, however. For example, suppose Scenario A is that: the MIRI worldview on AI alignment is correct, there will be a hard takeoff, and AGI will be designed with a combination of deep learning and symbolic AI. Suppose Scenario B is: the MIRI worldview is false, whole brain emulation is the fastest possible path to AGI, and it will slowly scale up from a mouse brain emulation around 2065 to a human brain emulation around 2125,[1] and gradually from 2125 to 2165 it (or, more accurately, they) will become like AlphaGo for everything — a world champion at all tasks. Is there any strongly defensible course of action that makes sense if we don’t know whether Scenario A or Scenario B is true (or many other possible scenarios I could describe) and if we can’t even cogently assign probabilities to these scenarios? That sounds like a very tall order.
It’s especially a tall order if part of the required defense is arguing why the proposed course of action wouldn’t backfire and make things worse.
Yeah, maybe. Who knows?
2065 for a mouse brain and 2125 for a human brain are real guesses from an expert survey:
Zeleznikow-Johnston A, Kendziorra EF, McKenzie AT (2025) What are memories made of? A survey of neuroscientists on the structural basis of long-term memory. PLoS One 20(6): e0326920. https://doi.org/10.1371/journal.pone.0326920
Thanks!
I think we both agree that there are ways to tell apart a human from an LLM of 2025, including handing ARC-AGI-2 to each.
Whether or not that fact means “LLMs of 2025 cannot pass the Turing Test” seems to be purely an argument about the definition / rules of “Turing Test”. Since that’s a pointless argument over definitions, I don’t really care to hash it out further. You can have the last word on that. Shrug :-P
Okay, since you’re giving me the last word, I’ll take it.
There are some ambiguities in terms of how to interpret the concept of the Turing test. People have disagreed about what the rules should be. I will say that in Turing’s original paper, he did introduce the concept of testing the computer via sub-games:
Including other games or puzzles, like the ARC-AGI 2 puzzles, seems in line with this.
My understanding of the Turing test has always been that there should be basically no restrictions at all — no time limit, no restrictions on what can be asked, no word limit, no question limit.
In principle, I don’t see why you wouldn’t allow sending of images, but if you only allowed text-based questions, I suppose even then a judge could tediously write out the ARC-AGI 2 tasks, since they consist of coloured squares in a 30 x 30 grid, and ask the interlocutor to re-create them in Paint.
To be clear, I don’t think ARC-AGI 2 is nearly the only thing you could use to make an LLM fail the Turing test, it’s just an easy example.
In Daniel Dennett’s 1985 essay “Can Machines Think?” on the Turing test (included in the anthology Brainchildren), Dennett says that “the unrestricted test” is “the only test that is of any theoretical interest at all”. He emphasizes that judges should be able to ask anything:
He also warns:
It’s true that before we had LLMs we had lower expectations of what computers can do and asked easier questions. But it doesn’t seem right to me to say that as computers get better at natural language, we shouldn’t be able to ask harder questions.
I do think the definition and conception of the Turing test is important. If people say that LLMs have passed the Turing test and that’s not true, it gives a false impression of LLMs’ capabilities, just like when people falsely claim LLMs are AGI.
You could qualify this by saying LLMs can pass a restricted, weak version of the Turing test — but not an unrestricted, adversarial Turing test — which was also true of older computer systems before deep learning. This would sidestep the question of defining the “true” Turing test and still give accurate information.