Pronouns: she/âher or they/âthem.
I got interested in effective altruism back before it was called effective altruism, back before Giving What We Can had a website. Later on, I got involved in my university EA group and helped run it for a few years. Now Iâm trying to figure out where effective altruism can fit into my life these days and what it means to me.
Yarrow Bouchard đ¸
I understand the general concept of ingroup/âoutgroup, but what specifically, does that mean in this context?
I wonder if you noticed that you changed the question? Did you not notice or did you change the question deliberately?
What I brought up as a potential form of important evidence for near-term AGI was:
Any sort of significant credible evidence of a major increase in AI capabilities, such as LLMs being able to autonomously and independently come up with new correct ideas in science, technology, engineering, medicine, philosophy, economics, psychology, etc. (not as a tool for human researchers to more easily search the research literature or anything along those lines, but doing the actual creative intellectual act itself)
You turned the question into:
If you wouldnât count this as evidence that genuine AI contributions to research mathematics might not be more than 6-7 years off, what, if anything would you count as evidence of that?
Now, rather than asking me about the evidence I use to forecast near-term AGI, youâre asking me to forecast the arrival of the evidence I would use for forecasting near-term AGI? Why?
What does the research literature say about the accuracy of short-term (e.g. 1-year timescales) geopolitical forecasting?
And what does the research literature say about the accuracy of long-term (e.g. longer than 5-year timescales) forecasting about technological progress?
(Should you even bother to check the literature to find out, or should you just guess how accurate you think each one probably is and leave it at that?)
When you say well under 0.05% chance of AGI in 10 years, is that a subjective guess?
Of course. And Iâll add that I think such guesses, including my own, have very little meaning or value. It may even be worse to make them than to not make them at all.
But then, instead of stopping, and saying that is my forecast, I remember that benchmark performance is generally a bit misleading in terms of real-world competence, and remember METR found that AIs often couldnât complete more realistic versions of the tasks which the benchmark counted them as passing. (Couldnât find a source for this claim, but I remember seeing it somewhere.)
This seems like a huge understatement. My impression is that the construct validity and criterion validity of the benchmarks METR uses, i.e. how much benchmark performance translates into real world performance, is much worse than you describe.
I think it would be closer to the truth to say if youâre trying to predict when AI systems will replace human coders, the benchmarks are meaningless and should be completely ignored. Iâm not saying thatâs the absolute truth, just thatâs itâs closer to the truth than saying benchmark performance is âgenerally a bit misleading in terms of real-world competenceâ.
Probably thereâs some loose correlation between benchmark performance and real-world competence, but itâs not nearly one-to-one.
I decide maybe when models will hit real world 6-month task 90% completion rate should maybe be a couple more doubling times of the 90 time-horizon METR metric forward. I move my forecast for human-level coders to, say, 15 months after the original to reflect this. Am I making a subjective guess, or relying on data?
Definitely making a subjective guess. For example, what if performance on benchmarks simply never generalizes to real world performance? Never, ever, ever, not in a million years never?
By analogy, what level of performance on go would AlphaGo need to achieve before you would guess it would be capable of baking a delicious croissant? Maybe these systems just canât do what youâre expecting them to do. And a chart canât tell you whether thatâs true or not.
What about if I am doing what AI 2027 did and trying to predict when LLMs match human coding ability on the basis of current data.
AI 2027 admits the role that gut intuition plays in their forecast. For example:
Disclaimer added Dec 2025: This forecast relies substantially on intuitive judgment, and involves high levels of uncertainty. Unfortunately, we believe that incorporating intuitive judgment is necessary to forecast timelines to highly advanced AIs, since there simply isnât enough evidence to extrapolate conclusively.
An example intuition:
Intuitively it feels like once AIs can do difficult long-horizon tasks with ground truth external feedback, it doesnât seem that hard to generalize to more vague tasks. After all, many of the sub-tasks of the long-horizon tasks probably involved using similar skills.
Okay, and what if it is hard? What if this kind of generalization is beyond the capabilities of current deep learning/âdeep RL systems? What if takes 20+ years of research to figure out? Then the whole forecast is out the window.
Whatâs the reward signal for vague tasks? This touches on open research problems that have existed in deep RL for many years. Why is this going to be fully solved within the next 2-4 years? Because âintuitively, it feels likeâ it will be?
Another example is online learning, which is a form of continual learning. AI 2027 highlights this capability:
Agent-2, more so than previous models, is effectively âonline learning,â in that itâs built to never really finish training. Every day, the weights get updated to the latest version, trained on more data generated by the previous version the previous day.
But I canât find anywhere else in any of the AI 2027 materials where they discuss online learning or continual learning. Are they thinking that online learning will not be one of the capabilities humans will have to invent? That AI will be able to invent online learning without first needing online learning to be able to invent such things? What does the scenario actually assume about online learning? Is it important or not? Is it necessary or unnecessary? And will it be something humans invent or AI invents?
When I tried to find what the AI 2027 authors have said about this, I found an 80,000 Hours Podcast interview where Daniel Kokotajlo said a few things about online learning, such as the following:
Luisa Rodriguez: OK. So it sounds like some people will think that these persistent deficiencies will be long-term bottlenecks. And youâre like, no, weâll just pour more resources into the thing doing the thing that it does well, and that will get us a long way to â
Daniel Kokotajlo: Probably. To be clear, Iâm not confident. I would say that thereâs like maybe a 30% or 40% chance that something like this is true, and that the current paradigm basically peters out over the next few years. And probably the companies still make a bunch of money by making iterations on the current types of systems and adapting them for specific tasks and things like that.
And then thereâs a question of when will the data efficiency breakthroughs happen, or when will the online learning breakthroughs happen, or whatever the thing is. And then this is an incredibly wealthy industry right now, and paradigm shifts of this size do seem to be happening multiple times a decade, arguably: think about the difference between the current AIs and the AIs of 2015. The whole language model revolution happened five years ago, the whole scaling laws thing like six, seven years ago. And now also AI agents â training the AIs to actually do stuff over long periods â thatâs happening in the last year.
So it does feel to me like even if the literal, exact current paradigm plateaus, thereâs a strong chance that sometime in the next decade â maybe 2033, maybe 2035, maybe 2030 â the huge amount of money and research going into overcoming these bottlenecks will succeed in overcoming these bottlenecks.
The other things Kokotajlo says in the interview about online learning and data efficiency are equally hazy and hand-wavy. It just comes down to his personal gut intuition. In the part I just quoted, he says maybe these fundamental research breakthroughs will happen in 2030-2035, but what if itâs more like 2070-2075, or 2130-2135? How would one come to know such a thing?
What historical precedent or scientific evidence do we have to support the idea that anyone can predict, with any accuracy, the time when new basic science will be discovered? As far as I know, this is not possible. So, whatâs the point of AI 2027? Why did the authors write it and why did anyone other than the authors take it seriously?
nostalgebraist originally made this critique here, very eloquently.
(In general many Yudkowskian ideas actually seem derived from quite mainstream sources on rationality and decision-making to me. I would not reject them just because you donât like what LW does with them. Bayesian epistemology is a real research program in philosophy for example.)
It can easily be true that Yudkowskyâs ideas about things are loosely derived from or inspired by ideas that make sense and that Yudkowskyâs donât make a lick of sense themselves. I donât think most self-identified Bayesians outside of the LessWrong community would agree with Yudkowskyâs rejection of institutional science, for instance. Yudkowskyâs irrationality says nothing about whether (the mainstream version of) Bayesianism is a good idea or not; whether (the mainstream version of) Bayesianism, or other ideas Yudkowsky draws from, are a good idea or not says nothing about whether Yudkowskyâs ideas are irrational.
By analogy, pseudoscience and crackpot physics are often loosely derived from or inspired by ideas in mainstream science. The correctness of mainstream science doesnât imply the correctness of pseudoscience or crackpot physics. Conversely, the incorrectness of pseudoscience or crackpot physics doesnât imply the incorrectness of mainstream science. It wouldnât be a defense of a crackpot physics theory that itâs inspired by legitimate physics, and the legitimacy of the ideas Yudkowsky is drawing from isnât a defense of Yudkowskyâs bizarre views.
I think forecasting is perfectly fine within the limitations that the scientific research literature on forecasting outlines. I think Yudkowskyâs personal twist on Aristotelian science or subjectively guessing which scientific propositions are true or false and then assuming heâs right (without seeking empirical evidence) because he thinks he has some kind of nearly superhuman intelligence â I think thatâs absurd and thatâs obviously not what people like Philip Tetlock have been advocating.
How many angels can dance on the head on a pin? An infinite number because angels have no spatial extension? Or maybe if we assume angels have a diameter of ~1 nanometre plus ~1 additional nanometre of diameter for clearance for dancing we can come up with a ballpark figure? Or, wait, are angels closer to human-sized? When bugs die do they turn into angels? What about bacteria? Can bacteria dance? Are angels beings who were formerly mortal, or were they âbornâ angels?[1]
AI is just a big thing in the world thatâs growing fast. Anybody capable of reading graphs can see that.
Well, some of the graphs are just made-up, like those in âSituational Awarenessâ, and some of the graphs are woefully misinterpreted to be about AGI when theyâre clearly not, like the famous METR time horizon graph.[2] I imagine that a non-trivial amount of EA misjudgment around AGI results from a failure to correctly read and interpret graphs.
And, of course, when people like titotal examine the math behind some of these graphs, like those in AI 2027, they are sometimes found to be riddled with major mistakes.
What I said elsewhere about AGI discourse in general is true about graphs in particular: the scientifically defensible claims are generally quite narrow, caveated, and conservative. The claims that are broad, unqualified, and bold are generally not scientifically defensible. People at METR themselves caveat the time horizons graph and note its narrow scope (I cited examples of this elsewhere in the comments on this post). Conversely, graphs that attempt to make a broad, unqualified, bold claim about AGI tend to be complete nonsense.
Out of curiosity, roughly what probability would you assign to there being an AI financial bubble that pops sometime within the next five years or so? If there is an AI bubble and if it popped, how would that affect your beliefs around near-term AGI?
- ^
How is correctness physically instantiated in space and time and how does it physically cause physical events in the world, such as speaking, writing, brain activity, and so on? Is this an important question to ask in this context? Do we need to get into this?
You can take an epistemic practice in EA such as âthinking that Leopold Aschenbrennerâs graphs are correctâ and ask about the historical origin of that practice without making a judgement about whether the practice is good or bad, right or wrong. You can ask the question in a form like, âHow did people in EA come to accept graphs like those in âSituational Awarenessâ as evidence?â If you want to frame it positively, you could ask the question as something like, âHow did people in EA learn to accept graphs like these as evidence?â If you want to frame it negatively, you could ask, âHow did people in EA not learn not to accept graphs like these as evidence?â And of course you can frame it neutrally.
The historical explanation is a separate question from the evaluation of correctness/âincorrectness and the two donât conflict with each other. By analogy, you can ask, âHow did Laverne come to believe in evolution?â And you could answer, âBecause itâs the correct view,â which would be right, in a sense, if a bit obtuse, or you could answer, âBecause she learned about evolution in her biology classes in high school and collegeâ, which would also be right, and which would more directly answer the question. So, a historical explanation does not necessarily imply that a view is wrong. Maybe in some contexts it insinuates it, but both kinds of answers can be true.
But this whole diversion has been unnecessary. - ^
Do you know a source that formally makes the argument that the METR graph is about AGI? I am trying to pin down the series of logical steps that people are using to get from that graph to AGI. I would like to spell out why I think this inference is wrong, but first it would be helpful to see someone spell out the inference theyâre making.
- ^
Upvoted because I think this is interesting historical/âintellectual context, but I think you might have misunderstood what I was trying to say in the comment you replied to. (I joined Giving What We Can in 2009 and got heavily involved in my university EA group from 2015-2018, so Iâm aware that AI has been a big topic in AI for a very long time, but Iâve never had any involvement with Oxford University or had any personal connections with Toby Ord or Will MacAskill, besides a few passing online interactions.)
In my comment above, I wasnât saying that EAâs interpenetration with LessWrong is largely to blame for the level of importance that the ideas of near-term AGI and AGI risk currently have in EA. (I also think that is largely true, but that wasnât the point of my previous comment.) I was saying that the influence of LessWrong and EAâs embrace of the LessWrong subculture is largely to blame for the EA community accepting ridiculous stuff like âSituational Awarenessâ, AI 2027, and so on, despite it having glaring flaws.
Focus on AGI risk at the current level EA gives it could be rational, or it might not be. What is definitely true is that the EA community accepts a lot of completely irrational stuff related to AGI risk. LessWrong doesnât believe in academia, institutional science, academic philosophy, journalism, scientific skepticism, common sense, and so on. LessWrong believes in Eliezer Yudkowsky, the Sequences, and LessWrong. So, members of the LessWrong community go completely off the rails and create or join cults at seemingly a much, much higher rate than the general population. Because theyâve been coached to reject the foundations of sanity that most people have, and to put their trust and belief in this small, fringe community.The EA community is not nearly as bad as LessWrong. If I thought it was as bad, I wouldnât bother trying to convince anyone in EA of anything, because I would think they were beyond rational persuasion. But EA has been infected to a very significant degree by the LessWrong irrationality. I think the level of emphasis that EA puts on subjective guesses as a source of truth and an accompanying sort of lazy, incurious approach to inquiry (why look stuff up or attempt to create a rigorous, defensible thesis when you can just guess stuff?) is one example of the LessWrong influence. Eliezer Yudkowsky quite literally, explicitly believes that his subjective guesses are a better guide to truth than the approach of traditional, mainstream scientific institutions and communities. Yudkowsky has attempted to teach his approach to subjectively guessing things to the LessWrong community (and enjoyers of Harry Potter fanfic). That approach has leeched into the EA community.
The result is you can have things like âSituational Awarenessâ and AI 2027 where the âdataâ is made-up and just consists of some random peopleâs random subjective guesses. This is the kind of stuff that should never be taken even a little bit seriously.
If you want to know which approach produces better results, look at the achievements of academic science â which underlie basically the entire modern world â versus the achievements of the LessWrong community â some Harry Potter fanfic and about half a dozen cults, despite believing their approach is unambiguously superior. If you adjust for time and population, the comparison still comes out favourably for science versus Yudkowskian subjective guessology. How many towns of under 5,000 people create even a single cult within even the span of 50 years? Versus the LessWrong community creating multiple cults within 16 years of its existence.I could be totally wrong in my root cause analysis. EA may have developed these bad habits independently of LessWrong. In any case, I think itâs clear that these are bad habits, that they lead nowhere good, and that EA should clear house (i.e. stop believing in subjective guess-based or otherwise super low-quality argumentative writing) and raise the bar for the quality of arguments and evidence that are taken seriously to something a bit closer to the academic or scientific level.
I donât have an idyllic view of academia. I donât think itâs all black-and-white. I recently re-read a review of Colin McGinnâs ridiculous book on the philosophy of physics. On one hand, the descriptions of and quotes from the book reminded me of all the stuff that drives me crazy in academic philosophy. On the other hand, the reviewer is a philosopher and her review is published in a philosophy journal. So, thereâs a push and pull.Maybe a good analogy for academia is liberal democracy. Itâs often a huge mess, full of ongoing conflicts and struggles, frequently unjust and unreasonable, but ultimately it produces an astronomical amount of value, rivalling the best of anything humans have ever done. By vouching for academia or liberal democracy, Iâm not saying itâs all good, Iâm just saying that the overall process is good. And the process itself (in both cases) can surely be improved, but through reform and evolution involving a lot of people with expertise, not by a charismatic outsider with a zealous following (e.g. illiberal/âauthoritarian strongmen, in the case of government, or someone like Yudkowsky, in the case of academia, who, incidentally, has a bit of an authoritarian attitude, not politically, but intellectually).
Thanks, Vasco.
âAI as normal technologyâ is a catchy phrase, and could be a useful one, but I was so confused and surprised when I dug in deeper to what the âAI as a normal technologyâ view actually is, as described by the people who coined the term.
I think ânormal technologyâ is a misnomer, because they seem to think some form of transformative AI or AGI will be created sometime over the next several decades, and in the meantime AI will have radical, disruptive economic effects.They should come up with some other term for their view like âtransformative AI slow takeoffâ because ânormal technologyâ just seems inaccurate.
One not need choose between the two because they both point toward the same path: re-examine claims with greater scrutiny. There is no excuse for the egregious flaws in works like âSituational Awarenessâ and AI 2027. This is not serious scholarship. To the extent the EA community gets fooled by stuff like this, its reasoning process, and its weighing of evidence, will be severely impaired.
If you get rid of all the low-quality work and retrace all the steps of the argument from the beginning, might the EA community end up in basically the same place all over again, with a similar estimation of AGI risk and a similar allocation of resources toward it? Well, sure, it might. But it might not.
If your views are largely informed by falsehoods and ridiculous claims, half-truths and oversimplifications, greedy reductionism and measurements with little to no construct validity or criterion validity, and, in some cases, a lack of awareness of countervailing ideas or the all-too-eager dismissal of inconvenient evidence, then you simply donât know what your views would end up being if you started all over again with more rigour and higher standards. The only appropriate response is to clear house. Put the ideas and evidence into a crucible and burn away what doesnât belong. Then, start from the beginning and see what sort of conclusions can actually be justified with what remains.
A large part of the blame lies at the feet of LessWrong and at the feet of all the people in EA who decided, in some important cases quite early on, to mingle the two communities. LessWrong promotes skepticism and suspicion of academia, mainstream/âinstitutional science, traditional forms of critical thinking and scientific skepticism, journalism, and society at large. At the same time, LessWrong promotes reverence and obsequence toward its own community, positioning itself as an alternative authority to replace academia, science, traditional critical thought, journalism, and mainstream culture. Not innocently. LessWrong is obsessed with fringe thinking. The community has created multiple groups that Ozy Brennan describes as âcultsâ. Given how small the LessWrong community is, Iâd have to guess that the rate at which the community creates cults must be multiple orders of magnitude higher than the base rate for the general population.
LessWrong is also credulous about racist pseudoscience, and, in the words of a former Head of Communications at the Centre for Effective Altruism, is largely âstraight-up racistâ. One of the admins of LessWrong and co-founders of Lightcone Infrastructure once said, in the context of a discussion about the societal myth that gay people are evil or malicious and a danger to children:
I think⌠finding out (in the 1950s) that someone maintained many secret homosexual relationships for many years is actually a signal the person is fairly devious, and is both willing and capable of behaving in ways that society has strong norms about not doing.
It obviously isnât true about homosexuals once the norm was lifted, but my guess is that it was at the time accurate to make a directional bayesian update that the person had behaved in actually bad and devious ways.
Such statements make ârationalistâ a misnomer. (I was able to partially dissuade of him of this nonsense by showing him some of the easily accessible evidence he could have looked up for himself, but the community did not seem to particularly value my intervention.)
I donât know that the epistemic practices of the EA community can be rescued as long the EA community remains interpenetrated with LessWrong to a major degree. The purpose of LessWrong is not to teach rationality, but to disable oneâs critical faculties until one is willing to accept nonsense. Perhaps it is futile to clamour for better-quality scholarship when such a large undercurrent of the EA community is committed to the idea that normal ideas of what constitutes good scholarship are wrong and that the answers to what constitutes actually good scholarship lie with Eliezer Yudkowsky, an amateur philosopher with no relevant qualifications or achievements in any field, who frequently speaks with absolute confidence and is wrong, who experts often find non-credible, who has said he literally sees himself as the smartest person on Earth, and who rarely admits mistakes (despite making many) or issues corrections. If Yudkowsky is your highest and more revered authority, if you follow him in rejecting academia, institutional science, mainstream philosophy, journalism, normal culture, and so on, then I donât know what could possibly convince you that the untrue things you believe are untrue, since your fundamental epistemology comes down to whether Yudkowsky says something is true or not, and heâs told you to reject all other sources of truth.
To the extent the EA community is under LessWrongâs spell, it will probably remain systemically irrational forever. Only within the portions of the EA community who have broken that spell, or never come under it in the first place, is there the hope for academic standards, mainstream scientific standards, traditional critical thinking, journalistic fact-checking, culturally evolved wisdom, and so on to take hold. It would be like expecting EA to be rational about politics while 30% of the community is under the spell of QAnon, or to be rational about global health while a large part of the community is under the spell of anti-vaccination pseudoscience. Itâs just not gonna happen.But maybe my root cause analysis is wrong and the EA community can course correct without fundamentally divorcing LessWrong. I donât know. I hope that, whatever is the root cause, whatever it takes to fix it, the EA communityâs current low standards for evidence and argumentation pertaining to AGI risk get raised significantly.
I donât think itâs a brand new problem, by the way. Around 2016, I was periodically arguing with people about AI on the main EA group on Facebook. One of my points of contention was that MIRIâs focus on symbolic AI was a dead-end and that machine learning had empirically produced much better results, and was where the AI field was now focused. (MIRI took a long time before they finally hired their first researcher to focus on machine learning.) I didnât have any more success convincing people about that back then than Iâve been having lately with my current points of contention.I agree though that the situation seems to have gotten much worse in recent years, and ChatGPT (and LLMs in general) probably had a lot to do with that.
âThe rationalist movement strikes me as, in many ways, not part of the cultic milieu qua community. We donât read the blogs of UFOlogists or sovereign citizens; we read the blogs of economists and historians. But the fundamental personality type is the same.
I was long puzzled about why rationalists keep becoming traditional Catholics, or getting really into chakras, or trying to summon demons, or joining the alt-right. You would think, given all the learning how to think good training weâre allegedly getting, we wouldnât do that stuff! I think, given the âcultic milieuâ concept, this observation is exactly what you would expect. While rationalists are more right than UFOlogists or for that matter people who believe in chakras, people do not become rationalists because they are especially good at thinking. People become rationalists because they are attracted to the cultic milieuâthat is, people who distrust authority and want to figure things out for themselves and like knowing secrets that no one else knows. People who are attracted to the cultic milieu are attracted to stigmatized knowledge whether or not it is in fact correct. The members of the conventional cultic milieu drift smoothly from astrology to aliens. It is to be expected that some rationalists do the same.â
âOzy Brennan, âRationalists and the Cultic Milieuâ (2022)
See my comment here (on titotalâs post) for some concrete evidence that the slow progress scenario is too high as a baseline, minimum progress scenario.
For example, the slow progress scenario predicts the development of household robots that can do various chores by December 2030. Metaculus, which tends to be highly aggressive and optimistic in its forecasts of AI capabilities progress, only predicts the development of such robots in mid-2032. To me, this indicates the slow progress scenario stipulates too much progress.
Metaculus is already aggressive, and the slow progress scenario seems to be more aggressive â at least on household robots, depending how exactly you interpret it â than Metaculus.
In principle, of course, but how? There are various practical obstacles such as:
Are such bets legal?
How do you compel people to pay up?
Why would someone on the other side of the bet want to take it?
I donât have spare money to be throwing at Internet stunts where thereâs a decent chance that, e.g. someone will just abscond with my money and Iâll have no recourse (or at least nothing cost-effective)
If itâs a bet that takes a form where if AGI isnât invented by January 1, 2036, people have to pay me a bunch of money (and vice versa), of course Iâll accept such bets gladly in large sums.
I would also be willing to take bets of that form for good intermediate proxies for AGI, which would take a bit of effort to figure out, but that seems doable. The harder part is figuring out how to actually structure the bet and ensure payment (if this is even legal in the first place).
From my perspective, itâs free money, and Iâll gladly take free money (at least from someone wealthy enough to have money to spare â I would feel bad taking it from someone who isnât financially secure). But even though similar bets have been made before, people still donât have good solutions to the practical obstacles.
I wouldnât want to accept an arrangement that would be financially irrational (or illegal, or not legally enforceable), though, and that would amount to essentially burning money to prove a point. That would be silly, I donât have that kind of money to burn.
Iâll say just a little bit more on the topic of the precautionary principle for now. I have a complex multi-part argument on this, which will take some explaining that I wonât try to do here. I have covered a lot of this in some previous posts and comments. The main three points Iâd make in relation to the precautionary principle and AGI risk are:
-
Near-term AGI is highly unlikely, much less than a 0.05% chance in the next decade
-
We donât have enough knowledge of how AGI will be built to usefully prepare now
-
As knowledge of how to build AGI is gained, investment into preparing for AGI becomes vastly more useful, such that the benefits of investing resources into preparation at higher levels of knowledge totally overwhelm the benefits of investing resources at lower levels of knowledge
-
The point of the FTX comparison is that, in the wake of the FTX collapse, many people in EA were eager to reflect on the collapse and try to see if there were any lessons for EA. In the wake of the AI bubble popping, people in EA could either choose to reflect in a similar way, or they could choose not to. The two situations are analogous insofar as they are both financial collapses and both could lead to soul-searching. They are disanalogous insofar as the AI bubble popping wonât affect EA funding and wonât associate EA in the publicâs mind with financial crimes or a moral scandal.
Itâs possible in the wake of the AI bubble popping, nobody in EA will try to learn anything. I fear that possibility. The comparisons I made to Ray Kurzweil and Elon Musk show that it is entirely possible to avoid learning anything, even when you ought to. So, EA could go multiple different ways with this, and Iâm just saying what I hope will happen is the sort of reflection that happened post-FTX.If the AI bubble popping wouldnât convince you that EAâs focus on near-term AGI has been a mistake â or at least convince you to start seriously reflecting on whether it has been or not â what evidence would convince you?
I think itâs fair to criticize Yudkowsky and Soaresâ belief that there is a very high probability of AGI being created within ~5-20 years because that is a central part of their argument. The purpose of the book is to argue for an aggressive global moratorium on AI R&D. For such a moratorium to make sense, probabilities need to be high and timelines need to be short. If Yudkowsky and Soares believed there was an extremely low chance of AGI being developed within the next few decades, they wouldnât be arguing for the moratorium.
So, I think Oscar is right to notice and critique this part of their argument. I donât think itâs fair to say Oscar is critiquing a straw man.
You can respond with a logical, sensible appeal to the precautionary principle: shouldnât we prepare anyway, just in case? First, I would say that even if this is the correct response, it doesnât make Oscarâs critique wrong or not worth making. Second, I think arguments around whether AGI will be safe or unsafe, easy or hard to align, and what to do to prepare for it â these arguments depend on how specific assumptions on how AGI will be built. So, this is not actually a separate question from the topic Oscar raised in this post.
It would be nice if there were something we could do just in case, to make any potential future AGI system safer or easier to align, but I donât see how we can do this in advance of knowing what technology or science will be used to build AGI. So, the precautionary principle response doesnât add up, either, in my view.
Eliezer Yudkowsky forecasts a 99.5% chance of human extinction from AGI âwell before 2050â, unless we implement his proposed aggressive global moratorium on AI R&D. Yudkowsky deliberately avoids giving more than a vague forecast on AGI, but he often strongly hints at a timeline. For example, in December 2022, he tweeted:
Pouring some cold water on the latest wave of AI hype: I could be wrong, but my guess is that we do *not* get AGI just by scaling ChatGPT, and that it takes *surprisingly* long from here. Parents conceiving today may have a fair chance of their child living to see kindergarten.
In April 2022, when Metaculusâ forecast for AGI was in the 2040s and 2050s, Yudkowsky harshly criticized Metaculus for having too long a timeline and not updating it downwards fast enough.
In his July 2023 TED Talk, Yudkowsky said:
At some point, the companies rushing headlong to scale AI will cough out something thatâs smarter than humanity. Nobody knows how to calculate when that will happen. My wild guess is that it will happen after zero to two more breakthroughs the size of transformers.
In March 2023, during an interview with Alex Fridman, Fridman asked Yudkowsky what advice he had for young people. Yudkowsky said:
Donât expect it to be a long life. Donât put your happiness into the future. The future is probably not that long at this point.
In that segment, he also said, âwe are not in the shape to frantically at the last minute do decadesâ worth of work.â
After reading these examples, do you still think Yudkowsky only believes that AGI is ânot unlikely to be built in the futureâ, âif not in 5 then maybe in 50 yearsâ?
I canât thank titotal enough for writing this post and for talking to the Forecasting Research Institute about the error described in this post.
Iâm also incredibly thankful to the Forecasting Research Institute for listening to and integrating feedback from me and, in this case, mostly from titotal. Itâs not nothing to be responsive to criticism and correction. I can only express appreciation for people who are willing to do this. Nobody loves criticism, but the acceptance of criticism is what it takes to move science, philosophy, and other fields forward. So, hallelujah for that.
I want to be clear that, as titotal noted, weâre just zeroing in here on one specific question discussed in the report, out of 18 total. It is an unfortunate thing that you can work hard on something that is quite large in scope and it can be almost entirely correct (I havenât reviewed the rest of the report, but Iâll give the benefit of the doubt), but then the discussion focuses around the one mistake you made. I donât want research or writing to be a thankless task that only elicits criticism, and I want to be thoughtful about how to raise criticism in the future.
For completeness, to make sure readers have a full understanding, I actually made three distinct and independent criticisms of this survey question and how it was reported. First, I noted that the probability of the rapid scenario was reported as an unqualified probability, rather than the probability of the scenario being the best matching of the three â âbest matchingâ is the wording the question used. The Forecasting Research Institute was quick to accept this point and promise to revise the report.
Second, I raised the problem around the intersubjective resolution/âmetaprediction framing that titotal describes in this post. After a few attempts, I passed the baton to titotal, figuring that titotalâs reputation and math knowledge would make them more convincing. The Forecasting Research Institute has now revised the report in response, as well as their EA Forum post about the report.
Third, the primary issue I raised in my original post on this topic is about a potential anchoring effect or question wording bias with the survey question.[1] The slow progress scenario is extremely aggressive and optimistic about the amount of progress in AI capabilities between now and the end of 2030. I would personally guess the probability of AI gaining the sort of capabilities described in the slow progress scenario by the end of 2030 is significantly less than 0.1% or 1 in 1,000. I imagine most AI experts would say itâs unlikely, if presented with the scenario in isolation and asked directly about its probability.
For example, here is what is said about household robots in the slow progress scenario:
By the end of 2030 in this slower-progress future, AI is a capable assisting technology for humans; it can ⌠conduct relatively standard tasks that are currently (2025) performed by humans in homes and factories.
Also:
Meanwhile, household robots can make a cup of coffee and unload and load a dishwasher in some modern homesâbut they canât do it as fast as most humans and they require a consistent environment and occasional human guidance.
Even Metaculus, which is known to be aggressive and optimistic about AI capabilities, and which is heavily used by people in the effective altruist community and the LessWrong community, where belief in near-term AGI is strong, puts the median date for the question âWhen will a reliable and general household robot be developed?â in mid-2032. The resolution criteria for the Metaculus question are compatible with the sentence in the slow progress scenario, although those criteria also stipulate a lot of details that are not stipulated in the slow progress scenario.
An expert panel surveyed in 2020 and 2021 was asked, â[5/â10] years from now, what percentage of the time that currently goes into this task can be automated?â and answered 47% for dish washing in 10 years, so in 2030 or 2031. I find this to be a somewhat confusing framing â what does it mean for 47% of the time involved in dish washing to be automated? â but it points to the baseline scenario in the LEAP survey involving contested questions and not just things we can take for granted.
Adam Jonas, a financial analyst at Morgan Stanley who has a track record of being extremely optimistic about AI and robotics (sometimes mistakenly so), and who the financial world interprets as having aggressive, optimistic forecasts, predicts that a âgeneral-purpose humanoidâ robot for household chores will require âtechnological progress in both hardware and AI models, which should take about another decadeâ, meaning around 2035. So, on Wall Street, even an optimist seems to be less optimistic than the LEAP surveyâs slow progress scenario.
If the baseline scenario is more optimistic about AI capabilities progress than Metaculus, the results of a previous expert survey, and a Wall Street analyst on the optimistic end of the spectrum, then it seems plausible that the baseline scenario is already more optimistic than what the LEAP panelists would have reported as their median forecast if they had been asked in a different way. It seems way too aggressive as a baseline scenario. This makes it hard to know how to to interpret the panelistsâ answers (in addition to the interpretative difficulty raised by the problem described in titotalâs post above).
- ^
I have also used the term âframing effectâ to describe this before â following the Forecasting Research Institute and AI Impacts â but now checking again the definition of that term in psychology, it seems to specifically refer to framing the same information as positive or negative, which doesnât apply here.
- ^
Update #2: titotal has published a full breakdown of the error involving the intersubjective resolution/âmetaprediction framing of the survey question. Itâs a great post that explains the error very well. Many thanks to titotal for taking the time to write the post and for talking to the Forecasting Research Institute about this. Thanks again to the Forecasting Research Institute for revising the report and this post.
If I want to know what âutilitarianismâ means, including any disagreements among scholars about the meaning of the term (I have a philosophy degree, I have studied ethics, and I donât have the impression there are meaningful disagreements among philosophers on the definition of âutilitarianismâ), I can find this information in many places, such as:
The book Utilitarianism: A Very Short Introduction co-authored by Peter Singer and published by Oxford University Press
A textbook like Normative Ethics or an anthology like Ethical Theory
Philosophy journals
An academic philosophy podcast like Philosophy Bites
Academic lectures on YouTube and Crash Course (a high-quality educational resource)
So, itâs easy for me to find out what âutilitarianismâ means. There is no shortage of information about that.
Where do I go to find out what âtruth-seekingâ means? Even if some people disagree on the definition, can I go somewhere and read about, say, the top 3 most popular definitions of the term and why people prefer one definition over the other?
It seems like an important word. I notice people keep using it. So, what does it mean? Where has it been defined? Is there a source you can cite that attempts to define it?
I have tried to find a definition for âtruth-seekingâ before, more than once. Iâve asked what the definition is before, more than once. I donât know if there is a definition. I donât know if the term means anything definite and specific. I imagine it probably doesnât have a clear definition or meaning, and that different people who say âtruth-seekingâ mean different things when they say it â and so people are largely talking past each other when they use this term.
Incidentally, I think what I just said about âtruth-seekingâ probably also largely applies to âepistemicsâ. I suspect âepistemicsâ probably either means epistemic practices or epistemology, but itâs not clear, and there is evidently some confusion on its intended meaning. Looking at the actual use of âepistemicsâ, Iâm not sure different people mean the same thing by it.
Do you stand by your accusation of bad faith?
Your accusation of bad faith seems to rest on your view that the restraints imposed by the laws of physics on space travel make an alien invasion or attack extremely improbable. Such an event may indeed be extremely improbable, but the laws of physics do not say so.
I have to imagine that you are referring to the speeds of spacecraft and the distances involved. The Milky Way Galaxy is 100,000 light-years in diameter organized along a plane in a disc shape that is 1,000 light-years thick. NASAâs Parker Space Probe has travelled at 0.064% the speed of light. Letâs round it to 0.05% of the speed of light for simplicity. At 0.05% the speed of light, the Parker Space Probe could travel between the two farthest points in the Milky Way Galaxy in 200 million years.
That means that if the maximum speed of spacecraft in the galaxy were limited to only the top speed of NASAâs fastest space probe today, an alien civilization that reached an advanced stage of science and technology â perhaps including things like AGI, advanced nanotechnology/âatomically precise manufacturing, cheap nuclear fusion, interstellar spaceships, and so on â more than 200 million years ago would have had plenty of time to establish a presence in every star system of the Milky Way. At 1% the speed of light, the window of time shrinks to 10 million years, and so on.
Designs for spacecraft that credible scientists and engineers thought Earth could actually build in the near future include a light sail-based probe that would supposedly travel at 15-20% the speed of light. Such a probe could traverse the diameter of the Milky Way in under 1 million years at top speed. Acceleration and deceleration complicate the picture somewhat, but the fundamental idea still holds.
If there are alien civilizations in our galaxy, we donât have any clear, compelling scientific reason to think they wouldnât be many millions of years older than our civilization. The Earth formed 4.5 billion years ago, so if a habitable planet elsewhere in the galaxy formed just 10% sooner and put life on that planet on the same trajectory as on ours, the aliens would be 450 million years ahead of us. Plenty of time to reach everywhere in the galaxy.
The Fermi paradox has been considered and discussed by people working in physics, astronomy, rocket/âspacecraft engineering, SETI, and related fields for decades. There is no consensus on the correct resolution to the paradox. Certainly, there is no consensus that the laws of physics resolve it.
So, if Iâm understanding your reasoning correctly â that surely I must be behaving in a dishonest or deceitful way, i.e. engaging in bad faith, because obviously everyone knows the restraints imposed by the laws of physics on space travel make an alien attack on Earth extremely improbable â then your accusation of bad faith seems to rest on a mistake.
Thanks for giving me the opportunity to talk about this because the Fermi paradox is always so much fun to talk about.
My list is very similar to yours. I believe items 1, 2, 3, 4, and 5 have already been achieved to substantial degrees and we continue to see progress in the relevant areas on a quarterly basis. I donât know about the status of 6.
Itâs hard to know what âto substantial degreesâ means. That sounds very subjective. Without the âto substantial degreesâ caveat, it would be easy to prove that 1, 3, 4, and 5 have not been achieved, and fairly straightforward to make a strong case that 2 has not been achieved.
For example, it is simply a fact that Waymo vehicles have a human in the loop â Waymo openly says so â so Waymo has not achieved Level 4â5 autonomy without a human in the loop. Has Waymo achieved Level 4â5 autonomy without humans in the loop âto a substantial degreeâ? That seems subjective. I donât know what âto a substantial degreeâ means to you, and it might mean something different to me, or to other people.
Humanoid robots have not achieved any profitable new applications in recent years, as far as Iâm aware. Again, I donât know what achieving this âto a substantial degreeâ might mean to you.
I would be curious to know what progress you think has been made recently on the fundamental research problems I mentioned, or what the closest examples are to LLMs engaging in the sort of creative intellectual act I described. I imagine the examples you have in mind are not something the majority of AI experts would agree fit the descriptions I gave.
For clarity on item 1, AI company revenues in 2025 are on track to cover 2024 costs, so on a product basis, AI models are profitable; itâs the cost of new models that pull annual figures into the red. I think this will stop being true soon, but thatâs my speculation, not evidence, so I remain open that scaling will continue to make progress towards AGI, potentially soon.
Distinguish here between gold mining vs. selling picks and shovels. Iâm talking about applications of LLMs and AI tools that are profitable for end users. Nvidia is extremely profitable because it sells GPUs to AI companies. In theory, in a hypothetical scenario, AI companies could become profitable by selling AI models as a service (e.g. API tokens, subscriptions) to businesses. But then would those business customers see any profit from the use of LLMs (or other AI tools)? Thatâs what Iâm talking about. Nvidia is selling picks and shovels, and to some extent even the AI companies are selling picks and shovels. Whereâs the gold?
The six-item list I gave was a list of some things that â each on their own but especially in combination â would go a long way toward convincing me that Iâm wrong and my near-term AGI skepticism is a mistake. When you say your list is similar, Iâm not quite sure what you mean. Do you mean that if those things didnât happen, that would convince you that the probability or level of credence you assign to near-term AGI is way too high? I was trying to ask you what evidence would convince you that youâre wrong.
This is directly answered in the post. Edit: Can you explain why you donât find what is said about this in the post satisfactory?
Beautiful comment. I wholeheartedly agree that fun and friendship are not an extra or a nice-to-have, but are the lifeblood of communities and movements.