(Crossposted to LessWrong)
Abstract
The linked paper is our submission to the Open Philanthropy AI Worldviews Contest. In it, we estimate the likelihood of transformative artificial general intelligence (AGI) by 2043 and find it to be <1%.
Specifically, we argue:
The bar is high: AGI as defined by the contest—something like AI that can perform nearly all valuable tasks at human cost or less—which we will call transformative AGI is a much higher bar than merely massive progress in AI, or even the unambiguous attainment of expensive superhuman AGI or cheap but uneven AGI.
Many steps are needed: The probability of transformative AGI by 2043 can be decomposed as the joint probability of a number of necessary steps, which we group into categories of software, hardware, and sociopolitical factors.
No step is guaranteed: For each step, we estimate a probability of success by 2043, conditional on prior steps being achieved. Many steps are quite constrained by the short timeline, and our estimates range from 16% to 95%.
Therefore, the odds are low: Multiplying the cascading conditional probabilities together, we estimate that transformative AGI by 2043 is 0.4% likely. Reaching >10% seems to require probabilities that feel unreasonably high, and even 3% seems unlikely.
Thoughtfully applying the cascading conditional probability approach to this question yields lower probability values than is often supposed. This framework helps enumerate the many future scenarios where humanity makes partial but incomplete progress toward transformative AGI.
Executive summary
For AGI to do most human work for <$25/hr by 2043, many things must happen.
We forecast cascading conditional probabilities for 10 necessary events, and find they multiply to an overall likelihood of 0.4%:
Event | Forecastby 2043 or TAGI, |
We invent algorithms for transformative AGI | 60% |
We invent a way for AGIs to learn faster than humans | 40% |
AGI inference costs drop below $25/hr (per human equivalent) | 16% |
We invent and scale cheap, quality robots | 60% |
We massively scale production of chips and power | 46% |
We avoid derailment by human regulation | 70% |
We avoid derailment by AI-caused delay | 90% |
We avoid derailment from wars (e.g., China invades Taiwan) | 70% |
We avoid derailment from pandemics | 90% |
We avoid derailment from severe depressions | 95% |
Joint odds | 0.4% |
If you think our estimates are pessimistic, feel free to substitute your own here. You’ll find it difficult to arrive at odds above 10%.
Of course, the difficulty is by construction. Any framework that multiplies ten probabilities together is almost fated to produce low odds.
So a good skeptic must ask: Is our framework fair?
There are two possible errors to beware of:
Did we neglect possible parallel paths to transformative AGI?
Did we hew toward unconditional probabilities rather than fully conditional probabilities?
We believe we are innocent of both sins.
Regarding failing to model parallel disjunctive paths:
We have chosen generic steps that don’t make rigid assumptions about the particular algorithms, requirements, or timelines of AGI technology
One opinionated claim we do make is that transformative AGI by 2043 will almost certainly be run on semiconductor transistors powered by electricity and built in capital-intensive fabs, and we spend many pages justifying this belief
Regarding failing to really grapple with conditional probabilities:
Our conditional probabilities are, in some cases, quite different from our unconditional probabilities. In particular, we assume that a world on track to transformative AGI will…
Construct semiconductor fabs and power plants at a far faster pace than today (our unconditional probability is substantially lower)
Have invented very cheap and efficient chips by today’s standards (our unconditional probability is substantially lower)
Have higher risks of disruption by regulation
Have higher risks of disruption by war
Have lower risks of disruption by natural pandemic
Have higher risks of disruption by engineered pandemic
Therefore, for the reasons above—namely, that transformative AGI is a very high bar (far higher than “mere” AGI) and many uncertain events must jointly occur—we are persuaded that the likelihood of transformative AGI by 2043 is <1%, a much lower number than we otherwise intuit. We nonetheless anticipate stunning advancements in AI over the next 20 years, and forecast substantially higher likelihoods of transformative AGI beyond 2043.
For details, read the full paper.
About the authors
This essay is jointly authored by Ari Allyn-Feuer and Ted Sanders. Below, we share our areas of expertise and track records of forecasting. Of course, credentials are no guarantee of accuracy. We share them not to appeal to our authority (plenty of experts are wrong), but to suggest that if it sounds like we’ve said something obviously wrong, it may merit a second look (or at least a compassionate understanding that not every argument can be explicitly addressed in an essay trying not to become a book).
Ari Allyn-Feuer
Areas of expertise
I am a decent expert in the complexity of biology and using computers to understand biology.
I earned a Ph.D. in Bioinformatics at the University of Michigan, where I spent years using ML methods to model the relationships between the genome, epigenome, and cellular and organismal functions. At graduation I had offers to work in the AI departments of three large pharmaceutical and biotechnology companies, plus a biological software company.
I have spent the last five years as an AI Engineer, later Product Manager, now Director of AI Product, in the AI department of GSK, an industry-leading AI group which uses cutting edge methods and hardware (including Cerebras units and work with quantum computing), is connected with leading academics in AI and the epigenome, and is particularly engaged in reinforcement learning research.
Track record of forecasting
While I don’t have Ted’s explicit formal credentials as a forecaster, I’ve issued some pretty important public correctives of then-dominant narratives:
I said in print on January 24, 2020 that due to its observed properties, the then-unnamed novel coronavirus spreading in Wuhan, China, had a significant chance of promptly going pandemic and killing tens of millions of humans. It subsequently did.
I said in print in June 2020 that it was an odds-on favorite for mRNA and adenovirus COVID-19 vaccines to prove highly effective and be deployed at scale in late 2020. They subsequently did and were.
I said in print in 2013 when the Hyperloop proposal was released that the technical approach of air bearings in overland vacuum tubes on scavenged rights of way wouldn’t work. Subsequently, despite having insisted they would work and spent millions of dollars on them, every Hyperloop company abandoned all three of these elements, and development of Hyperloops has largely ceased.
I said in print in 2016 that Level 4 self-driving cars would not be commercialized or near commercialization by 2021 due to the long tail of unusual situations, when several major car companies said they would. They subsequently were not.
I used my entire net worth and borrowing capacity to buy an abandoned mansion in 2011, and sold it seven years later for five times the price.
Luck played a role in each of these predictions, and I have also made other predictions that didn’t pan out as well, but I hope my record reflects my decent calibration and genuine open-mindedness.
Ted Sanders
Areas of expertise
I am a decent expert in semiconductor technology and AI technology.
I earned a PhD in Applied Physics from Stanford, where I spent years researching semiconductor physics and the potential of new technologies to beat the 60 mV/dec limit of today’s silicon transistor (e.g., magnetic computing, quantum computing, photonic computing, reversible computing, negative capacitance transistors, and other ideas). These years of research inform our perspective on the likelihood of hardware progress over the next 20 years.
After graduation, I had the opportunity to work at Intel R&D on next-gen computer chips, but instead, worked as a management consultant in the semiconductor industry and advised semiconductor CEOs on R&D prioritization and supply chain strategy. These years of work inform our perspective on the difficulty of rapidly scaling semiconductor production.
Today, I work on AGI technology as a research engineer at OpenAI, a company aiming to develop transformative AGI. This work informs our perspective on software progress needed for AGI. (Disclaimer: nothing in this essay reflects OpenAI’s beliefs or its non-public information.)
Track record of forecasting
I have a track record of success in forecasting competitions:
Top prize in SciCast technology forecasting tournament (15 out of ~10,000, ~$2,500 winnings)
Top Hypermind US NGDP forecaster in 2014 (1 out of ~1,000)
1st place Stanford CME250 AI/ML Prediction Competition (1 of 73)
2nd place ‘Let’s invent tomorrow’ Private Banking prediction market (2 out of ~100)
2nd place DAGGRE Workshop competition (2 out of ~50)
3rd place LG Display Futurecasting Tournament (3 out of 100+)
4th Place SciCast conditional forecasting contest
9th place DAGGRE Geopolitical Forecasting Competition
30th place Replication Markets (~$1,000 winnings)
Winner of ~$4200 in the 2022 Hybrid Persuasion-Forecasting Tournament on existential risks (told ranking was “quite well”)
Each finish resulted from luck alongside skill, but in aggregate I hope my record reflects my decent calibration and genuine open-mindedness.
Discussion
We look forward to discussing our essay with you in the comments below. The more we learn from you, the more pleased we’ll be.
If you disagree with our admittedly imperfect guesses, we kindly ask that you supply your own preferred probabilities (or framework modifications). It’s easier to tear down than build up, and we’d love to hear how you think this analysis can be improved.
I don’t think I understand the structure of this estimate, or else I might understand and just be skeptical of it. Here are some quick questions and points of skepticism.
Starting from the top, you say:
This section appears to be an estimate of all-things-considered feasibility of transformative AI, and draws extensively on evidence about how lots of things go wrong in practice when implementing complicated projects. But then in subsequent sections you talk about how even if we “succeed” at this step there is still a significant probability of failing because the algorithms don’t work in a realistic amount of time.
Can you say what exactly you are assigning a 60% probability to, and why it’s getting multiplied with ten other factors? Are you saying that there is a 40% chance that by 2043 AI algorithms couldn’t yield AGI no matter how much serial time and compute they had available? (It seems surprising to claim that even by 2023!) Presumably not that, but what exactly are you giving a 60% chance?
(ETA: after reading later sections more carefully I think you might be saying 60% chance that our software is about as good as nature’s, and maybe implicitly assuming there is a ~0% chance of being significantly better than that or building TAI without that? I’m not sure if that’s right though, if so it’s a huge point of methodological disagreement. I’ll return to this point later.)
In section 2 you say:
And give this a 40% probability. I don’t think I understand this claim or its justification. (This is related to my uncertainty about what your “60%” in the last section was referring to.)
It seems to me that if you had human-like learning you would be able to produce transformative AGI by 2043:
In fact it looks like human-like learning would enable AI to learn human-level physical skills:
10 years is sufficient for humans to learn most physical skills from scratch, and you are talking about 20 year timelines. So why is the serial time for learning even a candidate blocker?
Humans learn new physical skills (including e.g. operating unfamiliar machinery) within tens of hours. This requires transfer from other things humans have learned, but those tasks are not always closely related (e.g. I learn to drive a car based on experience walking) and AI systems will have access to transfer from tasks that seem if anything more similar (e.g. prediction of the relevant physical environments, predictions of expert behavior in similar domains, closed-loop behavior in a wide range of simulated environments, closed-loop behavior on physical tasks with shorter timescales, behavior in virtual environments...).
We can easily run tens of thousands of copies of AI systems in parallel. Existing RL is massively parallelizable. Human evolution gives no evidence about the difficulty of parallelizing learning in this way. Based on observations of human learning it seems extremely likely to me that parallelization 10,000 fold can reduce serial time by at least 10x (which is all that is needed). Extrapolations of existing RL algorithms seem to suggest serial requirements more like 10,000 episodes, with almost all of the compute used to run a massive number of episodes in parallel, which would be 1 year even for a 1-hour task. It seems hard to construct physical tasks that don’t provide rich feedback after even shorter horizons than 1 hour (and therefore suitable for a gradient descent step given enough parallel samples) so this seems pretty conservative.
Regardless of learning physical tasks, humans are able to learn to do R&D after 20 years of experience. AI systems operate at 10x speed and most environments relevant to hardware and software R&D can be sped up by at least 10x. So it seems like AI systems could be human-level at a wide range of tasks, sufficient to accelerate further AI progress, even if they just used non-parallelized human learning over 2 years. If you really thought physical tasks were somehow impossibly difficult (which I don’t think is justified) then this becomes the dominant path to AGI. This is particularly important because multiple of your later points also seem to rest on the distinctive difficulty of automating physical tasks, which should just shift your probability further and further to an explosion of automated R&D which drives automation of physical labor.
I think you are disagreeing with these claims, but I’m not sure about that. For example, you mention parallelizable learning but seem to give it <10% probability despite the fact that it is the overwhelmingly dominant paradigm in current practice and you don’t say anything about why it might not work.
(This isn’t super relevant to my mainline view, since in fact I think AI is much worse at learning quickly than humans and will likely be transformative way before reaching parity. This is related to the general point about being unnecessarily conjunctive, but here I’m just trying to understand and express disagreement with the particular path you lay out and the probabilities you assign.)
In section 3 you say:
I think you claim that each synapse firing event requires about 1-10 million floating point operations (with some error bars), and that there is only a 16% chance that computers will be able to do enough compute for $25/hour.
This is probably the part of the report I am most skeptical of:
How do you square this with our experience in AI so far? Overall you seem to think it is possible that AI will be as effective as brains but unlikely to be much better. But if a biological neuron is probably ten million times more efficient than an artificial neuron, then aren’t we already much better than biology in tons of domains? Is there any task for which performance can be quantified and where you think this estimate provides a sane guideline to the inference-time compute required to solve the task? Shouldn’t you be putting significant probability on our algorithms being radically better than biology in many important ways?
Replicating the human visual cortex should take millions of times more compute than we have ever used, yet we can match human performance on a range of quantifiable perceptual tasks and are making rapid progress, and I’m actually not aware of tasks where it’s even plausible that we are 6 orders of magnitude away.
Learned policies for robotic control using only hundreds of thousands of neurons already seem to reach comparable competence to insects, but you should expect it to be significantly worse than a nematode. Aren’t you surprised to observe successful grasping and walking?
Traditional control systems like those used by Boston Dynamics seem to produce more competent motor control than small animals despite using amounts of compute close to 1 flop per synapse. You focus on ML, but I don’t know why—isn’t classical control a more reasonable point of comparison to small animals that have algorithms designed directly by evolution rather than learned in a lifetime, and doesn’t your argument very strongly predict that it should be impossible?
Qualitatively it’s hard to compare GPT-3 to humans, but just to be clear you are saying that it should behave like a brain with ~1000 neurons. This is at least surprising (e.g. I think would have led to big misses if it had been used to make any qualitative predictions), and to me casts doubt on a story where you can’t get transformative AI using less than the analog of a hundred billion neurons.
Your biological analysis seems to hinge on the assertion that precise simulation of neurons is necessary to get similar levels of computational utility (and even from there the analysis is pretty conservative, e.g. by assuming that performing that you need to perform a very expensive computation thousands of times a second). I don’t personally consider this plausible and I think the main argument given for it is that “if not, why would we have all these proteins?” which I don’t find persuasive (since synapses are under a huge number of important constraints and serve many important functions beyond implementing computationally complex functions at inference time). I’ve seen zero candidates for useful purposes for such an incredible amount of local computation with negligible quantities of long-distance communication, and there are very few examples of human-designed computations structured in this way / it seems to involve an extremely implausible model of what neurons are doing (apparently some nearly-embarassingly parallelizable task with work concentrated in individual neurons?). I don’t really want to argue with this at length, but want to flag that you are very confident about it and it drives a large part of your estimate whereas something like 50-50 seems more appropriate even before updating on the empirical success of ML.
In general you seem to be making the case very unnecessarily conjunctive—you are asking how likely it is that we will find algorithms as good as the brain, and then also build computers that operate at the Landauer limit (as you are apparently confident the brain does), and then also deploy AI in a way that is competitive at a $25/hour price point, and so on. But in fact one of these areas can outperform your benchmark (and if you are right in this section, then it’s definitely the case that we are radically more efficient than biology on many tasks already!), and it seems like you are dropping a lot of probability by ignoring that possibility. It’s like asking about the probability that a sum of 5 normal distributions will be above the mean, and estimating it’s 1/2^5 because each of 5 normal distributions needs to be above its mean.
(ETA: this criticism of section 3 is unfair: you do discuss the prospect of much better than human performance in the 2-page section “On the computational intensity of AGI,” and indeed this plays a completely central role in your bottom line estimate. But I’m still left wondering what the earlier 60% and 40% (and all the other numbers!) are supposed to represent, given that you are apparently putting all the work of “maybe humans will design efficient algorithms that are as good as the brain” in this section. You also don’t really discuss existing experience, where your estimates already appear to be many orders of magnitude off in domains where it is easiest to make comparisons between biology and ML (like vision or classical control) and where I don’t see how to argue we aren’t already 1000x better than biology using your 10 million flops per synapse number. Aside from me disagreeing with your mean, you describe these as conservative error bars since they put 20% probability on 1000x improvements over biology, but I think that’s really not the case given that it includes uncertainty about the useful compute done by the brain (where you already disagree by >>3 OOMs with plausible estimates) as well as algorithmic progress (where 1000x improvements over 20 years seem common both within software and ML).)
I’ll stop here rather than going on to sections 4+, though I think I have a lot to object to along similar lines (primarily that the story is being made unreasonably conjunctive).
Overall your estimation strategy looks crazy to me and I’m skeptical of the the implicit claim that this kind of methodology would perform well in historical examples. That said, if this sort of methodology actually does work well in practice then I think that trumps some a priori speculation and would be an important thing for me to really absorb. Your personal forecasting successes seem like a big part of the evidence for that, so it might be helpful to understand what kinds of predictions were involved and how methodologically analogous they are. Superficially it looks like the SciCast technology forecasting tournament is by far the most relevant; is there a pointer to the list of questions (other info like participants and list of predictions would also be awesome if available)? Or do you think one of the other items is more relevant?
Excellent comment; thank you for engaging in such detail. I’ll respond piece by piece. I’ll also try to highlight the things you think we believe but don’t actually believe.
Section 1: Likelihood of AGI algorithms
Yes, we assign a 40% chance that we don’t have AI algorithms by 2043 capable of learning to do nearly any human task with realistic amounts of time and compute. Some things we probably agree on:
Progress has been promising and investment is rising.
Obviously the development of AI that can do AI research more cheaply than humans could be a huge accelerant, with the magnitude depending on the value-to-cost ratio. Already GPT-4 is accelerating my own software productivity, and future models over the next twenty years will no doubt be leagues better (as well as more efficient).
Obviously slow progress in the past is not great evidence of slow progress in the future, as any exponential curve shows.
But as we discuss in the essay, 20 years is not a long time, much easier problems are taking longer, and there’s a long track record of AI scientists being overconfident about the pace of progress (counterbalanced, to be sure, by folks on the other side who are overconfident about things that would not be achieved and subsequently were). These factors give us pause, so while agree it’s likely we’ll have algorithms for AGI by 2043, we’re not certain of it, which is why we forecast 60%. We think forecasts higher than 60% are completely reasonable, but we personally struggle to justify anything near 100%.
Incidentally, I’m puzzled by your comment and others that suggest we might already have algorithms for AGI in 2023. Perhaps we’re making different implicit assumptions of realistic compute vs infinite compute, or something else. To me, it feels clear we don’t have the algorithms and data for AGI at present.
Lastly, no, we emphatically do not assume a ~0% chance that AGI will be smarter than nature’s brains. That feels like a ridiculous and overconfident thing to believe, and it pains me that we gave this impression. Already GPT-4 is smarter than me in ways, and as time goes on, the number of ways AI is smarter than me will undoubtedly grow.
Section 2: Likelihood of fast reinforcement training
Agree—if we had AGI today, this would not be a blocker. This becomes a greater and greater blocker the later AGI is developed. E.g., if AGI is developed in 2038, we’d have only 4 years to train it to do nearly every human task. So this factor is heavily entangled with the timeline on which AGI is developed.
(And obviously the development of AGI is not going to a clean line passed on a particular year, but the idea is the same even applied to AGI systems developed gradually and unevenly.)
Agree on nearly everything here. I think the crux on which we differ is that we think interaction with the real world will be a substantial bottleneck (and therefore being able to run 10,000 parallel copies may not save us).
As I mentioned to Zach below:
To recap, we can of course parallelize a million self-driving car AIs and have them drive billions of miles in simulation. But that only works to the extent that (a) our simulations reflect reality and (b) we have the compute resources to do so. And so real self-driving car companies are spending billions on fleets and human supervision in order to gather the necessary data. In general, if an AGI cannot easily and cheaply simulate reality, it will have to learn from real-world interactions. And to the extent that it needs to learn from interactions with the consequences of its earlier actions, that training will need to be sequential.
Agreed. Our expectation is that early AGIs will be expensive and uneven. If they end up being incredibly sample efficient, then this task will be much easier than we’ve forecasted.
In general, I’m pretty open to updating higher here. I don’t think there are any insurmountable barriers here; but a sense that this will both be hard to do (as self-driving illustrates) as well as unlikely to be done (as all sorts of tasks not currently automated illustrate). My coauthor is a bit more negative on this factor than me and may chime in with his own thoughts later.
I personally struggle to imagine how an AlphaZero-like algorithm would learn to become the world’s best swim instructor via massively parallelized reinforcement learning on children, but that may well be a failure of my imagination. Certainly one route is massively parallelized RL to become excellent at AI R&D, then massively parallelized RL to become excellent at many tasks, and then quickly transferring that understanding to teaching children to swim, without any children ever drowning.
Section 3: Operating costs
Here, I think you ascribe many beliefs to us which we do not hold, and I apologize for not being clearer. I’ll start by emphasizing what we do not believe.
We do not believe this.
AI is already vastly better than human brains at some tasks, and the number of tasks on which AI is superhuman will rise with time. We do expect that early AGIs will be expensive and uneven, as all earliest versions of a technology are. And then they will improve from there.
We do not believe this.
We do not believe this.
We do not believe this. We do not believe that brains operate at the Landauer limit, nor do we believe computers will operate at this limit by 2043.
Incidentally, I studied the Landauer limit deeply during my physics PhD and could write an essay on the many ways it’s misinterpreted, but will save that for another day. :)
We do not believe this.
To multiply these probabilities together, one cannot multiply their unconditional expectations; rather, one must multiply their cascading conditional probabilities. You may disagree with our probabilities, but our framework specifically addresses this point. Our unconditional probabilities are far lower for some of these events, because we believe they will be rapidly accelerated conditional on progress in AGI.
Forecasting credentials
Honestly, I wouldn’t put too much weight on my forecasting success. It’s mostly a mix of common sense, time invested, and luck. I do think it reflects a decent mental model of how the world works, which leads to decent calibration for what’s 3% likely vs 30% likely. The main reason I mention it in the paper is just to help folks realize that we’re not wackos predicting 1% because we “really feel” confident. In many other situations (e.g., election forecasting, sports betting, etc.) I often find myself on the humble and uncertain side of the fence, trying to warn people that the world is more complicated and unpredictable than their gut is telling them. Even here, I consider our component forecasts quite uncertain, ranging from 16% to 95%. It’s precisely our uncertainty about the future which leads to a small product of 0.4%. (From my point of view, you are staking out a much higher confidence position in asserting that AGI algorithms is very likely and that rapid self-improvement is very likely.)
As for SciCast, here’s at least one publication that resulted from the project: https://ieeexplore.ieee.org/abstract/document/7266786
Example questions (from memory) included:
What will be the highest reported efficiency of a perovskite photovoltaic cell by date X
What will be the volume of deployed solar in the USA by date X
At the Brazil World Cup, how far will the paraplegic exoskeleton kick the ball for the opening kickoff
Will Amazon offer drone delivery by date X
Will physicists discover Y by date X
Most forecasts related to scientific discoveries and technological inventions and had timescales of months to years.
Conclusion
From your comment, I think the biggest crux between us is the rate of AI self-improvement. If the rate is lower, the world may look like what we’re envisioning. If the rate is higher, progress may take off in a way not well predicted by current trends, and the world may look more like what you’re envisioning. This causes our conditional probabilities to look too low and too independent, from your point of view. Do you think that’s a fair assessment?
Lastly, can I kindly ask what your cascading conditional probabilities would be in our framework? (Let’s hold the framework constant for this question, even if you prefer another.)
I would guess that more or less anything done by current ML can be done by ML from 2013 but with much more compute and fiddling. So it’s not at all clear to me whether existing algorithms are sufficient for AGI given enough compute, just as it wasn’t clear in 2013. I don’t have any idea what makes this clear to you.
Given that I feel like compute and algorithms mostly trade off, hopefully it’s clear why I’m confused about what the 60% represents. But I’m happy for it to mean something like: it makes sense at all to compare AI performance vs brain performance, and expect them to be able to solve a similar range of tasks within 5-10 orders of magnitude of the same amount of compute.
If 60% is your estimate for “possible with any amount of compute,” I don’t know why you think that anything is taking a long time. We just don’t get to observe how easy problems are if you have plenty of compute, and it seems increasingly clear that weak performance is often explained by limited compute. In fact, even if 60% is your estimate for “doable with similar compute to the brain,” I don’t see why you are updating from our failure to do tasks with orders of magnitude less compute than a brain (even before considering that you think individual neurons are incredibly potent).
I still don’t fully understand the claims being made in this section. I guess you are saying that there’s a significant chance that the serial time requirements will be large and that will lead to a large delay? Like maybe you’re saying something like: a 20% chance that it will add >20 years of delay, a 30% chance of 10-20 years of delay, a 40% chance of 1-10 years of delay, a 10% chance of <1 year of delay?
In addition to not fully understanding the view, I don’t fully understand the discussion in this section or why it’s justifying this probability. It seems like if you had human-level learning (as we are conditioning on from sections 1+3) then things would probably work in <2 years unless parallelization is surprisingly inefficient. And even setting aside the comparison to humans, such large serial bottlenecks aren’t really consistent with any evidence to date. And setting any concrete details, you are already assuming we have truly excellent algorithms and so there are lots of ways people could succeed. So I don’t buy the number, but that may just be a disagreement.
You seem to be leaning heavily on the analogy to self-driving cars but I don’t find that persuasive—you’ve already postulated multiple reasons why you shouldn’t expect them to have worked so far. Moreover, the difficulties there also just don’t seem very similar to the kind of delay from serial time you are positing here, they seem much more closely related to “man we don’t have algorithms that learn anything like humans.”
I think I’ve somehow misunderstood this section.
It looks to me like you are trying to estimate the difficulty of automating tasks by comparing to the size of brains of animals that perform the task (and in particular human brains). And you are saying that you expect it to take about 1e7 flops for each synapse in a human brain, and then define a probability distribution around there. Am I misunderstanding what’s going on here or is that a fair summary?
(I think my comment about GPT-3 = small brain isn’t fair, but the reverse direction seems fair: “takes a giant human brain to do human-level vision” --> “takes 7 orders of magnitude larger model to do vision.” If that isn’t valid, then why is “takes a giant human brain to do job X” --> “takes 7 orders of magnitude larger model to automate job X” valid? Is it because you are considering the worst-case profession?)
I don’t think I understand where your estimates come from, unless we are just disagreeing about the word “precise.” You cite the computational cost of learning a fairly precise model of a neuron’s behavior as an estimate for the complexity per neuron. You also talk about some low level dynamics without trying to explain why they may be computationally relevant. And then you give pretty confident estimates for the useful computation done in a brain. Could you fill in the missing steps in that estimate a bit more, both for the mean (of 1e6 per neuron*spike) and for the standard deviation of the log (which seems to be about ~1 oom)?
I think I misunderstood your claims somehow.
I think you are claiming that the brain does 1e20-1e21 flops of useful computation. I don’t know exactly how you are comparing between brains and floating point operations. A floating point operation is more like 1e5 bit erasures today and is necessarily at least 16 bit erasures at fp16 (and your estimates don’t allow for large precision reductions e.g. to 1 bit arithmetic). Let’s call it 1.6e21 bit erasures per second, I think quite conservatively?
I might be totally wrong about the Landauer limit, but I made this statement by looking at Wikipedia which claims 3e-21 J per bit erasure at room temperature. So if you multiply that by 1.6e21 bit erasures per second, isn’t that 5 W, nearly half the power consumption of the brain?
Is there a mistake somewhere in there? Am I somehow thinking about this differently from you?
I understand this, but the same objection applies for normal distributions being more than 0. Talking about conditional probabilities doesn’t help.
Are you saying that e.g. a war between China and Taiwan makes it impossible to build AGI? Or that serial time requirements make AGI impossible? Or that scaling chips means AGI is impossible? It seems like each of these just makes it harder. These are factors you should be adding up. Some things can go wrong and you can still get AGI by 2043. If you want to argue you can’t build AGI if something goes wrong, that’s a whole different story. So multiplying probabilities (even conditional probabilities) for none of these things happening doesn’t seem right.
I don’t know what the events in your decomposition refer to well enough to assign them probabilities:
I still don’t know what “algorithms for AGI” means. I think you are somehow ignoring compute costs, but if so I don’t know on what basis you are making any kind of generalization from our experience with the difficulty of designing extremely fast algorithms. In most domains algorithmic issues are ~the whole game and that seems true in AI as well.
I don’t really know what “invent a way for AGI to learn faster than humans” means, as distinct from the estimates in the next section about the cost of AGI algorithms. Again, are you trying to somehow abstract out compute costs of learning here? Then my probabilities are very but uninteresting.
Taken on its own, it seems like the third probability (“AGI inference costs drop below $25/hr (per human equivalent)”) implies the conclusion. So I assume you are doing something where you say “Ignoring increases in demand and the possibility of supply chain disruptions and...” or something like that? So the forecast you are making about compute prices aren’t unconditional forecasts?
I don’t know what level of cheap, quality robots you refer to. The quality of robotics needed to achieve transformative AI depends completely on the quality of your AI. For powerful AI it can be done with existing robot bodies, for weak AI it would need wildly superhuman bodies, at intermediate levels it can be done if humanoid robots cost millions of dollars each. And conversely the previous points aren’t really defined unless you specify something about the robotic platform. I assume you address this in the section but I think it’s going to be hard to define enough that I can give a number.
I don’t know what massively scaling chips mean—again, it seems like this just depends crucially on how good your algorithms are. It feels more like you should be estimating multiple numbers and then seeing the probability that the product is large enough to be impactful.
I don’t know what “avoid derailment” means. It seems like these are just factors that affect the earlier estimates, so I guess the earlier quantities were supposed to be something like “the probability of developing AGI given that nothing weird happens in the world”? Or something? But weird stuff is guaranteed to be happening in the world. I feel like this is the same deal as above, you should be multiplying out factors.
I think this seems right.
In particular, it seems like some of your estimates make more sense to me if I read them as saying “Well there will likely exist some task that AI systems can’t do.” But I think such claims aren’t very relevant for transformative AI, which would in turn lead to AGI.
By the same token, if the AIs were looking at humans they might say “Well there will exist some tasks that humans can’t do” and of course they’d be right, but the relevant thing is the single non-cherry-picked variable of overall economic impact. The AIs would be wrong to conclude that humans have slow economic growth because we can’t do some tasks that AIs are great at, and the humans would be wrong to conclude that AIs will have slow economic growth because they can’t do some tasks we are great at. The exact comparison is only relevant for assessing things like complementarity, which make large impacts happen strictly more quickly than they would otherwise.
(This might be related to me disliking AGI though, and then it’s kind of on OpenPhil for asking about it. They could also have asked about timelines to 100000x electricity production and I’d be making broadly the same arguments, so in some sense it must be me who is missing the point.)
That makes sense, and I’m ready to believe you have more calibrated judgments on average than I do. I’m also in the business of predicting a lot of things, but not as many and not with nearly as much tracking and accountability. That seems relevant to the question at hand, but still leaves me feeling very intuitively skeptical about this kind of decomposition.
C’mon Paul—please extend some principle of charity here. :)
You have repeatedly ascribed silly, impossible beliefs to us and I don’t know why (to be fair, in this particular case you’re just asking, not ascribing). Genuinely, man, I feel bad that our writing has either (a) given the impression that we believe such things or (b) given the impression that we’re the type of people who’d believe such things.
Like, are these sincere questions? Is your mental model of us that there’s a genuine uncertainty over whether we’ll say “Yes, a war precludes AGI” vs “No, a war does preclude AGI.”
To make it clear: No, of course a war between China and Taiwan does not make it impossible to build AGI by 2043. As our essay explicitly says.
To make it clear: our forecasts are not the odds of wars, pandemics, and depressions not occurring. They are the odds of wars, pandemics, and depressions not delaying AGI beyond 2043. Most wars, most pandemics, and most depressions will not delay AGI beyond 2043, we think. Our methodology is to forecast only the most severe events, and then assume a good fraction won’t delay AGI. As our essay explicitly says.
We probably forecast higher odds of delay than you, because our low likelihoods of TAGI mean that TAGI, if developed, is likeliest to be developed nearer to the end of the period, without many years of slack. If TAGI is easy, and can be developed early or with plenty of slack, then it becomes much harder for these types of events to derail TAGI.
My point in asking “Are you assigning probabilities to a war making AGI impossible?” was to emphasize that I don’t understand what 70% is a probability of, or why you are multiplying these numbers. I’m sorry if the rhetorical question caused confusion.
My current understanding is that 0.7 is basically just the ratio (Probability of AGI before thinking explicitly about the prospect of war) / (Probability of AGI after thinking explicitly about prospect of war). This isn’t really a separate event from the others in the list, it’s just a consideration that lengthens timelines. It feels like it would also make sense to list other considerations that tend to shorten timelines.
(I do think disruptions and weird events tend to make technological progress slower rather than faster, though I also think they tend to pull tiny probabilities up by adding uncertainty.)
I don’t follow you here.
Why is a floating point operation 1e5 bit erasures today?
Why does a fp16 operation necessitate 16 bit erasures? As an example, if we have two 16-bit registers (A, B) and we do a multiplication to get (A, A*B), where is the 16 bits of information loss?
(In any case, no real need to reply to this. As someone who has spent a lot of time thinking about the Landauer limit, my main takeaway is that it’s more irrelevant than often supposed, and I suspect getting to the bottom of this rabbit hole is not going to yield much for us in terms of TAGI timelines.)
Yep. We’re using the main definition supplied by Open Philanthropy, which I’ll paraphrase as “nearly all human work at human cost or less by 2043.”
If the definition was more liberal, e.g., AGI as smart as humans, or AI causing world GDP to rise by >100%, we would have forecasted higher probabilities. We expect AI to get wildly more powerful over the next decades and wildly change the face of human life and work. The public is absolutely unprepared. We are very bullish on AI progress, and we think AI safety is an important, tractable, and neglected problem. Creating new entities with the potential to be more powerful than humanity is a scary, scary thing.
Interesting—this is perhaps another good crux between us.
My impression is that existing robot bodies are not good enough to do most human jobs, even if we had human-level AGI today. Human bodies self-repair, need infrequent maintenance, last decades, have multi-modal high bandwidth sensors built in, and are incredibly energy efficient.
One piece of evidence for this is how rare tele-operated robots are. There are plenty of generally intelligent humans around the world who would be happy to control robots for $1/hr, and yet they are not being employed to do so.
I didn’t mean to imply that human-level AGI could do human-level physical labor with existing robotics technology; I was using “powerful” to refer to a higher level of competence. I was using “intermediate levels” to refer to human-level AGI, and assuming it would need cheap human-like bodies.
Though mostly this seems like a digression. As you mention elsewhere, the bigger crux is that it seems to me like automating R&D would radically shorten timelines to AGI and be amongst the most important considerations in forecasting AGI.
(For this reason I don’t often think about AGI timelines, especially not for this relatively extreme definition. Instead I think about transformative AI, or AI that is as economically impactful as a simulated human for $X, or something along those lines.)
Bingo. We didn’t take the time to articulate it fully, but yeah you got it. We think it makes it easier to forecast these things separately rather than invisibly smushing them together into a smaller set of factors.
We are multiplying out factors. Not sure I follow you here.
Agree 100%. Our essay does exactly this, forecasting over a wide range of potential compute needs, before taking an expected value to arrive a single summary likelihood.
Sounds like you think we should have ascribed more probability to lower ranges, which is a totally fair disagreement.
Pretty fair summary. 1e6, though, not 1e7. And honestly I could be pretty easily persuaded to go a bit lower by arguments such as:
Max firing rate of 100 Hz is not the informational content of the channel (that buys maybe 1 OOM)
Maybe a smaller DNN could be found, but wasn’t
It might take a lot of computational neurons to simulate the I/O of a single synapse, but it also probably takes a lot of synapses to simulate the I/O of a single computational neuron
Dropping our estimate by 1-2 OOMs would increase step 3 by 10%abs-20%abs. It wouldn’t have much effect on later estimates, as they are already conditional on success in step 3.
Maybe, but maybe not, which is why we forecast a number below 100%.
For example, it is very very rare to ever see a CEO hired with <2 years of experience, even if they are very intelligent and have read a lot of books and have watched a lot of interviews. Some reasons might be irrational or irrelevant, but surely some of it is real. A CEO job requires a large constellation of skills practiced and refined over many years. E.g., relationship building with customers, suppliers, shareholders, and employees.
For an AGI to be installed as CEO of a corporation in under two years, human-level learning would not be enough—it would need to be superhuman in its ability to learn. Such superhuman learning could come from simulation (e.g., modeling and simulating how a potential human partner would react to various communication styles), come from parallelization (e.g., being installed as a manager in 1,000 companies and then compiling and sharing learnings across copies), or from something else.
I agree that skills learned from reading or thinking or simulating could happen very fast. Skills requiring real-world feedback that is expensive, rare, or long-delayed would progress more slowly.
You seem to be missing the possibility of superhuman learning being from superhuman sample efficiency in the sense of requiring less feedback to aquire skills. Including actively experimenting in usefull directions more efectively.
Nope, we didn’t miss the possibility of AGIs being very sample efficient in their learning. We just don’t think it’s certain, which is why we forecast a number below 100%. Sounds like your estimate is higher than ours; however, that doesn’t mean we missed the possibility.
What’s an algorithm from 2013 that you think could yield AGI, if given enough compute? What would its inputs, outputs, and training look like? You’re more informed than me here and I would be happy to learn more.
I’m not sure I buy ’2013 algorithms are literally enough’, but it does seem very likely to me that in practice you get AGI very quickly (<2 years) if you give out GPUs which have (say) 10^50 FLOPS. (These GPUS are physically impossible, but I’m just supposing this to make the hypothetical easier. In particular, 2013 algorithms don’t parallelize very well and I’m just supposing this away.)
And, I think 2023 algorithms are literally enough with this amount of FLOP (perhaps with 90% probability).
For a concrete story of how this could happen, let’s imagine training a model with around 10^50 FLOP to predict all human data ever produced (say represented as uncompressed bytes and doing next token prediction) and simultaneously training with RL to play every game ever. We’ll use the largest model we can get with this flop budget, probably well over 10^25 parameters. Then, you RL on various tasks, prompt the AI, or finetune on some data (as needed).
This can be done with either 2013 or 2023 algorithms. I’m not sure if it’s enough with 2013 algorithms (in particular, I’d be worried that the AI would be extremely smart but the elicitation technology wasn’t there to get the AI to do anything useful). I’d put success with 2013 algos and this exact plan at 50%. It seems likely enough with 2023 algorithms (perhaps 80% chance of success).
In 2013 this would look like training an LSTM. Deep RL was barely developed, but did exist.
In 2023 this looks similar to GPT4 but scaled way up and trained on all source of data and trained to play games etc.
Let me replay my understanding to you, to see if I understand. You are predicting that...
IF:
we gathered all files stored on hard drives
...decompressed them into streams of bytes
...trained a monstrous model to predict the next chunk in each stream
...and also trained it to play every winnable computer game ever made
THEN:
You are 50% confident we’d get AGI* using 2013 algos
You are 80% confident we’d get AGI* using 2023 algos
WHERE:
*AGI means AI that is general; i.e., able to generalize to all sorts of data way outside its training distribution. Meaning:
It avoids overfitting on the data despite its massive parameter count. E.g., not just memorizing every file or brute forcing all the exploitable speedrunning bugs in a game that don’t generalize to real-world understanding.
It can learn skills and tasks that are barely represented in the computer dataset but that real-life humans are nonetheless able to quickly understand and learn due to their general world models
It can made to develop planning, reasoning, and strategy skills not well represented by next-token prediction (e.g., it would learn to how write a draft, reflect on it, and edit it, even though it’s never been trained to do that and has only been optimized to append single tokens in sequence)
It simultaneously avoids underfitting due to any regularization techniques used to avoid the above overfitting problems
ASSUMING:
We don’t train on data not stored on computers
We don’t train on non-computer games (but not a big crux if you want to posit high fidelity basketball simulations, for example)
We don’t train on games without win conditions (but not a big crux, as most have them)
Is this a correct restatement of your prediction?
And are your confidence levels for this resulting in AGI on the first try? Within ten tries? Within a year of trial and error? Within a decade of trial and error?
(Rounding to the nearest tenth of a percent, I personally am 0.0% confident we’d get AGI on our first try with a system like this, even with 10^50 FLOPS.)
This seems like a pretty good description of this prediction.
Your description misses needing a finishing step of doing some RL, prompting, and generally finetuning on the task of interest (similar to GPT4). But this isn’t doing much of the work, so it’s not a big deal. Additionally, this sort of finishing step wasn’t really developed in 2013, so it seems less applicable to that version.
I’m also assuming some iteration on hyperparameters and data manipulation etc. in keeping with the techniques used in the respective time periods. So, ‘first try’ isn’t doing that much work here because you’ll be iterating a bit in the same way that people generally iterate a bit (but you won’t be doing novel research).
My probabilities are for the ‘first shot’ but after you do some preliminary experiments to verify hyper-params etc. And with some iteration on the finetuning. There might be a non-trivial amount of work on the finetuning step also, I don’t have a strong view here.
It’s worth noting that I think that GPT5 (with finetuning and scaffolding, etc.) is perhaps around 2% likely to be AGI. Of course, you’d need serious robotic infrastructure and much larger pool of GPUs to automate all labor.
My general view is ‘if the compute is there, the AGI will come’. I’m going out on more of a limb with this exact plan and I’m much less confident in the plan than in this general principle.
Here are some examples reasons why I think my high probabilities are plausible:
The training proposal I gave is pretty close to how models like GPT4 are trained. These models are pretty general and are quite strategic etc. Adding more FLOP makes a pretty big qualitative difference.
It doesn’t seem to me like you have to generalize very far for this to succeed. I think existing data trains you to do basically everything humans can do. (See GPT4 and prompting)
Even if this proposal is massively inefficient, we’re throwing an absurd amount of FLOP at it.
It seems like the story for why humans are intelligent looks reasonably similar to this story: have big, highly functional brains, learn to predict what you see, train to achieve various goals, generalize far. Perhaps you think humans intelligence is very unlikely ex-ante (<0.04% likely).
Am I really the only person who thinks it’s a bit crazy that we use this blobby comment thread as if it’s the best way we have to organize disagreement/argumentation for audiences? I feel like we could almost certainly improve by using, e.g., a horizontal flow as is relatively standard in debate.[1]
With a generic example below:
To be clear, the commentary could still incorporate non-block/prose text.
Alternatively, people could use something like Kialo.com. But surely there has to be something better than this comment thread, in terms of 1) ease of determining where points go unrefuted, 2) ease of quickly tracing all responses in specific branches (rather than having to skim through the entire blob to find any related responses), and 3) seeing claims side-by-side, rather than having to scroll back and forth to see the full text. (Quoting definitely helps with this, though!)
(Depending on the format: this is definitely standard in many policy debate leagues.)
How hard do you suppose it might be to use an AI to scrub the comments and generate something like this? It may be worth doing manually for some threads, even, but it’s easier to get people to adopt if the debate already exists and only needs tweaking. There may even already exist software that accepts text as input and outputs a Kialo-like debate map (thank you for alerting me that Kialo exists, it’s neat).
Over the past few months I have occasionally tried getting LLMs to do some tasks related to argument mapping, but I actually don’t think I’ve tried that specifically, and probably should. I’ll make a note to myself to try here.
But I don’t think we could have predicted people would die into the comments like this. Usually comments have minimal engagement. There’s a lesswrong debate format for posts but that’s usually with a moderator and such. This seems spontaneous.
Are your referring to this format on LessWrong? If so I can’t say I’m particularly impressed, as it still seems to suffer from the problems of linear dialogue vs. a branching structure (e.g., it is hard to see where points have been dropped, it is harder to trace specific lines of argument). But I don’t recall seeing this, so thanks for the flag.
As for “I don’t think we could have predicted people…”, that’s missing my point(s). I’m partially saying “this comment thread seems like it should be a lesson/example of how text-blob comment-threads are inefficient in general.” However, even in this specific case Paul knew that he was laying out a multi-pronged criticism, and if the flow format existed he could have presented his claims that way, to make following the debate easier—assuming Ted would reply.
Ultimately, it just seems to me like it would be really logical to have a horizontal flow UI,[1] although I recognize I am a bit biased by my familiarity with such note taking methods from competitive debate.
In theory it need not be as strictly horizontal as I lay out; it could be a series of vertically nested claims, kept largely within one column—where the idea is that instead of replying to the entire comment you can just reply to specific blocks in the original comment (e.g., accessible in a drop down at the end of a specific argument block rather than the end of the entire comment).
I don’t know. As someone who was/still is quite good at debating and connected to debating communities I would find a flow-centric comment thread bothersome and unhelpful for reading the dialogues. I quite like internet comments as is in this UI.
I find this strange/curious. Is your preference more a matter of “Traditional interfaces have good features that a flowing interface would lack“ (or some other disadvantage to switching) or “The benefits of switching to a flowing interface would be relatively minor”?
For example on the latter, do you not find it more difficult with the traditional UI to identify dropped arguments? Or suppose you are fairly knowledgeable about most of the topics but there’s just one specific branch of arguments you want to follow: do you find it easy to do that? (And more on the less-obvious side, do you think the current structure disincentivizes authors from deeply expanding on branches?)
On the former, I do think that there are benefits to having less-structured text (e.g., introductions/summaries and conclusions) and that most argument mapping is way too formal/rigid with its structure, but I think these issues could be addressed in the format I have in mind.
I asked other debaters/EAs intersecting and they agreed with my line of reasoning that it would be contrived and lead to poorly structured arguments. I can elaborate if you really want but I hesitate spending time to write this out because I’m behind on work and don’t think it’ll have any impact on anything to be honest.
This is a really impressive paper full of highly interesting arguments. I am enjoying reading it. That said, and I hope I’m not being too dismissive here, I have a strong suspicion that the central argument in this paper suffers from what Eliezer Yudkowsky calls the multiple stage fallacy,
I think canonicalizing this as a Fallacy was very premature: Yudkowsky wrote his post based on two examples:
Nate Silver’s Trump’s Six Stages of Doom (which got a very-clearly-wrong-in-hindsight answer)
My Breaking Down Cryonics Probabilities (which is harder to evaluate since it’s about something farther in the future)
I wrote a response at the time, ending with:
In the discussion people gave a few other examples of people using this sort of model:
Robin Hanson in Break Cryonics Down
Forecasters in the Good Judgement project (comment)
Animal Charity Evaluators trying to predict the power of a planned behavior change study.
Still, this isn’t very much. Before deciding whether this is a method that tends to lead people to give bad estimates (relative to whatever they would use instead) I’d like to see more examples?
Related to this, you can only multiply probabilities if they’re independent, but I think a lot of the listed probabilities are positively correlated, which means the joint probability is higher than their product. For example, it seems to me that “AGI inference costs drop below $25/hr” and “We massively scale production of chips and power” are strongly correlated.
Agreed. Factors like AI progress, inference costs, and manufacturing scale needed are massively correlated. We discuss this in the paper. Our unconditional independent forecast of semiconductor production would be much, much lower than our conditional forecast of 46%, for example.
Thanks for the kind words.
Regarding the multiple stage fallacy, we recognize it’s a risk of a framework like this and go to some lengths explaining why we think our analysis does not suffer from it. (Namely, in the executive summary, the discussion, and the appendix “Why 0.4% might be less confident than it seems.”)
What are the disjunctive alternatives you think our framework misses?
Like Matthew, I think your paper is really interesting and impressive.
Some issues I have with the methodology:
Your framework excludes some factors that could cause the overall probability to increase.
For example, I can think of ways that a great power conflict (over Taiwan, say) actually increases the chances of TAI. But your framework doesn’t easily account for this.
You could have factored it in in all or some of the other stages, but I’m not sure you have, and it seems generally like this asymmetry (the “positive” effect of an event is factored into various other stages if at all, but the “negative” effect of the same event is estimated on its own conjunctive stage) will tend to give lower overall probabilities than it should.
It seems like you sometimes don’t fully condition on preceding propositions.
You calculate a base rate of “10% chance of [depression] in
the next 20 years”, and write: “Conditional on being in a world on track toward transformative AGI, we estimate a ~0.5%/yr chance of depression, implying a ~10% chance in the next 20 years.”
But this doesn’t seem like fully conditioning on a world with TAI that is cheap, that can automate ~100% of human tasks, and that can be deployed at scale, and that is relatively unregulated. It seems like once that happens, and when it’s nearly happening (e.g. AIs automate 20% of 2022-tasks), the probability of a severe depression should be way below historical base rates?
Similarly for “We quickly scale up semiconductor manufacturing and
electrical generation”, it seems like you don’t fully condition on a world where we have TAI that is cheap, that can automate ~100% of human tasks, and that can operate cheap, high-quality robots, and that can probably be deployed to some fairly wide extent even if not (yet) to actually automate ~all human labour.
Like, your X100 is 100x as cost-effective as the H100, but that doesn’t seem that far off what you’d get from by just projecting the Epoch trend for ML GPU price-performance out 2 decades?
More generally, I think these sorts of things are really hard to get right (i.e. it’s hard to imagine oneself in a conditional world, and estimate probabilities there without anchoring on the present world), and will tend to bias people to smaller overall estimates when using more conjunctive steps.
Thanks!
Totally reasonable to disagree with us on some of these forecasts—they’re rough educated guesses, after all. We welcome others to contribute their own forecasts. I’m curious: What do you think are the rough odds that invasion of Taiwan increases the likelihood of TAGI by 2043?
Agree wholeheartedly. In a world with scaled, cheap TAGI, things are going to look wildly different and it will be hard to predict what happens. Change could be a lot faster than what we’re used to, and historical precedent and intuition might be relatively poor guides relative to first principles thinking.
However, we feel somewhat more comfortable with our predictions prior to scaled, cheap AGI. Like, if it takes 3e30 − 3e35 operations to train an early AGI, then I don’t think we can condition on that AGI accelerating us towards construction of the resources needed to generate 3e30 − 3e35 operations. It would be putting the cart before the horse.
What we can (and try to) condition on are potential predecessors to that AGI; e.g., improved narrow AI or expensive human-level AGI. Both of those we have experience with today, which gives us more confidence that we won’t get an insane productivity explosion in the physical construction of fabs and power plants.
We could be wrong, of course, and we’ll find out in 2043.
Maybe 20% that it increases the likelihood? Higher if war starts by 2030 or so, and near 0% if it starts in 2041 (but maybe >0% if it starts in 2042?). What number would you put on it, and how would you update your model if that number changed?
I think what you’re saying here is, “yes, we condition on such a world, but even in such a world these things won’t be true for all of 2023-2043, but mainly only towards the latter years in that range”. Is that right?
I agree to some extent, but as you wrote, “transformative AGI is a much higher bar than merely massive progress in AI”: I think in a lot of those previous years we’ll still have AI doing lots of work to speed up R&D and carry out lots of other economically useful tasks. Like, we know in this world that we’re headed for AGI in 2043 or even earlier, so we should be seeing really capable and useful AI systems already in 2030 and 2035 and so on.
Maybe you think the progression from today’s systems to potentially-transformative AGI will be discontinuous or something like that, with lots of progress (on algorithms, hardware, robotics, etc.) happening near the end?
No, I actually fully agree with you. I don’t think progress will be discontinuous, and I do think we will see increasingly capable and useful systems by 2030 and 2035 that accelerate rates of progress.
I think where we may differ is that:
I think the acceleration will likely be more “in line” than “out of line” with the exponential acceleration we already see from improving computer tools and specifically LLM computer tools (e.g., GitHub Copilot, GPT-4). Already a software engineer today is many multiples more productive (by some metrics) than a software engineer in the 90s.
I think that tools that, say, cheaply automate half of work, or expensively automate 100% work probably won’t lead to wild, extra orders of magnitude levels of progress. OpenAI has what, 400 employees?
Scenario one: If half their work was automated, ok now those 400 people could do the work of 800 people. That’s great, but honestly I don’t think it’s path-breaking. And sure, that’s only the first order effect. If half the work was automated, we’d of course elastically start spending way much more on the cheap automated half. But on the other hand, there would be diminishing returns, and for every step that becomes free, we just hit bottlenecks in the hard to automate parts. Even in the limit of cheap AGI, those AGIs may be limited by the GPUs they have to experiment on. Labor becoming free just means capital is the constraint.
Scenario two: Or, suppose we have human-cost human-level AGIs. I’m not convinced that would, to first order, change much either. There are millions of smart people on earth who aren’t working on AI research now. We could hire them, but we don’t. We’re not limited by brains. We’re limited by willingness to spend. So even if we invent human-cost human-level brains, it actually doesn’t change much, because that wasn’t the constraint. (Of course, this is massively oversimplifying, and obviously human-cost human-level AGIs would be a bigger deal than human workers because of their ability to be rapidly improved and copied. But I hope it nevertheless conveys why I think AGI will need to get close to transformative levels before growth really explodes.)
Overall where it feels like I’m differing from some folks here is that I think higher levels of AI capability will be needed before we get wild self-improvement takeoff. I don’t think it will be early, because even if we get massive automation due to uneven AI we’ll still be bottlenecked by the things it’s bad at. I acknowledge this is a pretty squishy argument and I find it difficult to quantify and articulate, so I think it’s quite reasonable to disagree with me here. In general though, I think we’ve seen a long history of things being harder to automate than we thought (e.g., self-driving, radiology, etc.). It will be exciting to see what happens!
Do you have any material on this? It sounds plausible to me but I couldn’t find anything with a quick search.
Supposing you take “progress” to mean something like GDP per capita or AI capabilities as measured on various benchmarks, I agree that it probably won’t (though I wouldn’t completely rule it out). But also, I don’t think progress would need to jump by OOMs for the chances of a financial crisis large enough to derail transformative AGI to be drastically reduced. (To be clear, I don’t think drastic self-improvement is necessary for this, and I expect to see something more like increasingly sophisticated versions of “we use AI to automate AI research/engineering”.)
I also think it’s pretty likely that, if there is a financial crisis in these worlds, AI progress isn’t noticeably impacted. If you look at papers published in various fields, patent applications, adoption of various IT technologies, numbers of researchers per capita—none of these things seem to slow down in the wake of financial crises. Same thing for AI: I don’t see any derailment from financial crises when looking at model sizes (both in terms of parameters and training compute), dataset sizes or chess program Elo.
Maybe capital expenditure will decrease, and that might only start being really important once SOTA models are extremely expensive, but on the other hand: if there’s anything in these worlds you want to keep investing in it’s probably the technology that’s headed towards full-blown AGI? Maybe I think 1 in 10 financial crises would substantially derail transformative AGI in these worlds, but it seems you think it’s more like 1 in 2.
Yeah, but why only focus on OAI? In this world we have AIs that cheaply automate half of work. That seems like it would have immense economic value and promise, enough to inspire massive new investments in AI companies.
Ah, I think we have a crux here. I think that, if you could hire—for the same price as a human—a human-level AGI, that would indeed change things a lot. I’d reckon the AGI would have a 3-4x productivity boost from being able to work 24⁄7, and would be perfectly obedient, wouldn’t be limited to working in a single field, could more easily transfer knowledge to other AIs, could be backed up and/or replicated, wouldn’t need an office or a fun work environment, can be “hired” or “fired” ~instantly without difficulty, etc.
That feels somehow beside the point, though. I think in any such scenario, there’s also going to be very cheap AIs with sub-human intelligence that would have broad economic impact too.
Nope, it’s just an unsubstantiated guess based on seeing what small teams can build today vs 30 years ago. Also based on the massive improvement in open-source libraries and tooling compared to then. Today’s developers can work faster at higher levels of abstraction compared to folks back then.
Absolutely agree. AI and AGI will likely provide immense economic value even before the threshold of transformative AGI is crossed.
Still, supposing that AI research today is:
50⁄50 mix of capital and labor
faces diminishing returns
and has elastic demand
...then even a 4x labor productivity boost may not be all that path-breaking when you zoom out enough. Things will speed up, surely, but they might won’t create transformative AGI overnight. Even AGI researchers will need time and compute to do their experiments.
I put little weight on this analysis because it seems like a central example of the multiple stage fallacy. But it does seem worth trying to identify clear example of the authors not accounting properly for conditionals. So here are three concrete criticisms (though note that these are based on skimming rather than close-reading the PDF):
A lot of the authors’ analysis about the probability of war derailment is focused on Taiwan, which is currently a crucial pivot point. But conditional on chip production scaling up massively, Taiwan would likely be far less important.
If there is extensive regulation of AI, it will likely slow down both algorithmic and hardware progress. So conditional on the types of progress listed under events 1-5, the probability of extensive regulation is much lower than it would be otherwise.
The third criticism is more involved; I’ll summarize it as “the authors are sometimes treating the different events as sequential in time, and sometimes sequential in logical flow”. For example, the authors assign around 1% to events 1-5 happening before 2043. If they’re correct, then conditioning on events 1-5 happening before 2043, they’ll very likely only happen just before 2043. But this leaves very little time for any “derailing” to occur after that, and so the conditional probability of derailing should be far smaller than what they’ve given (62%).
The authors might instead say that they’re not conditioning on events 1-5 literally happening when estimating conditional probability of derailing, but rather conditioning on something more like “events 1-5 would have happened without the 5 types of disruption listed”. That way, their 10% estimate for a derailing pandemic could include a pandemic in 2025 in a world which was otherwise on track for reaching AGI. But I don’t think this is consistent, because the authors often appeal to the assumption that AGI already exists when talking about the probability of derailing (e.g. the probability of pandemics being created). So it instead seems to me like they’re explicitly treating the events as sequential in time, but implicitly treating the events as sequential in logical flow, in a way which significantly decreases the likelihood they assign to TAI by 2043.
I suspect that I have major disagreements with the way the authors frame events 1-5 as well, but don’t want to try to dig into those now.
Great comment! Thanks especially for trying to point the actual stages going wrong, rather than hand-waving the multiple stage fallacy, which we all are of course well aware of.
Replying to the points:
From my POV, if events 1-5 have happened, then we have TAGI. It’s already done. The derailments are not things that could happen after TAGI to return us to a pre-TAGI state. They are events that happen before TAGI and modify the estimates above.
Yes, we think AGI will precede TAGI by quite some time, and therefore it’s reasonable to talk about derailments of TAGI conditional on AGI.
If events 1-5 constitute TAGI, and events 6-10 are conditional on AGI, and TAGI is very different from AGI, then you can’t straightforwardly get an overall estimate by multiplying them together. E.g. as I discuss above, 0.3 seems like a reasonable estimate of P(derailment from wars) if the chip supply remains concentrated in Taiwan, but doesn’t seem reasonable if the supply of chips is on track to be “massively scaled up”.
I think that’s a great criticism. Perhaps our conditional odds of Taiwan derailment are too high because we’re too anchored to today’s distribution of production.
One clarification/correction to what I said above: I see the derailment events 6-10 as being conditional on us being on the path to TAGI had the derailments not occurred. So steps 1-5 might not have happened yet, but we are in a world where they will happen if the derailment does not occur. (So not really conditional on TAGI already occurring, and not necessarily conditional on AGI, but probably AGI is occurring in most of those on-the-path-to-TAGI scenarios.)
Edit: More precisely, the cascade is:
- Probability of us developing TAGI, assuming no derailments
- Probability of us being derailed, conditional on otherwise being on track to develop TAGI without derailment
Got it. As mentioned I disagree with your 0.7 war derailment. Upon further thought I don’t necessarily disagree with your 0.7 “regulation derailment”, but I think that in most cases where I’m talking to people about AI risk, I’d want to factor this out (because I typically want to make claims like “here’s what happens if we don’t do something about it”).
Anyway, the “derailment” part isn’t really the key disagreement here. The key disagreement is methodological. Here’s one concrete alternative methodology which I think is better: a more symmetric model which involves three estimates:
Probability of us developing TAGI, assuming that nothing extreme happens
Probability of us being derailed, conditional on otherwise being on track to develop TAGI
Probability of us being rerailed, conditional on otherwise not being on track to develop TAGI
By “rerailed” here I mean roughly “something as extreme as a derailment happens, but in a way which pushes us over the threshold to be on track towards TAGI by 2043″. Some possibilities include:
An international race towards AGI, akin to the space race or race towards nukes
A superintelligent but expensive AGI turns out to good enough at science to provide us with key breakthroughs
Massive economic growth superheats investment into TAGI
Suppose we put 5% credence on each of these “rerailing” us. Then our new calculation (using your numbers) would be:
The chance of being on track assuming that nothing extreme happens: 0.6*0.4*0.16*0.6*0.46 = 1%
P(no derailment conditional on being on track) = 0.7*0.9*0.7*0.9*0.95 = 38%
P(rerailment conditional on not being on track) = 1 − 0.95*0.95*0.95 = 14%
P(TAGI by 2043) = 0.99*0.14 + 0.01*0.38 = 14.2%
That’s over 30x higher than your original estimate, and totally changes your conclusions! So presumably you must think either that there’s something wrong with the structure I’ve used here, or else that 5% is way too high for each of those three rerailments. But I’ve tried to make the rerailments as analogous to the derailments as possible. For example, if you think a depression could derail us, then it seems pretty plausible that the opposite of a depression could rerail us using approximately the same mechanisms.
You might say “look, the chance of being on track to hit all of events 1-5 by 2043 is really low. This means that in worlds where we’re on track, we’re probably barely on track; whereas in worlds where we’re not on track, we’re often missing it by decades. This makes derailment much easier than rerailment.” Which… yeah, conditional on your numbers for events 1-5, this seems true. But the low likelihood of being on track also means that even very low rerailment probabilities could change your final estimate dramatically—e.g. even 1% for each of the rerailments above would increase your headline estimate by almost an order of magnitude. And I do think that many people would interpret a headline claim of “<1%” as pretty different from “around 3%”.
Having said that, speaking for myself, I don’t care very much about <1% vs 3%; I care about 3% vs 30% vs 60%. The difference between those is going to primarily depend on events 1-5, not on derailments or rerailments. I have been trying to avoid getting into the weeds on that, since everyone else has been doing so already. So I’ll just say the following: to me, events 1-5 all look pretty closely related. “Way better algorithms” and “far more rapid learning” and “cheaper inference” and “better robotic control” all seem in some sense to be different facets of a single underlying trend; and chip + power production will both contribute to that trend and also be boosted by that trend. And so, because of this, it seems likely to me that there are alternative factorizations which are less disjoint and therefore get very different results. I think this was what Paul was getting at, but that discussion didn’t seem super productive, so if I wanted to engage more with it a better approach might be to just come up with my own alternative factorization and then argue about whether it’s better or worse than yours. But this comment is already too long so will leave it be for now.
Great comment. We didn’t explicitly allocate probability to those scenarios, and if you do, you end up with much higher numbers. Very reasonable to do so.
Not reading the paper, and not planning to engage in much discussion, and stating beliefs without justification, but briefly commenting since you asked readers to explain disagreement:
I think this framework is bad and the probabilities are far too low, e.g.:
We probably already have “algorithms for transformative AGI.”
The straightforward meaning of “a way for AGIs to learn faster than humans” doesn’t seem to be relevant (seems to be already achieved, seems to be unnecessary, seems to be missing the point); e.g. language models are trained faster than humans learn language (+ world-modeling), and AlphaGo Zero went from nothing to superhuman in three days. Maybe you explain this in the paper though.
GPT-4 inference is much cheaper than paying humans $25/hr to write similar content.
We probably already have enough chips for AGI by 2043 without further scaling up production.
Separately, note that “AI that can quickly and affordably be trained to perform nearly all economically and strategically valuable tasks at roughly human cost or less” is a much higher bar than the-thing-we-should-be-paying-attention-to (which is more like takeover ability; see e.g. Kokotajlo).
Setting aside assessments of the probabilities (which are addressed in the paper), what do you think is bad about the framework? How would you suggest we improve it?
I mean, I don’t think all of your conditions are necessary (e.g. “We invent a way for AGIs to learn faster than humans” and “We massively scale production of chips and power”) and I think together they carve reality quite far from the joints, such that breaking the AGI question into these subquestions doesn’t help you think more clearly [edit: e.g. because compute and algorithms largely trade off, so concepts like ‘sufficient compute for AGI’ or ‘sufficient algorithms for AGI’ aren’t useful].
Thank you for the clarification. To me, it is not 100.0% guaranteed that AGIs will be able to rapidly parallelize all learning and it is not 100.0% guaranteed that we’ll have enough chips by 2043. Therefore, I think it helps to assign probabilities to them. If you are 100.0% confident in their likelihood of occurrence, then you can of course remove those factors. We personally find it difficult to be so confident about the future.
I agree that the success of AlphaZero and GPT-4 are promising notes, but I don’t think they imply a 100.0% likelihood that AGI, whatever it looks like, will learn just as fast on every task.
With AlphaZero in particular, fast reinforcement training is possible because (a) the game state can be efficiently modeled by a computer and (b) the reward can be efficiently computed by a computer.
In contrast, look at a task like self-driving. Despite massive investment, our self-driving AIs are learning more slowly than human teenagers. Part of the reason for this is that conditions (a) and (b) no longer hold. First, our simulations of reality are imperfect, and therefore fleets must be deployed to drive millions of miles. Second, calculating reward functions (i.e., “this action causes a collision”) is expensive and typically requires human supervision (e.g., test drivers, labelers), as the actual reward (e.g., a real-life collision) is even more expensive to acquire. This bottleneck of expensive feedback is partly why we can’t just throw more GPUs at the problem and learn self-driving overnight in the way we can with Go.
Because computers are taking longer than humans to learn how to drive, despite billions invested and vast datasets, it feels plausible to us (i.e., more than 0% likely) that early AGIs will also take as long as humans to learn some tasks, particularly if those tasks cannot afford to spend billions on data acquisition (e.g., swim instructor).
In conclusion, I think it’s totally reasonable to be more optimistic than we are that fast reinforcement learning on nearly all tasks will be solved for AGI by 2043. But I’d caution against presuming a 100.0% probability, which, to me, is what removing this factor from the framework would imply.
You start off saying that existing algorithms are not good enough to yield AGI (and you point to the hardness of self-driving cars as evidence) and fairly likely won’t be good enough for 20 years. And also you claim that existing levels of compute would be a way too low to learn to drive even if we had human-level algorithms. Doesn’t each of those factors on its own explain the difficulty of self-driving? How are you also using the difficulty of self-driving to independently argue for a third conjunctive source of difficulty?
Maybe another related question: can you make a forecast about human-level self-driving (e.g. similar accident rates vs speed tradeoffs to a tourist driving in a random US city) and explain its correlation with your forecast about human-level AI overall? If you think full self-driving is reasonably likely in the next 10 years, that superficially appears to undermine the way you are using it as evidence for very unlikely AGI in 20 years. Conversely, if you think self-driving is very unlikely in the next 10 years, then it would be easier for people to update their overall views about your forecasts after observing (or failing to observe) full self-driving.
I think there is significantly more than a 50% chance that there will be human-level self-driving cars, in that sense, within 10 years. Maybe my chance is 80% though I haven’t thought about it hard. (Note that I already lost one bet about self-driving cars: in 2017 my median for # of US cities where a member of the public could hail a self-driving taxi in mid-2023 was 10-20, whereas reality turned out to be 0-1 depending on details of the operationalization in Phoenix. But I’ve won and lost 50-50 bets about technology in both the too-optimistic and too-pessimistic directions, and I’d be happy to bet about self-driving again.)
(Note that I also think this is reasonably likely to be preempted by explosive technological change driven by AI, which highlights an important point of disagreement with your estimate, but here I’m willing to try to isolate the disagreement about the difficulty of full self-driving.)
ETA: let me try to make the point about self-driving cars more sharply. You seem to think there’s a <15% chance that by 2043 we can do what a human brain can do even using 1e17 flops (a 60% chance of “having the algorithms” and a 20% chance of being 3 OOMs better than 1e20 flops). Driving uses quite a lot of the functions that human brains are well-adapted to perform—perception, prediction, planning, control. If we call it one tenth of a brain, that’s 1e16 flops. Whereas I think existing self-driving cars use closer to 1e14 flops. So shouldn’t you be pretty much shocked if self-driving cars could be made to work using any amount of data with so little computing hardware? How can you be making meaningful updates from the fact that they don’t?
Here are my forecasts of self-driving from 2018: https://www.tedsanders.com/on-self-driving-cars/
Five years later, I’m pretty happy with how my forecasts are looking. I predicted:
100% that self-driving is solvable (looks correct)
90% that self-driving cars will not be available for sale by 2025 (looks correct)
90% that self-driving cars will debut as taxis years before sale to individuals (looks correct)
Rollout will be slow and done city-by-city, starting in the US (looks correct)
Today I regularly take Cruises around SF and it seems decently likely that self-driving taxis are on track to be widely deployed across the USA by 2030. Feels pretty probable, but still plenty of ways that it could be delayed or heterogenous (e.g., regulation, stalling progress, unit economics).
Plus, even wide robotaxi deployment doesn’t mean human taxi drivers are rendered obsolete. Seems very plausible we operate for many many years with a mixed fleet, where AI taxis with high fixed cost and low marginal cost serve baseload taxi demand while human taxis with lower fixed cost but higher marginal cost serve peak evening and commuter demand. In general it seems likely that as AI gets better we will see more complementary mixing and matching where AIs and humans are partnered to take advantage of their comparative advantages.
What counts as human-level is a bit fuzzier: is that human-level crash rates? human-level costs? human-level ability to deal with weird long tail situations?
On the specific question of random tourist vs self-driving vehicle in a new city, I predict that today the tourist is better (>99%) and that by 2030 I’d still give the edge to the tourist (75%), acknowledging that the closer it gets the more the details of the metric begin to matter.
Overall there’s some connection between self-driving progress and my AGI forecasts.
If most cars and trucks are not self-driving by 2043, then it seems likely that we haven’t achieved human-level human-cost AGI, and that would be a strong negative update.
If self-driving taxis still struggle with crowded parking lots and rare situations by 2030, that would a negative update.
If self-driving taxis are widely deployed across the US by 2030, but personal self-driving vehicles and self-driving taxis across Earth remain rare, that would be in line with my current worldview. Neutral update.
If self-driving improves and by 2030 becomes superhuman across a wide variety of driving metrics (i.e., not just crash rates, which can be maximized at the expense of route choice, speed, etc.), then that would be a positive update.
If self-driving improves by 2030 to the point that it can quickly learn to drive in new areas with new signage or new rules (i.e., we see a rapid expansion across all countries), that would be unexpected and a strong positive update on AGI.
Excellent argument. Maybe we should update on that, though I find myself resistant. Part of my instinctive justification against updating is that current self-driving AIs, even if they achieve human-ish level crash rates, are still very sub-human in terms of:
Time to learn to drive
Ability to deal with complex situations like crowded parking lots
Ability to deal with rare situations like a firefighter waving you in a particular direction near a burning car
Ability to deal with custom signage
Ability to deal with unmapped routes
Ability to explain why they did what they did
Ability to reason about things related to driving
Ability to deal with heavy snow, fog, inclement weather
It feels quite plausible to me that these abilities could “cost” orders of magnitude of compute. I really don’t know.
Edit: Or, you could make the same argument about walking. E.g., maybe it engages 10% of our brain in terms of spatial modeling and prediction. But then there are all sorts of animals with much smaller brains that are still able to walk, right? So maybe navigation, at a crude level of ability, really needs much less than 10% of human intelligence. After all, we can sleepwalk, but we cannot sleep-reason. :)
This is not a claim we’ve made.
That’s fair, this was some inference that is probably not justified.
To spell it out: you think brains are as effective as 1e20-1e21 flops. I claimed that humans use more than 1% of their brain when driving (e.g. our visual system is large and this seems like a typical task that engages the whole utility of the visual system during the high-stakes situations that dominate performance), but you didn’t say this. I concluded (but you certainly didn’t say) that a human-level algorithm for driving would not have much chance of succeeding using 1e14 flops.
I think you make a good argument and I’m open to changing my mind. I’m certainly no expert on visual processing in the human brain. Let me flesh out some of my thoughts here.
On whether this framework would have yielded bad forecasts for self-driving:
When we guess that brains use 1e20-1e21 FLOPS, and therefore that early AGIs might need 1e16-1e25, we’re not making a claim about AGIs in general, or the most efficient AGI possible, but AGIs by 2043. We expect early AGIs to be horribly inefficient by later standards, and AGIs to get rapidly more efficient over time. AGI in 2035 will be less efficient than AGI in 2042 which will be less efficient than AGI in 2080.
With that clarification, let’s try to apply our logic to self-driving to see whether it bears weight.
Supposing that self-driving needs 1% of human brainpower, or 1e18-1e19 FLOPS, and then similarly widen our uncertainty to 1e14-1e23 FLOPS, it might say yes, we’d be surprised but not stunned at 1e14 FLOPS being enough to drive (10% → 100%). But, and I know my reasoning is motivated here, that actually seems kind of reasonable? Like, for the first decade and change of trying, 1e14 FLOPS actually was not enough to drive. Even now, it’s beginning to be enough to drive, but still is wildly less sample efficient than human drivers and wildly worse at generalizing than human drivers. So it feels like if in 2010 we predicted self-driving would take 1e14-1e23 FLOPS, and then a time traveler from the future told us that actually it was 1e14 FLOPS, but it would take 13 years to get there, and actually would still be subhuman, then honestly that doesn’t feel too shocking. It was the low end of the range, took many years, and still didn’t quite match human performance.
No doubt with more time and more training 1e14 FLOPS will become more and more capable. Just as we have little doubt that with more time AGIs will require fewer and fewer FLOPS to achieve human performance.
So as I reflect on this framework applied to the test case of self-driving, I come away thinking (a) it actually would have made reasonable predictions and (b) in hindsight I think we should have spent some pages modeling rates of AGI progress, as (obviously) AGI needs are not a fixed target but will decline rapidly over time.
In sum, it’s not obvious to me that this logic would have generated bad predictions for self-driving, and so I’m still unconvinced that we’ve made a big blunder here.
On whether driving takes 1% of the human brain:
I’m going to go way outside my lane here, no pun intended, and I welcome comments from those more informed.
From my very brief Googling, sources say that something like half the brain is involved in vision, though that includes vision+motor, vision+attention, vision+spatial, etc. Going out on a limb, it seems if one knew that fact alone, they might feel reasonable saying that if we can solve vision, then AGI will only need twice as much compute, as vision is half of our brainpower. But it feels like we’ve made massive progress in vision (AlexNet, self-driving, etc.) without being on the cusp of AGI. Somehow even though the brain does lots of vision, these visual identification benchmarks feel far short of AGI.
My feeling is that what makes the human brain superior to today’s AI is its ability to generalize from few examples. To illustrate, suppose you take an AI and a human and you train them over and over to drive a route from A to B. With 1e14 FLOPS and enough training, the AI may be able to eventually outperform the human. But now test the AI and human on route C to D. The human will be wildly better, as their pre-trained world model will have prevented overfitting to the features of route AB. The je ne sais quoi of our intelligence seems to be that we are better able to build models of the world that allow us to more easily reason and generalize to new situations.
To me, getting 1e14 FLOPS to drive a route AB and nothing else is a monumentally different challenge than getting 1e14 FLOPS to drive a route AB, plus any other hypothetical route you throw at it. The first is narrow intelligence. The second is general intelligence. So if we discover that an AI can self-drive route AB with only 1e14 FLOPS, should it be a giant update for how much compute AGI will need? I think it depends: if it’s a narrow brittle overfit AI, then no. If it’s a general (in the sense of general to any road) robust AI, then yes, it should be a big update.
So where does self-driving lay? Well, obviously self-driving cars with 1e14 FLOPS are able to generalize well to all sorts of pre-mapped and pre-simulated routes in a city. But at the same time, they appear to generalize pretty poorly overall—I wouldn’t expect a Waymo vehicle to generalize well to a new city, let alone a new country.
Summarizing, I think the impressiveness of only needing 1e14 FLOPS to drive is really dependent on how well that driving generalizes. If it can generalize as well as human drivers, yes, that’s a big update that AGI may need less 1e20-1e21 FLOPS. But today’s self-driving doesn’t cross that bar for me. It’s quite brittle, and so I am less inclined to update.
Put another way, perhaps brittle self-driving takes 0.1% of human brainpower and robust self-driving takes 10% of human brainpower. Or something like that, who knows. It’s really not clear to me that self-driving in general needs 1% of the human brain, if that self-driving generalizes very poorly to new situations.
Lastly, there is typically going to be a tradeoff between training compute and inference compute. If 1e14 FLOPS is enough to self-drive, but only with millions of miles driven and billions of miles simulated, that should far be less impressive than 1e14 FLOPS being enough to drive after only 100 miles of training. For all I know, it may be possible to get a good driver that only uses 1e12 FLOPS if we train it on even more compute and data; e.g., a billion miles of driving and a trillion miles of simulation. But even if that loosely extrapolates to AGI, it’s only useful if we can afford the “brain equivalent” of a billion miles of training. If that’s too costly, then the existence proof of a 1e12 FLOPS driver may be moot in terms of analogizing AGI inference costs. It’s a pretty interesting question and we certainly could have added more modeling and discussion to our already lengthy essay.
I like that you can interact with this. It makes understanding models so much easier.
Playing with the calculator, I see that the result is driven to a surprising degree by the likelihood that “Compute needed by AGI, relative to a human brain (1e20-1e21 FLOPS)” is <1/1,000x (i.e. the bottom two options).[1]
I think this shows that your conclusion is driven substantially by your choice to hardcode “1e20-1e21 FLOPS” specifically, and then to treat this figure as a reasonable proxy for what computation an AGI would need. (That is, you suggest ~~1x as the midpoint for “Compute needed by AGI, relative to… 1e20-1e21 FLOPS”).
I think it’s also a bit of an issue to call the variable “relative to a human brain (1e20-1e21 FLOPS)”. Most users will read it as “relative to a human brain” while it’s really “relative to 1e20-1e21 FLOPS”, which is quite a specific take on what a human brain is achieving.
I value the fact that you argue for choosing this figure here. However, it seems like you’re hardcoding in confidence that isn’t warranted. Even from your own perspective, I’d guess that including your uncertainty over this figure would bump up the probability by a factor of 2-3, while it looks like other commenters have pointed out that programs seem to use much less computation than we’d predict with a similar methodology applied to tasks computers already do.
This is assuming a distribution on computation centred on ballpark ~100x as efficient in the future (just naively based on recent trends). If putting all weight on ~100x, nothing above 1⁄1,000x relative compute requirement matters. If putting some weight on ~1,000x, nothing above 1/100x relative compute requirement matters.
My quick rebuttal is the flaw you seem to also acknowledge. These different factors that you calculate are not separate variables. They all likely influence the probabilities of each other. (greater capabilities can give rise to greater scaling of manufacturing, since people will want more of it. Greater intelligence can find better forms of efficiency, which means cheaper to run, etc.) This is how you can use probabilities to estimate almost anything is extremely improbable, as you noted.
Yep, that’s admittedly a risk of a framework like this. We’ve tried our best to not to make that mistake, and have gone to some length explaining why we think we haven’t. If you disagree, please help us by telling us which disjunctive paths you think we’ve missed or which probabilities you think we’ve underestimated.
As we asked in the post:
The primary issue I guess is that the normal rules don’t easily apply here. We don’t have good past data to make predictions, so every new requirement added introduces more complexity (and chaos), which might make it less accurate than using fewer variables. Thinking in terms of “all other factors remaining, what are the odds of x” sounds less accurate, but might be the only way to avoid being consumed by all potential variables. Like, ones you don’t even mention that I could name include “US democracy breaksdown”, “AIs hack the grid”, “AIs break the internet/infect every interconnected device with malware”, etc.* You could just keep adding more requirements until your probabilities drop to near 0, because it’ll be difficult to say with much confidence that any of them are <.01 likely to occur, even though a lot of them probably are. It’s probably better just to group several constraints together, and just give a probability that one or more of them occurs (example: “chance that recession/war/regulation/other slows or halts progress”), rather than trying to assess the likelihood of each one. Ordinarily, this wouldn’t be a problem, but we don’t have any data we could normally work with.
Here’s a brief writeup of some agreements/disagreements I have with the individual contraints.
“We invent algorithms for transformative AGI”
I don’t know how this is only 60%. I’d place >.5 before 2030, let alone 2043. This is just guesswork, but we seem to be one or two breakthroughs away.
“We invent a way for AGIs to learn faster than humans 40%”
I don’t really know what this means, why it’s required, or why it’s so low. I see in the paper that it mentions humans being sequential learners that takes years, but AIs don’t seem to work that way. Imagine if GPT4 took years just to learn basic words. AIs also seem to already be able to learn faster than humans. They currently need more data, but less compute than a human brain. Computers can already process information much faster than a brain. And you don’t even need them to learn faster than humans, since once they learn a task, they can just copy that skill to all other AIs. This is a critical point. A human will spend years in Med School just because a senior in the field can’t copy their weights and send them to a grad student.
Also, I’m confused how this at .4, given that its conditional of TAI happening. If you have algorithms for TAI, why couldn’t they also invent algorithms that learn faster than humans? We already see how current AIs can improve algorithmic efficiency (as just one recent example: https://www.deepmind.com/blog/alphadev-discovers-faster-sorting-algorithms). Improving algorithms is probably one of the easiest things a TAI could do, without having to do any physical world experimentation.
“AGI inference costs drop below $25/hr (per human equivalent) 16%”
I really don’t see how this is 16%. Once an AI is able to obtain a new capability, it doesn’t seem to cost much to reuse that capability. Example: GPT4, very expensive to train, but it can be used for cents on a dollar afterward. These aren’t mechanical humans, they don’t need to go through repeated training, knowledge expertise, etc. They only need to do it once, and then it just gets copied.
And, like above, if this is conditional on TAI and faster-than-human learning occurring, how is this only at .016? A faster-than-human TAI can (very probably) improve algorithmic efficiency to radically drive down the cost.
“We invent and scale cheap, quality robots 60%”
This is one where infrastructure and regulation can bottleneck things, so I can understand at least why this is low.
“We massively scale production of chips and power 46%”
If we get TAIs, I imagine scaling will continue or else radically increase. We’re already seeing this, and current AIs have much more limited economic potential. We also don’t know if we actually need to keep scaling or not, since (as I mentioned), algorithmic efficiency might make this unimportant.
“We avoid derailment by human regulation 70%”
Maybe?
“We avoid derailment by AI-caused delay 90%”
In the paper, it describes this as “superintelligent but expensive AGI may itself warn us to slow progress, to forestall potential catastrophe that would befall both us and it.”
That’s interesting, but if the AI hasn’t coup’d humanity already, wouldn’t this just fall under ‘regulation derails TAI’? Unless there is some other way progress halts that doesn’t involve regulations or AI coups...
“We avoid derailment from wars (e.g., China invades Taiwan) 70%”
Possible, but I don’t think this would derail things for 20 years. Maybe 5.
“We avoid derailment from pandemics 90%”
Pandemics also increase with the chances of TAI (or, maybe, they go down, depending, AI could possibly detect and predict a pandemic much better). This is one of the issues with all of this, everything is so entangled, and it’s not actually that easy to say which way variables will influence each other. I’m pretty sure it’s not 50⁄50 it goes one way or the other, so it probably does greatly influence it.
“We avoid derailment from severe depressions”
Not sure, here. It’s not as though everyone will be going out and buying TPUs with or without economic worries. Not all industries slow or halt, even during a depression. Algorithmic efficiency especially seems unlikely to be affected by this.
Overall, I think the hardware and regulatory constraints are the most likely limiting factors. I’m not that sure about anything else.
*I originally wrote up another AI-related scenario, but decided it shouldn’t be publicly stated at the moment.
I’d have to think more carefully about the probabilities you came up with and the model for the headline number, but everything else you discuss is pretty consistent with my view. (I also did a PhD in post-silicon computing technology, but unlike Ted I went right into industry R&D afterwards, so I imagine I have a less synoptic view of things like supply chains. I’m a bit more optimistic, apparently—you assign <1% probability to novel computing technologies running global-scale AI by 2043, but I put down a full percent!)
The table “Examples transistor improvements from history (not cherry-picked)” is interesting. I agree that the examples aren’t cherry picked, since I had nearly the same list (I decided to leave out lithography and included STI and the CFET on imec’s roadmap), but you could choose different prototype dates depending on what you’re interested in.
I think you’ve chosen a fairly relaxed definition for “prototype”, which is good for making the point that it’s almost certain that the transistors of 2043 will use a technology we already have a good handle on, as far as theoretical performance is concerned.
Another idea would be to follow something like this IRDS table that splits out “early invention” and “focused research”. They use what looks like a stricter interpretation of invention—they don’t explain further or give references, but I suspect they just have in mind more similarity to the eventual implementation in production. (There are still questions about what counts, e.g., 1987 for tri-gate or 1998 for FinFET?) That gives about 10–12 years from focused research to volume production.
So even if some unforseeable breakthrough is more performant or easily scalable than what we’re currently thinking about, it still looks pretty tough to get it out by 2043.
(Here’s my submission—I make some similar points but don’t do as much to back them up. The direction is more like “someone should try taking this sort of thing into account”—so I’m glad you did!)
I appreciate the “We avoid derailment by…” sections – I think some forecasts have implicitly overly relied on a “business as usual” frame, and it’s worth thinking about derailment.
TSMC is obviously a market leader, but it seems weird to assume that TAI is infeasible without them? Samsung got to 3nm before TSMC, and that headline is misleading (e.g. they reportedly had poor yields, and the entire labeling system is kind of made up) but my impression is that Samsung is only ~1 generation behind and Intel ~2?
If you had written your report even one year ago you wouldn’t have been able to say that TSMC was responsible for cutting edge GPUs, as Nvidia was using Samsung, right?
I can’t tell how much of your probability mass comes from TSMC in particular versus thinking that e.g. a Taiwan invasion will likely escalate to a global conflict, but if it’s coming from the former, it seems like this factor is overemphasized.
Thanks! We agree that a common mistake by forecasters is to equate low probability of derailment with negligible probability of derailment. The future is hard to predict, and we think it’s worth taking tail risks seriously.
We do not assume TAI is infeasible without TSMC. That would be a terrible reasoning error, and I apologize for giving you that impression.
What we assume is that losing TSMC would likely delay TAI by a handful of years, as it would take:
Time for NVIDIA to bid on capacity from Samsung
Time for Samsung to figure out to what extent it could get out of prior contracted commitments
Time for NVIDIA and Samsung engineers to retune GPU designs for Samsung’s fab design rules
Time to manufacture the masks and put the GPUs into production
Time to iron out early manufacturing and yield issues
Time to build new fabs to absorb the tsunami of demand from TSMC customers (like Apple) and scale up to NVIDIA’s original TSMC volumes
And on top of this, there would massive geopolitical uncertainty that would slow things like investments into new fabs, as companies wonder whether the conflict will escalate or evaporate (both of which massively change the investment case).
What this might look like in reality will also depend on how close we are to transformative AGI.
Two example scenarios:
Today, for example, NVIDIA is probably not going to outbid Apple (Apple makes ~$10B in PROFIT per month, which would evaporate if they were starved of chips).
Or, imagine it’s 2035 and NVIDIA is worth $10T and the semiconductor industry has been building fabs left and right to fuel the impending AGI boom. In such a world, where NVIDIA is the world’s biggest chip designer, it may already dominate manufacturing on both Samsung and TSMC, meaning that if TSMC goes down, it cannot shift production to Samsung—because it already has production on Samsung.
In any case, we fully agree TAI is feasible without TSMC. But we think losing TSMC delays things by a few years, and if TAI is likely to come in the final 10 or 5 years of this period, then a few years might have a 50⁄50 shot of delaying things beyond 2043.
Cool, I agree that, if most your probability mass is in the final few years before 2043, then a couple year delay is likely to push you over the 2043 deadline.
One thing I find deeply unconvincing about such a low probability (< 1%) and that does not require expert knowledge is that other ways to slice this would yield much higher estimates.
E.g. it seems difficult to justify a less than 10% probability that there will be really strong pressures to develop AGI and it seems similarly difficult to justify a less than 10% success probability given such an effort and what we now know.
I agree there will be really strong pressures to develop AGI. Already, many research groups are investing billions today (e.g., Google DeepMind, OpenAI, Anthropic). I’d assign 100% probability to this rather than <10%. I guess it depends on how many billions of dollars of investment qualify as “strong pressures.”
Well, our essay is an attempt to forecast the likelihood of success, given what we know.
If you disagree with our estimates, would you care to supply your own? What conditional probabilities do you believe that result in a 10%+ chance of TAGI by 2043?
As I asked in the post:
Thanks for posting this, Ted, it’s definitely made me think more about the potential barriers and the proper way to combine probability estimates.
One thing I was hoping you could clarify: In some of your comments and estimates, it seems like you are suggesting that it’s decently plausible(?)[1] we will “have AGI“ by 2043, it’s just that it won’t lead to transformative AGI before 2043 because the progress in robotics, semiconductors, and energy scaling will be too slow by 2043. However, it seems to me that once we have (expensive/physically-limited) AGI, this should be able to significantly help with the other things, at least over the span of 10 years. So my main question is: Does your model attach significantly higher probabilities to transformative AGI by 2053? Is it just that 2043 is right near the base of a rise in the cumulative probability curve?
I wasn’t clear if this is just 60%, or 60%*40%, or what. If you could clarify this, that would be helpful!
Agree that:
The odds of AGI by 2043 are much, much higher than transformative AGI by 2043
AGI will rapidly accelerate progress toward transformative AGI
The odds of transformative AGI by 2053 is higher than by 2043
We didn’t explicitly forecast 2053 in the paper, just 2043 (0.4%) and 2100 (41%). If I had to guess without much thought I might go with 3%. It’s a huge advantage to get 10 extra years to build fabs, make algorithms efficient, collect vast training sets, train from slow/expensive real-world feedback, and recover from rare setbacks.
My mental model is some kind of S surve where progress in the short-term is extremely unlikely, progress in the medium-term is more likely, and after a while, the longer it takes to happen, the less likely it is to happen in any given year, as that suggests that some ingredient is still missing and hard to get.
I think you may be right that twenty years is before the S of my S curve really kicks in. Twenty just feels so short with everything that needs to be solved and scaled. I’m much more open-minded about forty.
Interesting. Perhaps we have quite different interpretations of what AGI would be able to do with some set of compute/cost and time limitations. I haven’t had the chance yet to read the relevant aspects of your paper (I will try to do so over the weekend), but I suspect that we have very cruxy disagreements about the ability of a high-cost AGI—and perhaps even pre-general AI that can still aid R&D—to help overcome barriers in robotics, semiconductor design, and possibly even aspects of AI algorithm design.
Just to clarify, does your S-curve almost entirely rely on base rates of previous trends in technological development, or do you have a component in your model that says “there’s some X% chance that conditional on the aforementioned progress (60% * 40%) we get intermediate/general AI that causes the chance of sufficiently rapid progress in everything else to be Y%, because AI could actually assist in the R&D and thus could have far greater returns to progress than most other technologies”?
No it’s not just extrapolating base rates (that would be a big blunder). We assume that the development of proto-AGI or AGI will rapidly accelerate progress and investment, and our conditional forecasts are much more optimistic about progress than they would be otherwise.
However, it’s a totally fair to disagree with us on the degree of that acceleration. Even with superhuman AGI, for example, I don’t think we’re moving away from semiconductor transistors in less than 15 years. Of course, it really depends on how superhuman this superhuman intelligence would be. We discuss this more in the essay.
Your probabilities are not independent, your estimates mostly flow from a world model which seem to me to be flatly and clearly wrong.
The plainest examples seem to be assigning
despite current models learning vastly faster than humans (training time of LLMs is not a human lifetime, and covers vastly more data) and the current nearing AGI and inference being dramatically cheaper and plummeting with algorithmic improvements. There is a general factor of progress, where progress leads to more progress, which you seem to be missing in the positive factors. For the negative, derailment that delays enough to push us out that far needs to be extreme, on the order of a full-out nuclear exchange, given more reasonable models of progress.
I’ll leave you with Yud’s preemptive reply:
Taking a bunch of number and multiplying them together causes errors to stack, especially when those errors are correlated.
Some models learning some things faster than humans does not imply AGI will learn all things faster than humans. Self-driving cars, for example, are taking much longer to learn to drive than teenagers do.
Disagree with example. Human teenagers spend quite a few years learning object recognition and other skills necessary for driving before driving, and I’d bet at good odds that a end-to-end training run of a self-driving car network is shorter than even the driving lessons a teenager goes through to become proficient at a similar level to the car. Designing the training framework, no, but the comparator there is evolution’s millions of years so that doesn’t buy you much.
The end-to-end training run is not what makes learning slow. It’s the iterative reinforcement learning process of deploying in an environment, gathering data, training on that data, and then redeploying with a new data collection strategy, etc. It’s a mistake, I think, to focus only the narrow task of updating model weights and omit the critical task of iterative data collection (i.e., reinforcement learning).
I don’t think this is necessarily the right metric, for the same reason that I think the following statement doesn’t hold:
Basically, while the contest rules do say, “By ‘AGI’ we mean something like ‘AI that can quickly and affordably be trained to perform nearly all economically and strategically valuable tasks at roughly human cost or less’” they then go on to clarify, “What we’re actually interested in is the potential existential threat posed by advanced AI systems.” I think the natural reading of this definition is that AGI that (severely threatened to) cause human extinction or the permanent disempowerment of humanity would qualify as TAI, and I think my interpretation would further be more consistent with the common definition that TAI would be “AI having an impact at least as large as the Industrial Revolution.” Further, I think expensive superhuman AGI would threaten to cause an existential catastrophe in a way that would qualify it for my interpretation.
If we then look at your list, under my interpretation, we no longer have to worry about “AGI inference costs drop below $25/hr (per human equivalent)”, nor “We invent and scale cheap, quality robots”, and possibly not others as well (such as “We massively scale production of chips and power”). If we just ignore those 2 cruxes, (and assume your other numbers hold) then we’re up to 4%. If we further ignore the one about chips & power, then we’re up to 9%.
It feels like you’re double counting a lot of the categories of derailment at first glance? There’s a highly conjunctive story of each of the derailments that makes me suspicious of multiplying them together as if they’re conjunctive. I’m also confused as to how you’re calculating the disjunctive probabilities because on page 78 you put “Conditional on being on a trajectory to transformative AGI, we forecast a 40% chance of severe war erupting by 2042”. However, this doesn’t seem to be an argument for derailment, it seems more likely it’d be an argument for race dynamics increasing?
To me, the odds of pandemic and wars and regulation feel decently independent, but perhaps I haven’t thought deeply enough about pandemics causing depressions causing wars, or wars leading to engineered pandemics being released, etc.
Looking at the past 100ish years of history, the worst wars (world war I & II), the worst pandemics (various flus, COVID19), and the worst recessions (great depression, great recession) all seem fairly independent.
In any case, we tried our best to come up with probabilities of each, conditional on the others not occurring.
What probabilities would you assign?
As we asked in our post:
Yeah I think I would just bin all of delay into one bucket such that they are not independent. For instance, the causal chain of WWI, Great Depression, and WWII seem quite contingent upon one another. I’ll chew on how the binning works but nonetheless really appreciate this piece of work and it’s really easy to read and understand—as well as internally well reasoned. Didn’t mean to come off too harsh.
Not harsh at all; I genuinely appreciate the discussion. If there are good criticisms of our approach, I hope that we absorb them into an improved model rather than entrenching ourselves against them.
The issue I see with grouping these factors is then how do we figure what forecast to make for the collective group? The intuitive approach I’d take is to look at the rates of pandemics, world wars, etc. So it feels like we’d still be basing the estimate on mostly independent considerations even if we smush the final product together at the end.
Seems like a tricky forecasting problem in general. You don’t want a model with too many finicky specific scenarios, but you also don’t want amorphous uninterpretable blobs that arise from irreversibly blending many ingredients together.
A model with 1,000 parameters isn’t going to convince anyone and neither will a model with just 1. We tried to keep to a manageable range of 10 overall factors, backed by a few dozen subfactors. But definitely room to move in either direction.
Cool! I mostly like your decomposition/framing. A major nitpick is that robotics doesn’t matter so much: dispatching to human actuators is probably cheap and easy, like listing mturk jobs or persuasion/manipulation.
Agreed. AGI can have great influence in the world just by dispatching humans.
But by the definition of transformative AGI that we use—i.e., that AGI is able to do nearly all human jobs—I don’t think it’s fair to equate “doing a job” with “hiring someone else to do the job.” To me, It would be a little silly to say “all human work has been automated” and only mean “the CEO is an AGI, but yeah everyone still has to go to work.”
Of course, if you don’t think robotics is necessary for transformative AGI, then you are welcome to remove the factor (or equivalently set it to 100%). In that case, our prediction would still be <1%.
While I agree with many of the object-level criticisms of various priors that seem to be out of touch of current state of ML, I would like to instead make precise a certain obvious flaw in the methodology of the paper which was pointed out several times and which you seem to be unjustifiably dismissive of.
tldr: when playing Baysian inference it is crucial to be cognizant that regardless of how certain your priors are the more conditional steps involved in your model the less credence you should give to the overall prediction.
As for the case at hand, it is very natural to assign instead of a number, a distribution over time for when transformative AGI will be reached.
You wil then find that as you dissect the prediction to more individual prior guesses, the mean of the overall prediction tends to go down, whereas the variance of the overall prediction tends to go up (the case of normal distributions is very instructive here).
So generally, when dissecting an estimate of a probability to atomic guesses as you did, you should be cognizant that with enough steps you can drive the variance of your overall prediction to diverge while keeping the variance of each of your individual priors fixed.
Regardless of how confident you are about your priors you should be quite skeptical of the overall <1% estimate as it most likely fails to account for variance.
Model error higher than 1%?
Three questions for you that would help us improve our model:
What important error do you think is made by our model?
What modification would you propose to address the error?
What impact do you think your modification would have on the resultant forecast?
I think he’s asking if your margin of error is >.01
What is a margin of error, here, exactly?
The event will either happen (1) or not (0). The 0.4% already reflects our uncertainty. In general, I don’t think it makes mathematical sense to discuss probabilities of probabilities.*
*although of course it can make sense to describe sensitivities of probabilities to new information coming in
It would be far far higher of course! With that many variables? Think about the uncertainty we ascribe to cost-effectiveness analysis with far less variables and far better evidence. Even calculating the error her would be close to impossible
95% confidence interval 0.1% to 50%? (Kind of joking here, but it might be in that range)
Confidence intervals over probabilities don’t make much sense to me. The probability itself is already the confidence interval over the binary domain [event happens, event doesn’t happen].
I guess to me the idea of confidence intervals over probabilities implies two different kinds of probabilities. E.g., a reducible flavor and an irreducible flavor. I don’t see what a two-tiered system of probability adds, exactly.
This was an extensive debate in the 1980s and 90s between Judea Pearl, Dempster-Schafer, and a few others. I think it’s trivially true, however, that in the probability centric view you espouse, it can be helpful to track second order uncertainty, and reducible versus irreducible uncertainty is critical for VoI analysis.
What is Vol analysis?
Value of Information
Here’s my brief intro post about it:
https://forum.effectivealtruism.org/posts/8w2hNT5WtDMzoaGuy/when-to-find-more-information-a-short-explanation
And for more on the debates about second-order probabilities and confidence intervals, and why Pearl says you don’t need them, you should just use a Bayesian Network, see his paper here: https://core.ac.uk/download/pdf/82281071.pdf