XPT forecasts on (some) Direct Approach model inputs

This post was co-authored by the Forecasting Research Institute and Rose Hadshar. Thanks to Josh Rosenberg for managing this work, Zachary Jacobs and Molly Hickman for the underlying data analysis, Kayla Gamin for fact-checking and copy-editing, and the whole FRI XPT team for all their work on this project. Special thanks to staff at Epoch for their feedback and advice.

Summary

  • Superforecaster and expert forecasts from the Existential Risk Persuasion Tournament (XPT) differ substantially from Epoch’s default Direct Approach model inputs on algorithmic progress and investment:

InputEpoch (default)XPT superforecasterXPT expert[1]Notes
Baseline growth rate in algorithmic progress (OOM/​year)0.21-0.650.09-0.20.15-0.23

Epoch: 80% confidence interval (CI)

XPT: 90% CI,[2] based on 2024-2030 forecasts

Current spending ($, millions)$60$35$60

Epoch: 2023 estimate

XPT: 2024 median forecast[3]

Yearly growth in spending (%)34%-91.4%6.40%-11%5.7%-19.5%

Epoch: 80% CI

XPT: 90% CI,[4] based on 2024-2050 forecasts

  • Note that there are no XPT forecasts relating to other inputs to the Direct Approach model, most notably the compute requirements parameters.

  • Taking the Direct Approach model as given and using relevant XPT forecasts as inputs where possible leads to substantial differences in model output:

OutputEpoch default inputsXPT superforecaster inputsXPT expert inputs
Median TAI arrival year

2036

2065

2052

Probability of TAI by 2050

70%

38%

49%

Probability of TAI by 2070

76%

53%

65%

Probability of TAI by 2100

80%

66%

74%

Note that regeneration affects model outputs, so these results can’t be replicated directly, and the TAI probabilities presented here differ slightly from those in Epoch’s blog post.[5] Figures given here are the average of 5 regenerations.

  • Epoch is drawing on recent research which was not available at the time the XPT forecasters made their forecasts (the XPT closed in October 2022).

  • Most of the difference in outputs comes down to differences in forecasts on baseline growth rate in algorithmic progress and yearly growth in spending, where XPT forecasts differ radically from the Epoch default inputs (which extrapolate historical trends).

  • XPT forecasters’ all-things-considered transformative artificial intelligence (TAI) timelines are much longer than those which the Direct Approach model outputs using XPT inputs:

Source of 2070 forecastXPT superforecasterXPT expert
Direct Approach model53%65%
XPT postmortem survey question on probability of TAI[6] by 20703.75%16%
  • If you buy the assumptions of the Direct Approach model, and XPT forecasts on relevant inputs, this pushes timelines out by two to three decades compared with the default Epoch inputs.

    • However, it still implies TAI by 2070.

  • It seems very likely that XPT forecasters would not buy the assumptions of the Direct Approach model: their explicitly stated probabilities on TAI by 2070 are <20%.

Introduction

This post:

  • Compares Direct Approach inputs with XPT forecasts on algorithmic progress and investment, and shows how the differences in forecasts impact the outputs of the Direct Approach model.

  • Discusses why Epoch’s inputs and XPT forecasts differ.

  • Notes that XPT forecasters’ all-things-considered TAI timelines are longer than those which the Direct Approach model outputs using XPT inputs.

  • Includes an appendix on the arguments given by Epoch and in the XPT for their respective forecasts.

Background on the Direct Approach model

In May 2023, researchers at Epoch released an interactive Direct Approach model, which models the probability that TAI arrives in a given year. The model relies on:

  • An estimate of the compute required for TAI, based on extrapolating neural scaling laws.

  • Various inputs relating to algorithmic progress, investment and compute.

Epoch’s default inputs produce a model output of 70% that TAI arrives in 2050, and a median TAI arrival year of 2036.[7] Note that these default inputs are based on extrapolating historical trends, and do not represent the all-things-considered view of Epoch staff.[8]

(The Direct Approach model is similar to Cotra’s biological anchors model, except that it uses scaling laws to estimate compute requirements rather than using biological anchors. It also incorporates more recent data for its other inputs, as it was made after Cotra’s model. See here for a comparison of XPT forecasts with Cotra’s model inputs.)

Background on the Existential Risk Persuasion Tournament (XPT)

In 2022, the Forecasting Research Institute (FRI) ran the Existential Risk Persuasion Tournament (XPT). Over the course of 4 months, 169 forecasters, including 80 superforecasters and 89 experts, forecasted on various questions related to existential and catastrophic risk. Forecasters moved through a four-stage deliberative process that was designed to incentivize them not only to make accurate predictions but also to provide persuasive rationales that boosted the predictive accuracy of others’ forecasts. Forecasters stopped updating their forecasts on 31st October 2022, and are not currently updating on an ongoing basis. FRI hopes to run future iterations of the tournament.

You can see the results from the tournament overall here, results relating to AI risk here, and to AI timelines in general here.

Comparing Direct Approach inputs and XPT forecasts

Some of the XPT questions relate directly to some of the inputs to the Direct Approach model. Specifically, there are XPT questions which relate to Direct Approach inputs on algorithmic progress and investment:[9]

XPT questionComparisonInput to Direct Approach model

46. How much will be spent on compute in the largest AI experiment by the end of 2024, 2030, 2050?

Comparison of median XPT 2024 forecasts with Direct Approach 2023 estimate

Current spending

(The dollar value, in millions, of the largest reasonable training run in 2023.)

Inferred annual spending growth between median 5th and 95th percentile XPT forecasts for 2024 and 2050, compared with Epoch 80% CIYearly growth in spending (%) (How much the willingness to spend on potentially-transformative training runs will increase, each year.)
48. By what factor will training efficiency on ImageNet classification have improved over AlexNet by the end of 2024, 2030?Inferred annual growth rate between median 5th and 95th percentile XPT forecasts for 2024 and 2030, compared with Epoch 80% CI

Baseline growth rate

(The yearly improvement in language and vision algorithms, expressed as an order of magnitude.)

Caveats and notes

It is important to note that there are several limitations to this analysis:

  • Outputs from the Direct Approach model using some XPT inputs do not reflect the overall views of XPT forecasters on TAI timelines.

    • Based on commentary during the XPT, it’s unlikely that XPT forecasters would accept the assumptions of the Direct Approach model, or agree with the inputs for which there were no relevant XPT forecasts.

    • In the postmortem survey we ran at the end of the XPT, superforecasters predicted a 3.8% chance of TAI by 2070 and experts predicted a 16% chance. Both of these forecasts are much lower than the corresponding 35% and 46% outputted by the Direct Approach model using respective XPT forecasts as inputs.

  • None of the XPT forecasts are of exactly the same inputs used in the Direct Approach model.

  • We chose to present results for all experts because the sample sizes were bigger than those of domain experts (14-19 compared to 5-7 domain experts). However, we expect readers will vary in how much weight they want to put on the forecasts of domain experts vs. general x-risk experts vs. non-domain experts on these questions. For details on each subgroup’s forecasts on these questions, see Appendix 5 here, where you can navigate to each question to see each subgroup’s forecast.

The forecasts

InputEpoch defaultXPT superforecasterXPT expertNotes
Baseline growth rate (OOM/​year)0.35-0.750.09-0.20.15-0.23

Epoch: 80% CI

XPT: 90% CI,[10] based on 2024-2030 forecasts

Current spending ($, millions)$60$35$60

Epoch: 2023 estimate

XPT: 2024 median forecast

Yearly growth in spending (%)34%-91.4%6.40%-11%5.7%-19.5%

Epoch: 80% CI

XPT: 90% CI,[11] based on 2024-2050 forecasts

Median TAI arrival year (according to the Epoch Direct Approach model)

2036

2065

2052

Note that regeneration affects model outputs, so these results can’t be replicated directly. Figures given here are the average of 5 regenerations.

See workings here.

What drives the differences between Epoch’s inputs and XPT forecasts?

Across the relevant inputs, Epoch is drawing on recent research which was not available at the time the XPT forecasters made their forecasts (the tournament closed in October 2022). Epoch doesn’t cite arguments for their inputs beyond these particular pieces of research, so it’s hard to say what drives the disagreement beyond access to more recent research, and different question formulation.

Specifically:

  • On algorithmic progress, Epoch draws on a December 2022 analysis of the historical rate of algorithmic progress.

  • On current spending, Epoch bases their input on an estimate of GPT-4’s training cost which was produced in 2023. At the time XPT forecasters stopped forecasting (October 2022), it was unclear how much was being spent on GPT-4.

  • On yearly growth in spending, Epoch draws on a 2023 analysis of the historical cost of compute for the final training run of ML systems.

The single biggest factor driving differences in outputs is yearly growth in spending, closely followed by baseline growth rate in algorithmic progress. It is noteworthy that:

  • The XPT predictions on both parameters differ radically from the Epoch default inputs. For both of these parameters, the XPT 90% CI (both for superforecasters and experts) does not overlap with the default Epoch 80% CI.

  • Epoch’s default inputs are based on extrapolating historical growth trends, so the XPT forecasts imply significantly slower growth in spending and algorithmic progress than has recently been the case.

On current spending estimates:

  • If Epoch’s estimate that GPT-4 cost $50m to train in August 2022 is correct, then the XPT superforecaster forecast of $35m was already substantially too small when it was made in October 2022.

  • That said, the current spending estimates have very little impact on outputs of the Direct Approach model, so the accuracy of these forecasts is not very material for considering what the Direct Approach model and XPT forecasts combined imply.

See here for some analysis showing how much changing just one input altered the model output. Note that because regeneration alters model outputs, these results cannot be directly replicated.

XPT forecasters’ all-things-considered view on TAI timelines

As we mentioned above, this analysis takes the Direct Approach model and most of Epoch’s original inputs as a given, and uses XPT forecasts for particular inputs. It cannot be read as a statement of XPT forecasters’ all-things-considered view on TAI.

In fact, from questions in the XPT postmortem survey, we know that XPT forecasters’ all-things-considered TAI timelines are longer than this analysis of the Direct Approach model suggests.

XPT forecasters made the following explicit predictions in the postmortem survey:

  • Probability of AGI by 2070: 13% (superforecasters), 25% (experts)

    • “Artificial general intelligence is defined here as any scenario in which cheap AI systems are fully substitutable for human labor, or if AI systems power a comparably profound transformation (in economic terms or otherwise) as would be achieved in such a world.”

  • Probability of TAI by 2070: 3.75% (superforecasters), 16% (experts)

    • “Transformative AI is defined here as any scenario in which global real GDP during a year exceeds 115% of the highest GDP reported in any full prior year.”

  • Probability of >15% GWP growth by 2100: 3% (superforecasters), 10% (experts)

    • “By 2100, will the global real GDP in a year ever exceed 115% of the highest GDP reported in any full prior year?”

The output of the Direct Approach model using XPT inputs is more aggressive than XPT forecasters’ overall views. Subsetting XPT forecasts to those forecasters who responded to the postmortem survey for comparability, the Direct Approach model outputs:

  • ~54% by 2070, and ~66% by 2100 for XPT superforecasters

  • ~64% by 2070, and ~74% by 2100 for XPT experts

Note that:

  • XPT forecasters think AGI is considerably more likely than TAI by 2070.

  • XPT forecasters’ views appear inconsistent.

    • ~26% of superforecasters predicted AGI by 2070 as 50% likely or more, but ~38% agree or strongly agree that AGI will arise by the end of 2072. ~36% of experts predicted AGI by 2070 as 50% likely or more, but ~61% agree or strongly agree that AGI will arise by the end of 2072.

    • Superforecasters predict a 3% chance of >15% growth by 2100,[12] and a 3.75% chance of TAI (defined as >15% growth) by 2070.

      • Experts predict a 10% chance of >15% growth by 2100,[13] and a 16% chance of TAI by 2070, so their views are even less coherent on this question.

Appendix A: Arguments made for different forecasts

Both Epoch and the XPT forecasters gave arguments for their forecasts.

In Epoch’s case, the arguments are put forward directly in the relevant sections of the Direct Approach post.

In the XPT case:

  • During the tournament, forecasters were assigned to teams.

  • Within teams, forecasters discussed and exchanged arguments in writing.

  • Each team was asked to produce a ‘rationale’ summarising the arguments raised in team discussion.

  • The rationales from different teams were summarised on each relevant XPT question.

The footnotes for XPT rationale summaries contain direct quotes from XPT team rationales.

Algorithmic progress

InputEpochXPT superforecasterXPT expert
Baseline growth rate (OOM/​year)0.21-0.650.09-0.20.15-0.23

Direct Approach arguments

  • Erdil and Besiroglu (2022) estimate that the historical rate of algorithmic progress in computer vision models has been 0.4 orders of magnitude per year (80% CI: 0.35 to 0.75).”[14]

XPT arguments

General comments:

  • Extrapolating current growth rates leads to above median forecasts, and median and below median forecasts assume that current growth rates will slow.[15]

Arguments for slower algorithmic progress (further from Epoch’s estimate):

  • It’s possible no further work will be done in this area such that no further improvements are made.[16]

  • Recently the focus has been on building very large models rather than increasing efficiency.[17]

  • There may be hard limits on how much computation is required to train a strong image classifier.[18]

  • Accuracy may be more important for models given what AI is used for, such that leading researchers target accuracy rather than efficiency gains.[19]

  • If there is a shift towards explainable AI, this may require more compute and so slow efficiency growth rates.[20]

  • Improvements may not be linear, especially as past improvements have been lumpy and the reference source is only rarely updated.[21]

  • Very high growth rates are hard to sustain and tend to revert to the mean.[22]

Arguments for faster algorithmic progress (closer to Epoch’s estimate):

  • Pure extrapolation of improvements to date.[23]

  • Quantum computing might increase compute power and speed.[24]

  • As AI models grow and become limited by available compute, efficiency will become increasingly important and necessary for improving accuracy.[25]

  • “The Papers with Code ImageNet benchmark sorted by GFLOPs shows several more recent models with good top 5 accuracy and a much lower GFLOPs used than the current leader, EfficientNet.” If GFLOPS is a good indicator of training efficiency, then large efficiency increases may already have been made.[26]

  • This technology is in its infancy so there may still be great improvements to be made.[27]

Investment

InputEpochXPT superforecasterXPT expert
Current spending ($, millions)$60$35$60
Yearly growth in spending (%)34%-91.4%6.40%-11%5.7%-19.5%

Direct Approach arguments

  • Current spending: “We estimate that the current largest training run in 2023 will cost $61 million, using our tentative guess at GPT-4’s training cost (Cottier, 2023b), and updating it in line with previous growth rates found in Cottier, 2023a.”

    • Cottier estimates GPT-4 training costs at $43m (at some point in 2022).

  • Yearly growth in spending: “Following Cottier, 2023a, we assume spending will increase between 34.0% and 91.4% per year (80% CI).”[28]

    • Cottier estimates that the cost of compute for the final training run of ML systems has grown by 0.49 orders of magnitude (OOM) per year (90% CI: 0.37 to 0.56) from 2009-2022.

XPT arguments

General comments:

  • Low forecasts are current costs with a modest multiplier, higher take anchors and assume fast scaling up to those.[29]

  • Lower forecasts assume current manufacturing processes, higher imagine novel technology.[30]

Arguments for lower spending (further from Epoch’s estimate):

  • Training costs have been stable around $10m for the last few years.[31]

  • Current trend increases are not sustainable for many more years.[32] One team cited this AI Impacts blog post.

  • Major companies are cutting costs.[33]

  • Increases in model size and complexity will be offset by a combination of falling compute costs, pre-training, and algorithmic improvements.[34]

  • Large language models will probably see most attention in the near future, and these are bottlenecked by availability of data, which will lead to smaller models and less compute.[35]

  • Growth may already be slowing down.[36]

  • In future AI systems may be more modular, such that single experiments remain small even if total spending on compute increases drastically.[37]

  • Recent spending on compute may have been status driven.[38]

  • There seems to be general agreement that experiments of more than a few months are unwise, which might place an upper bound on how much compute can cost for a single experiment.[39]

Arguments for higher spending (closer to Epoch’s estimate):

  • As AI creates more value, more money will be spent on development.[40]

  • A mega-project could be launched nationally or internationally which leads to this level of spending.[41]

  • There is strong competition between actors with lots of resources and incentives to develop AI.[42]

  • The impact of AI on AI development or the economy at large might raise the spending ceiling arbitrarily high.[43]

  1. ^

    In this post, the term “XPT expert” includes both general x-risk experts, domain, and non-domain experts, and does not include superforecasters. This is because the sample size for AI domain experts on these questions was small. About two-thirds of experts forecasting on these questions were either AI domain experts or general x-risk experts, while about one-third were experts in other domains. For details on each subgroups’ forecasts, see Appendix 5 here.

  2. ^

    For this question, XPT forecasters were asked to give their forecasts at the 5th, 25th, 50th, 75th, and 95th percentiles. The XPT CI presented here is the range between the XPT forecasters’ median 5th percentile and median 95th percentile forecasts, so it is not directly comparable to the Epoch CI.

  3. ^

    Here we use XPT median rather than 90% CI, because the direct approach model takes a single estimate as an input for this parameter.

  4. ^

    For this question, XPT forecasters were asked to give their forecasts at the 5th, 25th, 50th, 75th, and 95th percentiles. The XPT CI presented here is the range between the XPT forecasters’ median 5th percentile and median 95th percentile forecasts, so it is not directly comparable to the Epoch CI.

  5. ^

    In three instances (across roughly 50 regenerations), we generated a result of >2100. In all three cases, this result was many decades away from the results we generated on other generations. After consultation with the Epoch team, we think it’s very likely that this is some minor glitch rather than a true model output, and so we excluded all three >2100 results from our analysis.

  6. ^

    “Transformative AI is defined here as any scenario in which global real GDP during a year exceeds 115% of the highest GDP reported in any full prior year.”

  7. ^

    Note that regeneration affects model outputs, so these results can’t be replicated directly. Figures given here are the average of 5 regenerations.

  8. ^

    “The outputs of the model should not be construed as the authors’ all-things-considered views on the question; these are intended to illustrate the predictions of well-informed extrapolative models.” https://​​epochai.org/​​blog/​​direct-approach-interactive-model

  9. ^

    There was also an XPT question on compute, but it is not directly comparable to inputs to the Direct Approach model:

    XPT question: 47. What will be the lowest price, in 2021 US dollars, of 1 GFLOPS with a widely-used processor by the end of 2024, 2030, 2050?

    Direct Approach model input: Growth in FLOP/​s/​$ from hardware specialization (OOM/​year): The rate at which you expect hardware performance will improve each year due to workload specialization, over and above the default projections. The units are orders of magnitude per year. (Lognormal, 80% CI).

  10. ^

    For this question, XPT forecasters were asked to give their forecasts at the 5th, 25th, 50th, 75th, and 95th percentiles. The XPT CI presented here is the range between the XPT forecasters’ median 5th percentile and median 95th percentile forecasts, so it is not directly comparable to the Epoch CI.

  11. ^

    For this question, XPT forecasters were asked to give their forecasts at the 5th, 25th, 50th, 75th, and 95th percentiles. The XPT CI presented here is the range between the XPT forecasters’ median 5th percentile and median 95th percentile forecasts, so it is not directly comparable to the Epoch CI.

  12. ^

    The probability of >15% growth by 2100 was asked about in both the main component of the XPT and the postmortem survey. The results here are from the postmortem survey. The superforecaster median estimate for this question in the main component of the XPT was 2.75% (for both all superforecaster participants and the subset that completed the postmortem survey).

  13. ^

    The probability of >15% growth by 2100 was asked about in both the main component of the XPT and the postmortem survey. The results here are from the postmortem survey. The experts median estimate for this question in the main component of the XPT was 19% for all expert participants and 16.9% for the subset that completed the postmortem survey.

  14. ^
  15. ^

    Question 48: See 339, “”On the other hand, an economist would say that one day, the improvement will stagnate as models become “”good enough”″ for efficient use, and it’s not worth it to become even better at image classification. Arguably, this day seems not too far off. So growth may either level off or continue on its exponential path. Base rate thinking does not help much with this question… It eluded the team to find reasonable and plausible answers… stagnation may be just as plausible as further exponential growth. No one seems to know.”

  16. ^

    Question 48: 340, “Low range forecasts assume that nobody does any further work on this area, hence no improvement in efficiency.” 341, “The Github page for people to submit entries to the leaderboard created by OpenAI hasn’t received any submissions (based on pull requests), which could indicate a lack of interest in targeting efficiency. https://​​github.com/​​openai/​​ai-and-efficiency”.

  17. ^

    Question 48: 340, “In addition, it seems pretty unclear, whether this metric would keep improving incidentally with further progress in ML, especially given the recent focus on extremely large-scale models rather than making things more efficient.”

  18. ^

    Question 48: 340, “[T]here seem to bem some hard limits on how much computation would be needed to learn a strong image classifier”.

  19. ^

    Question 48: 341, “The use cases for AI may demand accuracy instead of efficiency, leading researchers to target continued accuracy gains instead of focusing on increased efficiency.”

  20. ^

    Question 48: 341, “A shift toward explainable AI (which could require more computing power to enable the AI to provide explanations) could depress growth in performance.”

  21. ^

    Question 48: 336, “Lower end forecasts generally focused on the fact that improvements may not happen in a linear fashion and may not be able to keep pace with past trends, especially given the “lumpiness” of algorithmic improvement and infrequent updates to the source data.” 338, “The lowest forecasts come from a member that attempted to account for long periods with no improvement. The reference table is rarely updated and it only includes a few data points. So progress does look sporadic.”

  22. ^

    Question 48: 337, “The most significant disagreements involved whether very rapid improvement observed in historical numbers would continue for the next eight years. A rate of 44X is often very hard to sustain and such levels usually revert to the mean.”

  23. ^

    Question 48: 340, “The higher range forecasts simply stem from the extrapolation detailed above.

    Pure extrapolation of the 44x in 7 years would yield a factor 8.7 for the 4 years from 2020 to 2024 and a factor of 222 for the years until 2030. ⇒ 382 and 9768.” 336, “Base rate has been roughly a doubling in efficiency every 16 months, with a status quo of 44 as of May 2019, when the last update was published. Most team members seem to have extrapolated that pace out in order to generate estimates for the end of 2024 and 2030, with general assumption being progress will continue at roughly the same pace as it has previously.”

  24. ^

    Question 48: 336, “The high end seems to assume that progress will continue and possibly increase if things like quantum computing allow for a higher than anticipated increase in computing power and speed.”

  25. ^

    Question 48: 341, “AI efficiency will be increasingly important and necessary to achieve greater accuracy as AI models grow and become limited by available compute.”

  26. ^

    Question 48: 341.

  27. ^

    Question 48: 337, “The most significant disagreements involved whether very rapid improvement observed in historical numbers would continue for the next eight years. A rate of 44X is often very hard to sustain and such levels usually revert to the mean. However, it seems relatively early days for this tech, so this is plausible.”

  28. ^
  29. ^

    Question 46: 338: “The main split between predictions is between lower estimates (including the team median) that anchor on present project costs with a modest multiplier, and higher estimates that follow Cotra in predicting pretty fast scaling will continue up to anchors set by demonstrated value-added, tech company budgets, and megaproject percentages of GDP.”

  30. ^

    Question 46: 340: “Presumably much of these disagreement[s] stem from different ways of looking at recent AI progress. Some see the growth of computing power as range bound by current manufacturing processes and others expect dramatic changes in the very basis of how processors function leading to continued price decreases.”

  31. ^

    Question 46: 337, “[T]raining cost seems to have been stuck in the $10M figure for the last few years.”; “we have not seen such a large increase in the estimated training cost of the largest AI model during the last few years: AlphaZero and PALM are on the same ballpark.” 341, “For 2024, the costs seem to have flattened out and will be similar to now. To be on trend in 2021, the largest experiment would need to be at $0.2-1.5bn. GPT-3 was only $4.6mn”

  32. ^

    Question 46: 341, “The AI impacts note also states that the trend would only be sustainable for a few more years. 5-6 years from 2018, i.e. 2023-24, we would be at $200bn, where we are already past the total budgets for even the biggest companies.”

  33. ^

    Question 46: 336, “The days of ‘easy money’ may be over. There’s some serious belt-tightening going on in the industry (Meta, Google) that could have a negative impact on money spent.”

  34. ^

    Question 46: 337, “It also puts more weight on the reduced cost of compute and maybe even in the improved efficiency of minimization algorithms, see question 48 for instance.” 336, “After 2030, we expect increased size and complexity to be offset by falling cost of compute, better pre-trained models and better algorithms. This will lead to a plateau and possible even a reduction in costs.”; “In the near term, falling cost of compute, pre-trained models, and better algorithms will reduce the expense of training a large language model (which is the architecture which will likely see the most attention and investment in the short term).” See also 343, “$/​FLOPs is likely to be driven down by new technologies and better chips. Better algorithm design may also improve project performance without requiring as much spend on raw compute.” See also 339, “The low end scenarios could happen if we were to discover more efficient training methods (e.g. take a trained model from today and somehow augment it incrementally each year rather than a single batch retrain or perhaps some new research paradigm which makes training much cheaper).”

  35. ^

    Question 46: 336, “Additionally, large language models are currently bottlenecked by available data. Recent results from DeepMind suggest that models over ~100 billion parameters would not have enough data to optimally train. This will lead to smaller models and less compute used in the near term. For example, GPT-4 will likely not be significantly larger than Chinchilla. https://​​arxiv.org/​​abs/​​2203.15556”. 341, “The data availability is limited.” See also 340, “The evidence from Chinchilla says that researchers overestimated the value of adding parameters (see https://​​www.lesswrong.com/​​posts/​​6Fpvch8RR29qLEWNH/​​chinchilla-s-wild-implications). That is probably discouraging researchers from adding more parameters for a while. Combined with the difficulty of getting bigger text datasets, that might mean text-oriented systems are hitting a wall. (I’m unsure why this lasts long—I think other datasets such as video are able to expand more).”

  36. ^

    Question 46: 340, “The growth might be slowing down now.”; “Or maybe companies were foolishly spending too little a few years ago, but are now reaching diminishing returns, with the result that declining hardware costs mostly offset the desire for bigger models.”

  37. ^

    Question 46: 340, “Later on, growth might slow a lot due to a shift to modular systems. I.e. total spending on AI training might increase a good deal. Each single experiment could stay small, producing parts that are coordinated to produce increasingly powerful results.” See also 339, “2050 At this point I’m not sure it will be coherent to talk about a single AI experiment, models will probably be long lived things which are improved incrementally rather than in a single massive go. But they’ll also be responsible for a large fraction of the global GDP so large expenditures will make sense, either at the state level or corporation.”

  38. ^

    Question 46: 340, “Some forecasters don’t expect much profit from increased spending on AI training. Maybe the recent spending spree was just researchers showing off, and companies are about to come to their senses and stop spending so much money.”

  39. ^

    Question 46: 340; “There may some limits resulting from training time. There seems to be agreement that it’s unwise to attempt experiments that take more than a few months. Maybe that translates into a limit on overall spending on a single experiment, due to limits on how much can be done in parallel, or datacenter size, or supercomputer size?”

  40. ^

    Question 46: 343, “Monetization of AGI is in its early stages. As AI creates new value, it’s likely that additional money will be spent on increasingly more complex projects.” Note that this argument refers to forecasts higher than the team median forecasts, and the team median for 2024 was $25m.

  41. ^

    Question 46: 337, “This will make very much sense in the event that a great public project or international collaboration will be assembled for researching a particular aspect of AI (a bit in the line of project Manhattan for the atomic bomb, the LHC for collider physics or ITER for fusion). The probability of such a collaboration eventually appearing is not small. Other scenario is great power competition between China and the US, with a focus on AI capabilities.”

  42. ^

    Question 46: 336, “There is strong competition between players with deep pockets and strong incentives to develop and commercialize ‘AI-solutions’.”

  43. ^

    Question 46: 344, “Automatic experiments run by AI are beyond valuation”. 337, “One forecast suggest astronomical numbers for the largest project in the future, where the basis of this particular forecast is the possibility of an AI-driven economic explosion (allowing for the allocation of arbitrarily large resources in AI).”

No comments.