One way to approach this would simply be to make a hypothesis (i.e. the bar for grants is being lowered, we’re throwing money at nonsense grants), and then see what evidence you can gather for and against it.
Thinking about FTX and their bar for funding seems very important. I’m thrilled that so much money is being put towards EA causes, but a few early signs have been concerning. Here’s two considerations on the hypothesis that FTX has a lower funding bar than previous EA funding.
First, it seems that FTX would like to spend a lot more a lot faster than has been the EA consensus for a long time. Perhaps it’s a rational response to having more available funds, but the funds are nowhere near unlimited. If the entire value of FTX was liquidated and distributed to the 650 million people in extreme poverty, each impoverished person would receive only $49, leaving global poverty as pressing a problem as ever. It also strikes against recent work on patient philanthropy, which is supported by Will MacAskill’s argument that we are not living in the most influential time in human history. (EDIT: See additional section below.) It seems that this money is intended to be disbursed with much less research and deliberation than used by GiveWell, OpenPhilanthropy, and other grantmakers, with no visible plans to build a sizeable research organization before giving out their money. Using the Rowing vs Steering analogy of EA, FTX has strapped a motor engine to the EA boat without engaging in any substantial steering of the boat.
On the object level, the Atlas Fellowship for high schoolers looks concerning on a few levels. The program intends to award $50K to 100 fellows by April 10th of this year. The fellowship has received little major press and was announced on Tyler Cowen’s blog only 2 days ago. This does not seem like enough time or publicity to generate the strongest possible pool of applicants. The selection process itself is unusual in a number of ways, requiring standardized tests but not grades or letters of recommendation, and explicitly not pursuing goals of diversity in race, gender, or socioeconomic status. The program is working with a number of young EAs with good reputations for direct work in their respective cause areas, but little to no experience in running admissions processes or academic fellowships (bottom of this page). Will this $5.5M given to high schoolers for professional development really do more good than spending it on lifesaving medication or other provably beneficial interventions?
There’s a lot to cover here and I’ve raised more questions or concerns than I’ve offered answers. But FTX is a massively influential development within EA, and should receive a lot of time and attention to make sure it achieves its full potential for positive impact.
More Thoughts on FTX
I’m confused by FTX’s astronomical spend on PR and brand awareness. Wikipedia gives a good breakdown of the spend, highlights include renaming an NBA arena, a college football stadium, an esports organization, and the Mercedes Benz Formula One team; sponsoring athletes such as Tom Brady, Steph Curry, and Shohei Ohtani; and making donations to the personal charitable foundations of celebrities Phil Mickelson, Alex Honnold, and Bryson DeChambeau. The marketing spend has been on the order of hundreds of millions if not billions of dollars. This would all be wonderful if it’s a profit-making strategy that creates more funding for good causes on net. But I would ask both how this will be perceived publicly, and the object-level question of whether this spending is, say, 8x better than donating directly to people in poverty through GiveDirectly.
Separately, I think there’s a real tension between the fact that FTX is headquartered in the Bahamas to avoid paying taxes and the fact that Sam Bankman Fried was the second largest donor to the Joe Biden campaign. They care enough about American politics to spend millions trying to influence the outcome of our elections, but don’t feel any responsibility to pay taxes? You can make the pure utilitarian argument, but I think most people would object to it.
EDIT: Spending Now as Patient Philanthropy
Thank you to several people who pointed out that spending now might be the best means to patient philanthropy, particularly for longtermists. Here is Owen Cotton-Barratt’s explanation of why “patient vs urgent longtermism” has little direct bearing on giving now vs later that conceptualizes some forms of current grantmaking as investments that open up greater opportunities for giving at a more impactful time in the future. FTX is specifically interested in these kinds of investments in future opportunities, with five or more of their focus areas potentially leading to greater opportunities in the future. Lukas Gloor also points out that there is significantly more disagreement about the Hinge of History hypothesis than I realized, much of it about priors and anthropic reasoning arguments that I don’t quite understand. This all seems reasonable, particularly for an organization that is trying to find giving opportunities to fulfill their mission of longtermist grantmaking.
First, it seems that FTX would like to spend a lot more a lot faster than has been the EA consensus for a long time. … It also strikes against recent work on patient philanthropy, which is supported by Will MacAskill’s argument that we are not living in the most influential time in human history.
In addition, the arguments for not living in the most influential time in human history are rejected by many EAs, as you can see in the discussion section of MacAskill’s orginal article and here.
(In general, I think it’s legitimate even for very large organizations to bet on a particular worldview, especially if they’re being transparent to donors and supporters.)
(That said, I want to note that “spend money now” is very different from “have a low bar.” I haven’t looked into FTX grants yet, but I want to flag that while I’m in favor of deploying capital now, I wouldn’t necessarily lower the bar. Instead, I’d aggressively fund active grantmaking and investigations into large grants in areas where EAs haven’t been active yet.)
Appreciate and agree with both of these comments. I’ve made a brief update to the original post to reflect it, and hope to respond in more detail soon.
This would all be wonderful if it’s a profit-making strategy that creates more funding for good causes on net. But I would ask both how this will be perceived publicly, and the object-level question of whether this spending is, say, 8x better than donating directly to people in poverty through GiveDirectly.
The second sentence seems to be confusing investment with consumption.
The investment in advertising, versus the consumption-style spending on GiveDirectly? Just meant to compare the impact of the two. The first’s impact would come by raising more money to eventually be donated, the second is directly impactful, so I’d like to think about which is a better use of the funds.
I suggest caution with trying to compare the future fund’s investments against the donating to global poverty without engaging with the long-termist worldview. This worldview you could be right or wrong but it is important to engage with it to understand why FTX might consider these investments worthwhile.
Another part of the argument is that there is currently an absurd amount of money per effective altruist. This might not matter for global poverty where much of the work can be outsourced, but it is a much bigger problem for many projects in other areas. In this scenario, it might make sense to exchange apps to seeming amounts of money to grow the pool of committed members, at least if this really is the bottleneck, particularly if you believe that certain projects need to be completed on short timelines.
I agree being situated in the Bahamas is less than deontologically spotless but I don’t believe that avoiding the negative PR is worth billions of dollars and I don’t see it as a particularly egregious moral violation nor do I see this as significantly reducing trust in EA or FTX.
Update on Atlas Fellowship: They’ve extended their application period by one week! Good decision for getting more qualified applications into the pipeline. I wonder how many applications they’ve received overall.
A few points on the Bio Anchors framework, and why I expect TAI to require much more compute than used by the human brain:
1. Today we routinely use computers with as much compute as the human brain. Joe Carlsmith’s OpenPhil report finds the brain uses between 10^13 and 10^17 FLOP/s. He points out that Nvidia’s V100 GPU retailing for $10,000 currently performs 10^14 FLOP/s.
2. Ajeya Cotra’s Bio Anchors report shows that AlphaStar’s training run used 10^23 FLOP, the equivalent of running a human brain-sized computer with 10^15 FLOP/s for four years. The Human Lifetime anchor therefore estimates that a transformative model could already be trained with today’s levels of compute with 22% probability, but we have not seen such a model so far.
Why, then, do we not have transformative AI? Maybe it’s right around the corner, with the human lifetime anchor estimating a 50% chance of transformative AI by 2032. I’m more inclined to say that this reduces my credence in the report’s short timelines based on the compute of the human brain. The Evolution anchor seems to me like a more realistic prediction, with 50% probability of TAI by 2090.
I’d also like to see more research on the evolution anchor. The Evolution anchor is the part of the report that Ajeya says she “spent the least amount of time thinking about.” Its estimates of the size of evolutionary history are primarily from this 2009 blog post, and its final calculation assumes that all of our ancestors had brains the size of nematodes and that the organism population of the Earth has been constant for 1 billion years. These are extremely rough assumptions, and Ajeya also says that “there are plausible arguments that I have underestimated true evolutionary computation here in ways that would be somewhat time-consuming to correct.” On the other hand, it seems reasonable to me that our scientists could generate algorithmic improvements much faster than evolution did, though Ajeya notes that “some ML researchers would want to argue that we would need substantially more computation than was performed in the brains of all animals over evolutionary history; while I disagree with this, it seems that the Evolution Anchor hypothesis should place substantial weight on this possibility.”
This (pop science) article provides two interesting critiques of the analogy between the human brain and neural nets.
“Neural nets are typically trained by “supervised learning”. This is very different from how humans typically learn. Most human learning is “unsupervised”, which means we’re not explicitly told what the “right” response is for a given stimulus. We have to work this out ourselves.”
“Another difference is the sheer scale of data used to train AI. The GPT-3 model was trained on 400 billion words, mostly taken from the internet. At a rate of 150 words per minute, it would take a human nearly 4,000 years to read this much text.”
I’m not sure the direct implication for timelines here. You might be able to argue that these disanalogies mean that neural nets will require less compute than the brain. But an interesting point of disanalogy, to correct any misconceptions that neural networks are “just like the brain”.
I strongly disagree with the claim that there is a >10% chance of TAI in the next 10 years. Here are two small but meaningful pieces of why I have much longer AI timelines.
Note that TAI is here defined as one or both of: (a) any 5 year doubling of real global GDP, or (b) any catastrophic or existential AI failures.
Market Activity
Top tech companies do not believe that AI takeoff is around the corner. Mark Zuckerberg recently saw many of his top AI research scientists leave the company, as Facebook has chosen to acquire Oculus and bet on the metaverse rather than AI as the next big thing. This 2019 interview with Facebook’s VP of AI might shed some light on why.
IBM Watson just sold off their entire healthcare business. This is a strong sign of the AI’s failure to meet tremendous expectations of revolutionizing the healthcare industry. Meanwhile on LessWrong, somebody is getting lots of upvotes for predicting (in admittedly a fun, off-the-cuff manner) that “Chatbots [will be] able to provide better medical diagnoses than nearly all doctors” in 2024.
Data Constraints
Progress has been swift in areas where it is easy to generate lots of training data. ML systems are lauded for achieving human-level performance on academic competitions like ImageNet, but those performances are only possible because of the millions of labeled data points provided. NLP systems trained on self-supervised objectives leverage massive datasets, but regurgitate hate speech, fake news, and private information memorized from the internet. Reinforcement learning (RL) systems play games like chess and Atari for thousands of years of virtual time in the popular method of self play.
Many real world goals have much longer time horizons than those where AI succeeds today, and cannot be readily decomposed into smaller goals. We cannot simulate the experience of founding a startup, running an experiment, or building a relationship in the same way we can do with writing a paper or playing a game. Machines will need to learn in open-ended play with the world, where today they mostly learn from labeled examples.
What I mean is that most of these examples show a failure of narrow AIs to deliver on some economic goals. In soft takeoff, we expect to see things like broad deployment of AIs contributing to massive economic gains and GDP doublings in short periods of time well before we get to anything like AGI.
But in hard takeoff, failure to see massive success from narrow AIs could happen due to regulations and other barriers (or it could just be limitations of the narrow AI). In fact, these limitations could even point more forcefully to the massive benefits of an AI that can generalize. And having the recipe for that AGI discovered and deployed in a lab doesn’t depend on the success of prior narrow AIs in the regulated marketplace. AGI is a different breed and may also become powerful enough that it doesn’t have to play by the rules of the regulated marketplace and national legal systems.
Machines will need to learn in open-ended play with the world, where today they mostly learn from labeled examples.
Have you seen DeepMind’s Generally capable agents emerge from open-ended play? I think it is a powerful demonstration of learning from open-ended play actually working in a lab (not just a possible future approach). Though it is still in a virtual environment rather than the real physical world.
Hey Evan, these are definitely stronger points against short timelines if you believe in slow takeoff, rather than points against short-timelines in a hard takeoff world. It might come as no surprise that I think slow takeoff is much more likely than hard takeoff, with the Comprehensive AI Systems model best representing what I would expect. A short list of the key arguments there:
Discontinuities on important metrics are rare, see the AI Impacts writeup. EDIT: Dan Hendrycks and Thomas Woodside provide a great empirical survey of AI progress across several domains. It largely shows continuous progress on individual metrics, but also highlights the possibilities of emergent capabilities and discontinuity.
Much of the case for fast takeoff relies heavily on the concept of “general intelligence”. I think the history of AI progress shows that narrow progress is much more common, and I don’t expect advances in e.g. language and vision models to generalize to success in the many low-data domains required to achieve transformative AI.
Recursive self-improvement is entirely possible in theory, but far from current capabilities. AI is not currently being used to write research papers or build new models, nor is it significantly contributing to the acceleration of hardware progress. (The two most important counterexamples are OpenAI’s Codex and Google’s DL for chip placement. If these were shown to be significantly pushing the cutting edge of AI progress, I would change my views on the likelihood of recursive self-improvement in a short-timelines scenario.)
EDIT 07/2022: Here is Thomas Woodside’s list of examples of AI increasing AI progress. While it’s debatable how much of an impact these are having on the pace of progress, it’s undeniable that it’s happening to some degree and efforts are ongoing to increase capacity for recursive self-improvement. My summary above was an overstatement.
I don’t think there’s any meaningful “regulatory overhang”. I haven’t seen any good examples of industries where powerful AI systems are achieved in academic settings, but not deployed for legal reasons. Self-driving cars, maybe? But those seem like more of a regulatory success story than a failure, with most caution self-imposed by companies.
The short timelines scenarios I find most plausible are akin to those outlined by Gwern and Daniel Kokotajlo (also here), where a pretrained language model is given an RL objective function and the capacity to operate a computer, and it turns out that one smart person behind a computer can do a lot more damage than we realized. More generally, short timelines and hard takeoff can happen when continuous scaling up of inputs results in discontinuous performance on important real world objectives. But I don’t see the argument for where that discontinuity will arise—there are too many domains where a language model trained with no real world goal will be helpless.
And yeah, that paper is really cool, but is really only a proof of concept of what would have to become a superhuman science in order for our “Clippy” to take over the world. You’re pointing towards the future, but how long until it arrives?
But in hard takeoff, failure to see massive success from narrow AIs could happen due to regulations and other barriers. It could just be limitations of the narrow AIs. In fact, these limitations could even point more forcefully to the massive benefits of an AI that can generalize.
I think you’re saying that regulations/norms could mask dangerous capability and development, having the effect of eroding credibility/recourses in safety. Yet, unhindered by enforcement, bad actors continue to progress to the worse states, even using the regulations as signposts.
I’m not fully sure I understand all of the sentences in the rest of your paragraph. There’s several jumps in there?
Gwern’s writing “Clippy” lays out some potential possibilities of dislocation of safety mechanisms. If there is additional content you think is convincing (of mechanisms and enforcement) that would be good to share.
Nuclear weapons are one of the only direct means to an existential catastrophe for humanity. Other existential risk factors such as global warming, great power war, and misaligned AI could not alone pose a specific credible threat to Earth’s population of seven billion. Instead, these stories only reach human extinction through bioweapons, asteroids, or something closer to the conclusion of Gwern’s recent story about AI catastrophe:
All over Earth, the remaining ICBMs launch.
How can we engineer a safer nuclear weapons system? A few ideas:
Information security has been previously recommended as an EA career path. There’s strong reason to believe that controlling computers systems will become increasingly valuable over the next century. But traditional cybersecurity credentials might not help somebody directly work on critical systems. How can cybersecurity engineering support nuclear weapons safety?
Stanislav Petrov is only famous because he had broken missile detector. How do you prevent false alarms of a nuclear launch? Who in the American, Chinese, Russian, and European security states is working on exploiting the vulnerabilities present in each other’s systems?
Security for defense contractors of the US government. The nuts and bolts of American military security are quite literally built by Boeing, Raytheon, and other for-profit defense contractors. Do these companies have ethics boards? Do they engage with academics on best practices, or build internal teams to work on safety? What if they received funding from FTX’s Future Fund -- what safety projects would they be willing to consider?
Here’s a US Senate hearing on detecting smuggled nuclear weapons. Nuclear non-proliferation is one of the most common forms of advocacy on the topic. How can non-proliferation efforts be improved by technological security of weapons systems? How could new groups acquire nuclear weapons today, and how could we close those holes?
Security engineering is only one way to improve nuclear safety. Advocacy, grantmaking, and other non-technical methods can advance nuclear security. On the other hand, we’ve seen major grant makers withdraw from the area thanks to insufficient results. It’s my impression that, compared to political methods, engineering has been relatively underexplored as a means to nuclear security. Perhaps it will be seen as significantly important by those focused on the risks of misaligned artificial intelligence.
In the 2016 report where 80,000 Hours declared nuclear security a “sometimes recommended” path for improving the world, they note a key cause for concern: “This issue is not as neglected as most other issues we prioritize. Current spending is between $1 billion and $10 billion per year.” In 2022, with longtermist philanthropy looking to deploy billions of dollars over the next decade, do we still believe nuclear security engineering is too ambitious to work on?
Generally speaking, I believe in longer timelines and slower takeoff speeds. But short timelines seem more dangerous, so I’m open to alignment work tailored to short timelines scenarios. Right now, I’m looking for research opportunities on risks from large language models.
How will AI develop over the next few centuries? Three scenarios seem particularly likely to me:
“Solving Intelligence”: Within the next 50 years, a top AI lab like Deepmind or OpenAI builds a superintelligent AI system, by using massive compute within our current ML paradigm.
“Comprehensive AI Systems”: Over the next century or few, computers keep getting better at a bunch of different domains. No one AI system is incredible at everything, each new job requires fine-tuning and domain knowledge and human-in-the-loop supervision, but soon enough we hit annual GDP growth of 25%.
“No takeoff”: Looks qualitatively similar to the above, except growth remains steady around 2% for at least several centuries. We remain in the economic paradigm of the Industrial Revolution, and AI makes an economic contribution similar to that of electricity or oil without launching us into a new period of human history. Progress continues as usual.
For clarify my beliefs about AI timelines, I found it helpful to flesh out these concrete “scenarios” by answering a set of closely related questions about how transformative AI might develop:
When do we achieve TAI? AGI? Superintelligence? How fast is takeoff? Who builds it? How much compute does it require? How much does that cost? Agent or Tool? Is machine learning the paradigm, or do we have another fundamental shift in research direction? What are the key AI Safety challenges? Who is best positioned to contribute?
The potentially useful insight here is that answering one of these questions helps you answer the others. If massive compute is necessary, then TAI will be built by a few powerful governments or corporations, not by a diverse ecosystem of small startups. If TAI isn’t achieved for another century, that affects which research agendas are most important today. Follow this exercise for awhile, and you might end up with a handful of distinct scenarios, and then you can judge the relative likelihood and timelines of each.
Here’s my rough sketch of what each of these mean. [Dumping a lot of rough notes here, which is why I’m posting as a shortform.]
Solving Intelligence: Within the next 20-50 years, a top AI lab like Deepmind or OpenAI builds a superintelligent AI system.
Machine learning is the paradigm that brings us to superintelligence. Most progress is driven by compute. Our algorithms are similar to the human brain, and therefore require similar amounts of compute.
It becomes a compute war. You’re taking the same fundamental algorithms and spending a hundred billion dollars on compute, and it works. (Informed by Ajeya’s report, IMO the most important upshot of which being that spending a truly massive amount of money can cover a sizeable portion of the difference between our current compute and the compute of the human brain. If human brain-level compute is an important threshold, then the few actors who could spend $100B+ are have an advantage of decades against against actors who can only spend millions. Would like to discuss this further.)
This is most definitely not CAIS. There would be one or two or ten superintelligent AI systems, but not two million.
Very few people can contribute effectively to AI Safety, because to contribute effectively you have to be at one of only a handful of organizations in the world. You need to be in “the room where it happens”, whether that’s the AI lab developing the superintelligence or the government attempting to monitor the project. The handful of people who can contribute are incredibly valuable.
What AI safety stuff matters?
Technical AI safety research. The people right now who are building AI that scales safely. It turns out you can do effective research now because our current methods are the methods that bring us to superintelligence, and whether or not our current research is good enough determines whether or not we survive.
Highest levels of government, for their ability to regulate AI labs. A project like this could be nationalized, or carried out under strict oversight from government regulators. Realistically I’d expect the opposite, that governments would be too slow to see the risks and rewards in such a technical domain.
People who imagine long-term policies for governing AI. I don’t know how much useful work exists here, but I have to imagine there’s some good stuff about how to run the world undersuperintelligence. What’s the game theory of multipolar scenarios? What are the points of decisive strategic advantage?
Comprehensive AI Systems: Over the next century or few, computers keep getting better at a bunch of different domains. No one AI system is incredible at everything, each new job requires fine-tuning and domain knowledge and human-in-the-loop supervision, but soon enough we hit annual GDP growth of 25%.
Governments go about international relations the same as usual, just with better weapons. There’s some strategic effects of this that Henry Kissinger and Justin Ding understand quite well, but there’s no instant collapse into one world government or anything. There’s a few outside risks here that would be terrible (a new WMD, or missile defense systems that threaten MAD), but basically we just get killer robots, which will probably be fine.
Killer robots are a key AI safety training ground. If they’re inevitable, we should be integrated within enemy lines in order to deploy safely.
We have lots of warning shots.
What are the existential risks? Nuclear war. Autonomous weapons accidents, which I suppose could turn out to be existential?? Long-term misalignment: over the next 300 years, we hand off the fate of the universe to the robots, and it’s not quite the right trajectory.
What AI Safety work is most valuable?
Run-of-the-mill AI Policy work. Accomplishing normal government objectives often unrelated to existential risk specifically, by driving forward AI progress in a technically-literate and altruistically-thoughtful way.
Driving forward AI progress. It’s a valuable technology that will help lots of people, and accelerating its arrival is a good thing.
With particular attention to safety. Building a CS culture, a Silicon Valley, a regulatory environment, and international cooperation that will sustain the three hundred year transition.
Working in military AI systems. They’re the killer robots most likely to run amok and kill some people (or 7 billion). Malfunctioning AI can also cause nuclear war by setting off geopolitical conflict. Also new WMDs would be terrible.
No takeoff: Looks qualitatively similar to the above, except growth remains steady around 2% for at least several centuries. We remain in the economic paradigm of the Industrial Revolution, and AI makes an economic contribution similar to that of electricity or oil without launching us into a new period of human history.
This seems entirely possible, maybe even the most likely outcome. I’ve been surrounded by people talking about short timelines from a pretty young age so I never really thought about this possibility, but “takeoff” is not guaranteed. The world in 500 years could resemble the world today; in fact, I’d guess most thoughtful people don’t think much about transformative AI and would assume that this is the default scenario.
Part of why I think this is entirely plausible is because I don’t see many independently strong arguments for short AI timelines:
IMO the strongest argument for short timelines is that, within the next few decades, we’ll cross the threshold for using more compute than the human brain. If this turns out to be a significant threshold and a fair milestone to anchor against, then we could hit an inflection point and rapidly see Bostrom Superintelligence-type scenarios.
I see this belief as closely associated with the entire first scenario described above: Held by OpenAI/DeepMind, the idea that we will “solve intelligence” with an agenty AI running a simple fundamental algorithm with massive compute and effectively generalizing across many domains.
IIRC, the most prominent early argument for short AI timelines, as discussed by Bostrom, Yudkowsky, and others, was recursive self-improvement. The AI will build smarter AIs, meaning we’ll eventually hit an inflection point of runaway improvement positively feeding into itself and rapidly escalating from near-human to lightyears-beyond-human intelligence. This argument seems less popular in recent years, though I couldn’t say exactly why. My only opinion would be that this seems more like an argument for “fast takeoff” (once we have near-human level AI systems for building AI systems, we’ll quickly achieve superhuman performance in that area), but does not tell you when that takeoff will occur. For all we know, this fast takeoff could happen in hundreds of years. (Or I could be misunderstanding the argument here, I’d like to think more about it.)
Surveys asking AI researchers when they expect superhuman AI have received lots of popular coverage and might be driving widespread acceptance of short timelines. My very subjective and underinformed intution puts little weight on these surveys compared to the object level arguments. The fact that people trying to build superintelligence believe it’s possible within their lifetime certainly makes me take that possibility seriously, but it doesn’t provide much of an upper bound on how long it might take. If the current consensus of AI researchers proves to be wrong about progress over the next century, I wouldn’t expect their beliefs about the next five or ten centuries to hold up—the worldview assumptions might just be entirely off-base.
These are the only three arguments for short timelines I’ve ever heard and remembered. Interested if I’m forgetting anything big here.
Compare this to the simple prior that history will continue with slow and steady single-digit growth as it has since the Industrial Revolution, and I see a significant chance that we don’t see AI takeoff for centuries, if ever. (That’s before considering object level arguments for longer timelines, which admittedly I don’t see many of, and therefore I don’t put much weight on.)
I haven’t fully thought through all of this, but would love to hear others thoughts on the probability of “no takeoff”.
This is pretty rough around the edges, but these three scenarios seem like the key possibilities for the next few centuries that I can see at this point. For the hell of it, I’ll give some very weak credences: 10% that we solve superintelligence within decades, 25% that CAIS brings double-digit growth within a century or so, maybe 50% that human progress continues as usual for at least a few centuries, and (at least) 15% that what ends up happening looks nothing like any of these scenarios.
Very interested in hearing any critiques or reactions to these scenarios or the specific arguments within.
I like the no takeoff scenario intuitive analysis, and find that I also haven’t really imagined this as a concrete possibility. Generally, I like that you have presented clearly distinct scenarios and that the logic is explicit and coherent. Two thoughts that came to mind:
Somehow in the CAIS scenario, I also expect the rapid growth and the delegation of some economic and organizational work to AI to have some weird risks that involve something like humanity getting pushed away from the economic ecosystem while many autonomous systems are self-sustaining and stuck in a stupid and lifeless revenue-maximizing loop. I couldn’t really pinpoint an x-risk scenario here.
Recursive self-improvement can also happen within long periods of time, not necessarily leading to a fast takeoff, especially if the early gains are much easier than later gains (which might make more sense if we think of AI capability development as resulting mostly from computational improvements rather than algorithmic).
Some thoughts on FTX copied from this thread:
Thinking about FTX and their bar for funding seems very important. I’m thrilled that so much money is being put towards EA causes, but a few early signs have been concerning. Here’s two considerations on the hypothesis that FTX has a lower funding bar than previous EA funding.
First, it seems that FTX would like to spend a lot more a lot faster than has been the EA consensus for a long time. Perhaps it’s a rational response to having more available funds, but the funds are nowhere near unlimited. If the entire value of FTX was liquidated and distributed to the 650 million people in extreme poverty, each impoverished person would receive only $49, leaving global poverty as pressing a problem as ever.
It also strikes against recent work onpatient philanthropy, which is supported byWill MacAskill’s argumentthat we are not living in the most influential time in human history.(EDIT: See additional section below.) It seems that this money is intended to be disbursed with much less research and deliberation than used by GiveWell, OpenPhilanthropy, and other grantmakers, with no visible plans to build a sizeable research organization before giving out their money. Using the Rowing vs Steering analogy of EA, FTX has strapped a motor engine to the EA boat without engaging in any substantial steering of the boat.On the object level, the Atlas Fellowship for high schoolers looks concerning on a few levels. The program intends to award $50K to 100 fellows by April 10th of this year. The fellowship has received little major press and was announced on Tyler Cowen’s blog only 2 days ago. This does not seem like enough time or publicity to generate the strongest possible pool of applicants. The selection process itself is unusual in a number of ways, requiring standardized tests but not grades or letters of recommendation, and explicitly not pursuing goals of diversity in race, gender, or socioeconomic status. The program is working with a number of young EAs with good reputations for direct work in their respective cause areas, but little to no experience in running admissions processes or academic fellowships (bottom of this page). Will this $5.5M given to high schoolers for professional development really do more good than spending it on lifesaving medication or other provably beneficial interventions?
There’s a lot to cover here and I’ve raised more questions or concerns than I’ve offered answers. But FTX is a massively influential development within EA, and should receive a lot of time and attention to make sure it achieves its full potential for positive impact.
More Thoughts on FTX
I’m confused by FTX’s astronomical spend on PR and brand awareness. Wikipedia gives a good breakdown of the spend, highlights include renaming an NBA arena, a college football stadium, an esports organization, and the Mercedes Benz Formula One team; sponsoring athletes such as Tom Brady, Steph Curry, and Shohei Ohtani; and making donations to the personal charitable foundations of celebrities Phil Mickelson, Alex Honnold, and Bryson DeChambeau. The marketing spend has been on the order of hundreds of millions if not billions of dollars. This would all be wonderful if it’s a profit-making strategy that creates more funding for good causes on net. But I would ask both how this will be perceived publicly, and the object-level question of whether this spending is, say, 8x better than donating directly to people in poverty through GiveDirectly.
Separately, I think there’s a real tension between the fact that FTX is headquartered in the Bahamas to avoid paying taxes and the fact that Sam Bankman Fried was the second largest donor to the Joe Biden campaign. They care enough about American politics to spend millions trying to influence the outcome of our elections, but don’t feel any responsibility to pay taxes? You can make the pure utilitarian argument, but I think most people would object to it.
EDIT: Spending Now as Patient Philanthropy
Thank you to several people who pointed out that spending now might be the best means to patient philanthropy, particularly for longtermists. Here is Owen Cotton-Barratt’s explanation of why “patient vs urgent longtermism” has little direct bearing on giving now vs later that conceptualizes some forms of current grantmaking as investments that open up greater opportunities for giving at a more impactful time in the future. FTX is specifically interested in these kinds of investments in future opportunities, with five or more of their focus areas potentially leading to greater opportunities in the future. Lukas Gloor also points out that there is significantly more disagreement about the Hinge of History hypothesis than I realized, much of it about priors and anthropic reasoning arguments that I don’t quite understand. This all seems reasonable, particularly for an organization that is trying to find giving opportunities to fulfill their mission of longtermist grantmaking.
I don’t think fast spending in and of itself strikes against patient longtermism: see Owen-Cotton-Barratt’s post “Patient vs urgent longtermism” has little direct bearing on giving now vs later.
In addition, the arguments for not living in the most influential time in human history are rejected by many EAs, as you can see in the discussion section of MacAskill’s orginal article and here.
(In general, I think it’s legitimate even for very large organizations to bet on a particular worldview, especially if they’re being transparent to donors and supporters.)
(That said, I want to note that “spend money now” is very different from “have a low bar.” I haven’t looked into FTX grants yet, but I want to flag that while I’m in favor of deploying capital now, I wouldn’t necessarily lower the bar. Instead, I’d aggressively fund active grantmaking and investigations into large grants in areas where EAs haven’t been active yet.)
Appreciate and agree with both of these comments. I’ve made a brief update to the original post to reflect it, and hope to respond in more detail soon.
The second sentence seems to be confusing investment with consumption.
The investment in advertising, versus the consumption-style spending on GiveDirectly? Just meant to compare the impact of the two. The first’s impact would come by raising more money to eventually be donated, the second is directly impactful, so I’d like to think about which is a better use of the funds.
I suggest caution with trying to compare the future fund’s investments against the donating to global poverty without engaging with the long-termist worldview. This worldview you could be right or wrong but it is important to engage with it to understand why FTX might consider these investments worthwhile.
Another part of the argument is that there is currently an absurd amount of money per effective altruist. This might not matter for global poverty where much of the work can be outsourced, but it is a much bigger problem for many projects in other areas. In this scenario, it might make sense to exchange apps to seeming amounts of money to grow the pool of committed members, at least if this really is the bottleneck, particularly if you believe that certain projects need to be completed on short timelines.
I agree being situated in the Bahamas is less than deontologically spotless but I don’t believe that avoiding the negative PR is worth billions of dollars and I don’t see it as a particularly egregious moral violation nor do I see this as significantly reducing trust in EA or FTX.
Update on Atlas Fellowship: They’ve extended their application period by one week! Good decision for getting more qualified applications into the pipeline. I wonder how many applications they’ve received overall.
Concerns with BioAnchors Timelines
A few points on the Bio Anchors framework, and why I expect TAI to require much more compute than used by the human brain:
1. Today we routinely use computers with as much compute as the human brain. Joe Carlsmith’s OpenPhil report finds the brain uses between 10^13 and 10^17 FLOP/s. He points out that Nvidia’s V100 GPU retailing for $10,000 currently performs 10^14 FLOP/s.
2. Ajeya Cotra’s Bio Anchors report shows that AlphaStar’s training run used 10^23 FLOP, the equivalent of running a human brain-sized computer with 10^15 FLOP/s for four years. The Human Lifetime anchor therefore estimates that a transformative model could already be trained with today’s levels of compute with 22% probability, but we have not seen such a model so far.
Why, then, do we not have transformative AI? Maybe it’s right around the corner, with the human lifetime anchor estimating a 50% chance of transformative AI by 2032. I’m more inclined to say that this reduces my credence in the report’s short timelines based on the compute of the human brain. The Evolution anchor seems to me like a more realistic prediction, with 50% probability of TAI by 2090.
I’d also like to see more research on the evolution anchor. The Evolution anchor is the part of the report that Ajeya says she “spent the least amount of time thinking about.” Its estimates of the size of evolutionary history are primarily from this 2009 blog post, and its final calculation assumes that all of our ancestors had brains the size of nematodes and that the organism population of the Earth has been constant for 1 billion years. These are extremely rough assumptions, and Ajeya also says that “there are plausible arguments that I have underestimated true evolutionary computation here in ways that would be somewhat time-consuming to correct.” On the other hand, it seems reasonable to me that our scientists could generate algorithmic improvements much faster than evolution did, though Ajeya notes that “some ML researchers would want to argue that we would need substantially more computation than was performed in the brains of all animals over evolutionary history; while I disagree with this, it seems that the Evolution Anchor hypothesis should place substantial weight on this possibility.”
This (pop science) article provides two interesting critiques of the analogy between the human brain and neural nets.
“Neural nets are typically trained by “supervised learning”. This is very different from how humans typically learn. Most human learning is “unsupervised”, which means we’re not explicitly told what the “right” response is for a given stimulus. We have to work this out ourselves.”
“Another difference is the sheer scale of data used to train AI. The GPT-3 model was trained on 400 billion words, mostly taken from the internet. At a rate of 150 words per minute, it would take a human nearly 4,000 years to read this much text.”
I’m not sure the direct implication for timelines here. You might be able to argue that these disanalogies mean that neural nets will require less compute than the brain. But an interesting point of disanalogy, to correct any misconceptions that neural networks are “just like the brain”.
I strongly disagree with the claim that there is a >10% chance of TAI in the next 10 years. Here are two small but meaningful pieces of why I have much longer AI timelines.
Note that TAI is here defined as one or both of: (a) any 5 year doubling of real global GDP, or (b) any catastrophic or existential AI failures.
Market Activity
Top tech companies do not believe that AI takeoff is around the corner. Mark Zuckerberg recently saw many of his top AI research scientists leave the company, as Facebook has chosen to acquire Oculus and bet on the metaverse rather than AI as the next big thing. This 2019 interview with Facebook’s VP of AI might shed some light on why.
Microsoft has similarly bet heavily on entertainment over AI with their acquisition of Activision Blizzard. Microsoft purchased Activision for $68B, which is 68 times more than they invested into OpenAI three years ago, after which they have not followed up with more public investments.
IBM Watson just sold off their entire healthcare business. This is a strong sign of the AI’s failure to meet tremendous expectations of revolutionizing the healthcare industry. Meanwhile on LessWrong, somebody is getting lots of upvotes for predicting (in admittedly a fun, off-the-cuff manner) that “Chatbots [will be] able to provide better medical diagnoses than nearly all doctors” in 2024.
Data Constraints
Progress has been swift in areas where it is easy to generate lots of training data. ML systems are lauded for achieving human-level performance on academic competitions like ImageNet, but those performances are only possible because of the millions of labeled data points provided. NLP systems trained on self-supervised objectives leverage massive datasets, but regurgitate hate speech, fake news, and private information memorized from the internet. Reinforcement learning (RL) systems play games like chess and Atari for thousands of years of virtual time in the popular method of self play.
Many real world goals have much longer time horizons than those where AI succeeds today, and cannot be readily decomposed into smaller goals. We cannot simulate the experience of founding a startup, running an experiment, or building a relationship in the same way we can do with writing a paper or playing a game. Machines will need to learn in open-ended play with the world, where today they mostly learn from labeled examples.
See Andrew Ng on the incredible challenge of data sparse domains. Perhaps this is why radiologists have not been replaced by machines, as Geoffrey Hinton so confidently predicted back in 2016.
These are thoughtful data points, but consider that they may just be good evidence for hard takeoff rather than soft takeoff.
What I mean is that most of these examples show a failure of narrow AIs to deliver on some economic goals. In soft takeoff, we expect to see things like broad deployment of AIs contributing to massive economic gains and GDP doublings in short periods of time well before we get to anything like AGI.
But in hard takeoff, failure to see massive success from narrow AIs could happen due to regulations and other barriers (or it could just be limitations of the narrow AI). In fact, these limitations could even point more forcefully to the massive benefits of an AI that can generalize. And having the recipe for that AGI discovered and deployed in a lab doesn’t depend on the success of prior narrow AIs in the regulated marketplace. AGI is a different breed and may also become powerful enough that it doesn’t have to play by the rules of the regulated marketplace and national legal systems.
Have you seen DeepMind’s Generally capable agents emerge from open-ended play? I think it is a powerful demonstration of learning from open-ended play actually working in a lab (not just a possible future approach). Though it is still in a virtual environment rather than the real physical world.
Hey Evan, these are definitely stronger points against short timelines if you believe in slow takeoff, rather than points against short-timelines in a hard takeoff world. It might come as no surprise that I think slow takeoff is much more likely than hard takeoff, with the Comprehensive AI Systems model best representing what I would expect. A short list of the key arguments there:
Discontinuities on important metrics are rare, see the AI Impacts writeup. EDIT: Dan Hendrycks and Thomas Woodside provide a great empirical survey of AI progress across several domains. It largely shows continuous progress on individual metrics, but also highlights the possibilities of emergent capabilities and discontinuity.
Much of the case for fast takeoff relies heavily on the concept of “general intelligence”. I think the history of AI progress shows that narrow progress is much more common, and I don’t expect advances in e.g. language and vision models to generalize to success in the many low-data domains required to achieve transformative AI.
Recursive self-improvement is entirely possible in theory, but far from current capabilities. AI is not currently being used to write research papers or build new models, nor is it significantly contributing to the acceleration of hardware progress. (The two most important counterexamples are OpenAI’s Codex and Google’s DL for chip placement. If these were shown to be significantly pushing the cutting edge of AI progress, I would change my views on the likelihood of recursive self-improvement in a short-timelines scenario.)
EDIT 07/2022: Here is Thomas Woodside’s list of examples of AI increasing AI progress. While it’s debatable how much of an impact these are having on the pace of progress, it’s undeniable that it’s happening to some degree and efforts are ongoing to increase capacity for recursive self-improvement. My summary above was an overstatement.
I don’t think there’s any meaningful “regulatory overhang”. I haven’t seen any good examples of industries where powerful AI systems are achieved in academic settings, but not deployed for legal reasons. Self-driving cars, maybe? But those seem like more of a regulatory success story than a failure, with most caution self-imposed by companies.
The short timelines scenarios I find most plausible are akin to those outlined by Gwern and Daniel Kokotajlo (also here), where a pretrained language model is given an RL objective function and the capacity to operate a computer, and it turns out that one smart person behind a computer can do a lot more damage than we realized. More generally, short timelines and hard takeoff can happen when continuous scaling up of inputs results in discontinuous performance on important real world objectives. But I don’t see the argument for where that discontinuity will arise—there are too many domains where a language model trained with no real world goal will be helpless.
And yeah, that paper is really cool, but is really only a proof of concept of what would have to become a superhuman science in order for our “Clippy” to take over the world. You’re pointing towards the future, but how long until it arrives?
I think you’re saying that regulations/norms could mask dangerous capability and development, having the effect of eroding credibility/recourses in safety. Yet, unhindered by enforcement, bad actors continue to progress to the worse states, even using the regulations as signposts.
I’m not fully sure I understand all of the sentences in the rest of your paragraph. There’s several jumps in there?
Gwern’s writing “Clippy” lays out some potential possibilities of dislocation of safety mechanisms. If there is additional content you think is convincing (of mechanisms and enforcement) that would be good to share.
You’re right, that paragraph was confusing. I just edited it to try and make it more clear.
Career Path: Nuclear Weapons Security Engineering
Nuclear weapons are one of the only direct means to an existential catastrophe for humanity. Other existential risk factors such as global warming, great power war, and misaligned AI could not alone pose a specific credible threat to Earth’s population of seven billion. Instead, these stories only reach human extinction through bioweapons, asteroids, or something closer to the conclusion of Gwern’s recent story about AI catastrophe:
How can we engineer a safer nuclear weapons system? A few ideas:
Information security has been previously recommended as an EA career path. There’s strong reason to believe that controlling computers systems will become increasingly valuable over the next century. But traditional cybersecurity credentials might not help somebody directly work on critical systems. How can cybersecurity engineering support nuclear weapons safety?
Stanislav Petrov is only famous because he had broken missile detector. How do you prevent false alarms of a nuclear launch? Who in the American, Chinese, Russian, and European security states is working on exploiting the vulnerabilities present in each other’s systems?
Security for defense contractors of the US government. The nuts and bolts of American military security are quite literally built by Boeing, Raytheon, and other for-profit defense contractors. Do these companies have ethics boards? Do they engage with academics on best practices, or build internal teams to work on safety? What if they received funding from FTX’s Future Fund -- what safety projects would they be willing to consider?
Here’s a US Senate hearing on detecting smuggled nuclear weapons. Nuclear non-proliferation is one of the most common forms of advocacy on the topic. How can non-proliferation efforts be improved by technological security of weapons systems? How could new groups acquire nuclear weapons today, and how could we close those holes?
The 80,000 Hours Podcast had an excellent conversation with Daniel Ellsberg about his book The Doomsday Machine. They’ve also spoken with ALLFED, which is working on a host of engineering solutions to existential risk problems. Here is ALLFED’s job board.
Security engineering is only one way to improve nuclear safety. Advocacy, grantmaking, and other non-technical methods can advance nuclear security. On the other hand, we’ve seen major grant makers withdraw from the area thanks to insufficient results. It’s my impression that, compared to political methods, engineering has been relatively underexplored as a means to nuclear security. Perhaps it will be seen as significantly important by those focused on the risks of misaligned artificial intelligence.
The Department of Energy manages the American nuclear stockpile. Here is their job board. Here is a recommendation report prepared by the International Atomic Energy Agency. The Johns Hopkins University Applied Physics Laboratory, a prominent defense contractor for the US where an EA recently received a grant to work on AI safety internally, also works on nuclear safety. What other governmental organizations are setting the world’s nuclear security policies?
In the 2016 report where 80,000 Hours declared nuclear security a “sometimes recommended” path for improving the world, they note a key cause for concern: “This issue is not as neglected as most other issues we prioritize. Current spending is between $1 billion and $10 billion per year.” In 2022, with longtermist philanthropy looking to deploy billions of dollars over the next decade, do we still believe nuclear security engineering is too ambitious to work on?
Fun fact: For 20 years at the peak of the Cold War, the US nuclear launch code was “00000000”
https://gizmodo.com/for-20-years-the-nuclear-launch-code-at-us-minuteman-si-1473483587
H/t: Gavin Leech
Collected Thoughts on AI Safety
Here are of some of my thoughts on AI timelines:
Who believes what about AI timelines
Why I have longer timelines than the BioAnchors report
Reasons to expect gradual takeoff rather than sudden takeoff
Why market activity and data constraints lengthen my timelines
Three scenarios for AI progress
And here are some thoughts on other AI Safety topics:
Questions about Decision Transformers and Deepmind’s Gato
Why AI policy seems valuable (selected quotes from Richard Ngo)
Why AI alignment prizes are so difficult
Generally speaking, I believe in longer timelines and slower takeoff speeds. But short timelines seem more dangerous, so I’m open to alignment work tailored to short timelines scenarios. Right now, I’m looking for research opportunities on risks from large language models.
Three Scenarios for AI Progress
How will AI develop over the next few centuries? Three scenarios seem particularly likely to me:
“Solving Intelligence”: Within the next 50 years, a top AI lab like Deepmind or OpenAI builds a superintelligent AI system, by using massive compute within our current ML paradigm.
“Comprehensive AI Systems”: Over the next century or few, computers keep getting better at a bunch of different domains. No one AI system is incredible at everything, each new job requires fine-tuning and domain knowledge and human-in-the-loop supervision, but soon enough we hit annual GDP growth of 25%.
“No takeoff”: Looks qualitatively similar to the above, except growth remains steady around 2% for at least several centuries. We remain in the economic paradigm of the Industrial Revolution, and AI makes an economic contribution similar to that of electricity or oil without launching us into a new period of human history. Progress continues as usual.
For clarify my beliefs about AI timelines, I found it helpful to flesh out these concrete “scenarios” by answering a set of closely related questions about how transformative AI might develop:
When do we achieve TAI? AGI? Superintelligence? How fast is takeoff? Who builds it? How much compute does it require? How much does that cost? Agent or Tool? Is machine learning the paradigm, or do we have another fundamental shift in research direction? What are the key AI Safety challenges? Who is best positioned to contribute?
The potentially useful insight here is that answering one of these questions helps you answer the others. If massive compute is necessary, then TAI will be built by a few powerful governments or corporations, not by a diverse ecosystem of small startups. If TAI isn’t achieved for another century, that affects which research agendas are most important today. Follow this exercise for awhile, and you might end up with a handful of distinct scenarios, and then you can judge the relative likelihood and timelines of each.
Here’s my rough sketch of what each of these mean. [Dumping a lot of rough notes here, which is why I’m posting as a shortform.]
Solving Intelligence: Within the next 20-50 years, a top AI lab like Deepmind or OpenAI builds a superintelligent AI system.
Machine learning is the paradigm that brings us to superintelligence. Most progress is driven by compute. Our algorithms are similar to the human brain, and therefore require similar amounts of compute.
It becomes a compute war. You’re taking the same fundamental algorithms and spending a hundred billion dollars on compute, and it works. (Informed by Ajeya’s report, IMO the most important upshot of which being that spending a truly massive amount of money can cover a sizeable portion of the difference between our current compute and the compute of the human brain. If human brain-level compute is an important threshold, then the few actors who could spend $100B+ are have an advantage of decades against against actors who can only spend millions. Would like to discuss this further.)
This is most definitely not CAIS. There would be one or two or ten superintelligent AI systems, but not two million.
Very few people can contribute effectively to AI Safety, because to contribute effectively you have to be at one of only a handful of organizations in the world. You need to be in “the room where it happens”, whether that’s the AI lab developing the superintelligence or the government attempting to monitor the project. The handful of people who can contribute are incredibly valuable.
What AI safety stuff matters?
Technical AI safety research. The people right now who are building AI that scales safely. It turns out you can do effective research now because our current methods are the methods that bring us to superintelligence, and whether or not our current research is good enough determines whether or not we survive.
Highest levels of government, for their ability to regulate AI labs. A project like this could be nationalized, or carried out under strict oversight from government regulators. Realistically I’d expect the opposite, that governments would be too slow to see the risks and rewards in such a technical domain.
People who imagine long-term policies for governing AI. I don’t know how much useful work exists here, but I have to imagine there’s some good stuff about how to run the world undersuperintelligence. What’s the game theory of multipolar scenarios? What are the points of decisive strategic advantage?
Comprehensive AI Systems: Over the next century or few, computers keep getting better at a bunch of different domains. No one AI system is incredible at everything, each new job requires fine-tuning and domain knowledge and human-in-the-loop supervision, but soon enough we hit annual GDP growth of 25%.
Governments go about international relations the same as usual, just with better weapons. There’s some strategic effects of this that Henry Kissinger and Justin Ding understand quite well, but there’s no instant collapse into one world government or anything. There’s a few outside risks here that would be terrible (a new WMD, or missile defense systems that threaten MAD), but basically we just get killer robots, which will probably be fine.
Killer robots are a key AI safety training ground. If they’re inevitable, we should be integrated within enemy lines in order to deploy safely.
We have lots of warning shots.
What are the existential risks? Nuclear war. Autonomous weapons accidents, which I suppose could turn out to be existential?? Long-term misalignment: over the next 300 years, we hand off the fate of the universe to the robots, and it’s not quite the right trajectory.
What AI Safety work is most valuable?
Run-of-the-mill AI Policy work. Accomplishing normal government objectives often unrelated to existential risk specifically, by driving forward AI progress in a technically-literate and altruistically-thoughtful way.
Driving forward AI progress. It’s a valuable technology that will help lots of people, and accelerating its arrival is a good thing.
With particular attention to safety. Building a CS culture, a Silicon Valley, a regulatory environment, and international cooperation that will sustain the three hundred year transition.
Working in military AI systems. They’re the killer robots most likely to run amok and kill some people (or 7 billion). Malfunctioning AI can also cause nuclear war by setting off geopolitical conflict. Also new WMDs would be terrible.
No takeoff: Looks qualitatively similar to the above, except growth remains steady around 2% for at least several centuries. We remain in the economic paradigm of the Industrial Revolution, and AI makes an economic contribution similar to that of electricity or oil without launching us into a new period of human history.
This seems entirely possible, maybe even the most likely outcome. I’ve been surrounded by people talking about short timelines from a pretty young age so I never really thought about this possibility, but “takeoff” is not guaranteed. The world in 500 years could resemble the world today; in fact, I’d guess most thoughtful people don’t think much about transformative AI and would assume that this is the default scenario.
Part of why I think this is entirely plausible is because I don’t see many independently strong arguments for short AI timelines:
IMO the strongest argument for short timelines is that, within the next few decades, we’ll cross the threshold for using more compute than the human brain. If this turns out to be a significant threshold and a fair milestone to anchor against, then we could hit an inflection point and rapidly see Bostrom Superintelligence-type scenarios.
I see this belief as closely associated with the entire first scenario described above: Held by OpenAI/DeepMind, the idea that we will “solve intelligence” with an agenty AI running a simple fundamental algorithm with massive compute and effectively generalizing across many domains.
IIRC, the most prominent early argument for short AI timelines, as discussed by Bostrom, Yudkowsky, and others, was recursive self-improvement. The AI will build smarter AIs, meaning we’ll eventually hit an inflection point of runaway improvement positively feeding into itself and rapidly escalating from near-human to lightyears-beyond-human intelligence. This argument seems less popular in recent years, though I couldn’t say exactly why. My only opinion would be that this seems more like an argument for “fast takeoff” (once we have near-human level AI systems for building AI systems, we’ll quickly achieve superhuman performance in that area), but does not tell you when that takeoff will occur. For all we know, this fast takeoff could happen in hundreds of years. (Or I could be misunderstanding the argument here, I’d like to think more about it.)
Surveys asking AI researchers when they expect superhuman AI have received lots of popular coverage and might be driving widespread acceptance of short timelines. My very subjective and underinformed intution puts little weight on these surveys compared to the object level arguments. The fact that people trying to build superintelligence believe it’s possible within their lifetime certainly makes me take that possibility seriously, but it doesn’t provide much of an upper bound on how long it might take. If the current consensus of AI researchers proves to be wrong about progress over the next century, I wouldn’t expect their beliefs about the next five or ten centuries to hold up—the worldview assumptions might just be entirely off-base.
These are the only three arguments for short timelines I’ve ever heard and remembered. Interested if I’m forgetting anything big here.
Compare this to the simple prior that history will continue with slow and steady single-digit growth as it has since the Industrial Revolution, and I see a significant chance that we don’t see AI takeoff for centuries, if ever. (That’s before considering object level arguments for longer timelines, which admittedly I don’t see many of, and therefore I don’t put much weight on.)
I haven’t fully thought through all of this, but would love to hear others thoughts on the probability of “no takeoff”.
Maybe the future of AI looks like this guy on the internet’s business slide deck: https://static1.squarespace.com/static/50363cf324ac8e905e7df861/t/5e45cbd35750af6b4e60ab0f/1581632599540/2017+Benedict+Evans+Ten+Year+Futures.pdf
This is pretty rough around the edges, but these three scenarios seem like the key possibilities for the next few centuries that I can see at this point. For the hell of it, I’ll give some very weak credences: 10% that we solve superintelligence within decades, 25% that CAIS brings double-digit growth within a century or so, maybe 50% that human progress continues as usual for at least a few centuries, and (at least) 15% that what ends up happening looks nothing like any of these scenarios.
Very interested in hearing any critiques or reactions to these scenarios or the specific arguments within.
I like the no takeoff scenario intuitive analysis, and find that I also haven’t really imagined this as a concrete possibility. Generally, I like that you have presented clearly distinct scenarios and that the logic is explicit and coherent. Two thoughts that came to mind:
Somehow in the CAIS scenario, I also expect the rapid growth and the delegation of some economic and organizational work to AI to have some weird risks that involve something like humanity getting pushed away from the economic ecosystem while many autonomous systems are self-sustaining and stuck in a stupid and lifeless revenue-maximizing loop. I couldn’t really pinpoint an x-risk scenario here.
Recursive self-improvement can also happen within long periods of time, not necessarily leading to a fast takeoff, especially if the early gains are much easier than later gains (which might make more sense if we think of AI capability development as resulting mostly from computational improvements rather than algorithmic).
Ah! Richard Ngo had just written something related to the CAIS scenario :)