Longtermists Should Work on AI—There is No “AI Neutral” Scenario
Summary: If you’re a longtermist (i.e you believe that most of the moral value lies in the future), and you want to prioritize impact in your career choice, you should strongly consider either working on AI directly, or working on things that will positively influence the development of AI.
Epistemic Status: The claim is strong but I’m fairly confident (>75%) about it. I think the main crux is how bad biorisks could be and how the risk profile compared with the AI safety one, which I think is the biggest crux of this post. I’ve spent at least a year thinking about advanced AIs and their implications on everything, including much of today’s decision-making. I’ve reoriented my career towards AI based on these thoughts.
The Case for Working on AI
If you care a lot about the very far future, you probably want two things to happen: first, you want to ensure that humanity survives at all; second, you want to increase the growth rate of good things that matter to humanity—for example, wealth, happiness, knowledge, or anything else that we value.
If we increase the growth rate earlier and by more, this will have massive ripple effects on the very longterm future. A minor increase in the growth rate now means a huge difference later. Consider the spread of covid—minor differences in the R-number had huge effects on how fast the virus could spread and how many people eventually caught it. So if you are a longtermist, you should want to increase the growth rate of whatever you care about as early as possible, and as much as possible.
For example, if you think that every additional happy life in the universe is good, then you should want the number of happy humans in the universe to grow as fast as possible. AGI is likely to be able to help with this, since it could create a state of abundance and enable humanity to quickly spread across the universe through much faster technological progress.
AI is directly relevant to both longterm survival and longterm growth. When we create a superintelligence, there are three possibilities. Either:
The superintelligence is misaligned and it kills us all
The superintelligence is misaligned with our own objectives but is benign
The superintelligence is aligned, and therefore can help us increase the growth rate of whatever we care about.
Longtermists should, of course, be eager to prevent the development of a destructive misaligned superintelligence. But they should also be strongly motivated to bring about the development of an aligned, benevolent superintelligence, because increasing the growth rate of whatever we value (knowledge, wealth, resources…) will have huge effects into the longterm future.
Some AI researchers focus more on the ‘carrot’ of aligned benevolent AI, others on the ‘stick’ of existential risk. But the point is, AI will likely either be extremely good or extremely bad—it’s difficult to be AI-neutral.
I want to emphasize that my argument only applies to people who want to strongly prioritize impact. It’s fine for longtermists to choose not to work on AI for personal reasons. Most people value things other than impact, and big career transitions can be extremely costly. I just think that if longtermists really want to prioritize impact above everything else, then AI-related work is the best thing for (most of) them to do; and if they want to work on other things for personal reasons, they shouldn’t be tempted by motivated reasoning to believe that they are working on the most impactful thing.
Objections
Here are some reasons why you might be unconvinced by this argument, along with reasons why I find these objections unpersuasive or unlikely.
You might not buy this argument because you believe one of the following things:
You want to take a ‘portfolio approach’
Some EAs take a ‘portfolio approach’ to cause prioritization, thinking that since the most important cause is uncertain, we should divide our resources between many plausibly-important causes.
A portfolio approach makes sense when you have comparable causes, and/or when there are decreasing marginal returns on each additional resource spent on one cause. But in my opinion, this isn’t true for longtermists and AI. First, the causes here are not comparable; no other cause has such large upsides and downsides. Second, the altruistic returns on AI work are so immensely high that even with decreasing marginal returns, there is still a large difference between this opportunity and our second biggest priority.
There’s a greater existential risk in the short term
You might think that something else currently poses an even greater existential risk than AI. I think this is unlikely, however. First, I’m confident that of the existential risks known to EAs, none is more serious than the risk from AI. Second, I think it’s unlikely that there is some existential risk that is known to a reader but not to most EAs, and that is more serious than AI risk.
In The Precipice, Toby Ord estimates that we are 3 times more likely to go extinct due to AI than due to biological risks—the second biggest risk factor after AI (in his opinion). Many people—including me—think that Ord vastly overestimates biorisks, and our chances of going extinct from biological disasters are actually very small.
One of the most critical features that seem to be crucial to extinction events via viruses is whether the virus is stealth or not and for how long. I think we’re likely to be able to prevent the ‘stealth viruses’ scenario happening in the next few years thanks to metagenomic sequencing which should make extinction from stealthy pathogens even less likely; therefore, I believe that the risk of extinction from pathogens in the next few decades is very unlikely. If there’s any X-risk this century, I think it’s heavily distributed in the second half of this century. For those interested, I wrote a more detailed post on scenarios that could lead to X-risks via biorisks. I think that the most likely way I could be wrong here is if the minimum viable population was not 1000 but greater than 1% of the world population or if an irrecoverable collapse was very likely even above these thresholds.
On the other hand, transformative AIs (TAIs) will probably be developed within the next few decades according to Ajeya Cotra’s report on biological anchors (which is arguably an upper bound of the development of TAI).
Others have argued that nuclear war and climate change, while they could have catastrophic consequences, are unlikely to cause human extinction.
A caveat: I’m less certain about the risks posed by nanotechnology. However, I don’t think this poses a comparable risk to AI, although I’d expect this to be the second biggest source of risk after AI.
See here for a database of various experts’ estimates of existential risk from various causes.
It’s not a good fit for you
I.e., you have skills or career capital that make it suboptimal for you to switch into AI. This is possible, but given that both AI Governance and AI Safety need a wide range of skills, I expect this to be pretty rare.
By wide range, I mean very wide. So wide that I think that even most longtermists with a biology background who want to maximize their impact should work on AI. Let me give some examples of AI-related career paths that are not obvious:
Community building (general EA community building or building the AI safety community specifically).
Communications about AI (to targeted public such as the ML community).
Increasing the productivity of people who do direct AI work by working with them as a project manager, coach, executive assistant, writer, or other key support roles.
Making a ton of money (I expect this to be very useful for AI governance as I will argue in a future post).
Building influence in politics (I expect this to be necessary for AI governance).
Studying psychology (e.g. what makes humans altruistic) or biology (e.g evolution). These questions are relevant for AI to make our understanding of optimization dynamics more accurate, which is key to predicting what we may expect from gradient descent. PIBSS is an example of this kind of approach to the AI problem.
UX designer for EA organizations such as 80k.
Writing fiction about AGI that is about plausible scenarios that could happen (rather than, e.g., terminator robots) - the only example I know of this type of fiction is Clippy.
There is something that will create more value in the long-term future than intelligence
This could be the case; but I give it a low probability, since intelligence seems to be highly multipurpose, and a superintelligent AI could help you find or increase this other thing more quickly.
It’s not possible to align AGI
In this case, you should focus on stopping the development of AGI or tried to develop beneficial unaligned AGI.
AGI will be aligned by default
If you don’t accept the orthogonality thesis or aren’t worried about misaligned AGI, then you should work to ensure that the governance structure around AGI is favorable to what you care about and that AGI happens as soon as possible within this structure, because then we can increase the growth rate of whatever we care about.
You’re really sure that developing AGI is impossible
This is hard to justify: the existence of humans proves that general intelligence is feasible.
Have I missed any important considerations and counter-arguments? Let me know in the comments. If you’re not convinced of my main point, I expect this to be because you disagree with the following crux: there isn’t any short term X-risk which is nearly as important as AGI. If this is the case- especially if you think that biorisks could be equally dangerous - tell me in the comments and I’ll consider writing about this topic in more depth.
Non-longtermists should also consider working on AI
In this post I’ve argued that longtermists should consider working on AI. I also believe the following stronger claim: “whatever thing you care more about, it will likely be radically transformed by AI pretty soon, so you should care about AI and work on something related to it”. I didn’t argue for this claim because this would have required significantly more effort. However, If you care about causes such as poverty, health or animals, and you think your community could update based on a post saying “Cause Y will be affected by AI”, leave a comment and I will think about writing about it.
This post was written collaboratively by Siméon Campos and Amber Dawn Ace as part of Nonlinear’s experimental Writing Internship program. The ideas are Siméon’s; Siméon explained them to Amber, and Amber wrote them up. We would like to offer this service to other EAs who want to share their as-yet unwritten ideas or expertise.
If you would be interested in working with Amber to write up your ideas, fill out this form.
- Joseph Bloom on choosing AI Alignment over bio, what many aspiring researchers get wrong, and more (interview) by 17 Sep 2023 18:45 UTC; 27 points) (LessWrong;
- Monthly Overload of EA—September 2022 by 1 Sep 2022 13:43 UTC; 15 points) (
- A pseudo mathematical formulation of direct work choice between two x-risks by 11 Aug 2022 0:28 UTC; 7 points) (
- 5 Sep 2022 7:42 UTC; 1 point) 's comment on EAs underestimate uncertainty in cause prioritisation by (
I think this is a bit too strong of a claim. It is true that that overwhelming majority of value in the future is determined by whether, when, and how we build AGI. I think it is also true that a longtermist trying to maximize impact should, in some sense, be doing something which affects whether, when, or how we build AGI.
However, I think your post is too dismissive of working on other existential risks. Reducing the chance that we all die before building AGI increases the chance that we build AGI. While there probably won’t be a nuclear war before AGI, it is quite possible that a person very well-suited to working on reducing nuclear issues could reduce x-risk more by working to reduce nuclear x-risk than they could by working more directly on AI.
Thanks for the comment.
I think it would be true if there were other X-risks. I just think that there is no other literal X-risk. I think that there are huge catastrophic risks. But there’s still a huge difference between killing 99% of people and killing a 100%.
I’d recommend reading (or skimming through) this to have a better sense of how different the 2 are.
I think that in general the sense that it’s cool to work on every risks come precisely from the fact that very few people have thought about every risks and thus people in AI for instance IMO tend to overestimate risks in other areas.
“no other literal X-risk” seems too strong. There are certainly some potential ways that nuclear war or a bioweapon could cause human extinction. They’re not just catastrophic risks.
In addition, catastrophic risks don’t just involve massive immediate suffering. They drastically change global circumstances in a way which will have knock-on effects on whether, when, and how we build AGI.
All that said, I directionally agree with you, and I think that probably all longtermists should have a model of the effects their work has on the potentiality of aligned AGI, and that they should seriously consider switching to working more directly on AI, even if their competencies appear to lie elsewhere. I just think that your post takes this point too far.
Just tell me a story with probabilities of how nuclear war or bioweapons could cause human extinction and you’ll see that when you’ll multiply the probabilities, it will go down to a very low number.
I repeat but I think that you don’t still have a good sense of how difficult it is to kill every humans if the minimal viable population (MVP) is around 1000 as argued in the post linked above.
”knock-on effects”
I think that it’s true but I think that on the first-order, not dying from AGI is the most important thing compared with developing it in like 100 years.
I have a slight problem with the “tell me a story” framing. Scenarios are useful, but also lend themselves general to crude rather than complex risks. In asking this question, you implicitly downplay complex risks. For a more thorough discussion, the “Democratising Risk” paper by Cramer and Kemp has some useful ideas in it (I disagree with parts of the paper but still) It also continues to priorities epistemically neat and “sexy” risks which whilst possibly the most worrying are not exclusive. Also probabilities on scenarios in many contexts can be somewhat problematic, and the methodologies used to come up with very high xrisk values for AGI vs other xrisks have very high uncertainties. To this degree, I think the certainty you have is somewhat problematic
Yes, scenarios are a good way to put a lower bound but if you’re not able to create one single scenario that’s a bad sign in my opinion.
For AGI there are many plausible scenarios where I can reach ~1-10% likelihood of dying. With biorisks it’s impossible with my current belief on the MVP (minimum viable population)
Sketching specific bio-risk extinction scenarios would likely involve substantial info-hazards.
You could avoid such infohazards by drawing up the scenarios in a private message or private doc that’s only shared with select people.
I think that if you take these infohazards seriously enough, you probably even shouldn’t do that. Because if everyone has a 95% likelihood to keep it secret, with 10 persons in the know is 60%.
I see what you mean, but if you value cause prioritization seriously enough, it is really stifling to have literally no place to discuss x-risks in detail. Carefully managed private spaces are the best compromise I’ve seen so far, but if there’s something better then I’d be really glad to learn about it.
I think that I’d be glad to stay as long as we can in the domain of aggregate probabilities and proxies of real scenarios, particularly for biorisks.
Mostly because I think that most people can’t do a lot about infohazardy things so the first-order effect is just net negative.
Yes I mostly agree but even conditional on info hazardy things I still think that the aggregate probability of likelihood of collapse is a very important parameter.
I’m not sure what you mean—I agree the aggregate probability of collapse is an important parameter, but I was talking about the kinds of bio-risk scenarios that simeon_c was asking for above?
Do I understand you right that overall risk levels should be estimated/communicated even though their components might involve info-hazards? If so, I agree, and it’s tricky. They’ll likely be some progress on this over the next 6-12 months with Open Phil’s project to quantify bio-risk, and (to some extent) the results of UPenn’s hybrid forecasting/persuasion tournament on existential risks.
Thanks for this information!
What’s the probability we go extinct due to biorisks by 2045 according to you?
Also, I think that things that are extremely infohazardy shouldn’t be thought of too strongly bc without the info revelation they will likely remain very unlikely
I’m currently involved in the UPenn tournament so can’t communicate my forecasts or rationales to maintain experimental conditions, but it’s at least substantially higher than 1⁄10,000.
And yeah, I agree complicated plans where an info-hazard makes the difference are unlikely, but info-hazards also preclude much activity and open communication about scenarios even in general.
And on AI, do you have timelines + P(doom|AGI)?
I don’t have a deep model of AI—I mostly defer to some bodged-together aggregate of reasonable seeming approaches/people (e.g. Carlsmith/Cotra/Davidson/Karnofsky/Ord/surveys).
I think that it’s one of the problems that explains why many people find my claim far too strong: in the EA community, very few people have a strong inside view on both advanced AIs and biorisks. (I think that’s it’s more generally true for most combinations of cause areas).
And I think that indeed, with the kind of uncertainty one must have when one’s deferring , it becomes harder to do claims as strong as the one I’m making here.
I don’t think this reasoning works in general. A highly dangerous technology could become obvious in 2035, but we could still want actors to not know about it until as late as possible. Or the probability of a leak over the next 10 years could be high, yet it could still be worth trying to maintain secrecy.
Yes, I think you’re right actually.
Here’s a weaker claim which is I think it true:
- When someone knows and has thought on a infohazard, the baseline is that he’s way more likely to cause harm via it than to cause good.
- Thus, I’d recommend anyone who’s not actively thinking about ways to solve to prevent classes of scenario where this infohazard would end up being very bad, to try to forget this infohazard and not talk about it even to trusted individuals. Otherwise it will most likely be net negative.
Luisa’s post addresses our chance of getting killed ‘within decades’ of a civilisational collapse, but that’s not the same as the chance that it prevents us ever becoming a happy intergalactic civilisation, which is the end state we’re seeking. If you think that probability is 90%, given a global collapse, then the effective x-risk of that collapse is 0.1 * <its probability of happening>. One order of magnitude doesn’t seem like that big a deal here, given all the other uncertainties around our future.
That’s right! I just think that the base rate for “civilisation collapse prevents us from ever becoming a happy intergalactic civilisation” is very low.
And multiplying any probability by 0.1 also does matter because when we’re talking about AGI, we’re talking about things are >=10% likely to happen for a lot of people (I put a higher likelihood than that but Toby Ord putting 10% is sufficient).
So it means that even if you condition on biorisks being the same as AGI (which is the point I argue against) for everything else, you still need biorisks to be >5% likely to lead to a civilizational collapse by the end of the century for my point to not hold, i.e that 95% of longtermists should work AI (19/20 of the people + assumption of linear returns for the few first thousands ppl).
I think there are many more options than this, and every argument that follows banks entirely your logical models being correct. Engineers can barely design a bike that will work on the first try, what possibly makes you think you can create an accurate theoretical model on a topic that is so much more complex?
I think you are massively overconfident, considering that your only source of evidence is abstract models with zero feedback loops. There is nothing wrong with creating such models, but be aware of just how difficult it is to get even something much simpler right.
It’s great that you spent a year thinking about this, but many have spent decades and feel MUCH less confident about all of this than you.
I think that this comment is way too outside viewy.
Could you mention concretely one of the “many options” that would change directionally the conclusion of the post?
for example:
intelligence peaks more closely to humans, and super intelligence doesn’t yield significant increases to growth.
superintelligence in one domain doesn’t yield superintelligence in others, leading to some, but limited growth, like most other technologies.
we develop EMs which radically changes the world, including growth trajectories, before we develop superintelligence.
intelligence peaks more closely to humans, and super intelligence doesn’t yield significant increases to growth.
Even if you have a human-ish intelligence, most of the advantage of AI from its other features:
- You can process any type of data, orders of magnitude faster than human and once you know how to do a task your deterministically know how to do it.
- You can just double the amount of GPUs and double the number of AIs. If you pair two AIs and make them interact at high speed, it’s much more power than anything human-ish.
These are two of the many features that make AI radically different and make that it will shape the future.
2. superintelligence in one domain doesn’t yield superintelligence in others, leading to some, but limited growth, like most other technologies.
That’s very (very) unlikely given recent observations on transformers where you just take some models trained from text and plug it on images, train a tiny bit more (compared with the initial term) and it works + the fact that it does maths + the fact that it’s more and more sample efficient.
3. we develop EMs which radically changes the world, including growth trajectories, before we develop superintelligence.
I think that’s the most plausible of all three claims but I still think it’s like btwn 0.1% and 1% likely. Whereas we’ve a pretty clear path in mind on how to reach AIs that are powerful enough to change the world, we’ve no idea how to build EMs. Also, this doesn’t change directionally my argument bc no one in the EA community works on EMs. If you think that EMs are likely to change the world and that EAs should work on it, you should probably write on it and make the case for it. But I think that it’s unlikely that EMs are a significant thing we should care about rn.
If you have other examples, I’m happy to consider them but I suspect you don’t have better examples than those.
Meta-point: I think that you should be more inside viewy when considering claims.
”Engineers can barely design a bike that will work on the first try, what possibly makes you think you can create an accurate theoretical model on a topic that is so much more complex?”
This class of arguments for instance is pretty bad IMO. Uncertainty doesn’t prevent you from thinking about the EV and here I was mainly arguing on the line that if you care about the long-term EV, AI is very likely to be the first-order determinant of it. Uncertainty should make us willing to do some exploration and I’m not arguing against that but in other cause areas we’re making much more than exploration. 5% of longtermists would be sufficient to do all types of explorations on many topics, even EMs.
I think the meta-point might be the crux of our disagreement.
I mostly agree with your inside view that other catastrophic risks struggle to be existential the way AI would, and I’m often a bit perplexed as to how quick people are to jump from ‘nearly everyone dies’ to ‘literally everyone dies’. Similarly I’m sympathetic to the point that it’s difficult to imagine particularly compelling scenarios where AI doesn’t radically alter the world in some way.
But we should be immensely uncertain about the assumptions we make and I would argue the far most likely first-order dominant of future value is something our inside-view models didn’t predict. My issue is not with your reasoning, but how much trust to place in our models in general. My critique is absolutely not that you shouldn’t have an inside view, but that a well-developed inside view is one of many tools we use to gather evidence. Over reliance on a single type of evidence leads to worse decision making.
According to you, what should be the proportion of longtermists who should work on AI?
If there were no preferences, at least 95% and probably more around 99%. I think that this should update according to our timelines.
And just to clarify, that includes community building etc. as I mentioned.
How relevant do you find work that aims to figure out what to do in deployment scenarios, what values we should have, what we should do next, etc?
That is high-value work. Holden Karnofsky’s list of “important, actionable research questions” about AI alignment and strategy includes one about figuring out what should be done in deployment of advanced AI and leading up to it (1):
I think it’s extremely relevant.
To be honest I think that if someone coming without technical background wanted to contribute, that looking into these things would be one of the best default opportunities because:
1) These points you mention are blindspots of the AI alignment community because the typical member of the AI alignment doesn’t really care about all this political stuff. Especially questions on values and on “How does magically those who are 1000x more powerful than others don’t start ruling the entire world with their aligned AI” are very relevant IMO.
2) I think that the fact that AI safety arguments are still too conceptual is still a big weakness of the field. I think that increasing the concreteness of “how it will happen” what will be the concrete problems is a great way to have ourselves clearer scenarios in mind and also to increase the number of people that are taking these risks seriously.
To be clear, when I say that you should work on AI, I totally include people who have thoughts that are very different from the AI alignment field. For instance, I really like the fact that Jacy is thinking about this with animals in mind (I think that animal people should do that more) & being uncertain about the value of the longterm future if it’s driven by human values.
I think this makes far too strong a claim for the evidence you provide. Firstly, under the standard ITN (Importance Tractability Neglectedness) framework, you only focus on importance. If there are orders of magnitude differences in, let’s say, traceability (seems most important here), then longtermists maybe shouldn’t work on AI. Secondly, your claims that there is a low possibility AGI isn’t possible seem to need to be fleshed out more. The term AGI and general intelligence is notoriously slippery, and many argue we simply don’t understand intelligence enough to actually clarify the concept of general intelligence. If we think we don’t understand what general intelligence is, one may suggest that it is intractable enough for present actors that no matter how important or unimportant AGI is, under an ITN framework its not the most important thing. On the other hand, I am not clear this claim about AGI is necessary; TAI (transformative AI) is clearly possible and potentially very disruptive without the AI being generally intelligent. Thirdly, your section on other X-Risks takes an overly single hazard approach to X-Risk, which probably leads to an overly narrow interprets of what might pose X-Risk. I also think the dismissal of climate change and nuclear war seems to imply that human extinction=X-Risk. This isn’t true (definitionally), although you may make an argument nuclear war and climate change aren’t X-Risks, that argument is not made here. I can clarify or provide evidence for these points if you think it would be useful, but I think the claims you make about AI vs other priorities is too strong for the evidence you provide. I am not hear claiming you are wrong, but rather you need stronger evidence to support your conclusions
I argued that orders of magnitude difference in tractability are rare here.
The problem is for the strength of the claims made here, that longtermists should work on AI above all else (like 95% of longtermists should be working on this), you need a tremendous amount of certainty that each of these assumptions hold. As your uncertainty grows, the strength of the argument made here reduces
By default, you shouldn’t have a prior that bio risks is 100x more tractable than AI though. Some (important) people think that the EA community had a net negative impact on biorisks because of infohazards for instance.
Also, I’ll argue below that timelines matter for ITN and I’m pretty confident the risk/year is very different for the two risks (which favors AI in my model).
I would be interested in your uncertainties with all of this. If we are basing our ITN analysis on priors, given the limitations and biases of our priors, I would again be highly uncertain, once more leaning away from the certainty that you present in this post
Basically, as I said in my post I’m fairly confident about most things except the MVP (minimum viable population) where I almost completely defer to Luisa Rodriguez.
Likewise, for the likelihood of irrecoverable collapse, my prior is that’s the likelihood is very low for the reasons I gave above but given that I haven’t explored that much the inside view arguments in favor of it, I could quickly update upward and I think that it would the best way for me to update positively on biorisks actually posing an X-risk in the next 30 years.
My view on the 95% is pretty robust to external perturbations though because my beliefs favor short timelines (<2030). So I think you’d also need to change my mind on how easy it is to make a virus by 2030 that kills >90% of the people, spreads so fast/ or is stealthy so that almost everyone gets infected.
“Firstly, under the standard ITN (Importance Tractability Neglectedness) framework, you only focus on importance. If there are orders of magnitude differences in, let’s say, traceability (seems most important here), then longtermists maybe shouldn’t work on AI.”
I think this makes sense when we’re in the domain of non-existential areas. I think that in practice when you’re confident on existential outcomes and don’t know how to solve them yet, you probably should still focus on it though.
”which probably leads to an overly narrow interprets of what might pose X-Risk. I also think the dismissal of climate change and nuclear war seems to imply that human extinction=X-Risk. This isn’t true (definitionally),”
Not sure what you mean by “this isn’t true (definitionnally”. Do you mean irrecoverable collapse, or do you mean for animals?
“although you may make an argument nuclear war and climate change aren’t X-Risks, that argument is not made here.”
The posts I linked to were meant to have that purpose.
”I am not hear claiming you are wrong, but rather you need stronger evidence to support your conclusions. “
An intuition for why it’s hard to kill everyone till only 1000 persons survive:
- For humanity to die, you need an agent: Humans are very adaptive in general+ you might expect that at least the richest people of this planet have plans and will try to survive at all costs.
So for instance, even if viruses infect 100% of the people (almost impossible if people are aware that there are viruss) and literally kill 99% of the people (again ; almost impossible), you still have 70 million people alive. And no agent on earth has ever killed 70 million people. So even if you had a malevolent state that wanted to do that (very unlikely), they would have a hard time doing that till there are below 1000 people left.
Same goes for nuclear power. It’s not too hard to kill 90% of people with a nuclear winter but it’s very hard to kill the remaining 10-1-0.1% etc.
“I think this makes sense when we’re in the domain of non-existential areas. I think that in practice when you’re confident on existential outcomes and don’t know how to solve them yet, you probably should still focus on it though” -I think this somewhat misinterprets what I said. This is only the case if you are CERTAIN that biorisk, climate, nuclear etc aren’t X-Risks. Otherwise it matters. If (toy numbers here) AI risk is 2 orders of magnitude more likely to occur than biorisk, but four orders of magnitude less tractable, then it doesn’t seem that AI risk is the thing to work on.
“Not sure what you mean by “this isn’t true (definitionnally”. Do you mean irrecoverable collapse, or do you mean for animals? ” -Sorry, I worded this badly. What I meant is that argument assumes that X-Risk and human extinction are identical. They are of course not, as irrecoverable collapse , s-risks and permanent curtailing of human potential (which I think is a somewhat problematic concept) are all X-Risks as well. Apologies for the lack of clarity.
“The posts I linked to were meant to have that purpose.” -I think my problem is that I don’t think the articles necessarily do a great job at evidencing the claims they make. Take the 80K one. It seems to ignore the concept of vulnerabilities and exposures, instead just going for a hazard centric approach. Secondly, it ignores a lot of important stuff that goes on in the climate discussion, for example what is discussed in this (https://www.pnas.org/doi/10.1073/pnas.2108146119) and this (https://www.cser.ac.uk/resources/assessing-climate-changes-contribution-global-catastrophic-risk/). Basically, I think it fails to adequately address systemic risk, cascading risk and latent risk. Also, it seems to (mostly) equate X-Risk to human extinction without massively exploring the question of if civilisation collapses whether we WILL recover not just whether we could. The Luisa Rodriguez piece also doesn’t do this (this isn’t a critique of her piece, as far as I can tell it didn’t intend to do this either).
An intuition for why it’s hard to kill everyone till only 1000 persons survive: - For humanity to die, you need an agent: Humans are very adaptive in general+ you might expect that at least the richest people of this planet have plans and will try to survive at all costs. So for instance, even if viruses infect 100% of the people (almost impossible if people are aware that there are viruss) and literally kill 99% of the people (again ; almost impossible), you still have 70 million people alive. And no agent on earth has ever killed 70 million people. So even if you had a malevolent state that wanted to do that (very unlikely), they would have a hard time doing that till there are below 1000 people left. Same goes for nuclear power. It’s not too hard to kill 90% of people with a nuclear winter but it’s very hard to kill the remaining 10-1-0.1% etc. -Again, this comes back to the idea that for something to be an X-Risk it needs to, in one single event, wipe out humanity or most of it. But X-risks may be a collapse we don’t recover from. Note this isn’t the same as a collapse we can’t recover from, but merely because “progress” (itself a very problematic term) seems highly contingent, even if we COULD recover doesn’t mean there isn’t a very high probability that we WILL. Moreover, if we retain this loss of complexity for a long time, ethical drift (making srisks far more likely even given recovery) is more likely. As is other catastrophes wiping us out, even if recoverable from alone, either in concert, by cascades or by discontinuous local catastrophes. It seems like it needs a lot more justification to have a very high probability that a civilisation we think is valuable would recover from a collapse that even leaves 100s of millions of people alive. This discussion over how likely a collapse or GCR would be converted into an X-Risk is still very open for debate, as is the discussion of contingency vs convergence. But for your position to hold, you need very high certainty on this point, which I think is highly debatable and perhaps at this point premature and unjustified. Sorry I can’t link the papers I need to right now, as I am on my phone, but will link later.
“If (toy numbers here) AI risk is 2 orders of magnitude more likely to occur than biorisk, but four orders of magnitude less tractable”. I think that indeed 2 or 3 OOMs of difference would be needed at least to compensate (especially given that positively shaping biorisks is not extremely positive) and as I argued above I think it’s unlikely.
“They are of course not, as irrecoverable collapse , s-risks and permanent curtailing of human potential”. I think that irrecoverable collapse is the biggest crux. What likelihood do you put on it? For other type of risks, it once again favors working on AI.
Your point below is also on irrecoverable collapse. Personnally I put a small weight on this but I could update pretty quickly because I haven’t thought that strongly about it. I just have these few arguments:
- Asymptotically, that would be surprising if you couldn’t find other ways to recover. The world in which our species used the ONLY way to make progress is a tiny fraction of all possible worlds.
- There are arguments about stocks of current material (huge) which could be used to recover.
- Humans are very adaptable.
I think that biorisks causing >90% of deaths are not for tomorrow and will most likely appear in the second half of the century which makes that it doesn’t compete in terms of timelines with AGI. The reasons why I think that is:
- Building viruses is still quite hard—
Doing gain-of-function research at a sufficient degree to be able to reach very high degree of lethalities + contamination is really not trivial.
- The world is still not connected enough for a virus to spread unstealthy and contaminate everyone
I actually think our big crux here is the amount of uncertainty. Each of the points I raise and each new assumption you are putting in should raise you uncertainty. Given you claim 95% ofongtermists should work on AI, high uncertainties fo not seem to weigh in the favour of your argument. Note I am not saying and haven’t that either AI isn’t the most important X-Risk or that we shouldn’t work on it. Just arguing against the certainty from your post
I think you make a good point if we were close in terms but what matters primarily is the EV and I expect this to dominate uncertainty here.
I didn’t do the computations but I feel like if u have something which is OOMs more important than others, even with very large bars of uncertainty you’d probably put >19/20 of your resources on the highest EV thing.
In the same way we don’t give to another less cost-effective org to hedge against AMF even though they might have some tail chances of having a very significant positive impact on society because the bars of estimate are very large.
“EA community building”, “making a ton of money”, or [being an] “UX designer for EA organisations like 80K” can be pursued in order to mitigate AI risk, but I wouldn’t say they’re intrinsically “AI-related”. Instead of “AI-related career paths” I’d call them something like “career paths that can be useful for addressing AI risk”.
Yep, good point! I just wanted to make clear that IMO a good first-order approximation of your impact on the long-term future is: “What’s the causal impact of your work on AI?”
And even though UX designer for 80k / Community building are not focused on AI, they are instrumentally very useful towards AI, in particular if the person who does it has this theory of change in mind.
Yeah agreed; I got that.
I think in the EA community, the bottleneck is the supply of AI safety related jobs/projects, but there is already a very strong desire to move into AI safety. The problem is not longtermists who are already working on something else. They should generally continue to do so, because the portfolio argument is compelling. The problem is the bootstrapping problem for people who want to start working an AI safety
Even if you only value AI safety, having a good portfolio community is important and makes our community attractive. Ai safety is still weird. FTX was originally only vegan, and only then shifted to long term considerations. That’s the trajectory of most people here. Being diverse is at least cool for that reason.
I think that if the community was convinced that it was by far the most important thing, we would try harder to find projects and I’m confident there are a bunch of relevant things that can be done.
I think we’re suffering from a argument to moderation fallacy that makes that we underinvest massively in AI safety bc :
1) AI Safety is hard
2) There are other causes that when you think not to deeply about it, seem equally important
The portfolio argument is an abstraction that hides the fact that if something is way more important than something else, you just shouldn’t diversify, that’s precisely why we give to AMF rather than other things without diversifying our portfolio.
“Ai safety is still weird. FTX was originally only vegan, and only then shifted to long term considerations.”
That’s right but then your theory of change in other areas needs to be oriented towards AI safety and that might lead to very difference strategies. For instance you might want to not lose “weirdness points” for other cause areas, or might not want to bring in the same type of profiles.
I can’t follow what you’re saying in the ‘AGI will be aligned by default’ section. I think you’re saying in that scenario it will be so good that you should disregard everything else and try and make it happen ASAP? If so, that treats all other x-risk and trajectory change scenarios as having probability indistinguishable from 0, which can’t be right. There’s always going to be one you think has distinctly higher probability than the others, and (as a longtermist) you should work on that.
I think that by the AGI timelines of the EA community, yes other X-risks have roughly a probability of extinction indistinguishable from 0.
And conditional on AGI working we’ll also go out of the other risks most likely.
Whereas without AGI, biorisks X-risks might become a thing, not in the short run but in the second half of the century.
Wow quite a strong claim, and as a longtermist mostly working on AI, not sure how to feel about it yet.🤔 But I’ll take a stab at a different thing:
A few initial concrete examples:
Poverty: Transformative AI could radically transform the global economy, which could lead to the end of poverty as we know it and immense wealth for all or perhaps tremendous inequality.
(Mental) health: We’re likely to see really good AI chatbots pop up in the shorter term (i.e. before TAI, kinda starting now) which could potentially serve as ultra-low-cost but very useful therapists/friends.
Animals: AI could make alternative proteins much more efficient to produce, bringing a much sooner switch away from animal agriculture (and might even be necessary for cultivated meat to be competitive). Some are already working on this.
Based on AI timelines, I would be surprised if all these things didn’t happen in the next ~50 years, which feels quite “neartermist” to me.
Thanks for giving these concrete examples!
“A(G)I could radically change X” is a way weaker claim than “working on A(G)I is the best way to tackle X”. You only defend the former here.
The claim is “AGI will radically change X”. And I tried to argue that if you cared about X and wanted to impact it, basically on the first order you could calculate your impact on it just by measuring your impact on AGI.
How does rejecting the orthogonality thesis imply that AGI will be aligned by default?
It was a way to say that if you think that intelligence is perfectly correlated with “morally good”, then you’re fine. But you’re right that it doesn’t include all the ways you could reject the orthogonality thesis
I don’t need to think this in order to think AI is not the top priority. I just need to think it’s hard enough that other risks dominate it. Eg I might think biorisk has a 10% chance of ending everything each century, and that risks from AI are at 5% this century and 10% every century after that. Then if all else is equal, such as tractability, I should work on biorisk.
It seems plausible that some people should be focused on areas other than AGI, even if only because these areas could ultimately influence AGI deployment.
You’ve already mentioned ‘building influence in politics’. But this could include things like nuclear weapons policy.
For example, if a nuclear-armed state believes a rival state is close to deploying AGI, they may decide they have no option but to attack with nukes first (or at least threaten to attack) in order to try to prevent this.
Yes, that’s right, but it’s very different to be somewhere and by chance affect AGI and to be somewhere because you think that it’s your best way to affect AGI.
And I think that if you’re optimizing the latter, you’re not very likely to end up working in nuclear weapons policy (even if there might be a few people for who it is be the best fit)
I don’t see how this is possible. There is nothing like “a little misalignment”. Keep in mind that creating an unstoppable and uncontrollable AI is a one-shot event that can’t be undone and will have extremely wide and long-term effects on everything. If this AI is misaligned even very slightly, the differences between its goals and humanity’s will aggregate and increase over time. It’s similar to launching a rocket without any steering mechanism with the aim of landing it on the Jupiter moon Europa: You have to set every parameter exactly right or the rocket will miss the target by far. Even the slightest deviance, like e.g. an unaccounted-for asteroid passing by close to the rocket and altering its course very slightly due to gravitational effects, will completely ruin the mission.
On the other hand, if we manage to build an AGI that is “docile” and “corrigible” (which I doubt very much we can do), this would be similar to having a rocket that can be steered from afar: In this case, I would say it is fully aligned, even if corrections are necessary once in a while.
Should we end up with both—a misaligned and an aligned AGI, or more of them—it is very likely that the worst AGI (from humanity’s perspective) will win the battle for world supremacy, so this is more or less the same as just having one misaligned AGI.
My personal view on your subject is that you don’t have to work in AI to shape its future. You can also do that by bringing the discussion into the public and create awareness for the dangers. This is especially relevant, and may even be more effective than a career in an AI lab, if our only chance for survival is to prevent a misaligned AI, at least until we have solved alignment (see my post on “red lines”).
“The superintelligence is misaligned with our own objectives but is benign”.
You could have an AI with some meta-cognition, able to figure out what’s good and maximizing it in the same way EAs try to figure out what’s good and maximize it with parts of their life. This view mostly make sense if you give some credence to moral realism.
“My personal view on your subject is that you don’t have to work in AI to shape its future.”
Yes, that’s what I wrote in the post.
“You can also do that by bringing the discussion into the public and create awareness for the dangers.”
I don’t think it’s a good method and I think you should target a much more specific public but yes, I know what you mean.
I’m not sure how that would work, but we don’t need to discuss it further, I’m no expert.
What exactly do you think is “not good” about a public discussion of AI risks?