Benjamin was a research analyst at 80,000 Hours. Before joining 80,000 Hours, he worked for the UK Government and did some economics and physics research.
Benjamin Hilton
There are important reasons to think that the change by the EA community is within the measurement error of these surveys, which makes this less noteworthy.
(Like say you put +/- 10 years and +/- 10% on all these answers—note there are loads of reasons why you wouldn’t actually assess the uncertainty like this, (e.g. probabilities can’t go below 0 or above 1), but just to get a feel for the uncertainty this helps. Well, then you get something like:
10%-30% chance of TAI by 2026-2046
40%-60% by 2050-2070
and 75%-95% by 2100
Then many many EA timelines and shifts in EA timelines fall within those errors.)
Reasons why these surveys have huge error
1. Low response rates.
The response rates were really quite low.
2. Low response rates + selection biases + not knowing the direction of those biases
The surveys plausibly had a bunch of selection biases in various directions.
This means you need a higher sample to converge on the population means, so the surveys probably aren’t representative. But we’re much less certain in which direction they’re biased.
Quoting me:
For example, you might think researchers who go to the top AI conferences are more likely to be optimistic about AI, because they have been selected to think that AI research is doing good. Alternatively, you might think that researchers who are already concerned about AI are more likely to respond to a survey asking about these concerns
3. Other problems, like inconsistent answers in the survey itself
AI impacts wrote some interesting caveats here, including:
Asking people about specific jobs massively changes HLMI forecasts. When we asked some people when AI would be able to do several specific human occupations, and then all human occupations (presumably a subset of all tasks), they gave very much later timelines than when we just asked about HLMI straight out. For people asked to give probabilities for certain years, the difference was a factor of a thousand twenty years out! (10% vs. 0.01%) For people asked to give years for certain probabilities, the normal way of asking put 50% chance 40 years out, while the ‘occupations framing’ put it 90 years out. (These are all based on straightforward medians, not the complicated stuff in the paper.)
People consistently give later forecasts if you ask them for the probability in N years instead of the year that the probability is M. We saw this in the straightforward HLMI question, and most of the tasks and occupations, and also in most of these things when we tested them on mturk people earlier. For HLMI for instance, if you ask when there will be a 50% chance of HLMI you get a median answer of 40 years, yet if you ask what the probability of HLMI is in 40 years, you get a median answer of 30%.
The 80k podcast on the 2016 survey goes into this too.
Thanks for this! Looks like we actually roughly agree overall :)
Thanks for this thoughtful post! I think I stand by my 1 in 10,000 estimate despite this.
A few short reasons:
Broad things: First, these scenarios and scenarios like them are highly conjunctive (many rare things need to happen), which makes any one scenario unlikely (although of course there may be many such scenarios). Second, I think these and similar scenarios are reason to think there may be a large catastrophe, but large and existential are a long way apart. (I discuss this a bit here but don’t come to a strong overall conclusion. More work on this would be great.)
On inducing nuclear war: My estimate of the direct risk of nuclear war is 1 in 10,000, and the indirect risk is 1 in 1,000. It seems like the chances that climate change causes a nuclear war, weighted by the extent to which the war was more likely by virtue of climate change and not e.g. geopolitical tensions unrelated to climate change is, while subjective and difficult to judge, probably much less than 10%. If it’s say 1%, this gives less than 1 in 100,000 indirect x-risk from climate change. This seems a bit small, but consistent with my 1 in 10,000 estimate. Note this includes inducing nuclear war from ways other than crop failure.
On runaway warming: My understanding is that the main limit here is how many fossil fuels it’s possible to recover from the ground—see more here. Even taking into account uncertainty and huge model error, it seems highly unlikely that we’ll end up with runaway warming that itself leads to extinction. I’d also add that lots of the reduction in risk occurs because climate change is a gradual catastrophe (unlike a pandemic or nuclear war), which means that, for example, we may find other emissionless technology (e.g. nuclear fusion) or get over our fear of nuclear fission, etc., reducing the risk of resource depletion. Relatedly, unless there is extremely fast runaway warming over only a few years, the gradual nature of climate change increases the chances of successful adaptation to a warmer environment. (Again, I mean adaptation to prevent an existential catastrophe—a large catastrophe that isn’t quite existential seems far far more likely.)
On coastal cities: I’d guess the existential risk from war breaking out between great powers is also around 1 in 10,000 (within an order of magnitude or so), although I’ve thought about this less. So again, while cyanobacteria blooms sounds like a not-impossible way in which climate change could lead to war (personally I’d be more worried about flooding and migration crises in South Asia), I think this is all consistent with my 1 in 10,000 estimate.
If it helps at all, my subjective estimate of the risk from AI is probably around 1%, and approximately none of that comes from worrying about killer nanobots. I wrote about what an AI-caused existential catastrophe might actually look like here.
Hi! Wanted to follow up as the author of the 80k software engineering career review, as I don’t think this gives an accurate impression. A few things to say:
I try to have unusually high standards for explaining why I believe the things I write, so I really appreciate people pushing on issues like this.
At the time, when you responded to <the Anthropic person>, you said “I think <the Anthropic person> is probably right” (although you added “I don’t think it’s a good idea to take this sort of claim on trust for important career prioritisation research”).
When I leave claims like this unsourced, it’s usually because I (and my editors) think they’re fairly weak claims, and/or they lack a clear source to reference. That is, the claim is effectively is a piece of research based on general knowledge (e.g. I wouldn’t source the claim “Biden is the President of the USA”) and/or interviews with a range of experts, and the claim is weak or unimportant enough not to investigate further. (FWIW I think it’s likely I should have prioritised writing a longer footnote on why I believe this claim.)
The closest data is the three surveys of NeurIPS researchers, but these are imperfect. They ask how long it will take until there is “human-level machine intelligence”. The median expert asked thought there was an around 1 in 4 chance of this by 2036. Of course, it’s not clear that HLMI and transformative AI are the same thing, or that thinking HLMI being developed soon necessarily means that HLMI will be made by scaling and adapting existing ML methods. In addition, no survey data pre-dates 2016, so it’s hard to say that these views have changed based solely on survey data. (I’ve written more about these surveys and their limitations here, with lots of detail in footnotes; and I discuss the timelines parts of those surveys in the second paragraph here.)
As a result, when I made this claim I was relying on three things. First, that there are likely correlations that make the survey data relevant (i.e., that many people answering the survey think that HLMI will be relatively similar to or cause transformative AI, and that many people answering the survey think that if HLMI is developed soon that suggests it will be ML-based). Second, that people did not think that ML could produce HLMI in the past (e.g. because other approaches like symbolic AI were still being worked on, because texts like Superintelligence do not focus on ML and this was not widely remarked upon at the time despite that book’s popularity, etc.). Third, that people in the AI and ML fields who I spoke to had a reasonable idea of what other experts used to think and how that has changed (note I spoke to many more people than the one person who responded to you in the comments on my piece)!
It’s true that there may be selection bias on this third point. I’m definitely concerned about selection bias for shorter timelines in general in the community, and plan to publish something about this at some point. But in general I think that the best way, as an outsider, to understand what prevailing opinions are in a field, is to talk to people in that field – rather than relying on your own ability to figure out trends across many papers, many of which are difficult to evaluate, many of which may not replicate. I also think that asking about what others in the field think, rather than what the people you’re talking to think, is a decent (if imperfect) way of dealing with that bias.
Overall, I thought the claim I made was weak enough (e.g. “many experts” not “most experts” or “all experts”) that I didn’t feel the need to evaluate this further.
It’s likely, given you’ve raised this, that I should have put this all in a footnote. The only reason I didn’t is that I try to prioritise, and I thought this claim was weak enough to not need much substantiation. I may go back and change that now (depending on how I prioritise this against other work).
This looks really cool, thanks Tom!
I haven’t read the report in full (just the short summary) - but I have some initial scepticism, and I’d love to answers to some of the following questions, so I can figure out how much evidence this report is on takeoff speeds. I’ve put the questions roughly in order of subjective importance to my ability to update:Did you consider Baumol effects, the possibility of technological deflation, and the possibility of technological unemployment, how they affect the profit incentive as tasks are increasingly automated? [My guess is that this effect of all of these is to slow takeoff down, so I’d guess a report that uses simpler models will be noticeably overestimating takeoff speeds.]
How much does this rely on the accuracy of semi-endogenous growth models? Does this model rely on exponential population growth? [I’m asking because as far as I can tell, work relying on semi-endogenous growth models should be pretty weak evidence. First, the “semi” in semi-endogenous growth usually refers to exogenous exponential population growth, which seems unlikely to be a valid assumption. Second, endogenous growth theory has very limited empirical evidence in favour of it (e.g. 1, 2) and I have the impression that this is true for semi-endogenous growth models too. This wouldn’t necessarily be a problem in other fields, but in general I think that economic models with little empirical evidence behind them provide only very weak evidence overall.]
In section 8, the only uncertainty pointing in favour of fast takeoff is “there might be a discontinuous jump in AI capabilities”. Does this mean that, if you don’t think a discontinuous jump in AI capabilities is likely, you should expect slower take-off than your model suggests? How substantial is this effect?
How did you model the AI production function? Relatedly, how did you model constraints like energy costs, data costs, semiconductor costs, silicon costs etc.? [My thoughts: looks like you roughly used a task-based CES model, which seems like a decent choice to me, knowing not much about this! But I’d be curious about the extent to which using this changed your results from Cobb-Douglas.]
I’m vaguely worried that the report proves too much, in that I’d guess that the basic automation of the industrial revolution also automated maybe 70%+ of tasks by pre-industrial revolution GDP. (Of course, generally automation itself wasn’t automated—so I’d be curious on your thoughts about the extent to which this criticism applies at least to the human investment parts of the report.)
That’s all the thoughts that jumped into my head when I read the summary and skimmed the report—sorry if they’re all super obvious if I’d read it more thoroughly! Again, super excited to see models with this level of detail, thanks so much!
How many people are working (directly) on reducing existential risk from AI?
I agree with (a). I disagree that (b) is true! And as a result I disagree that existing CEAs give you an accurate signpost.
Why is (b) untrue? Well, we do have some information about the future, so it seems extremely unlikely that you won’t be able to have any indication as to the sign of your actions, if you do (a) reasonably well.
Again, I don’t purely mean this from an extreme longtermist perspective (although I would certainly be interested in longtermist analyses given my personal ethics). For example, simply thinking about population changes in the above report would be one way to move in this direction. Other possibilities include thinking about the effects of GHW interventions on long-term trajectories, like growth in developing countries (and that these effects may dominate short-term effects like DALYs averted for the very best interventions). I haven’t thought much about what other things you’d want to measure to make these estimates, but I would love to see someone try, and it seems pretty crucial if you’re going to be doing accurate CEAs.
Sure, happy to chat about this!
Roughly I think that you are currently not really calculating cost-effectiveness. That is, whether you’re giving out malaria nets or preventing nuclear war, almost all of the effects of your actions will be affecting people in the future.
To clarify, by “future” I don’t necessarily mean “long run future”. Where you put that bar is a fascinating question. But focusing on current lives lost seems to approximately ignore most of the (positive or negative) value, so I expect your estimates to not be capturing much about what matters.
(You’ve probably seen this talk by Greaves, but flagging it in case you haven’t! Sam isn’t a huge fan, I think in part because Greaves reinvents a bunch of stuff that non-philosophers have already thought a bunch about, but I think it’s a good intro to the problem overall anyway.)
I’m curious about the ethical decisions you’ve made in this report. What’s your justification for evaluating current lives lost? I’d be far more interested in cause-X research that considers a variety of worldviews, e.g. a number of different ways of evaluating the medium or long-term consequences of interventions.
I agree that I’d love to see more work on this! (And I agree that the last story I talk about, of a very fast takeoff AI system with particularly advanced capabilities, seems unlikely to me—although others disagree, and think this “worst case” is also the most likely outcome.)
It’s worth noting again though that any particular story is unlikely to be correct. We’re trying to forecast the future, and good ways of forecasting should feel uncertain at the end, because we don’t know what the future will hold. Also, good work on this will (in my opinion) give us ideas about what many possible scenarios will look like . This sort of work (e.g. the first half of this article, rather than the second), often feels less concrete, but is, I think, more likely to be correct—and can inform actions that target many possible scenarios rather than one single unlikely event.
All that said, I’m excited to see work like OpenPhil’s nearcasting project which I find particularly clarifying and which will, I hope, improve our ability to prevent a catastrophe.
That particular story, in which I write “one day, every single person in the world suddenly dies”, is about a fast takeoff self-improvement scenario. In such scenarios, a sudden takeover is exactly what we should expect to occur, and the intermediate steps set out by Holden and others don’t apply to such scenarios. Any guessing about what sort of advanced technology would do this necessarily makes the scenario less likely, and I think such guesses (e.g. “hypnodrones”) are extremely likely to be false and aren’t useful or informative.
For what it’s worth, I personally agree that slow takeoff scenarios like those described by Holden (or indeed those I discuss in the rest of this article) are far more likely. That’s why I focus many different ways in which an AI could take over—rather than on any particular failure story. And, as I discuss, any particular combination of steps is necessarily less likely than the claim that any or all of these capabilities could be used.
But a significant fraction of people working on AI existential safety disagree with both of us, and think that a story which literally claims that a sufficiently advanced system will suddenly kill all humans is the most likely way for this catastrophe to play out! That’s why I also included a story which doesn’t explain these intermediate steps, even though my inside view is that this is less likely to occur.
Yeah, it’s a good question! Some thoughts:
-
I’m being quite strict with my definitions. I’m only counting people working directly on AI safety. So, for example, I wouldn’t count the time I spent writing this profile on AI (or anyone else who works at 80k for that matter). (Note: I do think lots of relevant work is done by people who don’t directly work on it) I’m also not counting people who think of themselves as on an AI safety career path and are, at the moment, skilling up rather than working directly on the problem. There are some ambiguities, e.g. are the ops team of an AI org working on safety? In general though these ambiguities seem much lower than the error in the data itself.
-
AI safety is hugely neglected outside EA (which is a key reason why it seems so useful to work on). This isn’t a big surprise and may be in large part a result of the fact that it used to be even more neglected, which means that anything that is started as an AI safety org is likely to have been started by EAs, so is also seen as an EA org. Which makes AI safety a subset of EA rather than the other way round.
-
Also, I’m looking at AI existential safety rather than broader AI ethics or AI safety issues. The focus on x-risk (combined with reasons to think that lots of work on AI non-existential safety isn’t that relevant—as compared with e.g. bio where lots of policy work for example is relevant to major pandemics and existential pandemics) makes it even more likely that this is just looking at a strict subset of EAs
-
There are I think up to around 10 thousand engaged EAs—of those maybe 1-2 thousand are longtermism or x-risk focused. So we’re looking at 10% of these people working full-time on AI x-risk! Seems like a pretty high proportion to me given the various causes in the wider EA (not even longtermist) community.
-
So in many ways the question of “how are so few people working on AI safety after 10 years” is similar to “how are there so few EAs after 10 years”, which is a pretty complicated question. But it seems to me like EA is way way way bigger and more influential than I would ever have expected in 2012!
-
There are also some other bottlenecks (notably mentoring capacity). The field was nearly non-existent 10 years ago, with very few senior people to help others enter the field – and it’s (rightly) a very technical field, focused on theoretical and practical computer science / ML. Even now, the proportion of time those 300 people should be spending mentoring is very much unclear to me.
I’d also like to highlight the footnote alongside this number: “There’s a lot of subjective judgement in the estimate (e.g. “does it seem like this research agenda is about AI safety in particular?”), and it could be too low if AI Watch is missing data on some organisations, or too high if the data counts people more than once or includes people who no longer work in the area. My 90% confidence interval would range from around 100 people to around 1,500 people.”
-
What could an AI-caused existential catastrophe actually look like?
Hi Gideon,
I wrote the 80,000 Hours problem profile on climate change. Thank you so much for this feedback! I’m genuinely really grateful to see such engagement with the things I write—and criticism is always a welcome contribution to making sure that I’m saying the right things.
Just to be clear, when I said “we think it’s potentially harmful to do work that could advance solar geoengineering”, I meant that (with a fair degree of uncertainty), it could be harmful to do work that advances the technology (which I think you agree with) not that all research around the topic seems bad! It definitely seems plausible that some research on the topic might be good—but I was trying to recommend the very best things to do to mitigate climate change. My reviewers pretty much all agreed that, partly as a result of potential harmful effects, it doesn’t seem like SRM research would be one of those very best things, and so suggested that we stop recommending working in the area. In large part I’m deferring to this consensus among the reviewers on this.
Hope that helps!
Benjamin
I think these are all great points! We should definitely worry about negative effects of work intended to do good.
That said here are two other places where maybe we have differing intuitions:You seem much more confident than I am that work on AI that is unrelated to AI safety is in fact negative in sign.
It seems hard to conclude that the counterfactual where any one or more of “no work on AI safety / no interpretability work / no robustness work / no forecasting work” were true is in fact a world with less x-risk from AI overall. That is, while I can see there are potential negative effects of these things, when I truly try to imagine the counterfactual, the overall impact seems likely positive to me.
Of course, intuitions like these are much less concrete than actually trying to evaluate the claims , and I agree it seems extremely important for people evaluating or doing anything in AI safety to ensure they’re doing positive work overall.
Ah thanks :) Fixed.
Preventing an AI-related catastrophe—Problem profile
Yes, there was!
Thank you so much for this feedback! I’m sorry to hear our messaging has been discouraging. I want to be very clear that I think it’s harmful to discourage people from working on such important issues, and would like to minimise the extent to which we do that.
I wrote the newsletter you’re referencing, so I particularly wanted to reply to this. I also wrote the 80,000 Hours article on climate change, explaining our view that it’s less pressing than our highest priority areas.
I don’t consider myself fundamentally a longtermist. Instead, I try my best to be impartial and cause-neutral. I try to find the ways in which I can best help others – including others in future generations, and animals, as I think they are moral patients.
Here are some specifically relevant things that I currently believe:
Existential risks are the most pressing problems we currently face (where by pressing I mean some combination of importance, which is determined in part by the expected number of individuals that could be affected, tractability, and neglectedness).
Climate change is less pressing than some other existential risks.
Cost-effectiveness is heavy-tailed. By trying to find the very best things to work on, you can substantially increase your impact.
It’s tractable to convince people to work on very important issues. It’s similarly tractable to convince people to work on existential risks as on other very important issues.
Therefore, it’s good to convince people to work on very important problems, but even better to convince people to work on existential risks.
I wrote that existential risks are the biggest problems we face, and that climate change is less pressing than other existential risks because I believe these things are both true and that communicating them is a highly cost-effective way to do good.
I don’t think everyone should work on existential risk reduction – personal fit is really important, and if so many people worked on it that they became very non-neglected, I’d think it was less useful for more people to work on them at the margin. Partly for these reasons, 80,000 Hours has generally promoted a range of areas – and has some positive evidence of people being convinced to work on poverty reduction and animal welfare as a result of 80,000 Hours.
On the newsletter audience
The 80,000 Hours newsletter is sent to a large audience who are largely unfamiliar with effective altruism. So that’s why the newsletter spoke about the importance of poverty reduction and animal welfare. For example, I wrote that “my best guess is that the negative effects of factory farming alone make the world worse than it’s ever been.” It would be brilliant if that newsletter convinced people to work on poverty reduction and animal welfare.
The newsletter also explained that, as far as we can tell, there are even bigger problems than these two.
I think it’s unlikely that the 80,000 Hours newsletter on net discouraged work on poverty reduction or animal welfare primarily because the vast majority (>99%) of newsletter subscribers aren’t working on any of poverty reduction, animal welfare or existential risk reduction.
If it did convince someone with equally good personal fit to work on existential risk reduction when they would have otherwise have worked on poverty reduction or animal welfare, that would be worse than convincing someone who wouldn’t have otherwise done anything very useful. However, since I think existential risks are the most pressing issues, I don’t think it’d be doing net expected harm,.
On whether I / 80,000 Hours value(s) work on non-existential threats
We value people working on animal welfare and poverty reduction (as well as other causes that aren’t our top priorities) a lot. We just don’t think those issues are the very most pressing problems in the world.
For example, where we list factory farming and global health on the problem profiles page you cite, we say:
It’s genuinely really difficult to send the message that something seems more pressing than other things, without implying that the other things are not important or that we wouldn’t want to see more people working on them. My colleague Arden, who wrote those two paragraphs above, also feels this way, and had this in mind when she wrote them.
On whether I / 80,000 Hours should defer more
One thing to consider is whether, given that many people disagree with 80,000 Hours on the relative importance of existential risks, we should lower our ranking.
I agree with this idea. Our ranking is post-deferral – we still think that existential risks seem more pressing than other issues, even after deferral. We have had conversations within the last year about whether, for example, factory farming should be included in our list of the top problems, and decided against making that change (for now), based on our assessment of its neglectedness and relative importance.
I also think that saying what we believe (all things considered) to be true is a good heuristic for deciding what to say. This is what the newsletter and problem profiles page try to do.
My personal current guess is that existential risk reduction is something like 100x more important than factory farming, and is also more neglected (although less tractable).
Because of our fundamental cause-neutrality, this is something that could (and hopefully will) change, for example if existential risks become less neglected, or the magnitude of these risks decrease.
Finally, on climate change
As I mentioned above, I think climate change is likely less important than other existential risks. Saying climate change is less pressing than the world’s literal biggest problem is a far cry from “unimportant” – I think that climate change is a hugely important problem. It just seems less likely to cause an existential catastrophe, and is far less neglected, than other possible risks (like nuclear-, bio-, or AI-related risks). My article on climate change defends this at length, and I’ve also responded to critiques of that article on the forum, e.g here.