The general background worldview that motivates this credence is that predicting the future is very hard, and we have almost no evidence that we can do it well. (Caveat I don’t think we have great evidence that we can’t do it either, though.) When it comes to short-term forecasting, the best strategy is to use reference-class forecasting (‘outside view’ reasoning; often continuing whatever trend has occurred in the past), and make relatively small adjustments based on inside-view reasoning. In the absence of anything better, I think we should do the same for long-term forecasts too. (Zach Groff is working on a paper making this case in more depth).
So when I look to predict the next hundred years, say, I think about how the past 100 years has gone (as well as giving consideration to how the last 1000 years and 10,000 years (etc) have gone). When you ask me about how AI will go, as a best guess I continue the centuries-long trend of automation of both physical and intellectual labour; in the particular context of AI I continue the trend where within a task, or task-category, the jump from significantly sub-human to vastly-greater-than-human level performance is rapid (on the order of years), but progress from one category of task to another (e.g. from chess to Go) goes rather slowly, as different tasks seem to differ from each other by orders of magnitude in terms of how difficult they are to automate. So I expect progress in AI to be gradual.
Then I also expect future AI systems to be narrow rather than general. When I look at the history of tech progress, I almost always see the creation of specific, highly optimised and generally very narrow tools, and very rarely the creation of general-purpose systems like general-purpose factories. And in general, when general-purpose tools are developed, they are worse than narrow tools on any given dimension: a swiss army knife is a crappier knife, bottle opener, saw, etc than any of those things individually. The current development of AI systems don’t give me any reason to think that AI is different: they’ve been very narrow to date; and when they’ve attempted to do things that are somewhat more general, like driving a car, progress has been slow and gradual, suffering from major difficulties in dealing with unusual situations.
Finally, I expect the development of any new technology to be safe by default. As an intuition pump: suppose there was some new design of bomb and BAE Systems decided to build it. There were, however, some arguments that the new design was unstable, and that if designed badly the bomb would kill everyone in the company, including the designers, the CEO, the board, and all their families. These arguments have been made in the media and the designers and the companies were aware of them. What odds do you put on BAE Systems building the bomb wrong and blowing themselves up? I’d put it very low — certainly less than 1%, and probably less than 0.1%. That would be true even if BAE Systems were in a race with Lockheed Martin to be the first to market. People in general really want to avoid dying, so there’s a huge incentive (a willingness-to-pay measured in the trillions of dollars for the USA alone) to ensure that AI doesn’t kill everyone. And when I look at other technological developments I see society being very risk averse and almost never taking major risks—a combination of public opinion and regulation means that things go slow and safe; again, self-driving cars are an example.
For each of these views, I’m very happy to acknowledge that maybe AI is different. And, when we’re talking about what could be the most important event ever, the possibility of some major discontinuity is really worth guarding against. But discontinuity is not my mainline prediction of what will happen.
(Later edit: I worry that the text above might have conveyed the idea that I’m just ignoring the Yudkowsky/Bostrom arguments, which isn’t accurate. Instead, another factor in my change of view was placing less weight on the Y-B arguments because of: (i) finding the arguments that we’ll get discontinuous progress in AI a lot less compelling than I used to (e.g. see here and here); (ii) trying to map the Yudkowsky/Bostrom arguments, which were made before the deep learning paradigm, onto actual progress in machine learning, and finding them hard to fit well. Going into this properly would require a lot more discussion though!)
Finally, I expect the development of any new technology to be safe by default.
The argument you give in this paragraph only makes sense if “safe” is defined as “not killing everyone” or “avoids risks that most people care about”. But what about “safe” as in “not causing differential intellectual progress in a wrong direction, which can lead to increased x-risks in the long run” or “protecting against or at least not causing value drift so that civilization will optimize for the ‘right’ values in the long run, whatever the appropriate meaning of that is”?
If short-term extinction risk (and in general risks that most people care about) is small compared to other kinds of existential risks, it would seem to make sense for longtermists to focus their efforts more on the latter.
(ii) trying to map the Yudkowsky/Bostrom arguments, which were made before the deep learning paradigm, onto actual progress in machine learning, and finding them hard to fit well. Going into this properly would require a lot more discussion though!)
I’d be happy to read more about this point.
If we end up with powerful deep learning models that optimize a given objective extremely well, the main arguments in Superintelligence seem to go through.
(If we end up with powerful deep learning models that do NOT optimize a given objective, it seems to me plausible that x-risks from AI are more severe, rather than less.)
[EDIT: replaced “a specified objective function” with “a given objective”]
The general background worldview that motivates this credence is that predicting the future is very hard, and we have almost no evidence that we can do it well. (Caveat I don’t think we have great evidence that we can’t do it either, though.) When it comes to short-term forecasting, the best strategy is to use reference-class forecasting (‘outside view’ reasoning; often continuing whatever trend has occurred in the past), and make relatively small adjustments based on inside-view reasoning. In the absence of anything better, I think we should do the same for long-term forecasts too. (Zach Groff is working on a paper making this case in more depth).
So when I look to predict the next hundred years, say, I think about how the past 100 years has gone (as well as giving consideration to how the last 1000 years and 10,000 years (etc) have gone). When you ask me about how AI will go, as a best guess I continue the centuries-long trend of automation of both physical and intellectual labour; in the particular context of AI I continue the trend where within a task, or task-category, the jump from significantly sub-human to vastly-greater-than-human level performance is rapid (on the order of years), but progress from one category of task to another (e.g. from chess to Go) goes rather slowly, as different tasks seem to differ from each other by orders of magnitude in terms of how difficult they are to automate. So I expect progress in AI to be gradual.
Then I also expect future AI systems to be narrow rather than general. When I look at the history of tech progress, I almost always see the creation of specific, highly optimised and generally very narrow tools, and very rarely the creation of general-purpose systems like general-purpose factories. And in general, when general-purpose tools are developed, they are worse than narrow tools on any given dimension: a swiss army knife is a crappier knife, bottle opener, saw, etc than any of those things individually. The current development of AI systems don’t give me any reason to think that AI is different: they’ve been very narrow to date; and when they’ve attempted to do things that are somewhat more general, like driving a car, progress has been slow and gradual, suffering from major difficulties in dealing with unusual situations.
Finally, I expect the development of any new technology to be safe by default. As an intuition pump: suppose there was some new design of bomb and BAE Systems decided to build it. There were, however, some arguments that the new design was unstable, and that if designed badly the bomb would kill everyone in the company, including the designers, the CEO, the board, and all their families. These arguments have been made in the media and the designers and the companies were aware of them. What odds do you put on BAE Systems building the bomb wrong and blowing themselves up? I’d put it very low — certainly less than 1%, and probably less than 0.1%. That would be true even if BAE Systems were in a race with Lockheed Martin to be the first to market. People in general really want to avoid dying, so there’s a huge incentive (a willingness-to-pay measured in the trillions of dollars for the USA alone) to ensure that AI doesn’t kill everyone. And when I look at other technological developments I see society being very risk averse and almost never taking major risks—a combination of public opinion and regulation means that things go slow and safe; again, self-driving cars are an example.
For each of these views, I’m very happy to acknowledge that maybe AI is different. And, when we’re talking about what could be the most important event ever, the possibility of some major discontinuity is really worth guarding against. But discontinuity is not my mainline prediction of what will happen.
(Later edit: I worry that the text above might have conveyed the idea that I’m just ignoring the Yudkowsky/Bostrom arguments, which isn’t accurate. Instead, another factor in my change of view was placing less weight on the Y-B arguments because of: (i) finding the arguments that we’ll get discontinuous progress in AI a lot less compelling than I used to (e.g. see here and here); (ii) trying to map the Yudkowsky/Bostrom arguments, which were made before the deep learning paradigm, onto actual progress in machine learning, and finding them hard to fit well. Going into this properly would require a lot more discussion though!)
The argument you give in this paragraph only makes sense if “safe” is defined as “not killing everyone” or “avoids risks that most people care about”. But what about “safe” as in “not causing differential intellectual progress in a wrong direction, which can lead to increased x-risks in the long run” or “protecting against or at least not causing value drift so that civilization will optimize for the ‘right’ values in the long run, whatever the appropriate meaning of that is”?
If short-term extinction risk (and in general risks that most people care about) is small compared to other kinds of existential risks, it would seem to make sense for longtermists to focus their efforts more on the latter.
I agree re value-drift and societal trajectory worries, and do think that work on AI is plausibly a good lever to positively affect them.
I’d be happy to read more about this point.
If we end up with powerful deep learning models that optimize a given objective extremely well, the main arguments in Superintelligence seem to go through.
(If we end up with powerful deep learning models that do NOT optimize a given objective, it seems to me plausible that x-risks from AI are more severe, rather than less.)
[EDIT: replaced “a specified objective function” with “a given objective”]