One could say the CIA benefited from their expertise in geopolitics, and yet the superforecasters still beat them. Superforecasters perform well because they are good at synthesising diverse opinions and expert opinions: they had access to the same survey of AI experts that everyone else did.
Saying the AI experts had more expertise in AI is obviously true, but it doesn’t explain the discrepancy in forecasts. Why were superforecasters unconvinced by the AI experts?
I think there are a bunch of things that make expertise more valuable in AI forecasting than in geopolitical forecasting:
We’ve all grown up with geopolitics and read about it in the news, in books and so on, so most of us non-experts already have passable models of it. That’s not true with AI (until maybe one year ago, but even now news don’t report on technical safety).
Geopolitical events have fairly clear reference classes that can give you base rates and so on (and this is a tool available to both experts and non-experts) -- this is much harder with AI. That means the outside view is less valuable for AI forecasting.
I think AI is really complex and technical, and especially hard given that we’re dealing with systems that don’t yet exist. Geopolitics is also complex, and geopolitical futures will be different from now, but the basic elements are the same. I think this also favors non-experts when forecasting on geopolitics.
And quoting Peter McCluskey, a participating superforecaster:
The initial round of persuasion was likely moderately productive. The persuasion phases dragged on for nearly 3 months. We mostly reached drastically diminishing returns on discussion after a couple of weeks.
[...]
The persuasion seemed to be spread too thinly over 59 questions. In hindsight, I would have preferred to focus on core cruxes, such as when AGI would become dangerous if not aligned, and how suddenly AGI would transition from human levels to superhuman levels. That would have required ignoring the vast majority of those 59 questions during the persuasion stages. But the organizers asked us to focus on at least 15 questions that we were each assigned, and encouraged us to spread our attention to even more of the questions.
[...]
Many superforecasters suspected that recent progress in AI was the same kind of hype that led to prior disappointments with AI. I didn’t find a way to get them to look closely enough to understand why I disagreed.
My main success in that area was with someone who thought there was a big mystery about how an AI could understand causality. I pointed him to Pearl, which led him to imagine that problem might be solvable. But he likely had other similar cruxes which he didn’t get around to describing.
That left us with large disagreements about whether AI will have a big impact this century.
I’m guessing that something like half of that was due to a large disagreement about how powerful AI will be this century.
I find it easy to understand how someone who gets their information about AI from news headlines, or from laymen-oriented academic reports, would see a fair steady pattern of AI being overhyped for 75 years, with it always looking like AI was about 30 years in the future. It’s unusual for an industry to quickly switch from decades of overstating progress, to underhyping progress. Yet that’s what I’m saying has happened.
[...]
That superforecaster trend seems to be clear evidence for AI skepticism. How much should I update on it? I don’t know. I didn’t see much evidence that either group knew much about the subject that I didn’t already know. So maybe most of the updates during the tournament were instances of the blind leading the blind.
Scott Alexander points out that the superforecasters have likely already gotten one question pretty wrong, having a median prediction of the most expensive training run for 2024 of $35M (experts had a median of $65M by 2024) whereas GPT-4 seems to have been ~$60M, though with ample uncertainty. But bearish predictions will tend to fail earlier than bullish predictions, so we’ll see how the two groups compare in the next years, I guess.
I think you make good points in favour of the AI expert side of the equation. To balance that out, I want to offer one more point in favour of the superforecasters, in addition to my earlier points about anchoring and selection bias (we don’t actually know what the true median of AI expert opinion is or would be if questions were phrased differently).
The primary point I want to make is that Ai x-risk forecasting is, at least partly, a geopolitical forecast. Extinction from rogue AI requires some form of war or struggle between humanity. You have to estimate the probability that that struggle ends with humanity losing.
An AI expert is an expert in software development, not in geopolitical threat management. Neither are they experts in potential future weapon technology. If someone has worked on the latest bombshell LLM model, I will take their predictions about specific AI development seriously, but if they tell me an AI will be able to build omnipotent nanomachines that take over the planet in a month, I have no hesitations in telling them they’re wrong, because I have more expertise in that realm than they do.
I think the superforecasters have superior geopolitical knowledge than the AI experts, and that is reflected in these estimates.
One could say the CIA benefited from their expertise in geopolitics, and yet the superforecasters still beat them. Superforecasters perform well because they are good at synthesising diverse opinions and expert opinions: they had access to the same survey of AI experts that everyone else did.
Saying the AI experts had more expertise in AI is obviously true, but it doesn’t explain the discrepancy in forecasts. Why were superforecasters unconvinced by the AI experts?
I think there are a bunch of things that make expertise more valuable in AI forecasting than in geopolitical forecasting:
We’ve all grown up with geopolitics and read about it in the news, in books and so on, so most of us non-experts already have passable models of it. That’s not true with AI (until maybe one year ago, but even now news don’t report on technical safety).
Geopolitical events have fairly clear reference classes that can give you base rates and so on (and this is a tool available to both experts and non-experts) -- this is much harder with AI. That means the outside view is less valuable for AI forecasting.
I think AI is really complex and technical, and especially hard given that we’re dealing with systems that don’t yet exist. Geopolitics is also complex, and geopolitical futures will be different from now, but the basic elements are the same. I think this also favors non-experts when forecasting on geopolitics.
And quoting Peter McCluskey, a participating superforecaster:
Scott Alexander points out that the superforecasters have likely already gotten one question pretty wrong, having a median prediction of the most expensive training run for 2024 of $35M (experts had a median of $65M by 2024) whereas GPT-4 seems to have been ~$60M, though with ample uncertainty. But bearish predictions will tend to fail earlier than bullish predictions, so we’ll see how the two groups compare in the next years, I guess.
I think you make good points in favour of the AI expert side of the equation. To balance that out, I want to offer one more point in favour of the superforecasters, in addition to my earlier points about anchoring and selection bias (we don’t actually know what the true median of AI expert opinion is or would be if questions were phrased differently).
The primary point I want to make is that Ai x-risk forecasting is, at least partly, a geopolitical forecast. Extinction from rogue AI requires some form of war or struggle between humanity. You have to estimate the probability that that struggle ends with humanity losing.
An AI expert is an expert in software development, not in geopolitical threat management. Neither are they experts in potential future weapon technology. If someone has worked on the latest bombshell LLM model, I will take their predictions about specific AI development seriously, but if they tell me an AI will be able to build omnipotent nanomachines that take over the planet in a month, I have no hesitations in telling them they’re wrong, because I have more expertise in that realm than they do.
I think the superforecasters have superior geopolitical knowledge than the AI experts, and that is reflected in these estimates.
By the way, I feel now that my first reply in this thread was needlessly snarky, and am sorry about that.