This doesn’t sound like an outlandish claim to me. Still, I’m not yet convinced.
Yeah, I think the evidence I felt comfortable sharing right now is enough to get to some confidence but perhaps not high confidence, so this is fair. The INFER point is probably stronger than the two bad predictions which is why I put it first.
I was really into Covid forecasting at the time, so I was tempted to go back through my comment history and noticed that this seemed like an extremely easy call at the time… Relatedly, if we only focus on instances where it’s obvious that some group’s consensus is wrong, it’s probably somewhat easy to find such instances (even for elite groups) because of the favorable selection effect at work. A through analysis would look at the track record on a pre-registered selection of questions.
I agree a more thorough analysis would look at the track record on a pre-registered selection of questions would be great. It’s pretty hard to know because the vast majority of superforecaster predictions are private and not on their public dashboard. Speaking for myself, I’d be pretty excited about a Samotsvety vs. supers vs. [any other teams who were interested] tournament happening.
That being said, I’m confused about how you seem to be taking “I was really into Covid forecasting at the time, so I was tempted to go back through my comment history and noticed that this seemed like an extremely easy call at the time” as an update toward superforecasters being better? If anything this feels like an update against superforecasters? The point I was trying to make was that it was a foreseeably wrong prediction and you further confirmed it?
I’d also say that on the cherry-picking point, I wasn’t exactly checking the superforecaster public dashboard super often over the last few years (like maybe I’ve checked ~25-50 days total) and there are only like 5 predictions up at a time.
Edit: The particular Covid question is strong evidence for “sometimes superforecasters don’t seem to be trying as much as they could.” So maybe your point is something like “On questions where we try as hard as possible, I trust us more than the average superforecaster prediction.” I think that stance might be reasonable.
I think it’s fair to interpret the Covid question to some extent as superforecasters not trying, but I’m confused about how you seem to be attributing little of it to prediction error? It could be a combination of both.
I think it’s fair to interpret the Covid question to some extent as superforecasters not trying, but I’m confused about how you seem to be attributing little of it to prediction error? It could be a combination of both.
Good point. I over-updated on my feeling of “this particular question felt so easy at the time” so that I couldn’t imagine why anyone who puts serious time into it would get it badly wrong.
However, on reflection, I think it’s most plausible that different types of information were salient to different people, which could have caused superforecasters to make prediction errors even if they were trying seriously. (Specifically, the question felt easy to me because I happened to have a lot of detailed info on the UK situation, which presented one of the best available examples to use for forming a reference class.)
You’re right that I essentially gave even more evidence for the claim you were making.
Really appreciate this deep dive!
Yeah, I think the evidence I felt comfortable sharing right now is enough to get to some confidence but perhaps not high confidence, so this is fair. The INFER point is probably stronger than the two bad predictions which is why I put it first.
I agree a more thorough analysis would look at the track record on a pre-registered selection of questions would be great. It’s pretty hard to know because the vast majority of superforecaster predictions are private and not on their public dashboard. Speaking for myself, I’d be pretty excited about a Samotsvety vs. supers vs. [any other teams who were interested] tournament happening.
That being said, I’m confused about how you seem to be taking “I was really into Covid forecasting at the time, so I was tempted to go back through my comment history and noticed that this seemed like an extremely easy call at the time” as an update toward superforecasters being better? If anything this feels like an update against superforecasters? The point I was trying to make was that it was a foreseeably wrong prediction and you further confirmed it?
I’d also say that on the cherry-picking point, I wasn’t exactly checking the superforecaster public dashboard super often over the last few years (like maybe I’ve checked ~25-50 days total) and there are only like 5 predictions up at a time.
I think it’s fair to interpret the Covid question to some extent as superforecasters not trying, but I’m confused about how you seem to be attributing little of it to prediction error? It could be a combination of both.
Good point. I over-updated on my feeling of “this particular question felt so easy at the time” so that I couldn’t imagine why anyone who puts serious time into it would get it badly wrong.
However, on reflection, I think it’s most plausible that different types of information were salient to different people, which could have caused superforecasters to make prediction errors even if they were trying seriously. (Specifically, the question felt easy to me because I happened to have a lot of detailed info on the UK situation, which presented one of the best available examples to use for forming a reference class.)
You’re right that I essentially gave even more evidence for the claim you were making.