Not sure how this is a ‘just so story’ in the sense that I understand the term.
“the fact that “Extremizing” works to better calibrate general forecasts, but that extremizing of superforecaster’s predictions makes them worse.”
How is that in conflict with my point? As superforecasters spend more time talking and sharing information with one another, maybe they have already incorporated extremising into their own forecasts.
I know very well about superforecasters (I’ve read all of his books and interviewed Tetlock last week), but I am pretty sure an aggregation of superforecasters beats almost all of them individually, which speaks to the benefits of averaging a range of people’s views in most cases. Though in many cases you should not give much weight to those who are clearly in a worse epistemic position (non-superforecasters, whose predictions Tetlock told me were about 10-30x less useful).
How is that in conflict with my point? As superforecasters spend more time talking and sharing information with one another, maybe they have already incorporated extremising into their own forecasts.
Doesn’t this clearly demonstrate that the superforecasters are not using modest epistemology? At best, this shows that you can improve upon a “non-modest” epistemology by aggregating them together, but does not argue against the original post.
Hi Halffull—now I see what you’re saying, but actually the reverse is true. That superforecasters have already extremised shows their higher levels of modesty. Extremising is about updating based on other people’s views, and realising that because they have independent information to add, after hearing their view, you can be more confident of where to shift from your prior.
Imagine two epistemic peers estimating the weighting of a coin. They start with their probabilities bunched around 50% because they have been told the coin will probably be close to fair. They both see the same number of flips, and then reveal their estimates of the weighting. Both give an estimate of p=0.7. A modest person, who correctly weights the other person’s estimates as equally as informative as their own, will now offer a number quite a bit higher than 0.7, which takes into account the equal information both of them has to pull them away from their prior.
Once they’ve done that, there won’t be gains from further extremising. But a non-humble participant would fail to properly extremise based on the information in the other person’s view, leaving accuracy to be gained if this is done at a later stage by someone running the forecasting tournament.
Imagine two epistemic peers estimating the weighting of a coin. They start with their probabilities bunched around 50% because they have been told the coin will probably be close to fair. They both see the same number of flips, and then reveal their estimates of the weighting. Both give an estimate of p=0.7. A modest person, who correctly weights the other person’s estimates as equally as informative as their own, will now offer a number quite a bit higher than 0.7, which takes into account the equal information both of them has to pull them away from their prior.
This is what I’m talking about when I say “jut so stories” about the data from the GJP. One explanation is that superforecasters are going through this thought process, another would be that they discard non-superforecasters’ knowledge, and therefore end up as more extreme without explicitly running the extremizing algorithm on their own forecasts.
Similarly, the existence of super-forecasters themselves argues for a non-modest epistemology, while the fact that the extremized aggregation beats the superforecasters may argue for somewhat of a more modest epistemology. Saying that the data here points one way or the other to my mind is cherrypicking.
″...the existence of super-forecasters themselves argues for a non-modest epistemology...”
I don’t see how. No theory on offer argues that everyone is an epistemic peer. All theories predict some people have better judgement and will be reliably able to produce better guesses.
As a result I think superforecasters should usually pay little attention to the predictions of non-superforecasters (unless it’s a question on which expertise pays few dividends).
Not sure how this is a ‘just so story’ in the sense that I understand the term.
“the fact that “Extremizing” works to better calibrate general forecasts, but that extremizing of superforecaster’s predictions makes them worse.”
How is that in conflict with my point? As superforecasters spend more time talking and sharing information with one another, maybe they have already incorporated extremising into their own forecasts.
I know very well about superforecasters (I’ve read all of his books and interviewed Tetlock last week), but I am pretty sure an aggregation of superforecasters beats almost all of them individually, which speaks to the benefits of averaging a range of people’s views in most cases. Though in many cases you should not give much weight to those who are clearly in a worse epistemic position (non-superforecasters, whose predictions Tetlock told me were about 10-30x less useful).
Doesn’t this clearly demonstrate that the superforecasters are not using modest epistemology? At best, this shows that you can improve upon a “non-modest” epistemology by aggregating them together, but does not argue against the original post.
Hi Halffull—now I see what you’re saying, but actually the reverse is true. That superforecasters have already extremised shows their higher levels of modesty. Extremising is about updating based on other people’s views, and realising that because they have independent information to add, after hearing their view, you can be more confident of where to shift from your prior.
Imagine two epistemic peers estimating the weighting of a coin. They start with their probabilities bunched around 50% because they have been told the coin will probably be close to fair. They both see the same number of flips, and then reveal their estimates of the weighting. Both give an estimate of p=0.7. A modest person, who correctly weights the other person’s estimates as equally as informative as their own, will now offer a number quite a bit higher than 0.7, which takes into account the equal information both of them has to pull them away from their prior.
Once they’ve done that, there won’t be gains from further extremising. But a non-humble participant would fail to properly extremise based on the information in the other person’s view, leaving accuracy to be gained if this is done at a later stage by someone running the forecasting tournament.
This is what I’m talking about when I say “jut so stories” about the data from the GJP. One explanation is that superforecasters are going through this thought process, another would be that they discard non-superforecasters’ knowledge, and therefore end up as more extreme without explicitly running the extremizing algorithm on their own forecasts.
Similarly, the existence of super-forecasters themselves argues for a non-modest epistemology, while the fact that the extremized aggregation beats the superforecasters may argue for somewhat of a more modest epistemology. Saying that the data here points one way or the other to my mind is cherrypicking.
″...the existence of super-forecasters themselves argues for a non-modest epistemology...”
I don’t see how. No theory on offer argues that everyone is an epistemic peer. All theories predict some people have better judgement and will be reliably able to produce better guesses.
As a result I think superforecasters should usually pay little attention to the predictions of non-superforecasters (unless it’s a question on which expertise pays few dividends).