So. First up, I have indeed made a few concrete forecasts, and it is worth noting that they absolutely stink. I don’t know what my Brier score is but it will definitely suck. That does not exactly make me want to do them more, because it’s a bit embarrassing, although there is a nice virtuous feeling when you hold your hand up and say “I got another one wrong”. And the EA/rationalist community is really good at socially rewarding that behaviour, and being on the fringes of the community I do get some good social feedback for doing it, so it’s not too cringey.
But it’s related to another problem, which is that when I make forecasts, it’s not the day job. I write some article, say, talking about the incentive problems in vaccine manufacturing. Do I stick a forecast on the end of that? Well, I could – “I think it is 60% likely that, I dunno, Covax will use advanced market commitment structures to purchase vaccines by the end of 2021”. But it’s kind of artificial. I’m just whacking it on the end of the piece.
And it also means they will, usually, suck. I know a few superforecasters, and I gather that one of the best predictors of the accuracy of a forecast is how long you spend making it. If I’ve just spent a day writing a piece, interviewing scientists or whatever, and my deadline is 5pm, then I won’t be able to spend much time doing a good forecast. It’s not what I’m being paid for and it’s not what the readers want from me.
I do think it’s valuable, and it means that I have to think carefully about what I actually mean when I say “it’s likely that schools will reopen in May” or whatever. So I try to do it. And sometimes pieces are more about forecasting, and they lend themselves more naturally to concrete predictions (although the problem of me doing it quickly, having spent most of my time chasing interviewees and writing the piece, is still there). I’ll definitely try to keep doing it. But I think the value isn’t always as huge as EAs/forecasters think, or in fact as huge as I used to think before I tried doing them more, so I understand journalists not being super interested. I hope more start doing it, but I doubt it will ever be a standard procedure in every opinion piece.
(That said, maybe I just suck and that’s what a person who sucks would say.)
If you haven’t spent time on calibration training, I recommend it! Open Phil has a tool here: https://www.openphilanthropy.org/blog/new-web-app-calibration-training. Making good forecasts is a mix of ‘understand the topic you’re making a prediction about’ and ‘understand yourself well enough to interpret your own feelings of confidence’. Even if they mostly don’t have expertise in the topic they’re writing about, I think most people can become pretty well-calibrated with an hour or two of practice.
And that’s a valuable service in its own right, I think. It would be a major gift to the public even if the only take-away readers got from predictions at the end of articles were ‘wow, even though these articles sound confident, the claims almost always tend to be 50% or 60% probable according to the reporter; guess I should keep in mind these topics are complex and these articles are being banged out in a few hours rather than being the product of months of study, so of course things are going to end up being pretty uncertain’.
If you also know enough about a topic to make a calibrated 80% or 90% (or 99%!) prediction about it, that’s great. But one of the nice things about probabilities is just that they clarify what you’re saying—they can function like an epistemic status disclaimer that notes how uncertain you really are, even if it was hard to make your prose flow without sounding kinda confident in the midst of the article. Making probabilistic predictions doesn’t have to be framed as ‘here’s me using my amazing knowledge of the world to predict the future’; it can just be framed as an attempt to disambiguate what you were saying in the article.
Relatedly, in my experience ‘writing an article or blog post’ can have bad effects on my ability to reason about stuff. I want to say things that are relevant and congruent and that flow together nicely; but my actual thought process includes a bunch of zig-zagging and updating and sorting-through-thoughts-that-don’t-initially-make-perfect-crisp-sense. So focusing on the writing makes me focus less on my thought process, and it becomes tempting to for me confuse the writing process or written artifact for my thought process or beliefs.
You’ve spent a lot of time living and breathing EA/rationalist stuff, so I don’t know that I have any advice that will be useful to you. But if I were giving advice to a random reporter, I’d warn about the above phenomenon and say that this can lead to overconfidence when someone’s just getting started adding probabilistic forecasts to their blogging.
I think this calibration-and-reflection bug is important—it’s a bug in your ability to recognize what you believe, not just in your ability to communicate it—and I think it’s fixable with some practice, without having to do the superforecaster ‘sink lots of hours into getting expertise about every topic you predict’ thing.
(And I don’t know, maybe the journey to fixing this could be an interesting one that generates an article of its own? Maybe a thing that could be linked to at the bottom of posts to give context for readers who are confused about why the numbers are there and why they’re so low-confidence?)
I agree with these comments, and think the first one—“If you haven’t spent time on calibration training...”—makes especially useful points.
Readers of this thread may also be interesting in a previous post of mine on Potential downsides of using explicit probabilities. (Though be warned that the post is less concise and well-structured than I’d aim for nowadays.) I ultimately conclude that post by saying:
There are some real downsides that can occur in practice when actual humans use explicit probabilities (or explicit probabilistic models, or maximising expected utility)
But some downsides that have been suggested (particularly causing overconfidence and understating the value of information) might actually be more pronounced for approaches other than using explicit probabilities
Some downsides (particularly relating to the optimizer’s curse, anchoring, and reputational issues) may be more pronounced when the probabilities one has (or could have) are less trustworthy
Other downsides (particularly excluding one’s intuitive knowledge) may be more pronounced when the probabilities one has (or could have) are more trustworthy
Only one downside (reputational issues) seems to provide any argument for even acting as if there’s a binary risk-uncertainty distinction
And even in that case the argument is quite unclear, and wouldn’t suggest we should use the idea of such a distinction inour own thinking
(That quote and post is obviously is somewhat tangential to this thread, but also somewhat relevant. I lightly edited that quote to make it make more sense of of context.)
I will look at that OpenPhil thing! I did do a calibration exercise with GJP (and was, to my surprise, both quite good and underconfident!) but I’d love to have another go.
I doubt it will ever be a standard procedure in every opinion piece.
Meaning you think there is a 95% chance that within five years, it won’t be the case that The New York Times, The Atlantic, and The Washington Post will include a quantitative, testable forecast in at least one fifth of their collective articles?
...Just kidding. Thanks for the well-written and illuminating answer.
(Just want to mention that a recent Scott Alexander post contains an interesting discussion of the topic of forecasting by journalists/pundits, and so may be of interest to readers of this thread.)
Ah man! I have THOUGHTS about this.
So. First up, I have indeed made a few concrete forecasts, and it is worth noting that they absolutely stink. I don’t know what my Brier score is but it will definitely suck. That does not exactly make me want to do them more, because it’s a bit embarrassing, although there is a nice virtuous feeling when you hold your hand up and say “I got another one wrong”. And the EA/rationalist community is really good at socially rewarding that behaviour, and being on the fringes of the community I do get some good social feedback for doing it, so it’s not too cringey.
But it’s related to another problem, which is that when I make forecasts, it’s not the day job. I write some article, say, talking about the incentive problems in vaccine manufacturing. Do I stick a forecast on the end of that? Well, I could – “I think it is 60% likely that, I dunno, Covax will use advanced market commitment structures to purchase vaccines by the end of 2021”. But it’s kind of artificial. I’m just whacking it on the end of the piece.
And it also means they will, usually, suck. I know a few superforecasters, and I gather that one of the best predictors of the accuracy of a forecast is how long you spend making it. If I’ve just spent a day writing a piece, interviewing scientists or whatever, and my deadline is 5pm, then I won’t be able to spend much time doing a good forecast. It’s not what I’m being paid for and it’s not what the readers want from me.
I do think it’s valuable, and it means that I have to think carefully about what I actually mean when I say “it’s likely that schools will reopen in May” or whatever. So I try to do it. And sometimes pieces are more about forecasting, and they lend themselves more naturally to concrete predictions (although the problem of me doing it quickly, having spent most of my time chasing interviewees and writing the piece, is still there). I’ll definitely try to keep doing it. But I think the value isn’t always as huge as EAs/forecasters think, or in fact as huge as I used to think before I tried doing them more, so I understand journalists not being super interested. I hope more start doing it, but I doubt it will ever be a standard procedure in every opinion piece.
(That said, maybe I just suck and that’s what a person who sucks would say.)
If you haven’t spent time on calibration training, I recommend it! Open Phil has a tool here: https://www.openphilanthropy.org/blog/new-web-app-calibration-training. Making good forecasts is a mix of ‘understand the topic you’re making a prediction about’ and ‘understand yourself well enough to interpret your own feelings of confidence’. Even if they mostly don’t have expertise in the topic they’re writing about, I think most people can become pretty well-calibrated with an hour or two of practice.
And that’s a valuable service in its own right, I think. It would be a major gift to the public even if the only take-away readers got from predictions at the end of articles were ‘wow, even though these articles sound confident, the claims almost always tend to be 50% or 60% probable according to the reporter; guess I should keep in mind these topics are complex and these articles are being banged out in a few hours rather than being the product of months of study, so of course things are going to end up being pretty uncertain’.
If you also know enough about a topic to make a calibrated 80% or 90% (or 99%!) prediction about it, that’s great. But one of the nice things about probabilities is just that they clarify what you’re saying—they can function like an epistemic status disclaimer that notes how uncertain you really are, even if it was hard to make your prose flow without sounding kinda confident in the midst of the article. Making probabilistic predictions doesn’t have to be framed as ‘here’s me using my amazing knowledge of the world to predict the future’; it can just be framed as an attempt to disambiguate what you were saying in the article.
Relatedly, in my experience ‘writing an article or blog post’ can have bad effects on my ability to reason about stuff. I want to say things that are relevant and congruent and that flow together nicely; but my actual thought process includes a bunch of zig-zagging and updating and sorting-through-thoughts-that-don’t-initially-make-perfect-crisp-sense. So focusing on the writing makes me focus less on my thought process, and it becomes tempting to for me confuse the writing process or written artifact for my thought process or beliefs.
You’ve spent a lot of time living and breathing EA/rationalist stuff, so I don’t know that I have any advice that will be useful to you. But if I were giving advice to a random reporter, I’d warn about the above phenomenon and say that this can lead to overconfidence when someone’s just getting started adding probabilistic forecasts to their blogging.
I think this calibration-and-reflection bug is important—it’s a bug in your ability to recognize what you believe, not just in your ability to communicate it—and I think it’s fixable with some practice, without having to do the superforecaster ‘sink lots of hours into getting expertise about every topic you predict’ thing.
(And I don’t know, maybe the journey to fixing this could be an interesting one that generates an article of its own? Maybe a thing that could be linked to at the bottom of posts to give context for readers who are confused about why the numbers are there and why they’re so low-confidence?)
all this makes a lot of sense, by the way, and I will take it on board.
I agree with these comments, and think the first one—“If you haven’t spent time on calibration training...”—makes especially useful points.
Readers of this thread may also be interesting in a previous post of mine on Potential downsides of using explicit probabilities. (Though be warned that the post is less concise and well-structured than I’d aim for nowadays.) I ultimately conclude that post by saying:
(That quote and post is obviously is somewhat tangential to this thread, but also somewhat relevant. I lightly edited that quote to make it make more sense of of context.)
I will look at that OpenPhil thing! I did do a calibration exercise with GJP (and was, to my surprise, both quite good and underconfident!) but I’d love to have another go.
Meaning you think there is a 95% chance that within five years, it won’t be the case that The New York Times, The Atlantic, and The Washington Post will include a quantitative, testable forecast in at least one fifth of their collective articles?
...Just kidding. Thanks for the well-written and illuminating answer.
hahaha!
(Just want to mention that a recent Scott Alexander post contains an interesting discussion of the topic of forecasting by journalists/pundits, and so may be of interest to readers of this thread.)