Yeah, I think that the distinction between evaluation and forecasting is non-central. For example, these estimates can also be viewed as forecasts of what I would estimate if I spent 100x as much time on this, or as forecasts of what a really good system would output.
More to the point, if a project isn’t completed I could just estimate the distribution of expected quality, and the expected impact given each degree of quality (or, do a simplified version of that).
That said, I was thinking more about 2., though having a classification/lookup scheme would also be a way to produce explicit estimates.
For example, these estimates can also be viewed as forecasts of what I would estimate if I spent 100x as much time on this, or as forecasts of what a really good system would output.
Agreed, but that’s still different from forecasting the impact of a project that hasn’t happened yet, and the difference intuitively seems like it might be meaningful for our purposes. I.e., it’s not immediately obvious that methods and intuitions that work well for the sort of estimation/forecasting done in this post would also work well for forecasting the impact of a project that hasn’t happened yet.
One could likewise say that it’s not obvious that methods and intuitions that work well for forecasting how I’ll do in job applications would also work well for forecasting GDP growth in developing countries. So I guess my point was more fundamentally about the potential significance of the domain being different, rather than whether the thing can be seen as a type of forecasting or not.
So it sounds like you’re thinking that the sort of thing done in this post would be “a way to calibrate one’s intuitions/forecasts (with the hope being that there’ll be transfer between calibration when estimating the impact of past projects and calibration when forecasting the impact of future projects)”?
That does seem totally plausible to me; it just adds a step to the argument.
(I guess I’m also more generally interested in the question of how well forecasting accuracy and calibration transfers across domains—though at the same time I haven’t made the effort to look into it at all...)
Yes, I expect the intuitions and for estimation to generalize/help a great deal with the forecasting step, and though I agree that this might not be intuitively obvious. I understand that estimation and forecasting seem like different categories, but I don’t expect that to be a significant hurdle in practice.
Yeah, I think that the distinction between evaluation and forecasting is non-central. For example, these estimates can also be viewed as forecasts of what I would estimate if I spent 100x as much time on this, or as forecasts of what a really good system would output.
More to the point, if a project isn’t completed I could just estimate the distribution of expected quality, and the expected impact given each degree of quality (or, do a simplified version of that).
That said, I was thinking more about 2., though having a classification/lookup scheme would also be a way to produce explicit estimates.
Agreed, but that’s still different from forecasting the impact of a project that hasn’t happened yet, and the difference intuitively seems like it might be meaningful for our purposes. I.e., it’s not immediately obvious that methods and intuitions that work well for the sort of estimation/forecasting done in this post would also work well for forecasting the impact of a project that hasn’t happened yet.
One could likewise say that it’s not obvious that methods and intuitions that work well for forecasting how I’ll do in job applications would also work well for forecasting GDP growth in developing countries. So I guess my point was more fundamentally about the potential significance of the domain being different, rather than whether the thing can be seen as a type of forecasting or not.
So it sounds like you’re thinking that the sort of thing done in this post would be “a way to calibrate one’s intuitions/forecasts (with the hope being that there’ll be transfer between calibration when estimating the impact of past projects and calibration when forecasting the impact of future projects)”?
That does seem totally plausible to me; it just adds a step to the argument.
(I guess I’m also more generally interested in the question of how well forecasting accuracy and calibration transfers across domains—though at the same time I haven’t made the effort to look into it at all...)
Yes, I expect the intuitions and for estimation to generalize/help a great deal with the forecasting step, and though I agree that this might not be intuitively obvious. I understand that estimation and forecasting seem like different categories, but I don’t expect that to be a significant hurdle in practice.