Exciting stuff Ozzie! We really need people to specify models in their forecasts, and the fact that you can score those models directly, not numbers derived from it, is a great step forwards.
In your “Challenges for Forecasting Platforms”, you write “Writing functions is more complicated than submitting point probability or single distribution forecasts”. I’d go further and say that formulating forecasts as functions is pretty hard even for savvy programmers. Most forecasters vibe, and inside views (idiosyncratic) dominate base rates (things that look like functions).
Take your example “For any future year T (2025 to 2100) and country, predict the life expectancy...” Let’s say my view is a standard time series regression on the last 20 years, plus a discontinuity around 2030 when cancer is cured, at which point we’ll get a 3 year jump worldwide in the first world, and then a 1.5 year jump in the third world 2 years later.
Yes, I could express that as a fairly involved function, but isn’t the sentence I wrote above a better description of my view? How do “inside view” forecasts get translated to functions?
> Yes, I could express that as a fairly involved function, but isn’t the sentence I wrote above a better description of my view?
That sentence isn’t easily scorable, because it’s not very precise. There’s no description of the uncertainty that it has or specificity of what the jumps would be, exactly. It’s also hard to use in an aggregator or similar, or do other sorts of direct modifications on.
But say that these attributes were added. Then, we’d just want some way to formally specify this. We don’t have many options here, as few programming languages / formal specifications are made for this sort of thing. We’ve tried to make Squiggle as a decent fit between “simple to express views with uncertainty” and “runs in code”, but it’s not perfect, and people will want different things.
The easiest thing now is that someone would be in charge of converting this sentence into a formal specification or algorithm. This could be done with an LLM or similar.
This setup feels very similar to other prediction platforms. You could argue some people would feel, “Do I really need to say I’m 85% sure, instead of, ‘I’m really sure’?”.
Thanks! Some very quick points: 1. I think that discontinuities like that are rare, and I’d question this one. Basically, I think that you can get ~90% of the benefit here with just a linear or exponential model, with the right uncertainty. 2. When writing a function that expresses 100 things (effectively), but in 3x the time, you wouldn’t be expected to forecast those things as well as if you spend 100x the time. In other words, I’d expect that algo forecasters would begin with a lot of shortcuts and approximations. Their worse forecasts can still be calibrated, just not as high-resolution as we’d expect from point forecasts with a similar amount of effort. I think a lot of people get caught up here, by thinking, “I have this specific model in my head, and unless I can model every part of it, the entire thing is useless”, but this really isn’t the case! 3. I fed a modified version of your question straight to our GPT-Squiggle tool, and it came up with this, (basically) no modification needed. Not perfect, but not terrible!
Exciting stuff Ozzie! We really need people to specify models in their forecasts, and the fact that you can score those models directly, not numbers derived from it, is a great step forwards.
In your “Challenges for Forecasting Platforms”, you write “Writing functions is more complicated than submitting point probability or single distribution forecasts”. I’d go further and say that formulating forecasts as functions is pretty hard even for savvy programmers. Most forecasters vibe, and inside views (idiosyncratic) dominate base rates (things that look like functions).
Take your example “For any future year T (2025 to 2100) and country, predict the life expectancy...” Let’s say my view is a standard time series regression on the last 20 years, plus a discontinuity around 2030 when cancer is cured, at which point we’ll get a 3 year jump worldwide in the first world, and then a 1.5 year jump in the third world 2 years later.
Yes, I could express that as a fairly involved function, but isn’t the sentence I wrote above a better description of my view? How do “inside view” forecasts get translated to functions?
> Yes, I could express that as a fairly involved function, but isn’t the sentence I wrote above a better description of my view?
That sentence isn’t easily scorable, because it’s not very precise. There’s no description of the uncertainty that it has or specificity of what the jumps would be, exactly. It’s also hard to use in an aggregator or similar, or do other sorts of direct modifications on.
But say that these attributes were added. Then, we’d just want some way to formally specify this. We don’t have many options here, as few programming languages / formal specifications are made for this sort of thing. We’ve tried to make Squiggle as a decent fit between “simple to express views with uncertainty” and “runs in code”, but it’s not perfect, and people will want different things.
The easiest thing now is that someone would be in charge of converting this sentence into a formal specification or algorithm. This could be done with an LLM or similar.
This setup feels very similar to other prediction platforms. You could argue some people would feel, “Do I really need to say I’m 85% sure, instead of, ‘I’m really sure’?”.
Thanks! Some very quick points:
1. I think that discontinuities like that are rare, and I’d question this one. Basically, I think that you can get ~90% of the benefit here with just a linear or exponential model, with the right uncertainty.
2. When writing a function that expresses 100 things (effectively), but in 3x the time, you wouldn’t be expected to forecast those things as well as if you spend 100x the time. In other words, I’d expect that algo forecasters would begin with a lot of shortcuts and approximations. Their worse forecasts can still be calibrated, just not as high-resolution as we’d expect from point forecasts with a similar amount of effort. I think a lot of people get caught up here, by thinking, “I have this specific model in my head, and unless I can model every part of it, the entire thing is useless”, but this really isn’t the case!
3. I fed a modified version of your question straight to our GPT-Squiggle tool, and it came up with this, (basically) no modification needed. Not perfect, but not terrible!
https://chatgpt.com/share/e0a65eda-77f3-45bb-bd55-d7dfec707c6f
Squiggle Link