I’ve substantially revised my views on QURI’s research priorities over the past year, primarily driven by the rapid advancement in LLM capabilities.
Previously, our strategy centered on developing highly-structured numeric models with stable APIs, enabling:
Formal forecasting scoring mechanisms
Effective collaboration between human forecasting teams
Reusable parameterized world-models for downstream estimates
However, the progress in LLM capabilities has updated my view. I now believe we should focus on developing and encouraging superior AI reasoning and forecasting systems that can:
Generate high-quality forecasts on-demand, rather than relying on pre-computed forecasts for scoring
Produce context-specific mathematical models as needed, reducing the importance of maintaining generic mathematical frameworks
Leverage repositories of key insights, though likely not in the form of formal probabilistic mathematical models
This represents a pivot from scaling up traditional forecasting systems to exploring how we can enhance AI reasoning capabilities for forecasting tasks. The emphasis is now on dynamic, adaptive systems rather than static, pre-structured models.
(I rewrote with Claude, I think it’s much more understandable now)
Generate high-quality forecasts on-demand, rather than relying on pre-computed forecasts for scoring
Leverage repositories of key insights, though likely not in the form of formal probabilistic mathematical models
To be clear, I think there’s a lot of batch intellectual work we can do before users ask for specific predictions. So “Generating high-quality forecasts on-demand” doesn’t mean “doing all the intellectual work on-demand.”
However, I think there’s a broad set of information that this batch intellectual work could look like. I used to think that this batch work would produce a large set of connect mathematic models. Now I think we probably want something very compressed. If a certain mathematical model can easily be generated on-demand, then there’s not much of a benefit to having it made and saved ahead of time. However, I’m sure there are many crucial insights that are both expensive to find, and would be useful for many questions that LLM users ask about.
So instead of searching for and saving math models, a system might do a bunch of intellectual work and save statements like, ”When estimating the revenue of OpenAI, remember crucial considerations [A] and [B]. Also, a surprisingly good data source for this is Twitter user ai-gnosis-34.”
A lot of user-provided forecasts or replies should basically be the “last mile” or intellectual work. All the key insights are already found, now there just needs to be a bit of customization for the very specific questions someone has.
I’ve substantially revised my views on QURI’s research priorities over the past year, primarily driven by the rapid advancement in LLM capabilities.
Previously, our strategy centered on developing highly-structured numeric models with stable APIs, enabling:
Formal forecasting scoring mechanisms
Effective collaboration between human forecasting teams
Reusable parameterized world-models for downstream estimates
However, the progress in LLM capabilities has updated my view. I now believe we should focus on developing and encouraging superior AI reasoning and forecasting systems that can:
Generate high-quality forecasts on-demand, rather than relying on pre-computed forecasts for scoring
Produce context-specific mathematical models as needed, reducing the importance of maintaining generic mathematical frameworks
Leverage repositories of key insights, though likely not in the form of formal probabilistic mathematical models
This represents a pivot from scaling up traditional forecasting systems to exploring how we can enhance AI reasoning capabilities for forecasting tasks. The emphasis is now on dynamic, adaptive systems rather than static, pre-structured models.
(I rewrote with Claude, I think it’s much more understandable now)
A bit more on this part:
To be clear, I think there’s a lot of batch intellectual work we can do before users ask for specific predictions. So “Generating high-quality forecasts on-demand” doesn’t mean “doing all the intellectual work on-demand.”
However, I think there’s a broad set of information that this batch intellectual work could look like. I used to think that this batch work would produce a large set of connect mathematic models. Now I think we probably want something very compressed. If a certain mathematical model can easily be generated on-demand, then there’s not much of a benefit to having it made and saved ahead of time. However, I’m sure there are many crucial insights that are both expensive to find, and would be useful for many questions that LLM users ask about.
So instead of searching for and saving math models, a system might do a bunch of intellectual work and save statements like,
”When estimating the revenue of OpenAI, remember crucial considerations [A] and [B]. Also, a surprisingly good data source for this is Twitter user ai-gnosis-34.”
A lot of user-provided forecasts or replies should basically be the “last mile” or intellectual work. All the key insights are already found, now there just needs to be a bit of customization for the very specific questions someone has.