Stripped of all AI-centred argumentation, the reply is left mostly empty. This suggests that judgmental forecasting, at least as exercised by FRI, should perhaps be thought of as a sub-domain of AI safety. In such a case, its impact would need to be evaluated in the portfolio context of all AI safety budgets, meaning a much higher hurdle rate would have to be cleared to justify its activities.
What more broadly applies to judgmental forecasting and online betting platforms—and is also the basis for many arguments in this defence of forecasting—is the circular reasoning regarding the field’s importance, frequently repeated by the field’s own and those adjacent to it. But, in contrast to the opinionated voices, the evidence is lacking. Merely stating that forecasting has informed some policy or that career decisions have been influenced is not sufficient. Similarly, whether its impact is positive or negative is taken at face value and never substantiated.
All this isn’t to say that judgmental forecasting research or its funding should be dispensed with. In fact, hybrids that combine quantitative predictive models with expert judgment are among the foundational tools of large organisations’ decision-making processes. However, I believe the field’s association with online betting (high time we called things for what they are) as well as over-reliance on AI for its services is actually hurting it.
Whose job is it to identify EA questions which could benefit from better forecasts?
Consider two different hypotheses:
Forecasting is only helpful for AI
Forecasting is helpful outside of AI, but AI has captured much more forecasting interest than other cause areas
How much time are non-AI org leaders spending trying to think up decision-relevant forecasts related to their cause areas?
If leaders are not spending any time trying to think up such forecasts, maybe there is low-hanging fruit here. Maybe EA has latent forecasting capability which can be tapped to improve organizational decision-making. Or maybe such forecasting capability will free up in a few years if AI turns out to be a nothingburger.
If leaders have spent a lot of time trying to think up useful forecasts, and failed, maybe forecasting really is fairly useless outside of AI.
If I was leading a non-AI EA organization, and I had a forecast I really wanted to see the result of, who would I even talk to? Which forecasting organizations are actively soliciting ideas for EA-related forecast questions?
It seems to me that a lot of what EA does is implicit forecasting in some sense, e.g. if you give someone a grant, it’s an implicit forecast about the probability that they will be able to accomplish something with that grant. EA is often critiqued for neglecting “systemic change”. If you want to do systemic change, being able to forecast the effects of various systemic changes is really useful. If you take any action, there’s an implicit forecast that it will lead to a good outcome and not backfire somehow. Wouldn’t it be better to make this forecast explicit? All else equal, wouldn’t it be good to get some perspective from people outside of the organization, who are perhaps forecasting in their free time as a replacement for watching TV or other downtime activities?
My understanding of the original post’s intent is that it calls for evidence of the field’s impact, given the funding it receives. I don’t believe it critiques judgmental forecasting as an analytical method and neither do I think that I signal this in my comment.
I stand by my opinion, however, that the community is correct to ask for tactile proof, burden of which rests on organizations that receive the funding.
I regret if this doesn’t satisfy the questions in your comment.
“Stripped of all AI-centred argumentation, the reply is left mostly empty.”
The bulk of our funding has gone toward AI-focused forecasting projects (e.g. LEAP, AI-biorisk, economic effects of AI) or ‘automating forecasting research’-type work that has the ultimate goal of assisting decisionmakers (e.g. ForecastBench), so I think this is most of what FRI should be evaluated on.
“...meaning a much higher hurdle rate would have to be cleared to justify its activities.”
I’m not sure what comparison class people had in mind previously, but I agree it seems broadly correct to consider this work alongside other AI-related funding opportunities. As noted above, I’d argue that it is appropriate and valuable to have “AI measurement” as an important funding domain alongside areas like “AI governance,” “Technical AI safety research,” “AI field-building,” etc. It seems valuable for one part of the AI grantmaking portfolio to be generating evidence that can be used to sharpen views on AI timelines, to assess risk in various domains (bio, cyber, catastrophic risk), to assess magnitudes of benefits (for calibrating cost-benefit analyses on policies), and to predict the likelihood and impact of various policies (e.g. the effectiveness of DNA synthesis screening for biorisk), etc. This type of fundamental research can inform and support more effective action in the other domains.
I also think forecasting research can have direct impacts on AI governance via direct decision-making partnerships like I described above: i.e., directly partnering with and advising important government agencies and frontier AI companies, among others, on high-stakes decisions related to AI regulation, implementing effective safeguards to reduce AI-cyber risk, and more. We have already seen some early impacts along these lines, as previously mentioned.
“Merely stating that forecasting has informed some policy or that career decisions have been influenced is not sufficient. Similarly, whether its impact is positive or negative is taken at face value and never substantiated.”
I agree. Due to confidentiality, we have primarily shared details of our impact case studies with our funders and had them assess the value of the impact we are making. Establishing evidence of impact publicly is more challenging due to confidentiality considerations. But elsewhere in the thread people have mentioned citations as one reasonable metric for evidence of impact for research organizations that have more diffuse impacts. We have targets for growing our prominent citations over time to assess our impact, and I’ve shared examples of prominent citations to FRI research in my comment above. I also hope that over time, we can share more case studies publicly and provide more of the reasoning for why we believe we had an impact and whether it was positive. The benchmarks RFP case study described above is one example that can be discussed relatively publicly.
“All this isn’t to say that judgmental forecasting research or its funding should be dispensed with. In fact, hybrids that combine quantitative predictive models with expert judgment are among the foundational tools of large organisations’ decision-making processes. However, I believe the field’s association with online betting (high time we called things for what they are) as well as over-reliance on AI for its services is actually hurting it.”
I broadly agree on these points. We are running longitudinal expert panels, partnering with important institutions to improve their decision-making, and automating forecasting research, so I see our work as distinct from online betting/forecasting platforms.
Stripped of all AI-centred argumentation, the reply is left mostly empty. This suggests that judgmental forecasting, at least as exercised by FRI, should perhaps be thought of as a sub-domain of AI safety. In such a case, its impact would need to be evaluated in the portfolio context of all AI safety budgets, meaning a much higher hurdle rate would have to be cleared to justify its activities.
What more broadly applies to judgmental forecasting and online betting platforms—and is also the basis for many arguments in this defence of forecasting—is the circular reasoning regarding the field’s importance, frequently repeated by the field’s own and those adjacent to it. But, in contrast to the opinionated voices, the evidence is lacking. Merely stating that forecasting has informed some policy or that career decisions have been influenced is not sufficient. Similarly, whether its impact is positive or negative is taken at face value and never substantiated.
All this isn’t to say that judgmental forecasting research or its funding should be dispensed with. In fact, hybrids that combine quantitative predictive models with expert judgment are among the foundational tools of large organisations’ decision-making processes. However, I believe the field’s association with online betting (high time we called things for what they are) as well as over-reliance on AI for its services is actually hurting it.
Whose job is it to identify EA questions which could benefit from better forecasts?
Consider two different hypotheses:
Forecasting is only helpful for AI
Forecasting is helpful outside of AI, but AI has captured much more forecasting interest than other cause areas
How much time are non-AI org leaders spending trying to think up decision-relevant forecasts related to their cause areas?
If leaders are not spending any time trying to think up such forecasts, maybe there is low-hanging fruit here. Maybe EA has latent forecasting capability which can be tapped to improve organizational decision-making. Or maybe such forecasting capability will free up in a few years if AI turns out to be a nothingburger.
If leaders have spent a lot of time trying to think up useful forecasts, and failed, maybe forecasting really is fairly useless outside of AI.
If I was leading a non-AI EA organization, and I had a forecast I really wanted to see the result of, who would I even talk to? Which forecasting organizations are actively soliciting ideas for EA-related forecast questions?
It seems to me that a lot of what EA does is implicit forecasting in some sense, e.g. if you give someone a grant, it’s an implicit forecast about the probability that they will be able to accomplish something with that grant. EA is often critiqued for neglecting “systemic change”. If you want to do systemic change, being able to forecast the effects of various systemic changes is really useful. If you take any action, there’s an implicit forecast that it will lead to a good outcome and not backfire somehow. Wouldn’t it be better to make this forecast explicit? All else equal, wouldn’t it be good to get some perspective from people outside of the organization, who are perhaps forecasting in their free time as a replacement for watching TV or other downtime activities?
My understanding of the original post’s intent is that it calls for evidence of the field’s impact, given the funding it receives. I don’t believe it critiques judgmental forecasting as an analytical method and neither do I think that I signal this in my comment.
I stand by my opinion, however, that the community is correct to ask for tactile proof, burden of which rests on organizations that receive the funding.
I regret if this doesn’t satisfy the questions in your comment.
The bulk of our funding has gone toward AI-focused forecasting projects (e.g. LEAP, AI-biorisk, economic effects of AI) or ‘automating forecasting research’-type work that has the ultimate goal of assisting decisionmakers (e.g. ForecastBench), so I think this is most of what FRI should be evaluated on.
I’m not sure what comparison class people had in mind previously, but I agree it seems broadly correct to consider this work alongside other AI-related funding opportunities. As noted above, I’d argue that it is appropriate and valuable to have “AI measurement” as an important funding domain alongside areas like “AI governance,” “Technical AI safety research,” “AI field-building,” etc. It seems valuable for one part of the AI grantmaking portfolio to be generating evidence that can be used to sharpen views on AI timelines, to assess risk in various domains (bio, cyber, catastrophic risk), to assess magnitudes of benefits (for calibrating cost-benefit analyses on policies), and to predict the likelihood and impact of various policies (e.g. the effectiveness of DNA synthesis screening for biorisk), etc. This type of fundamental research can inform and support more effective action in the other domains.
I also think forecasting research can have direct impacts on AI governance via direct decision-making partnerships like I described above: i.e., directly partnering with and advising important government agencies and frontier AI companies, among others, on high-stakes decisions related to AI regulation, implementing effective safeguards to reduce AI-cyber risk, and more. We have already seen some early impacts along these lines, as previously mentioned.
I agree. Due to confidentiality, we have primarily shared details of our impact case studies with our funders and had them assess the value of the impact we are making. Establishing evidence of impact publicly is more challenging due to confidentiality considerations. But elsewhere in the thread people have mentioned citations as one reasonable metric for evidence of impact for research organizations that have more diffuse impacts. We have targets for growing our prominent citations over time to assess our impact, and I’ve shared examples of prominent citations to FRI research in my comment above. I also hope that over time, we can share more case studies publicly and provide more of the reasoning for why we believe we had an impact and whether it was positive. The benchmarks RFP case study described above is one example that can be discussed relatively publicly.
I broadly agree on these points. We are running longitudinal expert panels, partnering with important institutions to improve their decision-making, and automating forecasting research, so I see our work as distinct from online betting/forecasting platforms.