[Epistemic status: I am optimising for providing arguments worth considering, which means I’m not trying to make certain every argument is strictly valid. I’m not trying to be safe to defer to, so I’m not minimising false-positives. I am just trying to expand the range of available tools to explore this question with, so I’m minimising false-negatives.]
Prize questions
The Future Fund’s AI Worldview Prize will award up to $1.5M for work that substantially changes their probabilities on the following three propositions.
“P(misalignment x-risk|AGI)”: Conditional on AGI being developed by 2070, humanity will go extinct or drastically curtail its future potential due to loss of control of AGI
AGI will be developed by January 1, 2043
AGI will be developed by January 1, 2100
See their post if you want the details. Here, I just want to offer some very broad and hasty arguments for why I’m hesitant about the value of AI forecasting in general, and this prize in particular.
Arguments
It’s worth stating explicitly that an actual working technical solution to the alignment problem with low tax would substantially update the panel’s beliefs about proposition A. So this isn’t necessarily a contest about forecasting-arguments even if it’s (misguidedly, imo) presented as one.
I think this matters because I don’t see forecasting as being a very targeted use of time for making the world better. In the worst case, this contest can make people less effective because it incentivises them to work on forecasting when they otherwise would have worked on generating solutions. If the latter is substantially more important on the margin, this contest is probably bad.
This seems especially true in a research community with high rates of intrinsic motivation.
I think a research community functions best when the people who have a well-formed opinion about what is the most effective thing for them to do, end up actually doing that thing. Especially in a pre-paradigmatic field like alignment, where there’s no authority you can defer to that will safely ensure that you end up working on something worthwhile.
As long as external incentives are approximately smoothly distributed across tasks, their intrinsic motivation to do what they think is the most effective thing for them to do, is more likely to win out over their competing motivations.
So I’d be reluctant to introduce imbalanced incentives, because they might displace motivation to act on individual prioritisation.
In general, introducing disproportionately strong incentives for a narrow subset of tasks in an area where task prioritisation has very broad uncertainty, seems bad.
Ahmdal’s law: “the overall performance improvement gained by optimizing a single part of a system is limited by the fraction of time that the improved part is actually used.”
It’s like running notepad—and only notepad—on a GTX 3080ti GPU.
These words don’t have strict definitions, but I think of “forecasting” as spending optimisation on predicting what will happen, and “problem-solving” as trying to change what will happen (by generating new ideas and solutions). Forecasting is differentiating between what’s already known, problem-solving is generating something that doesn’t exist yet.
Forecasters search broadly and exploit what’s already written because that’s the more reliable way to form informed opinions. They analyse more arguments than they generate, because the former is more cost-effective in terms of value of information related to the forecasting questions.
Problem-solvers explore unknown territory with uncertain (hits-based) payoff. If they’re part of a research community that can do parallel search, this is the optimal strategy for generating novel solutions that everyone can benefit from, but it’s not optimal for making the best forecasts.
Forecasting can be important for directing object-level work and choosing between different alignment strategies, and is therefore essential to the project. But it is a step removed from actually trying to develop a solution.
At the theoretical limit, you can be perfectly calibrated on predicting everything about what will happen related to AGI, even if you do no work that actually increases the chances that it ends up aligned.
It’s questionable to what extent prioritisation between alignment strategies is sensitive to differences in timeline forecasts, as opposed to mostly just being sensitive to their technical plausibility in the first place. If so, efforts to prioritise between different strategies is better spent on researching their plausibility, and less on figuring out where they fit on the timeline. But I expect different people will have wildly different takes on this.
For an individual, creative brainpower spent on forecasting doesn’t translate as readily into competence for problem-solving compared to brainpower spent directly on problem-solving.
Written work on forecasting does not inform work on problem-solving as readily as problem-solving work informs forecasting work.
What is more robustly usefwl across a range of the most plausible scenarios: work on forecasting or problem-solving? I think others are in a better position to answer this question than I am, but I’d nudge you to consider Ahmdal’s law again.
Optimisation spent on predicting worthwhile forecasting questions is plausibly hitting diminishing marginal returns much faster than trying to generate solutions. I have no theoretical model to support this (though I feel like one might exist), I just expect it to be the case in practice.
I expect promising research avenues targeted at producing solutions to last longer than promising research avenues targeted at estimating forecasting questions.
That is, if on the y-axis you plot the marginal value of information of spending one more hour researching a question, and on the x-axis you plot time spent researching it… I expect the distribution for AI forecasting questions to be front-loaded, at least as compared to AI problem-solving questions.
This relates to it being easier, in practice, to verify lines of thinking (e.g. while searching through existing literature as a foxy forecaster) compared to generating them in the first place.
Thanks to Mihnea Maftei for some helpfwl discussion on this.
Is AI forecasting a waste of effort on the margin?
[Epistemic status: I am optimising for providing arguments worth considering, which means I’m not trying to make certain every argument is strictly valid. I’m not trying to be safe to defer to, so I’m not minimising false-positives. I am just trying to expand the range of available tools to explore this question with, so I’m minimising false-negatives.]
Prize questions
The Future Fund’s AI Worldview Prize will award up to $1.5M for work that substantially changes their probabilities on the following three propositions.
“P(misalignment x-risk|AGI)”: Conditional on AGI being developed by 2070, humanity will go extinct or drastically curtail its future potential due to loss of control of AGI
AGI will be developed by January 1, 2043
AGI will be developed by January 1, 2100
See their post if you want the details. Here, I just want to offer some very broad and hasty arguments for why I’m hesitant about the value of AI forecasting in general, and this prize in particular.
Arguments
It’s worth stating explicitly that an actual working technical solution to the alignment problem with low tax would substantially update the panel’s beliefs about proposition A. So this isn’t necessarily a contest about forecasting-arguments even if it’s (misguidedly, imo) presented as one.
I think this matters because I don’t see forecasting as being a very targeted use of time for making the world better. In the worst case, this contest can make people less effective because it incentivises them to work on forecasting when they otherwise would have worked on generating solutions. If the latter is substantially more important on the margin, this contest is probably bad.
This seems especially true in a research community with high rates of intrinsic motivation.
I think a research community functions best when the people who have a well-formed opinion about what is the most effective thing for them to do, end up actually doing that thing. Especially in a pre-paradigmatic field like alignment, where there’s no authority you can defer to that will safely ensure that you end up working on something worthwhile.
As long as external incentives are approximately smoothly distributed across tasks, their intrinsic motivation to do what they think is the most effective thing for them to do, is more likely to win out over their competing motivations.
So I’d be reluctant to introduce imbalanced incentives, because they might displace motivation to act on individual prioritisation.
In general, introducing disproportionately strong incentives for a narrow subset of tasks in an area where task prioritisation has very broad uncertainty, seems bad.
Ahmdal’s law: “the overall performance improvement gained by optimizing a single part of a system is limited by the fraction of time that the improved part is actually used.”
It’s like running notepad—and only notepad—on a GTX 3080ti GPU.
These words don’t have strict definitions, but I think of “forecasting” as spending optimisation on predicting what will happen, and “problem-solving” as trying to change what will happen (by generating new ideas and solutions). Forecasting is differentiating between what’s already known, problem-solving is generating something that doesn’t exist yet.
Forecasters search broadly and exploit what’s already written because that’s the more reliable way to form informed opinions. They analyse more arguments than they generate, because the former is more cost-effective in terms of value of information related to the forecasting questions.
Problem-solvers explore unknown territory with uncertain (hits-based) payoff. If they’re part of a research community that can do parallel search, this is the optimal strategy for generating novel solutions that everyone can benefit from, but it’s not optimal for making the best forecasts.
Forecasting can be important for directing object-level work and choosing between different alignment strategies, and is therefore essential to the project. But it is a step removed from actually trying to develop a solution.
At the theoretical limit, you can be perfectly calibrated on predicting everything about what will happen related to AGI, even if you do no work that actually increases the chances that it ends up aligned.
It’s questionable to what extent prioritisation between alignment strategies is sensitive to differences in timeline forecasts, as opposed to mostly just being sensitive to their technical plausibility in the first place. If so, efforts to prioritise between different strategies is better spent on researching their plausibility, and less on figuring out where they fit on the timeline. But I expect different people will have wildly different takes on this.
Essentially, beware wasted motion.
For an individual, creative brainpower spent on forecasting doesn’t translate as readily into competence for problem-solving compared to brainpower spent directly on problem-solving.
Written work on forecasting does not inform work on problem-solving as readily as problem-solving work informs forecasting work.
What is more robustly usefwl across a range of the most plausible scenarios: work on forecasting or problem-solving? I think others are in a better position to answer this question than I am, but I’d nudge you to consider Ahmdal’s law again.
Optimisation spent on predicting worthwhile forecasting questions is plausibly hitting diminishing marginal returns much faster than trying to generate solutions. I have no theoretical model to support this (though I feel like one might exist), I just expect it to be the case in practice.
I expect promising research avenues targeted at producing solutions to last longer than promising research avenues targeted at estimating forecasting questions.
That is, if on the y-axis you plot the marginal value of information of spending one more hour researching a question, and on the x-axis you plot time spent researching it… I expect the distribution for AI forecasting questions to be front-loaded, at least as compared to AI problem-solving questions.
This relates to it being easier, in practice, to verify lines of thinking (e.g. while searching through existing literature as a foxy forecaster) compared to generating them in the first place.
Thanks to Mihnea Maftei for some helpfwl discussion on this.