My intuition is also that the discount for academia solving core alignment problems should be (much?) higher than here. At the same time I agree that some mainstream work (esp. foundations) does help current AI alignment research significantly. I would expect (and hope) more of this to still appear, but to be increasingly sparse (relative to amount of work in AI).
I think that it would be useful to have a contribution model that can distinguish (at least) between a)improving the wider area (including e.g. fundamental models, general tools, best practices, learnability) and b)working on the problem itself. Distinguishing past contribution and expected future contribution (resp. discount factor) may also help.
Why: Having a well developed field is a big help in solving any particular problem X adjacent to it and it seems reasonable to assign a part of the value of “X is solved” to work done on the field. However, field development alone is unlikely to solve X for sufficiently hard X that is not in the field’s foci, and dedicated work on X is still needed. I imagine this applies to the field of ML/AI and long-termist AI alignment.
Model sketch: General work done on the field has diminishing returns towards the work remaining on the problem. As the field grows, it branches and the surface area grows as a function of this, and the progress in directions that are not foci slows appropriately. Extensive investment in the field would solve any problem eventually but unfocused effort would is increasingly inefficient. Main uncertainties: I am not sure how to model areas of field focus and the faster progress in their vicinity, or how much I would expect some direction sufficiently close to AI alignment to be a focus of AI.
Overall, this would make me to expect that the past work in AI and ML would have a significant contribution towards AI alignment but to expect increasing discount in the future, unless alignment becomes a focus for AI or close to it. When thinking about policy implications for focusing research effort (with the goal of solving AI alignment), I would expect the returns to general academia to diminish much faster than to EA safety research.
My intuition is also that the discount for academia solving core alignment problems should be (much?) higher than here. At the same time I agree that some mainstream work (esp. foundations) does help current AI alignment research significantly. I would expect (and hope) more of this to still appear, but to be increasingly sparse (relative to amount of work in AI).
I think that it would be useful to have a contribution model that can distinguish (at least) between a) improving the wider area (including e.g. fundamental models, general tools, best practices, learnability) and b) working on the problem itself. Distinguishing past contribution and expected future contribution (resp. discount factor) may also help.
Why: Having a well developed field is a big help in solving any particular problem X adjacent to it and it seems reasonable to assign a part of the value of “X is solved” to work done on the field. However, field development alone is unlikely to solve X for sufficiently hard X that is not in the field’s foci, and dedicated work on X is still needed. I imagine this applies to the field of ML/AI and long-termist AI alignment.
Model sketch: General work done on the field has diminishing returns towards the work remaining on the problem. As the field grows, it branches and the surface area grows as a function of this, and the progress in directions that are not foci slows appropriately. Extensive investment in the field would solve any problem eventually but unfocused effort would is increasingly inefficient. Main uncertainties: I am not sure how to model areas of field focus and the faster progress in their vicinity, or how much I would expect some direction sufficiently close to AI alignment to be a focus of AI.
Overall, this would make me to expect that the past work in AI and ML would have a significant contribution towards AI alignment but to expect increasing discount in the future, unless alignment becomes a focus for AI or close to it. When thinking about policy implications for focusing research effort (with the goal of solving AI alignment), I would expect the returns to general academia to diminish much faster than to EA safety research.