I would like to highlight an aspect you mention in the “other caveats”: How much should you discount for Goodharting vs doing things for the right reasons? Or, relatedly, if you work on some relevant topic (say, Embedded Agency) without knowing that AI X-risk could be a thing, how much less useful will your work be? I am very uncertain about the size of this effect—maybe it is merely a 10% decrease in impact, but I wouldn’t be too surprised if it decreased the amount of useful work by 98% either.
Personally, I view this as the main potential argument against the usefulness of academia. However, even if the effect is large, the implication is not that we should ignore academia. Rather, it would suggest that we can get huge gains by increasing the degree to which academics do the research because of the right reasons.
(Standard disclaimers apply: This can be done in various ways. Viktoria Krakovna’s list of specification gaming examples is a good one. Screaming about how everybody is going to die tomorrow isn’t :P.)
I would like to highlight an aspect you mention in the “other caveats”: How much should you discount for Goodharting vs doing things for the right reasons? Or, relatedly, if you work on some relevant topic (say, Embedded Agency) without knowing that AI X-risk could be a thing, how much less useful will your work be? I am very uncertain about the size of this effect—maybe it is merely a 10% decrease in impact, but I wouldn’t be too surprised if it decreased the amount of useful work by 98% either.
Personally, I view this as the main potential argument against the usefulness of academia. However, even if the effect is large, the implication is not that we should ignore academia. Rather, it would suggest that we can get huge gains by increasing the degree to which academics do the research because of the right reasons.
(Standard disclaimers apply: This can be done in various ways. Viktoria Krakovna’s list of specification gaming examples is a good one. Screaming about how everybody is going to die tomorrow isn’t :P.)
Vojta