(This is not my area of expertise, although part of my relative disinterest in AI safety so far is because I’m neither convinced AI safety work is doing much at all, nor that it’s doing more good than harm. I’m more sympathetic to the cooperation side of AI safety, since there’s a stronger argument for it from a suffering-focused perspective.)
I mentioned a few more risks in this comment, for which I used AI safety work as an example for the problem of cluelessness for longtermist interventions:
For example, is the AI safety work we’re doing now backfiring? This could be due to, for example:
creating a false sense of security,
publishing the results of the GPT models, demonstrating AI capabilities and showing the world how much further we can already push it, and therefore accelerating AI development, or
slowing AI development more in countries that care more about safety than those that don’t care much, risking a much worse AGI takeover if it matters who builds it first.
Also from the same comment, and a concern for any work affecting extinction risks:
You still need to predict which of the attractors is ex ante ethically better, which again involves both arbitrary empirical weights and arbitrary ethical weights (moral uncertainty). You might find the choice to be sensitive to something arbitrary that could reasonably go either way. Is extinction actually bad, considering the possibility of s-risks?
Does some s-risk (e.g. AI safety, authoritarianism) work reduce some extinction risks and so increase other s-risks, and how do we weigh those possibilities?
(This is not my area of expertise, although part of my relative disinterest in AI safety so far is because I’m neither convinced AI safety work is doing much at all, nor that it’s doing more good than harm. I’m more sympathetic to the cooperation side of AI safety, since there’s a stronger argument for it from a suffering-focused perspective.)
I mentioned a few more risks in this comment, for which I used AI safety work as an example for the problem of cluelessness for longtermist interventions:
Also from the same comment, and a concern for any work affecting extinction risks: