bgarfinkel comments on AMA or discuss my 80K podcast episode: Ben Garfinkel, FHI researcher

bgarfinkel 19 Jul 2020 13:55 UTC
5 points
0 ∶ 0
I think the work is mainly useful for EA organizations making cause prioritization decisions (how much attention should they devote to AI risk relative to other cause areas?) and young/early-stage people deciding between different career paths. The idea is mostly to help clarify and communicate the state of arguments, so that more fully informed and well-calibrated decisions can be made.

A couple other possible positive impacts:
- Developing and shifting to improved AI risk arguments—and publicly acknowledging uncertainties/confusions—may, at least in the long run, cause other people to take the EA community and existential-risk-oriented AI safety communities more seriously. As one particular point, I think that a lot of vocal critics (e.g. Pinker) are mostly responding to the classic arguments. If the classic arguments actually have significant issues, then it’s good to acknowledge this; if other arguments (e.g. these) are more compelling, then it’s good to work them out more clearly and communicate them more widely. As another point, I think that sharing this kind of work might reduce perceptions that the EA is more group-think-y/unreflective than it actually is. I know that people have sometimes pointed to my EAG talk from a couple years back, for example, in response to concerns that the EA community is too uncritical in its acceptance of AI risk arguments.
- I think that it’s probably useful for the AI safety community to have a richer and more broadly shared understanding of different possible “AI risk threat models”; presumably, this would feed into research agendas and individual prioritization decisions to some extent. I think that work that analyzes newer AI risk arguments, especially, would be useful here. For example, it seems important to develop a better understanding of the role that “mesa-optimization” plays in driving existential risk.
(There’s also the possibility of negative impact, of course: focusing too much on the weaknesses of various arguments might cause people to downweight or de-prioritize risks more than they actually should.)

I haven’t thought very much about the timelines of which this kind of work is useful, but I think it’s plausible that the delayed impact on prioritization and perception is more important than the immediate impact.