How much does work in AI safety help the world?

Using a method of Owen’s, we made an interactive tool to estimate the probability that joining the AI safety research community would actually avert existential catastrophe.

Though I’m posting this and it’s written from my point of view, most of the writing and reasoning here is Owen’s—I’ll take the blame for any misunderstandings or mis-statements of his argument :)

There’s been some discussion lately about whether we can make estimates of how likely efforts to mitigate existential risk from AI are to succeed and about what reasonable estimates of that probability might be. In a recent conversation between the two of us, I mentioned to Owen that I didn’t have a good way to estimate the probability that joining the AI safety research community would actually avert existential catastrophe. Though it would be hard to be certain about this probability, it would be nice to have a principled back-of-the-envelope method for approximating it. Owen actually has a rough method based on the one he used in his article Allocating risk mitigation across time, but he never spelled it out.

We thought that this would be best presented interactively; since the EA Forum doesn’t allow JavaScript in posts, we put the tool on the Global Priorities Project website.

You can use the tool here to make your own estimate!

(Assuming that you’ve gone to the site and made your own estimate...)

So, what does this mean? Obviously this is quite a crude method, and some of the variables you have to estimate are themselves quite tricky to get a handle on, but we think they’re more approachable than trying to estimate the whole thing directly, and expect the answer to be within a few orders of magnitude of correct.

How big does the number have to be to imply that a career in AI safety research is one of the best things to do? One natural answer is to multiply out by the number of lives we expect we could get in the future. I think this is understandable, and worth doing as a check to see if the whole thing is dominated by focusing on the present, but it’s not the end of the story. There are fewer than 10 billion people alive today, and collectively it seems like we may have a large amount of influence over the future. Therefore if your estimate for the likelihood of a career in AI safety looks much worse than a 1 in 10 billion chance, it seems likely that there are other more promising ways to productively use your share of that influence. These might eventually help by influencing AI safety in another way, or through a totally different mechanism. The method we’ve used here could also be modified to estimate the value of joining communities working on other existential risks, or perhaps other interventions that change the eventual size or productivity of the AI research community, for example through outreach, funding, or field-steering work.