Saying ‘AI safety research is a Pascal’s Mugging’ isn’t a strong response

People should stop falling back on the argument that working on AI safety research is a ‘Pascal’s Mugging’, because that doesn’t address the actual reasons people who work on AI safety think we should work on AI safety today.

Most people who work on AI think the chances of affecting the outcome are not infinitesimal, but rather entirely macroscopic, the same way that voting in an election has a low but real chance of changing the outcome, or having an extra researcher has a low but real chance of causing us to invent a cure for malaria sooner, or having an extra person on ebola containment makes it less likely to become a pandemic.

For example someone involved might believe:

i) There’s a 10% chance of humanity creating a ‘superintelligence’ within the next 100 years.

ii) There’s a 30% chance that the problem can be solved if we work on it harder and earlier.

iii) A research team of five suitable people starting work on safety today and continuing through their working lives would raise the odds of solving the problem by 1% of that (0.3 percentage points). (This passes a sanity check, as they would represent a 20% increase in the effort being made today.)

iv) Collectively they therefore have a 0.03% chance of making an AI significantly more aligned with human values in the next 100 years, such that individual person involved has a 0.006 percentage point share.

Note that the case presented here has nothing to do with there being some enormous and arbitrary value available if you succeed, which is central to the weirdness of the Pascal’s Mugging case.

Do you think the numbers in this calculation are way over-optimistic? OK—that’s completely reasonable!

Do you think we can’t predict whether the sign of the work we do now is positive or negative? Is it better to wait and work on the problem later? There are strong arguments for that as well!

But those are the arguments that should be made and substantiated with evidence and analysis, not quick dismissals that people are falling for a ‘Pascal’s Mugging’, which they mostly are not.

Given the beliefs of this person, this is no more a Pascal’s Mugging than working on any basic science research, or campaigning for an outsider political campaign, or trying to reform a political institution. These all have unknown but probably very low chances of making a breakthrough, but could nevertheless be completely reasonable things to try to do.

Here’s a similar thing I wrote years ago: If elections aren’t a Pascal’s Mugging, existential risk work shouldn’t be either.

Postscript

As far as I can see all of these are open possibilities:

1) Solving the AI safety problem will turn out to be unnecessary, and our fears today are founded on misunderstandings about the problem.

2) Solving the AI safety problem will turn out to be relative straightforward on the timeline available.

3) It will be a close call whether we manage to solve it in time—it will depend on how hard we work and when we start.

4) Solving the AI safety problem is almost impossible and we would have to be extremely lucky to do so before creating a super-intelligent machine. We are therefore probably screwed.

We collectively haven’t put enough focussed work into the problem yet to have a good idea where we stand. But that’s hardly a compelling reason to assume 1), 2) or 4) and not work on it now.