David_Moss comments on Disentangling arguments for the importance of AI safety

David_Moss Jan 23, 2019, 5:34 PM
4 points
0 ∶ 0

And this proliferation of arguments is (weak) evidence against their quality: if the conclusions of a field remain the same but the reasons given for holding those conclusions change, that’s a warning sign for motivated cognition (especially when those beliefs are considered socially important).

I’m not sure these considerations should be too concerning in this case for a couple of reasons.

I agree that it’s concerning where “conclusions… remain the same but the reasons given for holding those conclusions change” in cases where people originally (putatively) believe p because of x, then x is shown to be a weak consideration and so they switch to citing y as a reason to believe y. But from your post it doesn’t seem like that’s necessarily what has happened, rather than a conclusion being overdetermined by multiple lines of evidence. Of course, particular people in the field may have switched between some of these reasons, having decided that some of them are not so compelling, but in the case of many of the reasons cited above, the differences between the positions seem sufficiently subtle that we should expect cases of people clarifying their own understanding by shifting to closely related positions(e.g. it seems plausible someone might reasonably switch from thinking that the main problem is knowing how to precisely describe what we value to thinking that the main problem is not knowing how to make an agent try to do that).

It also seems like a proliferation of arguments in favour of a position is not too concerning where there are plausible reasons why should expect multiple of the considerations to apply simultaneously. For example, you might think that any kind of powerful agent typically presents a threat in multiple different ways, in which case it wouldn’t be suspicious if people cited multiple distinct considerations as to why they were important.
- richard_ngo Jan 24, 2019, 1:09 AM
  2 points
  0 ∶ 0
  Parent
  I agree that it’s not too concerning, which is why I consider it weak evidence. Nevertheless, there are some changes which don’t fit the patterns you described. For example, it seems to me that newer AI safety researchers tend to consider intelligence explosions less likely, despite them being a key component of argument 1. For more details along these lines, check out the exchange between me and Wei Dai in the comments on the version of this post on the alignment forum.
- cole_haus Jan 23, 2019, 8:49 PM
  2 points
  0 ∶ 0
  Parent
  Agreed. I think these reasons seem to fit fairly easily into the following schema: Each of A, B, C, and D is necessary for a good outcome. Different people focus on failures of A, failures of B, etc. depending on which necessary criterion seems to them most difficult to satisfy and most salient.