MichaelStJules comments on The Epistemic Challenge to Longtermism (Tarsney, 2020)

MichaelStJules 6 Apr 2021 2:15 UTC
4 points
0 ∶ 0
On his estimate of the difference in probability we can achieve promoting one state over its complement, it’s worth mentioning that this does not consider the possibility of doing more harm than good, e.g. AI safety work advancing AGI more than it aligns it, and with the very low (but in his view, extremely conservative) probabilities that he uses in his argument, the possibility of backfire effects outweighing them becomes more plausible.
Furthermore, it does not argue that we can effectively predict that any particular state is better than its complement, e.g. is extinction good or bad? How should we deal with moral uncertainty, especially around population ethics?
For these reasons, it may be difficult to justifiably identify robustly positive expected value longtermist interventions ahead of time, which the case for longtermism depends on. I mean this even with subjective probabilities, since such probabilities supporting longtermist interventions tend to be particularly poorly-informed (largely for absence of good evidence) and so seem more prone to biases and whims, e.g. wishful thinking and the non-rational particulars of people’s brains and priors. This is just deep uncertainty and moral cluelessness.
For what it’s worth, I don’t think it makes much sense for this paper to address such issues in detail given its current length already, although they seem worth mentioning.
(Also, I read the paper a while ago, so maybe it did discuss these issues and I missed it.)
What links here?
- What harm could AI safety do? by SeanEngelhart (15 May 2021 1:11 UTC; 12 points)
- MichaelA 11 Apr 2021 14:08 UTC
  6 points
  0 ∶ 0
  Parent
  In line with your comment:
  1. I don’t recall the paper discussing the possibility that longtermist interventions could backfire for their intended effects
  2. The paper’s main working example is just about any intelligent civilization existing, and doesn’t get into what that civilization is doing or how valuable it is (which therefore includes things like not discussing whether it’s better or worse than extinction)
  But Tarsney does acknowledge roughly that second point in one place:
  Additionally, there are other potential sources of epistemic resistance to longtermism besides Weak Attractors that this paper has not addressed. In particular, these include:
  Neutral Attractors To entertain small values of r [the rate of ENEs], we must assume that the state S targeted by a longtermist intervention, and its complement ¬S, are both at least to some extent “attractor” states: Once a system is in state S, or state ¬S, it is unlikely to leave that state any time soon. But to justify significant values of ve and vs, it must also be the case that the attractors we are able to target differ significantly in expected value. And it’s not clear that we can assume this. For instance, perhaps “large interstellar civilization exists in spatial region X” is an attractor state, but “large interstellar civilization exists in region X with healthy norms and institutions that generate a high level of value” is not. If civilizations tend to “wander” unpredictably between high-value and low-value states, it could be that despite their astronomical potential for value, the expected value of large interstellar civilizations is close to zero. In that case, we can have persistent effects on the far future, but not effects that matter (in expectation).
  He says “low-value” rather than “negative value”, but I assume he actually meant negative value, because random wandering between high and low positive values wouldn’t produce an EV (for civilization existing rather than not existing) of close to 0.