Neel Nanda comments on Why did CEA buy Wytham Abbey?

Neel Nanda Dec 12, 2022, 12:36 PM
11 points
5 ∶ 0
This is fair, and I don’t want to argue that optics don’t matter at all or that we shouldn’t try to think about them.

My argument is more that actually properly accounting for optics in your EV calculations is really hard, and that most naive attempts to do so can easily do more harm than good. And that I think people can easily underestimate the costs of caring less about truth or effectiveness or integrity, and overestimate the costs of being legibly popular or safe from criticism. Generally, people have a strong desire to be popular and to fit in, and I think this can significantly bias thinking around optics! I particularly think this is the case with naive expected value calculations of the form “if there’s even a 0.1% chance of bad outcome X we should not do this, because X would be super bad”. Because it’s easy to anchor on some particularly salient example of X, and miss out on a bunch of other tail risk considerations.

The “annoying people by showing that we care more about style than substance” was an example of a counter-veiling consideration that argues in the opposite direction and could also be super bad.

This argument is motivated by the same reasoning as the “don’t kill people to steal their organs, even if it seems like a really good idea at the time, and you’re confident no one will ever find out” argument.
- Grayden 🔸Dec 13, 2022, 11:24 AM
  3 points
  1 ∶ 0
  Parent
  Thanks, Neel. This is a very helpful comment. I now don’t think our views are too far apart.
  - Neel Nanda Dec 13, 2022, 12:16 PM
    2 points
    0 ∶ 0
    Parent
    Thanks! Glad to hear it. This classic Yudkowsky post is a significant motivator. Key quote:
    
    But if you are running on corrupted hardware, then the reflective observation that it seems like a righteous and altruistic act to seize power for yourself—this seeming may not be be much evidence for the proposition that seizing power is in fact the action that will most benefit the tribe.
    
    By the power of naive realism, the corrupted hardware that you run on, and the corrupted seemings that it computes, will seem like the fabric of the very world itself—simply the way-things-are.
    
    And so we have the bizarre-seeming rule: “For the good of the tribe, do not cheat to seize power even when it would provide a net benefit to the tribe.”