Zach Stein-Perlman comments on On Deference and Yudkowsky’s AI Risk Estimates

Zach Stein-Perlman 19 Jun 2022 16:29 UTC
1 point
0 ∶ 0
Almost all of this seems reasonable. But:
Yudkowsky has previously held short AI timeline views that turned out to be wrong
I don’t think we should update based on this, or eg on the fact that we didn’t go extinct due to nanotechnology, because anthropics / observer selection. (We should only update based on whether we think the reasons for those beliefs were bad.)
- Derek Shiller 19 Jun 2022 17:23 UTC
  48 points
  0 ∶ 0
  Parent
  Suppose you’ve been captured by some terrorists and you’re tied up with your friend Eli. There is a device on the other side of the room you that you can’t quite make out. Your friend Eli says that he can tell (he’s 99% sure) it is a bomb and that it is rigged to go off randomly. Every minute, he’s confident there’s a 50-50 chance it will explode, killing both of you. You wait a minute and it doesn’t explode. You wait 10. You wait 12 hours. Nothing. He starts eying the light fixture, and say’s he’s pretty sure there’s a bomb there too. You believe him?
  - Zach Stein-Perlman 19 Jun 2022 17:35 UTC
    23 points
    0 ∶ 0
    Parent
    No, my survival for 12 hours is evidence against Eli being correct about the bomb.
    So: oops, I think.
    - Zach Stein-Perlman 20 Jun 2022 17:00 UTC
      2 points
      0 ∶ 0
      Parent
      I’m still not totally comfortable. I think my confusion arose because I was considering the related question of whether I could use my better knowledge than Eli to win money from bets (in expectation) -- I couldn’t, because Eli has no reason to bet on the bomb going off. More generally, Eliezer never had reason to bet (in the sense that he gets epistemic credit if he’s right) on nanotech-doom-by-2010, because in the worlds where he’s right we’re dead. It feels weird to update against Eliezer on the basis of beliefs that he wouldn’t have bet on; updating against him doesn’t seem to be incentive-compatible… but maybe that’s just the sacrifice immanent to the epistemic virtue of publicly sharing your belief in doom.
  - rhollerith 19 Jun 2022 19:43 UTC
    3 points
    0 ∶ 0
    Parent
    I am willing to bite your bullet.
    
    I had a comment here explaining my reasoning, but deleted it because I plan to make a post instead.
  - [ ]
    [deleted]
    - Derek Shiller 19 Jun 2022 20:05 UTC
      1 point
      0 ∶ 0
      Parent
      
      then it would be a violation of the law of the conservation of expected evidence for you to update your beliefs on observing the passage of a minute without the bomb’s exploding.
      
      Interesting! I would think this sort of case just shows that the law of conservation of expected evidence is wrong, at least for this sort of application. I figure it might depend on how you think about evidence. If you think of the infinite void of non-existence as possibly constituting your evidence (albeit evidence you’re not in a position to appreciate, being dead and all), then that principle wouldn’t push you toward this sort of anthropic reasoning.
      
      I am curious, what do you make of the following case?
      
      Suppose you’re touring Acme Bomb & Replica Bomb Co with your friend Eli. ABRBC makes bombs and perfect replicas of bombs, but they’re sticklers for safety so they alternate days for real bombs and replicas. You’re not sure which sort of day it is. You get to the point of the tour where they show off the finished product. As they pass around the latest model from the assembly line, Eli drops it, knocking the safety back and letting the bomb (replica?) land squarely on its ignition button. If it were a real bomb, it would kill everyone unless it were one of the 1-in-a-million bombs that’s a dud. You hold your breath for a second but nothing happens. Whew. How much do you want to bet that it’s a replica day?
      - [ ]
        [deleted]