Ofer comments on A mesa-optimization perspective on AI valence and moral patienthood

Ofer 16 Sep 2021 9:27 UTC
3 points
0 ∶ 0
(I don’t know/remember the details of AlphaGo, but if the setup involves a value network that is trained to predict the outcome of an MCTS-guided gameplay, that seems to make it more likely that the value network is doing some sort of search during inference.)
- Steven Byrnes 16 Sep 2021 18:08 UTC
  2 points
  0 ∶ 0
  Parent
  Hmm, yeah, I guess you’re right about that.