Aaron Bergman comments on My tentative best guess on how EAs and Rationalists sometimes turn crazy

Aaron Bergman 24 Jun 2023 1:24 UTC
14 points
3 ∶ 0
Seems like the forces that turn people crazy are the same ones that lead people to do anything good and interesting at all. At least for EA, a core function of orgs/elites/high status community members is to make the kind of signaling you describe highly correlated with actually doing good. Of course it seems impossible to make them correlate perfectly, and that’s why setting with super high social optimization pressure (like FTX) are gonna be bad regardless.

But (again for EA specifically) I suspect the forces you describe would actually be good to increase on the margin for people not living in Berkeley and/or in a group house which is probably a majority of self-identified EAs but a strong minority of the people-hours OP interacts with irl.
- Jobst Heitzig (EMPO project) 13 Sep 2023 5:45 UTC
  5 points
  0 ∶ 0
  Parent
  The “impossible to correlate perfectly” piece is like in AI alignment, where one could also argue that perfect alignment of a reward function to the “true” utility function is impossible.
  
  Indeed, one might even argue that the joint cognition implemented by the EA/rationality/x-risk community as a whole is a form of “artificial” intelligence, let’s call it “EI” and thus we face an “EI alignment” problem. As EA becomes more powerful in the world, we get “ESI” (effective altruism superhuman intelligence) and related risks from misaligned ESI.
  
  The obvious solution in my opinion is the same for AI and EI: don’t maximize, since the metric you might aim to maximize is most likely imperfectly aligned with true utility. Rather satisfice: be ambitious, but not infinitely so. After reaching an ambitious goal, check if your reward function still makes sense before setting the next, more ambitious goal. And have some human users constantly verify your reward function :-)