Max_Daniel answers What are the coolest topics in AI safety, to a hopelessly pure mathematician?

Max_Daniel 7 May 2022 21:53 UTC
9 points
0 ∶ 0
Some mathy AI safety pieces or other related material off the top of my head (in no particular order, and definitely not comprehensive nor weighted toward impact or influence):
- The Speed + Simplicity Prior is probably anti-deceptive
- Prediction can be Outer Aligned at Optimum
- Reinforcement Learning in Newcomblike Environments
- Commitment games with conditional information revelation
- Chris Olah’s older pieces on neural networks (under ‘Neural Networks (General)’ and below)