RSS

Leon Lang

Karma: 136

I’m a last-year PhD student at the University of Amsterdam working on AI Safety and Alignment, and specifically safety risks of Reinforcement Learning from Human Feedback (RLHF). Previously, I also worked on abstract multivariate information theory and equivariant deep learning.

Distri­bu­tion Shifts and The Im­por­tance of AI Safety

Leon Lang29 Sep 2022 22:38 UTC
7 points
0 comments9 min readEA link

Sum­maries: Align­ment Fun­da­men­tals Curriculum

Leon Lang19 Sep 2022 15:43 UTC
25 points
1 comment1 min readEA link
(docs.google.com)