Alignment Theory Series Eleni_A8 Aug 2022 18:33 UTCDistillation pieces for those who want to start from somewhere but don’t know where. Deception as the optimal: mesa-optimizers and inner alignment Eleni_A16 Aug 2022 3:45 UTC19 points0 comments5 min readEA linkThree scenarios of pseudo-alignment Eleni_A5 Sep 2022 20:26 UTC7 points0 comments3 min readEA linkMy summary of “Pragmatic AI Safety” Eleni_A5 Nov 2022 14:47 UTC14 points0 comments5 min readEA link