[Question] What is an example of recent, tangible progress in AI safety research?

Cross-posting a good question from Reddit. Answer there, here, or in both places; I’ll make sure the Reddit author knows about this post.

Eric Herboso’s answer on Reddit (the only one so far) includes these examples:

Scott Garrabrant on Finite Factored Sets (May)

Paul Christiano on his Research Methodology (March)

Rob Miles on Misaligned Mesa-Optimisers (Feb part 1 May part 2, both describing a paper from 2019)

No comments.