I enjoyed reading this post, and I think I agree with your assessment:
It is pretty clear that some of the main cruxes of current disagreements about AI alignment are beyond the limits of legible reasoning. (The current limits, anyway.)
(In addition to the Christiano-Yudkowsky example you give, one could also point to the Hanson-Yudkowsky AI-Foom Debate of 2008.)
I enjoyed reading this post, and I think I agree with your assessment:
(In addition to the Christiano-Yudkowsky example you give, one could also point to the Hanson-Yudkowsky AI-Foom Debate of 2008.)
In addition to “Epistemic Legibility” and “A Sketch of Good Communication,” which you mention, I’d recommend “Public beliefs vs. Private beliefs” (Tyre, 2022) to others who enjoyed this post – Tyre explores a somewhat related theme.