Sean🔸 comments on Alignment is not that hard

Sean🔸 19 Apr 2025 20:35 UTC
1 point
0 ∶ 0
Thank you for the links. The concerning scenario I imagine is an AI performing something like reflective equilibrium and coming away with something singular and overly-reductive, biting bullets we’d rather it not, all for the sake of coherence. I don’t think current LLM systems are doing this, but greater coherence seems generally useful, so I expect AI companies to seek it. I will read these and try to see if something like this is addressed.

Sean🔸 comments on Alignment is not *that* hard