David_Althaus comments on AI things that are perhaps as important as human-controlled AI

David_Althaus 13 Mar 2024 10:02 UTC
5 points
0 ∶ 0
Thanks Anthony!

Regarding 2: I’m totally no expert but it seems to me that there are other ways of influencing the preferences/dispositions of AI—e.g., i) penalizing, say, malevolent or fanatical reasoning/behavior/attitudes (e.g., by telling RLHF raters to specifically look out for such properties and penalize them), or ii) similarly amending the principles and rules of constitutional AI.