Owen Cotton-Barratt comments on A Conflict Between AI Alignment and Philosophical Competence

Owen Cotton-Barratt 28 Dec 2025 9:44 UTC
6 points
0 ∶ 0
I think there’s something interesting to this argument, although I think it may be relying on a frame where AI systems are natural agents, in particular at this step:
a strategically and philosophically competent AI should seemingly have its own moral uncertainty and pursue its own “option value maximization” rather than blindly serve human interests/values/intent
It’s not clear to me why the key functions couldn’t be more separated, or whether the conflict you’re pointing persists across such separation. For instance, we might have a mix of:
- Systems which competently pursue philosophy research (but do not have a sense of self that they are acting with regard to)
- Systems which are strategic (including drawing on the fruits of the philosophy research), on behalf of human institutions or individuals
- Systems which are instruction-following tools (which don’t aspire to philosophical competence), rather than independent agents
I mean “not clear to me” very literally here—I think that perhaps some version of your conflict will pose a challenge to such setups. But I’m responding with this alternate frame in the hope that it will be useful in advancing the conversation.