Regarding your last point: I see. I thought this was an argument for “alignment via moral reasoning as an addition to alignment via control”, not “alignment via moral reasoning instead of alignment via control.” So you would hope that alignment via moral reasoning would displace or replace alignment via control.
In that case, your argument is plausible but… quite hopeful? I’m sure many people will pursue control methods regardless. I suppose you might argue that, if enough people buy your argument, then research on AI that is merely controlled will advance more slowly, and research on AI that does its own moral reasoning, and is therefore harder to misuse, would advance faster or at least in parallel. Then I would accept that this might reduce the chance of malevolent misuse, but that’s quite a hopeful scenario! In less hopeful scenarios, I am unsure if people concerned with malevolent misuse ought to pursue this kind of work, or if they wouldn’t be better off simply advocating for a pause/slow down.
In short, I am not hoping for a specific outcome, and I can’t take into account every single scenario. If someone starts giving more credit to research on moral reasoning in AI after reading this, that’s already enough, considering that the topic doesn’t seem to be popular within AI alignment, and it was even more niche at the time I wrote this post.
Sure! And like I said, I do think this is valuable: it just seems more obviously valuable as a way to ensure the best outcomes (aligned AI), rather than as a means to avoid the worst outcomes.
Regarding your last point: I see. I thought this was an argument for “alignment via moral reasoning as an addition to alignment via control”, not “alignment via moral reasoning instead of alignment via control.” So you would hope that alignment via moral reasoning would displace or replace alignment via control.
In that case, your argument is plausible but… quite hopeful? I’m sure many people will pursue control methods regardless. I suppose you might argue that, if enough people buy your argument, then research on AI that is merely controlled will advance more slowly, and research on AI that does its own moral reasoning, and is therefore harder to misuse, would advance faster or at least in parallel. Then I would accept that this might reduce the chance of malevolent misuse, but that’s quite a hopeful scenario! In less hopeful scenarios, I am unsure if people concerned with malevolent misuse ought to pursue this kind of work, or if they wouldn’t be better off simply advocating for a pause/slow down.
In short, I am not hoping for a specific outcome, and I can’t take into account every single scenario. If someone starts giving more credit to research on moral reasoning in AI after reading this, that’s already enough, considering that the topic doesn’t seem to be popular within AI alignment, and it was even more niche at the time I wrote this post.
Sure! And like I said, I do think this is valuable: it just seems more obviously valuable as a way to ensure the best outcomes (aligned AI), rather than as a means to avoid the worst outcomes.