I agree that there might be reasons of moral cooperation and trade to compromise with other value systems. But when deciding how to cooperate, we should at least be explicitly guided by optimising for our own values, subject to constraints. I think it is far from obvious that aligning with the intent of the programmer is the best way to optimise for utilitarian values. Perhaps we should aim for utilitarian alignment first
I agree that there might be reasons of moral cooperation and trade to compromise with other value systems. But when deciding how to cooperate, we should at least be explicitly guided by optimising for our own values, subject to constraints. I think it is far from obvious that aligning with the intent of the programmer is the best way to optimise for utilitarian values. Perhaps we should aim for utilitarian alignment first