What if AI exploring moral uncertainty finds that there is provably no correct moral theory or right moral facts? It that case, there is no moral uncertainty between moral theories, as they are all false. Could it escape this obstacle just by aggregating human’s opinion about possible situations?
What if AI exploring moral uncertainty finds that there is provably no correct moral theory or right moral facts?
In that case it would be exploring traditional metaethics, not moral uncertainty.
But if moral uncertainty is used as a solution then we just bake in some high level criteria for the appropriateness of a moral theory, and the credences will necessarily sum to 1. This is little different from baking in coherent extrapolated volition. In either case the agent is directly motivated to do whatever it is that satisfies our designated criteria, and it will still want to do it regardless of what it thinks about moral realism.
Those criteria might be very vague and philosophical, or they might be very specific and physical (like ‘would a simulation of Bertrand Russell say “a-ha, that’s a good theory”?’), but either way they will be specified.
What if AI exploring moral uncertainty finds that there is provably no correct moral theory or right moral facts? It that case, there is no moral uncertainty between moral theories, as they are all false. Could it escape this obstacle just by aggregating human’s opinion about possible situations?
In that case it would be exploring traditional metaethics, not moral uncertainty.
But if moral uncertainty is used as a solution then we just bake in some high level criteria for the appropriateness of a moral theory, and the credences will necessarily sum to 1. This is little different from baking in coherent extrapolated volition. In either case the agent is directly motivated to do whatever it is that satisfies our designated criteria, and it will still want to do it regardless of what it thinks about moral realism.
Those criteria might be very vague and philosophical, or they might be very specific and physical (like ‘would a simulation of Bertrand Russell say “a-ha, that’s a good theory”?’), but either way they will be specified.