The key difference in my mind is that the AI system does not need to determine the relative authoritativeness of different pronouncements of human value, since the legal authoritativeness of e.g. caselaw is pretty formalized. But I agree that this is less of an issue if the primary route to alignment is just getting an AI to follow the instructions of its principal.
The key difference in my mind is that the AI system does not need to determine the relative authoritativeness of different pronouncements of human value, since the legal authoritativeness of e.g. caselaw is pretty formalized. But I agree that this is less of an issue if the primary route to alignment is just getting an AI to follow the instructions of its principal.
Yeah, I certainly feel better about learning law relative to learning the One True Set of Human Values That Shall Then Be Optimized Forevermore.