That sufficiently intelligent AIs will not ‘automatically’ be moral (e.g. the behaviour of un-RLHF’d models).
That sufficiently intelligent AIs will not ‘automatically’ be moral (e.g. the behaviour of un-RLHF’d models).