Larks answers What predictions from theoretical AI Safety research have been confirmed by empirical work?

Larks 30 Dec 2024 1:40 UTC
15 points
3 ∶ 1
That sufficiently intelligent AIs will not ‘automatically’ be moral (e.g. the behaviour of un-RLHF’d models).