[Question] What predictions from theoretical AI Safety research have been confirmed by empirical work?

I’m trying to better understand whether recent efforts to give AI Safety research a better empirical grounding have produced evidence that some claims based on theoretical AI Safety work have turned out to be correct.

This could make me update in favour of taking AI Safety concerns more seriously.

Previously I have been skeptical of AI Safety arguments due to many claims being based on theoretical reasoning rather than empirical evidence.

No comments.