[Question] Academic AI Safety/Alignment Reading List

Zak_H21 Nov 2023 14:19 UTC

6 points

Hi,

Is anyone aware of a reading list of mostly peer-reviewed journal articles and pre-prints for AI safety/alignment? I would like to start reading and citing more papers from this literature in my own papers.

Thanks in advance for any help :)

Zak

Zak_H21 Nov 2023 14:19 UTC

6 points

1 comment1 min readEA link

Research AI safety

aog 21 Nov 2023 16:43 UTC
2 points
1 ∶ 0
Hey, I’ve found this list really helpful, and the course that comes with it is great too. I’d suggest watching the course lecture video for a particular topic, then reading a few of the papers. Adversarial robustness and Trojans are the ones I found most interesting. https://course.mlsafety.org/readings/

No comments.

[Question] Academic AI Safety/​Alignment Reading List

[Question] Academic AI Safety/Alignment Reading List