The course presents possible solutions to these risks, and the students feel like they “understood” AI risk, and in the future it will be harder to these students about AI risk since they feel like they already have an understanding, even though it is wrong.
I am specifically worried about this because I try imagining who would write the course and who would teach it. Will these people be able to point out the problems in the current approaches to alignment? Will these people be able to “hold an argument” in class well enough to point out holes in the solutions that the students will suggest after thinking about the problem for five minutes?
Risk:
The course presents possible solutions to these risks, and the students feel like they “understood” AI risk, and in the future it will be harder to these students about AI risk since they feel like they already have an understanding, even though it is wrong.
I am specifically worried about this because I try imagining who would write the course and who would teach it. Will these people be able to point out the problems in the current approaches to alignment? Will these people be able to “hold an argument” in class well enough to point out holes in the solutions that the students will suggest after thinking about the problem for five minutes?
I’m not saying this isn’t solvable, just a risk.