This course sounds cool! Unfortunately there doesn’t seem to be too much relevant material out there.
This is a stretch, but I think there’s probably some cool computational modeling to be done with human value datasets (e.g., 70,000 responses to variations on the trolley problem). What kinds of universal human values can we uncover? https://www.pnas.org/doi/10.1073/pnas.1911517117
This course sounds cool! Unfortunately there doesn’t seem to be too much relevant material out there.
This is a stretch, but I think there’s probably some cool computational modeling to be done with human value datasets (e.g., 70,000 responses to variations on the trolley problem). What kinds of universal human values can we uncover? https://www.pnas.org/doi/10.1073/pnas.1911517117
For digestible content on technical AI safety, Robert Miles makes good videos. https://www.youtube.com/c/robertmilesai
Abby—good suggestions, thank you. I think I will assign some Robert Miles videos! And I’ll think about the human value datasets.