A common line of argument for AI alignment being an issue is that unaligned AI systems deployed in the real world have already caused harm to people or society.[1] However, many of the cited examples are disputed – for example, a paper finds no evidence that YouTube’s recommendation algorithm radicalizes users, but the study has been criticized for not studying how the algorithm interacts with individual, logged-in users.[2] What evidence is there that unaligned AI systems cause real-world harm, and how strong/consistent is it?
I know that the Center for Humane Technology has been compiling a Ledger of Harms documenting the various harms allegedly caused by digital technologies, but I’m concerned that it selects for positive results (i.e. results confirming that a technology causes harm).[3] Ideally, I would like to see a formal meta-analysis that incorporates both positive and negative results.
According to the submission form, they only include studies that “someone could use … to create (or catalyze the creation of) a more humane version” of the technology in question.
[Question] How strong is the evidence of unaligned AI systems causing harm?
A common line of argument for AI alignment being an issue is that unaligned AI systems deployed in the real world have already caused harm to people or society.[1] However, many of the cited examples are disputed – for example, a paper finds no evidence that YouTube’s recommendation algorithm radicalizes users, but the study has been criticized for not studying how the algorithm interacts with individual, logged-in users.[2] What evidence is there that unaligned AI systems cause real-world harm, and how strong/consistent is it?
I know that the Center for Humane Technology has been compiling a Ledger of Harms documenting the various harms allegedly caused by digital technologies, but I’m concerned that it selects for positive results (i.e. results confirming that a technology causes harm).[3] Ideally, I would like to see a formal meta-analysis that incorporates both positive and negative results.
See Aligning Recommender Systems as Cause Area.
Feuer, Will. “Critics slam study claiming YouTube’s algorithm doesn’t lead to radicalization.” CNBC, Dec. 30, 2019. Accessed July 21, 2020.
According to the submission form, they only include studies that “someone could use … to create (or catalyze the creation of) a more humane version” of the technology in question.