The claim (paraphrased), “it is pretty easy to get AI safety messaging wrong, but there are some useful things to communicate about AI safety” seems important (and right — I’ve also seen examples of people accidentally spreading the idea that “AI will be powerful”). I also think lots of people in the EA community should hear it — a good number of people are in fact working on “spreading the ideas of AI safety” (see a related topic page).
It’s very nice to have more content on things that ~everyone can help with.
“practically everyone can help with spreading messages at least some, via things like talking to friends; writing explanations of your own that will appeal to particular people; and, yes, posting to Facebook and Twitter and all of that. [...] I’d guess it can be a big deal: many extremely important AI-related ideas are understood by vanishingly small numbers of people, and a bit more awareness could snowball. Especially because these topics often feel too “weird” for people to feel comfortable talking about them! Engaging in credible, reasonable ways could contribute to an overall background sense that it’s OK to take these ideas seriously.”
The lists of kinds of messages that are risky/helpful are helpful:
Risky (presumably not an exhaustive list!):
messages that generically emphasize the importance and potential imminence of powerful AI systems
messages that emphasize that AI could be risky/dangerous to the world, without much effort to fill in how, or with an emphasis on easy-to-understand risks (where one of the risks is, “If people have a bad model of how and why AI could be risky/dangerous (missing key risks and difficulties), they might be too quick to later say things like “Oh, turns out this danger is less bad than I thought, let’s go full speed ahead!””)
Helpful + right (This list is presumably also not exhaustive. I should also say that I’m least optimistic about iii (sort of) and v.)
[S] We should worry about conflict between misaligned AI and all humans
[S] AIs could behave deceptively, so “evidence of safety” might be misleading
[S] AI projects should establish and demonstrate safety (and potentially comply with safety standards) before deploying powerful systems
One question/disagreement/clarification I have about the statement, “I’m not excited about blasting around hyper-simplified messages.”
The word “simplified” is a bit vague; I think I disagree with some interpretations of the sentence. I agree that “it’s generally not good enough to spread the most broad/relatable/easy-to-agree-to version of each key idea,” but I think in some cases, “simplifying” could be really useful for spreading more accurate messages. In particular, “simplifying” could mean something like “dumbing down somewhat indiscriminately” — which is bad/risky — or it could mean something like “shortening and focusing on the key points, making technical points accessible to a more general audience, etc.” — something like distillation. The latter approach seems really useful here, in part because it might help overcome a big problem in AI safety messaging: that a lot of the key points about risk are difficult to understand, and that important texts are technical. This means that it’s easy to be shown cool demos of new AI systems, but not as easy to understand the arguments that explain why progress in AI might be dangerous. (So people trying to make the case in favor of safety might resort to deferring to experts, get the messages wrong in ways that make the listener unnecessarily skeptical of the overall case, etc.)
(More minor: I also think that the word “blast” has negative connotations which make it harder to correctly engage with the sentence. I think you mean “I’m not excited about sharing hyper-simplified messages in a way that reaches a ~random-but-large subset of people.” I think I agree — it seems better to target a particular audience — but the way it’s currently stated makes it harder to disagree; it’s harder to say, “no, I think we should in fact blast some messages” than it is to say, “I think there are some messages that appeal to a very wide range of audiences,” or to say “I think there are some messages we should promote extensively.”)
(I should say that the opinions I’m sharing here are mine, not CEA’s. I also think a lot of my opinions here are not very resilient.)
Whether it’s a knife, a car, social media, or artificial intelligence, technology is power.
There’s no reason why we shouldn’t use the familiar and mature car safety culture and practices to improve AI (and other technologies’) safety.
This means user training (driver licenses), built-in safety features (eg. seat belts, air bags), frequent public service announcements, independent and rigorous safety and reliability reviews, rules and regulations (traffic rules), enforcement (traffic police), insurance, development and testing in controlled environments, guards against deliberate or accidental misuse, guards against (large) advances with (large) uncertainties, and promoting safe attitudes and mutual accountability (eg. reject road rage).
If we can’t educate the public, media, technologists, and politicians in simple, engaging terms, and inspire them to take action, then we’ll always be at risk.
Thanks for writing this! I’m curating it.
Some things I really appreciate about the post:
The claim (paraphrased), “it is pretty easy to get AI safety messaging wrong, but there are some useful things to communicate about AI safety” seems important (and right — I’ve also seen examples of people accidentally spreading the idea that “AI will be powerful”). I also think lots of people in the EA community should hear it — a good number of people are in fact working on “spreading the ideas of AI safety” (see a related topic page).
It’s very nice to have more content on things that ~everyone can help with.
“practically everyone can help with spreading messages at least some, via things like talking to friends; writing explanations of your own that will appeal to particular people; and, yes, posting to Facebook and Twitter and all of that. [...] I’d guess it can be a big deal: many extremely important AI-related ideas are understood by vanishingly small numbers of people, and a bit more awareness could snowball. Especially because these topics often feel too “weird” for people to feel comfortable talking about them! Engaging in credible, reasonable ways could contribute to an overall background sense that it’s OK to take these ideas seriously.”
The lists of kinds of messages that are risky/helpful are helpful:
Risky (presumably not an exhaustive list!):
messages that generically emphasize the importance and potential imminence of powerful AI systems
messages that emphasize that AI could be risky/dangerous to the world, without much effort to fill in how, or with an emphasis on easy-to-understand risks (where one of the risks is, “If people have a bad model of how and why AI could be risky/dangerous (missing key risks and difficulties), they might be too quick to later say things like “Oh, turns out this danger is less bad than I thought, let’s go full speed ahead!””)
Helpful + right (This list is presumably also not exhaustive. I should also say that I’m least optimistic about iii (sort of) and v.)
[S] We should worry about conflict between misaligned AI and all humans
[S] AIs could behave deceptively, so “evidence of safety” might be misleading
[S] AI projects should establish and demonstrate safety (and potentially comply with safety standards) before deploying powerful systems
[S] Alignment research is prosocial and great
[S] It might be important for companies (and other institutions) to act in unusual ways
[S] We’re not ready for this
One question/disagreement/clarification I have about the statement, “I’m not excited about blasting around hyper-simplified messages.”
The word “simplified” is a bit vague; I think I disagree with some interpretations of the sentence. I agree that “it’s generally not good enough to spread the most broad/relatable/easy-to-agree-to version of each key idea,” but I think in some cases, “simplifying” could be really useful for spreading more accurate messages. In particular, “simplifying” could mean something like “dumbing down somewhat indiscriminately” — which is bad/risky — or it could mean something like “shortening and focusing on the key points, making technical points accessible to a more general audience, etc.” — something like distillation. The latter approach seems really useful here, in part because it might help overcome a big problem in AI safety messaging: that a lot of the key points about risk are difficult to understand, and that important texts are technical. This means that it’s easy to be shown cool demos of new AI systems, but not as easy to understand the arguments that explain why progress in AI might be dangerous. (So people trying to make the case in favor of safety might resort to deferring to experts, get the messages wrong in ways that make the listener unnecessarily skeptical of the overall case, etc.)
(More minor: I also think that the word “blast” has negative connotations which make it harder to correctly engage with the sentence. I think you mean “I’m not excited about sharing hyper-simplified messages in a way that reaches a ~random-but-large subset of people.” I think I agree — it seems better to target a particular audience — but the way it’s currently stated makes it harder to disagree; it’s harder to say, “no, I think we should in fact blast some messages” than it is to say, “I think there are some messages that appeal to a very wide range of audiences,” or to say “I think there are some messages we should promote extensively.”)
(I should say that the opinions I’m sharing here are mine, not CEA’s. I also think a lot of my opinions here are not very resilient.)
Whether it’s a knife, a car, social media, or artificial intelligence, technology is power.
There’s no reason why we shouldn’t use the familiar and mature car safety culture and practices to improve AI (and other technologies’) safety.
This means user training (driver licenses), built-in safety features (eg. seat belts, air bags), frequent public service announcements, independent and rigorous safety and reliability reviews, rules and regulations (traffic rules), enforcement (traffic police), insurance, development and testing in controlled environments, guards against deliberate or accidental misuse, guards against (large) advances with (large) uncertainties, and promoting safe attitudes and mutual accountability (eg. reject road rage).
If we can’t educate the public, media, technologists, and politicians in simple, engaging terms, and inspire them to take action, then we’ll always be at risk.
Technology is Power: Raising Awareness Of Technological Risks