Thanks for writing this up, it’s fantastic to get a variety of perspectives on how different messaging strategies work.
Do you have evidence or a sense of if people you have talked to have changed their actions as a result? I worry that the approach you use is so similar to what people already think that it doesn’t lead to shifts in behavior. (But we need nudges where we can get them)
I also worry about anchoring on small near term problems and this leading to a moral-licensing type effect for safety (and a false sense of security). It is unclear how likely this is. As in, if people care about AI Safety but lack the big picture, they might establish a safety team dedicated to say algorithmic bias. If the counter factual is no safety team, this is likely good. If the counter factual is a safety team focused on interpretability, this is likely bad. It could be that “having a safety team” makes an org or the people in it feel more justified in taking risks or investing less in other elements of safety (seems likely); this would be bad. To me, the cruxes here are something like: “what do people do after these conversations” “are the safety things they work on relevant to big problems” “how does safety culture interact with security-licensing or false sense of security”.
I hope this comment didn’t come off aggressively. I’m super excited about this approach and particularly the way you meet people where they’re at, which is usually a much better strategy than how messaging around this usually comes off.
Yes, I have seen people become more actively interested in joining or promoting projects related to AI safety. More importantly, I think it creates an AI safety culture and mentality. I’ll have a lot more to say about all of this in my (hopefully) forthcoming post on why I think promoting near-term research is valuable.
Strongly agreed that working on the near term applications of AI safety is underrated by most EAs. Nearly all of the AI safety discussion focuses on advanced RL agents that are not widely deployed in the world today, and it’s possible that these systems do not soon reach commercial viability. Misaligned AI is causing real harms today and solving those problems would be a great step towards building the technical tools and engineering culture necessary to scale to aligning more advanced AI.
(That’s just a three sentence explanation of a topic deserving much more detailed analysis, so really looking forward to your post!!)
Thanks for writing this up, it’s fantastic to get a variety of perspectives on how different messaging strategies work.
Do you have evidence or a sense of if people you have talked to have changed their actions as a result? I worry that the approach you use is so similar to what people already think that it doesn’t lead to shifts in behavior. (But we need nudges where we can get them)
I also worry about anchoring on small near term problems and this leading to a moral-licensing type effect for safety (and a false sense of security). It is unclear how likely this is. As in, if people care about AI Safety but lack the big picture, they might establish a safety team dedicated to say algorithmic bias. If the counter factual is no safety team, this is likely good. If the counter factual is a safety team focused on interpretability, this is likely bad. It could be that “having a safety team” makes an org or the people in it feel more justified in taking risks or investing less in other elements of safety (seems likely); this would be bad. To me, the cruxes here are something like: “what do people do after these conversations” “are the safety things they work on relevant to big problems” “how does safety culture interact with security-licensing or false sense of security”. I hope this comment didn’t come off aggressively. I’m super excited about this approach and particularly the way you meet people where they’re at, which is usually a much better strategy than how messaging around this usually comes off.
Yes, I have seen people become more actively interested in joining or promoting projects related to AI safety. More importantly, I think it creates an AI safety culture and mentality. I’ll have a lot more to say about all of this in my (hopefully) forthcoming post on why I think promoting near-term research is valuable.
Strongly agreed that working on the near term applications of AI safety is underrated by most EAs. Nearly all of the AI safety discussion focuses on advanced RL agents that are not widely deployed in the world today, and it’s possible that these systems do not soon reach commercial viability. Misaligned AI is causing real harms today and solving those problems would be a great step towards building the technical tools and engineering culture necessary to scale to aligning more advanced AI.
(That’s just a three sentence explanation of a topic deserving much more detailed analysis, so really looking forward to your post!!)