I appreciate you drawing attention to the downside risks of public advocacy, and I broadly agree that they exist, but I also think the (admittedly) exaggerated framings here are doing a lot of work (basically just intuition pumping, for better or worse). The argument would be just as strong in the opposite direction if we swap the valence and optimism/pessimism of the passages: what if, in scenario one, the AI safety community continues making incremental progress on specific topics in interpretability and scalable oversight but achieves too little too slowly and fails to avert the risk of unforeseen emergent capabilities in large models driven by race dynamics, or even worse, accelerates those dynamics by drawing more talent to capabilities work? Whereas in scenario two, what if the AI safety movement becomes similar to the environmental movement by using public advocacy to build coalitions among diverse interest groups, becoming a major focus of national legislation and international cooperation, moving hundreds of billions of $ into clean tech research, etc.
Don’t get me wrong — there’s a place for intuition pumps like this, and I use them often. But I also think that both technical and advocacy approaches could be productive or counterproductive, and so it’s best for us to cautiously approach both and evaluate the risks and merits of specific proposals on their own. In terms of the things you mention driving bad outcomes for advocacy, I’m not sure if I agree — feeling uncertain about paying for ChatGPT seems like a natural response for someone worried about OpenAI’s use of capital, and I haven’t seen evidence that Holly (in the post you link) is exaggerating any risks to whip up support. We could disagree about these things, but my main point is that actually getting into the details of those disagreements is probably more useful in service of avoiding the second scenario than just describing it in pessimistic terms.
Yepp, I agree that I am doing an intuition pump to convey my point. I think this is a reasonable approach to take because I actually think there’s much more disagreement on vibes and culture than there is on substance (I too would like AI development to go more slowly). E.g. AI safety researchers paying for ChatGPT obviously brings in a negligible amount of money for OpenAI, and so when people think about that stuff the actual cognitive process is more like “what will my purchase signal and how will it influence norms?” But that’s precisely the sort of thing that has an effect on AI safety culture independent of whether people agree or disagree on specific policies—can you imagine hacker culture developing amongst people who were boycotting computers? Hence why my takeaway at the end of the post is not “stop advocating for pauses” but rather “please consider how to have positive effects on community culture and epistemics, which might not happen by default”.
I would be keen to hear more fleshed-out versions of the passages with the valences swapped! I like the one you’ve done; although I’d note that you’re focusing on the outcomes achieved by those groups, whereas I’m focusing also on the psychologies of the people in those groups. I think the psychological part is important because, as they say, culture eats strategy for breakfast. I do think that climate activists have done a good job at getting funding into renewables; but I think alignment research is much harder to accelerate (e.g. because the metrics are much less clear, funding is less of a bottleneck, and the target is moving much faster) and so trading off a culture focused on understanding the situation clearly for more success at activism may not be the right call here even if it was there.
I appreciate you drawing attention to the downside risks of public advocacy, and I broadly agree that they exist, but I also think the (admittedly) exaggerated framings here are doing a lot of work (basically just intuition pumping, for better or worse). The argument would be just as strong in the opposite direction if we swap the valence and optimism/pessimism of the passages: what if, in scenario one, the AI safety community continues making incremental progress on specific topics in interpretability and scalable oversight but achieves too little too slowly and fails to avert the risk of unforeseen emergent capabilities in large models driven by race dynamics, or even worse, accelerates those dynamics by drawing more talent to capabilities work? Whereas in scenario two, what if the AI safety movement becomes similar to the environmental movement by using public advocacy to build coalitions among diverse interest groups, becoming a major focus of national legislation and international cooperation, moving hundreds of billions of $ into clean tech research, etc.
Don’t get me wrong — there’s a place for intuition pumps like this, and I use them often. But I also think that both technical and advocacy approaches could be productive or counterproductive, and so it’s best for us to cautiously approach both and evaluate the risks and merits of specific proposals on their own. In terms of the things you mention driving bad outcomes for advocacy, I’m not sure if I agree — feeling uncertain about paying for ChatGPT seems like a natural response for someone worried about OpenAI’s use of capital, and I haven’t seen evidence that Holly (in the post you link) is exaggerating any risks to whip up support. We could disagree about these things, but my main point is that actually getting into the details of those disagreements is probably more useful in service of avoiding the second scenario than just describing it in pessimistic terms.
Yepp, I agree that I am doing an intuition pump to convey my point. I think this is a reasonable approach to take because I actually think there’s much more disagreement on vibes and culture than there is on substance (I too would like AI development to go more slowly). E.g. AI safety researchers paying for ChatGPT obviously brings in a negligible amount of money for OpenAI, and so when people think about that stuff the actual cognitive process is more like “what will my purchase signal and how will it influence norms?” But that’s precisely the sort of thing that has an effect on AI safety culture independent of whether people agree or disagree on specific policies—can you imagine hacker culture developing amongst people who were boycotting computers? Hence why my takeaway at the end of the post is not “stop advocating for pauses” but rather “please consider how to have positive effects on community culture and epistemics, which might not happen by default”.
I would be keen to hear more fleshed-out versions of the passages with the valences swapped! I like the one you’ve done; although I’d note that you’re focusing on the outcomes achieved by those groups, whereas I’m focusing also on the psychologies of the people in those groups. I think the psychological part is important because, as they say, culture eats strategy for breakfast. I do think that climate activists have done a good job at getting funding into renewables; but I think alignment research is much harder to accelerate (e.g. because the metrics are much less clear, funding is less of a bottleneck, and the target is moving much faster) and so trading off a culture focused on understanding the situation clearly for more success at activism may not be the right call here even if it was there.