This is great — thanks for writing it up! I think you’re spot-on that this is a big gap in the AI Safety ecosystem right now.
In fact, I recently stepped away from working on corporate campaigns at The Humane League to explore this very thing, so it feels very topical and is something I’ve been thinking about quite a bit. (As a side note, if anyone is thinking about or interested in working on this, I’d love to connect).
Anyway, just a couple of thoughts I want to add:
Negotiations and pressure campaigns have proven effective at driving corporate change across industries and movements.
One persistent concern I have is that this may only be true of industries and movements where the cost of a campaign can plausibly outweigh the costs of giving in to campaigners’ asks.
For context, during my time working on animal welfare campaigns, I became increasingly convinced that the decision of whether or not to give in to a campaign was a pretty simple financial equation for a corporate target. Something like the following:
Give in to (campaigner’s demands) if (estimated financial cost incurred from withstanding campaign) >= (estimated cost of giving into campaigners’ demands)
This is an oversimplification, of course. Corporations are full of humans who act for many reasons beyond profit maximization, including just doing the right thing. Also, the cost incurred from a campaign is almost surely a very uncertain and complex estimation. [1]
But still, I think some simple equation like the one above explains the vast majority of variation in whether or not a target gives in.[2] Put simply, a campaign has to have enough firepower to incur costs sufficiently high that giving in becomes cheaper for the corporate target than withstanding the campaign.
So, here’s where I get concerned: The costs for a large food company switching to use cage-free eggs, for example, are not only relatively low, but more importantly, they are bounded. You can start sourcing cage-free eggs in a few weeks or months and pay a certain low-double-digit % more for a single ingredient in your supply chain. For a lot of food companies, it’s easy to see how a moderately-sized campaign can become more expensive than just sourcing cage-free eggs.[3]
But what about AI? When it comes to falling behind even slightly in a corporate arms race for a technology as transformative as this, it’s not clear to me that the costs are that low — in fact, it’s not clear to me that the costs are bounded at all. For example, Google was the classic example of an entrenched leader when it came to web search (>90% market share), and Bing rolling out Sydney was enough to put Google in full “code-red” mode.
So, if the potential financial benefits of leading on AI are as massive as these companies (and me, and most folks in AI safety) seem to believe they are, it implies that a campaign would need to cause a ridiculous amount of financial risk to move a company to actually implement meaningful safeguards. [4]
Some orgs have started to spring up at the other end of the spectrum too, like the Campaign for AI Safety and PauseAI … Having organizations that use radical tactics seems to increase identification with and support for the more moderate groups.
One interesting thing I’m noticing — perhaps owing to the general disposition of people interested in AI safety — is that these groups are definitely radical in their asks (total training run moratoriums) but not so radical in tactics (their protests, as far as I can tell, have been less confrontational than many THL campaigns).
So I just think there is still a lot of implied space, further down the spectrum, for the sort of tactics that Just Stop Oil or Direct Action Everywhere are using. [5]
We have uncertainties about proposed governance asks… but some seem promising.
Another problem I’ve been running into is that, even where general categories of asks seem promising, there are very few specifics in place. For a company to commit to external auditing, for example, we have to know what the audits are and who conducts them and what models they apply to. From the conversations I’ve had with folks in policy so far, it appears this is all still in the works. Or, as Scott Alexander says, “The Plan Is To Come Up With A Plan”
Of course, you need specific language to make asks of a corporate campaign target. And, troublingly, vague language is just the kind of thing that I think companies love. Food businesses are happy to voluntarily make vague commitments (like “We are committed to animal welfare and will strive to make sure our animals can lead happy and healthy lives”) and much more reluctant to make concrete commitments that open them up to liability (like “We will meet the UEP certified cage-free guidelines”.) I’m worried a lot of the commitments you could get from tech companies and AI labs right now look more like the former, including the recent one made in collaboration with the White House.
One gap in the AI Safety space that I think could mitigate this problem would be having a highly trusted third-party entity that serves as a meta-certifier that can certify different standards or auditing orgs or evals. For example, when animal groups were asking for slower-growing breeds to be used, they didn’t actually know what breeds were best, so they secured a bunch of commitments that said something like “We commit to, by 2024, using breeds approved by the certifier G.A.P. pending their forthcoming research with the University of Guelph”.
I wish the AI Safety space had some certifier such that tech companies could commit to testing all new frontier models on, and publicly reporting the results of, benchmarks approved by that certifier in the future. I think government bodies can often serve this role, but it seems like we don’t have that yet either, so we can’t ask for these sorts of specific-but-TBD commitments.
By this I mean it relies on questions like how bad PR, employee satisfaction, relationships with corporate partners, future government regulation, etc. all impact future revenue. Also, on the other side of the equation, the costs of giving in might be simple in some industries (it’s easy to forecast how much it costs to transition to sourcing cage-free eggs), but there are also hard-to-measure benefits (using cage-free eggs is good for PR and marketing in its own right) and ambiguous lingering questions (will activists now think we are an easy target?) that probably complicate that side of the equation, too.
This is the product of speculation from my experiences, rather than any actual statistical analysis or rigorous thought, so take it with a big grain of salt.
I haven’t read much about the historical examples you’ve cited from the private sector (abortion services and fair-trade coffee). I’d be curious to see if the financial incentives seem to be driving these too. But I think part of why loads of bad PR has failed to significantly slow the fossil fuel industry, for example, is that the benefits of selling more oil often just vastly exceeds the costs of bad PR from activists.
By this I only mean that, descriptively, I don’t see anyone currently using radical tactics in AI Safety — at least compared with other major social movements. I’m not making any normative claims about whether such tactics are, or ever will be, useful or justified. Also, I hope it goes without saying, but I’m not talking about violence against people, which I take to never be justified.
Cool! Exciting that you’re working on this, and thanks for your thoughts.
One persistent concern I have is that this may only be true of industries and movements where the cost of a campaign can plausibly outweigh the costs of giving in to campaigners’ asks.
I think the bar for “disrupting supply / business as usual” is lower. A couple of the other social movement examples I cited were just this. I haven’t thought much about what that might look like in the context of AI safety, but it might be comparable to forcing a localised ‘pause’ on (some aspects of) frontier AGI development, which might be good.
If you’re going beyond that though, and trying to encourage meaningful corporate change, then I think maybe you just want to do some brainstorming about what sorts orgs or asks might be more promising.
For example, I found evidence that “boycotts of specific companies across their entire product range may be a more promising tactic for disrupting the supply of a product than boycotts of a specific product type across all companies.” (anti-abortion). So maybe e.g. Microsoft might be more vulnerable to pressure campaigns (across their entire product range or company) for failures relating to BingChat than more specialised companies like OpenAI would be for failures relating to ChatGPT.
There are different kinds of costs you could try to impose on non-complying companies. Immediate revenue costs, PR costs, risks of harsh regulation, wasted company time, etc. It might just be about matching the cost to the company and the campaign.
So something you highlight as a downside for this kind of campaign in AI safety could be used as an asset in your arsenal:
When it comes to falling behind even slightly in a corporate arms race for a technology as transformative as this, it’s not clear to me that the costs are that low — in fact, it’s not clear to me that the costs are bounded at all. For example, Google was the classic example of an entrenched leader when it came to web search (>90% market share), and Bing rolling out Sydney was enough to put Google in full “code-red” mode. So, if the potential financial benefits of leading on AI are as massive as these companies (and me, and most folks in AI safety) seem to believe they are, it implies that a campaign would need to cause a ridiculous amount of financial risk to move a company to actually implement meaningful safeguards.
You could use these as pressure points, as opposed to being things that the company needs to shoulder in order to cave to the campaign. E.g. going back to the ‘disruption’ idea, maybe this perspective means that something that risks slowing the company down by just a few months is a surprisingly powerful tool/threat against them.
For a company to commit to external auditing, for example, we have to know what the audits are and who conducts them and what models they apply to… I’m worried a lot of the commitments you could get from tech companies and AI labs right now look more like the former, including the recent one made in collaboration with the White House.
Are you highlighting this as just something like ‘here’s a risk corporate campaigns against AI labs/companies would need to look out for’, or ‘here’s something that makes these kinds of campaigns much less promising’? I agree with the former but not the latter.
I wish the AI Safety space had some certifier such that tech companies could commit to testing all new frontier models on, and publicly reporting the results of, benchmarks approved by that certifier in the future.
(I intuitively agree these things would ideally be done by governments, or government-funded bodies. But I don’t know much about the precedent from regulation of other industries.)
Thank you for responding and sorry for the delayed reply.
I’m not totally sure what the distinction is between disrupting business as usual and encouraging meaningful corporate change — in my mind, corporate campaigns do both, the former in service of the latter. Maybe I’m misunderstanding the distinction there.
That being said, I am much less certain than I was a few weeks ago about the “no costs from disrupted business can be sufficiently high to trigger action on AI safety” take, primarily because of what you pointed out: the corporate race dynamics here might make small disruptions much more costly, rather than less. In fact, the higher the financial upside is, the more costly it could be to lose even a tiny edge on the competition. So even if the costs of meaningful safeguards go up in competitive markets, so too do the costs of PR damage or the other setbacks you mention. I hadn’t thought of this when I wrote my comment but it seems pretty obvious to me now, so thanks for pointing it out.
I’m hoping to think more rigorously about why corporate campaigns work in the upcoming weeks, and might follow up here with additional thoughts.
Are you highlighting this as just something like ‘here’s a risk corporate campaigns against AI labs/companies would need to look out for’, or ‘here’s something that makes these kinds of campaigns much less promising’? I agree with the former but not the latter.
Both, I think. I’m still working on this because I’m optimistic that meaningful + robust policies with really granular detail will be developed, but if they aren’t, it would make campaigns less promising in my mind. Maybe what’s going on is something like the Collingridge dilemma, where it takes time for meaningful safeguards to be identified, but time also makes it harder to implement those safeguards.
Curious to hear why you think campaigns are just as promising even if there aren’t detailed asks to make of labs, if I’m understanding you correctly.
Alignment Research Center evals? Apollo Research evals? Maybe you mean something more specific and I’m just not following the distinction you’re making.
Yeah, in my mind, the animal welfare to AI safety analogy is something like this, where (???) is the missing entity that I wish existed:
This is to say that ARC and Apollo are developing eval regimes in the same way Cooks Venture develops slower-growing breeds, but a lab would probably be very reluctant to commit to auditing with a single partner into perpetuity regardless of how demanding the audits are in the same way a food company wouldn’t want to commit to exclusively sourcing breeds developed by Cooks Venture. And activists, too, would have reason to be concerned about an arrangement like this since the chicken breed (or model eval) developer’s standards could drop in the future.
So I wish there was some nonprofit or govt committee with a high degree of trust and few COIs who was tasked with certifying the eval regimes developed by ARC and Apollo (or those developed by academics, or even by labs themselves) — hence why I refer to them as a sort of meta-certifier. Then a lab could commit to something like “all future models will undergo evaluation approved by (meta-certifying body) and the results will be publicly shared,” even if many of the specifics this would entail don’t exist today.
On reflection, though, I really don’t know enough about the AI safety landscape to say with confidence how useful this would be. So take it with a big grain of salt.
This is great — thanks for writing it up! I think you’re spot-on that this is a big gap in the AI Safety ecosystem right now.
In fact, I recently stepped away from working on corporate campaigns at The Humane League to explore this very thing, so it feels very topical and is something I’ve been thinking about quite a bit. (As a side note, if anyone is thinking about or interested in working on this, I’d love to connect).
Anyway, just a couple of thoughts I want to add:
One persistent concern I have is that this may only be true of industries and movements where the cost of a campaign can plausibly outweigh the costs of giving in to campaigners’ asks.
For context, during my time working on animal welfare campaigns, I became increasingly convinced that the decision of whether or not to give in to a campaign was a pretty simple financial equation for a corporate target. Something like the following:
Give in to (campaigner’s demands) if (estimated financial cost incurred from withstanding campaign) >= (estimated cost of giving into campaigners’ demands)
This is an oversimplification, of course. Corporations are full of humans who act for many reasons beyond profit maximization, including just doing the right thing. Also, the cost incurred from a campaign is almost surely a very uncertain and complex estimation. [1]
But still, I think some simple equation like the one above explains the vast majority of variation in whether or not a target gives in.[2] Put simply, a campaign has to have enough firepower to incur costs sufficiently high that giving in becomes cheaper for the corporate target than withstanding the campaign.
So, here’s where I get concerned: The costs for a large food company switching to use cage-free eggs, for example, are not only relatively low, but more importantly, they are bounded. You can start sourcing cage-free eggs in a few weeks or months and pay a certain low-double-digit % more for a single ingredient in your supply chain. For a lot of food companies, it’s easy to see how a moderately-sized campaign can become more expensive than just sourcing cage-free eggs.[3]
But what about AI? When it comes to falling behind even slightly in a corporate arms race for a technology as transformative as this, it’s not clear to me that the costs are that low — in fact, it’s not clear to me that the costs are bounded at all. For example, Google was the classic example of an entrenched leader when it came to web search (>90% market share), and Bing rolling out Sydney was enough to put Google in full “code-red” mode.
So, if the potential financial benefits of leading on AI are as massive as these companies (and me, and most folks in AI safety) seem to believe they are, it implies that a campaign would need to cause a ridiculous amount of financial risk to move a company to actually implement meaningful safeguards. [4]
One interesting thing I’m noticing — perhaps owing to the general disposition of people interested in AI safety — is that these groups are definitely radical in their asks (total training run moratoriums) but not so radical in tactics (their protests, as far as I can tell, have been less confrontational than many THL campaigns).
So I just think there is still a lot of implied space, further down the spectrum, for the sort of tactics that Just Stop Oil or Direct Action Everywhere are using. [5]
Another problem I’ve been running into is that, even where general categories of asks seem promising, there are very few specifics in place. For a company to commit to external auditing, for example, we have to know what the audits are and who conducts them and what models they apply to. From the conversations I’ve had with folks in policy so far, it appears this is all still in the works. Or, as Scott Alexander says, “The Plan Is To Come Up With A Plan”
Of course, you need specific language to make asks of a corporate campaign target. And, troublingly, vague language is just the kind of thing that I think companies love. Food businesses are happy to voluntarily make vague commitments (like “We are committed to animal welfare and will strive to make sure our animals can lead happy and healthy lives”) and much more reluctant to make concrete commitments that open them up to liability (like “We will meet the UEP certified cage-free guidelines”.) I’m worried a lot of the commitments you could get from tech companies and AI labs right now look more like the former, including the recent one made in collaboration with the White House.
One gap in the AI Safety space that I think could mitigate this problem would be having a highly trusted third-party entity that serves as a meta-certifier that can certify different standards or auditing orgs or evals. For example, when animal groups were asking for slower-growing breeds to be used, they didn’t actually know what breeds were best, so they secured a bunch of commitments that said something like “We commit to, by 2024, using breeds approved by the certifier G.A.P. pending their forthcoming research with the University of Guelph”.
I wish the AI Safety space had some certifier such that tech companies could commit to testing all new frontier models on, and publicly reporting the results of, benchmarks approved by that certifier in the future. I think government bodies can often serve this role, but it seems like we don’t have that yet either, so we can’t ask for these sorts of specific-but-TBD commitments.
By this I mean it relies on questions like how bad PR, employee satisfaction, relationships with corporate partners, future government regulation, etc. all impact future revenue. Also, on the other side of the equation, the costs of giving in might be simple in some industries (it’s easy to forecast how much it costs to transition to sourcing cage-free eggs), but there are also hard-to-measure benefits (using cage-free eggs is good for PR and marketing in its own right) and ambiguous lingering questions (will activists now think we are an easy target?) that probably complicate that side of the equation, too.
This is the product of speculation from my experiences, rather than any actual statistical analysis or rigorous thought, so take it with a big grain of salt.
I haven’t read much about the historical examples you’ve cited from the private sector (abortion services and fair-trade coffee). I’d be curious to see if the financial incentives seem to be driving these too. But I think part of why loads of bad PR has failed to significantly slow the fossil fuel industry, for example, is that the benefits of selling more oil often just vastly exceeds the costs of bad PR from activists.
This assumes, of course, that meaningful safeguards are costly. If they weren’t, hopefully the inside-game collaborative stuff would be enough.
By this I only mean that, descriptively, I don’t see anyone currently using radical tactics in AI Safety — at least compared with other major social movements. I’m not making any normative claims about whether such tactics are, or ever will be, useful or justified. Also, I hope it goes without saying, but I’m not talking about violence against people, which I take to never be justified.
Cool! Exciting that you’re working on this, and thanks for your thoughts.
I think the bar for “disrupting supply / business as usual” is lower. A couple of the other social movement examples I cited were just this. I haven’t thought much about what that might look like in the context of AI safety, but it might be comparable to forcing a localised ‘pause’ on (some aspects of) frontier AGI development, which might be good.
If you’re going beyond that though, and trying to encourage meaningful corporate change, then I think maybe you just want to do some brainstorming about what sorts orgs or asks might be more promising.
For example, I found evidence that “boycotts of specific companies across their entire product range may be a more promising tactic for disrupting the supply of a product than boycotts of a specific product type across all companies.” (anti-abortion). So maybe e.g. Microsoft might be more vulnerable to pressure campaigns (across their entire product range or company) for failures relating to BingChat than more specialised companies like OpenAI would be for failures relating to ChatGPT.
There are different kinds of costs you could try to impose on non-complying companies. Immediate revenue costs, PR costs, risks of harsh regulation, wasted company time, etc. It might just be about matching the cost to the company and the campaign.
So something you highlight as a downside for this kind of campaign in AI safety could be used as an asset in your arsenal:
You could use these as pressure points, as opposed to being things that the company needs to shoulder in order to cave to the campaign. E.g. going back to the ‘disruption’ idea, maybe this perspective means that something that risks slowing the company down by just a few months is a surprisingly powerful tool/threat against them.
Are you highlighting this as just something like ‘here’s a risk corporate campaigns against AI labs/companies would need to look out for’, or ‘here’s something that makes these kinds of campaigns much less promising’? I agree with the former but not the latter.
Alignment Research Center evals? Apollo Research evals? Maybe you mean something more specific and I’m just not following the distinction you’re making.
(I intuitively agree these things would ideally be done by governments, or government-funded bodies. But I don’t know much about the precedent from regulation of other industries.)
Thank you for responding and sorry for the delayed reply.
I’m not totally sure what the distinction is between disrupting business as usual and encouraging meaningful corporate change — in my mind, corporate campaigns do both, the former in service of the latter. Maybe I’m misunderstanding the distinction there.
That being said, I am much less certain than I was a few weeks ago about the “no costs from disrupted business can be sufficiently high to trigger action on AI safety” take, primarily because of what you pointed out: the corporate race dynamics here might make small disruptions much more costly, rather than less. In fact, the higher the financial upside is, the more costly it could be to lose even a tiny edge on the competition. So even if the costs of meaningful safeguards go up in competitive markets, so too do the costs of PR damage or the other setbacks you mention. I hadn’t thought of this when I wrote my comment but it seems pretty obvious to me now, so thanks for pointing it out.
I’m hoping to think more rigorously about why corporate campaigns work in the upcoming weeks, and might follow up here with additional thoughts.
Both, I think. I’m still working on this because I’m optimistic that meaningful + robust policies with really granular detail will be developed, but if they aren’t, it would make campaigns less promising in my mind. Maybe what’s going on is something like the Collingridge dilemma, where it takes time for meaningful safeguards to be identified, but time also makes it harder to implement those safeguards.
Curious to hear why you think campaigns are just as promising even if there aren’t detailed asks to make of labs, if I’m understanding you correctly.
Yeah, in my mind, the animal welfare to AI safety analogy is something like this, where (???) is the missing entity that I wish existed:
G.A.P : Cooks Venture :: (???) : ARC/Apollo
This is to say that ARC and Apollo are developing eval regimes in the same way Cooks Venture develops slower-growing breeds, but a lab would probably be very reluctant to commit to auditing with a single partner into perpetuity regardless of how demanding the audits are in the same way a food company wouldn’t want to commit to exclusively sourcing breeds developed by Cooks Venture. And activists, too, would have reason to be concerned about an arrangement like this since the chicken breed (or model eval) developer’s standards could drop in the future.
So I wish there was some nonprofit or govt committee with a high degree of trust and few COIs who was tasked with certifying the eval regimes developed by ARC and Apollo (or those developed by academics, or even by labs themselves) — hence why I refer to them as a sort of meta-certifier. Then a lab could commit to something like “all future models will undergo evaluation approved by (meta-certifying body) and the results will be publicly shared,” even if many of the specifics this would entail don’t exist today.
On reflection, though, I really don’t know enough about the AI safety landscape to say with confidence how useful this would be. So take it with a big grain of salt.
Hello!
Did you ever do this research on why corporate campaigns work? And if so, would you share it? Thanks!