Book a 1:1 with me: https://cal.com/tylerjohnston/book
Share anonymous feedback with me: https://www.admonymous.co/tylerjohnston
Book a 1:1 with me: https://cal.com/tylerjohnston/book
Share anonymous feedback with me: https://www.admonymous.co/tylerjohnston
Being mindful of the incentives created by pressure campaigns
I’ve spent the past few months trying to think about the whys and hows of large-scale public pressure campaigns (especially those targeting companies — of the sort that have been successful in animal advocacy).
A high-level view of these campaigns is that they use public awareness and corporate reputation as a lever to adjust corporate incentives. But making sure that you are adjusting the right incentives is more challenging than it seems. Ironically, I think this is closely connected to specification gaming: it’s often easy to accidentally incentivize companies to do more to look better, rather than doing more to be better.
For example, an AI-focused campaign calling out RSPs recently began running ads that single out AI labs for speaking openly about existential risk (quoting leaders acknowledging that things could go catastrophically wrong). I can see why this is a “juicy” lever — most of the public would be pretty astonished/outraged to learn some of the beliefs that are held by AI researchers. But I’m not sure if pulling this lever is really incentivizing the right thing.
As far as I can tell, AI leaders speaking openly about existential risk is good. It won’t solve anything in and of itself, but it’s a start — it encourages legislators and the public to take the issue seriously. In general, I think it’s worth praising this when it happens. I think the same is true of implementing safety policies like RSPs, whether or not such policies are sufficient in and of themselves.
If these things are used as ammunition to try to squeeze out stronger concessions, it might just incentivize the company to stop doing the good-but-inadequate thing (i.e. CEOs are less inclined to speak about the dangers of their product when it will be used as a soundbite in a campaign, and labs are probably less inclined to release good-but-inadequate safety policies when doing so creates more public backlash than they were facing before releasing the policy). It also risks directing public and legislative scrutiny to actors who actually do things like speak openly about (or simply believe in) existential risks, as opposed to those who don’t.
So, what do you do when companies are making progress, but not enough? I’m not sure, but it seems like a careful balance of carrots and sticks.
For example, animal welfare campaigns are full of press releases like this: Mercy for Animals “commends” Popeye’s for making a commitment to broiler welfare reforms. Spoiler alert: it probably wasn’t written by someone who thought that Popeye’s had totally absolved themselves of animal abuse with a single commitment, but rather it served as a strategic signal to the company and to their competitors (basically, “If you lead relative to your competitors on animal welfare, we’ll give you carrots. If you don’t, we’ll give you the stick.” If they had reacted by demanding more (which in my heart I may feel is appropriate), it would have sent a very different message: “We’ll punish you even if you make progress.” Even when it’s justified [1], the incentives it creates can leave everybody worse off.
There are lots of other ways that I think campaigns can warp incentives in the wrong ways, but this one feels topical.
Popeyes probably still does, in fact, have animal abuse in its supply chain
I’ve been thinking about Emre’s comment since I read it — and given this event on the Forum, I eventually decided to go and read Marcus Rediker’s biography of Lay. I recommend it for anyone interested in learning more about him as a historical figure.
To share some thoughts on the questions you posed, my feeling is that his extreme protests weren’t based on any strategic thinking about social change, and I definitely don’t think he’d be an incrementalist if he were alive today. Rather, I think his actions were driven by his extremely firm, passionately felt, and often spiritually-derived moral convictions — the same ones that convinced him to live in a cave and practice radical self-sufficiency. Actually, it seems like he had what we might describe as an excessive degree of “soldier mindset.” From the Rediker text:
He was loving to his friends, but he could be a holy terror to those who did not agree with him. He was aggressive and disruptive. He was stubborn, never inclined to admit a mistake. His direct antinomian connection to God made him self-righteous and at times intolerant. The more resistance he encountered, or, as he understood it, the more God tested his faith, the more certain he was that he was right. He had reasons both sacred and self-serving for being the way he was. He was sure that these traits were essential to defeat the profound evil of slavery.
I don’t know if the EA community would be wrong to exclude him today. He turned out to be ahead of his time in so many ways, and probably did meaningfully influence the eventual abolition of slavery, but this is so much easier to celebrate ex post. What does it actually feel like from the inside, to have extreme personal convictions that society doesn’t share, and how do you know (1) that history will prove you right; and (2) that you are actually making a difference? I really worry that what it feels like to be Benjamin Lay, from the inside, isn’t so dissimilar from what it feels like to be a Westboro Baptist Church member today.
I do think the role of radicalism in driving social change is underrated in this community, and I think it played a big role in not only the slavery abolition movement but also the women’s suffrage movement, the civil rights movement, the gay rights movement, etc. It’s worth looking into the radical flank effect or Cass Sunstein’s writing on social change if you are curious about this. Maybe one thing I’d like to believe is that the world is antifragile and can tolerate radicals ranging the moral spectrum, and those who are on the right side of history will eventually win out, making radicalism a sort of asymmetric weapon that’s stronger when you are ahead of your time on the moral arc of history. But that’s a very convenient theory and I think it’s hard to know with any confidence, and the success of so many fascist and hateful ideologies in relatively recent history probably suggests otherwise.
In any case, I really admire Lay for his conviction and his empathy and his total dedication to living a principled life. But I also really admire communities like this one for their commitment to open debate and the scout mindset and earnest attempts to hear each other out and question our own assumptions. So I expect, and hope, that the EA community would ban Benjamin Lay from our events. But I also hope we wouldn’t laugh at him like so many Quakers did. I hope we would look at him, scowling at us through the glass, and ask ourselves with total sincerity, “What if he has a point?”
I appreciate you drawing attention to the downside risks of public advocacy, and I broadly agree that they exist, but I also think the (admittedly) exaggerated framings here are doing a lot of work (basically just intuition pumping, for better or worse). The argument would be just as strong in the opposite direction if we swap the valence and optimism/pessimism of the passages: what if, in scenario one, the AI safety community continues making incremental progress on specific topics in interpretability and scalable oversight but achieves too little too slowly and fails to avert the risk of unforeseen emergent capabilities in large models driven by race dynamics, or even worse, accelerates those dynamics by drawing more talent to capabilities work? Whereas in scenario two, what if the AI safety movement becomes similar to the environmental movement by using public advocacy to build coalitions among diverse interest groups, becoming a major focus of national legislation and international cooperation, moving hundreds of billions of $ into clean tech research, etc.
Don’t get me wrong — there’s a place for intuition pumps like this, and I use them often. But I also think that both technical and advocacy approaches could be productive or counterproductive, and so it’s best for us to cautiously approach both and evaluate the risks and merits of specific proposals on their own. In terms of the things you mention driving bad outcomes for advocacy, I’m not sure if I agree — feeling uncertain about paying for ChatGPT seems like a natural response for someone worried about OpenAI’s use of capital, and I haven’t seen evidence that Holly (in the post you link) is exaggerating any risks to whip up support. We could disagree about these things, but my main point is that actually getting into the details of those disagreements is probably more useful in service of avoiding the second scenario than just describing it in pessimistic terms.
Given the already existing support of the public for going slowly and deliberately, there seems to be a decent case that instead of trying to build public support, we should directly target the policymakers.
I think “public support” is ambiguous, and by some definitions, it isn’t there yet.
One definition is something like “Does the public care about this when they are asked directly?” and this type of support definitely exists, per data like the YouGov poll showing majority support for AI pause.
But there are also polls showing that almost half of U.S. adults “support a ban on factory farming.” I think the correct takeaway from those polls is that there’s a gap between vaguely agreeing with an idea when asked vs. actually supporting specific, meaningful policies in a proactive way.
So I think the definition of “public support” that could help the safety situation, and which is missing right now, is something like “How does this issue rank when the public is asked what causes will inform their voting decisions in the next election cycle?”
I’m very sympathetic to some of the signalling benefits of being (or at least appearing to be) frugal.
I just graduated from a uni with a large EA presence, and most of my very-motivated do-gooder friends were outside of EA (either affiliated with a homeless shelter I worked at, grad student union organizing, or various social justice causes on campus). Most of them were seemingly convinced that the EAs on campus weren’t actually interested in doing good, because there was money being spent on sending students to fly abroad for conferences, hosting discussion groups, opening an office/hang out space in our insanely expensive city, etc. Which, to be fair, was a far cry from how the campus homeless shelter I worked at was spending money — we cherished small donations from our fundraising drives, spending it almost exclusively on programs benefitting the guests we served, often just getting together basic bits of clothing and hygiene products.
I tried to explain to my friends the EA argument for spendy-ness (I still believe it is the best way to do good, deep down) but I just couldn’t seem to convince them that it wasn’t a ruse of motivated reasoning. Looking back on it, I’m bummed that some of my most passionate and talented friends, who were already choosing careers based on serving others, were turned off by this. I wish my friends’ first image of EA had even more similar to mine — things like the GWWC pledge and Singer’s famine affluence and morality — as I think that would have sold them on EA in the same way it first sold me on it. But they just saw social gatherings and professional development, and for students who were skipping out on studying and social events to do on-the-ground organizing and volunteering and public service, EA just didn’t seem all that selfless to begin with. I couldn’t really convince them otherwise, and I’m sad about that.
This is great — thanks for writing it up! I think you’re spot-on that this is a big gap in the AI Safety ecosystem right now.
In fact, I recently stepped away from working on corporate campaigns at The Humane League to explore this very thing, so it feels very topical and is something I’ve been thinking about quite a bit. (As a side note, if anyone is thinking about or interested in working on this, I’d love to connect).
Anyway, just a couple of thoughts I want to add:
Negotiations and pressure campaigns have proven effective at driving corporate change across industries and movements.
One persistent concern I have is that this may only be true of industries and movements where the cost of a campaign can plausibly outweigh the costs of giving in to campaigners’ asks.
For context, during my time working on animal welfare campaigns, I became increasingly convinced that the decision of whether or not to give in to a campaign was a pretty simple financial equation for a corporate target. Something like the following:
Give in to (campaigner’s demands) if (estimated financial cost incurred from withstanding campaign) >= (estimated cost of giving into campaigners’ demands)
This is an oversimplification, of course. Corporations are full of humans who act for many reasons beyond profit maximization, including just doing the right thing. Also, the cost incurred from a campaign is almost surely a very uncertain and complex estimation. [1]
But still, I think some simple equation like the one above explains the vast majority of variation in whether or not a target gives in.[2] Put simply, a campaign has to have enough firepower to incur costs sufficiently high that giving in becomes cheaper for the corporate target than withstanding the campaign.
So, here’s where I get concerned: The costs for a large food company switching to use cage-free eggs, for example, are not only relatively low, but more importantly, they are bounded. You can start sourcing cage-free eggs in a few weeks or months and pay a certain low-double-digit % more for a single ingredient in your supply chain. For a lot of food companies, it’s easy to see how a moderately-sized campaign can become more expensive than just sourcing cage-free eggs.[3]
But what about AI? When it comes to falling behind even slightly in a corporate arms race for a technology as transformative as this, it’s not clear to me that the costs are that low — in fact, it’s not clear to me that the costs are bounded at all. For example, Google was the classic example of an entrenched leader when it came to web search (>90% market share), and Bing rolling out Sydney was enough to put Google in full “code-red” mode.
So, if the potential financial benefits of leading on AI are as massive as these companies (and me, and most folks in AI safety) seem to believe they are, it implies that a campaign would need to cause a ridiculous amount of financial risk to move a company to actually implement meaningful safeguards. [4]
Some orgs have started to spring up at the other end of the spectrum too, like the Campaign for AI Safety and PauseAI … Having organizations that use radical tactics seems to increase identification with and support for the more moderate groups.
One interesting thing I’m noticing — perhaps owing to the general disposition of people interested in AI safety — is that these groups are definitely radical in their asks (total training run moratoriums) but not so radical in tactics (their protests, as far as I can tell, have been less confrontational than many THL campaigns).
So I just think there is still a lot of implied space, further down the spectrum, for the sort of tactics that Just Stop Oil or Direct Action Everywhere are using. [5]
We have uncertainties about proposed governance asks… but some seem promising.
Another problem I’ve been running into is that, even where general categories of asks seem promising, there are very few specifics in place. For a company to commit to external auditing, for example, we have to know what the audits are and who conducts them and what models they apply to. From the conversations I’ve had with folks in policy so far, it appears this is all still in the works. Or, as Scott Alexander says, “The Plan Is To Come Up With A Plan”
Of course, you need specific language to make asks of a corporate campaign target. And, troublingly, vague language is just the kind of thing that I think companies love. Food businesses are happy to voluntarily make vague commitments (like “We are committed to animal welfare and will strive to make sure our animals can lead happy and healthy lives”) and much more reluctant to make concrete commitments that open them up to liability (like “We will meet the UEP certified cage-free guidelines”.) I’m worried a lot of the commitments you could get from tech companies and AI labs right now look more like the former, including the recent one made in collaboration with the White House.
One gap in the AI Safety space that I think could mitigate this problem would be having a highly trusted third-party entity that serves as a meta-certifier that can certify different standards or auditing orgs or evals. For example, when animal groups were asking for slower-growing breeds to be used, they didn’t actually know what breeds were best, so they secured a bunch of commitments that said something like “We commit to, by 2024, using breeds approved by the certifier G.A.P. pending their forthcoming research with the University of Guelph”.
I wish the AI Safety space had some certifier such that tech companies could commit to testing all new frontier models on, and publicly reporting the results of, benchmarks approved by that certifier in the future. I think government bodies can often serve this role, but it seems like we don’t have that yet either, so we can’t ask for these sorts of specific-but-TBD commitments.
By this I mean it relies on questions like how bad PR, employee satisfaction, relationships with corporate partners, future government regulation, etc. all impact future revenue. Also, on the other side of the equation, the costs of giving in might be simple in some industries (it’s easy to forecast how much it costs to transition to sourcing cage-free eggs), but there are also hard-to-measure benefits (using cage-free eggs is good for PR and marketing in its own right) and ambiguous lingering questions (will activists now think we are an easy target?) that probably complicate that side of the equation, too.
This is the product of speculation from my experiences, rather than any actual statistical analysis or rigorous thought, so take it with a big grain of salt.
I haven’t read much about the historical examples you’ve cited from the private sector (abortion services and fair-trade coffee). I’d be curious to see if the financial incentives seem to be driving these too. But I think part of why loads of bad PR has failed to significantly slow the fossil fuel industry, for example, is that the benefits of selling more oil often just vastly exceeds the costs of bad PR from activists.
This assumes, of course, that meaningful safeguards are costly. If they weren’t, hopefully the inside-game collaborative stuff would be enough.
By this I only mean that, descriptively, I don’t see anyone currently using radical tactics in AI Safety — at least compared with other major social movements. I’m not making any normative claims about whether such tactics are, or ever will be, useful or justified. Also, I hope it goes without saying, but I’m not talking about violence against people, which I take to never be justified.
To some extent, I agree with this, but I also think it overlooks an important component of how defamation law is used in practice — which is not to hold people to high epistemic norms but instead to scare them out of harming your reputation regardless of the truth. This is something folks who work on corporate campaigns for farmed animal welfare run into all the time. And, because our legal system is imperfect, it often works. Brian Martin has a good write-up on the flaws in our legal system that contribute to this:
Cost: If you are sued for defamation, you could end up paying tens of thousands of dollars in legal fees, even if you win. If you lose, you could face a massive pay-out on top of the fees.
The large costs, due especially to the cost of legal advice, mean that most people never sue for defamation. If you don’t have much money, you don’t have much chance against a rich opponent, whether you are suing them or they are suing you. Cases can go on for years. Judgements can be appealed. The costs become enormous. Only those with deep pockets can pursue such cases to the end.
The result is that defamation law is often used by the rich and powerful to deter criticisms. It is seldom helpful to ordinary people whose reputations are attacked unfairly.
Unpredictability: People say and write defamatory things all the time, but only a very few are threatened with defamation. Sometimes gross libels pass unchallenged while comparatively innocuous comments lead to major court actions. This unpredictability has a chilling effect on free speech. Writers, worried about defamation, cut out anything that might offend. Publishers, knowing how much it can cost to lose a case, have lawyers go through articles to cut out anything that might lead to a legal action. The result is a tremendous inhibition of free speech.
Complexity: Defamation law is so complex that most writers and publishers prefer to be safe than sorry, and do not publish things that are quite safe because they’re not sure. Judges and lawyers have excessive power because outsiders cannot understand how the law will be applied. Those who might desire to defend against a defamation suit without a lawyer are deterred by the complexities.
Slowness: Sometimes defamation cases are launched months after the statement in question. Cases often take years to resolve. This causes anxiety, especially for those sued, and deters free speech in the meantime. As the old saying goes, “Justice delayed is justice denied.”
I’m not saying this is what’s happening here — I have no idea about the details of any of these allegations. But what if someone did have additional private information about Nonlinear or the folks involved? Unless they are rich or have a sophisticated understanding of the law, the scary lawyer talk from Nonlinear here might deter them from talking about it at all, and I think that’s a really bad epistemic norm. This isn’t to say “the EA Forum should be completely insulated from defamation law” or anything, but in a high-trust community where people will respond to alternatives like publicly sharing counterevidence, threatening lawsuits seems like it might hinder, rather than help, epistemics.
Some exciting news from the animal welfare world: this morning, in a very ideologically-diverse 5-4 ruling, the US Supreme Court upheld California’s Proposition 12, one of the strongest animal welfare laws in the world!
For what it’s worth, I have no affiliation with CE, yet I disagree with some of the empirical claims you make — I’ve never gotten the sense that CE has a bad reputation among animal advocacy researchers, nor is it clear to me that the charities you mentioned were bad ideas prior to launching.
Then again, I might just not be in the know. But that’s why I really wish this post was pointing at specific reasoning for these claims rather than just saying it’s what other people think. If it’s true that other people think it, I’d love to know why they think it! If there are factual errors in CE’s research, it seems really important to flag them publicly. You even mention that the status quo for giving in the animal space (CE excepted) is “very bad already,” which is huge if true given the amount of money at stake, and definitely worth sharing examples of what exactly has gone wrong.
My understanding is that screwworm eradication in North America has been treated by wild animal welfare researchers as a sort of paradigmatic example of what wild animal welfare interventions could look like, so I think it is on folks’ radar. And, as Kevin mentions, it looks like Uruguay is working on this now with hopes of turning it into a regional campaign across South America.
I’m guessing one of the main reasons there hasn’t been more uptake in promoting this idea is general uncertainty — both about the knock-on effects of something so large scale, and about whether saving the lives of animals who would have died from screwworm really results in higher net welfare for those animals (in many cases it’s probably trading off an excruciating death now for a painful death later with added months or years of life in-between that may themselves be net-negative). So I do think it’s a big overstatement for the guest to suggest that eradicating screwworm would be two orders of magnitude better than preventing the next 100 years of factory farming, which basically assumes that the wild animal lives saved directly trade-off (positively) against the (negative) lives of farmed animals.
@saulius might know more about this. One quote from a recent post of his: “To my surprise, most WAW researchers that I talked to agreed that we’re unlikely to find WAW interventions that could be as cost-effective as farmed animal welfare interventions within the next few years.”
I broadly want to +1 this. A lot of the evidence you are asking for probably just doesn’t exist, and in light of that, most people should have a lot of uncertainty about the true effects of any overton-window-pushing behavior.
That being said, I think there’s some non-anecdotal social science research that might make us more likely to support it. In the case of policy work:
Anchoring effects, one of the classic Kahneman/Tversky biases, have been studied quite a bit, and at least one article calls it “the best-replicated finding in social psychology.” To the extent there’s controversy about it, it’s often related to “incidental” or “subliminal” anchoring which isn’t relevant here. The market also seems to favor a lot of anchoring strategies (like how basically everything on Amazon in “on sale” from an inflated MSRP), which should be a point of evidence that this genuinely just works.
In cases where there is widespread “preference falsification,” overton-shifting behavior might increase people’s willingness to publicly adopt views that were previously outside of it. Cass Sunstein has a good argument that being a “norm entrepreneur,” that is, proposing something that is controversial, might create chain-reaction social cascades. A lot of the evidence for this is historical, but there are also polling techniques that can reveal preference falsification, and a lot of experimental research that shows a (sometimes comically strong) bias toward social conformity, so I suspect something like this is true. Could there be preference falsification among lawmakers surrounding AI issues? Seems possible.
Also, in the case of public advocacy, there’s some empirical research (summarized here) that suggests a “radical flank effect” whereby overton-window shifting activism increases popular support for moderate demands. There’s also some evidence pointing the other direction. Still, I think the evidence supporting is stronger right now.
P.S. Matt Yglesias (as usual) has a good piece that touches on your point. His takeaway is something like: don’t engage in sloppy Overton-window-pushing for its own sake — especially not in place of rigorously argued, robustly good ideas.
It’s not obvious to me that message precision is more important for public activism than in other contexts. I think it might be less important, in fact. Here’s why:
My guess is that the distinction between “X company’s frontier AI models are unsafe” vs. “X company’s policy on frontier models is unsafe” isn’t actually registered by the vast majority of the public (many such cases!). Instead, both messages basically amount to a mental model that is something like “X company’s AI work = bad” And that’s really all the nuance that you need to create public pressure for X company to do something. Then, in more strategic contexts like legislative work and corporate outreach, message precision becomes more important. (When I worked in animal advocacy, we had a lot of success campaigning for nuanced policies with protests that had much vaguer messaging).
Also, I don’t think the news media is “likely” going to twist an activist’s words. It’s always a risk, but in general, the media seems to have a really healthy appetite for criticizing tech companies and isn’t trying to work against activists here. If anything, not mentioning the dangers of the current models (which do exist) might lead to media backlash of the “X-risk is a distraction” sort. So I really don’t think Holly saying “Meta’s frontier AI models are fundamentally unsafe” is evidence of a lack of careful consideration re: messaging here.
I do agree with the Open Source issue though. In that case, it seems like the message isn’t just imprecise, but instead pointing in the wrong direction altogether.
For me, one of the main takeaways of the FTX debacle was a reminder of the fact we have something to lose. That a load of money isn’t just a number or a means to personal enrichment, but rather its value is weighed in the absolutely mind-boggling number of people and animals that our efforts today could impact.
So, in a strage way, I’m really glad that I’m surrounded by people who care enough for this to have hurt, and for it to have hurt for the right reasons.
It’s a reminder that this community is largely comprised by people who are remarkably driven to make the world better a better place, even long after they’re no longer in it. It helps me recalibrate to see this as a bump in the road and focus on the next steps, knowing there’s a lot of talent and a lot of motivation and a lot of care around me.
So, thanks to you all! I appreciate you.
Wow. This is great.
I’ve been looking for a write-up like this for a long time. And thanks for formatting it so well (sections, subsections, effect sizes, hyperlinks, 250+ references).
It’s a bit depressing that so many of the effect sizes for interventions with a strong base of evidence are relatively small. I guess there’s part of me that wants a silver bullet, but I know well enough that no such thing exists — at least not broadly across the population. Nonetheless, I’m guessing people could get a lot out of experimenting with and implementing many of these.
I’m looking forward to digging more into this!
I’m also heartened by recent polling, and spend a lot of time time these days thinking about how to argue for the importance of existential risks from artificial intelligence.
I’m guessing the main difference in our perspective here is that you see including existing harms in public messaging as “hiding under the banner” of another issue. In my mind, (1) existing harms are closely related to the threat models for existential risks (i.e. how do we get these systems to do the things we want and not do the other things); and (2) I think it’s just really important for advocates to try to build coalitions between different interest groups with shared instrumental goals (e.g. building voter support for AI regulation). I’ve seen a lot of social movements devolve into factionalism, and I see the early stages of that happening in AI safety, which I think is a real shame.
Like, one thing that would really help the safety situation is if frontier models were treated like nuclear power plants and couldn’t just be deployed at a single company’s whim without meeting a laundry list of safety criteria (both because of the direct effects of the safety criteria, and because such criteria literally just buys us some time). If it is the case that X-risk interest groups can build power and increase the chance of passing legislation by allying with others who want to include (totally legitimate) harms like respecting intellectual property in that list of criteria, I don’t see that as hiding under another’s banner. I see it as building strategic partnerships.
Anyway, this all goes a bit further than the point I was making in my initial comment, which is that I think the public isn’t very sensitive to subtle differences in messaging — and that’s okay because those subtle differences are much more important when you are drafting legislation compared to generally building public pressure.
OPP was making grants in the Global Health and Wellbeing space (which includes animal welfare) long before this.
The data exist via their grants database [1] — it doesn’t look to me like there was any shift away from longtermism that coincided with SBF/FTX entering the space (if anything, it looks like the opposite could be true in 2022).
I think just letting the public now about AI lab leaders’ p(dooms)s makes sense—in fact, I think most AI researchers are on board with that too (they wouldn’t say these things on podcasts or live on stage if not).
It seems to me this campaign isn’t just meant to raise awareness of X-risk though — it’s meant to punish a particular AI lab for releasing what they see as an inadequate safety policy, and to generate public/legislative opposition to that policy.
I think the public should know about X-risk, but I worry using soundbites of it to generate reputatonial harms and counter labs’ safety agendas might make it less likely they speak about it in the future. It’s kind of like a repeated game: if the behavior you want in the coming years is safety-oriented, you should cooperate when your opponent exhibits that behavior. Only when they don’t should you defect.
Man, this interview really broke my heart. I think I used to look up to Sam a lot, as a billionaire whose self-attested sole priority was doing as much as possible to help the most marginalized + in need, today and in the future.
But damn… “I had to be good [at talking about ethics]… it’s what reputations are made of.”
Just unbelievable.
I hope this is a strange, pathological reaction to the immense stress of the past week for him, and not a genuine unfiltered version of the true views he’s held all along. It all just makes me quite sad, to be honest.