All the disagreements on worldview can be phrased correctly. Currently people use the word “marginal” to sneak in specific values and assumptions about what is effective.
Holly Elmore ⏸️ 🔸
No, it’s literally about what the word marginal means
I think people like the “labs” language because it makes it easier to work with them and all the reasons you state, which is why I generally say “AI companies”. I do find it hard, however, to make myself understood sometimes in an EA context when I don’t use it.
I do feel called to address the big story (that’s also usually what makes me sad and worn out), but, like you, what really brings me back is little stuff like a beautiful flower or seeing a hummingbird.
EAs forgot what “marginal” means
I agree having access to what the labs are doing and having the ability to blow the whistle would be super useful. I’ve just recently updated hugely in the direction of respecting the risk of value drift of having people embedded in the labs. We’re imagining cheaply having access to the labs, but the labs and their values will also have access back to us and our mission-alignment through these people.
I think EA should be setting an example to a more confused public of how dangerous this technology is, and being mixed up in making it makes that very difficult.
See my other comments. “Access” to do what? At what cost?
I didn’t mean “there is no benefit to technical safety work”; I meant more like “there is only benefit to labs to emphasizing technical safety work to the exclusion of other things”, as in it benefits them and doesn’t cost them to do this.
Yeah good point. I thought Ebenezer was referring to more run-of-the-mill community members.
I think you, and this community, have no idea how difficult it is to resist value/mission drift in these situations. This is not a friend:friend exchange. It’s a small community of nonprofits and individuals:the most valuable companies in the world. They aren’t just gonna pick up the values of a few researchers by osmosis.
From your other comment it seems like you have already been affected by the lab’s influence via the technical research community. The emphasis on technical solutions only benefits them, and it just so happens that to work on the big models you have to work with them. This is not an open exchange where they have been just as influenced by us. Sam and Dario sure want you and the US government to think they are the right safety approach, though.
Here’s our crux:
My subjective sense is there’s a good chance we lose because all the necessary insights to build aligned AI were lying around, they just didn’t get sufficiently developed or implemented.
For both theoretical and empirical reasons, I would assign a probably as low as 5% to there being alignment insights just laying around that could protect us at the superintelligence capabilities level and don’t require us to slow or stop development to implement in time.
I don’t see a lot of technical safety people engaging in advocacy, either? It’s not like they tried advocacy first and then decided on technical safety. Maybe you should question their epistemology.
What you write there makes sense but it’s not free to have people in those positions, as I said. I did a lot of thinking about this when I was working on wild animal welfare. It seems superficially like you could get the right kind of WAW-sympathetic person into agencies like FWS and the EPA and they would be there to, say, nudge the agency in a way no one else cared about to help animals when the time came. I did some interviews and looked into some historical cases and I concluded this is not a good idea.
The risk of being captured by the values and motivations of the org where they spend most of their daily lives before they have the chance to provide that marginal difference is high. Then that person is lost the Safety cause or converted into further problem. I predict that you’ll get one successful Safety sleeper agent in, generously, 10 researchers who go to work at a lab. In that case your strategy is just feeding the labs talent and poisoning the ability of their circles to oppose them.
Even if it’s harmless, planting an ideological sleeper agent in firms is generally not the best counterfactual use of the person because their influence in a large org is low. Even relatively high-ranking people frequently have almost no discretion about what happens in the end. AI labs probably have more flexibility than US agencies, but I doubt the principle is that different.
Therefore I think trying to influence the values and safety of labs by working there is a bad idea that would not be pulled off.
Connect the rest of the dots for me—how does that researcher’s access become community knowledge? How does the community do anything productive with this knowledge? How do you think people working at the labs detracts from other strategies?
There should be protests against them (PauseAI US will be protesting them in SF 2⁄28) and we should all consider them evil for building superintelligence when it is not safe! Dario is now openly calling for recursive self-improvement. They are the villains—this is not hard. The fact that you would think Zach’s post with “maybe” in the title is scrutiny is evidence of the problem.
If the supposed justification for taking these jobs is so that they can be close to what’s going on, and then they never tell and (I predict) get no influence on what the company does, how could this possibly be the right altruistic move?
What you seem to be hinting at, essentially espionage, may honestly be the best reason to work in a lab. But of course those people need to be willing to break NDAs and there are better ways to get that info than getting a technical safety job.
(Edited to add context for bringing up “espionage” and implications elaborated.)
Great piece— great prompt to rethink things and good digests of implications.
If you agree that mass movement building is a priority, check out PauseAI-US.org (I am executive director), or donate here: https://www.zeffy.com/donation-form/donate-to-help-pause-ai
One implication I strongly disagree with is that people should be getting jobs in AI labs. I don’t see you connecting that to actual safety impact, and I sincerely doubt working as a researcher gives you any influence on safety at this point (if it ever did). There is a definite cost to working at a lab, which is capture and NDA-walling. Already so many EAs work at Anthropic that it is shielded from scrutiny within EA, and the attachment to “our player” Anthropic has made it hard for many EAs to do the obvious thing by supporting PauseAI. Put simply: I see no meaningful path to impact on safety working as an AI lab researcher, and I see serious risks to individual and community effectiveness and mission focus.
Yes, you detect correctly that I have some functionalist assumptions in the above. They aren’t strongly held but I had hope then we could simply avoid building conscious systems by pausing generally. Even if it seems less likely now that we can avoid making sentient systems at all, I still think it’s better to stop advancing the frontier. I agree there could in principle be a small animal problem with that, but overwhelmingly I think the benefits of more time, creating fewer possibly sentient models before learning more about how their architecture corresponds to their experience, and pushing a legible story about why it is important to stop without getting into confusing paradoxical effects like the small animal problem (I formed this opinion in the context of animal welfare— people get the motives behind vegetarianism; they do not get why you would eat certain wild-caught fish and not chickens, so you’re missing out on the power of persuasion and norm-setting) mean the right move re:digital sentience is pausing.
If you are so inclined, individual donors can make a big difference to PauseAI US as well (more here: https://forum.effectivealtruism.org/posts/YWyntpDpZx6HoaXGT/please-vote-for-pauseai-us-in-the-donation-election)
We’re the highest voted AI risk contender in the donation election, so vote for us while there’s still time!
Yeah, because then it would be a clear conversation. The tradeoffs that are currently obscured wouldn’t be hidden and the speculation would be unmasked.