While Anthropic’s plan is a terrible one, so is PauseAI’s. We have no good plans. And we must’nt fight amongst ourselves.
Who’s “ourselves”? Anthropic doesn’t have “a terrible plan” for AI Safety—they are the AI danger.
While Anthropic’s plan is a terrible one, so is PauseAI’s. We have no good plans. And we must’nt fight amongst ourselves.
Who’s “ourselves”? Anthropic doesn’t have “a terrible plan” for AI Safety—they are the AI danger.
What happens because of these papers? Do they influence Anthropic to stop developing powerful AI? Evidently not.
I agree with this descriptively, but at this moment in time the way EA evolved to basically require all these things makes me sad because that isolates the idea from the broader world and isolates EAs from pursuing interventions that are outside of their norm, like big tent mass movement building (which I believe is the way forward with AI Safety, but EAs to consider anti-Scout mindset or something).
Agree. One of the things I most appreciate about old school EA is that it took things that used to feel like above-and-beyond altruism in my personal life and made me see that I actually enjoyed those things selfishly. Local charitable giving or going out of my way to help a friend of a friend became less of a burden once I was “off the hook” because of giving money more effectively, and I realized that the reason I didn’t want to give that stuff up was that it made me feel good and improved my life.
PauseAI US thanks you for your donation, Bruce! Anyone else who wants to make us the beneficiary of a wager is highly encouraged :)
Yeah, because then it would be a clear conversation. The tradeoffs that are currently obscured wouldn’t be hidden and the speculation would be unmasked.
All the disagreements on worldview can be phrased correctly. Currently people use the word “marginal” to sneak in specific values and assumptions about what is effective.
No, it’s literally about what the word marginal means
I think people like the “labs” language because it makes it easier to work with them and all the reasons you state, which is why I generally say “AI companies”. I do find it hard, however, to make myself understood sometimes in an EA context when I don’t use it.
I do feel called to address the big story (that’s also usually what makes me sad and worn out), but, like you, what really brings me back is little stuff like a beautiful flower or seeing a hummingbird.
I agree having access to what the labs are doing and having the ability to blow the whistle would be super useful. I’ve just recently updated hugely in the direction of respecting the risk of value drift of having people embedded in the labs. We’re imagining cheaply having access to the labs, but the labs and their values will also have access back to us and our mission-alignment through these people.
I think EA should be setting an example to a more confused public of how dangerous this technology is, and being mixed up in making it makes that very difficult.
See my other comments. “Access” to do what? At what cost?
I didn’t mean “there is no benefit to technical safety work”; I meant more like “there is only benefit to labs to emphasizing technical safety work to the exclusion of other things”, as in it benefits them and doesn’t cost them to do this.
Yeah good point. I thought Ebenezer was referring to more run-of-the-mill community members.
I think you, and this community, have no idea how difficult it is to resist value/mission drift in these situations. This is not a friend:friend exchange. It’s a small community of nonprofits and individuals:the most valuable companies in the world. They aren’t just gonna pick up the values of a few researchers by osmosis.
From your other comment it seems like you have already been affected by the lab’s influence via the technical research community. The emphasis on technical solutions only benefits them, and it just so happens that to work on the big models you have to work with them. This is not an open exchange where they have been just as influenced by us. Sam and Dario sure want you and the US government to think they are the right safety approach, though.
Here’s our crux:
My subjective sense is there’s a good chance we lose because all the necessary insights to build aligned AI were lying around, they just didn’t get sufficiently developed or implemented.
For both theoretical and empirical reasons, I would assign a probably as low as 5% to there being alignment insights just laying around that could protect us at the superintelligence capabilities level and don’t require us to slow or stop development to implement in time.
I don’t see a lot of technical safety people engaging in advocacy, either? It’s not like they tried advocacy first and then decided on technical safety. Maybe you should question their epistemology.
What you write there makes sense but it’s not free to have people in those positions, as I said. I did a lot of thinking about this when I was working on wild animal welfare. It seems superficially like you could get the right kind of WAW-sympathetic person into agencies like FWS and the EPA and they would be there to, say, nudge the agency in a way no one else cared about to help animals when the time came. I did some interviews and looked into some historical cases and I concluded this is not a good idea.
The risk of being captured by the values and motivations of the org where they spend most of their daily lives before they have the chance to provide that marginal difference is high. Then that person is lost the Safety cause or converted into further problem. I predict that you’ll get one successful Safety sleeper agent in, generously, 10 researchers who go to work at a lab. In that case your strategy is just feeding the labs talent and poisoning the ability of their circles to oppose them.
Even if it’s harmless, planting an ideological sleeper agent in firms is generally not the best counterfactual use of the person because their influence in a large org is low. Even relatively high-ranking people frequently have almost no discretion about what happens in the end. AI labs probably have more flexibility than US agencies, but I doubt the principle is that different.
Therefore I think trying to influence the values and safety of labs by working there is a bad idea that would not be pulled off.
Connect the rest of the dots for me—how does that researcher’s access become community knowledge? How does the community do anything productive with this knowledge? How do you think people working at the labs detracts from other strategies?
lol “great post, but it fails to engage what I think about when I think of PauseAI”