Feels like your true objection here is that frontier AI development just isn’t that dangerous? Otherwise I don’t know how you could be more concerned about the few piddling “inaccuracies and misleading statements that I won’t fully enumerate” than nobody doing CAIP’s work to get the beginnings of safeguards in place.
Holly Elmore ⏸️ 🔸
(Makes much more sense if you were talking about unilateral pauses! The PauseAI pause is international, so that’s just how I think of Pause.)
Then there should be future legislation? Why is it on CAIP and this legislation to foresee the entire future? That’s a prohibitively high bar for regulation.
I love to see people coming to this simple and elegant case in their own way and from their own perspective— this is excellent for spreading the message and helps to keep it grounded. Was very happy to see this on the Forum :)
As for whether Pause is the right policy (I’m a founder of PauseAI and ED of PauseAI US), we can quibble about types of pauses or possible implementations but I think “Pause NOW” is the strongest and clearest message. I think anything about delaying a pause or timing it perfectly is the unrealistic thing that makes it harder to achieve consensus and to have the effect we want, and Carl should know better. I’m still very surprised he said it given how much he seems to get the issue, but I think it comes down to trying to “balance the benefits and the risks”. Imo the best we can do for now is slam the brakes and not drive off the cliff, and we can worry about benefits after.
When we treat international cooperation or a moratorium as unrealistic, we weaken our position and make that more true. So, at least when you go to the bargaining table, if not here, we need to ask for fully what we want without pre-surrendering. “Pause AI!”, not “I know it’s not realistic to pause, but maybe you could tap the brakes?” What’s realistic is to some extent what the public says is realistic.
Short answer I think trying to time this is too galaxy-brained. I think getting the meme of Pause out there ASAP is good because it pushes the Overton window and it gives people longer to chew on it. If and when warning shots occur, they will mainly advance Pause if people already had the idea that Pause would combat things like the warning shot happening before they happened.
I think takes that rely on saving up some kind of political capital and deploying it at the perfect time are generally wrong. PauseAI will gain more capital with more time and conversation, not use it up.
You can even take this further and question why the person on their deathbed doesn’t feel proud of the work they chose to do. Maybe they feel ashamed of their actual preferences and don’t need to. Or maybe they aren’t taking to heart the tradeoff in interests between the experiencing self and the remembering self.
To say someone is not “truthseeking” in Berkeley is like a righteous ex-communication. It gets to be an epistemic issue.
Imo the biggest reason not to do this is that it’s labeling the person or getting at their character. There’s a threat implied that they will be dismissed out of hand bc they are categorically in bad faith. It can be weaponized.
As someone trying to start a social movement (PauseAI), I wish EAs were more understanding and forgiving that there isn’t a great literature I can just follow. I feel confident that jumping in and finding my way was a good thing to do because advocacy and activism were neglected angles to a very important problem.
Most of my thinking and decision-making with PauseAI US is based on my world model, not specific beliefs about the efficacy of different practices or philosophies in other social movements. I expect local conditions and factors specific to the topic and landscape of AI and organizational factors like my leadership style to be more important than which approach is “best” ceteris parabis.
This is totally spitballing, but doing anything that encourages modularity in the circuits (or perhaps at another level?) of the AIs and the ability to swap mind modules would be really good for interpretability.
Ever since this project, I’ve had a vague sense that genome architecture has something interesting to teach us about interpreting/predicting NNs, but I’ve never had a particularly useful insight from it. Love this book on it by Micheal Lynch if anyone’s interested.
I’ve heard this idea of AI group selection floated a few times but people used to say it was too computationally intensive. Now who knows?
Closest biology the idea brings to mind is this paper showing that selecting chickens as groups leads to better overall yields (in factory farming :( ) for the reasons you predict—they aren’t as aggressive or stressed by crowding as the chickens that are individually selected for the biggest yields.
The examples of gradient hackers with positive effects seem like they could be following the pattern of “here’s a sub-system doing something bad (e.g. transposons copying themselves incessantly), which the system needs to defend against, so the system finds a way (e.g. introns) to defend which carries other (maybe greater) benefits but which wouldn’t have been found otherwise”, does that seem like it explains things?
Yes, this is broadly accurate from my knowledge of positive examples (for the organism) of drive. They either contribute more scratch (TEs) or they drive through a nifty innovation (homing endonucleases for mating type switching in yeast, VJD recombination in immune cells) that can be coopted. It’s possible there are other positive contributions that we don’t know about, of course.
I knew it! I’ve been wondering about this for literally years, thanks for confirming that this is a thing that happens.
The coolest example is Cupressus dupreziana, the androgenetic cypress. It’s hard to observe a history of extinctions from meiotic drive, bc it’s not a cause of death that fossilizes, but this one we’re seeing just right before it completes. When I learned about this, there were only 28 individuals left in this species. Genome Exclusion is covered in chapter 10 of Burt & Trivers.
Re:analogies to recombination, I did think as I was preparing these old notes to post that possibly I should see the cost function or the task being trained on as somewhat analogous in the sense that they are sort of templates against which performance is being checked? It’s a very tenuous thought and I can’t quite make the analogy work, but maybe you or someone else can do something with it.
I have to admit, I wouldn’t have taken it to heart much if these studies hadn’t found much effect (nor if they had found a huge effect). And I feel exposed here bc I know that looks bad, like I’m resisting actual evidence in favor of my vibes, but I really think my model is better and the evidence in these studies should only tweak it.
I’m just not that hopeful that you can control enough of the variables with the few historical examples we have to really know that through this kind of analysis. I also think the defining of aims and impacts is too narrow—Overton window pushing can manifest in many, many ways and still be contributing to the desired solution.
I’m comfortable with pursuing protests through PauseAI US because they were a missing mood in the AI Safety discussion. They are a form of discussion and persuasion, and I approach them similarly to how I decide to address AI danger in writing or in interviews. They are also a form of showing up in force for the cause, in a way that genuinely signals commitment bc it is very hard to get people to do it, which is important to movement building even when small. The only point of protests is not to get whatever the theme of that protest was (the theme of our protests is either shutting down your company or getting an international treaty lol)-- they feed back into the whole movement and community, which can have many unanticipated but directionally desirable impacts.
I don’t think my approach to protests is perfect by any means, and I may have emphasized them too much and failed to do things I should to grow them. But I make my calls about doing protests based on many considerations for how it will affect the rhetorical and emotional environment of the space. I wish there were studies that could tell me how to do this better, but there aren’t, just like there aren’t studies that tell me exactly what to write to change people’s minds on AI danger in the right way. (Actually, a good comparison here would be “does persuasive writing work?” bc there we all have personal experiences of knowing it worked, but actually as a whole the evidence might be thin for it achieving its aims.)
What is the upshot of this? Is this for new audiences to read? It seems like the most straightforward application of it is futures betting, not positively influencing the future.
Perhaps you’re indicating that if the money will run out if frontier-AI doesn’t becoming self-sustaining by 2030? Maybe we can do something to make that more likely?
Because I do struggle to see how this helps.
When I learned more about eyestalk ablation reviewing the Rethink Priorities report, I was surprised how little it seemed to bother the shrimp, and I did downgrade my concern about welfare from that particular practice. However, I think what people are reacting to is more the barbarity of it than the level or amount of harm. (After all, they already knew the shrimp get killed at the end.) I think it’s just so bizarre and gross and exploitative-feeling that it shocks them out of complacency in how they view the shrimp. I think they helplessly imagine themselves losing their own eye and they empathize with the shrimp in a powerful, gut-level way, and that this is why it has been impactful to talk about.
Why not attack them? They defected. They did a really bad thing.
(FYI I’m the ED of PauseAI US and we have our own website pauseai-us.org)
1. It is on every actor morally to do the right thing by not advancing dangerous capabilities separate from whether everyone else does it, even though everyone pausing and then agreeing to safe development standards is the ideal solution. That’s what that language refers to. I’m very careful about taking positions as an org, but, personally, I also think unilateral pauses would make the world safer compared to no pauses by slowing worldwide development. In particular, if the US were to pause capabilities development, our competitors wouldn’t have our frontier research to follow/imitate, and it would take other countries longer to generate those insights themselves.
2. “PauseAI NOW” is not just the simplest and best message to coordinate around, it’s also an assertion that we are ALREADY in too much danger. You pause FIRST, then sort out the technical details.