Wei Dai comments on Better Futures Discussion Thread: With Fin Moorhouse

Wei Dai 2 May 2026 18:38 UTC
6 points
0 ∶ 0

In each the examples you give, i’m thinking that the pause would be significantly more beneficial (plausibly by 10x) if we pause when AI is already capable enough that it can significantly help us solve the issue.

Why assume that there can only be one pause? Pausing now could make a later pause both more likely and more useful, by building the infrastructure and precedent for pausing, and by making subsequent AIs more aligned and differentially more productive in areas that we care about. If we end the first pause only after we’ve solved the problem of building aligned AIs that are philosophically and strategically competent, that would seemingly make subsequent pauses much easier.

I wonder if you’re thinking that we won’t be able to pause long enough to make significant progress on these problems? I can see that if we only have the “willpower” for a single short pause, then it becomes unclear when to best use it.

In general, they seem like the kinds of issues where AI could massively accelerate progress.

I have been warning for several years that AI could be differentially bad at philosophy and long-horizon strategy (due in part to AI training requiring massive amounts of training data and/or fast and cheap feedback loops, which are lacking for these fields, and in part to lack of understanding of e.g. metaphilosophy). So if we don’t pause now (and use the time to fix this issue) then by the time we do pause, we’ll likely have AIs that can accelerate other fields (such as math/coding/science/tech and manipulating humans) much more than the fields that are crucial for Better Futures.

Worse, we may end up with AIs that decelerate (in an absolute sense) hard-to-verify fields like philosophy and long-horizon strategy, because these AIs are better at coming up with plausible sounding ideas and arguments, and convincing humans of their truth, or persuading humans that their own bad ideas are actually good (which is already being reported under “AI psychosis” and “sycophancy”), than making real progress in these fields.
- Tom_Davidson 15 May 2026 13:12 UTC
  2 points
  0 ∶ 0
  Parent
  Sorry for the slow reply!
  Thanks, this is a helpful perspective.
  I’ve normally thought from a frame of “we’ve got limited chips to spend on pausing, when is it best to spend them”. I think this frame is reasonable if you’re worried about irresponsible developers catching up or tradeoffs with the current gen’s desire to survive.
  But it is true that a pause today might make a pause in the future more likely.
  Otoh, it could also make it less likely if ppl perceive that nothing concretely useful comes out of it, which is my worry with pausing today. Like, i think ~nothing useful would have come from pausing shortly after GPT-4 was released.
  If we end the first pause only after we’ve solved the problem of building aligned AIs that are philosophically and strategically competent
  Do you think this is possible with today’s AI capabilities? I’d have thought you can’t match human philosophy and strategy yet, but we are def getting closer.
  Also, how do you think about whether to slow down vs pause, holding fixed the total delay relative to ‘full speed ahead’? I’d have thought slow down is better re iterating on alignment as problems arise and re building philosophically competent AIs.
  So if we don’t pause now (and use the time to fix this issue) then by the time we do pause, we’ll likely have AIs that can accelerate other fields (such as math/coding/science/tech and manipulating humans) much more than the fields that are crucial for Better Futures.
  Interesting. I normally expect AI to accelerate philosophy and strategy less than the math/coding but more than science/tech. Science/tech rely on experimental bottlenecks, whereas for philosophy the only input is cognitive labour. But you’re right, if AI can’t do philosophy/strategy properly, it won’t speed it up at all! So far, AI systems have been pretty good at these skills though?