Wei Dai comments on Better Futures Discussion Thread: With Fin Moorhouse

Wei Dai 25 Apr 2026 20:05 UTC
21 points
1 ∶ 0
For example, in What We Owe The Future, Will said he thought that the expected value of the future, given survival, was less than 1% of what it might be.1 After being exposed to some of the arguments in this essay, he revised his views closer to 10%; after analysing them in more depth, that percentage dropped a little bit, to 5%-10%.

[...]

However, it’s unlikely to me that companies will in fact produce morally uncertain AIs that are motivated by doing good de dicto. They probably won’t have thought about this issue, and won’t be motivated by trying to improve scenarios in which humanity is disempowered.

Given this combination of views, I’m surprised that Will doesn’t support what @Holly Elmore ⏸️ 🔸 calls “Pause NOW” and instead want to see a pause later (after we have human-level AI). I’m curious if your own views are similar or how they differ from Will’s. (My own “expected value of the future, given survival” I would say is similarly pessimistic, but I’m reluctant to put into numbers due to being very unsure how to quantify it.)

Aside from what Holly said in the linked comment, which I agree with, another argument more relevant to the current discussion is that many opportunities for making the future better seem to exist during the AI transition, including the early parts of it, so by not pausing ASAP (and currently having few resources for such interventions), we’re permanently giving up these opportunities. Conversely, by pausing NOW, we buy more time to think and strategize about how to better intervene on these opportunities, or otherwise lay the groundwork for them.

For example, during the pause, we could:
1. Try to solve metaphilosophy, or otherwise think about how to improve AI philosophical competence or moral epistemology.
2. Try to get AI companies to “think about this issue” (of morally uncertain AIs that are motivated by doing good de dicto).
3. Research ways to make such AIs safer from our (human) perspective so that there’s less of a tradeoff between safety and Better Futures.
4. Spread the idea of Better Futures generally so that when AI development resumes, there will be more people aware of and working on these issues.
Such interventions could mean the difference between the first human-level AIs being competent and critical moral/philosophical advisors, or independent moral (and safe) agents, vs uncritically doing what humans seem to want and/or giving bad/incompetent/sycophantic “advice” (when humans think to ask for it), which seemingly can make a big difference to how well the future goes.

What do you think about this argument, and overall about pause now vs later?
- Tom_Davidson 30 Apr 2026 16:30 UTC
  4 points
  0 ∶ 1
  Parent
  Thanks for this.
  
  In each the examples you give, i’m thinking that the pause would be significantly more beneficial (plausibly by 10x) if we pause when AI is already capable enough that it can significantly help us solve the issue. In general, they seem like the kinds of issues where AI could massively accelerate progress.
  
  So if i’m choosing between international pause now vs international pause in 2 years, I choose the latter. (I assume we’re talking about international pauses here rather than just the U.S. but lmk if you also support a unilateral pause now!)
  
  I do find Holly’s point that it might be damaging to quibble about exactly when we pause if that reduces the chance of a pause happening at all. And today we are very far from a pause actually happening, and one may well be needed in two years’ time, so I def support efforts to get us closer to a pause!
  
  I’m hesitant about saying “pause now” because I actually think a different policy might be much more effective. But I think a world where we were about to do an international pause would be better than the actual world.
  
  (I want to think more about this topic and all of this is v tentative.)
  - Wei Dai 2 May 2026 18:38 UTC
    6 points
    0 ∶ 0
    Parent
    
    In each the examples you give, i’m thinking that the pause would be significantly more beneficial (plausibly by 10x) if we pause when AI is already capable enough that it can significantly help us solve the issue.
    
    Why assume that there can only be one pause? Pausing now could make a later pause both more likely and more useful, by building the infrastructure and precedent for pausing, and by making subsequent AIs more aligned and differentially more productive in areas that we care about. If we end the first pause only after we’ve solved the problem of building aligned AIs that are philosophically and strategically competent, that would seemingly make subsequent pauses much easier.
    
    I wonder if you’re thinking that we won’t be able to pause long enough to make significant progress on these problems? I can see that if we only have the “willpower” for a single short pause, then it becomes unclear when to best use it.
    
    In general, they seem like the kinds of issues where AI could massively accelerate progress.
    
    I have been warning for several years that AI could be differentially bad at philosophy and long-horizon strategy (due in part to AI training requiring massive amounts of training data and/or fast and cheap feedback loops, which are lacking for these fields, and in part to lack of understanding of e.g. metaphilosophy). So if we don’t pause now (and use the time to fix this issue) then by the time we do pause, we’ll likely have AIs that can accelerate other fields (such as math/coding/science/tech and manipulating humans) much more than the fields that are crucial for Better Futures.
    
    Worse, we may end up with AIs that decelerate (in an absolute sense) hard-to-verify fields like philosophy and long-horizon strategy, because these AIs are better at coming up with plausible sounding ideas and arguments, and convincing humans of their truth, or persuading humans that their own bad ideas are actually good (which is already being reported under “AI psychosis” and “sycophancy”), than making real progress in these fields.
    - Tom_Davidson 15 May 2026 13:12 UTC
      2 points
      0 ∶ 0
      Parent
      Sorry for the slow reply!
      Thanks, this is a helpful perspective.
      I’ve normally thought from a frame of “we’ve got limited chips to spend on pausing, when is it best to spend them”. I think this frame is reasonable if you’re worried about irresponsible developers catching up or tradeoffs with the current gen’s desire to survive.
      But it is true that a pause today might make a pause in the future more likely.
      Otoh, it could also make it less likely if ppl perceive that nothing concretely useful comes out of it, which is my worry with pausing today. Like, i think ~nothing useful would have come from pausing shortly after GPT-4 was released.
      If we end the first pause only after we’ve solved the problem of building aligned AIs that are philosophically and strategically competent
      Do you think this is possible with today’s AI capabilities? I’d have thought you can’t match human philosophy and strategy yet, but we are def getting closer.
      Also, how do you think about whether to slow down vs pause, holding fixed the total delay relative to ‘full speed ahead’? I’d have thought slow down is better re iterating on alignment as problems arise and re building philosophically competent AIs.
      So if we don’t pause now (and use the time to fix this issue) then by the time we do pause, we’ll likely have AIs that can accelerate other fields (such as math/coding/science/tech and manipulating humans) much more than the fields that are crucial for Better Futures.
      Interesting. I normally expect AI to accelerate philosophy and strategy less than the math/coding but more than science/tech. Science/tech rely on experimental bottlenecks, whereas for philosophy the only input is cognitive labour. But you’re right, if AI can’t do philosophy/strategy properly, it won’t speed it up at all! So far, AI systems have been pretty good at these skills though?