Ozzie Gooen comments on Ozzie Gooen’s Quick takes

Ozzie Gooen 14 May 2025 2:37 UTC
8 points
0 ∶ 0
There’s been some neat work on making AI agent forecasters. Some of these seem to have pretty decent levels of accuracy, vs. certain sets of humans.

And yet, very little of this seems to be used in the wild, from what I can tell.

It’s one thing to show some promising results in a limited study. But ultimately, we want these tools to be used by real people.

I assume some obvious todos would be:
1. Websites where you can easily ask one or multiple AI forecasters questions.
2. Competing services that package “AI forecasting” tools in different ways, focusing on optimizing (positive) engagement.
3. I assume that many AI forecasters should really be racking up good scores in Metaculus/Manifold now. The limitation seems to mainly be effort—neither platform has significant incentives yet.

Optimizing AI forecasting bots, but only in experimental settings, seems akin to optimizing cameras, but only in experimental settings. I’d expect you’d wind up with things that are technically impressive but highly unusable. We might learn a lot about a few technical challenges, but little about what real use would look like or what the key bottlenecks will be.
- calebp 14 May 2025 5:10 UTC
  4 points
  0 ∶ 0
  Parent
  I haven’t been following this area closely, but why aren’t they making a lot of money on polymarket?
  - Ozzie Gooen 14 May 2025 6:57 UTC
    2 points
    0 ∶ 0
    Parent
    I’m sure some people are using custom AI tools for polymarket, but I don’t expect that to be very public.
    
    I was focusing on Metaculus/Manifold, where I don’t think there’s much AI bot engagement yet. (Metaculus does have a dedicated tournament, but that’s separate from the main part we see, I believe).
  - calebp 14 May 2025 5:12 UTC
    2 points
    0 ∶ 0
    Parent
    Also what are the main or best open source projects in the space? Or if someone wanted to actually use LMs for forecasting, what is better than just asking o3 to produce a forecast?
    - Ozzie Gooen 14 May 2025 6:59 UTC
      2 points
      0 ∶ 0
      Parent
      There’s some relevant discussion here:
      https://forum.effectivealtruism.org/posts/TG2zCDCozMcDLgoJ5/metaculus-q4-ai-benchmarking-bots-are-closing-the-gap?commentId=TvwwuKB6rNASzMNoo
      
      Basically, it seems like people haven’t outperformed the Metaculus template bot much, which IMO is fairly underwhelming, but it is what it is.
      
      You can do simple tips though like run it a few times and average the results.