It seems to me that the limiting step here is the ability to act like an agent. If we already have AI that can reason and answer questions at a PhD level, why would we need reasoning and question answering to be any better?
Yes I basically agree that’s the biggest limiting factor at this point.
However, a better base model can improve agency via e.g. better perception (which is still weak).
And although reasoning models are good at science and math, they still make dumb mistakes reasoning about other domains, and very high reliability is needed for agents. So I expect better reasoning models also helps with agency quite a bit.
It seems to me that the limiting step here is the ability to act like an agent. If we already have AI that can reason and answer questions at a PhD level, why would we need reasoning and question answering to be any better?
Yes I basically agree that’s the biggest limiting factor at this point.
However, a better base model can improve agency via e.g. better perception (which is still weak).
And although reasoning models are good at science and math, they still make dumb mistakes reasoning about other domains, and very high reliability is needed for agents. So I expect better reasoning models also helps with agency quite a bit.