Thanks for the detailed reply, I understand your point clearly now I think!
But $20,000 for *all* of the OpenBSD bugs (not just the published ones) doesnât sound like that much to spend on inference compute to me. If AISLE could have spent the same and made an equally impressive announcement, unearthing enough bugs at once that government ministers around the world start issuing statements about it, then shouldnât they have been able to find the investors to fund that? That would have been incredible publicity for them.
The crux for me seems to be whether they have made equally impressive announcements, as you suggest they might have done. Maybe theyâre just worse at marketing. I donât know enough to evaluate that claim properly, but that does seem the relevant question here: have Anthropic been able to use Mythos to go significantly beyond what the best harnesses could already achieve with existing models for the same inference spend? I thought the answer was a clear yes, and I didnât find the original linked AISLE writeup very convincing at all. Your comment has made me more uncertain, but has still not convinced me, and Iâd be really interested to read something more in depth on that question. (Maybe we also would disagree about what the word âsignificantlyâ means here, since I guess you are acknowledging it probably represents some improvement).
(Also, Iâd push back a bit on your characterization of AI progress. I agree the scaffolding is extremely important, but in my experience the âparadigm shiftsâ in capability over the last two and a half years Iâve been working with them have come from the models)
(And extra comment: the fact that cybersecurity capabilities might not imply imminent superintelligence takeoff seems an entirely independent point that I donât necessarily disagree with)
That makes a lot of sense, thanks.
Iâm sorry youâve said you regret your engagement, since Iâve found your comments helpful (the link to AISLEâs OpenSSL zero days has shifted my view on this a fair bit).
I guess this whole discussion does just feel like a classic example of âAll debates are bravery debatesâ.