Exponential growth in time horizon with a ~4mo doubling time has been confirmed by other organizations on very different distributions (1, 2). Furthermore, it correlates very well with the Epoch Capabilities Index.
The blog post by the Australian AI safety organization says, āWe apply METRās time-horizon methodologyā¦ā How would this address the criticisms raised of METRās methodology?
At a glance, the FutureTech pre-print makes some interesting choices, e.g., task quality is only scored up to above-average and above-average gets a perfect score, and acknowledges some of the limitations with their methodology, e.g., all tasks used for this experiment must contain all relevant information in the LLM prompt. (Is that realistic for most work tasks?) I wonder if this pre-print will be submitted for publication in a journal? FutureTech seems to be one of those weird MIT hybrids between an academic research group and a management consultancy. Iām not sure if theyāve ever published a peer-reviewed paper.
[Edit on 2026-05-14 at 18:56 UTC: After reading Peter Slatteryās comment below, I spent a few more minutes looking into it, and Iām still not sure what FutureTech is or what kind of stuff they publish. If someone knows and can explain it, that would be helpful. I could spend more time and get to the bottom of it, but I donāt want to spend more time on it right now.
Please also note the EA Forum team has limited my ability to reply to comments, so I canāt reply further. But if you want to continue the discussion, Iām reachable here.]
Someone could take the time to do a deep dive into the FutureTech pre-print and write a review, but I wonder if thatās a good use of anyoneās time? Is there a reason to think this group publishes high-quality research that is worth getting into?
If someone thinks itās worthwhile, and they also think the pre-print is unlikely to be submitted for peer review, one option would be to ask the EA organization called The Unjournal to commission a review by an external expert.
Are you sure you are thinking of the correct organization when you say:
FutureTech seems to be one of those weird MIT hybrids between an academic research group and a management consultancy. Iām not sure if theyāve ever published a peer-reviewed paper.
I say that because the lab has many publications, including in top peer-reviewed journals like Science. For more context, here is the publications page and here is the bio for Neil Thompson, the head of the lab:
Dr. Thompsonās work has over 3000 citations with an h-index of 21 across his publication portfolio, including such well known and renowned papers as Expertise, The Computational Limits of Deep Learning, and Thereās plenty of room at the Top: What will drive computer performance after Mooreās law? Dr. Thompson has been invited to present his work and recommendations to Congressional Staffers (House and Senate), the US Federal Reserve, the Pentagon, National Security Staff, the Department of Commerce, the Department of Energy, Brookings Institute, and most recently presented at a World Summit on the same program as the Prime Minister of India and Former Prime Ministers of England and Australia. With experience in 80+ countries, Dr. Thompsonās research and impact is on a global scale.
Exponential growth in time horizon with a ~4mo doubling time has been confirmed by other organizations on very different distributions (1, 2). Furthermore, it correlates very well with the Epoch Capabilities Index.
The blog post by the Australian AI safety organization says, āWe apply METRās time-horizon methodologyā¦ā How would this address the criticisms raised of METRās methodology?
At a glance, the FutureTech pre-print makes some interesting choices, e.g., task quality is only scored up to above-average and above-average gets a perfect score, and acknowledges some of the limitations with their methodology, e.g., all tasks used for this experiment must contain all relevant information in the LLM prompt. (Is that realistic for most work tasks?) I wonder if this pre-print will be submitted for publication in a journal? FutureTech seems to be one of those weird MIT hybrids between an academic research group and a management consultancy. Iām not sure if theyāve ever published a peer-reviewed paper.
[Edit on 2026-05-14 at 18:56 UTC: After reading Peter Slatteryās comment below, I spent a few more minutes looking into it, and Iām still not sure what FutureTech is or what kind of stuff they publish. If someone knows and can explain it, that would be helpful. I could spend more time and get to the bottom of it, but I donāt want to spend more time on it right now.
Please also note the EA Forum team has limited my ability to reply to comments, so I canāt reply further. But if you want to continue the discussion, Iām reachable here.]
Someone could take the time to do a deep dive into the FutureTech pre-print and write a review, but I wonder if thatās a good use of anyoneās time? Is there a reason to think this group publishes high-quality research that is worth getting into?
If someone thinks itās worthwhile, and they also think the pre-print is unlikely to be submitted for peer review, one option would be to ask the EA organization called The Unjournal to commission a review by an external expert.
Are you sure you are thinking of the correct organization when you say:
I say that because the lab has many publications, including in top peer-reviewed journals like Science. For more context, here is the publications page and here is the bio for Neil Thompson, the head of the lab:
Dr. Thompsonās work has over 3000 citations with an h-index of 21 across his publication portfolio, including such well known and renowned papers as Expertise, The Computational Limits of Deep Learning, and Thereās plenty of room at the Top: What will drive computer performance after Mooreās law? Dr. Thompson has been invited to present his work and recommendations to Congressional Staffers (House and Senate), the US Federal Reserve, the Pentagon, National Security Staff, the Department of Commerce, the Department of Energy, Brookings Institute, and most recently presented at a World Summit on the same program as the Prime Minister of India and Former Prime Ministers of England and Australia. With experience in 80+ countries, Dr. Thompsonās research and impact is on a global scale.
Oh, and the preprint will almost certainly be submitted for peer review, but it might take 1-2 years before it is published.
How would this not? It doesnāt use the same tasks nor does it use the same human baseliner panel as the HCAST dataset.