I think using made-up-ish numbers can be better than not doing it at all—as long as there’s no impression of rigor. We don’t have to claim rigor! How else do we then decide? I agree there are some cases where its better not to use numbers, but where we have some real-ish baseline figures its better than having nothing to hang our hats on. In the case of AI 2027 there were some real numbers like Compute scaling patterns from the past and METR progress, so it wasn’t unreasonable to make predictions based on these I think at least. But we all have different lines for rigor....
But I 100% agree there was far too much of a “research-based” and “these are the top experts” framing on AI 2027, which was proved absurd within months when the prediction had been pushed back to 2029. There was at least a little too much confidence/bluster about the way it was presented. To their credit though they pushed it back which is a tick on the “good epistemics” ledger...
Hi Michael. That is fair. On the other hand, what readers ignore or not depends on how the results are communicated. It would be harder to ignore uncertainty about AI timelines if AI2027 was e.g. AI2027-2047, and this would still undercommunicate uncertainty. The difference between the 90th and 10th percentile dates of artificial superintelligence (ASI), as defined below, is more than 100 years for Daniel Kokotajlo and Eli Lifland, the 2 main forecasters of the AI Futures Model (which superseded AI2027).
Agreed on the narrow point: anchoring on real data is better than pure vibes, when there is real data.
First, my main complaint about AI 2027 are that they extrapolate from METR data to fit a model while mostly ignoring the heavy caveats that the METR people put with their graph (this is not unique to AI 2027, Situational Awareness did something similar and many people do extrapolate a lot from benchmarks when this is not warranted or endorsed by the creators of those benchmarks).
This is an example of what I see as a broad problem in EA/rationality circles, when someone says “bad model better than no model” and uses numbers that are not “empirical with huge error bars” but completely made up.
More on made up numbers, certain psychological anchorings make people say 1% instead of 10^-5 for implausible claims, just because % is a typical way of expressing probabilities.
More generally, on community epistemics and why I’m picking on this particular example.
80k made a dramatized video out of AI2027 for a mass audience. I showed this video to some people in my circle and their reaction was to dismiss 80k’s channel as one more AI hype/doom content. This is similar to what I remember being my first reaction when I encountered 80k way before learning anything about EA.
They even admitted that they chose AI 2027 in part because “it’s a story, so people are compelled to keep watching”.
They also said they received criticism for being “too speculative” but I haven’t seen them engaging with the substance of it, at least in their retrospective. Please correct me if I’m wrong in this last part.
Apologies for the previous claim that 80k admitted that a more argument-based video would have depended on preexisting trust, this was AI generated and I was sloppy checking (it was on a comment on their retrospective, not by 80k themselves). My trust in AI as a search engine has gone down accordingly.
Hey @Clara Torres Latorre 🔸your point isn’t bad but some of this this feels heavily AI written to me and I don’t love it. I could be wrong again (would not be the first time).
Second and last paragraphs were AI written. The rest, I used AI to search but double checked (but not well enough) the sources bc it hallucinated a bunch of stuff, but the rest I wrote directly.
Now it’s 100% written by me, don’t know if it was worth my time but I hate AI slop so be the change that you want to see in the world etc
I think using made-up-ish numbers can be better than not doing it at all—as long as there’s no impression of rigor. We don’t have to claim rigor! How else do we then decide? I agree there are some cases where its better not to use numbers, but where we have some real-ish baseline figures its better than having nothing to hang our hats on. In the case of AI 2027 there were some real numbers like Compute scaling patterns from the past and METR progress, so it wasn’t unreasonable to make predictions based on these I think at least. But we all have different lines for rigor....
But I 100% agree there was far too much of a “research-based” and “these are the top experts” framing on AI 2027, which was proved absurd within months when the prediction had been pushed back to 2029. There was at least a little too much confidence/bluster about the way it was presented. To their credit though they pushed it back which is a tick on the “good epistemics” ledger...
In my experience, here’s how these things go ~100% of the time:
Authors make up some numbers, and they include about a dozen caveats about the limitations of their model.
Readers ignore all the caveats and accuse them of claiming to be rigorous, even though they claimed no such thing.
AI 2027 is a great example of this.
Hi Michael. That is fair. On the other hand, what readers ignore or not depends on how the results are communicated. It would be harder to ignore uncertainty about AI timelines if AI2027 was e.g. AI2027-2047, and this would still undercommunicate uncertainty. The difference between the 90th and 10th percentile dates of artificial superintelligence (ASI), as defined below, is more than 100 years for Daniel Kokotajlo and Eli Lifland, the 2 main forecasters of the AI Futures Model (which superseded AI2027).
Agreed on the narrow point: anchoring on real data is better than pure vibes, when there is real data.
First, my main complaint about AI 2027 are that they extrapolate from METR data to fit a model while mostly ignoring the heavy caveats that the METR people put with their graph (this is not unique to AI 2027, Situational Awareness did something similar and many people do extrapolate a lot from benchmarks when this is not warranted or endorsed by the creators of those benchmarks).
This is an example of what I see as a broad problem in EA/rationality circles, when someone says “bad model better than no model” and uses numbers that are not “empirical with huge error bars” but completely made up.
More on made up numbers, certain psychological anchorings make people say 1% instead of 10^-5 for implausible claims, just because % is a typical way of expressing probabilities.
More generally, on community epistemics and why I’m picking on this particular example.
80k made a dramatized video out of AI2027 for a mass audience. I showed this video to some people in my circle and their reaction was to dismiss 80k’s channel as one more AI hype/doom content. This is similar to what I remember being my first reaction when I encountered 80k way before learning anything about EA.
They even admitted that they chose AI 2027 in part because “it’s a story, so people are compelled to keep watching”.
They also said they received criticism for being “too speculative” but I haven’t seen them engaging with the substance of it, at least in their retrospective. Please correct me if I’m wrong in this last part.
Apologies for the previous claim that 80k admitted that a more argument-based video would have depended on preexisting trust, this was AI generated and I was sloppy checking (it was on a comment on their retrospective, not by 80k themselves). My trust in AI as a search engine has gone down accordingly.
Hey @Clara Torres Latorre 🔸your point isn’t bad but some of this this feels heavily AI written to me and I don’t love it. I could be wrong again (would not be the first time).
Second and last paragraphs were AI written. The rest, I used AI to search but double checked (but not well enough) the sources bc it hallucinated a bunch of stuff, but the rest I wrote directly.
Now it’s 100% written by me, don’t know if it was worth my time but I hate AI slop so be the change that you want to see in the world etc
What seems AI written about it? (I’m conscious I received a similar flag from you awhile back too, hah!)
read above she changed it.
excessive colons
it’s not x it’s y
some language which was technically correct but seemed hollow.
but I’ll find it hard to describe exactly why sometimes