I agree with parts of this. EA has produced genuinely good work, and the forecasting culture is a real epistemic virtue compared to most advocacy communities.
However:
I want to flag an EA vice: take some made-up numbers, put them in a simple model, get a scary output, present it as a forecast. The AI 2027 timelines model is a good example, and 80k chose it as the first video for their new channel, framed as āresearch-based.ā The authorsā own defense when critiqued was ābad model better than no model.ā Iād argue a bad quantitative model is often worse than no model, because it creates an impression of rigor that pure qualitative reasoning wouldnāt.
On longtermism being ācorrectā as evidence of good epistemics: this is backwards. Longtermism is a values commitment. Pointing to it as vindication of EAās epistemic practices is precisely the move bad epistemics looks like: start with the conclusion, find reasons to believe you were right all along.
I think using made-up-ish numbers can be better than not doing it at allāas long as thereās no impression of rigor. We donāt have to claim rigor! How else do we then decide? I agree there are some cases where its better not to use numbers, but where we have some real-ish baseline figures its better than having nothing to hang our hats on. In the case of AI 2027 there were some real numbers like Compute scaling patterns from the past and METR progress, so it wasnāt unreasonable to make predictions based on these I think at least. But we all have different lines for rigor....
But I 100% agree there was far too much of a āresearch-basedā and āthese are the top expertsā framing on AI 2027, which was proved absurd within months when the prediction had been pushed back to 2029. There was at least a little too much confidence/ābluster about the way it was presented. To their credit though they pushed it back which is a tick on the āgood epistemicsā ledger...
I think using made-up-ish numbers can be better than not doing it at allāas long as thereās no impression of rigor. We donāt have to claim rigor!
In my experience, hereās how these things go ~100% of the time:
Authors make up some numbers, and they include about a dozen caveats about the limitations of their model.
Readers ignore all the caveats and accuse them of claiming to be rigorous, even though they claimed no such thing.
Hi Michael. That is fair. On the other hand, what readers ignore or not depends on how the results are communicated. It would be harder to ignore uncertainty about AI timelines if AI2027 was e.g. AI2027-2047, and this would still undercommunicate uncertainty. The difference between the 90th and 10th percentile dates of artificial superintelligence (ASI), as defined below, is more than 100 years for Daniel Kokotajlo and Eli Lifland, the 2 main forecasters of the AI Futures Model (which superseded AI2027).
Agreed on the narrow point: anchoring on real data is better than pure vibes, when there is real data.
First, my main complaint about AI 2027 are that they extrapolate from METR data to fit a model while mostly ignoring the heavy caveats that the METR people put with their graph (this is not unique to AI 2027, Situational Awareness did something similar and many people do extrapolate a lot from benchmarks when this is not warranted or endorsed by the creators of those benchmarks).
This is an example of what I see as a broad problem in EA/ārationality circles, when someone says ābad model better than no modelā and uses numbers that are not āempirical with huge error barsā but completely made up.
More on made up numbers, certain psychological anchorings make people say 1% instead of 10^-5 for implausible claims, just because % is a typical way of expressing probabilities.
More generally, on community epistemics and why Iām picking on this particular example.
80k made a dramatized video out of AI2027 for a mass audience. I showed this video to some people in my circle and their reaction was to dismiss 80kās channel as one more AI hype/ādoom content. This is similar to what I remember being my first reaction when I encountered 80k way before learning anything about EA.
They even admitted that they chose AI 2027 in part because āitās a story, so people are compelled to keep watchingā.
They also said they received criticism for being ātoo speculativeā but I havenāt seen them engaging with the substance of it, at least in their retrospective. Please correct me if Iām wrong in this last part.
Apologies for the previous claim that 80k admitted that a more argument-based video would have depended on preexisting trust, this was AI generated and I was sloppy checking (it was on a comment on their retrospective, not by 80k themselves). My trust in AI as a search engine has gone down accordingly.
Hey @Clara Torres Latorre šøyour point isnāt bad but some of this this feels heavily AI written to me and I donāt love it. I could be wrong again (would not be the first time).
Second and last paragraphs were AI written. The rest, I used AI to search but double checked (but not well enough) the sources bc it hallucinated a bunch of stuff, but the rest I wrote directly.
Now itās 100% written by me, donāt know if it was worth my time but I hate AI slop so be the change that you want to see in the world etc
I agree with parts of this. EA has produced genuinely good work, and the forecasting culture is a real epistemic virtue compared to most advocacy communities.
However:
I want to flag an EA vice: take some made-up numbers, put them in a simple model, get a scary output, present it as a forecast. The AI 2027 timelines model is a good example, and 80k chose it as the first video for their new channel, framed as āresearch-based.ā The authorsā own defense when critiqued was ābad model better than no model.ā Iād argue a bad quantitative model is often worse than no model, because it creates an impression of rigor that pure qualitative reasoning wouldnāt.
On longtermism being ācorrectā as evidence of good epistemics: this is backwards. Longtermism is a values commitment. Pointing to it as vindication of EAās epistemic practices is precisely the move bad epistemics looks like: start with the conclusion, find reasons to believe you were right all along.
I think using made-up-ish numbers can be better than not doing it at allāas long as thereās no impression of rigor. We donāt have to claim rigor! How else do we then decide? I agree there are some cases where its better not to use numbers, but where we have some real-ish baseline figures its better than having nothing to hang our hats on. In the case of AI 2027 there were some real numbers like Compute scaling patterns from the past and METR progress, so it wasnāt unreasonable to make predictions based on these I think at least. But we all have different lines for rigor....
But I 100% agree there was far too much of a āresearch-basedā and āthese are the top expertsā framing on AI 2027, which was proved absurd within months when the prediction had been pushed back to 2029. There was at least a little too much confidence/ābluster about the way it was presented. To their credit though they pushed it back which is a tick on the āgood epistemicsā ledger...
In my experience, hereās how these things go ~100% of the time:
Authors make up some numbers, and they include about a dozen caveats about the limitations of their model.
Readers ignore all the caveats and accuse them of claiming to be rigorous, even though they claimed no such thing.
AI 2027 is a great example of this.
Hi Michael. That is fair. On the other hand, what readers ignore or not depends on how the results are communicated. It would be harder to ignore uncertainty about AI timelines if AI2027 was e.g. AI2027-2047, and this would still undercommunicate uncertainty. The difference between the 90th and 10th percentile dates of artificial superintelligence (ASI), as defined below, is more than 100 years for Daniel Kokotajlo and Eli Lifland, the 2 main forecasters of the AI Futures Model (which superseded AI2027).
Agreed on the narrow point: anchoring on real data is better than pure vibes, when there is real data.
First, my main complaint about AI 2027 are that they extrapolate from METR data to fit a model while mostly ignoring the heavy caveats that the METR people put with their graph (this is not unique to AI 2027, Situational Awareness did something similar and many people do extrapolate a lot from benchmarks when this is not warranted or endorsed by the creators of those benchmarks).
This is an example of what I see as a broad problem in EA/ārationality circles, when someone says ābad model better than no modelā and uses numbers that are not āempirical with huge error barsā but completely made up.
More on made up numbers, certain psychological anchorings make people say 1% instead of 10^-5 for implausible claims, just because % is a typical way of expressing probabilities.
More generally, on community epistemics and why Iām picking on this particular example.
80k made a dramatized video out of AI2027 for a mass audience. I showed this video to some people in my circle and their reaction was to dismiss 80kās channel as one more AI hype/ādoom content. This is similar to what I remember being my first reaction when I encountered 80k way before learning anything about EA.
They even admitted that they chose AI 2027 in part because āitās a story, so people are compelled to keep watchingā.
They also said they received criticism for being ātoo speculativeā but I havenāt seen them engaging with the substance of it, at least in their retrospective. Please correct me if Iām wrong in this last part.
Apologies for the previous claim that 80k admitted that a more argument-based video would have depended on preexisting trust, this was AI generated and I was sloppy checking (it was on a comment on their retrospective, not by 80k themselves). My trust in AI as a search engine has gone down accordingly.
Hey @Clara Torres Latorre šøyour point isnāt bad but some of this this feels heavily AI written to me and I donāt love it. I could be wrong again (would not be the first time).
Second and last paragraphs were AI written. The rest, I used AI to search but double checked (but not well enough) the sources bc it hallucinated a bunch of stuff, but the rest I wrote directly.
Now itās 100% written by me, donāt know if it was worth my time but I hate AI slop so be the change that you want to see in the world etc
What seems AI written about it? (Iām conscious I received a similar flag from you awhile back too, hah!)
read above she changed it.
excessive colons
itās not x itās y
some language which was technically correct but seemed hollow.
but Iāll find it hard to describe exactly why sometimes