So have you done anything or do you just have the high-level idea?
Clara Torres Latorre 🔸
If something looks like chatGPT generated Linkedin slop, does it get a pass because it’s draft amnesty? Trying to calibrate
Let’s say I (or anyone else) read this post and am convinced, but am below the 10k/year bar to talk to meta charity funders.
Is there a place where I could know where to donate and how much room for funding at which multiplier they expect? I’m happy to put some time on the analysis but don’t even know where to start.
Quoting their Giving Multiplier calculations doc:
https://docs.google.com/document/d/1jmq18Ud4RQx1sPdnTHv5zD_DYoVx8PmTk3aW0d9T04M/edit?tab=t.0#heading=h.jjl75aeojvn
4. Evidence of Behavioural InfluenceIn December 2025, we added a live impact calculator to our donation form and compared giving between donors who saw it and those who didn’t. Our A/B test of impact-focused messaging demonstrated:
Control group (standard donation form): 172 donations, AUD 69,469 total (AUD 403.89 average)
Impact Calculator group: 196 donations, AUD 262,804 total (AUD 1,340.84 average)
Effect: 3.4x increase in average donation size, 3.8x increase in total value
Donors who saw the impact calculator gave AUD 936.95 more per donor than those who didn’t, representing 70% of their total donation value. This validates that our impact-focused communications measurably change donor behaviour, providing evidence that we actively influence giving decisions rather than passively capture donations.
While this experiment represents only a small subset of total donations, it provides a clear demonstration of causal impact, supporting our understanding of how our communications contribute to above-baseline growth in 2025, when we intensified focus on impact messaging across all donor channels.
In my view, this is massive gains and deserves to be replicated by other effective giving organizations to see if the results are replicable. Are you planning on writing more about this specifically?
I think the first sentence in your post is great.
But now this depends on the individual will of the op, and what I’m suggesting is something structural.
Imagine for instance when you write sth you can select the level of AI involvement from a dropdown and it appears somewhere.
Yes, this is a mistake. We’ll fix it asap.Should be fixed now. Thank you for flagging it.
I would love to have a discreet mandatory way to disclose the level of AI use in the forum. Not sure how it could look in practice but I am in favour of normalizing AI use in writing and at the same time being honest on how much AI got into the text.
How Are EAs Really Doing? Help Us Find Out (~15 min survey)
Second and last paragraphs were AI written. The rest, I used AI to search but double checked (but not well enough) the sources bc it hallucinated a bunch of stuff, but the rest I wrote directly.
Now it’s 100% written by me, don’t know if it was worth my time but I hate AI slop so be the change that you want to see in the world etc
Agreed on the narrow point: anchoring on real data is better than pure vibes, when there is real data.
First, my main complaint about AI 2027 are that they extrapolate from METR data to fit a model while mostly ignoring the heavy caveats that the METR people put with their graph (this is not unique to AI 2027, Situational Awareness did something similar and many people do extrapolate a lot from benchmarks when this is not warranted or endorsed by the creators of those benchmarks).
This is an example of what I see as a broad problem in EA/rationality circles, when someone says “bad model better than no model” and uses numbers that are not “empirical with huge error bars” but completely made up.
More on made up numbers, certain psychological anchorings make people say 1% instead of 10^-5 for implausible claims, just because % is a typical way of expressing probabilities.
More generally, on community epistemics and why I’m picking on this particular example.
80k made a dramatized video out of AI2027 for a mass audience. I showed this video to some people in my circle and their reaction was to dismiss 80k’s channel as one more AI hype/doom content. This is similar to what I remember being my first reaction when I encountered 80k way before learning anything about EA.
They even admitted that they chose AI 2027 in part because “it’s a story, so people are compelled to keep watching”.
They also said they received criticism for being “too speculative” but I haven’t seen them engaging with the substance of it, at least in their retrospective. Please correct me if I’m wrong in this last part.
Apologies for the previous claim that 80k admitted that a more argument-based video would have depended on preexisting trust, this was AI generated and I was sloppy checking (it was on a comment on their retrospective, not by 80k themselves). My trust in AI as a search engine has gone down accordingly.
I agree with parts of this. EA has produced genuinely good work, and the forecasting culture is a real epistemic virtue compared to most advocacy communities.
However:
I want to flag an EA vice: take some made-up numbers, put them in a simple model, get a scary output, present it as a forecast. The AI 2027 timelines model is a good example, and 80k chose it as the first video for their new channel, framed as “research-based.” The authors’ own defense when critiqued was “bad model better than no model.” I’d argue a bad quantitative model is often worse than no model, because it creates an impression of rigor that pure qualitative reasoning wouldn’t.
On longtermism being “correct” as evidence of good epistemics: this is backwards. Longtermism is a values commitment. Pointing to it as vindication of EA’s epistemic practices is precisely the move bad epistemics looks like: start with the conclusion, find reasons to believe you were right all along.
“a policy issue”
I think the most important thing here is “what” the project is about, specifically.
Can you expand on what kind of funding and what kind of project do you have in mind?
I agree with Nick here. About the substance, the ideas are interesting but the claims are too bold for what supports them, a typical feature of LLM written text.
I agree with Nick here. About the substance, the ideas are interesting but the claims are too bold for what supports them, a typical feature of LLM written text.
Cool (:
I’m specifically interested in automating filtering EA-related opportunities and events to write our weekly announcements.
I think with a bit of tweaking that would be a public good for EA community building and might be re used by many groups.
Hey, cool toy model (:
I bet there’s not enough data on METR about how messy are the tasks to include it here, but I would expect it to have real world consequences and to tug in the direction of agents being less viable outside of well defined domains.
Very interesting critique. I’ve seen this kinds of comments in academic circles doing evals work, and there have been attempts to improve the situation such as the General Scales Framework:
https://arxiv.org/abs/2503.06378
Think of it as passing an IQ test instead of a school exam, more predictive power. It’s not percect ofc but thankfully some people are really taking this seriously.
I think allowing this debate to happen would be a fantastic opportunity to put our money where our mouth is regarding not ignoring systemic issues:
https://80000hours.org/2020/08/misconceptions-effective-altruism/#misconception-3-effective-altruism-ignores-systemic-change
On the other hand, deciding that democratic backsliding is off limits, and not even trying to have a conversation about it, could (rightfully, in my view) be treated as evidence of EA being in an ivory tower and disconnected from the real world.
Strongly downvoted because, while pointing to some plausible failure mode of LLMs, this is very unnecessarily long, hard to read, and it’s not clear what is being tested or how.