Winners of the EA Criticism and Red Teaming Contest

LizkaOct 1, 2022, 1:50 AM

226 points

Criticism and Red Teaming Contest Criticism of effective altruism Red teaming Discussion norms Writing advice Criticism of effective altruist organizations Population ethics Criticism of longtermism and existential risk studies Impact assessment Criticism of effective altruist causes AI forecasting Prizes and contests Cause prioritization Criticism of work in effective altruism Existential risk Building effective altruism Forecasting Philosophy

We’re excited to announce the winners of the EA Criticism and Red Teaming Contest. We had 341 submissions and are awarding $120,000 in prizes to our top 31 entries.

We^[1] set out with the primary goals of identifying errors in existing work in effective altruism, stress-testing important ideas, raising the average quality of criticism (in part to create examples for future work), and supporting a culture of openness and critical thinking. We’re pleased about the progress submissions to this contest made, though there’s certainly still lots of work to be done. We think the winners of the contest are both valuable in their own right as criticisms, and as helpful examples of different types of critique.

We had a large judging panel. Not all panelists read every piece (even among the winners), and some pieces have won prizes despite being read by relatively few people or having some controversy over their value. Particularly when looking at challenges to the basic frameworks of effective altruism, there can be cases where there is significant uncertainty about whether a contribution is ultimately helpful. But if it is, it’s often very important, so we didn’t want to exclude cases like this from winning prizes when they had some strong advocates.^[2] You can read about our process and overall thoughts on the contest at the end of this post. Prize distribution logistics are also discussed at the end of this post.

An overview of the winners

Top prizes [see more]
1. A critical review of GiveWell’s 2022 cost-effectiveness model and Methods for improving uncertainty analysis in EA cost-effectiveness models by Alex Bates (Froolow) ($25,000 total)
2. Biological Anchors external review by Jennifer Lin ($20,000)
3. Population Ethics without Axiology: A Framework by Lukas Gloor ($20,000)
Second prizes (runners up) — $5,000 each [see more]
1. Are you really in a race? The Cautionary Tales of Szilárd and Ellsberg by Haydn Belfield
2. Against Anthropic Shadow by Toby Crisford
3. An Evaluation of Animal Charity Evaluators by eaanonymous1234
4. Red Teaming CEA’s Community Building Work by AnonymousEAForumAccount
5. A Critical Review of Open Philanthropy’s Bet On Criminal Justice Reform by Nuño Sempere
6. Effective altruism in the garden of ends by Tyler Alterman
7. Notes on effective altruism by Michael Nielsen
Honorable mentions — $1,000 for each of the 20 in this category [see more]

Top prizes

A critical review of GiveWell’s 2022 cost-effectiveness model and Methods for improving uncertainty analysis in EA cost-effectiveness models by Alex Bates (Froolow^[3]) ($25,000 prize in total)

We’re awarding a total of $25,000 for these two submissions by the same author covering similar ground. A critical review of GiveWell’s 2022 cost-effectiveness model is a deep dive into the strengths and weaknesses of GiveWell’s analysis, and how it might be improved. Methods for improving uncertainty analysis in EA cost-effectiveness models extracts some more generalizable lessons.^[4]

Summary of A critical review of GiveWell’s 2022 cost-effectiveness model: The submission replicates GiveWell’s cost-effectiveness models, critiques their design and structure, notes some minor errors, and suggests some broader takeaways for GiveWell and effective altruism. The author emphasizes GiveWell’s lack of uncertainty analysis as a weakness, notes issues with the models’ architectures (external data sources appear as inputs on many different levels of the model, elements from a given level in the model “grab” from others on that level, etc.), and discusses ways in which communication of the models is confusing. Overall, though, the author seems impressed with GiveWell’s work.

You can also see the author’s own picture-based summary of their findings:

Summary of Methods for improving uncertainty analysis in EA cost-effectiveness models: This post argues that the EA community seriously undervalues uncertainty analysis in economic modelling (that a common attitude in EA is to simply plug in “best we can do” numbers and move on, whereas state-of-the-art models in Health Economics use specific tools for uncertainty analysis). The submission explains specific methods that should be applied in different cases, proposes that developing better tools for uncertainty modeling could be useful,^[5] notes that a deep-dive into GiveWell’s model suggests that resolving “moral” disagreements is more impactful than resolving empirical disagreements, and generally shares a lot of expertise about different tools for uncertainty analysis. You can see the author’s summary of these tools here:

What we liked in these posts: The author was rigorous in trying to understand what is actually important (for GiveWell and for effective altruism more broadly).^[6] They rebuilt GiveWell’s model, allowing them to understand critical inputs and features of the model and to suggest many concrete improvements.^[7] These suggestions seem to focus on key weaknesses, not minor sidetracks that are unlikely to be decision-relevant. The author also made these posts an easy and enlightening experience for readers. The inclusion of clear and information-packed diagrams, and the addition of a second post explaining how people can approach such analyses in general, were both exemplary. We also appreciated that they drew out broader conclusions for effective altruism.

When we asked an external expert reviewer to assess these posts, they wrote:

It’s hard to overstate what a gift the author of this post has provided to GiveWell – replicating a complex model is tedious, time-consuming, and often thankless work.

(In case you’re considering contributing similar work, note that GiveWell recently announced the “Change Our Mind Contest” to encourage (more) critiques of their cost-effectiveness analyses.)

What we didn’t like: It’s not clear that GiveWell’s charity recommendations should change based on the submission’s findings (although the broader lessons seem important). Given this, some panelists worried that classifying the lack of uncertainty analysis as an “extremely severe” error is overstating things. And within EA, GiveWell is a strong example of the type of work that can most benefit from this type of analysis; it is unclear whether the lessons are meaningfully generalizable areas in which highly sophisticated measurement and quantification are less central.

Biological Anchors external review by Jennifer Lin ($20,000)

Summary: This is a summary and critical review of Ajeya Cotra’s biological anchors report on AI timelines.^[8] It provides an easy-to-understand overview of the main methodology of Cotra’s report. It then examines and challenges central assumptions of the modelling in Cotra’s report. First, the review looks at reasons why we might not expect 2022 architectures to scale to AGI. Second, it raises the point that we don’t know how to specify a space of algorithmic architectures that contains something that could scale to AGI and can be efficiently searched through (inability to specify this could undermine the ability to take the evolutionary anchors from the report as a bound on timelines).

What we liked: AI timelines are important in a great deal of strategic thinking in EA, especially in the longtermist context, and Cotra’s report has become a standard reference, so reviewing it is engaging with central and sophisticated thinking in EA. Lin’s submission is clearly written despite being on a difficult technical topic, and we appreciated its straightforwardness about its own uncertainties and confusions. The critiques that it raises seem to be of central importance rather than just nitpicking or missing the point — if the biological anchors forecasts are misleading or uninformative, this may well be why. And the review provides suggestions for further research that could improve our understanding of these key uncertainties.

What we didn’t like: A lot of this review was focused on understanding which of Cotra’s arguments could be made most legible and strong. While this outlined important gaps in the arguments in Cotra’s report (explaining why readers are not compelled to adopt certain views on timelines), the submission didn’t really discuss how we could reasonably form certain important views on AI timelines (like the likelihood that small extensions of current architectures could scale to AGI, or how probable it is that the first few evolutionary approaches might contain something that works). While we admire the project of putting arguments on as firm foundations as possible (and attempting to undermine foundations others have proposed), we think this review could have done more to position itself to be helpful to people needing to make decisions in light of the current lack of fully-supported arguments.

Population Ethics without Axiology: A Framework by Lukas Gloor ($20,000)

Summary: Population ethics is relevant for decisions that could affect large groups of people, some of whom might or might not exist depending on our actions. It is usually discussed with axiologies – accounts of the objective good. Gloor introduces an alternative framework that considers population ethics from the perspective of individual decision-makers. He introduces the notion of minimal morality (“don’t be a jerk”) that all moral agents should follow, and proposes that what to do beyond that is up to the individual in important ways.

What we liked: This is an ambitious reconceptualization of a key field of ethics. It makes space to capture the powerful intuitions people often have that standard ethical discourse doesn’t allow room for. It proposes a conceptualization of person-affecting views which don’t have the same problems that these views are commonly understood to have (in axiological frameworks). And it provides space for scope-sensitive effective altruism to have moral force without that creating an overwhelming moral pressure to optimize.

What we didn’t like: This is very abstract and at times speculative. Although there are a few practical suggestions for how EA (especially longtermist EA) might present its ideas, these suggestions seem predicated on the ideas of the post being at least plausibly correct. It would have been nice to see a discussion of the likelihood that this framing would actually help people relate to EA ideas in healthier and more valuable ways.

Second prizes (runners up — $5,000 each)

Note: We didn’t deliberately select winners in different categories, but we wanted to structure this announcement post a bit (rather than just listing all the runners up in one large section). So for this post, we’re organizing them by their broad subject.

Critiques of specific concepts and assumptions

Are you really in a race? The Cautionary Tales of Szilárd and Ellsberg by Haydn Belfield

This post tells a compelling story of how even smart people with the best intentions can end up in a harmful arms-race dynamic, and extends this lesson to effective altruism and AI safety: “I am concerned that at some point in the next few decades, well-meaning and smart people who work on AGI research and development, alignment and governance will become convinced they are in an existential race with an unsafe and misuse-prone opponent.” (You can see a brief summary that goes into more detail here.)

We think many people might do well to internalize this point more strongly, and think the post helps by providing a clear example. We also loved the concrete description of a dynamic that could plausibly lead to an existential catastrophe and the fact that this description was grounded in a relevant historical analogy. On the other hand, we thought the post’s suggestions for the effective altruism community were vague, and it’s unclear to what extent its implicit recommendations are already understood.

Against Anthropic Shadow by Toby Crisford

This post digs into some toy examples to challenge the concept of anthropic shadow. This post argues that anthropics is not an important factor to adjust for in assessing the world we find ourselves in; it claims that people have fallen into the (oh-so-easy) trap of confusing themselves with arguments about the idea.

We liked how the post went step-by-step through the thinking to help the reader build intuitions about what’s going on, and we’d love to be able to throw away an unnecessary concept (which might plausibly be the right response here). On the other hand, the idea of anthropic shadow has always been a niche one, and we suspect it has a limited impact on actual prioritization decisions.

Criticisms of (work by) specific organizations

A note: This category is for critiques people have made of (important) organizations working in the EA space. We think outside scrutiny can be valuable. It can help alert organizations to their blindspots, and it can also help the broader community to understand the strengths and weaknesses of those organizations. At the same time, we’re aware that it’s a hard game to play since there’s often a lot of context outsiders aren’t aware of, so it can be unusually easy for them to miss the point. (In several of these cases one of the panelists had inside context which altered their perception of the criticism. However, they recognized a potential conflict of interest and recused themselves. For example, multiple panelists are affiliated with CEA.)

An Evaluation of Animal Charity Evaluators by eaanonymous1234

This post argues that Animal Charity Evaluators (ACE) communicates poorly about which charities it recommends, insufficiently evaluates charities across different types of interventions (thus losing information about big differences in impact), underrates the effectiveness of animal welfare reforms, and tries to fill too many roles at the same time. It also suggests some concrete improvements, like changes to the language on ACE’s website and a recommendation for ACE to pivot towards having more emphasis on producing original research (leaning into the “evaluator” side of its role) relative to functioning as a fund.

We were impressed by the level of detail in the post, its focus on big problems in ACE’s work and explanations of which issues it felt were more important, and the amount of constructive work done by the author (e.g. diving into research into animal welfare reforms and outlining how this compares to other interventions ACE recommends). On the other hand, we’re not sure how much of the post is novel. We’re also aware of minor errors in the post, like suggesting that an organization that disbanded in 2022 is able to take on some of the roles that ACE is currently holding.

Red Teaming CEA’s Community Building Work by AnonymousEAForumAccount

This post lists a number of issues found in different projects run by the Centre for Effective Altruism (CEA), like understaffing, a shortage of project evaluations, and poor public communications.

We were impressed by the thoroughness of the post, the fact that it was critically reviewing an influential organization, and the fair and constructive spirit of the post. We think it’s helpful to give the community the perspective of issues in a historical context, so they can better assess present and historical work. On the other hand, the recommendations err in the direction of generic, and since many of the issues it lists are from a past period of turbulence at CEA (which the post acknowledges), it’s unclear how actionable these elements of the critique still are.

A Critical Review of Open Philanthropy’s Bet On Criminal Justice Reform by Nuño Sempere

This post analyzes $200M in grants spent by Open Philanthropy on criminal justice reform, and estimates that these grants were worse than their other grants in global health and development. It then draws conclusions about Open Philanthropy’s policies and procedures and suggests some improvements.

We appreciated that the post built a rough (but useful) model for estimating the effectiveness of interventions for criminal justice reform, that it listed different reasons why Open Philanthropy might have made the grants in the first place (like value of information), that the errors it focuses on are major (cost-effectiveness of big grants), and that it targeted an extremely important entity in effective altruism — it’s a good example of hitting up. On the other hand, we think it’s possible that second-order effects might be especially important in US criminal justice reform (disagreeing with the author, who claims that they’re comparable to those of malaria prevention), and were disappointed that these were not considered in the analysis. At least one panelist felt that the arguments and conclusions in the section on why Open Philanthropy donated to criminal justice were overly speculative and insinuated more than was warranted.

Broad criticism of effective altruism as a phenomenon

Effective altruism in the garden of ends by Tyler Alterman

This post tells the story of the author’s journey through EA, and in particular into (and later out of) a totalizing consequentialist attitude. Even hardcore consequentialists will want to do lots of things that we regard as everyday goods because they are instrumentally useful. But the post warns us that approaching these things with an attitude of because they’re useful can lead to them being less useful, as we engage with them less wholeheartedly. (“By analogy, imagine (a) reading your favorite book for its own sake vs (b) reading the same book only to get an A+ on a test. You can feel what’s different about these experiences. What is it?”)

We liked the rich texture of the author’s account, and the conceptual underpinning offered for avoiding totalizing attitudes. The post is a clear account of how a highly involved person in EA became disillusioned. It also gives the historical example of JS Mill’s breakdown, which should be widely known since it prefigures a great deal of “EA disillusionment”. On the other hand, we weren’t very compelled by the alternative vision the author offers of how EA should work and felt that many of the recommendations were vague.

Notes on effective altruism by Michael Nielsen

This post cautions against new moral systems, describes a phenomenon of “EA judo” (whereby people in effective altruism respond to valid outside criticism by saying the critic has simply made the case for EA even stronger, by making EA more effective), and suggests ways for people in effective altruism to avoid “misery traps” — high levels of stress due to worries that one is living wrongly.^[9]

We felt that this was a very strong holistic critique of effective altruism that approached EA with curiosity, knowledge, and a focus on significant problems (a number of which seemed not to have been discussed before). Several panelists have referred to “EA judo” multiple times since reading the post, and feel that it is putting words to an important phenomenon (that is relevant for discussions of criticism). We thought that the lack of suggestion of an alternative to EA (a lack the author acknowledges) made the post weaker. Several panelists also believe that some of the author’s points in the “Summing up” section were not well justified.

Honorable mentions ($1,000 for each of these 20 submissions)

Disclaimer: Some submissions below only got two scores from the panel. This means that many of the panelists have not vetted each submission, and this should not be viewed as a strong endorsement of the claims made (even more than for the winners above). These are organized by the type of criticism they represent (although we didn’t deliberately select winners in different categories — these all got rankings that made them honorable mentions).

Philosophical work

Better vaguely right than precisely wrong in effective altruism: the problem of marginalism by Nicolas Côté and Bastian Steuwer
- We liked: the construction of a clear model that identifies interventions for which a classic approach of estimating value per additional unit of resources fails (because they have discontinuous benefits).
- We didn’t like: that this submission seems to conflate effective altruism with global health and development (or perhaps with GiveWell), and we thought that a number of points made were not very original (or were misguided, by criticizing something that wasn’t quite true).
Wheeling and dealing: An internal bargaining approach to moral uncertainty by Michael Plant
- We liked: how it explored the intuitions for how moral bargaining might work while avoiding some of the traps people worry about.
- We didn’t like: that the post took a long time to get to its points and didn’t make it clear how its proposal differed from the parliamentary approaches that have been discussed a number of times before.
Longtermism, risk, and extinction by Richard Pettigrew
- We liked: that the post introduced and clearly summarized important arguments against expected utility theory (in particular, by incorporating risk-aversion), which is very relevant to work inspired by longtermism.
- We didn’t like: that some of the premises seemed unconvincing and that the takeaways weren’t very actionable (although future work might build on this paper).
Existential risk pessimism and the time of perils by David Thorstad
- We liked: that it developed a concrete model for the value of the future, and drew implications for positions one would have to adopt to reach certain conclusions.
- We didn’t like: the lack of engagement with object-level reasons to find the assumptions reasonable or not. In fact, we thought the top comment on the post did a fantastic job of making these counterpoints, and we decided to award it an extra $1,000 prize (although the comment was not formally submitted to the contest, and the funds for this come from the Forum team’s budget).
Prioritizing x-risks may require caring about future people by Eli Lifland
- We liked: that the post addressed a narrative that has been growing in influence — that the case for mitigating existential risks should be introduced without discussing the idea that future people have moral value — and pointed out serious mistakes (or misleading oversimplifications) in this idea.
- We didn’t like: that the numbers cited in the post were extremely rough — we felt like the post made its points too strongly given how fragile the numbers it used were.
Critique of MacAskill’s “Is It Good to Make Happy People?” by Magnus Vinding
- We liked: that the post critiques an influential work and theory in effective altruism, and uses specific evidence against the theory.
- We didn’t like: as a comment points out, that some of the evidence is incredibly sensitive to different framings of a study. Some of the arguments presented were also not very new.
Tensions between moral anti-realism and effective altruism by Spencer Greenberg
- We liked: that the post addresses a particular combination of beliefs that a group of people holds (and is explicit about addressing the post to this group of people), and that it draws out a potential contradiction while exploring other possible explanations.
- We didn’t like: that the conclusion is a little vague.

Climate change and energy issues

The most important climate change uncertainty by cwa
- We liked: that it pushes back against the tendency in EA to focus mostly on extreme-warming scenarios (>6°C), outlining reasons for why most of the uncertainty comes from poor models of the effects of much more probable medium-warming scenarios. We also liked that the post notes potential points of disagreement and how they might affect readers’ conclusions.
- We didn’t like: that we’re more unsure than usual about whether the conclusions are accurate. For instance, a comment from John Halstead points to literature from climate economics that suggests that harms from medium-warming scenarios can be constrained (although this is disputed in further comments). We also appreciate this comment.
The great energy descent (short version) - An important thing EA might have missed by Corentin Biteau
- We liked: that this makes the case for a really-big-if-true feature of the world that could impact a lot of EA prioritization.
- We didn’t like: that it seemed weak in its engagement with arguments about how the world might adapt to avoid the worst outcomes described (we recommend the top comments for more discussion of these issues).

Critiques of work by specific organizations

Quantifying Uncertainty in GiveWell’s GiveDirectly Cost-Effectiveness Analysis by Sam Nolan (Hazelfire^[10])
- We liked: listing specific reasons for performing uncertainty analyses, developing one for GiveWell’s cost-effectiveness estimates for GiveDirectly, the use of this work to support future similar projects by demonstrating Squiggle’s capabilities, and the inclusion of some concrete takeaways (like the fact that GiveDirectly seems to perform better if a different utility measure is used).
- We didn’t like: that we’re not sure how decision-relevant the specific criticisms were (although the broader takeaways seemed useful), and, as a comment points out, that the analysis used some parameters that were out of distribution (numbers from the UK instead of numbers from LMICs, which we might expect to be very different).
A philosophical review of Open Philanthropy’s Cause Prioritisation Framework by Michael Plant
- We liked: that it explicitly called out possible implicit assumptions and spelled out their implications; that it made concrete recommendations for possible alternate approaches.
- We didn’t like: that it said “I don’t understand what OP means by worldview so I’ll assume it means a set of philosophical assumptions” when this is at odds with how OP describes it (the interpretation is explicitly disavowed in the top comment).
Deworming and decay: replicating GiveWell’s cost-effectiveness analysis by Joel McGuire, Samuel Dupret, and Michael Plant
- We liked: that this submission points out real issues; Alex Cohen responded on behalf of GiveWell, agreeing that incorporating decay more into their model would reduce the cost-effectiveness of deworming (they plan on conducting more research into this), and noting that making this change earlier would have redirected $2-8M in grants. (The comment notes that the submission will likely change some future funding recommendations and improve GiveWell’s decision-making, which seems like a strong positive signal.) We also liked the post’s discussion on reasoning transparency.
- We didn’t like: As the comment linked above notes, it’s unclear that the authors’ approach to incorporating decay is accurate (it might overestimate the effect of decay due to issues with measurement) — it seems like this could lead to systematic underestimation of the effectiveness of programs with long-term benefits (due to increased uncertainty).
[A private submission] by Nuño Sempere
- We have discussed internally and with Nuño the fact that this submission is private, and are sufficiently compelled by the claim that keeping it private will lead to a better outcome than a public submission. The author has shared it directly with the organization in question.
- We have also encouraged Nuño to share how his critique has been addressed within 12 months, or to make his original piece public if its core claims have not been addressed by then.

Critiques of real-world dynamics in effective altruism

Critiques of EA that I want to read by Abraham Rowe
- We liked: that this post lists many issues that could be quite serious and deserve more consideration, and that it inspired more discussion.
- We didn’t like: that the critiques are quite vague (which is natural, given that this is a list), that some were unoriginal, and that the critiques’ relative importance isn’t discussed in the post (i.e. the list is basically flat — it’s not clear what should be prioritized).
EA Criticism: Vegan Nutrition by Elizabeth Van Nostrand
- We liked: that the submission identifies an extremely specific issue that seems potentially incredibly important (and like a bad sign about the community’s epistemics), and suggests a number of changes and projects that could help address the issue.
- We didn’t like: that it didn’t have an estimate of the potential harm from the lack of guidance for vegan nutrition in EA, and we’re not sure about some of the factual claims.
Ways money can make things worse by Jan Kulveit
- We liked: that this is a pretty comprehensive list of issues that funders should potentially pay attention to.
- We didn’t like: that the post didn’t explain how some of the phenomena it lists could become real issues (it generally relied on “stylized examples”), or how likely it is that they’re happening.
Senior EA ‘ops’ roles: if you want to undo the bottleneck, hire differently by AnonymousThrowAway
- We liked: that the post had extremely specific and actionable takeaways in an area that’s important for effective altruism (hiring).
- We didn’t like: that the takeaways from this post didn’t seem as novel or critical as some of the other winners.
Criticism of EA Criticism Contest by Zvi
- We liked: that the submission criticizes a concrete thing (this competition) as a way to get at broader, often unspoken assumptions in the EA community. We particularly liked the list of 21 implicit assumptions as a jumping-off point for discussion.
- We didn’t like: that the post is long, a bit convoluted, and doesn’t make concrete recommendations that many in the EA community are likely to find actionable. And many on our panel disagreed with the object-level claims about criticism.^[11]
Leaning into EA Disillusionment by Helen
- We liked: that this post describes (and names) a real and under-discussed phenomenon, and suggested actions members of the community could take to improve (e.g. maintaining non-EA connections, viewing EA-the-community as only one of the possible paths to impact, notice disagreements and areas where you’re uncomfortable with something).
- We didn’t like: large parts of this post were pretty vague, and some of the ideas were not very new.
Aesthetics as Epistemic Humility by Étienne Fortier-Dubois
- We liked: that it identified a potentially important blindspot for EA (aesthetics), noted that it’s a symmetric weapon, and suggested a number of specific reasons that aesthetics could be important (and flagged pitfalls for those in EA who might start paying attention to aesthetics).
- We didn’t like: that a number of claims seemed unjustified or too strong and some key arguments seemed incorrect.

Notes on the judging process

We got more submissions than we were expecting — 341 submissions (105 of which were submissions via the form). A few of them were entirely private (we’re awarding one private prize, listed above). Some submissions (about 45) were disqualified, usually for being written earlier than our March cutoff. Approximately 60 submissions became finalists.^[12] Panelists cast nearly 800 votes across the 341 submissions (all submissions got at least 2 votes, and top and second prizes got at least 4). When we were out of our depth, we tried to reach out to experts in relevant fields. We also tried to discuss disagreements on the panel as much as possible, but a number of disagreements about which submissions we should reward were unresolved, so the fact that we’re awarding a prize does not mean that everyone on the panel endorses the submission.

After the original announcement, one panelist (Nicole Ross) had to step down from the panel due to other commitments. Because of this and the volume of submissions we got, we invited two other people to join the panel: Aaron Gertler and Bruce Tsai. I’m extremely grateful to everyone who spent time and energy making this happen, and I want to give a special shout-out to Gavin Leech and Bruce Tsai, who collectively gave more than a quarter of the total panelist scores.

How winners will get their prizes

We’ll be emailing or messaging all winners (and referrers, when relevant) about how they should claim their prizes. If you haven’t been contacted by October 8 and you think you should have gotten a prize, please email forum@centreforeffectivealtruism.org.

We don’t plan to reach out to people who did not get prizes.

Closing thoughts

It was fascinating to see the wide range of criticisms that people submitted to this contest. There were many submissions that we felt could have been finalists, and multiple panelists remarked that they learned something important from a post that didn’t end up getting a prize. And although we tried to be critical in our reviews, we were impressed by the quality of the winning submissions. It’s been touching to see people who clearly care deeply about effective altruism put so much into suggesting how it could be even better.

Although we didn’t ask for submissions in particular categories, we did find that there was some natural clustering into similar types of work.^[13] We are slightly more confident in our comparisons within clusters than between clusters.

We also want to note that we deeply appreciate a lot of the projects (ranging from research to the work that organizations in effective altruism do) that got criticized by the winning submissions. We hope that readers don’t come away with the sense that everything that got criticized by a winning entry in this contest is bad — we think everything criticized here has some flaws, but to a first approximation everything has flaws, and when things are valuable enough it’s worth taking the time to identify and learn from those flaws (in practice that sometimes means fixing the issues, and sometimes throwing things away and starting again). Moreover, there are some laudable qualities that a certain project can have that make it more amenable to (especially useful) criticism, like reasoning transparency and epistemic legibility. (We think that GiveWell is a good example of this.)

Of course, the existence of criticism (and the selection of some especially high-quality criticism) doesn’t solve all our problems. Most importantly, criticisms are only useful if they lead to actual changes and improvements.

So we would be excited to see changes in response to these criticisms. (These could be changes suggested by submissions to the contest, changes that are prompted by submissions’ identification of certain issues — even if the people running the relevant projects disagree with the recommendations made by submissions, or other kinds of changes entirely.) Changes could include:

Corrections of concrete errors listed in the winning submissions (e.g. faults in models, confusing language, incorrect research conclusions, etc.)
Shifts in mindset of people in EA
Shifts in prioritization decisions (by organizations and by individuals)
Development of more uncertainty analyses for important research in EA
Better feedback loops^[14]

Finally, one of our goals was to accelerate meaningful discussion in these areas, and (of course) awarding the critiques funding does not mean we think they are entirely correct or the final word on the subject. So we strongly encourage further discussion on the topics brought up by the criticisms we’re highlighting here.

Thank you all so much!

^
This announcement was written by Lizka, with significant help from Owen Cotton-Barratt and more help from Bruce Tsai, and Fin Moorhouse. Other panelists got the chance to review it and some shared feedback, but most didn’t have time to read it carefully. In general, views stated here do not represent views of everyone on the panel.
^
In particular, we wanted to avoid vetoes, or rewarding submissions that satisfied everyone, which would bias us towards uncontroversial submissions.
^
These were published under the name “Froolow,” and we learned the author’s real name after scoring was over.
^
We think that each post was individually very valuable, and having both adds to this — but is less than twice as valuable, as the posts cover similar ground. We have therefore decided to award a prize to this pair equal to the sum of a first ($20,000) and second ($5,000) prize.
^
Like the new programming language for probabilistic estimation, Squiggle.
^
We were also impressed by the fact that the author — a self-described relative outsider to EA — first posted a question on the Forum to make sure that they’d avoid straw-manning the “state of the art” in EA cost-effectiveness analysis and check that GiveWell’s models are the best thing to critique. We think this is a brilliant example to follow.
^
These suggestions included adding an uncertainty analysis, developing a system for prioritizing key inputs in their models (which would make it clearer which data is most important to check more carefully) [1], and re-organizing the presentation of their data, e.g. by fixing inconsistent markup in important sheets [2].
^
Here’s a summary of the original report: Forecasting transformative AI: the “biological anchors” method in a nutshell
^
Readers who are interested in this submission might also benefit from listening to this podcast episode: A podcast episode exploring critiques of effective altruism (with Michael Nielsen and Ajeya Cotra)
^
This was published under the name “Hazelfire,” and we learned the author’s real name after scoring was over.
^
Many members of the panel don’t agree with a number of the claims the post makes (though opinions were divided), including the list of 21 assumptions. For an alternative take on criticisms, see Criticism Of Criticism Of Criticism. While we recused panelists with a conflict of interest for other posts, we were unable to do that in this case (since all panelists had a conflict of interest, by definition) but at least one panelist did recuse themselves.
^
We don’t plan on publishing a full list of finalists, as we haven’t vetted these submissions enough for us to feel comfortable highlighting them so prominently. However, I (Lizka) encourage panelists to share any submissions they particularly liked in the comments of this post.
You can also see a lot of the submissions here.
^
One panelist speculates that there was a negative correlation between how much pieces criticized the foundational assumptions of EA, and how much they made crisp or actionable recommendations. They thought this may be because there is a lot of work required for deriving clear actions as well as for laying new foundations, so it’s rare to see a piece do both.
^
Including for this contest; we’d love to hear general feedback, and are also interested in hearing about any cases where a submission (or our reviews) changed your mind or actions. You might also want to tell the author(s) of the submission if this happens.
^
This is a link to a public Google Document.

What links here?