EAs underestimate uncertainty in cause prioritisation
TL;DR:
EA cause prioritisation frameworks use probabilities derived from belief which aren’t based on empirical evidence. This means EA cause prioritisation is highly uncertain and imprecise, which means that the optimal distribution of career focuses for engaged EAs should be less concentrated amongst a small number of “top” cause areas.
Post:
80000 Hours expresses uncertainty over the prioritisation of causes in their lists of the most pressing problems:
“We begin this page with some categories of especially pressing world problems we’ve identified so far, which we then put in a roughly prioritised list.”
“We then give a longer list of global issues that also seem promising to work on (and which could well be more promising than some of our priority problems), but which we haven’t investigated much yet.”
Their list is developed in part using this framework and this definition of impact, with a list of example scores from 2017 at https://80000hours.org/articles/cause-selection/.
Importantly, the framework uses probabilities derived from belief which aren’t based on empirical evidence, which makes the probabilities highly uncertain and susceptible to irrationality. Even if 80000 Hours’ own staff were to separately come up with some of the relevant probabilities, I would expect very little agreement between them. When the inputs to the framework are so uncertain and susceptible to irrationality, the outputs should be seen as very uncertain.
80000 Hours do express uncertainty with their example scores:
“Please take these scores with a big pinch of salt. Some of the scores were last updated in 2016. We think some of them could easily be wrong by a couple of points, and we think the scores may be too spread out.”
To take the concern that “some of them could easily be wrong by a couple of points” literally, this could mean that factory farming could easily be on par with AI, or land use reform could easily be more pressing than biosecurity.
Meanwhile, the 2020 EA survey tells us about the cause priorities of highly engaged EAs, but it doesn’t tell us what highly engaged EAs are focusing their careers on.
I don’t think the distribution of causes that engaged EAs are focusing their careers on reflects the uncertainty we should have around cause prioritisation.
I’m mostly drawing my sense of what engaged EAs are focusing their careers on from reading Swapcard profiles for EAG London 2022 - where careers seemed to largely be focused on EA movement building, randomista development, animal welfare and biosecurity (it’s plausible that people working in AI are less likely to be living in the UK and attending EAG London).
I think if EAs better appreciated uncertainty when prioritising causes, people’s careers would span a wider range of cause areas.
I think 80 000 Hours could emphasise uncertainty more, but also that the EA community as a whole just needs to be more conscious of uncertainty in cause prioritisation.
Work in the randomista development wing of EA and prioritisation between interventions in this area is highly empirical, able to use high quality evidence and unusually resistant to irrationality. Since this is the wing of EA that initially draws many EAs to the movement, I think it can give them the misconception that cause prioritisation in EA is also highly empirical and unusually resistant to irrationality, when this is not true.
A useful thought experiment is to imagine 100 different timelines where effective altruism emerged. How consistent do you think the movement’s cause priorities (and rankings of them) would be across these 100 different timelines?
I believe the largest sources of irrationality are likely to be the not-based-on-empirical-evidence probabilities that are used in cause prioritisation, but other potential sources are:
Founder effects—if an individual was a strong advocate for prioritising a particular cause area early in the history of EA, chances are that this cause area is larger now than it ideally should be
Cultural biases − 69% of EAs live in the US, the UK, Germany, Australia, and Canada. This may create blindspots, unknown unknowns and may make the goal of being truly impartial more difficult to achieve, compared to a scenario where more EAs lived in other countries.
Gender biases—there are gender differences in stated priorities by EAs. The more rational EA cause prioritisation is, the smaller I’d expect these differences to be.
- EA & LW Forums Weekly Summary (21 Aug − 27 Aug 22’) by 30 Aug 2022 1:37 UTC; 144 points) (
- A libertarian socialist’s view on how EA can improve by 30 Dec 2022 13:07 UTC; 143 points) (
- Taking prioritisation within ‘EA’ seriously by 18 Aug 2023 17:50 UTC; 102 points) (
- EA & LW Forums Weekly Summary (21 Aug − 27 Aug 22′) by 30 Aug 2022 1:42 UTC; 57 points) (LessWrong;
- 4 Aug 2023 17:45 UTC; 26 points) 's comment on University EA Groups Need Fixing by (
- Should I force myself to work on AGI alignment? by 24 Aug 2022 17:25 UTC; 19 points) (
- Monthly Overload of EA—September 2022 by 1 Sep 2022 13:43 UTC; 15 points) (
I have been thinking about this a lot recently, and it seems like a pretty perennial topic on the forum. This post raises good points—I’m especially interested in the idea that EA might be somewhere like a local maximum in terms of cause prioritization, such that if you “reran” EA you’d likely end up somewhere else—I but there are many ways to come at this issue. The general sentiment, as I understand it, is that the EA cause prioritization paradigm seems insufficient given how core it is.
For anyone who’s landed here, here’s a few very relevant posts, in reverse chronological order:
The “Meta Cause”—Aug ’22
The Case Against “Cause Areas”—July ‘21
Why “cause area” as the unit of analysis? - Jan ‘21
The Case of the Missing Cause Prioritization Research—Aug ‘20
The ITN framework, cost-effectiveness, and cause prioritisation—Oct ‘19
On “causes”—June ‘14
Paul Christiano on Cause Prioritization research—March ’14
The case for cause prioritization as the best cause
The apparent lack of a well-organized, public “cause ontology,” or a world model which tries to integrate, systematically and extensibly, the main EA theories of change, seems like a research gap, and I’ve been unable to find any writing which satisfactorily resolves this line of inquiry for me.
Note that it’s not just “cause prioritization” per se that this is relevant to, but really any sort of cause ontologies or frameworks for integrating/comparing theories of change. It has to do with the bigger question of how EA ought to structure its epistemic systems, and is also thus relevant to e.g. metascience and collective behavior “studies,” to name a few preparadigmatic proto-disciplines.
Would also recommend this post discussing the part of Michael Plant’s thesis discussing some philosophical issues in cause prioritisation: https://forum.effectivealtruism.org/posts/bZBzvJfgShwF9LuLH/doing-good-badly-michael-plant-s-thesis-chapters-5-6-on
And my post titled “EA cause areas are just areas where great interventions should be easier to find”, inspired by the ideas from Michael Plant’s thesis: https://forum.effectivealtruism.org/posts/Gch5fAqY3L66eAbDP/ea-cause-areas-are-just-areas-where-great-interventions
I was going to write this post, so I definitely agree :)
In general, EA recommendations produce suboptimal herding behavior. This is because individuals can’t choose a whole distribution over career paths, only a single career path. Let’s say our best guess at the best areas for people to work in is that there’s a 30% chance it’s AI, a 20% chance it’s biosecurity, a 20% chance it’s animal welfare, a 20% chance it’s global development, and a 10% chance it’s something else. Then that would also be the ideal distribution of careers (ignoring personal fit concerns for the moment). But even if every single person had this estimate, all of them would be optimizing by choosing to work in AI, which is not the optimal distribution. Every person optimizing their social impact actually leads to a suboptimal outcome!
The main countervailing force is personal fit. People do not just optimize for the expected impact of a career path, they select into the career paths where they think they would be most impactful. Insofar as people’s aptitudes are more evenly distributed, this evens out the distribution of career paths that people choose and brings it closer to the uncertainty-adjusted optimal distribution.
But this is not a guaranteed outcome. It depends on what kind of people EA attracts. If EA attracts primarily people with CS/software aptitudes, then we would see disproportionate selection into AI relative to other areas. So I think another source of irrationality in EA prioritization is the disproportionate attraction of people with some aptitudes rather than others.
This is a useful exercise. I think that in many of these timelines, EA fails to take AI risk seriously (in our timeline, this only happened in the last few years) and this is a big loss. Also probably in a lot of timelines, the relative influence of rationality, transhumanism, philosophy, philanthropy, policy, etc. as well as existing popular movements like animal rights, social justice, etc. is pretty different. This would be good to the extent these movements bring in more knowledge, good epistemics, operational competence, and bad to the extent they either (a) bring in bad epistemics, or (b) cause EAs to fail to maximize due to preconceptions.
My model is something like this: to rank animal welfare as important, you have to have enough either utilitarian philosophers or animal rights activists to get “factory farming might be a moral atrocity” into the EA information bubble, and then it’s up to the epistemics of decision-makers and individuals making career decisions. A successful movement should be able to compensate for some founder effects, cultural biases, etc. just by thinking well enough (to the extent that these challenges are epistemic challenges rather than values differences between people).
I do feel a bit weird about saying “where effective altruism emerged” as it sort of implies communities called “effective altruism” are the important ones, whereas I think the ones that focus on doing good and have large epistemic advantages over the rest of civilization are the important ones.
The more time I’ve put into giving career advice and trying to work out my own career planning, the more I’m drawn to Holden’s aptitude advice. In addition to feeling more actionable, identifying aptitudes you could excel at (and then testing those ) is more robust to this cause area uncertainty.
I agree. In my advice giving, especially to college students and recent grads, I lean the same way. I find that people can develop a sense of the aptitudes they align with through experiences in a variety of realms (through non EA-related activities, clubs, jobs, school work), which increases the opportunities for data input and experimentation.
I mostly agree with this post, but I take issue with the summary:
As a Bayesian, you have to assign some subjective probabilities to things, and sometimes there just isn’t empirical evidence. To argue that e.g. 80k doesn’t have enough uncertainty (even if you have reasons to believe that high uncertainty is warranted in general), it’s necessary to argue that their methodology is not only subject to irrationality, it’s subject to bias in the direction you argue (overconfidence rather than underconfidence in top causes).
E.g. if their numbers for I, T, and 1/N are based on point estimates for each number rather than expected values, then you underrate causes that have a small chance to be big / highly tractable. (I don’t know whether 80k actually has this bias.)
The main body of the post does address this briefly, but I would want more evidence, and I think the summary does not have valid standalone reasoning.
minor point, but land use reform wouldn’t be more important (in the sense of higher scale) than biosecurity, since they differ by 3 points (30x) in overall score but 6 points (1000x) in scale.
Hmm, I’m not sure you need to make an argument about the direction of the bias. Maybe I should be specifically mentioning imprecision due to potential irrationality?
The way I’m thinking about it, is that 80K have used some frameworks to come up with quantitative scores for how pressing each cause area is, and then ranked the cause areas by the point estimates.
But our imagined confidence intervals around the point estimates should be very large and presumably overlap for a large number of causes, so we should take seriously the idea that the ranking of causes would be different in a better model.
This means we need to take more seriously the idea that the true top causes are different to those suggested by 80K’s model.
Also sorry I accidentally used the word important instead of pressing! Will correct this.
Agree that this methodology of point estimates can be overconfident in what the top causes are, but I’m not sure if that’s their methodology or if they’re using expected values where they should. Probably someone from 80k should clarify, if 80k still believes in their ranking enough to think anyone should use it?
Also agree with this sentence.
My issue is that the summary claims “probabilities derived from belief which aren’t based on empirical evidence [...] means that the optimal distribution of career focuses for engaged EAs should be less concentrated amongst a small number of “top” cause areas.” This is a claim that we should be less confident than 80k’s cause prio.
When someone has a model, you can’t always say we should be less confident than their model without knowing their methodology, even if their model is “probabilities derived from belief which aren’t based on empirical evidence”. Otherwise you can build a model where their model is right 80% of the time, and things are different in some random way 20% of the time, and then someone else takes your model and does the same thing, and this continues infinitely until your beliefs are just the uniform distribution over everything. So I maintain that the summary should mention something about using point estimates inappropriately, or missing some kinds of uncertainty; otherwise it’s saying something that’s not true in general.
I think you’re interpreting my summary as:
“80K have a cause prioritisation model with wide confidence intervals around point estimates, but as individual EAs, our personal cause prioritisation models should have wider confidence around point estimates than in 80K’s model.”
What I meant to communicate is:
“80K have a cause prioritisation model with wide confidence intervals around point estimates, and individual EAs should 1) pay more attention to the wide confidence intervals in 80K’s model than they are currently and 2) have wide confidence intervals in their personal cause prioritisation model too.”
Nice post.
All in all, I think it is still the case one should maximise expected value, and terminate deliberation based on resilience, not certainty.
Hi, thanks for writing this. As others have pointed out I am a bit confused how the conclusion (more diversification in EA careers etc) follows from the assumption (high uncertainty about cause prioritisations.
You might think that we should be risk averse with respect to our difference-making, i.e. that the EA community does some good in many worlds. See here a summary post from me which collects the arguments against the “risk averse difference-making” view. One might still justify increased diversification for instrumental reasons (e.g. welcomingness of the community), but I don’t think that’s what you explicitly argue for.
You might think that updating that we are more uncertain means that we are more likely to change our minds about causes in the future. If we change our minds about priorities in e.g. 2 or 10 years , it is really advantageous if X members of the community already worked in the relevant cause area. Hence, we should spread out.
However, I don’t think that this argument works. First, more uncertainty now might also mean more uncertainty later—hence unclear that I should update that it is more likely that we will change our mind
Secondly, if you think that we can resolve that uncertainty and update in the future, then I think this is a reason for people to work as cause prioritisation researchers and not a reason to spread out among more cause areas.
Thanks for writing this!
I think you make some reasonable points in your post, but I don’t think that you make a strong argument for what appears to be your central point, that more uncertainty would and should lead to greater diversity in cause areas.
I think I’d like to see your models for the following points to buy your conclusion
How much EA uncertainty does the current amount of diversity predict that we have, is this less than you think we ‘should’ have? My sense is that you’re getting more of a vibe we should have more causes, but
Why does more diversity fall out of more uncertainty? This seems to kind of be assumed but I think the only argument that was made here was the timelines thing which feels like the wrong way to think about this (at least to me).
A few concrete claims about uncertainty over some crux and why you think that means we are missing [specific cause area].
(You do point to a few reasons for why the diversity of causes may not exist which I think is very helpful—although I probably disagree with the object level takes)
Thanks for your comment!
By diversity do you mean diversity of cultural origin and gender, or diversity of career focus, or both?
I’d say my view is that the optimal distribution of career focuses would have more people working on less popular EA causes (eg—democracy / IDM, climate change, supervolcanoes, other stuff on 80K’s lists) than we have now. I don’t have a particular cause area in mind which EA is entirely ignoring.
No, I mean roughly the total number of cause areas.
It’s a bit different to total number of causes as each cause area adds more diversity if it is uncorrelated from other cause areas. Maybe a better operationalisation is ‘total amount of the cause area space covered’.
I’ve got a strong intuition that this is wrong, so I’m trying to think it through.
To argue that EA’s underestimate uncertainty, you need to directly observe their uncertainty estimates (and have knowledge of the correct level of uncertainty to have). For example, if the community was homogenous and all assigned a 1% chance to Cause X being the most important issue (I’m deliberately trying not to deal with how to measure this) and there was a 99% chance of cause Y being the most important issue, then all individuals would choose to work on Cause Y. If the probabilities were 5% X and 95% you’d get the same outcome. This is because individuals are making single choices.
Now, if there was a central body coordinating everyone’s efforts, in the first scenario, it still wouldn’t follow that 1% of people would get allocated to cause Y. Optimal allocation strategy aside, there isn’t this clean relationship between uncertainty and decision rules.
I think 80k is already very conscious of this (based on my general sense of 80k materials). Global priorities research is one of their 4 highest priorities areas and it’s precisely about having more confidence about what is the top priority.
I think something that would help me understand better where you are coming from is to hear more about what you think the decision rules are for most individuals, how they are taking their uncertainty into account and more about precisely how gender/culture interacts with cause area uncertainty in creating decisions.
From one of my other comments:
“The way I’m thinking about it, is that 80K have used some frameworks to come up with quantitative scores for how pressing each cause area is, and then ranked the cause areas by the point estimates.
But our imagined confidence intervals around the point estimates should be very large and presumably overlap for a large number of causes, so we should take seriously the idea that the ranking of causes would be different in a better model.
This means we need to take more seriously the idea that the true top causes are different to those suggested by 80K’s model.”
So I think EAs should approach the uncertainty on what the top cause is by spending more time individually thinking about cause prioritisation, and by placing more attention on personal fit in career choices. I think this would produce a distribution of career focuses which is less concentrated to randomista development, animal welfare, meta-EA and biosecurity.
With gender, the 2020 EA Survey shows that male EAs are less likely to prioritise near-term causes than female EAs. So it seems likely that EA was 75% female instead of 75% male, the distribution of career focuses of EAs would be different, which indicates some kind of model error to me.
With culture, I mentioned that I expect unknown unknowns here, but another useful thought experiment would be—how similar would EA’s cause priorities and rankings of cause priorities be if it emerged in India, or Brazil, or Nigeria, instead of USA / UK? For example, it seems plausible to me that we value animal welfare less than an EA movement with more Hindu / Buddhist cultural influences would, or that we prioritise promoting liberal democracy less than an imagined EA movement with more influence from people in less democratic countries. Also, maybe we value improving balance and harmony less than an EA movement that originated in Japan would, which could affect cause prioritisation.
Thanks for clarifying.
So I’m an example of someone in that position (I’m trying to work out how to contribute via direct work to a cause area) so I appreciate the opportunity to discuss the topic.
Upon reflection, maybe the crux of my disagreement here is that I just don’t agree that the uncertainty is wide enough to effect the rankings (except in each tier) or to make the direct-work decision rule robust to personal fit.
I think that X-risks have non-overlapping confidence intervals with non-x-risks because of the scale of the problem, and I don’t feel like this changes from a near-term perspective. Even small chances of major catastrophic events this century seen to dwarf other problems.
80k’s second top priority areas are Nuclear security, Climate Change (extreme) and Improving Institutional decision making. For the first two, these seem to be associated with major catastrophe’s (maybe not x-risks) which also might be considered not to overlap with the next set of issues (factory farming/global health).
With respect to concerns that demographics might be heavily affecting cause prioritisation, I think it would be helpful to have specific examples of causes you think are under-estimated and the biases associated with them.
For example, I’ve heard lots of different arguments that x-risks are concerning even if you don’t buy into long-termism. To a similar end, I can’t think of any causes that would be under-valued because of not caring adequately about balance/harmony.
If you agree with the astronomical waste argument for longtermism, then this is true. But within x-risks, for example, I imagine that the confidence intervals for different x-risks probably overlap.
So as an imaginary example, I think it’d be suboptimal if all EAs worked on AI safety, or only on AI safety and engineered pandemics, and no EAs were working on nuclear war, supervolcanoes or climate change.
And then back to the real world, (without data on this), I’d guess that we have fewer EAs than is optimal currently working on supervolcanoes.
I think there are unknown unknowns here, but a concrete example which I offered above:
“For example, it seems plausible to me that we value animal welfare less than an EA movement with more Hindu / Buddhist cultural influences would.”
If you don’t buy longtermism, you probably still care about x-risks, but your rejection of longtermism massively affects the relative importance of x-risks compared to nearterm problems, which affects cause prioritisation.
Similarly, I don’t expect diversity of thought to introduce entirely new causes to EA or lead to current causes being entirely abandoned, but I do expect it to affect cause prioritisation.
I don’t entirely understand what East Asian cultures mean by balance / harmony so can’t tell how it would affect cause prioritisation, I just think there would be an effect.
Sorry for the slow reply.
Talking about allocation of EA’s to cause areas.
I agree that confidence intervals between x-risks are more likely to overlap. I haven’t really looked into super-volcanoes or asteroids and I think that’s because what I know about them currently doesn’t lead me to believe they’re worth working on over AI or Biosecurity.
Possibly, a suitable algorithm would be to defer to/check with prominent EA organisations like 80k to see if they are allocating 1 in every 100 or every 1000 EAs to rare but possibly important x-risks. Without a coordinated effort by a central body, I don’t see how you’d calibrate adequately (use a random number generator and if the number is less than some number, work on a neglected but possibly important cause?).
My thoughts on EA allocation to cause areas have evolved quite a bit recently (partly due to talking 80k and others, mainly in biosecurity). I’ll probably write a post with my thoughts, but the bottom line is that, basically, the sentiment expressed here is correct and that it’s easier socially to have humility in the form of saying you have high uncertainty.
Responding to the spirit of the original post, my general sense is that plenty of people are not highly uncertain about AI-related x-risk—you might have gotten that email from 80k titled “A huge update to our problem profile — why we care so much about AI risk”. That being said, they’re still using phrases like “we’re very uncertain”. Maybe the lack of uncertainty about some relevant facts is lower than their decision rule. For example, in the problem profile, they write:
Different Views under Near-Termism
This seems tempting to believe, but I think we should substantiate it. What current x-risks are not ranked higher than non-x-risks (or how much less of a lead do they have) relative to non-x-risks causes from a near-term perspective?
I think this post proposes a somewhat detailed summary of how your views may change under a transformation from long-termist to near-termist. Scott says:
His arguments here are convincing because I find an AGI event this century likely. If you didn’t, then you would disagree. Still, I think that even were AI not to have short timelines, other existential risks like engineered pandemics, super-volcanoes or asteroids might have milder only catastrophic variations, which near-termists would equally prioritise, leading to little practical variation in what people work on.
Talking about different cultures and EA
Can you reason out how “there would be an effect”?