Why effective altruism used to be like evidence-based medicine. But isn’t anymore
On August 8th, Robert Wiblin, owner of probably the most intellectually stimulating facebook wall of all time asked “What past social movement is effective altruism most similar to?” This is a good question and there were some interesting answers. In the end, the most liked post (well actually the second, after Kony 2012), was ‘evidence-based medicine’. I think effective altruism used to have a lot of similarities to evidence-based medicine but is increasingly moving in the opposite direction.
What is it that makes them similar? Obviously a focus on evidence. “Effective altruism is a philosophy and social movement that applies evidence and reason to determine the most effective ways to improve the world.” (Wikipedia)
The trouble is, evidence and reason aren’t the same thing.
Reason, in effective altruism seems to often be equated with maximising expected utility. It is characterised by organisations like the Future of Humanities Institute and often ends up prioritising things for which we have almost no evidence, like protection against deathbots.
Evidence is very different. It’s about ambiguity aversion, not maximising expected utility. It’s a lot more modest in its aims and is characterised by organisations like Givewell, prioritising charities like AMF or SCI, for which we have a decent idea of the quantum of their effects.
I place myself in the evidence camp. One of the strengths of evidence-based medicine in my view is that it realises the limits of our rationality. It realises that, actually, we are VERY bad at working out how to maximise expected utility through abstract reasoning so we should actually test stuff empirically to find out what works. It also allows consensus-building by decreasing uncertainty.
I’m not saying there isn’t room for both. There should definitely be people in the world who think about existential risk and there should definitely be people in the world providing evidence on the effectiveness of charitable interventions. I’m just not sure they should be the same people.
There are also similarities between the two camps. They’re both motivated by altruism and they’re both explicitly consequentialist, or at least act like it. The trouble is, they also both claim to be doing the most good and so in a way they disagree. Maybe I shouldn’t be worried about this. After all, healthy debate within social movements is a good thing. On the other hand, the two camps often seem to have such fundamentally different approaches to the question of how to do the most good that it is difficult to know if they can be reconciled.
In any case, I think it can only be a good thing that this difference is explicitly recognised.
- 15 Aug 2015 11:22 UTC; 2 points) 's comment on August Open Thread: EA Global! by (
Thanks for this very interesting and clearly articulated post. A comment specifically on the “camps” thing.
Within the people actually working on existential risk/far future, my impression is that this ‘competition’ mindset doesn’t exist to nearly the same extent (I imagine the same is true in the ‘evidence’ causes, to borrow your framing). And so it’s a little alarming at least to me, to see competitive camps drawing up in the broader EA community, and to hear (for example) reports of people who value xrisk research ‘dismissing’ global poverty work.
Toby Ord, for example, is heavily involved in both global poverty/disease and far future work with FHI. In my own case, I spread my bets by working on existential risk but my donations (other than unclaimed expenses) go to AMF and SCI. This is because I have a lot of uncertainty on the matter, and frankly I think it’s unrealistic not to have a lot of uncertainty on it. I think this line (” There should definitely be people in the world who think about existential risk and there should definitely be people in the world providing evidence on the effectiveness of charitable interventions.”) more accurately sums up the views of most researchers I know working on existential risk.
I realise that this might be seen as going against the EA ‘ethos’ to a certain extent—a lot of the aim is to be able to rank things clearly and objectively, and choose the best causes. But this gets very difficult when you start to include the speculative causes. It’s the nature of existential risk research to be wrong a lot of time—a lot of work surrounds high impact, low probability risks that may not come to pass, many of the interventions may not have effect until much further in the future, it is hard to predict whether it’s our work which makes the crucial difference, etc—all of this makes it difficult to measure.
I’m happy to say existential risk (and global catastrophic risk) are important areas of work. I think there are strong, evidence-based arguments that it has been under-served and underfunded globally to date, for reasons well-articulated elsewhere. I think there are also strong arguments that e.g. global poverty is under-served and underfunded for a set of reasons. I’m happy to say I consider these both to be great causes, with strong reasons to fund them. But reducing down “donate to AMF vs donate to CSER” into e.g. lives saved in the present versus speculative lives saved in the future involves so much gross simplification and assumptions that could be wrong by so many orders of magnitude that I’m not comfortable doing it. Add to this moral uncertainty over value of present lives versus value of speculative future lives, value of animal lives, and so on, and it gets even more difficult.
I don’t know how to resolve this fully within the EA framing. My personal ‘dodge’ has been to prioritise raising funds from non-EA sources for FHI and CSER (>95%, if one excludes Musk, >80% if one includes him). I would be a hypocrite to recommend to someone to stop funding AMF in favour of CSER, given that I’m not doing that myself. But I do appreciate that an EA still has to decide what to do with her funds between xrisk, global poverty, animal altruism, and other causes. I think we will learn from continuing excellent work by ‘meta’ groups like GiveWell/OPP and others. But to a certain extent, I think we will have to recognise, and respect, that at some point there are moral and empirical uncertainties that are hard to reduce away.
Perhaps for now the best we can say is “There are a number of very good causes that are globally under-served. There are significant uncertainties that make it difficult to ‘rank’ between them, and it will partly depend on a person’s moral beliefs and appetite for ‘long shots’ vs ‘safe bets’, as well as near-term opportunities for making a clear difference in a particular area. But we can agree that there are solid reasons to support this set of causes over others”.
You’re quite right that there are people like Toby (and clearly yourself) who are genuinely and deeply concerned by causes like global poverty while also working on very different causes like x-risk, and are not dismissive of either. The approach you describe seems very sensible, and it would be great to keep (or make?) room for it in the EA ethos. If people felt that EA committed them to open battle until the one best cause emerged victorious atop a pile of bones… well, that could cause problems. One thing which would help avoid it (and might be a worthwhile thing to do overall) would be to work out and establish a set of norms for potentially divisive or dismissive discussions of different EA causes.
That said, I am uncertain as to whether the different parts of EA will naturally separate, and whether this would be good or bad. I’m inclined to think that it would be bad, partly because right now everyone benefits from the greater chance at critical mass that we can achieve together, and partly because broad EA makes for a more intellectual interesting movement and this helps draw people in. But I can see the advantages of a robustly evidenced, empiricist, GiveWell/GWWC Classic-type movement. I’ve devoted a certain amount of time to that myself, including helping out Joey and Katherine Savoie’s endeavours along these lines at Charity Science.
This also seems like a good time to reiterate that I agree that “there should definitely be people in the world who think about existential risk”, that I don’t want to be dismissive of them either, and that my defending the more ‘empiricist’, poverty-focused part of EA doesn’t mean that I automatically subscribe to every x-risk sceptic attitude that you can find out there.
Besides evidence-based medicine, there is of course also evidence-based policy which is explicitly modelled on evidence-based medicine. They also focus heavily on RCTs and other kinds of rigorous evidence.
I founded a network for evidence-based policy earlier this year in Sweden. Here’s a post where I argue that evidence-based policy is basically EA minus it’s moral views.
Interesting suggestion. I don’t think anyone’s advocating for using reasoning without evidence (called ‘a priori reasoning’), nor does anyone think that we should truly only reimplement interventions performed in particular studies without extrapolating at all. People like the future of humanity institute, in particular, are seeking to generalise from evidence in a principled way. So the question is really ‘of ways to use reason to generalise from existing evidence, which is best?’ It seems counterproductive to try to divide people who are all fundamentally trying to answer this same question into different camps.
Bayesian rationality is one compelling answer to the ‘how do we apply reason to evidence?’ that has some advantages:
it allows quantification of beliefs,
it allows quantification of strength of evidence,
it’s unexploitable in betting
Bayesian stats is not the panacea of logic it is often held out to be; I say this as someone who practices statistics for the purpose of social betterment (see e.g. https://projects.propublica.org/surgeons/ for an example of what I get up to)
First, my experience is that quantification is really, really hard. Here are a few reasons why.
I have seen few discussions, within EA, of the logistics of data collection in developing countries, which is a HUGE problem. For example, how do you get people to talk to you? How do you know if they’re telling you the truth? These folks have often talked to wave after wave of well meaning foreigners over their lives and would rather ignore or lie to you and your careful survey. The people I know who actually collect data in field have all sorts of nasty things to say about the realities of working in fluid environments.
Even worse: for a great many outcomes there just ISN’T a way to get good indicator data. Consider the problem of attribution of outcomes to interventions. We can’t even reliably solve the problem of attributing a purchase to an ad in the digital advertising industry, where all actions are online and therefore recorded somewhere. How then do we solve attribution at the social intervention level? The answers revolve around things like theories of change and qualitative indicators, neither of which the EA community takes seriously. But often this is the ONLY type of evidence we can get.
Second, Bayesian stats is built entirely on a single equation that follows from the axioms of probability. All of this update, learning, rationality stuff is an interpretation we put on top of it. Andrew Gelman and Cosma Shalizi have the clearest exposition of this, from “Philosophy and the Practice of Bayesian Statistics”,
“A substantial school in the philosophy of science identifies Bayesian inference with inductive inference and even rationality as such, and seems to be strengthened by the rise and practical success of Bayesian statistics. We argue that the most successful forms of Bayesian statistics do not actually support that particular philosophy but rather accord much better with sophisticated forms of hypothetico-deductivism. We examine the actual role played by prior distributions in Bayesian models, and the crucial aspects of model checking and model revision, which fall outside the scope of Bayesian confirmation theory. We draw on the literature on the consistency of Bayesian updating and also on our experience of applied work in social science.”
Bayesianism is not rationality. It’s a particular mathematical model of rationality. I like to analogize it to propositional logic: it captures some important features of successful thinking, but it’s clearly far short of the whole story.
We need much more sophisticated frameworks for analytical thinking. This is my favorite general purpose approach, which applies to mixed quant/qual evidence, and was developed by consideration of cognitive biases at the CIA:
https://www.cia.gov/library/center-for-the-study-of-intelligence/csi-publications/books-and-monographs/psychology-of-intelligence-analysis/art11.html
But of course this isn’t rationality either. It’s never been codified completely, and probably cannot be.
I think your distinction rests on an overly simplistic description of ‘evidence based medicine’, and that to divide effective altruism into camps is likewise a false dichotomy.
(TLDR: EBM doesn’t equal total reliance on meta-analyses. Evidence based medicine still requires reason. EBM without reason is just as dangerous as reason without evidence)
As most people here known, in EBM, the highest standard of evidence is a meta-anaysis of well conducted randomised, controlled, double blind trials. Unfortunately decisions that can be supported by evidence of such quality are relatively few. There are many reasons for this: trials are difficult to design and run in such a way they actually answer the question correctly; when designed well they are expensive to run; not all trial data are reported; there is a long lag time between identifying a question, conceiving and conducting the trials, analysing and reporting the data, and considering how this changes the weight of available evidence.
In general trials are most often run for investigations or interventions that can make somebody money or where regulation demands it for something novel. Drugs and devices need trials to be licensed, but once licensed they can be sold and used ‘off license’ for other conditions without further RCTs. It is considered unethical to withdraw what is currently considered standard best practice and replace it with a placebo in a trial. For these and other reasons there remains a huge amount of medical decision making is not supported by highest quality evidence.
However we still need to act. We can’t put off our patients and ask them to come back with their perforated bowel once somebody has done a controlled trial of operating on a burst colon vs placebo. In the absence of highest quality evidence medical professionals still practice in a way that is considering and respectful of evidence. We consider lesser forms of evidence, we weigh the likelihoods based on biological theories, and we update our beliefs as new evidence comes to light. It all sounds a bit Bayesian, doesn’t it?
In fact, the physicians I know who are most committed to guiding their practice with evidence and reason are prepared to act against the results of randomised clinical trials. As a recent example: a US based RCT showed that EGDT—targeting treatment of septic (ie very sick from infection) at particular numbers—was superior to usual care (ie being guided by the treating doctor). In fact they demonstrated a massive 16% absolute decrease in 30 day mortality. UK, European and US centres set about trying to replicate it, but given what was at stake this required huge and co-ordinated efforts. It took over a decade and all 3 multicentre studies showed… absolutely no benefit.
http://blogs.nejm.org/now/index.php/the-final-nail-in-early-goal-directed-therapys-coffin/2015/03/24/
In the meantime, what should the evidence-based practitioner have done? A shallow answer would be to do what the evidence said: immediately change practice to EGDT, and only update when a further RCT countered the result. But that would have been a costly mistake that subjected patients to unnecessary invasive monitoring. The results of the first trial were counterintuitive to many experts, especially those who have seen fashions for ‘treating the numbers’ (rather than the patient) arise and be discredited over many years. Most senior ED/ITU doctors did not follow EGDT in the decade between results because the trial data was not enough to cause them to update their practice.
It later came out that in the original trial there were several systematic ways that the intervention group differed from the usual care group, but this was not fully captured in the study report. Further, the mortality in the first trial was much higher than we see in the NHS. The overall ‘story’ is probably that treatment by numbers is not better than getting the basics right, which we overall do.
Reason and evidence aren’t separate camps. They are both fundamentally important when you act in the real world.
DOI—studying evidence based medicine on and off since 1999.
Bernadette,
Thank you for your very informative response. I must admit that my knowledge of EBM is much more limited than yours and is primarily Wikipedia-based.
The lines which particularly led me to believe that EBM favoured formal approaches rather than doctors’ intuitions were:
“Although all medicine based on science has some degree of empirical support, EBM goes further, classifying evidence by its epistemologic strength and requiring that only the strongest types (coming from meta-analyses, systematic reviews, and randomized controlled trials) can yield strong recommendations; weaker types (such as from case-control studies) can yield only weak recommendations”
“Whether applied to medical education, decisions about individuals, guidelines and policies applied to populations, or administration of health services in general, evidence-based medicine advocates that to the greatest extent possible, decisions and policies should be based on evidence, not just the beliefs of practitioners, experts, or administrators.”
Criticism of EBM: “Research tends to focus on populations, but individual persons can vary substantially from population norms, meaning that extrapolation of lessons learned may founder. Thus EBM applies to groups of people, but this should not preclude clinicians from using their personal experience in deciding how to treat each patient.”
Perhaps the disagreement comes from my unintentional implication that the two camps were diametrically opposed to each other.
I agree that they are “both fundamentally important when you act in the real world” and that evidence based giving / evidence based medicine are not the last word on the matter and need to be supplemented by reason. At the same time though, I think there is an important distinction between maximising expected utility and being averse to ambiguity.
For example, to the best of my knowledge, the tradeoff between donating to SCI ($1.23 per treatment) and Deworm the World Initiative ($0.50 per treatment), is that DWI has demonstrated higher cost effectiveness but with a wider confidence interval (less of a track record). Interestingly, this actually sounds similar to your EGDT example. I therefore donate to SCI because I prefer to be confident in the effect. I think this distinction also applies to XRisk vs. development.
Sorry for being slow to reply James.
The methods of EBM do absolutely favour formal approaches and concrete results. However—and partly because of some of the pitfalls you describe—it’s relatively common to find you have no high quality evidence that specifically applies to inform your decision. It is also relatively common to find poor quality evidence (such as a badly constructed trial, or very confounded cohort studies). If those constitute the best-available evidence, a strict reading of the phrase ‘to greatest extent possible, decisions and policies should be based on evidence’ would imply that decisions should be founded on that dubious evidence. However in practice I think most doctors who are committed to EBM would not change their practice on the basis of a bad trial.
Regarding tradeoffs between maximising expected good and certainty of results (which I guess is maximising the minimum you achieve), I agree that’s a point where people come down on different sides. I don’t think it strictly divides causes (because as you say, one can lean to maximising expected utility within the global poverty), though the overlap between those who favour maximising expectation and those think existential risk is the best cause to focus on is probably high. I think this is actually going to be a topic of panel discussion at EA Global Oxford if you’re going?
Not to imply that you were implying otherwise, but I don’t think that the ‘evidence camp’ generally sees itself as maximising the minimum you achieve, or as disagreeing with maximising expected good. Instead it often disagrees with specific claims about what does the most good, particularly ones based on a certain sort of expected value calculation.
(In a way this only underscores your point that there isn’t that sharp a divide between the two approaches, and that we need to take into account all the evidence and reasons that we have. As you say, we often don’t have RCTs to settle things, leaving everyone with the tricky job of weighting different forms of evidence. There will be disagreements about that, but they won’t look like a sharp, binary division into two opposed ‘camps’. Describing what actually happens in medicine seems very helpful to understanding this.)
Thanks for the post! I place myself in the evidence camp, though I wouldn’t say it’s a matter of “ambiguity aversion”, and saying its “not [about] maximising expected utility” risks being misunderstood. As you said, there are explicit consequentialists in both camps. (Non-consequentialists can be found in both too.)
Both camps are diverse so I wouldn’t want to try giving a complete characterisation of either, and someone in the rationalist camp would likely do a better job of describing their group’s most common views. My personal impression is that the evidence/empiricism camp is typically not so much against maximising expected utility, but is sceptical of certain sorts of explicit expected value calculations. For example, a majority of those found on the back of envelopes, and an even larger number of those found on the back of napkins. Here are some relevant posts:
Why We Can’t Take Expected Value Estimates Literally (Even When They’re Unbiased) - Holden Karnofsky
Sequence Thinking vs. Cluster Thinking—Holden Karnofsky
Quantification as a Lamppost in the Dark—Adam Casey (h/t Sasha Cooper here)
Why I’m Skeptical About Unproven Causes (And You Should Be Too) - Peter Hurford
Thanks both for thoughtful replies and links.
I agree that it may be counterproductive to divide people who are answering the same questions into different camps and, on re-reading, that is how my post may come across. My more limited intention was to provide a (crude) framework through which we might be able to understand the disagreement.
I guess I had always interpreted (perhaps falsely) EA as making a stronger claim than ‘we should be more reasonable when deciding how to do good’. In particular I feel that there used to be more of a focus on ‘hard’ rather than ‘soft’ evidence. This helps explain why EA used to advocate charitable giving over advocacy work / systemic change, for which hard evidence is necessarily more limited. It seems EA is now a broader church and this is probably for the better but in departing from a preference for hard evidence/RCTs it has lost its claim to being like evidence-based medicine.
The strength of this evolution is that EA seems to have absorbed thoughtful critiques such as that of Acemoglu http://bostonreview.net/forum/logic-effective-altruism/daron-acemoglu-response-effective-altruism although I imagine it must have been quite annoying to be told that “if X offers some prospect of doing good, then EAs will do it” when we weren’t at the time. Perhaps EA is growing so broad that the only real opponents they have left are the anti-rationalists like John Gray (although the more opponents he has the better)
Where is the evidence for this claim? This all seems like reason and words :P
I think most philosophers would say that evidence and reason are different because even if practical rationality requires that you maximize expected utility in one way or another—just as theoretical rationality requires that you conditionalize on your evidence—neither thing tells you that MORE evidence is better. You can be a perfectly rational, perfectly ignorant agent. That more evidence is better than less is a separate kind of epistemological principle than the one that tells you to conditionalize on whatever you’ve managed to get.(1)
Another way to put it: more evidence is better from a first-person point of view: if you can get more evidence before you decide to act, you should do it! But from the third person point of view, you shouldn’t criticize people who maximize expected utility on the basis of bad or scarce evidence.
Here’s a quote from James Joyce (a causal decision theorist):
“CDT [causal decision theory] is committed to two principles that jointly entail that initial opinions should fix actions most of the time, but not [always]...
CURRENT EVALUATION. If Prob_t(.) characterizes your beliefs at t, then at t you should evaluate each act by its causal expected utility using Prob_t(.).
FULL INFORMATION. You should act on your time-t utility assessment only if those assessments are based on beliefs that incorporate all the evidence that is both freely available to you at t and relevant to the question about what your acts are likely to cause.” (Joyce, “Regret and Instability in Causal Decision Theory,” 126-127)
...only the first principle is determined by the utility-maximizing equation that’s at the mathematical core of causal decision theory. Anyway, that’s my nerdy philosophical lit contribution to the issue ;).
(1) In an extreme case, suppose you have NO evidence—you are in the “a priori position” mentioned by RyanCarey. Then reason is like an empty stomach, with no evidence to digest. But still it would contribute the tautologies of pure logic—those are propositions that are true no matter what you conditionalize on, indeed whether you conditionalize on anything at all.
This is a real important distinction I had never thought of.
The way I would describe this or a similar difference in personal preference is along the lines of repeatability:
Some people feel drawn toward maximizing expected value by addressing hypothetical events that are so bad that we wouldn’t be here anymore to think about them if they happened repeatedly or had happened at all, while
other people feel drawn toward maximizing expected value by addressing events that are far less bad but happen frequently.
These repeating events have the advantage that you can test the effectiveness of your intervention on them (say, with RCTs) and improve it incrementally. Hence there is a lot more solid evidence to draw on. But solely based on that consideration, there is no better preference, just as social scientist like to say that physics is so much farther along just because it’s so much easier to study.
Thank you all for some great responses and apologies for my VERY late reply. This post was intended to ‘test an idea/provoke a response’ and there’s some really good discussion here.