Biorisk researcher Gregory Lewis’s criticism of EA’s initial response to covid-19

Wait, what? Why a post now about the EA’s community’s initial response to covid-19, based on an 80,000 Hours Podcast interview from April 2020? That will become clearer as you read on. There is a bigger lesson here.

Context: Gregory Lewis is a biorisk researcher, formerly at the Future of Humanity Institute at Oxford, with a background in medicine and public health. He describes himself as “heavily involved in Effective Altruism”. (He’s not a stranger here: his EA Forum account was created in 2014 and he has 21 posts and 6000 karma.)

The interview

Lewis was interviewed on the 80,000 Hours Podcast in an episode released on April 17, 2020. Lewis has some harsh words for how the EA community initially responded to the pandemic.

But first, he starts off with a compliment:

If we were to give a fair accounting of all EA has done in and around this pandemic, I think this would overall end up reasonably strongly to its credit. For a few reasons. The first is that a lot of EAs I know were, excuse the term, comfortably ahead of the curve compared to most other people, especially most non-experts in recognizing this at the time: that emerging infectious disease could be a major threat to people’s health worldwide. And insofar as their responses to this were typically either going above and beyond in terms of being good citizens or trying to raise the alarm, these seem like all prosocial, good citizen things which reflect well on the community as a whole.

He also pays a compliment to a few people in the EA community who have brainstormed interesting ideas about how to respond to the pandemic and who (as of April 2020) were working on some interesting projects. But he continues (my emphasis added):

But unfortunately I’ve got more to say.

So, putting things politely, a lot of the EA discussion, activity, whatever you want to call it, has been shrouded in this miasma of obnoxious stupidity, and it’s been sufficiently aggravating for someone like me. I sort of want to consider whether I can start calling myself EA adjacent rather than EA, or find some way of distancing myself from the community as a whole. Now the thing I want to stress before I go on to explain why I feel this way is that unfortunately I’m not alone in having these sorts of reactions.

… But at least I have a few people who talk to me now, who, similar to me, have relevant knowledge, background and skills. And also, similar to me, have found this community so infuriating they need to take a break from their social media or want to rage quit the community as a whole. … So I think there’s just a pattern whereby discussion around this has been very repulsive to people who know a lot about the subject is, I think, a course for grave concern.

That EA’s approval rating seems to fall dramatically with increasing knowledge is not the pattern you typically take as a good sign from the outside view.

Lewis elaborates (my emphasis added again):

And this general sense of just playing very fast and loose is pretty frustrating. I have experienced a few times of someone recommending X, then I go into the literature, find it’s not a very good idea, then I briefly comment going, “Hey, this thing here, that seems to be mostly ignored”, then I get some pretty facile reply and I give up and go home. And that’s happened to other people as well. So I guess given all these things, it seems like bits of the EA response were somewhat less than optimal.

And I think for ways it could have been improved were mostly in the modesty direction. So, for example, I think several EAs have independently discovered for themselves things like right censoring or imperfect ascertainment or other bits of epidemiology which inform how you, for example, assess the case fatality ratio. And that’s great, but all of that was in most textbooks and maybe it’d have saved time had those been consulted first rather than doing something else instead.

More on this consulting textbooks:

But typically for most fields of human endeavor, we have a reasonably good way which is probably reasonably efficient in terms of picking up the relevant level of knowledge and expertise. Now, it’s less efficient if you just target it, if you know in advance what you want to know ahead. But unfortunately, this area tends to be one where it’s a background tacit knowledge thing. It’s hard to, as it were, rapier-like just stab all the things, in particular, facts you need. And if you miss some then it can be a bit tricky in terms of having good ideas thereafter.

What’s worse than inefficiency:

The other problems are people often just having some fairly bad takes on lots of things. And it’s not always bad in terms of getting the wrong answer. I think some of the interventions do seem pretty ill-advised and could be known to be ill-advised if one had maybe done one’s homework slightly better. These are complicated topics generally: something you thought about for 30 minutes and wrote a Medium post about may not actually be really hitting the cutting edge.

An example of a bad take:

So I think President Trump at the moment is suggesting that, as it were, the cure is worse than the disease with respect to suppression. … But suppose we’re clairvoyant and we see in two years’ time, we actually see that was right. … I think very few people would be willing to, well, maybe a few people listening to this podcast can give Trump a lot of credit for calling it well. Because they would probably say, “Well yeah, maybe that was the right decision but he chose it for the wrong reasons or the wrong epistemic qualities”. And I sort of feel like a similar thing sort of often applies here.

So, for example, a lot of EAs are very happy to castigate the UK government when it was more going for mitigation rather than suppression, but for reasons why, just didn’t seem to indicate they really attended to any of the relevant issues which you want to be wrestling with. And see that they got it right, but they got it right in the way that stopped clocks are right if you look at them at the right time of day. I think it’s more like an adverse rather than a positive indicator. So that’s the second thing.

On bad epistemic norms:

And the third thing is when you don’t have much knowledge of your, perhaps, limitations and you’re willing to confidently pronounce on various things. This is, I think, somewhat annoying for people like me who maybe know slightly more as I’m probably expressing from the last five minutes of ranting at you. But moreover, it doesn’t necessarily set a good model for the rest of the EA community either. Because things I thought we were about were things like, it’s really important to think things through very carefully before doing things. A lot of your actions can have unforeseen consequences. You should really carefully weigh things up and try and make sure you understand all the relevant information before making a recommendation or making a decision.

And it still feels we’re not really doing that as much as we should be. And I was sort of hoping that EA, in an environment where there’s a lot of misinformation, lots of outrage on various social media outlets, there’s also castigation of various figures, I was hoping EA could strike a different tone from all of this and be more measured, more careful and just more better I guess, roughly speaking.

More on EA criticism of the UK government:

Well, I think this is twofold. So one is, if you look at SAGE, which is the Scientific Advisory Group for Emergencies, who released what they had two weeks ago in terms of advice that they were giving the government, which is well worth a read. And my reading of it was essentially they were essentially weeks ahead of EA discourse in terms of all the considerations they should be weighing up. So obviously being worse than the expert group tasked to manage this is not a huge rap in terms of, “Well you’re doing worse than the leading experts in the country.” That’s fair enough. But they’re still overconfident in like, “Oh, don’t you guys realize that people might die if hospital services get overwhelmed, therefore your policy is wrong.” It seems like just a very facile way of looking at it.

But maybe the thing is first like, not having a very good view. The second would be being way too overconfident that you actually knew the right answer and they didn’t. So much that you’re willing to offer a diagnosis, for example, “Maybe the Chief Medical Officer doesn’t understand how case ascertainment works or something”. And it’s like this guy was a professor of public health in a past life. I think he probably has got that memo by now. And so on and so forth.

On cloth masks:

I think also the sort of ideas which I’ve seen thrown around are at least pretty dicey. So one, in particular, is the use of cloth masks; we should all be making cloth masks and wearing them.

And I’m not sure that’s false. I know the received view in EA land is that medical masks are pretty good for the general population which I’ll just about lean in favor of, although all of these things are uncertain. But cloth masks seem particularly risky insofar as if people aren’t sterilizing them regularly which you expect they won’t: a common thing about the public that you care about is actual use rather than perfect use. And you have this moist cloth pad which you repeatedly contaminate and apply to your face which may in fact increase your risk and may in fact even increase the risk of transmission. It’s mostly based on contact rather than based on direct droplet spreads. And now it’s not like lots of people were touting this. But lots on Twitter were saying this. They cite all the things. They seem not to highlight the RCT which cluster analyzed healthcare workers to medical masks, control, and cloth masks, and found cloth masks did worse than the control.

Then you would point out, per protocol, that most people in the controlled arm were using medical masks anyway or many of them were, so it’s hard to tell whether cloth masks were bad or medical masks were good. But it’s enough to cause concern. People who write the reviews on this are also similarly circumspect and I think they’ve actually read the literature where I think most of the EAs confidently pronouncing it’s a good idea generally haven’t. So there’s this general risk of having risky policy proposals which you could derisk, in expectation, by a lot, by carefully, as it were, checking the tape.

More on cloth masks:

And I still think if you’re going to do this, or you’re going to make your recommendations based on expectation, you should be checking very carefully to make sure your expectation is as accurate as it could be, especially if there’s like a credible risk of causing harm and that’s hard to do for anyone, for anything. I mean cf. the history of GiveWell, for example, amongst all its careful evaluation. And we’re sort of at the other end of the scale here. And I think that could be improved. If it was someone like, “Oh, I did my assessment review of mask use and here’s my interpretation. I talked to these authors about these things or whatever else”, then I’d be more inclined to be happy. But where there’s dozens of ideas being pinged around… Many of them are at least dubious, if not downright worrying, then I’m not sure I’m seeing really EA live out its values and be a beacon of light in the darkness of irrationality.

Lewis’ concrete recommendations for EA:

The direction I would be keen for EAs to go in is essentially paying closer attention to available evidence such as it is. And there are some things out there which can often be looked at or looked up, or existing knowledge one can get better acquainted with to help inform what you think might be good or bad ideas. And I think, also, maybe there’s a possibility that places like 80K could have a comparative advantage in terms of elicitation or distillation of this in a fast moving environment, but maybe it’s better done by, as it were, relaying on what people who do this all day long, and who have a relevant background are saying about this.

So yeah, maybe Marc Lipsitch wants to come on the 80K podcast, maybe someone like Adam Kucharski would like to come on. Or like Rosalind Eggo or other people like this. Maybe they’d welcome a chance of being able to set the record straight given like two hours to talk about their thing rather than like a 15 minute media segment. And it seems like that might be a better way of generally improving the epistemic waterline of EA discussions, rather than lots of people pandemic blogging, roughly speaking, and a very rapid, high turnaround. By necessity, there’s like limited time to gather relevant facts and information.

More on EA setting a bad example:

...one of the things I’m worried about, it’s like a lot of people are going to look at COVID-19, start want get involved in GCBRs. And sort of all these people are cautious, circumspect, lot’s of discretion and stuff like that. I don’t think 80Ks activity on this has really modeled a lot of that to them. Rob [Wiblin], in particular, but not alone. So having a pile of that does not fill me with great amounts of joy or anticipation but rather some degree of worry.

I think that does actually apply even in first order terms to the COVID-19 pandemic, where I can imagine a slightly more circumspect or cautious version of 80K, or 80K staff or whatever, would have perhaps had maybe less activity on COVID, but maybe slightly higher quality activity on COVID and that might’ve been better.

On epistemic caution:

I mean people like me are very hesitant to talk very much on COVID for fear of being wrong or making mistakes. And I think that fear should be more widespread and maybe more severe for folks who don’t have the relevant background who’re trying to navigate the issue as well.

The lesson

Lewis twice mentions an EA Forum post he wrote about epistemic modesty, which sounds like it would be a relevant read, here. I haven’t read the whole thing yet, but I adore this bon mot in the section “Rationalist/​EA exceptionalism”:

Our collective ego is writing checks our epistemic performance (or, in candour, performance generally) cannot cash; general ignorance, rather than particular knowledge, may explain our self-regard.

Another bon mot a little further down, which is music to my weary ears:

If the EA and rationalist communities comprised a bunch of highly overconfident and eccentric people buzzing around bumping their pet theories together, I may worry about overall judgement and how much novel work gets done, but I would at grant this at least looks like fertile ground for new ideas to be developed.

Alas, not so much. What occurs instead is agreement approaching fawning obeisance to a small set of people the community anoints as ‘thought leaders’, and so centralizing on one particular eccentric and overconfident view. So although we may preach immodesty on behalf of the wider community, our practice within it is much more deferential.

This so brilliantly written, and so tightly compressed, I nearly despair, because I fear my efforts to articulate similar ideas will never approach this masterful expression.[1]

The philosopher David Thorstad corroborates Lewis’ point here in a section of a Reflective Altruism blog post about “EA celebrities”.

I’m not an expert on AI, and there is so much fundamental uncertainty about the future of AI and the nature of intelligence, and so much fundamental disagreement, that it would be hard if not impossible to meaningfully discern the majority views of expert communities on AGI in anything like the way you can for fields like epidemiology, virology, or public health. So, covid-19 and AGI are just fundamentally incomparable in some important way.

But I do know enough about AI — things that are not hard for anyone to Google to confirm — to know that people in EA routinely make elementary mistakes, ask the wrong questions, and confidently hold views that the majority of experts disagree with.

Elementary mistakes include: getting the definitions of key terms in machine learning wrong; not realizing that Waymos can only drive with remote assistance.

Asking the wrong questions includes: failing to critically appraise whether performance on benchmark tasks actually translates into real world capabilities on tasks in the same domain (i.e. does the benchmark have measurement validity if its intended use is to measure general intelligence, or human-like intelligence, or even just real world performance or competence?); failing to wonder what (some, many, most) experts say are the specific obstacles to AGI.

Confidently holding views that the majority of experts disagree with includes: there is widespread, extreme confidence about LLMs scaling to AGI, but a survey of AI experts earlier this year found that 76% think current AI techniques are unlikely or very unlikely to scale to AGI.

The situation with AI is more forgivable than with covid because there’s no CDC or SAGE for AGI. There’s no research literature — or barely any, especially compared to any established field — and there are no textbooks. But, still, there is general critical thinking and skepticism that can be applied with AGI. There are commonsense techniques and methods to understanding the issue better.

In a sense, I think the best analogy for assessing the plausibility of claims about near-term AGI is investigating claims that someone possesses a supernatural power like psychics who claim to be able to solve crimes via extrasensory perception, or investigating an account of a religious miracle, like a holy relic healing someone’s disease or injury. Or maybe an even better analogy is assessing the plausibility of the hypothesis that a near-Earth object is alien technology, as I discussed here. There is no science of extrasensory perception, or religious miracles. There is no science of alien technology (besides maybe a few highly speculative papers). Yet there are general principles of scientific epistemology, scientific skepticism, and critical thinking we can apply to these questions.

I resonate completely with Gregory Lewis’ dismay at how differently the EA community does research or forms opinions today than how GiveWell evaluates charities. I feel, like Lewis seems to, that the way it is now is a betrayal of EA’s founding values.

AGI is an easy topic for me to point out the mistakes in a clear way. A similar conversation could be had about longtermism, in terms of applying general critical thinking and skepticism. To wit: if longtermism is supposedly the most important thing in the world, then, after eight years of developing the idea, after multiple books published and many more papers, why haven’t we seen a single promising longtermist intervention yet, other than those that long predate the term “longtermism”? (More on this here.)

Other overconfident, iconoclastic opinions in the EA community are less prominent, but you’ll often see people who think they can outsmart decades of expert study of an issue with a little effort in their spare time. You see opinions of this sort in areas including policy, science, philosophy, and finance/​economics. It would be impossible for me to know enough about all these topics to point out the elementary mistakes that are probably made in all of them. But even in a few cases where I know just a little, or suddenly feel curious and care to Google, I have been able to notice some elementary mistakes.

To distill the lesson of the covid-19 example into two parts:

  • In cases where there is an established science or academic field or mainstream expert community, the default stance of people in EA should be nearly complete deference to expert opinion, with deference moderately decreasing only when people become properly educated (i.e., via formal education or a process approximating formal education) or credentialed in a subject.

  • In cases where there is no established science or academic field or mainstream expert community, such as AGI, longtermism, or alien technology, the appropriate approach is scientific skepticism, epistemic caution, uncertainty, common sense, and critical thinking.

  1. ^

    My best attempt so far was my post “Disciplined iconoclasm”.