Curious to know- how many of these papers were TERRA previously aware of before they were uncovered by the algo?
I’ve always wondered about the “first N Google results” strategy. Even in the absence of a file-drawer effect, isn’t this more likely to turn up papers making positive claims (on the assumption that e.g. rejections of the null are more likely to be cited than inconclusive results)?
Thank you so much for writing this. This is one of my central areas of interest, and I’ve been puzzled by the comparative lack of resources expended by the EA community on institutional decision-making given the apparently high degree of importance accorded to it by many of us.
This is a great guide. I agree that the central question here is whether or not deliberative democracy leads to better outcomes. If it does, or even if probably does, it seems that it’s easily one of the highest-value potential cause areas, since the levers that influence many other cause areas are within reach of democratic polities.
With that in mind, it seems clear to me that the primary way in which deliberation is EA-relevant is as a large-scale decision making mechanism. So it seems like relatively small-scale uses are not very important to us, and it also seems like information about these successes may not be useful given the likelihood that instituting these mechanisms at a large scale is likely to present very different problems of kind, not of degree. I’d love to hear your thoughts on that.
I have a few other thoughts about this review, and I’d like to hear your responses if you have the time.
• Basically all of the cross-country comparisons in this review suffer from reverse causation. Countries that have lots of deliberation and good outcomes don’t necessarily have the former causing the latter; the former could rather be just another instance of the latter. As enthused as I am about deliberative democracy, this scenario seems just as likely as the causal one. Is there any reason to view these correlations as suggestive of a causal effect?
• It seems like this review contains a relative paucity of research supporting the null hypothesis that deliberation does not improve decision making (or, for that matter, the alternative hypothesis that it actually worsens decision making). Were you unable to find studies taking this position? If not, how worried are you about the file-drawer effect here?
• Based on your reading of all this evidence, I’d love to hear your subjective first impressions- what do you personally feel is the “best bet” for enacting deliberative democracy on a large scale somewhere besides China? How far do you think this could feasibly go and how long would you expect such a change to take? Very wide confidence bands on these estimates are fine, of course.
I think this is a great and really sensible way to think about things. It’s really natural, and the physics analogy provides some intuition behind why that is. A question: have you thought about how this way of thinking is in some sense “baked into” certain moral frameworks? I’m thinking specifically here of rule utilitarianism: rules can apply at different scales. It seems to be that at the personal level rule utilitarianism is basically instantiated as virtue ethics.
I haven’t read this book and I’m also not an expert, so my confidence on this comment is low.
Although nuclear weapons seem to have at best a quite limited substantive impact on actual historical events, they have had a tremendous influence on our agonies and obsessions, inspiring desperate rhetoric, extravagant theorizing, wasteful expenditure, and frenetic diplomatic posturing
Not only have nuclear weapons failed to be of much value in military conflicts, they also do not seem to have helped a nuclear country to swing its weight or “dominate” an area
Wars are not caused by weapons or arms races, and the quest to control nuclear weapons has mostly been an exercise in irrelevance
As a relative layman, I find claims like these puzzling. This is primarily because the “agonies and obsessions … desperate rhetoric, extravagant theorizing, wasteful expenditure, and frenetic diplomatic posturing” that Mueller apparently dismisses drove the course of history for the half-century following the Second World War.
It’s hard to imagine that the Cold War would have occurred at all in the absence of nuclear weapons. While it’s true that the first nukes didn’t pose much more serious a threat than a large-scale firebombing, it was barely more than a decade after the war that much more destructive weapons were being built. A successful conventional Soviet assault on the U.S. mainland was, as far as I know, never a serious possibility. It seems clear that the terror of that period was driven by the nuclear threat, and that the nuclear threat drove U.S. and Soviet strategic posture, which also influenced foreign aid, trade policy, etc. Even if their danger is exaggerated, perception of their danger (in my view an unavoidable perception—even the Joint Chiefs were prepared to nuke Cuba during the missile crisis despite knowing that the strategic situation had not appreciably changed) had serious effects.
Also, and again, not an expert (and I’d like to know if Mueller addresses this specific case) but of course Israel has been a nuclear power since as early as 1979. Before that date, Israel fought three major wars and dozens of smaller engagements with its neighbors. Since then, virtually all of Israel’s military conflicts have been essentially counterinsurgency or against state proxies such as Hezbollah. It’s often argued that Israel’s status as a nuclear power has driven Iran’s efforts in that arena, which has also influenced Saudi belligerence; this conflict has affected oil prices, domestic politics in both countries, the ongoing war in Yemen, etc. This is kind of a long DAG, but I feel like there are other examples like this, and I find it sort of hard to accept the position that the simple existence of nuclear weapons hasn’t been immensely consequential.
I nominate Raj Chetty’s Who Becomes an Inventor in America? The Importance of Exposure to Innovation, which builds on his and his collaborators’ impressive other work using administrative data to estimate intergenerational economic mobility.
Chetty’s recent work is methodologically ahead of the curve, and I hope to see many more economists using large-scale administrative data to address the big questions. But the paper I’ve nominated—the “Lost Einsteins” paper—is exceptionally interesting, and I think that within a few years it will start to be seen as really important.
This is, first, because it very palpably demonstrates that concerns about inequality and economic efficiency and long-run growth are inextricably linked. If you accept endogenous growth theory as a plausible account, then the Lost Einsteins paper suggests (actually, states explicitly) that various kinds of inequality can slow innovation and therefore growth.
Second, I think that this is a fairly EA-relevant paper. It’s clear that individual inventors or small groups of innovators (Haber/Bosch, Borlaug, Tesla, Robert Noyce) can alter the course of history in a meaningful way. It’s impossible to estimate the lost social value of the lost Einsteins, but I think it’s plausible to suggest that it could be significant.
I’ve been following this series and I’m really enjoying it. I’m curious if you’ve thought about Fermi-like paradoxes in a general way and if you have any thoughts on extending your analysis here to other domains. You are probably familiar with Sandberg et al.’s proposed resolution of the Fermi paradox, but your framing of the issue has got me thinking about other similar (though perhaps less mystifying) paradoxes out there. The lenses you apply here (e.g. humaneness/treachery) seem like they could equally be applicable in other domains. A couple other examples:
• It seems like far-right terrorism in the U.S. is relatively rare despite the (again, relative) prevalence of militant views and easy access to firearms
• I often wonder why bookstores don’t burn down more often, since arsonists and pyromaniacs exist (and arson is fairly common) and bookstores are among the easiest pickings.
Thanks for writing this! I take the broader point and I think you provide good reasons to think that international trade deserves more attention as an effective intervention.
I may be missing something, but I’m really not sure what to make of that $200k number. It seems low intuitively, but a little examination makes it seem even stranger. In 2018, about $3.5 billion was spent on lobbying. In the 115th congress, 2017-2019, 443 bills were passed, as in, actually became law. So it seems reasonable to say that about 200 bills became law in 2018. That’s almost twenty million dollars per bill. And that’s in a weird idealized scenario where spending on lobbying gets the bill passed and where all lobbying money is being spent on lobbying-for (not lobbying-against) and where the money is evenly divided across bills.
We have no idea what the distribution of effectiveness looks like, and I totally buy the idea that some bills can be passed with only $200k in lobbying funds, but that would be true at the tails of the distribution, not in expectation.
Thanks for responding. I’ve now reread your post (twice) and I feel comfortable in saying that I twisted myself up reading it the first time around. I don’t think my comment is directly relevant to the point you’re making, and I’ve retracted it. The point is well-taken, and I think it holds up.
I imagine that there a large fraction of EAs who expect to be more productive in direct work than in an ETG role. But I’m not too clear why we should believe that.
I think that for some of us this is a basic assumption. I can only speak to this personally, so please ignore me if this isn’t a common sentiment.
First, direct roles are (in principle) high-leverage positions. If you work, for example, as a grantmaker at an EA org, a 1% increase in your productivity or aptitude could translate into tens of thousands of dollars more in funds for effective causes. In many ETG positions, a 1% increase in productivity is unlikely to result in any measurable impact on your earnings, and even an earnings impact proportional to the productivity gain would be negligible in absolute terms. So I tend to feel like, all other things being equal, my value is higher in a direct role.
But I don’t think all other things are even equal. There seems to be an assumption underlying the ETG conversation that most EA-capable people are also capable of performing comparably well in ETG roles. In a movement with many STEM-oriented individuals, this may be a statistical truth, but it’s not clear to me that it’s necessarily true. Though it’s obviously important to be intelligent, analytical, rational, etc. in many high-impact EA roles, the skills required to get and keep a job as, say, a senior software engineer, are highly specific. They require a significant investment of time and energy to acquire, and the highest-earning positions are as competitive as (or more competitive than) top EA jobs. For EAs without STEM backgrounds, this is a very long road, and being very smart isn’t necessarily enough to make it all the way.
Some EAs seem capable of making these investments solely for the sake of ETG and the opportunity for an intellectual challenge. Others find it difficult to stay motivated to make these investments when we feel we have already made significant personal investments in building skills that would be uniquely useful in a direct role and might not have the same utility in an ETG role. Familiarity with the development literature, for example, is relatively hard-won and not particularly well-compensated outside EA.
I recognize that there’s a sort of collective action problem here: there simply cannot be a direct EA role for every philosophy MA or social scientist. But I wanted to argue here that the apparent EA preference for direct roles makes some good amount of sense.
I myself have split the difference, working as a data scientist at a socially-minded organization that I hope to make more “EA-aware” and giving away a fixed percentage of my earnings. I make less than I would in a more competitive role, but I believe there is some possibility of making a positive impact through the work itself. This is my way of dealing with career uncertainty and I’m curious to hear everyone’s thoughts on it.
Hey Wyatt, this is impressive! Your writing is very clear and the document overall is very digestible (I mean that as a genuine compliment). “Life stewardship” seems a reasonable enough lens with which to view these issues. I know you’re still writing, so this may be premature, but I think it’s probably possible to significantly pare down this document without sacrificing meaning, perhaps by more than half.
It might help us to know who the target audience is for this work. I think EAs will find these concepts familiar and may appreciate your framing; your thoughts may or may not resonate/convince. There is probably also some segment of the general public that will find this interesting.
As a work of political philosophy, I think the book is a little bit hamstrung by a lack of engagement with other work in the field. Without speaking to your specific arguments, I feel confident in saying that this will probably create some resistance among readers who have a serious interest in philosophy. Political and moral philosophers have, of course, been struggling with some of these issues for centuries, and I think it’s vital to build on, respond to, rebut, and otherwise integrate the large body of existing literature that you’re making a good-faith effort to contribute to.
Some very interesting thoughts here. I think your final points are excellent, particularly #2. It does seem that experts in some fields have a hard-won humility about the ability of data to answer the central questions in their fields, and that perhaps we should use this as a sort of prior guideline for distributing future research resources.
I just want to note that I think the focus on sample size here is somewhat misplaced. N = 200 is by no means a crazily small sample size for an RCT, particularly when units are villages, administrative units, etc. As you note, suitably large effect sizes are reliably statistically distinguishable from zero in this context. This is true even with considerably smaller samples—even N = 20! Randomizations even of small samples are relatively unlikely to be unbalanced on confounders, and the p-values yielded by now-common methods like randomization inference express exactly this likelihood. To me—and I mean this exclusively in the context of rigorously designed and executed RCTs—this concern can be addressed by greater attention to the actual size of resulting p-values: our threshold for accepting the non-null finding of a high-variance, small-sample RCT should perhaps be some very much lower value.
It is true that when there is high variance across units, statistically significant effects are necessarily large; this can obviously lead to some misleading results. Your point is well-taken in this context: if, for example, there are only 20 administrative units in country X, and we are able to randomize some educational intervention across units that could plausibly increase graduation rates only by 1%, but the variance in graduation rates across units is 5%, well, we’re unlikely to find anything useful. But it remains statistically possible to do so given a strong enough effect!
Thanks for writing this. I want to emphasize a point you make implicitly here, which is that it’s not always clear when ITN is being used as an informal heuristic and when it’s being used for actual or abstract calculation. I think arguments made previously by Rob Wiblin and John Halstead about the conceptual and practical difficulties of this approach make it clear that it is not a suitable method for rigorously ranking causes.
Still, I think it remains a valuable heuristic and a guide for more exhaustive calculations. Though neglectedness may be the wobbliest aspect, it’s a (generally) good approximation of the possibility for additional value when in-depth information on possible marginal returns to a candidate cause area is immediately unavailable.