Currently Research Director at Founders Pledge, but posts and comments represent my own opinions, not FP’s, unless otherwise noted.
I worked previously as a data scientist and as a journalist.
Currently Research Director at Founders Pledge, but posts and comments represent my own opinions, not FP’s, unless otherwise noted.
I worked previously as a data scientist and as a journalist.
Some very interesting thoughts here. I think your final points are excellent, particularly #2. It does seem that experts in some fields have a hard-won humility about the ability of data to answer the central questions in their fields, and that perhaps we should use this as a sort of prior guideline for distributing future research resources.
I just want to note that I think the focus on sample size here is somewhat misplaced. N = 200 is by no means a crazily small sample size for an RCT, particularly when units are villages, administrative units, etc. As you note, suitably large effect sizes are reliably statistically distinguishable from zero in this context. This is true even with considerably smaller samples—even N = 20! Randomizations even of small samples are relatively unlikely to be unbalanced on confounders, and the p-values yielded by now-common methods like randomization inference express exactly this likelihood. To me—and I mean this exclusively in the context of rigorously designed and executed RCTs—this concern can be addressed by greater attention to the actual size of resulting p-values: our threshold for accepting the non-null finding of a high-variance, small-sample RCT should perhaps be some very much lower value.
It is true that when there is high variance across units, statistically significant effects are necessarily large; this can obviously lead to some misleading results. Your point is well-taken in this context: if, for example, there are only 20 administrative units in country X, and we are able to randomize some educational intervention across units that could plausibly increase graduation rates only by 1%, but the variance in graduation rates across units is 5%, well, we’re unlikely to find anything useful. But it remains statistically possible to do so given a strong enough effect!
Hey Wyatt, this is impressive! Your writing is very clear and the document overall is very digestible (I mean that as a genuine compliment). “Life stewardship” seems a reasonable enough lens with which to view these issues. I know you’re still writing, so this may be premature, but I think it’s probably possible to significantly pare down this document without sacrificing meaning, perhaps by more than half.
It might help us to know who the target audience is for this work. I think EAs will find these concepts familiar and may appreciate your framing; your thoughts may or may not resonate/convince. There is probably also some segment of the general public that will find this interesting.
As a work of political philosophy, I think the book is a little bit hamstrung by a lack of engagement with other work in the field. Without speaking to your specific arguments, I feel confident in saying that this will probably create some resistance among readers who have a serious interest in philosophy. Political and moral philosophers have, of course, been struggling with some of these issues for centuries, and I think it’s vital to build on, respond to, rebut, and otherwise integrate the large body of existing literature that you’re making a good-faith effort to contribute to.
I imagine that there a large fraction of EAs who expect to be more productive in direct work than in an ETG role. But I’m not too clear why we should believe that.
I think that for some of us this is a basic assumption. I can only speak to this personally, so please ignore me if this isn’t a common sentiment.
First, direct roles are (in principle) high-leverage positions. If you work, for example, as a grantmaker at an EA org, a 1% increase in your productivity or aptitude could translate into tens of thousands of dollars more in funds for effective causes. In many ETG positions, a 1% increase in productivity is unlikely to result in any measurable impact on your earnings, and even an earnings impact proportional to the productivity gain would be negligible in absolute terms. So I tend to feel like, all other things being equal, my value is higher in a direct role.
But I don’t think all other things are even equal. There seems to be an assumption underlying the ETG conversation that most EA-capable people are also capable of performing comparably well in ETG roles. In a movement with many STEM-oriented individuals, this may be a statistical truth, but it’s not clear to me that it’s necessarily true. Though it’s obviously important to be intelligent, analytical, rational, etc. in many high-impact EA roles, the skills required to get and keep a job as, say, a senior software engineer, are highly specific. They require a significant investment of time and energy to acquire, and the highest-earning positions are as competitive as (or more competitive than) top EA jobs. For EAs without STEM backgrounds, this is a very long road, and being very smart isn’t necessarily enough to make it all the way.
Some EAs seem capable of making these investments solely for the sake of ETG and the
opportunity for an intellectual challenge. Others find it difficult to stay motivated to make these investments when we feel we have already made significant personal investments in building skills that would be uniquely useful in a direct role and might not have the same utility in an ETG role. Familiarity with the development literature, for example, is relatively hard-won and not particularly well-compensated outside EA.
I recognize that there’s a sort of collective action problem here: there simply cannot be a direct EA role for every philosophy MA or social scientist. But I wanted to argue here that the apparent EA preference for direct roles makes some good amount of sense.
I myself have split the difference, working as a data scientist at a socially-minded organization that I hope to make more “EA-aware” and giving away a fixed percentage of my earnings. I make less than I would in a more competitive role, but I believe there is some possibility of making a positive impact through the work itself. This is my way of dealing with career uncertainty and I’m curious to hear everyone’s thoughts on it.
Thanks for responding. I’ve now reread your post (twice) and I feel comfortable in saying that I twisted myself up reading it the first time around. I don’t think my comment is directly relevant to the point you’re making, and I’ve retracted it. The point is well-taken, and I think it holds up.
Thanks for writing this! I take the broader point and I think you provide good reasons to think that international trade deserves more attention as an effective intervention.
I may be missing something, but I’m really not sure what to make of that $200k number. It seems low intuitively, but a little examination makes it seem even stranger. In 2018, about $3.5 billion was spent on lobbying. In the 115th congress, 2017-2019, 443 bills were passed, as in, actually became law. So it seems reasonable to say that about 200 bills became law in 2018. That’s almost twenty million dollars per bill. And that’s in a weird idealized scenario where spending on lobbying gets the bill passed and where all lobbying money is being spent on lobbying-for (not lobbying-against) and where the money is evenly divided across bills.
We have no idea what the distribution of effectiveness looks like, and I totally buy the idea that some bills can be passed with only $200k in lobbying funds, but that would be true at the tails of the distribution, not in expectation.
I’ve been following this series and I’m really enjoying it. I’m curious if you’ve thought about Fermi-like paradoxes in a general way and if you have any thoughts on extending your analysis here to other domains. You are probably familiar with Sandberg et al.’s proposed resolution of the Fermi paradox, but your framing of the issue has got me thinking about other similar (though perhaps less mystifying) paradoxes out there. The lenses you apply here (e.g. humaneness/treachery) seem like they could equally be applicable in other domains. A couple other examples:
• It seems like far-right terrorism in the U.S. is relatively rare despite the (again, relative) prevalence of militant views and easy access to firearms
• I often wonder why bookstores don’t burn down more often, since arsonists and pyromaniacs exist (and arson is fairly common) and bookstores are among the easiest pickings.
I haven’t read this book and I’m also not an expert, so my confidence on this comment is low.
But-
Although nuclear weapons seem to have at best a quite limited substantive impact on actual historical events, they have had a tremendous influence on our agonies and obsessions, inspiring desperate rhetoric, extravagant theorizing, wasteful expenditure, and frenetic diplomatic posturing
Not only have nuclear weapons failed to be of much value in military conflicts, they also do not seem to have helped a nuclear country to swing its weight or “dominate” an area
Wars are not caused by weapons or arms races, and the quest to control nuclear weapons has mostly been an exercise in irrelevance
As a relative layman, I find claims like these puzzling. This is primarily because the “agonies and obsessions … desperate rhetoric, extravagant theorizing, wasteful expenditure, and frenetic diplomatic posturing” that Mueller apparently dismisses drove the course of history for the half-century following the Second World War.
It’s hard to imagine that the Cold War would have occurred at all in the absence of nuclear weapons. While it’s true that the first nukes didn’t pose much more serious a threat than a large-scale firebombing, it was barely more than a decade after the war that much more destructive weapons were being built. A successful conventional Soviet assault on the U.S. mainland was, as far as I know, never a serious possibility. It seems clear that the terror of that period was driven by the nuclear threat, and that the nuclear threat drove U.S. and Soviet strategic posture, which also influenced foreign aid, trade policy, etc. Even if their danger is exaggerated, perception of their danger (in my view an unavoidable perception—even the Joint Chiefs were prepared to nuke Cuba during the missile crisis despite knowing that the strategic situation had not appreciably changed) had serious effects.
Also, and again, not an expert (and I’d like to know if Mueller addresses this specific case) but of course Israel has been a nuclear power since as early as 1979. Before that date, Israel fought three major wars and dozens of smaller engagements with its neighbors. Since then, virtually all of Israel’s military conflicts have been essentially counterinsurgency or against state proxies such as Hezbollah. It’s often argued that Israel’s status as a nuclear power has driven Iran’s efforts in that arena, which has also influenced Saudi belligerence; this conflict has affected oil prices, domestic politics in both countries, the ongoing war in Yemen, etc. This is kind of a long DAG, but I feel like there are other examples like this, and I find it sort of hard to accept the position that the simple existence of nuclear weapons hasn’t been immensely consequential.
I think this is a great and really sensible way to think about things. It’s really natural, and the physics analogy provides some intuition behind why that is. A question: have you thought about how this way of thinking is in some sense “baked into” certain moral frameworks? I’m thinking specifically here of rule utilitarianism: rules can apply at different scales. It seems to be that at the personal level rule utilitarianism is basically instantiated as virtue ethics.
Thank you so much for writing this. This is one of my central areas of interest, and I’ve been puzzled by the comparative lack of resources expended by the EA community on institutional decision-making given the apparently high degree of importance accorded to it by many of us.
This is a great guide. I agree that the central question here is whether or not deliberative democracy leads to better outcomes. If it does, or even if probably does, it seems that it’s easily one of the highest-value potential cause areas, since the levers that influence many other cause areas are within reach of democratic polities.
With that in mind, it seems clear to me that the primary way in which deliberation is EA-relevant is as a large-scale decision making mechanism. So it seems like relatively small-scale uses are not very important to us, and it also seems like information about these successes may not be useful given the likelihood that instituting these mechanisms at a large scale is likely to present very different problems of kind, not of degree. I’d love to hear your thoughts on that.
I have a few other thoughts about this review, and I’d like to hear your responses if you have the time.
• Basically all of the cross-country comparisons in this review suffer from reverse causation. Countries that have lots of deliberation and good outcomes don’t necessarily have the former causing the latter; the former could rather be just another instance of the latter. As enthused as I am about deliberative democracy, this scenario seems just as likely as the causal one. Is there any reason to view these correlations as suggestive of a causal effect?
• It seems like this review contains a relative paucity of research supporting the null hypothesis that deliberation does not improve decision making (or, for that matter, the alternative hypothesis that it actually worsens decision making). Were you unable to find studies taking this position? If not, how worried are you about the file-drawer effect here?
• Based on your reading of all this evidence, I’d love to hear your subjective first impressions- what do you personally feel is the “best bet” for enacting deliberative democracy on a large scale somewhere besides China? How far do you think this could feasibly go and how long would you expect such a change to take? Very wide confidence bands on these estimates are fine, of course.
I’ve always wondered about the “first N Google results” strategy. Even in the absence of a file-drawer effect, isn’t this more likely to turn up papers making positive claims (on the assumption that e.g. rejections of the null are more likely to be cited than inconclusive results)?
Thanks!
Curious to know- how many of these papers were TERRA previously aware of before they were uncovered by the algo?
Hey Aaron! Thanks for posting this. I am likely going to include CES in my giving this year as a result of some of the points you’ve made here.
I’ve been researching lobbying recently and I’m curious about this passage:
We will lobby legislators. Because of approval voting’s simplicity, there are opportunities for lobbying elected officials. Normally, this isn’t an option because of the conflict of interest with those elected. But the opportunity presents itself when the party in power suffers because of vote splitting yet wants to avoid implementing a complex method. There are places where RCV is stalled out where we have opportunities. These are typically higher risk but very high reward since they don’t require the same resources as a campaign. Our estimate is that they can be one sixth the expected cost per citizen compared to ballot measures when factoring in their relative probability of success. This also requires funding for a 501(c)4 to do this effectively at scale.
I’m not particularly skeptical about this one-sixth estimate, but I haven’t been able to find anything like it my lit review! Do you have some background on this research?
I’ll post a summary lit review here on the forum when I’m done with my research. Spoiler alert: political scientists don’t have a great idea of how/why/whether lobbying works and research on its effectiveness is almost strictly limited to trade policy and large publicly traded firms. So you get expressions of effects like “$140 in additional shareholder value for every $1 spent on lobbying.” Interesting, but not particularly generalizable.
It seems like CES’s strategy so far has been to start small, which makes obvious sense. I’m curious to know when/if you make the decision to withdraw from a local advocacy effort that seems like it’s not paying off. It’s not obvious to me that public support is monotonically increasing in dollars spent on advocacy— what’s your stopping rule?
This part of the discussion really rang true to me, and I want to hear more serious discussion on this topic. To many people outside the community it’s not at all clear what AI research, animal welfare, and global poverty have in common. Whatever corner of the movement they encounter first will guide their perception of EA; this obviously affects their likelihood of participation and the chances of their giving to an effective cause.
We all mostly recognize that EA is a question and not an answer, but the question that ties these topics together itself requires substantial context and explanation for the uninitiated (people who are relatively unused to thinking in a certain way). In addition, entertaining counterintuitive notions is a central part of lots of EA discourse, but many people simply do not accept counterintuitive conclusions as a matter of habit and worldview.
The way the movement is structured now, I fear that large swaths of the population are basically excluded by these obstacles. I think we have a tendency to write these people off. But in the “network” sense, many of these people probably have a lot to contribute in the way of skills, money, and ideas. There’s a lot of value—real value of the kind we like to quantify when we think about big cause areas—lost in failing to include them.
I recognize that EA movement building is an accepted cause area. But I’d like to see our conception of that cause area broaden by a lot— even the EA label is enough to turn people off, and strategies for communication of the EA message to the wider world have severely lagged the professionalization of discourse within the “community.”
The EA movement is disproportionately composed of highly logical, analytically minded individuals, often with explicitly quantitative backgrounds. The intuitive-seeming folk explanation for this phenomenon is that that EA, with its focus on rigor and quantification, appeals to people with a certain mindset, and that the relative lack of diversity of thinking styles in the movement is a function of personality type.
I want to reframe this in a way that I think makes a little more sense: the case for an EA perspective is really only made in an analytic, quantitative way. In this sense, having a quantitative mindset is actually a soft prerequisite for “getting” EA, and therefore for getting involved.
I don’t mean to say that only quantitative people can understand the movement, or that there’s something intellectually very special about EAs.
Rather- very few people would disagree that charity should be effective. Even non-utilitarians readily agree that in most contexts we should help as many people as we can. But the essential concepts for understanding the EA perspective are highly unfamiliar to most people.
Expected value
Cost-benefit analysis
Probability
An awareness of the abilities and limitations of social science
You don’t need to be an expert in any of these areas to “get” EA. You just need to be vaguely comfortable with them in the way that people who have studied microeconomics or analytic philosophy or mathematics are, and most other people aren’t.
This may be a distinction without a difference, but I want to raise the perspective that the composition of the EA movement is less about personality types and more about intellectual preparation.
Thanks for your thoughts. I wasn’t thinking about the submerged part of the EA iceberg (e.g. GWWC membership), and I do feel somewhat less confident in my initial thoughts.
Still, I wonder if you’d countenance a broader version of my initial point- that there is a way of thinking that is not itself explicitly quantitative, but that is nonetheless very common among quantitative types. I’m tempted to call this ‘rationality,’ but it’s not obvious to me that this thinking style is as all-encompassing as what LW-ers, for example, mean when they talk about rationality.
The examples you give of commonsensical versions of expected value and probability are what I’m thinking about here- perhaps the intuitive, informal versions of these concepts are soft prerequisites. This thinking style is not restricted to the formally trained, but it is more common among them (because it’s trained into them). So in my (revised) telling, the thinking style is a prerequisite and explicitly quantitative types are overrepresented in EA simply because they’re more likely to have been exposed to these concepts in either a formal or informal setting.
The reason I think this might be important is that I occasionally have conversations in which these concepts—in the informal sense—seem unfamiliar. “Do what has the best chance of working out” is, in my experience, a surprisingly rare way of conducting everyday business in the world, and some people seem to find it strange and new to think in that fashion. The possible takeaway is that some basic informal groundwork might need to be done to maximize the efficacy of different EA messages.
Thanks for the writeup!
If the recent Bill Gates documentary on Netflix is to be believed, then Gates first became seriously aware of the problem of diarrhea in the developing world thanks to a 1998 column by Nicholas Kristof. It’s hard to assess the counterfactual here (would Gates have encountered the issue in a different context? Would he have taken the steps he ultimately did after reading the Kristof piece?) but it seems plausible that Kristof’s article constitutes a cost-effective intervention in its own right (if a not particularly targeted one).
I bring this up because I’m intrigued by the viral coverage of your clean energy research. It’s not possible to quantify the impact of an article like this in any realistic way, but perhaps we can agree that a plausible distribution of beliefs about its value is close to strictly positive.
Future Perfect being what it is, it’s obviously the case that Vox constitutes an unusually receptive channel for EA-adjacent research. But I’m curious if you consider the wide propagation of your research in the news media a “risky and very effective” project, and if your research products have been intentionally structured toward this end. If you have some takeaways from your big success so far, it could be very helpful to post them here- widely taken-up tweaks to make research propagate more effectively through the media are marginal improvements with potentially very high value.
Given standard models of rational voter ignorance (and rational irrationality, etc.), this shouldn’t be surprising. Oversimplifying for a moment, the electorate’s middle are in all likelihood systematically mistaken about the sort of policies that would advance their interests; and when you pair these voters with political leaders who are incentivized to pander, we have a recipe for occasional disaster. I see no reason why this wouldn’t occur in a system with approval voting in the same way that it occurs in our current system.
I can think of one reason: rational ignorance is partially a consequence of the voting procedure used. People have less of an incentive to be ignorant when their votes matter more, as they would with approval voting. I don’t have a strong stance on this, but I think it’s important to recognize that studies about voter ignorance are not yielding evidence of an immutable characteristic of citizens; the situation is actually heavily contingent.
In the first few pages of The Myth of the Rational Voter, Bryan Caplan makes (implicitly) the case that voter ignorance isn’t a huge deal as long as errors are symmetric: ignorant voters on both sides of an issue will cancel each other out, and the election will be decided by informed voters who should be on the “right” side, in expectation. Caplan claims that systematic bias across the population results in “wrong” answers.
My point in bringing this up is just that the existence of large numbers of ignorant voters doesn’t have to be a major issue: large elections are decided by relatively small groups. Different voting procedures have very different ramifications for the composition of these small groups.
Fantastic work! In your post introducing this initiative you wrote that the base rate for passage of ballot initiatives was 11%. A conservative reading of the data here (taking the low value of $20m for development funding raised) seems to indicate a 100:1 return on investment. Taking the base rate, this $10 in effective development aid for $1 spent on advocacy (in expectation). If the development aid is effectively spent, the implication here is that money spent on an initiative like this might be ten times as effective in expectation as money donated directly to a top-rated charity. This assumes, of course, that the base rate is accurate.
In that initial post, you had an exchange with Stefan Schubert about the relevance of your assumed base rate. You discussed the importance of polling at that point but it’s not clear to me where you left off.
This success really seems to highlight the importance of public opinion polling here. The value of information in this domain is very high, since you’re trying to identify the avenue which will provide the greatest leverage. Choosing the wrong avenue has no value, and potentially even minor reputational costs for your organization or for EA in general. Choosing the right avenue has huge upsides.
Public opinion polling seems crucial to this end. In this scenario, prior polling might have allowed you to identify a reasonable figure beforehand (avoiding the $87 million overreach). More importantly, though (if I understand the procedure correctly), it might have enabled you to avoid the counterproposal process and to pinpoint an optimal figure to ask for—perhaps one higher than the one you ultimately got.
I don’t want to diminish the achievement here, which I think is huge; I just want to point out that extremely useful information for this effort can be retrieved from the public at relatively low cost. In the future, this information can be used to reduce the uncertainty around efforts to fund ballot proposals and increase the expected value of these efforts by lowering the probability of failure in expectation.
Thanks for writing this. I want to emphasize a point you make implicitly here, which is that it’s not always clear when ITN is being used as an informal heuristic and when it’s being used for actual or abstract calculation. I think arguments made previously by Rob Wiblin and John Halstead about the conceptual and practical difficulties of this approach make it clear that it is not a suitable method for rigorously ranking causes.
Still, I think it remains a valuable heuristic and a guide for more exhaustive calculations. Though neglectedness may be the wobbliest aspect, it’s a (generally) good approximation of the possibility for additional value when in-depth information on possible marginal returns to a candidate cause area is immediately unavailable.