Late response, but may still be of interest: some colleagues and I spent some time surveying the existing literature on China x AI issues and the resource list we produced includes a section on Key actors and their views on AI risks. In general, I’d recommend the Concordia AI Safety newsletter for regular news of Chinese actors commenting on AI safety (and, more or less directly, on related x-risks).
Sarah Weiler
Thanks for reiterating the distinction, it seems quite helpful to the topic (on first consideration; I’ll have to mull this over a bit more over the next few days to really understand how the distinction fits into and may shift my thinking)!
I partially (largely?) agree with your comments. It seems right that in specific decision-situations, it will often not be relevant to consider how prior influences account for (and take away from the individual impact of) my own actions or the actions of a person I’m trying to influence. But I do think that it’s useful to remain aware of the fact that our actions are so heavily influenced by others, and especially that our actions will in turn contribute to influencing the behaviour (and thoughts, and attitudes) of many other people. Remaining aware of that fact seems to push away from evaluating actions only on the basis of how much counterfactual impact one can expect from that one isolated action: the fact that all actions are always the result of many preceding actions, where (often) every single preceding action only contributes a small part to shaping the resulting action, makes conceivable the idea that actions with low direct counterfactual impact can still be quite important and justifiable when considered from a perspective of behavioural heuristics or collective rationality (both of which recognise that some hugely important outcomes can only ever be attained if many people decide to take actions that have low expected impact on their own).
- Mar 27, 2024, 10:43 PM; 7 points) 's comment on Critique of the notion that impact follows a power-law distribution by (
Thanks for your comment, very happy to hear that my post struck you as clear and thorough (I’m never sure how well I do on clarity in my philosophical writing, since I usually retain a bit of confusion and uncertainty even in my own mind).
I agree that many dangers of internalizing the “heavy-tailed impact” perspective in the wrong way are due to misguided inference, not a strictly necessary implication of the perspective itself.
Not least thanks to input from several comments below, I am back to reconsidering my stance on the claims made in the essay around empirical reality and around appropriate conceptual frameworks. I have tangentially encountered Shapley values before but not yet really tried to understand the concept, so if you think they could be useful for the contents of this post, I’ll try to find the time to read the article you linked; thanks for the input!
I share the wariness that you mention re “arguments that have the form “even if X is true, believing / saying it has bad consequences, so we shouldn’t believe / say X.”″. At the same time, I don’t think that these arguments are always completely groundless (at least the arguments around refraining from saying something; much more inclined to agree that we should never believe something just for the sake of supposed better consequences from believing it). I also tend to be more sympathetic to these arguments when X is very hard to know (“we don’t really have means to tell whether X is true, and since believing in X might well have bad side-effects, we should not claim that X and we should maybe even make an effort to debunk the certainty with which others claim that X”). But yes, agree that wariness (though maybe not unconditional rejection) around arguments of this form is generally warranted, to avoid misguided dogmatism in the flawed attempt to prevent (supposed) information hazards.
Thanks for mentioning this! I definitely agree that Sinocism is an interesting newsletter to follow if one wants to stay up-to-date on China-related events (from a US perspective). I think it falls outside of what we are considering for this list, since Sinocism only occasionally mentions tech- or AI-related issues and rarely discusses them in-depth (afaik). But I will add this to the list of “potential items to include”, which we (the authors of this list) will discuss at regular, but quite spaced-out, intervals (tentative plan, subject to revision if more updates seem needed: annual or maybe semi-annual).
Yikes, I linked to the wrong Todd article there, apologies! Meat consumption is mentioned in Todd 2021(2023):
We’ve basically found this pattern wherever data is available.
I’ll add the source to that part of the essay, thanks for the alert!
Appreciate the attempt to make headway on the disagreement!
I feel pretty lost when trying to quantify impact at these percentiles. Taking concerns about naive attribution of impact into consideration, I don’t even really know where to start to try to come up with numbers here. I just notice that I have a strong intuition, backed up by something that seems to me like a plausible claim: given that myriad actors always contribute to any outcome, it is hard to imagine that there is one (or a very few) individual(s) that does all of the heavy lifting...
“And how much spread do we need to get here in order to justify a lot of attention going into looking for tail-upsides?” —Also a good question. I think my answer would be: it depends on the situation and how much up- or downsides come along with looking for tail-upsides. If we’re cautious about the possible adverse effects of impact maximizing mindsets, I agree that it’s often sensible to look for tail-upsides even if they would “only” allow us to double impact. Then there are some situations/problems where I believe the collective rationality mindset, which looks for “how should I and my fellows behave in order to succeed as a community” rather than “how should I act now to maximize the impact I can have as a relatively direct/traceable outcome from my own action?”
- Mar 20, 2024, 11:42 PM; 11 points) 's comment on Critique of the notion that impact follows a power-law distribution by (
This is a good question. I think, if we assume everything else equal (neither got the money by causing harm, both were influenced by roughly the same number of actors to be able and willing to donate their money), then I think I agree that the altruistic impact of the first is 100x that of the second.
I am not entirely sure what that implies for my own thinking on the topic. On the face of it, it clearly contradicts the conclusion in my Empirical problem section. But it does so without, as far as I can tell, addressing the subpoints I mention in that section. Does that mean the subpoints are not relevant to the empirical claim I make? They seem relevant to me, and that seems clear in examples other than the one you presented. I’m confused, and I imagine I’ll need at least a few more days to figure out how the example you gave changes my thinking.
Update: I am currently working on a Dialogue post with JWS to discuss their responses to the essay above and my reflections since publishing it. I imagine/hope that this will help streamline my thinking on some of the issues raised in comments (as well as some of the uncertainties I had while writing the essay). For that reason, I’ll hold off on comment responses here and on updates to the original essay until work on the Dialogue post has progressed a bit further, hoping to come back to this in a few days (max 1-1.5 weeks?) with a clearer take on whether & how comments such as this one by Jeff shift my thinking. Thanks again to all critical (and supportive) commenters for kicking off these further reflections!
- Mar 25, 2024, 7:23 PM; 4 points) 's comment on Critique of the notion that impact follows a power-law distribution by (
I think counterfactual analysis as a guide to making decisions is sometimes (!) a useful approach (especially if it is done with appropriate epistemic humility in light of the empirical difficulties).
But, tentatively, I don’t think that it is a valid method for calculating the impact an individual has had (or can be expected to have, if you calculate ex ante). I struggle a bit to put my thinking on this into words, but here’s an attempt: If I say “Alec [random individual] has saved 1,000 lives”, I think what I mean is “1,000 people now live because of Alec alone”. But if Alec was only able to save those lives with the help of twenty other people, and the 1,000 people would now be dead were it not for those twenty helpers, then it seems wrong to me to claim that the 1,000 survivors are alive only because of Alec—even if Alec played a vital role in the endeavour and if it would have been impossible to replace Alec by some other random individual. And just because any one of the twenty people were easily replaceable, I don’t think that they all suddenly count for nothing/very little in the impact evaluation; the fact seems to remain that Alec would not have been able to have any impact if he did not have twenty other people to help him… So it seems like an individual impact evaluation would need to include some sharing between Alec and the twenty other helpers; wouldn’t it??
Correct me if you (anyone reading this) think I’m misguided, but I believe the crux here is that I’m using a different definition of “impact” than the one that underlies counterfactual analysis. I agree that the impact definition underlying counterfactual analysis can sometimes be useful for making individual decisions, but I would argue that the definition I use can be helpful when talking about efforts to do good as a community and when trying to build a strategy for how to live one’s life over the long term (because it looks at what is needed for positive change in the aggregate).
Wasn’t quite sure where best to respond in this thread, hope here makes decent sense.
I did actually seek to convey the claim that individuals do not differ massively in impact ex post (as well as ex ante, which I agree is the weaker and more easily defensible version of my claim). I was hoping to make that clear in this bullet point in the summary: “I claim that there are no massive differences in impact between individual interventions, individual organisations, and individual people, because impact is dispersed across [many actions]”. So, I do want to claim that: if we tried to apportion the impact of these consequences across contributing actions ex post, then no one individual action is massively higher in impact than the average action (with the caveat that net-negative actions and neutral actions are excluded; we only look at actions that have some substantial positive impact).
That said, I can see how my chosen title may be flawed because a) it leaves out large parts of what the post is about (adverse effects, conceptual debate); and b) it is stronger than my actual claim (the more truthful title would then need to be something like “There are probably no massive differences in impact between individuals (excluding individuals who have a net-negative or no significant impact on the world)”).
I am not sure if I agree that the current title is actively misleading and click-baity, but I take seriously the concern that it could be. I’ll mull this over some more and might change the title if I conclude that it is indeed inappropriate.[EDIT: Concluded that changing the title seems sensible and appropriate. I hope that the new title is better able to communicate fully what my post is about.]
I’m obviously not super happy about the downvote, but I appreciate that you left the comment to explain and push me to reconsider, so thank you for that.
Thanks for that thoughtful comment!
Agree that the adverse effects that I dedicate a large part of the post to do not speak to the question of whether impact actually follows a power-law distribution. They are just arguments against thinking about impact in that way. I think I acknowledge that repeatedly in the post, but can see now that the title makes it sound like I focus mainly on the “Empirical problem”.
“I think that after you sort through this kind of consideration you would be able to recover some version of the power law claim basically intact”—I wonder if our disagreement on that is traceable and resolvable, or whether it stems from some pretty fundamental intuitions which it’s hard to argue about sensibly?
ex ante vs. ex post: Interesting that you raise that! I’ve talked to a few people about the ideas in the essay, and I think something like your argument here was the most common response. I think I remain more persuaded by the claim that impact is not power law distributed at all, even ex post and not just because we don’t have the means to predict ex ante. But I agree that the case for a power law distribution is harder to defend ex ante (because of all the uncertainty) than ex post, and my confidence in doubting the claim is stronger for ex ante impact evaluations than it is for ex post evaluations.
True and good point that I basically ignored the benefits of power-law thinking. I’ll consider whether I think my thoughts on these benefits can fit somewhere in the essay, and will update it accordingly if I find an appropriate fit. Thanks for pointing this out!
Your conclusion sounds largely agreeable to me (though I imagine we would disagree when asked to specify how large the “tail-upsides” are that people should look for in a cautious manner).
I appreciate the sentiment and agree that preventing clickbaity titles from becoming more common on the EA forum is a valid goal!
I’d sincerely regret if my title does indeed fall into the “does not convey what the post is about” category. But as Jeff Kaufman already wrote, I’m not sure I understand in which sense the top-level claim is untrue to the main argument in the post. Is it because only part of the post is primarily about the empirical claim that impact does not differ massively between individuals?
Critique of the notion that impact follows a power-law distribution
China x AI Reference List
Thanks for sharing your reflections here! I was rather taken aback when reading the headline and remain so after reading your “What I think” section:
You write that you’re “glad the BWC exists,” but also that you’re “weakly against the BWC.” Did I understand correctly that you mean that you’re glad that the legal agreement exists but are weakly against supporting or engaging in work to uphold the convention (at least the kind of work that is currently undertaken with that aim)? If so, could you explain whether you think that the durability and strength of the agreement is largely unaffected by such work; or that such work has value but less value than alternative efforts; or that such work has value but should be left to “others” (maybe: “non-EA-minded people” who are less focused on reducing unconventional risks)?
Whatever the answer to the above, I’m somewhat concerned by the normative effects of the phrasing you chose, and would be curious to hear your thoughts on those concerns. What I worry about is that your title and parts of the write-up suggest that the BWC and its surrounding structure is bad, is a waste of resources, and should thus be dismantled, or at least neglected even more than it already is. I share your sentiment that it’s good to have the BWC as some backstop against state development of biological weapons programmes; and I believe that this backstop works mostly through normative means (rather than some material enforcement mechanism). I further believe that such normative measures consist of how the BWC is spoken about (whether it is taken seriously, appreciated as important, etc) and of dedicated events to bring together relevant stakeholders (i.e. those people who might contribute to upholding and further developing the convention). The concern is that articles like yours contribute to undermining both.
More coordinated civil society action on reducing nuclear risk
Thanks a lot for taking the time to read & comment, Chris!
Main points
I want to take this opportunitiy to steelman your case: If the lower neglectedness and higher tractability of civil movement / policy in denuclearisation to less than 300 nuclear weapons (approximate number for not causing a nuclear winter) > higher neglectedness and lower tractability of physical intervention (resilience food and supply chain resilience plan), you might be correct!
I (honestly!) appreciate your willingness to steelman a case that (somewhat?) challenges your own views. However, I think I don’t endorse/buy the steelmanned argument as you put it here. It seems to me like the kind of simplified evaluation/description that I don’t think is very well-suited for assessing strategies to tackle societal problems. More specifically, I think the simple argument/relation you outline wrongfully ignores second- and third-order effects (incl. the adverse effects outlined in the post), which I believe are both extremely important and hard to simplify/formalize.
In a similar vein, I worry about the simplification in your comment on assessing the tractability of denuclearization efforts. I don’t think it’s appropriate to assess the impact of prior denuclearization efforts based simply on observed correlations, for two main reasons: first, there are numerous relevant factors aside from civil society’s denuclearization efforts, and the evidence we have access to has a fairly small sample size of observations that are not independent from each other, which means that identifying causal impact reliably is challenging if not impossible. Second, this is likely a “threshold phenomenon” (not sure what the official term would be), where observable cause-and-effect relations are not linear but occur in jumps; in other words, it seems likely here that civil society activism needs to build up to a certain level to result in clearly visible effects in terms of denuclearization (and the level of movement mobilisation required at any given time in history depends on a number of other circumstances, such as geopolitical and economic events). I don’t think that civil society activism for denuclearization is meaningless as long as it remains below that level, because I think it potentially has beneficial side-effects (on norms, culture, nuclear doctrine, decision-makers’ inhibitions against nuclear use, etc) and because we will never get above the threshold level if we consider efforts below the level to be pointless and not worth pursuing; but I do think that its visible effects as revealed by the evidence may well appear meaningless because of this non-linear nature of the causal relationship.
squeezeing the last 10% is an extremely hard up-hill battle if not impossible as countries continue to look up for their interests
I completely agree that denuclearization is an extremely hard up-hill battle, and I would argue that this is true even before the last 10% are reached. But I don’t think we have the evidence to say that it’s an impossible battle, and since I’m not convinced by the alternatives on offer (interventions “to the right of boom”, or simple nihilism/giving up/focusing on other things), I think it’s worthwhile—vital, actually—to keep fighting that uphill battle.
Some further side-notes
But note that at least half of the nuclear weapon deployed are in the hands of authoritarian countries [Russia: 3859, China: 410, North Korea: 30] which does not have good track record in listening to civil societies. While you could argue that Russia had drastically reduced their stockpile at the end of the cold war, many non-alligned countries [non-NATO, non Russia Bloc] have only increased their stockpile absolutely.
A short comment on the point about authoritarian states: At least for Russia and China, I think civil society/public opinion is far from unimportant (dictators tend to care about public approval at least to some extent) but agree that it’s a different situation from liberal democracies, which means that assessing the potential for denuclearization advocacy would require separate considerations for the different settings. On a different note, I think there is at least some case for claiming that changing attitudes/posture in the US has some influence on possibly shifting attitudes and debate in other nuclear-weapons states, especially when considered over a longer timeframe (just as examples: there are arguments that Putin’s current bellicosity is partially informed by continued US hostility and militarism esp. during the 2000s; and China justifies its arms build-up mainly by arguing that the huge gap between its arsenal and that of the US is unacceptable). All of this would require a much larger discussion (which would probably lead to no clear conclusion, because uncertainty is immense), so I wouldn’t be surprised nor blame you if the snippets of an argument presented above don’t change your mind on this particular point ^^
(And a side-note to the side-note: I think it’s worth pointing out that the biggest reduction in Russia’s stockpiles occurred before the end of the cold war, when the Soviet Union still seriously considered themselves a superpower)
Your argument reminds me of a perspective in animal welfare. If we improve the current condition of the billions of animal suffering, we have more of an excuse to slaughter them, in turn, empowering the meat companies, and thus it impedes our transition towards a cruelty free world.
The analogy makes sense to me, since both some of my claims and the animal advocats’ claim you mention seem to fall into the moral hazards category. Without having looked closely at the animal case, I don’t think I strongly share their concern in that case (or at least, I probably wouldn’t make the tradeoff of giving up on interventions to reduce suffering).
Again, thanks a lot for your comment and thoughts! Looking forward to hearing if you have any further thoughts on the answers given above.
Our descendants are unlikely to have values that are both different from ours in a very significant way and predictable. Either they have values similar to ours or they have values we can’t predict. Therefore, trying to predict their values is a waste of time and resources.
I’m strongly drawn to that response. I remain so after reading this initial post, but am glad that you, by writing this sequence, are offering the opportunity for someone like me to engage with the arguments/ideas a bit more! Looking forward to upcoming installments!
Wrote this on my phone and wasn’t offered the option to format the paragraph as a quote (and I don’t know what the command is); might come back to edit and fix it later
We interviewed 15 China-focused researchers on how to do good research
Thanks for your comment and for adding to Aron’s response to my post!
Before reacting point-by-point, one more overarching warning/clarification/observation: My views on the disvalue of numerical reasoning and the use of BOTECs in deeply uncertain situations are quite unusual within the EA community (though not unheard of, see for instance this EA Forum post on “Potential downsides of using explicit probabilities” and this GiveWell blog post on “Why we can’t take expected value estimates literally (even when they’re unbiased)” which acknowledge some of the concerns that motivate my skeptical stance). I can imagine that this is a heavy crux between us and that it makes advances/convergence on more concrete questions (esp. through a forum comments discussion) rather difficult (which is not at all meant to discourage engagement or to suggest I find your comments unhelpful (quite the contrary); just noting this in an attempt to avoid us arguing past each other).
On moral hazards:
In general, my deep-seated worries about moral hazard and other normative adverse effects feel somewhat inaccessible to numerical/empirical reasoning (at least until we come up with much better empirical research strategies for studying complex situations). To be completely honest, I can’t really imagine arguments or evidence that would be able to substantially dissolve the worries I have. That is not because I’m consciously dogmatic and unwilling to budge from my conclusions, but rather because I don’t think we have the means to know empirically to what extent these adverse effects actually exist/occur. It thus seems that we are forced to rely on fundamental worldview-level beliefs (or intuitions) when deciding on our credences for their importance. This is a very frustrating situation, but I just don’t find attempts to escape it (through relatively arbitrary BOTECs or plausibility arguments) in any sense convincing; they usually seem to me to be trying to come up with elaborate cognitive schemes to diffuse a level of deep empirical uncertainty that simply cannot be diffused (given the structure of the world and the research methods we know of).
To illustrate my thinking, here’s my response to your example:
I don’t think that we really know anything about the moral hazard effects that interventions to prepare for nuclear winter would have had on nuclear policy and outcomes in the Cold War era.
I don’t think we have a sufficiently strong reason to assign the 20% reduction in nuclear weapons to the difference in perceived costs of nuclear escalation after research on nuclear winter surfaced.
I don’t think we have any defensible basis for making a guess about how this reduction in weapons stocks would have been different had there been efforts to prepare for nuclear winter in the 1980s.
I don’t think it is legitimate to simply claim that fear of nuclear-winter-type events has no plausible effect on decision-making in crisis situations (either consciously or sub-consciously, through normative effects such as those of the nuclear taboo). At the same time, I don’t think we have a defensible basis for guessing the expected strength of this effect of fear (or “taking expected costs seriously”) on decision-making, nor for expected changes in the level of fear given interventions to prepare for the worst case.
In short, I don’t think it is anywhere close to feasible or useful to attempt to calculate “the moral hazard term of loss in net effectiveness of the [nuclear winter preparation] interventions”.
On the cost-benefit analysis and tractability of food resilience interventions:
As a general reaction, I’m quite wary of cost-effectiveness analyses for interventions into complex systems. That is because such analyses require that we identify all relevant consequences (and assign value and probability estimates to each), which I believe is extremely hard once you take indirect/second-order effects seriously. (In addition, I’m worried that cost-effectiveness analyses distract analysts and readers from the difficult task of mapping out consequences comprehensively, instead focusing their attention on the quantification of a narrow set of direct consequences.)
That said, I think there sometimes is informational value in cost-effectiveness analyses in such situations, if their results are very stark and robust to changes in the numbers used. I think the article you link is an example of such a case, and accept this as an argument in favor of food resilience interventions.
I also accept your case for the tractability of food resilience interventions (in the US) as sound.
As far as the core argument in my post is concerned, my concern is that the majority of post-nuclear war conditions gets ignored in your response. I.e., if we have sound reasons to think that we can cost-effectively/tractably prepare for post-nuclear war food shortages but don’t have good reasons to think that we know how to cost-effectively/tractably prepare for most of the other plausible consequences of nuclear deployment (many of which we might have thus far failed to identify in the first place), then I would still argue that the tractability of preparing for a post-nuclear war world is concerningly low. I would thus continue to maintain that preventing nuclear deployment should be the primary priority (in other words: your arguments in favor of preparation interventions don’t address the challenge of preparing for the full range of possible consequences, which is why I still think avoiding the consequences ought to be the first priority).
I think I agree that the perspective I describe is far less relevant/valuable when 95% of actors hold and act in accordance with that perspective already. In those cases, it is relatively harmless to ignore the collective actions that would be required for the commom good because one can safely assume that the others (the 95%) will take care of those collective efforts by themselves. But when it comes to “the world’s most pressing problems,” I don’t have the sense that we have those 95% of people to rely on to deal with the collective action problems. And I think that, even if the situation is such that 95% of other people take care of collective efforts thus leaving room for 5% to choose actions unconstrained by responsibilities for those collective action needs, it remains useful and important to keep the collective rationality perspective in mind, to remember how much one relies on that large mass of people doing relatively mundane, but societally essential tasks.
I strongly sympathise with the concern of EA (or anyone) being pulled away from a drive to take action informed by robust data! I think especially for fields like Global Health (where we do have robust data for several, though not all, important questions), my response would be to insist that data-driven attempts to find particularly good actions as measured by their relatively direct, individual, counterfactual impact can, to some extent, coexist with and be complemented by a collective rationality perspective.
The way I imagine decision-making when based on both perspectives is something like: an action can be worth taking either because it has an exceptionally large expected counterfactual impact (e.g., donations to AMF); or it can be worth taking because a certain collective problem will not be solved unless many people take that kind of action (e.g., donations to an org that somehow works to dismantle colonial-area stereotypes and negative self-images in a localised setting within a formerly colonised country [please take the worldview-dependent beliefs underlying this, such as that internalised racism is super harmful to development and flourishing, as a given for the sake of the example]; or, more easy to do complementarily: working for a global health org and being very transparent and honest about the result of one’s interventions, and refraining from manipulative donation ads, even if that approach is expected to decrease donations at least in the short run [again, I’d suggest putting aside the question of whether the approach would in fact decrease donation volumes overall]), where any one person taking the action has an impact that is negligible or impossible to measure at all.
I don’t have a good answer for how to decide between these two buckets of action, especially when faced with a decision between two actions that need to be traded off against one another (donations to two different orgs) (my own current approach is to diversify somewhat arbitrarily, without a very clear distribution rule). But I would still argue that considering actions from both buckets as potentially worthwhile is the right way to go here. Curious to hear if that sparks any thoughts in response (and if you think it makes basic sense in the first place)!