the fact that all actions are always the result of many preceding actions, where (often) every single preceding action only contributes a small part to shaping the resulting action, makes conceivable the idea that actions with low direct counterfactual impact can still be quite important and justifiable when considered from a perspective of behavioural heuristics or collective rationality (both of which recognise that some hugely important outcomes can only ever be attained if many people decide to take actions that have low expected impact on their own).
I can’t exactly put my finger on why I think this, but I suspect that EA impact analyses missing this sort of potential impact is—as a practical matter—relatively less important where the proportion of activity/funding in a cause area is EA-aligned than where a significant proportion is so aligned. If 95%+ of the funding/actors in a cause area are fairly attuned to theories of change like that described above, then it seems less likely that there are many stellar opportunities for that kind of impact that the remaining 5% are leaving on the table.
In those circumstances, largely discounting this kind of impact may make sense in many cases. And from a global health perspective, trying to make decisions based on estimates of this kind of impact would pull EA global health away from its data-driven roots.
I think I agree that the perspective I describe is far less relevant/valuable when 95% of actors hold and act in accordance with that perspective already. In those cases, it is relatively harmless to ignore the collective actions that would be required for the commom good because one can safely assume that the others (the 95%) will take care of those collective efforts by themselves. But when it comes to “the world’s most pressing problems,” I don’t have the sense that we have those 95% of people to rely on to deal with the collective action problems. And I think that, even if the situation is such that 95% of other people take care of collective efforts thus leaving room for 5% to choose actions unconstrained by responsibilities for those collective action needs, it remains useful and important to keep the collective rationality perspective in mind, to remember how much one relies on that large mass of people doing relatively mundane, but societally essential tasks.
I strongly sympathise with the concern of EA (or anyone) being pulled away from a drive to take action informed by robust data! I think especially for fields like Global Health (where we do have robust data for several, though not all, important questions), my response would be to insist that data-driven attempts to find particularly good actions as measured by their relatively direct, individual, counterfactual impact can, to some extent, coexist with and be complemented by a collective rationality perspective.
The way I imagine decision-making when based on both perspectives is something like: an action can be worth taking either because it has an exceptionally large expected counterfactual impact (e.g., donations to AMF); or it can be worth taking because a certain collective problem will not be solved unless many people take that kind of action (e.g., donations to an org that somehow works to dismantle colonial-area stereotypes and negative self-images in a localised setting within a formerly colonised country [please take the worldview-dependent beliefs underlying this, such as that internalised racism is super harmful to development and flourishing, as a given for the sake of the example]; or, more easy to do complementarily: working for a global health org and being very transparent and honest about the result of one’s interventions, and refraining from manipulative donation ads, even if that approach is expected to decrease donations at least in the short run [again, I’d suggest putting aside the question of whether the approach would in fact decrease donation volumes overall]), where any one person taking the action has an impact that is negligible or impossible to measure at all.
I don’t have a good answer for how to decide between these two buckets of action, especially when faced with a decision between two actions that need to be traded off against one another (donations to two different orgs) (my own current approach is to diversify somewhat arbitrarily, without a very clear distribution rule). But I would still argue that considering actions from both buckets as potentially worthwhile is the right way to go here. Curious to hear if that sparks any thoughts in response (and if you think it makes basic sense in the first place)!
But when it comes to “the world’s most pressing problems,” I don’t have the sense that we have those 95% of people to rely on to deal with the collective action problems.
I had global health in mind—the vast majority of the funding and people on board are not EAs or conducting EA-type analyses (although many are at least considering cost-effectiveness).
Even in global health, I can see some circumstances in which EA’s actions could be important for collective-action purposes. Tom Drake at CGD shared some insightful commentary about funding of global-health work last year that would require widespread cooperation from funders to execute. I noted my view in a response: that even if we agreed with the plan, there is no clear action for EA at this time, because the big fish (national governments and Gates) would need to tentatively commit to doing the same if a supermajority of funders made similar commitments.
The rest of this is going to be fairly abstract because we’re looking at the 100,000 foot level.
If I understand the collective-action issue you raise correctly, it reminds me a little of the “For want of a nail” proverb, in which the loss of a single nail results in the loss of an entire kingdom. It’s a good theory of change in certain circumstances.
The modern version might read: For want of a competently-designed ballot, the US invaded Iraq. Or a combination of very small vote-changing efforts (like a few local efforts to drive voters to the polls) could have changed history. It’s at least possible to estimate the expected impact of switching 100 votes in a swing state in a given election given what other actors are expected to do. It’s not easy, and the error bars are considerable, but it can be done.
Admittedly, that example has short feedback loops compared to many other problems. Although determining which collective-action problems are worth committing time and energy to is difficult, I think there are some guideposts. First, is there a coherent and plausible theory of change behind the proposed efforts? For many different forms of activism, I sense that the efforts of many actors in the space are much too affected by virtue signaling, what feels good, and maintaining ideological purity. Under those circumstances, I am pretty skeptical about securing collective-action wins through my involvement. The other actors in the space need to be working in a manner consistent with a coherent and plausible theory of change for me to give much weight to possible synergistic, tipping-point, etc. effects.
Second, if the theory of change is contingent on reaching a tipping point, does the evidence suggest we are close to a tipping point? Suppose you’re absolutely convinced (random example) that legalizing opioids would save tens of thousands of lives in the US alone, every year. If there is already significant public opinion in support of your position, devoting resources could be fairly effective under a collective-action theory. If public support is 1% and has been for several years, marginal additional impact of your support isn’t likely to change anything. It’s true that great social-reform movements start with a small passionate minority . . . but if your theory of change will almost certainly take several generations to have a chance at results, the odds of another solution being found (e.g., a vaccine) or society having changed in a way that makes the issue less important/moot is rather high.
donations to an org that somehow works to dismantle colonial-area stereotypes and negative self-images in a localised setting within a formerly colonised country
I don’t see anything about this hypothetical intervention that renders it incapable of empirical analysis. If one can determine how effective the organization is at dismantling stereotypes and self-images per dollar spent, then a donor can adjudicate the tradeoff between donating to it and donating to AMF based on how super harmful they think internalized racism is vs. how bad they think toddlers dying of malaria is.
[Sarah] But when it comes to “the world’s most pressing problems,” I don’t have the sense that we have those 95% of people to rely on to deal with the collective action problems.
[Jason] I had global health in mind—the vast majority of the funding and people on board are not EAs or conducting EA-type analyses (although many are at least considering cost-effectiveness).
Quick point on this: I didn’t mean to suggest that EAs constitute vastly more than 5% of people working on pressing problems. Completely agree that “the vast majority of the funding and people on board [in global health] are not EAs or conducting EA-type analyses”, but I still think that relatively few of those people (EA or not) approach problems with a collective rationality mindset, which would mean asking themselves: “how do I need to act if I want to be part of the collectively most rational solution?” rather than: “how do I need to act if I want to maximise the (counterfactual) impact from my next action?” or, as maybe done by many non-EA people in global health: “how should I act given my intuitive motivations and the (funding) opportunities available to myself?”. I think—based on anecdotal evidence and observation—that the first of these questions is not asked enough, inside EA and outside of it.
I can see some circumstances in which EA’s actions could be important for collective-action purposes. [...] It’s at least possible to estimate the expected impact of switching 100 votes in a swing state in a given election given what other actors are expected to do. It’s not easy, and the error bars are considerable, but it can be done.
I think it’s correct that some collective action problems can be addressed by individuals or small groups deciding to take action based on their counterfactual impact (and I thank you for the paper and proverb references, found it helpful to read these related ideas expressed in different terms!). In practice, I think (and you seem to acknowledge) that estimating that counterfactual impact for interventions aimed at disrupting collective action problems (by convincing lots of other people to behave collectively rational) is extremely hard and I thus doubt whether counterfactual impact calculations are the best (most practicable) tool for deciding whether and when to take such actions (I think the rather unintuitive analysis by 80,000 Hours on voting demonstrates the impracticability of these considerations for everyday decisions relatively well). But I can see how this might sometimes be a tenable and useful way to go. I find your reflections on how to do this interesting (checking for a plausible theory of change; checking for the closeness of reaching a required tipping point); my quick response (because this comment is already awfully long) would be that they seem useful but limited heuristics (what exactly makes a theory of change in deeply uncertain and empirically-poor domains “plausible”?; and for the tipping point, you mentioned my counterpoint already: if everybody always waiting for a fairly reachable tipping point, many large social changes would never have happened).
But the approach that I gesture at when I talk about “we should often act guided by principles of collective rationality” is different from guesstimating the counterfactual impact of an action that tries to break the collective action dilemma. I think what the collective rationality approach (in my mind) comes down to is an acceptance that sometimes we should take an action that has a low counterfactual impact, because our community (local, national, all of humanity) depends on many people taking such actions to add up to a huge impact. The very point of the collective action problem is that, counterfactually considered, my impact from taking that action will be low, because one individual taking or not taking the action is usually either completely or largely pointless. An example of that would be “making an effort to engage in dinner-table conversations on societally-important issues.” If (and I acknowledge this may be a controversial if) we believe that a vibrant and functioning democracy would be one where most citizens have such conversations every once in a while, then it would be collectively rational for me to engage in these conversations. But this will only really become an impactful and useful action (for our country’s democracy, ignoring benefits to myself) if many other citizens in my country do the same thing. And if many other citizens in my country do do the same thing, then paradoxically it doesn’t really matter that much anymore whether I do it; because it’s the mass doing it that counts, and any one individual that is added or subtracted from that mass has little effect. I think such dynamics can be captured by counterfactual impact reasoning only relatively unintuitively and in ways that are often empirically intractable in practice.
I don’t see anything about this hypothetical intervention that renders it incapable of empirical analysis. If one can determine how effective the organization is at dismantling stereotypes and self-images per dollar spent, then a donor can adjudicate the tradeoff between donating to it and donating to AMF based on how super harmful they think internalized racism is vs. how bad they think toddlers dying of malaria is.
Weakly agree that there can be some empirical analysis to estimate part of the effectiveness of the hypothetical stereotypes-intervention (though I do want to note that such estimates run a large risk of missing important, longer-running effects that only surface after long-time engagement and/or are not super easy to observe at all). I think the main point I was trying to make here is that the empirical question of “how bad internalized racism is”, i.e. how much it decreases development and flourishing, is one that seems hard if not impossible to address via quantitative empirical analysis. I could imagine your response being that we can run some correlational studies on communities or individuals with less vs. more internalized racism and then go from there; I don’t think this will give us meaningful causal knowledge given the many hidden variables that will differ between the groups we analyze and given the long-running effects we seek to find.
my quick response (because this comment is already awfully long) would be that they seem useful but limited heuristics (what exactly makes a theory of change in deeply uncertain and empirically-poor domains “plausible”? [. . . .]
I think that’s right. But if I understand correctly, a collective rationality approach would commend thousands of actions to us, more than we can do even if we went 100% with that approach. So there seemingly has to be some way to triage candidate actions.
More broadly, I worry a lot about what might fill the vacuum if we significantly move away from the current guardrails created by cost-effectiveness analysis (at least in neartermism). I think it is awfully easy for factors like strength of emotional attachment to an issue, social prestige, ease of getting funding, and so forth to infect charitable efforts. Ideally, our theories about impact should be testable, such that we can tell when we misjudged an initiative as too promising and redirect our energies elsewhere. I worry that many initiatives suggested by a collective rationality approach are not “falsifiable” in that way; the converse is that it could also be hard to tell if we were underinvesting in them. So, at EA’s current size/influence level, I may be willing to give up on the potential for working toward certain types of impact because I think maintaining the benefits of the guardrails is more important.
Incidentally, one drawback of longtermist cause areas in general for me is the paucity of feedback loops, often hazy theories of change, and so on. The sought-after ends for longtermism are so important (e.g., the continuation of humanity, avoidance of billions of death from nuclear war) that one can reasonably choose to overlook many methodological issues. But—while remaining open to specific proposals—I worry that many collective-rationality-influenced approaches might carry many of the methodological downsides of current longtermist cause areas while often not delivering potential benefits at the same order of magnitude as AI safety or nuclear safety.
To the extent that we’re talking about EAs not doing things that are commonly done (like taking the time to cast an intelligent vote), I am admittedly uneasy about suggesting EAs not “do their part” and free-ride off of everyone else’s community-sustaining efforts. At the same time, I wouldn’t have begrudged Anthony Fauci for not voting during the recent public health emergency!
Most collective-action results do allow for some degree of free-riding; even measles vaccination works at 95% so we can exempt those with relative medical contraindications and (in some places/cases) sincere religious objections and still get the benefits. Self-declaring oneself as worthy of one of the free-riding slots can be problematic though! I think I’d need to consider this in more specific contexts rather than at the 100K-foot view to refine my approach as opposed to recognizing the tension.
In practice, we might not be that far apart in approach to some things although we may get there by somewhat different means. I posit that living life in “EA mode” 24⁄7 is not feasible—at least not for long—and will result in various maladies that are inimical to impact even on the traditional EA model. So for activities like “doing one’s part as a citizen of one country,” there may be lower practical differences / trade-off decisions here than one might think at the 100K-level.
I think the main point I was trying to make here is that the empirical question of “how bad internalized racism is”, i.e. how much it decreases development and flourishing, is one that seems hard if not impossible to address via quantitative empirical analysis.
I’m not actually troubled by that; this may because I am less utilitarian than the average EA. Without suggesting that all possible ~penultimate or ultimate goals are equally valid, I think “how desirable would X ~penultimate goal be” is significantly less amenable to quantitative empirical analysis than “can we / how can we effectively reach X goal.” But I could be in the minority on that point.
I can’t exactly put my finger on why I think this, but I suspect that EA impact analyses missing this sort of potential impact is—as a practical matter—relatively less important where the proportion of activity/funding in a cause area is EA-aligned than where a significant proportion is so aligned. If 95%+ of the funding/actors in a cause area are fairly attuned to theories of change like that described above, then it seems less likely that there are many stellar opportunities for that kind of impact that the remaining 5% are leaving on the table.
In those circumstances, largely discounting this kind of impact may make sense in many cases. And from a global health perspective, trying to make decisions based on estimates of this kind of impact would pull EA global health away from its data-driven roots.
I think I agree that the perspective I describe is far less relevant/valuable when 95% of actors hold and act in accordance with that perspective already. In those cases, it is relatively harmless to ignore the collective actions that would be required for the commom good because one can safely assume that the others (the 95%) will take care of those collective efforts by themselves. But when it comes to “the world’s most pressing problems,” I don’t have the sense that we have those 95% of people to rely on to deal with the collective action problems. And I think that, even if the situation is such that 95% of other people take care of collective efforts thus leaving room for 5% to choose actions unconstrained by responsibilities for those collective action needs, it remains useful and important to keep the collective rationality perspective in mind, to remember how much one relies on that large mass of people doing relatively mundane, but societally essential tasks.
I strongly sympathise with the concern of EA (or anyone) being pulled away from a drive to take action informed by robust data! I think especially for fields like Global Health (where we do have robust data for several, though not all, important questions), my response would be to insist that data-driven attempts to find particularly good actions as measured by their relatively direct, individual, counterfactual impact can, to some extent, coexist with and be complemented by a collective rationality perspective.
The way I imagine decision-making when based on both perspectives is something like: an action can be worth taking either because it has an exceptionally large expected counterfactual impact (e.g., donations to AMF); or it can be worth taking because a certain collective problem will not be solved unless many people take that kind of action (e.g., donations to an org that somehow works to dismantle colonial-area stereotypes and negative self-images in a localised setting within a formerly colonised country [please take the worldview-dependent beliefs underlying this, such as that internalised racism is super harmful to development and flourishing, as a given for the sake of the example]; or, more easy to do complementarily: working for a global health org and being very transparent and honest about the result of one’s interventions, and refraining from manipulative donation ads, even if that approach is expected to decrease donations at least in the short run [again, I’d suggest putting aside the question of whether the approach would in fact decrease donation volumes overall]), where any one person taking the action has an impact that is negligible or impossible to measure at all.
I don’t have a good answer for how to decide between these two buckets of action, especially when faced with a decision between two actions that need to be traded off against one another (donations to two different orgs) (my own current approach is to diversify somewhat arbitrarily, without a very clear distribution rule). But I would still argue that considering actions from both buckets as potentially worthwhile is the right way to go here. Curious to hear if that sparks any thoughts in response (and if you think it makes basic sense in the first place)!
I had global health in mind—the vast majority of the funding and people on board are not EAs or conducting EA-type analyses (although many are at least considering cost-effectiveness).
Even in global health, I can see some circumstances in which EA’s actions could be important for collective-action purposes. Tom Drake at CGD shared some insightful commentary about funding of global-health work last year that would require widespread cooperation from funders to execute. I noted my view in a response: that even if we agreed with the plan, there is no clear action for EA at this time, because the big fish (national governments and Gates) would need to tentatively commit to doing the same if a supermajority of funders made similar commitments.
The rest of this is going to be fairly abstract because we’re looking at the 100,000 foot level.
If I understand the collective-action issue you raise correctly, it reminds me a little of the “For want of a nail” proverb, in which the loss of a single nail results in the loss of an entire kingdom. It’s a good theory of change in certain circumstances.
The modern version might read: For want of a competently-designed ballot, the US invaded Iraq. Or a combination of very small vote-changing efforts (like a few local efforts to drive voters to the polls) could have changed history. It’s at least possible to estimate the expected impact of switching 100 votes in a swing state in a given election given what other actors are expected to do. It’s not easy, and the error bars are considerable, but it can be done.
Admittedly, that example has short feedback loops compared to many other problems. Although determining which collective-action problems are worth committing time and energy to is difficult, I think there are some guideposts. First, is there a coherent and plausible theory of change behind the proposed efforts? For many different forms of activism, I sense that the efforts of many actors in the space are much too affected by virtue signaling, what feels good, and maintaining ideological purity. Under those circumstances, I am pretty skeptical about securing collective-action wins through my involvement. The other actors in the space need to be working in a manner consistent with a coherent and plausible theory of change for me to give much weight to possible synergistic, tipping-point, etc. effects.
Second, if the theory of change is contingent on reaching a tipping point, does the evidence suggest we are close to a tipping point? Suppose you’re absolutely convinced (random example) that legalizing opioids would save tens of thousands of lives in the US alone, every year. If there is already significant public opinion in support of your position, devoting resources could be fairly effective under a collective-action theory. If public support is 1% and has been for several years, marginal additional impact of your support isn’t likely to change anything. It’s true that great social-reform movements start with a small passionate minority . . . but if your theory of change will almost certainly take several generations to have a chance at results, the odds of another solution being found (e.g., a vaccine) or society having changed in a way that makes the issue less important/moot is rather high.
I don’t see anything about this hypothetical intervention that renders it incapable of empirical analysis. If one can determine how effective the organization is at dismantling stereotypes and self-images per dollar spent, then a donor can adjudicate the tradeoff between donating to it and donating to AMF based on how super harmful they think internalized racism is vs. how bad they think toddlers dying of malaria is.
Quick point on this: I didn’t mean to suggest that EAs constitute vastly more than 5% of people working on pressing problems. Completely agree that “the vast majority of the funding and people on board [in global health] are not EAs or conducting EA-type analyses”, but I still think that relatively few of those people (EA or not) approach problems with a collective rationality mindset, which would mean asking themselves: “how do I need to act if I want to be part of the collectively most rational solution?” rather than: “how do I need to act if I want to maximise the (counterfactual) impact from my next action?” or, as maybe done by many non-EA people in global health: “how should I act given my intuitive motivations and the (funding) opportunities available to myself?”. I think—based on anecdotal evidence and observation—that the first of these questions is not asked enough, inside EA and outside of it.
I think it’s correct that some collective action problems can be addressed by individuals or small groups deciding to take action based on their counterfactual impact (and I thank you for the paper and proverb references, found it helpful to read these related ideas expressed in different terms!). In practice, I think (and you seem to acknowledge) that estimating that counterfactual impact for interventions aimed at disrupting collective action problems (by convincing lots of other people to behave collectively rational) is extremely hard and I thus doubt whether counterfactual impact calculations are the best (most practicable) tool for deciding whether and when to take such actions (I think the rather unintuitive analysis by 80,000 Hours on voting demonstrates the impracticability of these considerations for everyday decisions relatively well). But I can see how this might sometimes be a tenable and useful way to go. I find your reflections on how to do this interesting (checking for a plausible theory of change; checking for the closeness of reaching a required tipping point); my quick response (because this comment is already awfully long) would be that they seem useful but limited heuristics (what exactly makes a theory of change in deeply uncertain and empirically-poor domains “plausible”?; and for the tipping point, you mentioned my counterpoint already: if everybody always waiting for a fairly reachable tipping point, many large social changes would never have happened).
But the approach that I gesture at when I talk about “we should often act guided by principles of collective rationality” is different from guesstimating the counterfactual impact of an action that tries to break the collective action dilemma. I think what the collective rationality approach (in my mind) comes down to is an acceptance that sometimes we should take an action that has a low counterfactual impact, because our community (local, national, all of humanity) depends on many people taking such actions to add up to a huge impact. The very point of the collective action problem is that, counterfactually considered, my impact from taking that action will be low, because one individual taking or not taking the action is usually either completely or largely pointless. An example of that would be “making an effort to engage in dinner-table conversations on societally-important issues.” If (and I acknowledge this may be a controversial if) we believe that a vibrant and functioning democracy would be one where most citizens have such conversations every once in a while, then it would be collectively rational for me to engage in these conversations. But this will only really become an impactful and useful action (for our country’s democracy, ignoring benefits to myself) if many other citizens in my country do the same thing. And if many other citizens in my country do do the same thing, then paradoxically it doesn’t really matter that much anymore whether I do it; because it’s the mass doing it that counts, and any one individual that is added or subtracted from that mass has little effect. I think such dynamics can be captured by counterfactual impact reasoning only relatively unintuitively and in ways that are often empirically intractable in practice.
Weakly agree that there can be some empirical analysis to estimate part of the effectiveness of the hypothetical stereotypes-intervention (though I do want to note that such estimates run a large risk of missing important, longer-running effects that only surface after long-time engagement and/or are not super easy to observe at all). I think the main point I was trying to make here is that the empirical question of “how bad internalized racism is”, i.e. how much it decreases development and flourishing, is one that seems hard if not impossible to address via quantitative empirical analysis. I could imagine your response being that we can run some correlational studies on communities or individuals with less vs. more internalized racism and then go from there; I don’t think this will give us meaningful causal knowledge given the many hidden variables that will differ between the groups we analyze and given the long-running effects we seek to find.
I think that’s right. But if I understand correctly, a collective rationality approach would commend thousands of actions to us, more than we can do even if we went 100% with that approach. So there seemingly has to be some way to triage candidate actions.
More broadly, I worry a lot about what might fill the vacuum if we significantly move away from the current guardrails created by cost-effectiveness analysis (at least in neartermism). I think it is awfully easy for factors like strength of emotional attachment to an issue, social prestige, ease of getting funding, and so forth to infect charitable efforts. Ideally, our theories about impact should be testable, such that we can tell when we misjudged an initiative as too promising and redirect our energies elsewhere. I worry that many initiatives suggested by a collective rationality approach are not “falsifiable” in that way; the converse is that it could also be hard to tell if we were underinvesting in them. So, at EA’s current size/influence level, I may be willing to give up on the potential for working toward certain types of impact because I think maintaining the benefits of the guardrails is more important.
Incidentally, one drawback of longtermist cause areas in general for me is the paucity of feedback loops, often hazy theories of change, and so on. The sought-after ends for longtermism are so important (e.g., the continuation of humanity, avoidance of billions of death from nuclear war) that one can reasonably choose to overlook many methodological issues. But—while remaining open to specific proposals—I worry that many collective-rationality-influenced approaches might carry many of the methodological downsides of current longtermist cause areas while often not delivering potential benefits at the same order of magnitude as AI safety or nuclear safety.
To the extent that we’re talking about EAs not doing things that are commonly done (like taking the time to cast an intelligent vote), I am admittedly uneasy about suggesting EAs not “do their part” and free-ride off of everyone else’s community-sustaining efforts. At the same time, I wouldn’t have begrudged Anthony Fauci for not voting during the recent public health emergency!
Most collective-action results do allow for some degree of free-riding; even measles vaccination works at 95% so we can exempt those with relative medical contraindications and (in some places/cases) sincere religious objections and still get the benefits. Self-declaring oneself as worthy of one of the free-riding slots can be problematic though! I think I’d need to consider this in more specific contexts rather than at the 100K-foot view to refine my approach as opposed to recognizing the tension.
In practice, we might not be that far apart in approach to some things although we may get there by somewhat different means. I posit that living life in “EA mode” 24⁄7 is not feasible—at least not for long—and will result in various maladies that are inimical to impact even on the traditional EA model. So for activities like “doing one’s part as a citizen of one country,” there may be lower practical differences / trade-off decisions here than one might think at the 100K-level.
I’m not actually troubled by that; this may because I am less utilitarian than the average EA. Without suggesting that all possible ~penultimate or ultimate goals are equally valid, I think “how desirable would X ~penultimate goal be” is significantly less amenable to quantitative empirical analysis than “can we / how can we effectively reach X goal.” But I could be in the minority on that point.