The EA community aims to make a positive difference using two very different approaches. One of them is much harder than the other.
As I see it, there are two main ways people in the EA community today aim to make a positive difference in the world: (1) identifying existing, high-performing altruistic programs and providing additional resources to support them; and (2) designing and executing new altruistic programs. I think people use both approaches—though in varying proportions—in each of the major cause areas that people inspired by EA ideas tend to focus on.
In this post, I’ll call approach (1) the evaluation-and-support approach, and I’ll call approach (2) the design-and-execution approach.
I consider GiveWell’s work to be the best example of the evaluation-and-support approach.[1] Most policy advocacy efforts, technical engineering efforts, and community-building projects are examples of the design-and-execution approach.[2]
Both of these approaches are difficult to do well, but I think design-and-execution is much more difficult than evaluation-and-support. (In fact, recognizing and taking seriously how difficult and rare it is that a well-intended altruistic program is actually designed and executed effectively is one of the centralinsights that I find distinctive and valuable about EA’s evaluation-and-support approach.)
I also think design-and-execution—with its long feedback loops and scarcity of up-front empirical evidence—carries a much higher risk of accidentally causing harm than evaluation-and-support, and so depends much more heavily on effective risk-management and error-correction processes to have a positive impact on the world.[3] I think the riskiness of design-and-execution approaches makes it unclear whether it’s virtuous to be especially ambitious when pursuing these approaches, since ambitious efforts can involve pushing the limits of one’s ability to perform competently and avoid unintended consequences.
Finally, I think the two approaches require very different sets of skills. My guess is that there are many more people in the EA community today (which skews young and quantitatively-inclined) with skills that are a good fit for evaluation-and-support than have skills that are an equally good fit for design-and-execution. I worry that this skills gap might increase the risk that people in the EA community might accidentally cause harm while attempting the design-and-execution approach.
Cause prioritization looks like evaluation-and-support, but when there are no existing, tested interventions in a priority area, we’re still stuck attempting design-and-execution.
I think the causes people in the EA community tend to prioritize—managing risks from emerging technologies, alleviating the burdens of extreme poverty, reducing animal suffering, spreading good moral values—are unusually important and neglected. There are evidence-based reasons to think each of these causes is a big deal, including empirical data about the number of individuals affected and historical examples of cases in which people have made a positive difference by working on similar topics. In that sense, the EA community’s cause prioritization efforts look like a successful case of the evaluation-and-support approach that seems to be a good fit for the skills of many people in the EA community.
But even if we think the EA community has identified important and neglected cause areas, I don’t think it necessarily follows that taking action in those cause areas is among the best ways that people inspired by EA ideas can help others. When cause prioritization points toward an area in which there aren’t many (or any) existing programs with strong track records of success, people in the EA community are stuck attempting to set up new programs using the design-and-execution approach, for which their skills might not be such a good fit. And I think it’s possible that many new programs in a promising cause area would have no effect or even cause significant harm in expectation, which could mean that “trying the best things we can think of” might be a poor strategy.
Design-and-execution interventions have the potential to cause significant harm.
For many design-and-execution-style interventions inspired by EA ideas, I think it’s ambiguous whether the expected effect of the intervention is net-beneficial or net-harmful. I think there are serious risks associated with working on each of the topics that people in the EA community tend to prioritize, and the extent to which the benefits of any given project outweigh the risks of causing harm seem uncertain and fact-sensitive.
For example:
Technical AI alignment efforts might unintentionally contribute to deceptive alignment of powerful systems that is impossible to detect using existing methods. Or they might unintentionally help bring about a situation in which private companies rapidly build and align powerful systems to their own private objectives, massively increasing their power at the expense of governments and other democratic institutions.
Advocacy for better biosecurity programs might unintentionally help popularize the idea that modern synthetic biology could be used to make powerful weapons.
Discouraging certain kinds of virology research might unintentionally decrease society’s preparedness to respond to future pandemic pathogens.
Efforts to accelerate the careers of people interested in EA ideas might unintentionally empower people to reach important, influential positions for which they are not well-prepared, or they might unintentionally empower people who turn out to have poor judgment and harm others as a result.
I think many people in the EA community are aware of these risks and try to mitigate them, but my impression is that many people aren’t thinking seriously enough about the possibility that these risks might systematically outweigh the benefits of new interventions that seem net-beneficial in principle.
I think it’s particularly important to err on the side of caution when evaluating the risk of what I’ve called design-and-execution-style interventions, since we don’t have the benefit of evidence about the proposed intervention’s historical performance to help us spot potential problems.
Finally, because these downside risks could outweigh the benefits of any design-and-execution program, I think accurately assessing and managing risks of new programs and correcting errors along the way are some of the most important aptitudes for any person, institution, or community aiming to make a significant positive difference in the world through design-and-execution-style interventions. Because I think the EA community doesn’t have especially strong risk-management or error-correction capabilities today, I would like to see the EA community significantly strengthen its capabilities in these areas.
—
(I’m considering turning these ideas into a top-level post. I’d welcome your feedback on whether it might make sense to do that, along with anything you’d recommend revising before re-posting.)
I don’t think this approach is limited to organizations that primarily direct cash donations; I could also imagine an organization in this category that recommends that people take jobs to support specific projects based on rigorous impact evaluations of those projects. (I think 80,000 Hours occasionally advertises jobs in this category, but I think in practice most 80,000 Hours recommendations are directed toward building career capital or supporting projects that I think fit within the design-and-execution approach.)
Two marginal notes on how I think about two cases that might fall somewhere in between the two approaches:
(A) For very high levels of “support” relative to existing program size, I think the evaluation-and-support approach starts to blur into design-and-execution. For example, deciding to join a two-person research nonprofit as its third full-time employee might significantly change the scope of the nonprofit’s work in ways that make it more like design-and-execution. Deciding to donate enough money to double a global health nonprofit’s operating budget might raise similar considerations.
(B) I think most people who are “earning to give” without changing jobs (or by switching into a conventional, non-entrepreneurial-but-high-paying job like quantitative trading) are essentially following the evaluation-and-support approach, often by explicitly deferring to GiveWell or another charity evaluator about where to direct their donations. But I consider entrepreneurial approaches to earning to give to fit within the design-and-execution approach in that they depend for their success on the design and implementation of a new program—the for-profit business—in addition to the effective allocation of the proceeds to other existing programs.
My view that risk management is particularly important when attempting design-and-execution approaches is one of the reasons I’ve advocated for more public conversations about actions people in the EA community can take to reduce the risk of unintended harm.
Finally, I think the two approaches require very different sets of skills. My guess is that there are many more people in the EA community today (which skews young and quantitatively-inclined) with skills that are a good fit for evaluation-and-support than have skills that are an equally good fit for design-and-execution. I worry that this skills gap might increase the risk that people in the EA community might accidentally cause harm while attempting the design-and-execution approach.
This paragraph is a critical component of the argument as presently stated. However, I don’t see much more than a mere assertion that (1) certain skills are generally missing that are needed for design-and-execution (D&E) and (2) the absence of those skills increases the risk of accidental harm. In a full post, I would explain this more.
My own intuition is that a larger driver for increased harm in D&E models (vs. evaluation-and-support, E&S) may be inherent to working in a novel and neglected subject area like AI safety. In an E&S model, the startup efforts incubated independently of EA are more likely to be pretty small-scale. Even if a number of them end up being net-harmful, the risk is limited by how small they are. But in a D&E model, EA resources may be poured into an organization earlier in its life cycle, increasing the risk of significant harm it it turns out the organization was ultimately not well-conceived.
As far as mitigations, I think a presumption toward “start small, go slow” in a underdeveloped cause area for which a heavily D&E approach is necessary might be appropriate in many cases for the reason described in the paragraph above. E.g., in some cases, the objective should be to develop the ecosystem in that cause area where heavy work can begin in 7-10 years, vs. pouring in a ton of resources early and trying to get results ASAP. I think I’d like to see more ideas like than in a full post, as the suggestion to develop better “risk-management or error-correction capabilities” (while correct in my view) is also rather abstract.
The EA community aims to make a positive difference using two very different approaches. One of them is much harder than the other.
As I see it, there are two main ways people in the EA community today aim to make a positive difference in the world: (1) identifying existing, high-performing altruistic programs and providing additional resources to support them; and (2) designing and executing new altruistic programs. I think people use both approaches—though in varying proportions—in each of the major cause areas that people inspired by EA ideas tend to focus on.
In this post, I’ll call approach (1) the evaluation-and-support approach, and I’ll call approach (2) the design-and-execution approach.
I consider GiveWell’s work to be the best example of the evaluation-and-support approach.[1] Most policy advocacy efforts, technical engineering efforts, and community-building projects are examples of the design-and-execution approach.[2]
Both of these approaches are difficult to do well, but I think design-and-execution is much more difficult than evaluation-and-support. (In fact, recognizing and taking seriously how difficult and rare it is that a well-intended altruistic program is actually designed and executed effectively is one of the central insights that I find distinctive and valuable about EA’s evaluation-and-support approach.)
I also think design-and-execution—with its long feedback loops and scarcity of up-front empirical evidence—carries a much higher risk of accidentally causing harm than evaluation-and-support, and so depends much more heavily on effective risk-management and error-correction processes to have a positive impact on the world.[3] I think the riskiness of design-and-execution approaches makes it unclear whether it’s virtuous to be especially ambitious when pursuing these approaches, since ambitious efforts can involve pushing the limits of one’s ability to perform competently and avoid unintended consequences.
Finally, I think the two approaches require very different sets of skills. My guess is that there are many more people in the EA community today (which skews young and quantitatively-inclined) with skills that are a good fit for evaluation-and-support than have skills that are an equally good fit for design-and-execution. I worry that this skills gap might increase the risk that people in the EA community might accidentally cause harm while attempting the design-and-execution approach.
Cause prioritization looks like evaluation-and-support, but when there are no existing, tested interventions in a priority area, we’re still stuck attempting design-and-execution.
I think the causes people in the EA community tend to prioritize—managing risks from emerging technologies, alleviating the burdens of extreme poverty, reducing animal suffering, spreading good moral values—are unusually important and neglected. There are evidence-based reasons to think each of these causes is a big deal, including empirical data about the number of individuals affected and historical examples of cases in which people have made a positive difference by working on similar topics. In that sense, the EA community’s cause prioritization efforts look like a successful case of the evaluation-and-support approach that seems to be a good fit for the skills of many people in the EA community.
But even if we think the EA community has identified important and neglected cause areas, I don’t think it necessarily follows that taking action in those cause areas is among the best ways that people inspired by EA ideas can help others. When cause prioritization points toward an area in which there aren’t many (or any) existing programs with strong track records of success, people in the EA community are stuck attempting to set up new programs using the design-and-execution approach, for which their skills might not be such a good fit. And I think it’s possible that many new programs in a promising cause area would have no effect or even cause significant harm in expectation, which could mean that “trying the best things we can think of” might be a poor strategy.
Design-and-execution interventions have the potential to cause significant harm.
For many design-and-execution-style interventions inspired by EA ideas, I think it’s ambiguous whether the expected effect of the intervention is net-beneficial or net-harmful. I think there are serious risks associated with working on each of the topics that people in the EA community tend to prioritize, and the extent to which the benefits of any given project outweigh the risks of causing harm seem uncertain and fact-sensitive.
For example:
Technical AI alignment efforts might unintentionally contribute to deceptive alignment of powerful systems that is impossible to detect using existing methods. Or they might unintentionally help bring about a situation in which private companies rapidly build and align powerful systems to their own private objectives, massively increasing their power at the expense of governments and other democratic institutions.
AI governance efforts might unintentionally contribute negatively to race dynamics or harmful differential technical progress.
Advocacy for better biosecurity programs might unintentionally help popularize the idea that modern synthetic biology could be used to make powerful weapons.
Discouraging certain kinds of virology research might unintentionally decrease society’s preparedness to respond to future pandemic pathogens.
Efforts to accelerate the careers of people interested in EA ideas might unintentionally empower people to reach important, influential positions for which they are not well-prepared, or they might unintentionally empower people who turn out to have poor judgment and harm others as a result.
I think many people in the EA community are aware of these risks and try to mitigate them, but my impression is that many people aren’t thinking seriously enough about the possibility that these risks might systematically outweigh the benefits of new interventions that seem net-beneficial in principle.
I think it’s particularly important to err on the side of caution when evaluating the risk of what I’ve called design-and-execution-style interventions, since we don’t have the benefit of evidence about the proposed intervention’s historical performance to help us spot potential problems.
Finally, because these downside risks could outweigh the benefits of any design-and-execution program, I think accurately assessing and managing risks of new programs and correcting errors along the way are some of the most important aptitudes for any person, institution, or community aiming to make a significant positive difference in the world through design-and-execution-style interventions. Because I think the EA community doesn’t have especially strong risk-management or error-correction capabilities today, I would like to see the EA community significantly strengthen its capabilities in these areas.
—
(I’m considering turning these ideas into a top-level post. I’d welcome your feedback on whether it might make sense to do that, along with anything you’d recommend revising before re-posting.)
I don’t think this approach is limited to organizations that primarily direct cash donations; I could also imagine an organization in this category that recommends that people take jobs to support specific projects based on rigorous impact evaluations of those projects. (I think 80,000 Hours occasionally advertises jobs in this category, but I think in practice most 80,000 Hours recommendations are directed toward building career capital or supporting projects that I think fit within the design-and-execution approach.)
Two marginal notes on how I think about two cases that might fall somewhere in between the two approaches:
(A) For very high levels of “support” relative to existing program size, I think the evaluation-and-support approach starts to blur into design-and-execution. For example, deciding to join a two-person research nonprofit as its third full-time employee might significantly change the scope of the nonprofit’s work in ways that make it more like design-and-execution. Deciding to donate enough money to double a global health nonprofit’s operating budget might raise similar considerations.
(B) I think most people who are “earning to give” without changing jobs (or by switching into a conventional, non-entrepreneurial-but-high-paying job like quantitative trading) are essentially following the evaluation-and-support approach, often by explicitly deferring to GiveWell or another charity evaluator about where to direct their donations. But I consider entrepreneurial approaches to earning to give to fit within the design-and-execution approach in that they depend for their success on the design and implementation of a new program—the for-profit business—in addition to the effective allocation of the proceeds to other existing programs.
My view that risk management is particularly important when attempting design-and-execution approaches is one of the reasons I’ve advocated for more public conversations about actions people in the EA community can take to reduce the risk of unintended harm.
I liked this and would encourage you to publish it as a top-level post.
I think it has potential!
This paragraph is a critical component of the argument as presently stated. However, I don’t see much more than a mere assertion that (1) certain skills are generally missing that are needed for design-and-execution (D&E) and (2) the absence of those skills increases the risk of accidental harm. In a full post, I would explain this more.
My own intuition is that a larger driver for increased harm in D&E models (vs. evaluation-and-support, E&S) may be inherent to working in a novel and neglected subject area like AI safety. In an E&S model, the startup efforts incubated independently of EA are more likely to be pretty small-scale. Even if a number of them end up being net-harmful, the risk is limited by how small they are. But in a D&E model, EA resources may be poured into an organization earlier in its life cycle, increasing the risk of significant harm it it turns out the organization was ultimately not well-conceived.
As far as mitigations, I think a presumption toward “start small, go slow” in a underdeveloped cause area for which a heavily D&E approach is necessary might be appropriate in many cases for the reason described in the paragraph above. E.g., in some cases, the objective should be to develop the ecosystem in that cause area where heavy work can begin in 7-10 years, vs. pouring in a ton of resources early and trying to get results ASAP. I think I’d like to see more ideas like than in a full post, as the suggestion to develop better “risk-management or error-correction capabilities” (while correct in my view) is also rather abstract.