A “Red Team” to rigorously explore possible futures and advocate against interventions that threaten to backfire
Research That Can Help Us Improve, Effective Altruism, Epistemic Institutions, Values and Reflective Processes
Motivation. There are a lot of proposals here. There are additional proposals on the Future Fund website. There are additional proposals also on various lists I have collected. Many EA charities are already implementing ambitious interventions. But really we’re quite clueless about what the future will bring.
This week alone I’ve discussed with friends and acquaintances three decisions in completely different contexts that might make the difference between paradise and hell for all sentient life, and not just in the abstract in the way that cluelessness forces us to assign some probability to almost any outcome but in the sense were we could point to concrete mechanisms along which the failure might occur. Yet we had to decide. I imagine that people in more influential positions than mine have to make similar decisions on almost a daily basis and on hardly any more information.
As a result, the robustness of an intervention has been the key criterion for prioritization for me for the past six years now. It’s something like the number and breadth of scenarios that trusted, impartial people have thought through along which the intervention may have any effect, especially unintended effects, divided by the number of bad failure modes that they’ve found and haven’t been able to mitigate. I mostly don’t bother to think about probabilities for this exercise, though that would be even better.
Tools. Organizations should continue to do their own red-teaming in-house, but that is probably always going to be less rigorous and systematic than what a dedicated red-teaming team could do.
I hear that Policy Horizons Canada (h/t Jaques Thibodeau), RAND (h/t Christian Tarsney), and others have experience in eliciting such scenarios systematically. The goal is often to have a policy response ready for every eventuality. The focus may need to be different for effective alturism and especially any interventions to do with existential and suffering risks: We can probe our candidate interventions using scenario planning or ensample simulations and discard them if they seem too risky.
I imagine that you could also set up a system akin to a prediction market platform but with stronger incentives to create new markets and to conditionally chain markets, including the UI/UX to make voting on conditional predictions frictionless, without much mental math.
Jaques Thibodeau is also thinking about using machine learning tools to aid with the elicitation of scenarios.
Organization. The final “Red Team” organization needs to have strong social buy-in from EA-branded organizations to offset the social awkwardness of being a perpetual critic. It’ll probably need to recruit the sort of people who thrive in the role of a devil’s advocate. But it’ll also need to hold itself to its own high standards and disband if it finds that its own mission is at risk of backfiring badly.
I had a similar idea, and I think that a few more things need to be included in the discussion of this.
There are multiple levels of ideas in EA, and I think that a red team becomes much more valuable when they are engaging with issues that are applicable to the whole of EA.
I think ideas like the institutional critique of EA, the other heavy tail, and others are often not read and internalized by EAs. I think it is worth having a team that makes arguments like this, then breaks them down and provides methods for avoiding the pitfalls pointed out in them.
Things brought up in critique of EA should be specifically recognized and talked about as good. These ideas should be recognized, held up to be examined, then passed out to our community so that we can grow and overcome the objections.
I’m almost always lurking on the forum, and I don’t often see posts talking about EA critiques.
I basically agree but in this proposal I was really referring to such things as “Professor X is using probabilistic programming to model regularities in human moral preferences. How can that backfire and result in the destruction of our world? What other risks can we find? Can X mitigate them?”
I also think that the category that you’re referring to is very valuable but I think those are “simply” contributions to priorities research as they are published by the Global Priorities Institute (e.g., working papers by Greaves and Tarsney come to mind). Rethink Priorities, Open Phil, FHI, and various individuals also occasionally publish articles that I would class that way. I think priorities research is one of the most important fields of EA and much broader than my proposal, but it is also well-known. Hence why my proposal is not meant to be about that.
A “Red Team” to rigorously explore possible futures and advocate against interventions that threaten to backfire
Research That Can Help Us Improve, Effective Altruism, Epistemic Institutions, Values and Reflective Processes
Motivation. There are a lot of proposals here. There are additional proposals on the Future Fund website. There are additional proposals also on various lists I have collected. Many EA charities are already implementing ambitious interventions. But really we’re quite clueless about what the future will bring.
This week alone I’ve discussed with friends and acquaintances three decisions in completely different contexts that might make the difference between paradise and hell for all sentient life, and not just in the abstract in the way that cluelessness forces us to assign some probability to almost any outcome but in the sense were we could point to concrete mechanisms along which the failure might occur. Yet we had to decide. I imagine that people in more influential positions than mine have to make similar decisions on almost a daily basis and on hardly any more information.
As a result, the robustness of an intervention has been the key criterion for prioritization for me for the past six years now. It’s something like the number and breadth of scenarios that trusted, impartial people have thought through along which the intervention may have any effect, especially unintended effects, divided by the number of bad failure modes that they’ve found and haven’t been able to mitigate. I mostly don’t bother to think about probabilities for this exercise, though that would be even better.
Tools. Organizations should continue to do their own red-teaming in-house, but that is probably always going to be less rigorous and systematic than what a dedicated red-teaming team could do.
I hear that Policy Horizons Canada (h/t Jaques Thibodeau), RAND (h/t Christian Tarsney), and others have experience in eliciting such scenarios systematically. The goal is often to have a policy response ready for every eventuality. The focus may need to be different for effective alturism and especially any interventions to do with existential and suffering risks: We can probe our candidate interventions using scenario planning or ensample simulations and discard them if they seem too risky.
I imagine that you could also set up a system akin to a prediction market platform but with stronger incentives to create new markets and to conditionally chain markets, including the UI/UX to make voting on conditional predictions frictionless, without much mental math.
Jaques Thibodeau is also thinking about using machine learning tools to aid with the elicitation of scenarios.
Organization. The final “Red Team” organization needs to have strong social buy-in from EA-branded organizations to offset the social awkwardness of being a perpetual critic. It’ll probably need to recruit the sort of people who thrive in the role of a devil’s advocate. But it’ll also need to hold itself to its own high standards and disband if it finds that its own mission is at risk of backfiring badly.
I had a similar idea, and I think that a few more things need to be included in the discussion of this.
There are multiple levels of ideas in EA, and I think that a red team becomes much more valuable when they are engaging with issues that are applicable to the whole of EA.
I think ideas like the institutional critique of EA, the other heavy tail, and others are often not read and internalized by EAs. I think it is worth having a team that makes arguments like this, then breaks them down and provides methods for avoiding the pitfalls pointed out in them.
Things brought up in critique of EA should be specifically recognized and talked about as good. These ideas should be recognized, held up to be examined, then passed out to our community so that we can grow and overcome the objections.
I’m almost always lurking on the forum, and I don’t often see posts talking about EA critiques.
That should change.
I basically agree but in this proposal I was really referring to such things as “Professor X is using probabilistic programming to model regularities in human moral preferences. How can that backfire and result in the destruction of our world? What other risks can we find? Can X mitigate them?”
I also think that the category that you’re referring to is very valuable but I think those are “simply” contributions to priorities research as they are published by the Global Priorities Institute (e.g., working papers by Greaves and Tarsney come to mind). Rethink Priorities, Open Phil, FHI, and various individuals also occasionally publish articles that I would class that way. I think priorities research is one of the most important fields of EA and much broader than my proposal, but it is also well-known. Hence why my proposal is not meant to be about that.