First off, I think this is a really useful post that’s moved the discussion forward productively, and I agree with most of it.
I disagree with some of the current steering – but a necessary condition for changing direction is that people talk/care/focus more on steering, so I’m going to make the case for that first.
I agree with the basic claim that steering is relatively neglected and that we should do more of it, so I’m much more curious about what current steering you disagree with/think we should do differently.
My view is closer to: most steering interventions are obvious, but they’ve ended up being most people’s second priority, and we should mostly just do much more of various things that are currently only occasionally done, or have been proposed but not carried out.
Most of the specific things you’ve suggested in this post I agree with. But you didn’t mention any specific current steering you thought was mistaken.
The way I naturally think of steering is in terms of making more sophisticated decisions: EA should be better at dealing with moral and empirical uncertainty in a rigorous and principled way. Here are some things that come to mind:
Talking more about moral uncertainty: I’d like to see more discussion of something like Ajeya’s concrete explicit world-view diversification framework, where you make sure you don’t go all-in and take actions that one worldview you’re considering would label catastrophic, even if you’re really confident in your preferred worldview—e.g. strong longtermism vs neartermism. I think taking this framework seriously would address a lot of the concerns people have with strong longtermism. From this perspective it’s natural to say that there’s a longtermist case for extinction risk mitigation based on total utilitarian potential and also a neartermist one based on a basket of moral views, and then we can say there’s clear and obvious interventions we can all get behind on either basis, along with speculative interventions that depend on your confidence in longtermism. Also, if we use a moral/‘worldview’ uncertainty framework, the justification for doing more research into how to prioritise different worldviews is easier to understand.
Better risk analysis: On the empirical uncertainty side, I very much agree with the specific criticism that longtermists should use more sophisticated risk/fault analysis methods when doing strategy work and forecasting (which was one of the improvements suggested in Carla’s paper). This is a good place to start on that. I think considering the potential backfire risks of particular interventions, along with how different X-risks and Risk factors interact, is a big part of this.
Soliciting external discussions and red-teaming: these seem like exactly the sorts of interventions that would throw up ways of better dealing with moral and empirical uncertainty, point out blindspots etc.
The part that makes me think we’re maybe thinking of different things is the focus on democratic feedback.
Again, I wish to recognise that many community leaders strongly support steering – e.g., by promoting ideas like ‘moral uncertainty’ and ‘the long reflection’ or via specific community-building activities. So, my argument here is not that steering currently doesn’t occur; rather, it doesn’t occur enough and should occur in more transparent and democratic ways.
There are ways of reading this that make a lot of sense on the view of steering that I’m imagining here.
Under ‘more democratic feedback’: we might prefer to get elected governments and non-EA academics thinking about cause prioritisation and longtermism, without pushing our preferred interventions on them (because we expect this to help in pointing out mistakes, better interventions or things we’ve missed). I’ve also argued before that since common sense morality is a view we should care about, if we get to the point of recommending things that are massively at odds with CSM we should take that into account.
But if it’s something beyond all of these considerations, something like it’s intrinsically better when you’re doing things that lots of people agree with (and I realize this is a very fine distinction in practice!), arguing for more democratic feedback unconditionally looks more like Anchoring/Equity than Steering.
I think this would probably be cleared up a lot if we understood what specifically is being proposed by ‘democratic feedback’ - maybe it is just all the things I’ve listed, and I’d have no objections whatsoever!
First off, I think this is a really useful post that’s moved the discussion forward productively, and I agree with most of it.
I agree with the basic claim that steering is relatively neglected and that we should do more of it, so I’m much more curious about what current steering you disagree with/think we should do differently.
My view is closer to: most steering interventions are obvious, but they’ve ended up being most people’s second priority, and we should mostly just do much more of various things that are currently only occasionally done, or have been proposed but not carried out.
Most of the specific things you’ve suggested in this post I agree with. But you didn’t mention any specific current steering you thought was mistaken.
The way I naturally think of steering is in terms of making more sophisticated decisions: EA should be better at dealing with moral and empirical uncertainty in a rigorous and principled way. Here are some things that come to mind:
Talking more about moral uncertainty: I’d like to see more discussion of something like Ajeya’s concrete explicit world-view diversification framework, where you make sure you don’t go all-in and take actions that one worldview you’re considering would label catastrophic, even if you’re really confident in your preferred worldview—e.g. strong longtermism vs neartermism. I think taking this framework seriously would address a lot of the concerns people have with strong longtermism. From this perspective it’s natural to say that there’s a longtermist case for extinction risk mitigation based on total utilitarian potential and also a neartermist one based on a basket of moral views, and then we can say there’s clear and obvious interventions we can all get behind on either basis, along with speculative interventions that depend on your confidence in longtermism. Also, if we use a moral/‘worldview’ uncertainty framework, the justification for doing more research into how to prioritise different worldviews is easier to understand.
Better risk analysis: On the empirical uncertainty side, I very much agree with the specific criticism that longtermists should use more sophisticated risk/fault analysis methods when doing strategy work and forecasting (which was one of the improvements suggested in Carla’s paper). This is a good place to start on that. I think considering the potential backfire risks of particular interventions, along with how different X-risks and Risk factors interact, is a big part of this.
Soliciting external discussions and red-teaming: these seem like exactly the sorts of interventions that would throw up ways of better dealing with moral and empirical uncertainty, point out blindspots etc.
The part that makes me think we’re maybe thinking of different things is the focus on democratic feedback.
There are ways of reading this that make a lot of sense on the view of steering that I’m imagining here.
Under ‘more democratic feedback’: we might prefer to get elected governments and non-EA academics thinking about cause prioritisation and longtermism, without pushing our preferred interventions on them (because we expect this to help in pointing out mistakes, better interventions or things we’ve missed). I’ve also argued before that since common sense morality is a view we should care about, if we get to the point of recommending things that are massively at odds with CSM we should take that into account.
But if it’s something beyond all of these considerations, something like it’s intrinsically better when you’re doing things that lots of people agree with (and I realize this is a very fine distinction in practice!), arguing for more democratic feedback unconditionally looks more like Anchoring/Equity than Steering.
I think this would probably be cleared up a lot if we understood what specifically is being proposed by ‘democratic feedback’ - maybe it is just all the things I’ve listed, and I’d have no objections whatsoever!