That said, FWIW, my independent impression is that “cluelessness” isn’t a useful concept and that the common ways the concept has been used either to counter neartermism or counter longtermism are misguided. (I write about this here and here.) So I guess that that’s probably consistent with your conclusion, though maybe by a different road. (I prefer to use the sort of analysis in Tarsney’s epistemic challenge paper, and I think that that pushes in favour of either longtermism or further research on longtermism vs neartermism, though I definitely acknowledge room for debate on that.)
I think Tarsney’s paper does not address/avoid cluelessness, or at least its spirit, i.e., the arbitrary weighting of different considerations, since
You still need to find a specific intervention that you predict ex ante pushes you towards one attractor and away from another, and you have more reason to believe it does this than it goes in the opposite direction (in expectation, say). If you have more reason to believe this due to arbitrary weights, which could reasonably have been chosen to have the intervention backfire, this is not a good epistemic state to be in. For example, is the AI safety work we’re doing now backfiring? This could be due to, for example:
creating a false sense of security,
publishing the results of the GPT models, demonstrating AI capabilities and showing the world how much further we can already push it, and therefore accelerating AI development, or
slowing AI development more in countries that care more about safety than those that don’t care much, risking a much worse AGI takeover if it matters who builds it first.
You still need to predict which of the attractors is ex ante ethically better, which again involves both arbitrary empirical weights and arbitrary ethical weights (moral uncertainty). You might find the choice to be sensitive to something arbitrary that could reasonably go either way. Is extinction actually bad, considering the possibility of s-risks?
Does some s-risk (e.g. AI safety, authoritarianism) work reduce some extinction risks and so increase other s-risks, and how do we weigh those possibilities?
I worry that research on longtermism vs neartermism (like Tarsney’s paper) just ignores these problems, since you really need to deal with somewhat specific interventions, because of the different considerations involved. In my view, (strong) longtermism is only true if you actually identify an intervention that you can only reasonably believe does (much) more net good in the far future in expectation than short-term-focused alternatives do in the short term in expectation, or, roughly, that you can only reasonably believe does (much) more good than harm (in the far future) in expectation. This requires careful analysis of a specific intervention, and we may not have the right information now or ever to confirm that a particular intervention satisfies these conditions. To every longtermist intervention I’ve tried to come up with specific objections to, I’ve come up with objections that I think could reasonably push it into doing more harm than good in expectation.
Of course, what should “reasonable belief” mean? How do we decide which beliefs are reasonable and which ones aren’t (and the degree of reasonableness, if it’s a fuzzy concept)?
Basically, I agree that longtermist interventions could have these downside risks, but:
I think we should basically just factor that into their expected value (while using various best practices and avoiding naive approaches)
I do acknowledge that this is harder than that makes it sound, and that people often do a bad job. But...
I think that these same points also apply to neartermist interventions
Though with less uncertainty about at least the near-term effects, of course
Of course, what should “reasonable belief” mean? How do we decide which beliefs are reasonable and which ones aren’t (and the degree of reasonableness, if it’s a fuzzy concept)?
I think this gets at part of what comes to mind when I hear objections like this.
Another part is: I think we could say all of that with regards to literally any decision—we’d often be less uncertain, and it might be less reasonable to think the decision would be net negative or astronomically so, but I think it just comes in degrees, rather than applying strongly to some scenarios and not at all applying to others. One way to put this is that I think basically every decision meets the criteria for complex cluelessness (as I argued in the above-mentioned links: here and here).
But really I think that (partly for that reason) we should just ditch the term “complex cluelessness” entirely, and think in terms of things like credal resilience, downside risk, skeptical priors, model uncertainty, model combination and adjustment, the optimizer’s curse, best practice for forecasting, and expected values given all that.
Here I acknowledge that I’m making some epistemological, empirical, decision-theoretic, and/or moral claims/assumptions that I’m aware various people who’ve thought about related topics would contest (including yourself and maybe Greaves, both of whom have clearly “done your homework”). I’m also aware that I haven’t fully justified these stances here, but it seemed useful to gesture roughly at my conclusions and reasoning anyway.
I do think that these considerations mostly push against longtermism and in favour of neartermism. (Caveats include things like being very morally uncertain, such that e.g. reducing poverty or reducing factory farming could easily be bad, such that maybe the best thing is to maintain option value and maximise the chance of a long reflection. But this also reduces option value in some ways. And then one can counter that point, and so on.) But I think we should see this all as a bunch of competing quantitative factors, rather than as absolutes and binaries.
(Also, as noted elsewhere, I currently think longtermism—or further research on whether to be longtermist—comes out ahead of neartermism, all-things-considered, but I’m unsure on that.)
I don’t think it’s usually reasonable to choose only one expected value estimate, though, and this to me is the main consequence of cluelessness. Doing your best will still leave a great deal of ambiguity if you’re being honest about what beliefs you think would be reasonable to have, despite not being your own fairly arbitrary best guess (often I don’t even have a best guess, precisely because of how arbitrary that seems). Sensitivity analysis seems important.
But really I think that (partly for that reason) we should just ditch the term “complex cluelessness” entirely, and think in terms of things like credal resilience, downside risk, skeptical priors, model uncertainty, model combination and adjustment, the optimizer’s curse, best practice for forecasting, and expected values given all that.
I would say complex cluelessness basically is just sensitivity of recommendations to model uncertainty. The problem is that it’s often too arbitrary to come to a single estimate by combining models. Two people with access to all of the same information and even the same ethical views (same fundamental moral uncertainty and methods for dealing with them) could still disagree about whether an intervention is good or bad, or which of two interventions is best, depending basically on whims (priors, arbitrary weightings).
At least substantial parts of our credences are not very sensitive to arbitrariness with shorttermist interventions with good evidence, even if on the whole the expected value is, but the latter is what I hope hedging could be used to control. Maybe you can do this just with longtermist interventions, though. A portfolio of interventions can be less ambiguous than each intervention in it. (This is what my hedging post is about.)
I haven’t read your post, so can’t comment.
That said, FWIW, my independent impression is that “cluelessness” isn’t a useful concept and that the common ways the concept has been used either to counter neartermism or counter longtermism are misguided. (I write about this here and here.) So I guess that that’s probably consistent with your conclusion, though maybe by a different road. (I prefer to use the sort of analysis in Tarsney’s epistemic challenge paper, and I think that that pushes in favour of either longtermism or further research on longtermism vs neartermism, though I definitely acknowledge room for debate on that.)
I think Tarsney’s paper does not address/avoid cluelessness, or at least its spirit, i.e., the arbitrary weighting of different considerations, since
You still need to find a specific intervention that you predict ex ante pushes you towards one attractor and away from another, and you have more reason to believe it does this than it goes in the opposite direction (in expectation, say). If you have more reason to believe this due to arbitrary weights, which could reasonably have been chosen to have the intervention backfire, this is not a good epistemic state to be in. For example, is the AI safety work we’re doing now backfiring? This could be due to, for example:
creating a false sense of security,
publishing the results of the GPT models, demonstrating AI capabilities and showing the world how much further we can already push it, and therefore accelerating AI development, or
slowing AI development more in countries that care more about safety than those that don’t care much, risking a much worse AGI takeover if it matters who builds it first.
You still need to predict which of the attractors is ex ante ethically better, which again involves both arbitrary empirical weights and arbitrary ethical weights (moral uncertainty). You might find the choice to be sensitive to something arbitrary that could reasonably go either way. Is extinction actually bad, considering the possibility of s-risks?
Does some s-risk (e.g. AI safety, authoritarianism) work reduce some extinction risks and so increase other s-risks, and how do we weigh those possibilities?
I worry that research on longtermism vs neartermism (like Tarsney’s paper) just ignores these problems, since you really need to deal with somewhat specific interventions, because of the different considerations involved. In my view, (strong) longtermism is only true if you actually identify an intervention that you can only reasonably believe does (much) more net good in the far future in expectation than short-term-focused alternatives do in the short term in expectation, or, roughly, that you can only reasonably believe does (much) more good than harm (in the far future) in expectation. This requires careful analysis of a specific intervention, and we may not have the right information now or ever to confirm that a particular intervention satisfies these conditions. To every longtermist intervention I’ve tried to come up with specific objections to, I’ve come up with objections that I think could reasonably push it into doing more harm than good in expectation.
Of course, what should “reasonable belief” mean? How do we decide which beliefs are reasonable and which ones aren’t (and the degree of reasonableness, if it’s a fuzzy concept)?
Basically, I agree that longtermist interventions could have these downside risks, but:
I think we should basically just factor that into their expected value (while using various best practices and avoiding naive approaches)
I do acknowledge that this is harder than that makes it sound, and that people often do a bad job. But...
I think that these same points also apply to neartermist interventions
Though with less uncertainty about at least the near-term effects, of course
I think this gets at part of what comes to mind when I hear objections like this.
Another part is: I think we could say all of that with regards to literally any decision—we’d often be less uncertain, and it might be less reasonable to think the decision would be net negative or astronomically so, but I think it just comes in degrees, rather than applying strongly to some scenarios and not at all applying to others. One way to put this is that I think basically every decision meets the criteria for complex cluelessness (as I argued in the above-mentioned links: here and here).
But really I think that (partly for that reason) we should just ditch the term “complex cluelessness” entirely, and think in terms of things like credal resilience, downside risk, skeptical priors, model uncertainty, model combination and adjustment, the optimizer’s curse, best practice for forecasting, and expected values given all that.
Here I acknowledge that I’m making some epistemological, empirical, decision-theoretic, and/or moral claims/assumptions that I’m aware various people who’ve thought about related topics would contest (including yourself and maybe Greaves, both of whom have clearly “done your homework”). I’m also aware that I haven’t fully justified these stances here, but it seemed useful to gesture roughly at my conclusions and reasoning anyway.
I do think that these considerations mostly push against longtermism and in favour of neartermism. (Caveats include things like being very morally uncertain, such that e.g. reducing poverty or reducing factory farming could easily be bad, such that maybe the best thing is to maintain option value and maximise the chance of a long reflection. But this also reduces option value in some ways. And then one can counter that point, and so on.) But I think we should see this all as a bunch of competing quantitative factors, rather than as absolutes and binaries.
(Also, as noted elsewhere, I currently think longtermism—or further research on whether to be longtermist—comes out ahead of neartermism, all-things-considered, but I’m unsure on that.)
I don’t think it’s usually reasonable to choose only one expected value estimate, though, and this to me is the main consequence of cluelessness. Doing your best will still leave a great deal of ambiguity if you’re being honest about what beliefs you think would be reasonable to have, despite not being your own fairly arbitrary best guess (often I don’t even have a best guess, precisely because of how arbitrary that seems). Sensitivity analysis seems important.
I would say complex cluelessness basically is just sensitivity of recommendations to model uncertainty. The problem is that it’s often too arbitrary to come to a single estimate by combining models. Two people with access to all of the same information and even the same ethical views (same fundamental moral uncertainty and methods for dealing with them) could still disagree about whether an intervention is good or bad, or which of two interventions is best, depending basically on whims (priors, arbitrary weightings).
At least substantial parts of our credences are not very sensitive to arbitrariness with shorttermist interventions with good evidence, even if on the whole the expected value is, but the latter is what I hope hedging could be used to control. Maybe you can do this just with longtermist interventions, though. A portfolio of interventions can be less ambiguous than each intervention in it. (This is what my hedging post is about.)