That said, FWIW, my independent impression is that ācluelessnessā isnāt a useful concept and that the common ways the concept has been used either to counter neartermism or counter longtermism are misguided. (I write about this here and here.) So I guess that thatās probably consistent with your conclusion, though maybe by a different road. (I prefer to use the sort of analysis in Tarsneyās epistemic challenge paper, and I think that that pushes in favour of either longtermism or further research on longtermism vs neartermism, though I definitely acknowledge room for debate on that.)
I think Tarsneyās paper does not address/āavoid cluelessness, or at least its spirit, i.e., the arbitrary weighting of different considerations, since
You still need to find a specific intervention that you predict ex ante pushes you towards one attractor and away from another, and you have more reason to believe it does this than it goes in the opposite direction (in expectation, say). If you have more reason to believe this due to arbitrary weights, which could reasonably have been chosen to have the intervention backfire, this is not a good epistemic state to be in. For example, is the AI safety work weāre doing now backfiring? This could be due to, for example:
creating a false sense of security,
publishing the results of the GPT models, demonstrating AI capabilities and showing the world how much further we can already push it, and therefore accelerating AI development, or
slowing AI development more in countries that care more about safety than those that donāt care much, risking a much worse AGI takeover if it matters who builds it first.
You still need to predict which of the attractors is ex ante ethically better, which again involves both arbitrary empirical weights and arbitrary ethical weights (moral uncertainty). You might find the choice to be sensitive to something arbitrary that could reasonably go either way. Is extinction actually bad, considering the possibility of s-risks?
Does some s-risk (e.g. AI safety, authoritarianism) work reduce some extinction risks and so increase other s-risks, and how do we weigh those possibilities?
I worry that research on longtermism vs neartermism (like Tarsneyās paper) just ignores these problems, since you really need to deal with somewhat specific interventions, because of the different considerations involved. In my view, (strong) longtermism is only true if you actually identify an intervention that you can only reasonably believe does (much) more net good in the far future in expectation than short-term-focused alternatives do in the short term in expectation, or, roughly, that you can only reasonably believe does (much) more good than harm (in the far future) in expectation. This requires careful analysis of a specific intervention, and we may not have the right information now or ever to confirm that a particular intervention satisfies these conditions. To every longtermist intervention Iāve tried to come up with specific objections to, Iāve come up with objections that I think could reasonably push it into doing more harm than good in expectation.
Of course, what should āreasonable beliefā mean? How do we decide which beliefs are reasonable and which ones arenāt (and the degree of reasonableness, if itās a fuzzy concept)?
Basically, I agree that longtermist interventions could have these downside risks, but:
I think we should basically just factor that into their expected value (while using various best practices and avoiding naive approaches)
I do acknowledge that this is harder than that makes it sound, and that people often do a bad job. But...
I think that these same points also apply to neartermist interventions
Though with less uncertainty about at least the near-term effects, of course
Of course, what should āreasonable beliefā mean? How do we decide which beliefs are reasonable and which ones arenāt (and the degree of reasonableness, if itās a fuzzy concept)?
I think this gets at part of what comes to mind when I hear objections like this.
Another part is: I think we could say all of that with regards to literally any decisionāweād often be less uncertain, and it might be less reasonable to think the decision would be net negative or astronomically so, but I think it just comes in degrees, rather than applying strongly to some scenarios and not at all applying to others. One way to put this is that I think basically every decision meets the criteria for complex cluelessness (as I argued in the above-mentioned links: here and here).
But really I think that (partly for that reason) we should just ditch the term ācomplex cluelessnessā entirely, and think in terms of things like credal resilience, downside risk, skeptical priors, model uncertainty, model combination and adjustment, the optimizerās curse, best practice for forecasting, and expected values given all that.
Here I acknowledge that Iām making some epistemological, empirical, decision-theoretic, and/āor moral claims/āassumptions that Iām aware various people whoāve thought about related topics would contest (including yourself and maybe Greaves, both of whom have clearly ādone your homeworkā). Iām also aware that I havenāt fully justified these stances here, but it seemed useful to gesture roughly at my conclusions and reasoning anyway.
I do think that these considerations mostly push against longtermism and in favour of neartermism. (Caveats include things like being very morally uncertain, such that e.g. reducing poverty or reducing factory farming could easily be bad, such that maybe the best thing is to maintain option value and maximise the chance of a long reflection. But this also reduces option value in some ways. And then one can counter that point, and so on.) But I think we should see this all as a bunch of competing quantitative factors, rather than as absolutes and binaries.
(Also, as noted elsewhere, I currently think longtermismāor further research on whether to be longtermistācomes out ahead of neartermism, all-things-considered, but Iām unsure on that.)
I donāt think itās usually reasonable to choose only one expected value estimate, though, and this to me is the main consequence of cluelessness. Doing your best will still leave a great deal of ambiguity if youāre being honest about what beliefs you think would be reasonable to have, despite not being your own fairly arbitrary best guess (often I donāt even have a best guess, precisely because of how arbitrary that seems). Sensitivity analysis seems important.
But really I think that (partly for that reason) we should just ditch the term ācomplex cluelessnessā entirely, and think in terms of things like credal resilience, downside risk, skeptical priors, model uncertainty, model combination and adjustment, the optimizerās curse, best practice for forecasting, and expected values given all that.
I would say complex cluelessness basically is just sensitivity of recommendations to model uncertainty. The problem is that itās often too arbitrary to come to a single estimate by combining models. Two people with access to all of the same information and even the same ethical views (same fundamental moral uncertainty and methods for dealing with them) could still disagree about whether an intervention is good or bad, or which of two interventions is best, depending basically on whims (priors, arbitrary weightings).
At least substantial parts of our credences are not very sensitive to arbitrariness with shorttermist interventions with good evidence, even if on the whole the expected value is, but the latter is what I hope hedging could be used to control. Maybe you can do this just with longtermist interventions, though. A portfolio of interventions can be less ambiguous than each intervention in it. (This is what my hedging post is about.)
I havenāt read your post, so canāt comment.
That said, FWIW, my independent impression is that ācluelessnessā isnāt a useful concept and that the common ways the concept has been used either to counter neartermism or counter longtermism are misguided. (I write about this here and here.) So I guess that thatās probably consistent with your conclusion, though maybe by a different road. (I prefer to use the sort of analysis in Tarsneyās epistemic challenge paper, and I think that that pushes in favour of either longtermism or further research on longtermism vs neartermism, though I definitely acknowledge room for debate on that.)
I think Tarsneyās paper does not address/āavoid cluelessness, or at least its spirit, i.e., the arbitrary weighting of different considerations, since
You still need to find a specific intervention that you predict ex ante pushes you towards one attractor and away from another, and you have more reason to believe it does this than it goes in the opposite direction (in expectation, say). If you have more reason to believe this due to arbitrary weights, which could reasonably have been chosen to have the intervention backfire, this is not a good epistemic state to be in. For example, is the AI safety work weāre doing now backfiring? This could be due to, for example:
creating a false sense of security,
publishing the results of the GPT models, demonstrating AI capabilities and showing the world how much further we can already push it, and therefore accelerating AI development, or
slowing AI development more in countries that care more about safety than those that donāt care much, risking a much worse AGI takeover if it matters who builds it first.
You still need to predict which of the attractors is ex ante ethically better, which again involves both arbitrary empirical weights and arbitrary ethical weights (moral uncertainty). You might find the choice to be sensitive to something arbitrary that could reasonably go either way. Is extinction actually bad, considering the possibility of s-risks?
Does some s-risk (e.g. AI safety, authoritarianism) work reduce some extinction risks and so increase other s-risks, and how do we weigh those possibilities?
I worry that research on longtermism vs neartermism (like Tarsneyās paper) just ignores these problems, since you really need to deal with somewhat specific interventions, because of the different considerations involved. In my view, (strong) longtermism is only true if you actually identify an intervention that you can only reasonably believe does (much) more net good in the far future in expectation than short-term-focused alternatives do in the short term in expectation, or, roughly, that you can only reasonably believe does (much) more good than harm (in the far future) in expectation. This requires careful analysis of a specific intervention, and we may not have the right information now or ever to confirm that a particular intervention satisfies these conditions. To every longtermist intervention Iāve tried to come up with specific objections to, Iāve come up with objections that I think could reasonably push it into doing more harm than good in expectation.
Of course, what should āreasonable beliefā mean? How do we decide which beliefs are reasonable and which ones arenāt (and the degree of reasonableness, if itās a fuzzy concept)?
Basically, I agree that longtermist interventions could have these downside risks, but:
I think we should basically just factor that into their expected value (while using various best practices and avoiding naive approaches)
I do acknowledge that this is harder than that makes it sound, and that people often do a bad job. But...
I think that these same points also apply to neartermist interventions
Though with less uncertainty about at least the near-term effects, of course
I think this gets at part of what comes to mind when I hear objections like this.
Another part is: I think we could say all of that with regards to literally any decisionāweād often be less uncertain, and it might be less reasonable to think the decision would be net negative or astronomically so, but I think it just comes in degrees, rather than applying strongly to some scenarios and not at all applying to others. One way to put this is that I think basically every decision meets the criteria for complex cluelessness (as I argued in the above-mentioned links: here and here).
But really I think that (partly for that reason) we should just ditch the term ācomplex cluelessnessā entirely, and think in terms of things like credal resilience, downside risk, skeptical priors, model uncertainty, model combination and adjustment, the optimizerās curse, best practice for forecasting, and expected values given all that.
Here I acknowledge that Iām making some epistemological, empirical, decision-theoretic, and/āor moral claims/āassumptions that Iām aware various people whoāve thought about related topics would contest (including yourself and maybe Greaves, both of whom have clearly ādone your homeworkā). Iām also aware that I havenāt fully justified these stances here, but it seemed useful to gesture roughly at my conclusions and reasoning anyway.
I do think that these considerations mostly push against longtermism and in favour of neartermism. (Caveats include things like being very morally uncertain, such that e.g. reducing poverty or reducing factory farming could easily be bad, such that maybe the best thing is to maintain option value and maximise the chance of a long reflection. But this also reduces option value in some ways. And then one can counter that point, and so on.) But I think we should see this all as a bunch of competing quantitative factors, rather than as absolutes and binaries.
(Also, as noted elsewhere, I currently think longtermismāor further research on whether to be longtermistācomes out ahead of neartermism, all-things-considered, but Iām unsure on that.)
I donāt think itās usually reasonable to choose only one expected value estimate, though, and this to me is the main consequence of cluelessness. Doing your best will still leave a great deal of ambiguity if youāre being honest about what beliefs you think would be reasonable to have, despite not being your own fairly arbitrary best guess (often I donāt even have a best guess, precisely because of how arbitrary that seems). Sensitivity analysis seems important.
I would say complex cluelessness basically is just sensitivity of recommendations to model uncertainty. The problem is that itās often too arbitrary to come to a single estimate by combining models. Two people with access to all of the same information and even the same ethical views (same fundamental moral uncertainty and methods for dealing with them) could still disagree about whether an intervention is good or bad, or which of two interventions is best, depending basically on whims (priors, arbitrary weightings).
At least substantial parts of our credences are not very sensitive to arbitrariness with shorttermist interventions with good evidence, even if on the whole the expected value is, but the latter is what I hope hedging could be used to control. Maybe you can do this just with longtermist interventions, though. A portfolio of interventions can be less ambiguous than each intervention in it. (This is what my hedging post is about.)