Fwiw, I wrote this, which sort of goes against your impression, in another comment thread here:
I really don’t see how one could make a convincing argument why donating to animal shelters predictably makes the World better or worse, considering all the effects from now until the end of time.
The problem is we can’t just update away from agnosticism based on arguments that don’t address the very reasons for our agnosticism. In the DogvCat story, one key driver of my cluelessness is that I think there will always be crucial considerations we are unaware of, because we’re missing them or couldn’t even comprehend them (see Roussos 2021; Tarsney et al 2024, §3), and I can’t conveniently assume good and bad unknown unknowns ‘cancel out’ (Lenman 2000; Greaves 2016; Tarsney et al 2024, §3). For me to quit agnosticism, we’d have to find an argument robust to these unknown unknowns (and I’d be surprised if we find one). Arguments that don’t address unknown unknowns don’t address my cluelessness at all and it seems like they shouldn’t make me update. This is an instance of what Miriam Shoenfield (2012) calls ‘insensitivity to mild sweetening’.
But it’d be hard for me to make a case more convincing than this without unpacking a lot more (which I’ll do properly someday somewhere, hopefully). And your point that my thought experiment is weakened by the fact that the last sentence doesn’t seem obviously right at all (at least if we assume that we are given more resources to think hard about the question) is still well taken! That’s a very fair and helpful observation :)
I can’t conveniently assume good and bad unknown unknowns ‘cancel out’
FWIW, my take would be:
No, we shouldn’t assume that they “cancel out”
However, as a structural fact[*] about the world, the prevalence of good and bad unknown unknowns are correlated with the good and bad knowns (and known unknowns)
So, on average and in expectation, things will point in the same direction as the analysis ignoring cluelessness (although it’s worth being conscious that this will turn out wrong in a significant fraction of cases ― probably approaching 50% for something like cats vs dogs)
Of course this relies heavily on the “fact” I denoted as [*], but really I’m saying “I hypothesise this to be a fact”. My reasons for believing it are something like:
Some handwavey argument along these lines:
Among the many complex things we could consider, they will vary in the proportion of considerations that point in a good direction
If our knowledge sampled randomly from the available considerations, we would expect this correlation
It’s too much to expect our knowledge to sample randomly ― there will surely sometimes be structural biases ― but there’s no reason to expect the deviations to be so perverse as to (on average) actively mislead
(this needn’t preclude the existence of some domains with such a perverse pattern, but I’d want a positive argument that something might be such a domain)
Given that we shouldn’t expect the good and bad unknown unknowns to cancel out, by default we should expect them to correlate with the knowns
A sense that empirically this kind of correlation is true in less clueless-like situations
e.g. if I uncover a new considerations about whether it’s good or bad for EAs to steal-to-give, it’s more likely to point to “bad” than “good”
Combined with something like a simplicity prior ― if this effect exists for things where we have a fairly strong sense of the considerations we can track, by default I’d expect it to exist in weaker form for things where we have a weaker sense of the considerations we can track (rather than being non-existent or occurring in a perverse form)
In principle, this could be tested experimentally. In practice, you’re going to be chasing after tiny effect sizes with messy setups, so I don’t think it’s viable any time soon for human judgement. I do think you might hope to one day run experiments along these lines for AI systems. Of course they would have to be cases where we have some access to the ground truth, but the AI is pretty clueless—perhaps something like getting non-superintelligent AI systems to predict outcomes in a complex simulated world.
Thanks a lot for developing on that! To confirm whether we’ve identified at least one of the cruxes, I’d be curious to know what you think of what follows.
Say I am clueless about the (dis)value of the alien counterfactual we should expect (i.e., whether another civ someday replacing our own after we go extinct or something would be better or worse than if it was ours maintaining control over our corner of the Universe). One consideration I have identified is that there is, all else equal, a selection effect against caring about suffering for grabby civs. But all else is ofc not equal and there might be plenty of considerations I haven’t thought of and/or never will be aware of supporting the opposite or other relevant considerations that have nothing to do with care for suffering. I’m clueless. By, ‘I’m clueless’, I don’t mean ‘I have a 50% credence the alien counterfactual is better’. Instead, I mean ‘my credence is severely indeterminate/imprecise, such that I can’t compute the expected value of reducing X-risks (unless I decide to give up on impartial consequentialism and ignore things like the alien counterfactual which I’m clueless about)’ (for a case for how cluelessness threatens expected value reasoning in such a way, see e.g. Mogensen 2021).
Your above argument is based on the assumption that our credences all ought to be determinate/precise and that cluelessness = 50% credence, right? It’s probably not worth discussing further in here whether this assumption is justified but do you also think that’s one of the cruxes, here?
I think this is at least in the vicinity of a crux?
My immediate thoughts (I’d welcome hearing about issues with these views!):
I don’t think our credences all ought to be determinate/precise
But I’ve also never been satisfied with any account I’ve seen of indeterminate/imprecise credences
(though noting that there’s a large literature there and I’ve only seen a tiny fraction of it)
My view would be something more like:
As boundedly rational actors, it makes sense for a lot of our probabilities to be imprecise
But this isn’t a fundamental indeterminacy — rather, it’s a view that it’s often not worth expending the cognition to make them more precise
By thinking longer about things, we can get the probabilities to be more precise (in the limit converging on some precise probability)
At any moment, we have credence (itself kind of imprecise absent further thought) about where our probabilities will end up with further thought
What’s the point of tracking all these imprecise credences rather than just single precise best-guesses?
It helps to keep tabs on where more thinking might be helpful, as well as where you might easily be wrong about something
On this perspective, cluelessness = inability to get the current best guess point estimate of where we’d end up to deviate from 50% by expending more thought
Oh my bad. I don’t think it’s really a crux, then. Or not the most key one at least. I guess I can’t narrow it down to more precise than whether your “fact[*]” is true, in that case. And it looks like I misunderstood the assumptions behind your justification of it.
I’ll brush upon my little knowledge of the literature on unawareness—maybe dive deeper—and see to what extent your “fact[*]” was already discussed. I’m sure it was. Then, I’ll go back to your justification of it to see if I understand it better and whether I actually can say I disagree.
Surely we should have nonzero credence, and maybe even >10% that there aren’t any crucial considerations we are missing that are on the scale of ‘consider nonhumans’ or ‘consider future generations’. In which case we can bracket worlds where there is a crucial consideration we are missing as too hard, and base our decision on the worlds where we have the most crucial considerations already, and base our analysis on that. Which could still move us slightly away from pure agnosticism?
Your view seems to imply the futility of altruistic endeavour? Which of course doesn’t mean it is incorrect, just seems like an important implication.
In which case we can bracket worlds where there is a crucial consideration we are missing as too hard, and base our decision on the worlds where we have the most crucial considerations already, and base our analysis on that.
Ah nice, so this could mean two different things:
A.(The ‘canceling out’ objection to (complex) cluelessness:) We assume that good and bad unpredictable effects “cancel each other out” such that we are warranted to believe whatever option is best according to predictable effects is also best according to overall effects, OR
B. (Giving up on impartial consequentialism:) We reconsider what matters for our decision and simply decide to stop caring about whether our action makes the World better or worse, all things considered. Instead, we focus only on whether the parts of the World that are predictably affected a certain way are made better or worse and/or about things that have nothing to do with consequences (e.g., our intentions), and ignore the actual overall long-term impact of our decision which we cannot figure out.
Some version of B might be the right response in the scenario where we don’t know what else to do anyway? I don’t know. One version of B is explicitly given by Lenman who says we should reject consequentialism. Another is implicitly given by Tarsney (2022) when he says we should focus on the next thousands of years and sort of admit we have no idea what our impact is beyond that. But then we’re basically saying that we “got beaten” by cluelessness and are giving up on actually trying to improve the long-term future, overall (which is what most longtermists are claiming our goal should be, for compelling ethical reasons). We can very well endorse B, but then we can’t pretend we’re trying to actually predictably improve the World. We’re not. We’re just trying to improve some aspects of the World, ignoring how this affects things overall (since we have no idea).
Your view seems to imply the futility of altruistic endeavour?
If you replace “altruistic endeavour” by “impartial consequentialism”, in the DogvCat case, yes, absolutely. But I didn’t mean to imply that cluelessness in that case generalizes to everything (although I’m also not arguing it doesn’t). There might be cases where we have arguments plausibly robust to many unknown unknowns that warrant updating away from agnosticism, e.g., arguments based on logical inevitabilities or unavoidable selection effects. In this thread, I’ve only argued that I’d be surprised if we find such (convincing) argument for the DogVCat case, specifically. But it may very well be that this generalizes to many other cases and that we should be agnostic about many other things, to the extent that we actually care about our overall impact.
And I absolutely agree that this is an important implication of my points here. I think the reason why these problems are neglected by sympathizers of longtermism is that they (unwarrantedly) endorse A or (also unwarrantedly) assume that the fact that ‘wild guesses’ are often better than agnosticism in short-term geopolitical forecasting means they’re also better when it comes to predicting our overall impact on the long-term future (see ‘Winning isn’t enough’).
I think I am quite sympathetic to A, and to the things Owen wrote in the other branch, especially about operationalizing imprecise credences. But this is sufficiently interesting and important-seeming that I am noting to read later some of the references you give to justify A being false.
Oh interesting, I would have guessed you’d endorse some version of B or come up with a C, instead.
Iirc, these resources I referenced don’t directly address Owen’s points to justify A, though. Not sure. I’ll look into this and where they might be more straightforwardly addressed, since this seems quite important w.r.t. the work I’m currently doing. Happy to keep you updated if you want.
Interesting, thanks a lot!
Fwiw, I wrote this, which sort of goes against your impression, in another comment thread here:
The problem is we can’t just update away from agnosticism based on arguments that don’t address the very reasons for our agnosticism. In the DogvCat story, one key driver of my cluelessness is that I think there will always be crucial considerations we are unaware of, because we’re missing them or couldn’t even comprehend them (see Roussos 2021; Tarsney et al 2024, §3), and I can’t conveniently assume good and bad unknown unknowns ‘cancel out’ (Lenman 2000; Greaves 2016; Tarsney et al 2024, §3). For me to quit agnosticism, we’d have to find an argument robust to these unknown unknowns (and I’d be surprised if we find one). Arguments that don’t address unknown unknowns don’t address my cluelessness at all and it seems like they shouldn’t make me update. This is an instance of what Miriam Shoenfield (2012) calls ‘insensitivity to mild sweetening’.
But it’d be hard for me to make a case more convincing than this without unpacking a lot more (which I’ll do properly someday somewhere, hopefully). And your point that my thought experiment is weakened by the fact that the last sentence doesn’t seem obviously right at all (at least if we assume that we are given more resources to think hard about the question) is still well taken! That’s a very fair and helpful observation :)
Just on this point:
FWIW, my take would be:
No, we shouldn’t assume that they “cancel out”
However, as a structural fact[*] about the world, the prevalence of good and bad unknown unknowns are correlated with the good and bad knowns (and known unknowns)
So, on average and in expectation, things will point in the same direction as the analysis ignoring cluelessness (although it’s worth being conscious that this will turn out wrong in a significant fraction of cases ― probably approaching 50% for something like cats vs dogs)
Of course this relies heavily on the “fact” I denoted as [*], but really I’m saying “I hypothesise this to be a fact”. My reasons for believing it are something like:
Some handwavey argument along these lines:
Among the many complex things we could consider, they will vary in the proportion of considerations that point in a good direction
If our knowledge sampled randomly from the available considerations, we would expect this correlation
It’s too much to expect our knowledge to sample randomly ― there will surely sometimes be structural biases ― but there’s no reason to expect the deviations to be so perverse as to (on average) actively mislead
(this needn’t preclude the existence of some domains with such a perverse pattern, but I’d want a positive argument that something might be such a domain)
Given that we shouldn’t expect the good and bad unknown unknowns to cancel out, by default we should expect them to correlate with the knowns
A sense that empirically this kind of correlation is true in less clueless-like situations
e.g. if I uncover a new considerations about whether it’s good or bad for EAs to steal-to-give, it’s more likely to point to “bad” than “good”
Combined with something like a simplicity prior ― if this effect exists for things where we have a fairly strong sense of the considerations we can track, by default I’d expect it to exist in weaker form for things where we have a weaker sense of the considerations we can track (rather than being non-existent or occurring in a perverse form)
In principle, this could be tested experimentally. In practice, you’re going to be chasing after tiny effect sizes with messy setups, so I don’t think it’s viable any time soon for human judgement. I do think you might hope to one day run experiments along these lines for AI systems. Of course they would have to be cases where we have some access to the ground truth, but the AI is pretty clueless—perhaps something like getting non-superintelligent AI systems to predict outcomes in a complex simulated world.
Thanks a lot for developing on that! To confirm whether we’ve identified at least one of the cruxes, I’d be curious to know what you think of what follows.
Say I am clueless about the (dis)value of the alien counterfactual we should expect (i.e., whether another civ someday replacing our own after we go extinct or something would be better or worse than if it was ours maintaining control over our corner of the Universe). One consideration I have identified is that there is, all else equal, a selection effect against caring about suffering for grabby civs. But all else is ofc not equal and there might be plenty of considerations I haven’t thought of and/or never will be aware of supporting the opposite or other relevant considerations that have nothing to do with care for suffering. I’m clueless. By, ‘I’m clueless’, I don’t mean ‘I have a 50% credence the alien counterfactual is better’. Instead, I mean ‘my credence is severely indeterminate/imprecise, such that I can’t compute the expected value of reducing X-risks (unless I decide to give up on impartial consequentialism and ignore things like the alien counterfactual which I’m clueless about)’ (for a case for how cluelessness threatens expected value reasoning in such a way, see e.g. Mogensen 2021).
Your above argument is based on the assumption that our credences all ought to be determinate/precise and that cluelessness = 50% credence, right? It’s probably not worth discussing further in here whether this assumption is justified but do you also think that’s one of the cruxes, here?
I think this is at least in the vicinity of a crux?
My immediate thoughts (I’d welcome hearing about issues with these views!):
I don’t think our credences all ought to be determinate/precise
But I’ve also never been satisfied with any account I’ve seen of indeterminate/imprecise credences
(though noting that there’s a large literature there and I’ve only seen a tiny fraction of it)
My view would be something more like:
As boundedly rational actors, it makes sense for a lot of our probabilities to be imprecise
But this isn’t a fundamental indeterminacy — rather, it’s a view that it’s often not worth expending the cognition to make them more precise
By thinking longer about things, we can get the probabilities to be more precise (in the limit converging on some precise probability)
At any moment, we have credence (itself kind of imprecise absent further thought) about where our probabilities will end up with further thought
What’s the point of tracking all these imprecise credences rather than just single precise best-guesses?
It helps to keep tabs on where more thinking might be helpful, as well as where you might easily be wrong about something
On this perspective, cluelessness = inability to get the current best guess point estimate of where we’d end up to deviate from 50% by expending more thought
Oh my bad. I don’t think it’s really a crux, then. Or not the most key one at least. I guess I can’t narrow it down to more precise than whether your “fact[*]” is true, in that case. And it looks like I misunderstood the assumptions behind your justification of it.
I’ll brush upon my little knowledge of the literature on unawareness—maybe dive deeper—and see to what extent your “fact[*]” was already discussed. I’m sure it was. Then, I’ll go back to your justification of it to see if I understand it better and whether I actually can say I disagree.
Thanks for all your thoughts!
Surely we should have nonzero credence, and maybe even >10% that there aren’t any crucial considerations we are missing that are on the scale of ‘consider nonhumans’ or ‘consider future generations’. In which case we can bracket worlds where there is a crucial consideration we are missing as too hard, and base our decision on the worlds where we have the most crucial considerations already, and base our analysis on that. Which could still move us slightly away from pure agnosticism?
Your view seems to imply the futility of altruistic endeavour? Which of course doesn’t mean it is incorrect, just seems like an important implication.
Ah nice, so this could mean two different things:
A. (The ‘canceling out’ objection to (complex) cluelessness:) We assume that good and bad unpredictable effects “cancel each other out” such that we are warranted to believe whatever option is best according to predictable effects is also best according to overall effects, OR
B. (Giving up on impartial consequentialism:) We reconsider what matters for our decision and simply decide to stop caring about whether our action makes the World better or worse, all things considered. Instead, we focus only on whether the parts of the World that are predictably affected a certain way are made better or worse and/or about things that have nothing to do with consequences (e.g., our intentions), and ignore the actual overall long-term impact of our decision which we cannot figure out.
I think A is a big epistemic mistake for the reasons given by, e.g., Lenman 2000; Greaves 2016; Tarsney et al 2024, §3.
Some version of B might be the right response in the scenario where we don’t know what else to do anyway? I don’t know. One version of B is explicitly given by Lenman who says we should reject consequentialism. Another is implicitly given by Tarsney (2022) when he says we should focus on the next thousands of years and sort of admit we have no idea what our impact is beyond that. But then we’re basically saying that we “got beaten” by cluelessness and are giving up on actually trying to improve the long-term future, overall (which is what most longtermists are claiming our goal should be, for compelling ethical reasons). We can very well endorse B, but then we can’t pretend we’re trying to actually predictably improve the World. We’re not. We’re just trying to improve some aspects of the World, ignoring how this affects things overall (since we have no idea).
If you replace “altruistic endeavour” by “impartial consequentialism”, in the DogvCat case, yes, absolutely. But I didn’t mean to imply that cluelessness in that case generalizes to everything (although I’m also not arguing it doesn’t). There might be cases where we have arguments plausibly robust to many unknown unknowns that warrant updating away from agnosticism, e.g., arguments based on logical inevitabilities or unavoidable selection effects. In this thread, I’ve only argued that I’d be surprised if we find such (convincing) argument for the DogVCat case, specifically. But it may very well be that this generalizes to many other cases and that we should be agnostic about many other things, to the extent that we actually care about our overall impact.
And I absolutely agree that this is an important implication of my points here. I think the reason why these problems are neglected by sympathizers of longtermism is that they (unwarrantedly) endorse A or (also unwarrantedly) assume that the fact that ‘wild guesses’ are often better than agnosticism in short-term geopolitical forecasting means they’re also better when it comes to predicting our overall impact on the long-term future (see ‘Winning isn’t enough’).
I think I am quite sympathetic to A, and to the things Owen wrote in the other branch, especially about operationalizing imprecise credences. But this is sufficiently interesting and important-seeming that I am noting to read later some of the references you give to justify A being false.
Oh interesting, I would have guessed you’d endorse some version of B or come up with a C, instead.
Iirc, these resources I referenced don’t directly address Owen’s points to justify A, though. Not sure. I’ll look into this and where they might be more straightforwardly addressed, since this seems quite important w.r.t. the work I’m currently doing. Happy to keep you updated if you want.
yeah sure, lmk what you find out!