I guess I’m not wild about this approach, but I think it is important to consider (and sometimes use) alternative frames, so thanks for the write-up!
To articulate my worries, I suppose it’s that this implies a very reductionist and potentially exclusionary idea of doing good; it’s sort of “Holy shit, X-risks matters (and nothing else does)”. On any plausible conception of EA, we want people doing a whole bunch of stuff to make things better.
The other bit that irks me is that it does not follow, from the mere fact that’s there’s a small chance of something bad happening, that preventing that bad thing is the most good you can do. I basically stop listening to the rest of any sentence that starts with “but if there’s even a 1% chance that …”
FWIW, the framing of EA I quite like are versions of “we ought to do good; doing more good is better”
Think about how hard you would try to avoid getting the next wave of COVID if it turned out it had a 1% chance of killing you. Not even 1% conditional on you getting it; 1% unconditional. (So for concreteness, imagine that your doctor at the next checkup tells you that based on your blood type and DNA you actually have a 10% chance of dying from COVID if you were to get it, and based on your current default behavior and prevalence in the population it seems like you have a 10% chance of getting it before a better vaccine for your specific blood type is developed.)
Well, I claim, you personally are more than 1% likely to die of x-risk. (Because we all are.)
To articulate my worries, I suppose it’s that this implies a very reductionist and potentially exclusionary idea of doing good; it’s sort of “Holy shit, X-risks matters (and nothing else does)”. On any plausible conception of EA, we want people doing a whole bunch of stuff to make things better.
I’d actually hoped that this framing is less reductionist and exclusionary. Under total utilitarianism + strong longtermism, averting extinction is the only thing that matters, everything else is irrelevant. Under this framing, averting extinction from AI is, say, maybe 100x better than totally solving climate change. And AI is comparatively much more neglected and so likely much more tractable. And so it’s clearly the better thing to work on. But it’s only a few orders of magnitude, coming from empirical details of the problem, rather than a crazy, overwhelming argument that requires estimating the number of future people, the moral value of digital minds, etc.
The other bit that irks me is that it does not follow, from the mere fact that’s there’s a small chance of something bad happening, that preventing that bad thing is the most good you can do. I basically stop listening to the rest of any sentence that starts with “but if there’s even a 1% chance that …”
I agree with the first sentence, but your second sentence seems way too strong—it seems bad to devote all your efforts to averting some tiny tail risk, but I feel pretty convinced that averting a 1% chance of a really bad thing is more important than averting a certainty of a kinda bad thing (operationalising this as 1000x less bad, though it’s fuzzy). But I agree that the preference ordering of (1% chance of really bad thing) vs (certainty of maybe bad thing) is unclear, and that it’s reasonable to reject eg naive attempts to calculate expected utility.
I’ve seen a lot of estimates in this world that are more than 100x off so I’m also pretty unconvinced by “if there’s even a 1% chance”. Give me a solid reason for your estimate, otherwise I’m not interested.
These arguments appeal to phenomenal stakes implying that, using expected value reasoning, even a very small probability of the bad thing happening means we should try to reduce the risk, provided there is some degree of tractability in doing so.
Is the reason you dismiss such arguments because:
You reject EV reasoning if the probabilities are sufficiently small (i.e. anti-fanaticism)
There are issues with this response e.g. here to give one
You think the probabilities cited are too arbitrary so you don’t take the argument seriously
But the specific numerical probabilities themselves are not super important in longtermist cases. Usually, because of the astronomical stakes, the important thing is that there is a “non-negligible” probability decrease we can achieve. Much has been written about why there might be non-negligible x-risk from AI or biosecurity etc. and that there are things we can do to reduce this risk. The actual numerical probabilities themselves are insanely hard to estimate, but it’s also not that important to do so.
You reject the arguments that we can reduce x-risk in a non-negligible way (e.g. from AI, biosecurity etc.)
When people say “even if there’s a 1% chance” without providing any other evidence, I have no reason to believe there is a 1% chance vs 0.001% or a much smaller number.
I think you’re getting hung up on the specific numbers which I personally think are irrelevant. What about if one says something like:
“Given arguments put forward by leading AI researchers such as Eliezer Yudkowsky, Nick Bostrom, Stuart Russell and Richard Ngo, it seems that there is a very real possibility that we will create superintelligent AI one day. Furthermore, we are currently uncertain about how we can ensure such an AI would be aligned to our interests. A superintelligent AI that is not aligned to our interests could clearly bring about highly undesirable states of the world that could persist for a very long time, if not forever. There seem to be tractable ways to increase the probability that AI will be aligned to our interests, such as through alignment research or policy/regulation meaning such actions are a very high priority”.
There’s a lot missing from that but I don’t want to cover all the object-level arguments here. My point is that waving it all away by saying that a specific probability someone has cited is arbitrary seems wrong to me. You would need to counter the object-level arguments put forward by leading researchers. Do you find those arguments weak?
Ah gotcha. So you’re specifically objecting to people who say ‘even if there’s a 1% chance’ based on vague intuition, and not to people who think carefully about AI risk, conclude that there’s a 1% chance, and then act upon it?
Exactly! “Even if there’s a 1% chance” on its own is a poor argument, “I am pretty confident there’s at least a 1% chance and therefore I’m taking action” is totally reasonable
These arguments appeal to phenomenal stakes implying that, using expected value reasoning, even a very small probability of the bad thing happening means we should try to reduce the risk, provided there is some degree of tractability in doing so.
To be clear, the argument in my post is that we only need the argument to work for very small=1% or 0.1%, not eg 10^-10. I am much more skeptical about arguments involving 10^-10 like probabilities
Estimates can be massively off in both directions. Why do you jump to the conclusion of inaction rather than action?
(My guess is that it’s sufficiently easy to generate plausible but wrong ideas at the 1% level that you should have SOME amount of inaction bias, but not to take it too far)
I guess I’m not wild about this approach, but I think it is important to consider (and sometimes use) alternative frames, so thanks for the write-up!
To articulate my worries, I suppose it’s that this implies a very reductionist and potentially exclusionary idea of doing good; it’s sort of “Holy shit, X-risks matters (and nothing else does)”. On any plausible conception of EA, we want people doing a whole bunch of stuff to make things better.
The other bit that irks me is that it does not follow, from the mere fact that’s there’s a small chance of something bad happening, that preventing that bad thing is the most good you can do. I basically stop listening to the rest of any sentence that starts with “but if there’s even a 1% chance that …”
FWIW, the framing of EA I quite like are versions of “we ought to do good; doing more good is better”
Think about how hard you would try to avoid getting the next wave of COVID if it turned out it had a 1% chance of killing you. Not even 1% conditional on you getting it; 1% unconditional. (So for concreteness, imagine that your doctor at the next checkup tells you that based on your blood type and DNA you actually have a 10% chance of dying from COVID if you were to get it, and based on your current default behavior and prevalence in the population it seems like you have a 10% chance of getting it before a better vaccine for your specific blood type is developed.)
Well, I claim, you personally are more than 1% likely to die of x-risk. (Because we all are.)
I’d actually hoped that this framing is less reductionist and exclusionary. Under total utilitarianism + strong longtermism, averting extinction is the only thing that matters, everything else is irrelevant. Under this framing, averting extinction from AI is, say, maybe 100x better than totally solving climate change. And AI is comparatively much more neglected and so likely much more tractable. And so it’s clearly the better thing to work on. But it’s only a few orders of magnitude, coming from empirical details of the problem, rather than a crazy, overwhelming argument that requires estimating the number of future people, the moral value of digital minds, etc.
I agree with the first sentence, but your second sentence seems way too strong—it seems bad to devote all your efforts to averting some tiny tail risk, but I feel pretty convinced that averting a 1% chance of a really bad thing is more important than averting a certainty of a kinda bad thing (operationalising this as 1000x less bad, though it’s fuzzy). But I agree that the preference ordering of (1% chance of really bad thing) vs (certainty of maybe bad thing) is unclear, and that it’s reasonable to reject eg naive attempts to calculate expected utility.
I’m curious, do you actually agree with the two empirical claims I make in this post? (1% risk of AI x-risk, 0.1% of bio within my lifetime)
Can you say more about why you dismiss such arguments? Do you have a philosophical justification for doing so?
I’ve seen a lot of estimates in this world that are more than 100x off so I’m also pretty unconvinced by “if there’s even a 1% chance”. Give me a solid reason for your estimate, otherwise I’m not interested.
These arguments appeal to phenomenal stakes implying that, using expected value reasoning, even a very small probability of the bad thing happening means we should try to reduce the risk, provided there is some degree of tractability in doing so.
Is the reason you dismiss such arguments because:
You reject EV reasoning if the probabilities are sufficiently small (i.e. anti-fanaticism)
There are issues with this response e.g. here to give one
You think the probabilities cited are too arbitrary so you don’t take the argument seriously
But the specific numerical probabilities themselves are not super important in longtermist cases. Usually, because of the astronomical stakes, the important thing is that there is a “non-negligible” probability decrease we can achieve. Much has been written about why there might be non-negligible x-risk from AI or biosecurity etc. and that there are things we can do to reduce this risk. The actual numerical probabilities themselves are insanely hard to estimate, but it’s also not that important to do so.
You reject the arguments that we can reduce x-risk in a non-negligible way (e.g. from AI, biosecurity etc.)
You reject phenomenal stakes
Some other reason?
When people say “even if there’s a 1% chance” without providing any other evidence, I have no reason to believe there is a 1% chance vs 0.001% or a much smaller number.
I think you’re getting hung up on the specific numbers which I personally think are irrelevant. What about if one says something like:
“Given arguments put forward by leading AI researchers such as Eliezer Yudkowsky, Nick Bostrom, Stuart Russell and Richard Ngo, it seems that there is a very real possibility that we will create superintelligent AI one day. Furthermore, we are currently uncertain about how we can ensure such an AI would be aligned to our interests. A superintelligent AI that is not aligned to our interests could clearly bring about highly undesirable states of the world that could persist for a very long time, if not forever. There seem to be tractable ways to increase the probability that AI will be aligned to our interests, such as through alignment research or policy/regulation meaning such actions are a very high priority”.
There’s a lot missing from that but I don’t want to cover all the object-level arguments here. My point is that waving it all away by saying that a specific probability someone has cited is arbitrary seems wrong to me. You would need to counter the object-level arguments put forward by leading researchers. Do you find those arguments weak?
Ah gotcha. So you’re specifically objecting to people who say ‘even if there’s a 1% chance’ based on vague intuition, and not to people who think carefully about AI risk, conclude that there’s a 1% chance, and then act upon it?
Exactly! “Even if there’s a 1% chance” on its own is a poor argument, “I am pretty confident there’s at least a 1% chance and therefore I’m taking action” is totally reasonable
To be clear, the argument in my post is that we only need the argument to work for very small=1% or 0.1%, not eg 10^-10. I am much more skeptical about arguments involving 10^-10 like probabilities
Estimates can be massively off in both directions. Why do you jump to the conclusion of inaction rather than action?
(My guess is that it’s sufficiently easy to generate plausible but wrong ideas at the 1% level that you should have SOME amount of inaction bias, but not to take it too far)