How much of the argument for working towards positive futures rather than existential security rests on conditional value, as opposed to expected value?
One could argue for conditional value, that in worlds where strong AI is easy and AI safety is hard, we are doomed regardless of effort, so we should concentrate on worlds where we could plausibly have good outcomes.
Alternatively, one could be confident that the probability of safety is relatively high, and make the argument that we should spend more time focused on positive futures because it’s likely already—either due to efforts towards superintelligence safety are likely to work, (and if so, which ones?) or because alignment by default seems likely.
(Or, I guess, lastly, one could assume, or argue, that no superintelligence is possible, or it is unlikely.)
I’ll just share that for me personally the case rests on expected value. I actually think there is a lot that we can do to make AI existential safety go better (governance if nothing else), and this is what I spend most of my time on. But the expected value of better futures seems far higher given the difference in size between the default post-human future and the best possible future.
So it sounds like this might be a predictive / empirical dispute about probabilities conditional on slowing AI and avoiding extinction, and the likely futures in each case, and not primarily an ethical theory dispute?
That is an excellent question. I think ethical theory matters a lot — see Power Laws of Value. But I also just think our superintelligent descendants are going to be pretty derpy and act on enlightened self-interest as they turn the stars into computers, not pursue very good things. And that might be somewhere where, e.g., @William_MacAskill and I disagree.
Interesting argument—I don’t know much about this argument, but my thoughts are that there’s not much value in thinking in terms of conditional value. If AI Safety is doomed to fail, there’s not much value focusing on good outcomes which won’t happen, when there are great global health interventions today. Arguably, these global health interventions could also help at least some parts of humanity have a positive future.
I don’t think that logic works—in the worlds where AI safety fails, humans go extinct, and you’re not saving lives for very long, so the value of short term EA investments is also correspondingly lower, and you’re choosing between “focusing on good outcomes which won’t happen,” as you said, and focusing on good outcomes which end almost immediately anyways. (But to illustrate this better, I’d need to work an example, and do the math, and then I’d need to argue about the conditionals and the exact values I’m using.)
I think it rests a lot on conditional value, and that is very unsatisfactory from a simple moral perspective of wanting to personally survive and have my friends and family survive. If extinction risk is high, and near (and I think it is!) we should be going all out to prevent it (i.e. pushing for a global moratorium on ASI). We can then work out the other issues once we have more time to think about them (rather than hastily punting on a long shot of surviving just because it appears higher EV now).
We can then work out the other issues once we have more time to think about them
Fin and I talk a bit about the “punting” strategy here.
I think it works often, but not in all cases.
For example the AI capability level that poses a meaningful risk of human takeover comes earlier than the AI capability level that poses a meaningful risk of AI takeover. Because some humans are coming with loads of power, already, and the amount of strategic intelligence you need to take over, if you already have loads of power, is less than the strategic capability you need if you’re starting off with almost none (which will be true of the ASI).
This seems like a predictive difference about AI trajectories and control, rather than an ethical debate. Does that seem correct to you (and/or to @Greg_Colbourn ⏸️ ?)
the AI capability level that poses a meaningful risk of human takeover comes earlier than the AI capability level that poses a meaningful risk of AI takeover.
I don’t think it comes meaningfully earlier. It might only be a few months (an AI capable of doing the work of a military superpower would be capable of doing most work involved in AI R&D, precipitating an intelligence explosion). And the humans wielding the power will lose it to the AI too, unless they halt all further development of AI (which seems unlikely, due to hubris/complacency, if nothing else).
starting off with almost none (which will be true of the ASI)
Any ASI worthy of the name would probably be able to go straight for an unstoppable nanotech computronium grey goo scenario.
How much of the argument for working towards positive futures rather than existential security rests on conditional value, as opposed to expected value?
One could argue for conditional value, that in worlds where strong AI is easy and AI safety is hard, we are doomed regardless of effort, so we should concentrate on worlds where we could plausibly have good outcomes.
Alternatively, one could be confident that the probability of safety is relatively high, and make the argument that we should spend more time focused on positive futures because it’s likely already—either due to efforts towards superintelligence safety are likely to work, (and if so, which ones?) or because alignment by default seems likely.
(Or, I guess, lastly, one could assume, or argue, that no superintelligence is possible, or it is unlikely.)
I’ll just share that for me personally the case rests on expected value. I actually think there is a lot that we can do to make AI existential safety go better (governance if nothing else), and this is what I spend most of my time on. But the expected value of better futures seems far higher given the difference in size between the default post-human future and the best possible future.
So it sounds like this might be a predictive / empirical dispute about probabilities conditional on slowing AI and avoiding extinction, and the likely futures in each case, and not primarily an ethical theory dispute?
That is an excellent question. I think ethical theory matters a lot — see Power Laws of Value. But I also just think our superintelligent descendants are going to be pretty derpy and act on enlightened self-interest as they turn the stars into computers, not pursue very good things. And that might be somewhere where, e.g., @William_MacAskill and I disagree.
Interesting argument—I don’t know much about this argument, but my thoughts are that there’s not much value in thinking in terms of conditional value. If AI Safety is doomed to fail, there’s not much value focusing on good outcomes which won’t happen, when there are great global health interventions today. Arguably, these global health interventions could also help at least some parts of humanity have a positive future.
I don’t think that logic works—in the worlds where AI safety fails, humans go extinct, and you’re not saving lives for very long, so the value of short term EA investments is also correspondingly lower, and you’re choosing between “focusing on good outcomes which won’t happen,” as you said, and focusing on good outcomes which end almost immediately anyways. (But to illustrate this better, I’d need to work an example, and do the math, and then I’d need to argue about the conditionals and the exact values I’m using.)
great point—thanks you changed my view!
I think it rests a lot on conditional value, and that is very unsatisfactory from a simple moral perspective of wanting to personally survive and have my friends and family survive. If extinction risk is high, and near (and I think it is!) we should be going all out to prevent it (i.e. pushing for a global moratorium on ASI). We can then work out the other issues once we have more time to think about them (rather than hastily punting on a long shot of surviving just because it appears higher EV now).
Fin and I talk a bit about the “punting” strategy here.
I think it works often, but not in all cases.
For example the AI capability level that poses a meaningful risk of human takeover comes earlier than the AI capability level that poses a meaningful risk of AI takeover. Because some humans are coming with loads of power, already, and the amount of strategic intelligence you need to take over, if you already have loads of power, is less than the strategic capability you need if you’re starting off with almost none (which will be true of the ASI).
This seems like a predictive difference about AI trajectories and control, rather than an ethical debate. Does that seem correct to you (and/or to @Greg_Colbourn ⏸️ ?)
Yeah, I think a lot of the overall debate—including what is most ethical to focus on(!) -- depends on AI trajectories and control.
I don’t think it comes meaningfully earlier. It might only be a few months (an AI capable of doing the work of a military superpower would be capable of doing most work involved in AI R&D, precipitating an intelligence explosion). And the humans wielding the power will lose it to the AI too, unless they halt all further development of AI (which seems unlikely, due to hubris/complacency, if nothing else).
Any ASI worthy of the name would probably be able to go straight for an unstoppable nanotech computronium grey goo scenario.