Hey Jason, I share the same thoughts on pascal-mugging type arguments.
Having said that, The Precipice convincingly argues that the x-risk this century is around ~1/6, which is really not very low. Even if you don’t totally believe Toby, it seems reasonable to put the odds at that order of magnitude, and it shouldn’t fall into the 1-e6 type of argument.
I don’t think the Deutsch quotes apply either. He writes “Virtually all of them could have avoided the catastrophes that destroyed them if only they had possessed a little additional knowledge, such as improved agricultural or military technology”.
That might be true when it comes to warring human civilizations, but not when it comes to global catastrophes. In the past, there was no way to say “let’s not move on to the bronze age quite yet”, so any individual actor who attempted to stagnate would be dominated by more aggressive competitors.
But for the first time in history, we really do have the potential for species-wide cooperation. It’s difficult, but feasible. If the US and China manage to agree to a joint AI resolution, there’s no third party that will suddenly sweep in and dominate with their less cautious approach.
I haven’t read Ord’s book (although I read the SSC review, so I have the high-level summary). Let’s assume Ord is right and we have a 1⁄6 chance of extinction this century.
My “1e-6” was not an extinction risk. It’s a delta between two choices that are actually open to us. There are no zero-risk paths open to us, only one set of risks vs. a different set.
So:
What path, or set of choices, would reduce that 1⁄6 risk?
What would be the cost of that path, vs. the path that progress studies is charting?
How certain are we about those two estimates? (Or even the sign of those estimates?)
My view on these questions is very far from settled, but I’m generally aligned through all of the points of the form “X seems very dangerous!” Where I get lost is when the conclusion becomes, “therefore let’s not accelerate progress.” (Or is that even the conclusion? I’m still not clear. Ord’s “long reflection” certainly seems like that.)
I am all for specific safety measures. Better biosecurity in labs—great. AI safety? I’m a little unclear how we can create safety mechanisms for a thing that we haven’t exactly invented yet, but hey, if anyone has good ideas for how to do it, let’s go for it. Maybe there is some theoretical framework around “value alignment” that we can create up front—wonderful.
I’m also in favor of generally educating scientists and engineers about the grave moral responsibility they have to watch out for these things and to take appropriate responsibility. (I tend to think that existential risk lies most in the actions, good or bad, of those who are actually on the frontier.)
But EA/XR folks don’t seem to be primarily advocating for specific safety measures. Instead, what I hear (or think I’m hearing) is a kind of generalized fear of progress. Again, that’s where I get lost. I think that (1) progress is too obviously valuable and (2) our ability to actually predict and control future risks is too low.
But EA/XR folks don’t seem to be primarily advocating for specific safety measures. Instead, what I hear (or think I’m hearing) is a kind of generalized fear of progress. Again, that’s where I get lost. I think that (1) progress is too obviously valuable and (2) our ability to actually predict and control future risks is too low.
I think there’s a fear of progress in specific areas (e.g. AGI and certain kinds of bio) but not a general one? At least I’m in favor of progress generally and against progress in some specific areas where we have good object-level arguments for why progress in those areas in particular could be very risky.
(I also think EA/XR folks are primarily advocating for the development of specific safety measures, and not for us to stop progress, but I agree there is at least some amount of “stop progress” in the mix.)
Re: (2), I’m somewhat sympathetic to this, but all the ways I’m sympathetic to it seem to also apply to progress studies (i.e. I’d be sympathetic to “our ability to influence the pace of progress is too low”), so I’m not sure how this becomes a crux.
That’s interesting, because I think it’s much more obvious that we could successfully, say, accelerate GDP growth by 1-2 points per year, than it is that we could successfully, say, stop an AI catastrophe.
The former is something we have tons of experience with: there’s history, data, economic theory… and we can experiment and iterate. The latter is something almost completely in the future, where we don’t get any chances to get it wrong and course-correct.
(Again, this is not to say that I’m opposed to AI safety work: I basically think it’s a good thing, or at least it can be if pursued intelligently. I just think there’s a much greater chance that we look back on it and realize, too late, that we were focused on entirely the wrong things.)
I just think there’s a much greater chance that we look back on it and realize, too late, that we were focused on entirely the wrong things.
If you mean like 10x greater chance, I think that’s plausible (though larger than I would say). If you mean 1000x greater chance, that doesn’t seem defensible.
In both fields you basically ~can’t experiment with the actual thing you care about (you can’t just build a superintelligent AI and check whether it is aligned; you mostly can’t run an intervention on the entire world and check whether world GDP went up). You instead have to rely on proxies.
In some ways it is a lot easier to run proxy experiments for AI alignment—you can train AI systems right now, and run actual proposals in code on those systems, and see what they do; this usually takes somewhere between hours and weeks. It seems a lot harder to do this for “improving GDP growth” (though perhaps there are techniques I don’t know about).
I agree that PS has an advantage with historical data (though I don’t see why economic theory is particularly better than AI theory), and this is a pretty major difference. Still, I don’t think it goes from “good chance of making a difference” to “basically zero chance of making a difference”.
The latter is something almost completely in the future, where we don’t get any chances to get it wrong and course-correct.
Fwiw, I think AI alignment is relevant to current AI systems with which we have experience even if the catastrophic versions are in the future, and we do get chances to get it wrong and course-correct, but we can set that aside for now, since I’d probably still disagree even if I changed my mind on that. (Like, it is hard to do armchair theory without experimental data, but it’s not so hard that you should conclude that you’re completely doomed and there’s no point in trying.)
I absolutely agree with all the other points. This isn’t an exact quote, but from his talk with Tyler Cowen, Nick Beckstead notes:
“People doing philosophical work to try to reduce existential risk are largely wasting their time. Tyler doesn’t think it’s a serious effort, though it may be good publicity for something that will pay off later… the philosophical side of this seems like ineffective posturing.
Tyler wouldn’t necessarily recommend that these people switch to other areas of focus because people motivation and personal interests are major constraints on getting anywhere. For Tyler, his own interest in these issues is a form of consumption, though one he values highly.”
https://drive.google.com/file/d/1O—V1REGe1-PNTpJXl3GHsUu_eGvdAKn/view
That’s a bit harsh, but this was in 2014. Hopefully Tyler would agree efforts have gotten somewhat more serious since then. I think the median EA/XR person would agree that there is probably a need for the movement to get more hands on and practical.
R.e. safety for something that hasn’t been invented: I’m not an expert here, but my understanding is that some of it might be path dependent. I.e. research agendas hope to result in particular kinds of AI, and it’s not necessarily a feature you can just add on later. But it doesn’t sound like there’s a deep disagreement here, and in any case I’m not the best person to try to argue this case.
Intuitively, one analogy might be: we’re building a rocket, humanity is already on it, and the AI Safety people are saying “let’s add life support before the rocket takes off”. The exacerbating factor is that once the rocket is built, it might take off immediately, and no one is quite sure when this will happen.
To your Beckstead paraphrase, I’ll add Tyler’s recent exchange with Joseph Walker:
Cowen: Uncertainty should not paralyse you: try to do your best, pursue maximum expected value, just avoid the moral nervousness, be a little Straussian about it. Like here’s a rule on average it’s a good rule we’re all gonna follow it. Bravo move on to the next thing. Be a builder.
Walker: So… Get on with it?
Cowen: Yes ultimately the nervous Nellie’s, they’re not philosophically sophisticated, they’re over indulging their own neuroticism, when you get right down to it. So it’s not like there’s some brute let’s be a builder view and then there’s some deeper wisdom that the real philosophers pursue. It’s you be a builder or a nervous Nelly, you take your pick, I say be a builder.
Hey Jason, I share the same thoughts on pascal-mugging type arguments.
Having said that, The Precipice convincingly argues that the x-risk this century is around ~1/6, which is really not very low. Even if you don’t totally believe Toby, it seems reasonable to put the odds at that order of magnitude, and it shouldn’t fall into the 1-e6 type of argument.
I don’t think the Deutsch quotes apply either. He writes “Virtually all of them could have avoided the catastrophes that destroyed them if only they had possessed a little additional knowledge, such as improved agricultural or military technology”.
That might be true when it comes to warring human civilizations, but not when it comes to global catastrophes. In the past, there was no way to say “let’s not move on to the bronze age quite yet”, so any individual actor who attempted to stagnate would be dominated by more aggressive competitors.
But for the first time in history, we really do have the potential for species-wide cooperation. It’s difficult, but feasible. If the US and China manage to agree to a joint AI resolution, there’s no third party that will suddenly sweep in and dominate with their less cautious approach.
Good points.
I haven’t read Ord’s book (although I read the SSC review, so I have the high-level summary). Let’s assume Ord is right and we have a 1⁄6 chance of extinction this century.
My “1e-6” was not an extinction risk. It’s a delta between two choices that are actually open to us. There are no zero-risk paths open to us, only one set of risks vs. a different set.
So:
What path, or set of choices, would reduce that 1⁄6 risk?
What would be the cost of that path, vs. the path that progress studies is charting?
How certain are we about those two estimates? (Or even the sign of those estimates?)
My view on these questions is very far from settled, but I’m generally aligned through all of the points of the form “X seems very dangerous!” Where I get lost is when the conclusion becomes, “therefore let’s not accelerate progress.” (Or is that even the conclusion? I’m still not clear. Ord’s “long reflection” certainly seems like that.)
I am all for specific safety measures. Better biosecurity in labs—great. AI safety? I’m a little unclear how we can create safety mechanisms for a thing that we haven’t exactly invented yet, but hey, if anyone has good ideas for how to do it, let’s go for it. Maybe there is some theoretical framework around “value alignment” that we can create up front—wonderful.
I’m also in favor of generally educating scientists and engineers about the grave moral responsibility they have to watch out for these things and to take appropriate responsibility. (I tend to think that existential risk lies most in the actions, good or bad, of those who are actually on the frontier.)
But EA/XR folks don’t seem to be primarily advocating for specific safety measures. Instead, what I hear (or think I’m hearing) is a kind of generalized fear of progress. Again, that’s where I get lost. I think that (1) progress is too obviously valuable and (2) our ability to actually predict and control future risks is too low.
I wrote up some more detailed questions on the crux here and would appreciate your input: https://forum.effectivealtruism.org/posts/hkKJF5qkJABRhGEgF/help-me-find-the-crux-between-ea-xr-and-progress-studies
I think there’s a fear of progress in specific areas (e.g. AGI and certain kinds of bio) but not a general one? At least I’m in favor of progress generally and against progress in some specific areas where we have good object-level arguments for why progress in those areas in particular could be very risky.
(I also think EA/XR folks are primarily advocating for the development of specific safety measures, and not for us to stop progress, but I agree there is at least some amount of “stop progress” in the mix.)
Re: (2), I’m somewhat sympathetic to this, but all the ways I’m sympathetic to it seem to also apply to progress studies (i.e. I’d be sympathetic to “our ability to influence the pace of progress is too low”), so I’m not sure how this becomes a crux.
That’s interesting, because I think it’s much more obvious that we could successfully, say, accelerate GDP growth by 1-2 points per year, than it is that we could successfully, say, stop an AI catastrophe.
The former is something we have tons of experience with: there’s history, data, economic theory… and we can experiment and iterate. The latter is something almost completely in the future, where we don’t get any chances to get it wrong and course-correct.
(Again, this is not to say that I’m opposed to AI safety work: I basically think it’s a good thing, or at least it can be if pursued intelligently. I just think there’s a much greater chance that we look back on it and realize, too late, that we were focused on entirely the wrong things.)
If you mean like 10x greater chance, I think that’s plausible (though larger than I would say). If you mean 1000x greater chance, that doesn’t seem defensible.
In both fields you basically ~can’t experiment with the actual thing you care about (you can’t just build a superintelligent AI and check whether it is aligned; you mostly can’t run an intervention on the entire world and check whether world GDP went up). You instead have to rely on proxies.
In some ways it is a lot easier to run proxy experiments for AI alignment—you can train AI systems right now, and run actual proposals in code on those systems, and see what they do; this usually takes somewhere between hours and weeks. It seems a lot harder to do this for “improving GDP growth” (though perhaps there are techniques I don’t know about).
I agree that PS has an advantage with historical data (though I don’t see why economic theory is particularly better than AI theory), and this is a pretty major difference. Still, I don’t think it goes from “good chance of making a difference” to “basically zero chance of making a difference”.
Fwiw, I think AI alignment is relevant to current AI systems with which we have experience even if the catastrophic versions are in the future, and we do get chances to get it wrong and course-correct, but we can set that aside for now, since I’d probably still disagree even if I changed my mind on that. (Like, it is hard to do armchair theory without experimental data, but it’s not so hard that you should conclude that you’re completely doomed and there’s no point in trying.)
Thanks for clarifying, the delta thing is a good point. I’m not aware of anyone really trying to estimate “what are the odds that MIRI prevents XR”, though there is one SSC sort of on the topic: https://slatestarcodex.com/2015/08/12/stop-adding-zeroes/
I absolutely agree with all the other points. This isn’t an exact quote, but from his talk with Tyler Cowen, Nick Beckstead notes: “People doing philosophical work to try to reduce existential risk are largely wasting their time. Tyler doesn’t think it’s a serious effort, though it may be good publicity for something that will pay off later… the philosophical side of this seems like ineffective posturing.
Tyler wouldn’t necessarily recommend that these people switch to other areas of focus because people motivation and personal interests are major constraints on getting anywhere. For Tyler, his own interest in these issues is a form of consumption, though one he values highly.” https://drive.google.com/file/d/1O—V1REGe1-PNTpJXl3GHsUu_eGvdAKn/view
That’s a bit harsh, but this was in 2014. Hopefully Tyler would agree efforts have gotten somewhat more serious since then. I think the median EA/XR person would agree that there is probably a need for the movement to get more hands on and practical.
R.e. safety for something that hasn’t been invented: I’m not an expert here, but my understanding is that some of it might be path dependent. I.e. research agendas hope to result in particular kinds of AI, and it’s not necessarily a feature you can just add on later. But it doesn’t sound like there’s a deep disagreement here, and in any case I’m not the best person to try to argue this case.
Intuitively, one analogy might be: we’re building a rocket, humanity is already on it, and the AI Safety people are saying “let’s add life support before the rocket takes off”. The exacerbating factor is that once the rocket is built, it might take off immediately, and no one is quite sure when this will happen.
To your Beckstead paraphrase, I’ll add Tyler’s recent exchange with Joseph Walker: