I’m saying all this because I’m not afraid of treading on any toes. I don’t depend on EA money (or anyone’s money) for my livelihood or career[1]. I’m financially independent. In fact, my life is pretty good, all apart from facing impending doom from this! I mean, I don’t need to work to survive[2], I’ve got an amazing partner and and a supportive family. All that is missing is existential security! I’d be happy to have “completed it mate” (i.e I’ve basically done this with the normal life of house, car, spouse, family, financial security etc); but I haven’t - remaining is this small issue of surviving for a normal lifespan, having my children survive and thrive / ensuring the continuation of the sentient universe as we know it...
I think maybe at this point more realistic than expecting alignment to be solved in time (or at all?).
I think it’s a lot more realistic to solve alignment than to delay AGI by 50 years. I’d guess that delaying AGI by 10 years is maybe easier than alignment, but it also doesn’t solve anything unless we can use those 10 years to figure out alignment as well. For that matter, delaying by 50 years also requires that we solve alignment in that timeframe, unless we’re trying to buy time to do some third other thing.
The difficulty of alignment is also a lot more uncertain than the difficulty of delaying AGI: it depends more on technical questions that are completely unknown from our current perspective. Delaying AGI by decades is definitely very hard, whereas the difficulty of alignment is mostly a question mark.
All of that suggests to me that alignment is far more important as a way to spend marginal resources today, but we should try to do both if there are sane ways to pursue both options today.
If you want MIRI to update from “both seem good, but alignment is the top priority” to your view, you should probably be arguing (or gathering evidence) against one or more of these claims:
AGI alignment is a solvable problem.
Absent aligned AGI, there isn’t a known clearly-viable way for humanity to achieve a sufficiently-long reflection (including centuries of delaying AGI, if that turned out to be needed, without permanently damaging or crippling humanity).
(There are alternatives to aligned AGI that strike me as promising enough to be worth pursuing. E.g., maybe humans can build Drexlerian nanofactories without help from AGI, and can leverage this for a pivotal act. But these all currently seem to me like even bigger longshots than the alignment problem, so I’m not currently eager to direct resources away from (relatively well-aimed, non-capabilities-synergistic) alignment research for this purpose.)
Humanity has never succeeded in any political task remotely as difficult as the political challenge of creating an enforced and effective 50+ year global moratorium on AGI. (Taking into account that we have no litmus test for what counts as an “AGI” and we don’t know what range of algorithms or what amounts of compute you’d need to exclude in order to be sure you’ve blocked AGI. So a regulation that blocks AGI for fifty years would probably need to block a ton of other things.)
EAs have not demonstrated the ability to succeed in political tasks that are way harder than any political task any past humans have succeeded on.
Even a 10 year delay is worth a huge amount (in expectation). We may well have a very different view of alignment by then (including perhaps being pretty solid on it’s impossibility? Or perhaps a detailed plan for implementing it? (Or even the seemingly very unlikely ”..there’s nothing to worry about”)), which would allow us to iterate on a better strategy (we shouldn’t assume that our outlook will be the same after 10 years!)
but we should try to do both if there are sane ways to pursue both options today.
Yes! (And I think there are sane ways).
If you want MIRI to update from “both seem good, but alignment is the top priority” to your view, you should probably be arguing (or gathering evidence) against one or more of these claims:
AGI alignment is a solvable problem.
There are people working on this (e.g. Yampolskiy, Landry & Ellen), and this is definitely something I want to spend more time on (note that the writings so far could definitely do with a more accessible distillation).
Absent aligned AGI, there isn’t a known clearly-viable way for humanity to achieve a sufficiently-long reflection
I really don’t think we need to worry about this now. AGI x-risk is an emergency—we need to deal with that emergency first (e.g. kick the can down the road 10 years with a moratorium on AGI research); then when we can relax a little, we can have the luxury to think about long term flourishing.
Humanity has never succeeded in any political task remotely as difficult as the political challenge of creating an enforced and effective 50+ year global moratorium on AGI.
I think this can definitely be argued against (and I will try and write more as/when I make a more fleshed out post calling for a global AGI moratorium). For a start, without all the work on nuclear proliferation and risk, we may well not be here today. Yes there has been proliferation, but there hasn’t been an all-out nuclear exchange yet! It’s now 77 years since a nuclear weapon was used in anger. That’s a pretty big result I think! Also, global taboos around bio topics such as human genetic engineering are well established. If such a taboo is established, enforcement becomes a lesser concern, as you are then only fighting against isolated rogue elements rather than established megacorporations. Katja Grace discusses such taboos in her post on slowing down AI.
EAs have not demonstrated the ability to succeed in political tasks that are way harder than any political task any past humans have succeeded on.
Fair point. I think we should be thinking much wider than EA here. This needs to become mainstream, and fast.
Also, I should say that I don’t think MIRI should necessarily be diverting resources to work on a moratorium. Alignment is your comparative advantage so you should probably stick to that. What I’m saying is that you should be publicly and loudly calling for a moratorium. That would be very easy for you to do (a quick blog post/press release). But it could have a huge effect in terms of shifting the Overton Window on this. As I’ve said, it doesn’t make sense for this not to be part of any “Death with Dignity” strategy. The sensible thing when faced with ~0% survival odds is to say “FOR FUCK’S SAKE CAN WE AT LEAST TRY AND PULL THE PLUG ON HUMANS DOING AGI RESEARCH!?!”, or even “STOP BUILDING AGI YOU FUCKS!” [Sorry for the language, but I think it’s appropriate given the gravity of the situation, as assumed by talk of 100% chance of death etc.]
Even a 10 year delay is worth a huge amount (in expectation). We may well have a very different view of alignment by then (including perhaps being pretty solid on it’s impossibility?
Agreed on all counts! Though as someone who’s been working in this area for 10 years, I have a newfound appreciation for how little intellectual progress can easily end up happening in a 10-year period...
(Or even the seemingly very unlikely ”..there’s nothing to worry about”)
I have a lot of hopes that seem possible enough to me to be worth thinking about, but this specific hope isn’t one of them. Alignment may turn out to be easier than expected, but I think we can mostly rule out “AGI is just friendly by default”.
But it could have a huge effect in terms of shifting the Overton Window on this.
In which direction?
:P
I’m joking, though I do take seriously that there are proposals that might be better signal-boosted by groups other than MIRI. But if you come up with a fuller proposal you want lots of sane people to signal-boost, do send it to MIRI so we can decide if we like it; and if we like it as a sufficiently-realistic way to lengthen timelines, I predict that we’ll be happy to signal-boost it and say as much.
As I’ve said, it doesn’t make sense for this not to be part of any “Death with Dignity” strategy. The sensible thing when faced with ~0% survival odds is to say “FOR FUCK’S SAKE CAN WE AT LEAST TRY AND PULL THE PLUG ON HUMANS DOING AGI RESEARCH!?!”, or even “STOP BUILDING AGI YOU FUCKS!” [Sorry for the language, but I think it’s appropriate given the gravity of the situation, as assumed by talk of 100% chance of death etc.]
I strongly agree and think it’s right that people… like, put some human feeling into their words, if they agree about how fucked up this situation is? (At least if they find it natural to do so.)
I’m saying all this because I’m not afraid of treading on any toes. I don’t depend on EA money (or anyone’s money) for my livelihood or career[1]. I’m financially independent. In fact, my life is pretty good, all apart from facing impending doom from this! I mean, I don’t need to work to survive[2], I’ve got an amazing partner and and a supportive family. All that is missing is existential security! I’d be happy to have “completed it mate” (i.e I’ve basically done this with the normal life of house, car, spouse, family, financial security etc); but I haven’t - remaining is this small issue of surviving for a normal lifespan, having my children survive and thrive / ensuring the continuation of the sentient universe as we know it...
Although I still care about my reputation in EA to be fair (can’t really avoid this as a human)
All my EA work is voluntary
I think it’s a lot more realistic to solve alignment than to delay AGI by 50 years. I’d guess that delaying AGI by 10 years is maybe easier than alignment, but it also doesn’t solve anything unless we can use those 10 years to figure out alignment as well. For that matter, delaying by 50 years also requires that we solve alignment in that timeframe, unless we’re trying to buy time to do some third other thing.
The difficulty of alignment is also a lot more uncertain than the difficulty of delaying AGI: it depends more on technical questions that are completely unknown from our current perspective. Delaying AGI by decades is definitely very hard, whereas the difficulty of alignment is mostly a question mark.
All of that suggests to me that alignment is far more important as a way to spend marginal resources today, but we should try to do both if there are sane ways to pursue both options today.
If you want MIRI to update from “both seem good, but alignment is the top priority” to your view, you should probably be arguing (or gathering evidence) against one or more of these claims:
AGI alignment is a solvable problem.
Absent aligned AGI, there isn’t a known clearly-viable way for humanity to achieve a sufficiently-long reflection (including centuries of delaying AGI, if that turned out to be needed, without permanently damaging or crippling humanity).
(There are alternatives to aligned AGI that strike me as promising enough to be worth pursuing. E.g., maybe humans can build Drexlerian nanofactories without help from AGI, and can leverage this for a pivotal act. But these all currently seem to me like even bigger longshots than the alignment problem, so I’m not currently eager to direct resources away from (relatively well-aimed, non-capabilities-synergistic) alignment research for this purpose.)
Humanity has never succeeded in any political task remotely as difficult as the political challenge of creating an enforced and effective 50+ year global moratorium on AGI. (Taking into account that we have no litmus test for what counts as an “AGI” and we don’t know what range of algorithms or what amounts of compute you’d need to exclude in order to be sure you’ve blocked AGI. So a regulation that blocks AGI for fifty years would probably need to block a ton of other things.)
EAs have not demonstrated the ability to succeed in political tasks that are way harder than any political task any past humans have succeeded on.
Even a 10 year delay is worth a huge amount (in expectation). We may well have a very different view of alignment by then (including perhaps being pretty solid on it’s impossibility? Or perhaps a detailed plan for implementing it? (Or even the seemingly very unlikely ”..there’s nothing to worry about”)), which would allow us to iterate on a better strategy (we shouldn’t assume that our outlook will be the same after 10 years!)
Yes! (And I think there are sane ways).
There are people working on this (e.g. Yampolskiy, Landry & Ellen), and this is definitely something I want to spend more time on (note that the writings so far could definitely do with a more accessible distillation).
I really don’t think we need to worry about this now. AGI x-risk is an emergency—we need to deal with that emergency first (e.g. kick the can down the road 10 years with a moratorium on AGI research); then when we can relax a little, we can have the luxury to think about long term flourishing.
I think this can definitely be argued against (and I will try and write more as/when I make a more fleshed out post calling for a global AGI moratorium). For a start, without all the work on nuclear proliferation and risk, we may well not be here today. Yes there has been proliferation, but there hasn’t been an all-out nuclear exchange yet! It’s now 77 years since a nuclear weapon was used in anger. That’s a pretty big result I think! Also, global taboos around bio topics such as human genetic engineering are well established. If such a taboo is established, enforcement becomes a lesser concern, as you are then only fighting against isolated rogue elements rather than established megacorporations. Katja Grace discusses such taboos in her post on slowing down AI.
Fair point. I think we should be thinking much wider than EA here. This needs to become mainstream, and fast.
Also, I should say that I don’t think MIRI should necessarily be diverting resources to work on a moratorium. Alignment is your comparative advantage so you should probably stick to that. What I’m saying is that you should be publicly and loudly calling for a moratorium. That would be very easy for you to do (a quick blog post/press release). But it could have a huge effect in terms of shifting the Overton Window on this. As I’ve said, it doesn’t make sense for this not to be part of any “Death with Dignity” strategy. The sensible thing when faced with ~0% survival odds is to say “FOR FUCK’S SAKE CAN WE AT LEAST TRY AND PULL THE PLUG ON HUMANS DOING AGI RESEARCH!?!”, or even “STOP BUILDING AGI YOU FUCKS!” [Sorry for the language, but I think it’s appropriate given the gravity of the situation, as assumed by talk of 100% chance of death etc.]
Agreed on all counts! Though as someone who’s been working in this area for 10 years, I have a newfound appreciation for how little intellectual progress can easily end up happening in a 10-year period...
I have a lot of hopes that seem possible enough to me to be worth thinking about, but this specific hope isn’t one of them. Alignment may turn out to be easier than expected, but I think we can mostly rule out “AGI is just friendly by default”.
In which direction?
:P
I’m joking, though I do take seriously that there are proposals that might be better signal-boosted by groups other than MIRI. But if you come up with a fuller proposal you want lots of sane people to signal-boost, do send it to MIRI so we can decide if we like it; and if we like it as a sufficiently-realistic way to lengthen timelines, I predict that we’ll be happy to signal-boost it and say as much.
I strongly agree and think it’s right that people… like, put some human feeling into their words, if they agree about how fucked up this situation is? (At least if they find it natural to do so.)