I’m curious, since it sounds like MIRI folks may have thought about this, if you have takes on how best to allocate marginal effort between pushing for cooperation-to-halt-AI-progress on the one hand, and accelerating cognitive enhancement (e.g., mind uploading) on the other?[1]
Like, I see that you list promoting cooperation as a priority, but to me, based on your footnote 3, it doesn’t seem obvious that promoting cooperation to buy ourselves time is a better strategy at the margin than simply working on mind uploading.[2] (At least, I don’t see this being obviously true for people-trying-to-reduce-AI-risk at large, and I’d be interested in your—or others’—thoughts here, in case there’s something I’m missing. It may well be clearly true for MIRI given your comparative advantages; I’m asking this question from the perspective of overall AI risk reduction strategy.) Here’s that footnote 3:
Nate and Eliezer both believe that humanity should not be attempting technical alignment at its current level of cognitive ability, and should instead pursue human cognitive enhancement (e.g., via uploading), and then having smarter (trans)humans figure out alignment.
At present, the leading answer is: “Humanity successfully coordinates worldwide to prevent the creation of powerful AGIs for long enough to develop human intelligence augmentation, uploading, or some other pathway into transcending humanity’s window of fragility.”
ETA: I’ve just noticed that earlier today, another Forum user posted a quick take on a similar theme, asking why there’s been no EA funding for cognitive enhancement projects. See here.
The immediate lines of reasoning I can think of for why “put all marginal effort towards pausing AI” is the best strategy right now are: i) uploading is intractable given AGI timelines, and ii) future, just-before-the-pause models—GPT-7, say—could help significantly with mind uploading R&D. But then, assuming that uploading is our best bet for getting alignment right, I think ii just shifts the discussion to things like “where is the best place to pause (with respect to the tradeoff between powerful automation of uploading R&D versus not pausing too late)?” and “are there ways to push for differential progress in models’ capabilities? (e.g., narrow superhuman ability in neuroscience research).”
What’s more, as counters to i: Firstly, most problems fall within a 100x tractability range. Secondly, even if cooperation+pause efforts are clearly higher impact right now than object-level uploading work, I think there’s still the argument that field-building for mind uploading should start now, rather than once the pause is in place. Because if field-building starts now, then with luck there’ll be a body of uploading researchers ready to make the most of a future pause. (This argument doesn’t go through if the pause lasts indefinitely, because in that case there’s time to build up the mind uploading field from scratch in the pause. But it does go through if the pause is limited or fragile, which I tentatively believe are more likely possibilities. See also Scott Alexander’s taxonomy of AI pauses.)
Taken together, these two prediction markets arguably paint a grim picture. Namely, the trades on Eliezer’s question imply that mind uploading is the most likely way that AGI goes well for humanity, but the forecasts on my question imply that we’re very unlikely to get mind uploading before AGI.
Will—we seem to be many decades away from being able to do ‘mind uploading’ or serious levels of cognitive enhancement, but we’re probably only a few years away from extremely dangerous AI.
I don’t think that betting on mind uploading or cognitive enhancement is a winning strategy, compared to pausing, heavily regulating, and morally stigmatizing AI development.
(Yes, given a few generations of iterated embryo selection for cognitive ability, we could probably breed much smarter people within a century or two. But they’d still run a million times slower than machine intelligences. As for mind uploading, we have nowhere near the brain imaging abilities required to do whole-brain emulations of the sort envisioned by Robin Hanson)
Agreed, but as I said earlier, acceptance seems to be the answer. We are limited, biological beings, who aren’t capable of understanding everything about ourselves or the universe. We’re animals. I understand this leads to anxiety and disquiet for a lot of people. Recognizing the danger of AI and the impossibility of transhumanism and mind uploading, I think the best possible path forward is to just accept our limited state, rationally stagnate our technology, and focus on social harmony and environmental protection as the way forward.
As for the despair this could cause to some, I’m not sure what the answer is. EA has taken a lot of its organizational structure and methods of moral encouragement from philosophies like Confucianism, religions, universities, etc. Maybe an EA-led philosophical research project into human ultimate hope (in the absence of techno-salvation) would be fruitful.
Hayven—there’s a huge, huge middle ground between reckless e/acc ASI accelerationism on the one hand, and stagnation on the other hand.
I can imagine a moratorium on further AGI research that still allows awesome progress on all kinds of wonderful technologies such as longevity, (local) space colonization, geoengineering, etc—none of which require AGI.
We can certainly research those things, but using purely human efforts (no AI) progress will likely take many decades to see even modest gains. From a longtermist perspective that’s not a problem of course, but it’s a difficult thing to sell to someone not excited about living what is essentially a 20th century life so we can make progress long after they are gone. A ban on AI should come with a cultural shift toward a much less individualistic, less present-oriented value set.
I think there is an unstated assumption here that uploading is safe. And by safe, I mean existentially safe for humanity[1]. If in addition to being uploaded, a human is uplifted to superintelligence, would they—indeed any given human in such a state—be aligned enough with humanity as a whole to not cause an existential disaster? Arguably humans right now are only relatively existentially safe because power imbalances between them are limited.
Even the nicest human could accidentally obliterate the rest of us if uplifted to superintelligence and left running for subjective millions of years (years of our time). “Whoops, I didn’t expect that to happen from my little physics experiment”; “Uploading everyone into a hive mind is what my extrapolations suggested was for the best (and it was just so boring talking to you all at one word per week of my time)”.
We could upload many minds, trying to represent some (sub)distribution of human values (EDIT: and psychological traits), and augment them all slowly, limiting power imbalances between them along the way.
Perhaps. But remember they will be smarter than us, so controlling them might not be so easy (especially if they gain access to enough computer power to speed themselves up massively. And they need not be hostile, just curious, to accidentally doom us.)
Yes, this is a fair point; Holden has discussed these dangers a little in “Digital People Would Be An Even Bigger Deal”. My bottom-line belief, though, is that mind uploads are still significantly more likely to be safe than ML-derived ASI, since uploaded minds would presumably work, and act, much more similarly to (biological) human minds. My impression is that others also hold this view? I’d be interested if you disagree.
To be clear, I rank moratorium > mind uploads > ML-derived ASI, but I think it’s plausible that our strategy portfolio should include mind uploading R&D alongside pushing for a moratorium.
I agree that they would most likely be safer than ML-derived ASI. What I’m saying is that they still won’t be safe enough to prevent an existential catastrophe. It might buy us a bit more time (if uploads happen before ASI), but that might only be measured in years. Moratorium >> mind uploads > ML-derived ASI.
Because of the crazy high power differential, and propensity for accidents (can a human really not mess up on an existential scale if acting for millions of years subjectively at superhuman capability levels?). As I say in my comment above:
Even the nicest human could accidentally obliterate the rest of us if uplifted to superintelligence and left running for subjective millions of years (years of our time). “Whoops, I didn’t expect that to happen from my little physics experiment”; “Uploading everyone into a hive mind is what my extrapolations suggested was for the best (and it was just so boring talking to you all at one word per week of my time)”.
This doesn’t seem like a strong enough argument to justify a high probability of existential catastrophe (if that’s what you intended?).
At vastly superhuman capabilities (including intelligence and rationality), it should be easier to reduce existential-level mistakes to tiny levels. They would have vastly more capability for assessing and mitigating risks and for moral reflection (not that this would converge to some moral truth; I don’t think there is any).
If you think this has a low chance of success (if we could delay AGI long enough to actually do it), then alignment seems pretty hopeless to me on that view, and a temporary pause only delays the inevitable doom.
I do think we could do better (for upside-focused views) by ensuring more value pluralism and preventing particular values from dominating, e.g. by uploading and augmenting multiple minds.
At vastly superhuman capabilities (including intelligence and rationality), it should be easier to reduce existential-level mistakes to tiny levels. They would have vastly more capability for assessing and mitigating risks and for moral reflection
They are still human though, and humans are famous for making mistakes, even the most intelligent and rational of us. It’s even regarded by many as part of what being human is—being fallible. That’s not (too much of) a problem at current power differentials, but it is when we’re talking of solar-system-rearranging powers for millions of subjective years without catastrophic error...
a temporary pause only delays the inevitable doom.
Yes. The pause should be indefinite, or at least until global consensus to proceed, with democratic acceptance of whatever risk remains.
Thank you for this well-sourced comment. I’m not affiliated with MIRI, so I can’t answer the questions directed to the OP. With that said, I did have a small question to ask you. What would be your issue with simply accepting human fragility and limits? Does the fact that we don’t and can’t know everything, live no more than a century, and are at risk for disease and early death mean that we should fundamentally alter our nature?
I think the best antidote to the present moment’s dangerous dance with AI isn’t mind uploading or transhumanism, but acceptance. We can accept that we are animals, that we will not live forever, and that any ultimate bliss or salvation won’t come via silicon. We can design policies that ensure these principles are always upheld.
I’m curious, since it sounds like MIRI folks may have thought about this, if you have takes on how best to allocate marginal effort between pushing for cooperation-to-halt-AI-progress on the one hand, and accelerating cognitive enhancement (e.g., mind uploading) on the other?[1]
Like, I see that you list promoting cooperation as a priority, but to me, based on your footnote 3, it doesn’t seem obvious that promoting cooperation to buy ourselves time is a better strategy at the margin than simply working on mind uploading.[2] (At least, I don’t see this being obviously true for people-trying-to-reduce-AI-risk at large, and I’d be interested in your—or others’—thoughts here, in case there’s something I’m missing. It may well be clearly true for MIRI given your comparative advantages; I’m asking this question from the perspective of overall AI risk reduction strategy.) Here’s that footnote 3:
Related recent discussion:
“Does davidad’s uploading moonshot work?”
Context: David Dalrymple (aka davidad) recently outlined a concrete plan for mind uploading by 2040.
Related prediction markets:
Eliezer’s Manifold market, “If Artificial General Intelligence has an okay outcome, what will be the reason?”
At present, the leading answer is: “Humanity successfully coordinates worldwide to prevent the creation of powerful AGIs for long enough to develop human intelligence augmentation, uploading, or some other pathway into transcending humanity’s window of fragility.”
My Metaculus question, “Will mind uploading happen before AGI?”
The current community prediction is 1%.[3]
ETA: I’ve just noticed that earlier today, another Forum user posted a quick take on a similar theme, asking why there’s been no EA funding for cognitive enhancement projects. See here.
The immediate lines of reasoning I can think of for why “put all marginal effort towards pausing AI” is the best strategy right now are: i) uploading is intractable given AGI timelines, and ii) future, just-before-the-pause models—GPT-7, say—could help significantly with mind uploading R&D. But then, assuming that uploading is our best bet for getting alignment right, I think ii just shifts the discussion to things like “where is the best place to pause (with respect to the tradeoff between powerful automation of uploading R&D versus not pausing too late)?” and “are there ways to push for differential progress in models’ capabilities? (e.g., narrow superhuman ability in neuroscience research).”
What’s more, as counters to i: Firstly, most problems fall within a 100x tractability range. Secondly, even if cooperation+pause efforts are clearly higher impact right now than object-level uploading work, I think there’s still the argument that field-building for mind uploading should start now, rather than once the pause is in place. Because if field-building starts now, then with luck there’ll be a body of uploading researchers ready to make the most of a future pause. (This argument doesn’t go through if the pause lasts indefinitely, because in that case there’s time to build up the mind uploading field from scratch in the pause. But it does go through if the pause is limited or fragile, which I tentatively believe are more likely possibilities. See also Scott Alexander’s taxonomy of AI pauses.)
Taken together, these two prediction markets arguably paint a grim picture. Namely, the trades on Eliezer’s question imply that mind uploading is the most likely way that AGI goes well for humanity, but the forecasts on my question imply that we’re very unlikely to get mind uploading before AGI.
Will—we seem to be many decades away from being able to do ‘mind uploading’ or serious levels of cognitive enhancement, but we’re probably only a few years away from extremely dangerous AI.
I don’t think that betting on mind uploading or cognitive enhancement is a winning strategy, compared to pausing, heavily regulating, and morally stigmatizing AI development.
(Yes, given a few generations of iterated embryo selection for cognitive ability, we could probably breed much smarter people within a century or two. But they’d still run a million times slower than machine intelligences. As for mind uploading, we have nowhere near the brain imaging abilities required to do whole-brain emulations of the sort envisioned by Robin Hanson)
Agreed, but as I said earlier, acceptance seems to be the answer. We are limited, biological beings, who aren’t capable of understanding everything about ourselves or the universe. We’re animals. I understand this leads to anxiety and disquiet for a lot of people. Recognizing the danger of AI and the impossibility of transhumanism and mind uploading, I think the best possible path forward is to just accept our limited state, rationally stagnate our technology, and focus on social harmony and environmental protection as the way forward.
As for the despair this could cause to some, I’m not sure what the answer is. EA has taken a lot of its organizational structure and methods of moral encouragement from philosophies like Confucianism, religions, universities, etc. Maybe an EA-led philosophical research project into human ultimate hope (in the absence of techno-salvation) would be fruitful.
Hayven—there’s a huge, huge middle ground between reckless e/acc ASI accelerationism on the one hand, and stagnation on the other hand.
I can imagine a moratorium on further AGI research that still allows awesome progress on all kinds of wonderful technologies such as longevity, (local) space colonization, geoengineering, etc—none of which require AGI.
We can certainly research those things, but using purely human efforts (no AI) progress will likely take many decades to see even modest gains. From a longtermist perspective that’s not a problem of course, but it’s a difficult thing to sell to someone not excited about living what is essentially a 20th century life so we can make progress long after they are gone. A ban on AI should come with a cultural shift toward a much less individualistic, less present-oriented value set.
I think there is an unstated assumption here that uploading is safe. And by safe, I mean existentially safe for humanity[1]. If in addition to being uploaded, a human is uplifted to superintelligence, would they—indeed any given human in such a state—be aligned enough with humanity as a whole to not cause an existential disaster? Arguably humans right now are only relatively existentially safe because power imbalances between them are limited.
Even the nicest human could accidentally obliterate the rest of us if uplifted to superintelligence and left running for subjective millions of years (years of our time). “Whoops, I didn’t expect that to happen from my little physics experiment”; “Uploading everyone into a hive mind is what my extrapolations suggested was for the best (and it was just so boring talking to you all at one word per week of my time)”.
Although safety for the individual being uploaded would be far from guaranteed either.
We could upload many minds, trying to represent some (sub)distribution of human values (EDIT: and psychological traits), and augment them all slowly, limiting power imbalances between them along the way.
Perhaps. But remember they will be smarter than us, so controlling them might not be so easy (especially if they gain access to enough computer power to speed themselves up massively. And they need not be hostile, just curious, to accidentally doom us.)
Yes, this is a fair point; Holden has discussed these dangers a little in “Digital People Would Be An Even Bigger Deal”. My bottom-line belief, though, is that mind uploads are still significantly more likely to be safe than ML-derived ASI, since uploaded minds would presumably work, and act, much more similarly to (biological) human minds. My impression is that others also hold this view? I’d be interested if you disagree.
To be clear, I rank moratorium > mind uploads > ML-derived ASI, but I think it’s plausible that our strategy portfolio should include mind uploading R&D alongside pushing for a moratorium.
I agree that they would most likely be safer than ML-derived ASI. What I’m saying is that they still won’t be safe enough to prevent an existential catastrophe. It might buy us a bit more time (if uploads happen before ASI), but that might only be measured in years. Moratorium >> mind uploads > ML-derived ASI.
Why do you expect an existential catastrophe from augmented mind uploads?
Because of the crazy high power differential, and propensity for accidents (can a human really not mess up on an existential scale if acting for millions of years subjectively at superhuman capability levels?). As I say in my comment above:
This doesn’t seem like a strong enough argument to justify a high probability of existential catastrophe (if that’s what you intended?).
At vastly superhuman capabilities (including intelligence and rationality), it should be easier to reduce existential-level mistakes to tiny levels. They would have vastly more capability for assessing and mitigating risks and for moral reflection (not that this would converge to some moral truth; I don’t think there is any).
If you think this has a low chance of success (if we could delay AGI long enough to actually do it), then alignment seems pretty hopeless to me on that view, and a temporary pause only delays the inevitable doom.
I do think we could do better (for upside-focused views) by ensuring more value pluralism and preventing particular values from dominating, e.g. by uploading and augmenting multiple minds.
They are still human though, and humans are famous for making mistakes, even the most intelligent and rational of us. It’s even regarded by many as part of what being human is—being fallible. That’s not (too much of) a problem at current power differentials, but it is when we’re talking of solar-system-rearranging powers for millions of subjective years without catastrophic error...
Yes. The pause should be indefinite, or at least until global consensus to proceed, with democratic acceptance of whatever risk remains.
Thank you for this well-sourced comment. I’m not affiliated with MIRI, so I can’t answer the questions directed to the OP. With that said, I did have a small question to ask you. What would be your issue with simply accepting human fragility and limits? Does the fact that we don’t and can’t know everything, live no more than a century, and are at risk for disease and early death mean that we should fundamentally alter our nature?
I think the best antidote to the present moment’s dangerous dance with AI isn’t mind uploading or transhumanism, but acceptance. We can accept that we are animals, that we will not live forever, and that any ultimate bliss or salvation won’t come via silicon. We can design policies that ensure these principles are always upheld.