The Swedish utilitarian philosopher Torbjörn Tännsjö has argued for that view (in Swedish; paywall). “We should embrace a future where the universe is populated by blissful robots”. I’m sure others have as well.
I like Bostrom and Shulman’s compromise proposal (below) – turn 99.99% of the reachable resources in the universe into hedonium, while leaving 0.01% for (post-)humanity to play with.
Human extinction due to AI = human self-destruction, assuming we are talking about an AI initially created by humans (not from some alien source).
If the assumption above is correct, then human self-destruction at the hands of an AI is better than most forms of human self-desrruction (imo), for at least the knowledge we generated during our short time here might be carried forward by the AI we created.
I think this first point is fair, but I don’t think this is the trade off. The cost of extinction includes all future peoples, which in itself includes all people that that will be turned into digital minds and their offspring/all digital minds created by people directly. This then, presumably, will also be able to be in the quadrillions.
You might be right insofar as humans transforming into digital lives/our ability to create digital minds to the same degree as AI systems will happen a lot laterthan AI being able to generate this immense amount of value. In turn, the disvalue is the foregone number of digital minds we could have created whilst waiting to transform/directly create ourselves. But I also think it looks like the longer the timescale of the universe, the more implausible this is. This is for a number of reasons, but not least, the more value that AI will be responsible for creating in the universe, then our ability to shape the course of AI is increasingly leveraged.
This is true even if the only thing that we can change is whether AI wipes humans out, as the last thing we’d want is trillions of digital minds wiping out alien species if the counterfactual is the same number of self-propagating digital minds + many alien species. In turn, the biggest confounder is whether we could eradicate this incentive in the first place—precisely what AI alignment seeks to do.
Definitely see (iii) being potentially true! But of course, if (i) or (ii) are false, then it’s not hugely important as we’ll be able to generate large amounts of value ourselves. This would be the case, for example, if we eventually become digital minds. I think the point at which we ourselves can simply create digital minds at the same pace as AI, then it’s also equivalent.
Even if (i) and (iii) are both somewhat true, I think it’s unlikely to be true to the degree that there’sgreater disvalue from humans attempting to generate this value ourselves—given we’ll still be using AI extensively in either scenario & my aforementioned point about our ability to shape AI’s long-run value. Once again, the bigger confounder here is the question of whether AI is actually likely to be an existential threat and what we can do about it.
Of course, there’s the possibility all three assumptions are true. But I think the question naturally follows is the extent to which these are necessarily true, and at the bare minimum I’m really sceptical about the idea that the expected value of persuading humans to not hinder the positive generation of value by AI has lower expected value than allowing AI to wipe humans out. Once again, the more important consideration is whether this is a problem, precisely because AI only generates value via ways that look really bad to humans.
In all, there are definitely valid concerns here! But I strongly suspect that a lot of this turns on AI alignment progress & my guess that there’s a lot more potential value to be captured in a world where human extinction doesn’t take place, such that I personally don’t see it as hugely plausible that we should assign a lot of credence to human extinction as good for the universe. But very interested in your thoughts here!
Presumably only in the event that all the value AI could realise that humans can’t necessitates (or is at least greatly contingent on) human extinction?
I think quite a few people are pretty keen on the idea that, for example, we only ever reach immense amounts of value by becoming digital minds. In part, this is precisely because a lot of the obvious reasons why AI might be able to generate far more value than humans look like they also apply to digital minds (e.g. being able to travel to other galaxies).
But in turn, unless we think (i) this is sufficiently improbable; (ii) there will be no other way to generate equivalent amounts of value, and (iii) humans will be an obstacle to AI doing so themselves, then I’m not too sure that there’s a strong case here? At least assuming any of these assumptions are false, it looks like value(humans) + value(AI) > value(AI), thus the focus on AI alignment—but happy to be shown otherwise!
The Swedish utilitarian philosopher Torbjörn Tännsjö has argued for that view (in Swedish; paywall). “We should embrace a future where the universe is populated by blissful robots”. I’m sure others have as well.
I like Bostrom and Shulman’s compromise proposal (below) – turn 99.99% of the reachable resources in the universe into hedonium, while leaving 0.01% for (post-)humanity to play with.
https://nickbostrom.com/papers/digital-minds.pdf
Another resource: https://ai-alignment.com/sympathizing-with-ai-e11a4bf5ef6e
Human extinction due to AI = human self-destruction, assuming we are talking about an AI initially created by humans (not from some alien source).
If the assumption above is correct, then human self-destruction at the hands of an AI is better than most forms of human self-desrruction (imo), for at least the knowledge we generated during our short time here might be carried forward by the AI we created.
A couple thoughts!
I think this first point is fair, but I don’t think this is the trade off. The cost of extinction includes all future peoples, which in itself includes all people that that will be turned into digital minds and their offspring/all digital minds created by people directly. This then, presumably, will also be able to be in the quadrillions.
You might be right insofar as humans transforming into digital lives/our ability to create digital minds to the same degree as AI systems will happen a lot later than AI being able to generate this immense amount of value. In turn, the disvalue is the foregone number of digital minds we could have created whilst waiting to transform/directly create ourselves. But I also think it looks like the longer the timescale of the universe, the more implausible this is. This is for a number of reasons, but not least, the more value that AI will be responsible for creating in the universe, then our ability to shape the course of AI is increasingly leveraged.
This is true even if the only thing that we can change is whether AI wipes humans out, as the last thing we’d want is trillions of digital minds wiping out alien species if the counterfactual is the same number of self-propagating digital minds + many alien species. In turn, the biggest confounder is whether we could eradicate this incentive in the first place—precisely what AI alignment seeks to do.
Definitely see (iii) being potentially true! But of course, if (i) or (ii) are false, then it’s not hugely important as we’ll be able to generate large amounts of value ourselves. This would be the case, for example, if we eventually become digital minds. I think the point at which we ourselves can simply create digital minds at the same pace as AI, then it’s also equivalent.
Even if (i) and (iii) are both somewhat true, I think it’s unlikely to be true to the degree that there’s greater disvalue from humans attempting to generate this value ourselves—given we’ll still be using AI extensively in either scenario & my aforementioned point about our ability to shape AI’s long-run value. Once again, the bigger confounder here is the question of whether AI is actually likely to be an existential threat and what we can do about it.
Of course, there’s the possibility all three assumptions are true. But I think the question naturally follows is the extent to which these are necessarily true, and at the bare minimum I’m really sceptical about the idea that the expected value of persuading humans to not hinder the positive generation of value by AI has lower expected value than allowing AI to wipe humans out. Once again, the more important consideration is whether this is a problem, precisely because AI only generates value via ways that look really bad to humans.
In all, there are definitely valid concerns here! But I strongly suspect that a lot of this turns on AI alignment progress & my guess that there’s a lot more potential value to be captured in a world where human extinction doesn’t take place, such that I personally don’t see it as hugely plausible that we should assign a lot of credence to human extinction as good for the universe. But very interested in your thoughts here!
Presumably only in the event that all the value AI could realise that humans can’t necessitates (or is at least greatly contingent on) human extinction?
I think quite a few people are pretty keen on the idea that, for example, we only ever reach immense amounts of value by becoming digital minds. In part, this is precisely because a lot of the obvious reasons why AI might be able to generate far more value than humans look like they also apply to digital minds (e.g. being able to travel to other galaxies).
But in turn, unless we think (i) this is sufficiently improbable; (ii) there will be no other way to generate equivalent amounts of value, and (iii) humans will be an obstacle to AI doing so themselves, then I’m not too sure that there’s a strong case here? At least assuming any of these assumptions are false, it looks like value(humans) + value(AI) > value(AI), thus the focus on AI alignment—but happy to be shown otherwise!