I think this first point is fair, but I don’t think this is the trade off. The cost of extinction includes all future peoples, which in itself includes all people that that will be turned into digital minds and their offspring/all digital minds created by people directly. This then, presumably, will also be able to be in the quadrillions.
You might be right insofar as humans transforming into digital lives/our ability to create digital minds to the same degree as AI systems will happen a lot laterthan AI being able to generate this immense amount of value. In turn, the disvalue is the foregone number of digital minds we could have created whilst waiting to transform/directly create ourselves. But I also think it looks like the longer the timescale of the universe, the more implausible this is. This is for a number of reasons, but not least, the more value that AI will be responsible for creating in the universe, then our ability to shape the course of AI is increasingly leveraged.
This is true even if the only thing that we can change is whether AI wipes humans out, as the last thing we’d want is trillions of digital minds wiping out alien species if the counterfactual is the same number of self-propagating digital minds + many alien species. In turn, the biggest confounder is whether we could eradicate this incentive in the first place—precisely what AI alignment seeks to do.
Definitely see (iii) being potentially true! But of course, if (i) or (ii) are false, then it’s not hugely important as we’ll be able to generate large amounts of value ourselves. This would be the case, for example, if we eventually become digital minds. I think the point at which we ourselves can simply create digital minds at the same pace as AI, then it’s also equivalent.
Even if (i) and (iii) are both somewhat true, I think it’s unlikely to be true to the degree that there’sgreater disvalue from humans attempting to generate this value ourselves—given we’ll still be using AI extensively in either scenario & my aforementioned point about our ability to shape AI’s long-run value. Once again, the bigger confounder here is the question of whether AI is actually likely to be an existential threat and what we can do about it.
Of course, there’s the possibility all three assumptions are true. But I think the question naturally follows is the extent to which these are necessarily true, and at the bare minimum I’m really sceptical about the idea that the expected value of persuading humans to not hinder the positive generation of value by AI has lower expected value than allowing AI to wipe humans out. Once again, the more important consideration is whether this is a problem, precisely because AI only generates value via ways that look really bad to humans.
In all, there are definitely valid concerns here! But I strongly suspect that a lot of this turns on AI alignment progress & my guess that there’s a lot more potential value to be captured in a world where human extinction doesn’t take place, such that I personally don’t see it as hugely plausible that we should assign a lot of credence to human extinction as good for the universe. But very interested in your thoughts here!
A couple thoughts!
I think this first point is fair, but I don’t think this is the trade off. The cost of extinction includes all future peoples, which in itself includes all people that that will be turned into digital minds and their offspring/all digital minds created by people directly. This then, presumably, will also be able to be in the quadrillions.
You might be right insofar as humans transforming into digital lives/our ability to create digital minds to the same degree as AI systems will happen a lot later than AI being able to generate this immense amount of value. In turn, the disvalue is the foregone number of digital minds we could have created whilst waiting to transform/directly create ourselves. But I also think it looks like the longer the timescale of the universe, the more implausible this is. This is for a number of reasons, but not least, the more value that AI will be responsible for creating in the universe, then our ability to shape the course of AI is increasingly leveraged.
This is true even if the only thing that we can change is whether AI wipes humans out, as the last thing we’d want is trillions of digital minds wiping out alien species if the counterfactual is the same number of self-propagating digital minds + many alien species. In turn, the biggest confounder is whether we could eradicate this incentive in the first place—precisely what AI alignment seeks to do.
Definitely see (iii) being potentially true! But of course, if (i) or (ii) are false, then it’s not hugely important as we’ll be able to generate large amounts of value ourselves. This would be the case, for example, if we eventually become digital minds. I think the point at which we ourselves can simply create digital minds at the same pace as AI, then it’s also equivalent.
Even if (i) and (iii) are both somewhat true, I think it’s unlikely to be true to the degree that there’s greater disvalue from humans attempting to generate this value ourselves—given we’ll still be using AI extensively in either scenario & my aforementioned point about our ability to shape AI’s long-run value. Once again, the bigger confounder here is the question of whether AI is actually likely to be an existential threat and what we can do about it.
Of course, there’s the possibility all three assumptions are true. But I think the question naturally follows is the extent to which these are necessarily true, and at the bare minimum I’m really sceptical about the idea that the expected value of persuading humans to not hinder the positive generation of value by AI has lower expected value than allowing AI to wipe humans out. Once again, the more important consideration is whether this is a problem, precisely because AI only generates value via ways that look really bad to humans.
In all, there are definitely valid concerns here! But I strongly suspect that a lot of this turns on AI alignment progress & my guess that there’s a lot more potential value to be captured in a world where human extinction doesn’t take place, such that I personally don’t see it as hugely plausible that we should assign a lot of credence to human extinction as good for the universe. But very interested in your thoughts here!