I think I was too terse; let me explain my model a bit more.
I think there’s a decent chance (OTTMH, let’s say 10%) that without any deliberate effort we make an AI which wipes our humanity, but is anyhow more ethically valuable than us (although not more than something which we deliberately design to be ethically valuable). This would happen, e.g. if this was the default outcome (e.g. if it turns out to be the case that intelligence ~ ethical value). This may actually be the most likely path to victory.**
There’s also some chance that all we need to do to ensure that AI has (some) ethical value (e.g. due to having qualia) is X. In that case, we might increase our chance of doing X by understanding qualia a bit better.
Finally, my point was that I can easily imagine a scenario in which our alternatives are:
Build an AI with 50% chance of being aligned, 50% chance of just being an AI (with P(AI has property X) = 90% if we understand qualia better, 10% else)
Allow our competitors to build an AI with ~0% chance of being ethically valuable.
So then we obviously prefer option1, and if we understand qualia better, option 1 becomes better.
* I notice as I type this that this may have some strange consequences RE high-level strategy; e.g. maybe it’s better to just make something intelligent ASAP and hope that it has ethical value, because this reduces its X-risk, and we might not be able to do much to change the distribution of the ethical value the AI we create produces that much anyhow*. I tend to think that we should aim to be very confident that the AI we build is going to have lots of ethical value, but this may only make sense if we have a pretty good chance of succeeding.
Ah, that makes a lot more sense, sorry for misinterpreting you. (I think Toby has a view closer to the one I was responding to, though I suspect I am also oversimplifying his view.)
I agree that there are important philosophical questions that bear on the goodness of building various kinds of (unaligned) AI, and I think that those questions do have impact on what we ought to do. The biggest prize is if it turns out that some kinds of unaligned AI are much better than others, which I think is plausible. I guess we probably have similar views on these issues, modulo me being more optimistic about the prospects for aligned AI.
I don’t think that an understanding of qualia is an important input into this issue though.
For example, from a long-run ethical perspective, whether or not humans have qualia is not especially important, and what mostly matters is human preferences (since those are what shape the future). If you created a race of p-zombies that nevertheless shared our preferences about qualia, I think it would be fine. And “the character of human preferences” is a very different kind of object than qualia. These questions are related in various ways (e.g. our beliefs about qualia are related to our qualia and to philosophical arguments about consciousness), but after thinking about that a little bit I think it is unlikely that the interaction is very important.
To summarize, I do agree that there are time-sensitive ethical questions about the moral value of creating unaligned AI. This was item 1.2 in this list from 4 years ago. I could imagine concluding that the nature of qualia is an important input into this question, but don’t currently believe that.
I think I was too terse; let me explain my model a bit more.
I think there’s a decent chance (OTTMH, let’s say 10%) that without any deliberate effort we make an AI which wipes our humanity, but is anyhow more ethically valuable than us (although not more than something which we deliberately design to be ethically valuable). This would happen, e.g. if this was the default outcome (e.g. if it turns out to be the case that intelligence ~ ethical value). This may actually be the most likely path to victory.**
There’s also some chance that all we need to do to ensure that AI has (some) ethical value (e.g. due to having qualia) is X. In that case, we might increase our chance of doing X by understanding qualia a bit better.
Finally, my point was that I can easily imagine a scenario in which our alternatives are:
Build an AI with 50% chance of being aligned, 50% chance of just being an AI (with P(AI has property X) = 90% if we understand qualia better, 10% else)
Allow our competitors to build an AI with ~0% chance of being ethically valuable.
So then we obviously prefer option1, and if we understand qualia better, option 1 becomes better.
* I notice as I type this that this may have some strange consequences RE high-level strategy; e.g. maybe it’s better to just make something intelligent ASAP and hope that it has ethical value, because this reduces its X-risk, and we might not be able to do much to change the distribution of the ethical value the AI we create produces that much anyhow*. I tend to think that we should aim to be very confident that the AI we build is going to have lots of ethical value, but this may only make sense if we have a pretty good chance of succeeding.
Ah, that makes a lot more sense, sorry for misinterpreting you. (I think Toby has a view closer to the one I was responding to, though I suspect I am also oversimplifying his view.)
I agree that there are important philosophical questions that bear on the goodness of building various kinds of (unaligned) AI, and I think that those questions do have impact on what we ought to do. The biggest prize is if it turns out that some kinds of unaligned AI are much better than others, which I think is plausible. I guess we probably have similar views on these issues, modulo me being more optimistic about the prospects for aligned AI.
I don’t think that an understanding of qualia is an important input into this issue though.
For example, from a long-run ethical perspective, whether or not humans have qualia is not especially important, and what mostly matters is human preferences (since those are what shape the future). If you created a race of p-zombies that nevertheless shared our preferences about qualia, I think it would be fine. And “the character of human preferences” is a very different kind of object than qualia. These questions are related in various ways (e.g. our beliefs about qualia are related to our qualia and to philosophical arguments about consciousness), but after thinking about that a little bit I think it is unlikely that the interaction is very important.
To summarize, I do agree that there are time-sensitive ethical questions about the moral value of creating unaligned AI. This was item 1.2 in this list from 4 years ago. I could imagine concluding that the nature of qualia is an important input into this question, but don’t currently believe that.