I actually agree with a lot of this—we probably won’t intend to make them sentient at all, and it seems likely that we may do so accidentally, or that we may just not know if we have done so or not.
I’m mildly inclined to think that if ASI knows all, it can tell us when digital minds are or aren’t conscious. But it seems very plausible that we either don’t create full ASI, or that we do, but enter into a disempowerment scenario before we can rethink our choices about creating digital minds.
So yes, all that is reason to be concerned in my view. I just depart slightly from your second to last paragraph. To put a number on it, I think that this is at least half as likely as minds that are generally happy. Consciousness is a black box to me, but I think that we should as a default put more weight on a basic mechanistic theory: positive valence encourages us towards positive action, negative valence threatens us away from dis-action or apathy. The fact that we don’t observe any animals that seem dominated by one or the other seems to indicate that there is some sort of optimal equilibrium for goal fulfillment; that AI goals are different in kind from evolution’s reproductive fitness goals doesn’t seem like an obviously meaningful difference to me.
Part of your argument centers around “giving” them the wrong goals. But goals necessarily mean sub-goals—shouldn’t we expect the interior life of a digital mind to be in large part about it’s sub-goals, rather than just ultimate goals? And if it is something so intractable that it can’t even progress, wouldn’t it just stop outputting? Maybe there is suffering in that; but surely not unending suffering?
I actually agree with a lot of this—we probably won’t intend to make them sentient at all, and it seems likely that we may do so accidentally, or that we may just not know if we have done so or not.
I’m mildly inclined to think that if ASI knows all, it can tell us when digital minds are or aren’t conscious. But it seems very plausible that we either don’t create full ASI, or that we do, but enter into a disempowerment scenario before we can rethink our choices about creating digital minds.
So yes, all that is reason to be concerned in my view. I just depart slightly from your second to last paragraph. To put a number on it, I think that this is at least half as likely as minds that are generally happy. Consciousness is a black box to me, but I think that we should as a default put more weight on a basic mechanistic theory: positive valence encourages us towards positive action, negative valence threatens us away from dis-action or apathy. The fact that we don’t observe any animals that seem dominated by one or the other seems to indicate that there is some sort of optimal equilibrium for goal fulfillment; that AI goals are different in kind from evolution’s reproductive fitness goals doesn’t seem like an obviously meaningful difference to me.
Part of your argument centers around “giving” them the wrong goals. But goals necessarily mean sub-goals—shouldn’t we expect the interior life of a digital mind to be in large part about it’s sub-goals, rather than just ultimate goals? And if it is something so intractable that it can’t even progress, wouldn’t it just stop outputting? Maybe there is suffering in that; but surely not unending suffering?