To clarify: I don’t think it will be especially fruitful to try to ensure AIs are conscious, for the reason you mention: multipolar scenarios don’t really work that way, what will happen is determined by what’s efficient in a competitive world, which doesn’t allow much room to make changes now that will actually persist.
And yes, if a singleton is inevitable, then our only hope for a good future is to do our best to align the singleton, so that it uses its uncontested power to do good things rather than just to pursue whatever nonsense goal it will have been given otherwise.
What I’m concerned about is the possibility that a singleton is not inevitable (which seems to me the most likely scenario) but that folks attempt to create one anyway. This includes realities where a singleton is impossible or close to it, as well as where a singleton is possible but only with some effort made to push towards that outcome. An example of the latter would just be a soft takeoff coupled with an attempt at forming a world government to control the AI—such a scenario certainly seems to me like it could fit the “possible but not inevitable” description.
A world takeover attempt has the potential to go very, very wrong—and then there’s the serious possibility that the creation of the singleton would be successful but the alignment of it would not. Given this, I don’t think it makes sense to push unequivocally for this option, with the enormous risks it entails, until we have a good idea of what the alternative looks like. That we can’t control that alternative is irrelevant—we can still understand it! When we have a reasonable picture of that scenario, then we can start to think about whether it’s so bad that we should embark on dangerous risky strategies to try to avoid it.
One element of that understanding would be on how likely AIs are to be conscious; another would be how good or bad a life conscious AIs would have in a multipolar scenario. I agree entirely that we don’t know this yet—whether for rabbits or for future AIs—that’s part of what I’d need to understand before I’d agree that a singleton seems like our best chance at a good future.
Thank you for writing this. I broadly agree with the perspective and find it frustrating how often it’s dismissed based on (what seem to me) somewhat-shaky assumptions.
A few thoughts, mainly on the section on total utilitarianism:
1. Regarding why people tend to assume unaligned AIs won’t innately have any value, or won’t be conscious: my impression is this is largely due to the “intelligence as optimisation process” model that Eliezer advanced. Specifically, that in this model, the key ability humans have that enables us to be so successful is our ability to optimise for goals; whereas mind features we like, such as consciousness, joy, curiosity, friendship, and so on are largely seen as being outside this optimisation ability, and are instead the terminal values we optimise for. (Also that none of the technology we have so far built has really affected this core optimisation ability, so once we do finally build an artificial optimiser it could very well quickly become much more powerful than us, since unlike us it might be able to improve its optimisation ability.)
I think people who buy this model will tend not to be moved much by observations like consciousness having evolved multiple times, as they’d think: sure, but why should I expect that consciousness is part of the optimisation process bit of our minds, specifically? Ditto for other mind features, and also for predictions that AIs will be far more varied than humans — there just isn’t much scope for variety or detail in the process of doing optimisation. You use the phrase “AI civilisation” a few times; my sense is that most people who expect disaster from unaligned AI would say their vision of this outcome is not well-described as a “civilisation” at all.
2. I agree with you that if the above model is wrong (which I expect it is), and AIs really will be conscious, varied, and form a civilisation rather than being a unified unconscious optimiser, then there is some reason to think their consumption will amount to something like “conscious preference satisfaction”, since a big split between how they function when producing vs consuming seems unlikely (even though it’s logically possible).
I’m a bit surprised though by your focus (as you’ve elaborated on in the comments) on consumption rather than production. For one thing, I’d expect production to amount to a far greater fraction of AIs’ experience-time than consumption, I guess on the basis that production enables more subsequent production (or consumption), whereas consumption doesn’t, it just burns resources.
Also, you mentioned concerns about factory farms and wild animal suffering. These seem to me describable as “experiences during production” — do you not have similar concerns regarding AIs’ productive activities? Admittedly pain might not be very useful for AIs, as plausibly if you’re smart enough to see the effects on your survival of different actions, then you don’t need such a crude motivator — even humans trying very hard to achieve goals seem to mostly avoid pain while doing so, rather than using it to motivate themselves. But emotions like fear and stress seem to me plausibly useful for smart minds, and I’d not be surprised if they were common in an AI civilisation in a world where the “intelligence as optimisation process” model is not true. Do you disagree, or do you just think they won’t spend much time producing relative to consuming, or something else?
(To be clear, I agree this second concern has very little relation to what’s usually termed “AI alignment”, but it’s the concern re: an AI future that I find most convincing, and I’m curious on your thoughts on it in the context of the total utilitarian perspective.)