Thank you for writing this. I broadly agree with the perspective and find it frustrating how often it’s dismissed based on (what seem to me) somewhat-shaky assumptions.
A few thoughts, mainly on the section on total utilitarianism:
1. Regarding why people tend to assume unaligned AIs won’t innately have any value, or won’t be conscious: my impression is this is largely due to the “intelligence as optimisation process” model that Eliezer advanced. Specifically, that in this model, the key ability humans have that enables us to be so successful is our ability to optimise for goals; whereas mind features we like, such as consciousness, joy, curiosity, friendship, and so on are largely seen as being outside this optimisation ability, and are instead the terminal values we optimise for. (Also that none of the technology we have so far built has really affected this core optimisation ability, so once we do finally build an artificial optimiser it could very well quickly become much more powerful than us, since unlike us it might be able to improve its optimisation ability.)
I think people who buy this model will tend not to be moved much by observations like consciousness having evolved multiple times, as they’d think: sure, but why should I expect that consciousness is part of the optimisation process bit of our minds, specifically? Ditto for other mind features, and also for predictions that AIs will be far more varied than humans — there just isn’t much scope for variety or detail in the process of doing optimisation. You use the phrase “AI civilisation” a few times; my sense is that most people who expect disaster from unaligned AI would say their vision of this outcome is not well-described as a “civilisation” at all.
2. I agree with you that if the above model is wrong (which I expect it is), and AIs really will be conscious, varied, and form a civilisation rather than being a unified unconscious optimiser, then there is some reason to think their consumption will amount to something like “conscious preference satisfaction”, since a big split between how they function when producing vs consuming seems unlikely (even though it’s logically possible).
I’m a bit surprised though by your focus (as you’ve elaborated on in the comments) on consumption rather than production. For one thing, I’d expect production to amount to a far greater fraction of AIs’ experience-time than consumption, I guess on the basis that production enables more subsequent production (or consumption), whereas consumption doesn’t, it just burns resources.
Also, you mentioned concerns about factory farms and wild animal suffering. These seem to me describable as “experiences during production” — do you not have similar concerns regarding AIs’ productive activities? Admittedly pain might not be very useful for AIs, as plausibly if you’re smart enough to see the effects on your survival of different actions, then you don’t need such a crude motivator — even humans trying very hard to achieve goals seem to mostly avoid pain while doing so, rather than using it to motivate themselves. But emotions like fear and stress seem to me plausibly useful for smart minds, and I’d not be surprised if they were common in an AI civilisation in a world where the “intelligence as optimisation process” model is not true. Do you disagree, or do you just think they won’t spend much time producing relative to consuming, or something else?
(To be clear, I agree this second concern has very little relation to what’s usually termed “AI alignment”, but it’s the concern re: an AI future that I find most convincing, and I’m curious on your thoughts on it in the context of the total utilitarian perspective.)
Thank you for writing this. I broadly agree with the perspective and find it frustrating how often it’s dismissed based on (what seem to me) somewhat-shaky assumptions.
Thanks. I agree with what you have to say about effective altruists dismissing this perspective based on what seem to be shaky assumptions. To be a bit blunt, I generally find that, while effective altruists are often open to many types of criticism, the community is still fairly reluctant to engage deeply with some ideas that challenge their foundational assumptions. This is one of those ideas.
But I’m happy to see this post is receiving net-positive upvotes, despite the disagreement. :)
Thank you for writing this. I broadly agree with the perspective and find it frustrating how often it’s dismissed based on (what seem to me) somewhat-shaky assumptions.
A few thoughts, mainly on the section on total utilitarianism:
1. Regarding why people tend to assume unaligned AIs won’t innately have any value, or won’t be conscious: my impression is this is largely due to the “intelligence as optimisation process” model that Eliezer advanced. Specifically, that in this model, the key ability humans have that enables us to be so successful is our ability to optimise for goals; whereas mind features we like, such as consciousness, joy, curiosity, friendship, and so on are largely seen as being outside this optimisation ability, and are instead the terminal values we optimise for. (Also that none of the technology we have so far built has really affected this core optimisation ability, so once we do finally build an artificial optimiser it could very well quickly become much more powerful than us, since unlike us it might be able to improve its optimisation ability.)
I think people who buy this model will tend not to be moved much by observations like consciousness having evolved multiple times, as they’d think: sure, but why should I expect that consciousness is part of the optimisation process bit of our minds, specifically? Ditto for other mind features, and also for predictions that AIs will be far more varied than humans — there just isn’t much scope for variety or detail in the process of doing optimisation. You use the phrase “AI civilisation” a few times; my sense is that most people who expect disaster from unaligned AI would say their vision of this outcome is not well-described as a “civilisation” at all.
2. I agree with you that if the above model is wrong (which I expect it is), and AIs really will be conscious, varied, and form a civilisation rather than being a unified unconscious optimiser, then there is some reason to think their consumption will amount to something like “conscious preference satisfaction”, since a big split between how they function when producing vs consuming seems unlikely (even though it’s logically possible).
I’m a bit surprised though by your focus (as you’ve elaborated on in the comments) on consumption rather than production. For one thing, I’d expect production to amount to a far greater fraction of AIs’ experience-time than consumption, I guess on the basis that production enables more subsequent production (or consumption), whereas consumption doesn’t, it just burns resources.
Also, you mentioned concerns about factory farms and wild animal suffering. These seem to me describable as “experiences during production” — do you not have similar concerns regarding AIs’ productive activities? Admittedly pain might not be very useful for AIs, as plausibly if you’re smart enough to see the effects on your survival of different actions, then you don’t need such a crude motivator — even humans trying very hard to achieve goals seem to mostly avoid pain while doing so, rather than using it to motivate themselves. But emotions like fear and stress seem to me plausibly useful for smart minds, and I’d not be surprised if they were common in an AI civilisation in a world where the “intelligence as optimisation process” model is not true. Do you disagree, or do you just think they won’t spend much time producing relative to consuming, or something else?
(To be clear, I agree this second concern has very little relation to what’s usually termed “AI alignment”, but it’s the concern re: an AI future that I find most convincing, and I’m curious on your thoughts on it in the context of the total utilitarian perspective.)
Thanks. I agree with what you have to say about effective altruists dismissing this perspective based on what seem to be shaky assumptions. To be a bit blunt, I generally find that, while effective altruists are often open to many types of criticism, the community is still fairly reluctant to engage deeply with some ideas that challenge their foundational assumptions. This is one of those ideas.
But I’m happy to see this post is receiving net-positive upvotes, despite the disagreement. :)