If you’re interested in diving into “how bad/good is it to cede the universe to AIs”, I strongly think it’s worth reading and responding to “When is unaligned AI morally valuable?” which is the current state of the art on the topic (same thing I linked above). I now regret rehashing a bunch of these arguments which I think are mostly made better here. In particular, I think the case for “AIs created in the default way might have low moral value is reasonably well argued for here:
Many people have a strong intuition that we should be happy for our AI descendants, whatever they choose to do. They grant the possibility of pathological preferences like paperclip-maximization, and agree that turning over the universe to a paperclip-maximizer would be a problem, but don’t believe it’s realistic for an AI to have such uninteresting preferences.
I disagree. I think this intuition comes from analogizing AI to the children we raise, but that it would be just as accurate to compare AI to the corporations we create. Optimists imagine our automated children spreading throughout the universe and doing their weird-AI-analog of art; but it’s just as realistic to imagine automated PepsiCo spreading throughout the universe and doing its weird-AI-analog of maximizing profit.
It might be the case that PepsiCo maximizing profit (or some inscrutable lost-purpose analog of profit) is intrinsically morally valuable. But it’s certainly not obvious.
Or it might be the case that we would never produce an AI like a corporation in order to do useful work. But looking at the world around us today that’s certainly not obvious.
Neither of those analogies is remotely accurate. Whether we should be happy about AI “flourishing” is a really complicated question about AI and about morality, and we can’t resolve it with a one-line political slogan or crude analogy.
I now regret rehashing a bunch of these arguments which I think are mostly made better here.
It’s fine if you don’t want to continue this discussion. I can sympathize if you find it tedious. That said, I don’t really see why you’d appeal to that post in this context (FWIW, I read the post at the time it came out, and just re-read it). I interpret Paul Christiano to mainly be making arguments in the direction of “unaligned AIs might be morally valuable, even if we’d prefer aligned AI” which is what I thought I was broadly arguing for, in contradistinction to your position. I thought you were saying something closer to the opposite of what Paul was arguing for (although you also made several separate points, and I don’t mean to oversimplify your position).
(But I agree with the quoted part of his post that we shouldn’t be happy with AIs doing “whatever they choose to do”. I don’t think I’m perfectly happy with unaligned AI. I’d prefer we try to align AIs, just as Paul Christiano says too.)
Huh, no I almost entirely agree with this post as I noted in my prior comment. I cited this much earlier: “More generally, I think I basically endorse the views here (which discusses the questions of when you should cede power etc.).”
I do think unaligned ai would be morally valuable (I said in an earlier comment unaligned ai which take over might capture 10-30% of the value. That’s a lot of value.)
I don’t think I’m perfectly happy with unaligned AI. I’d prefer we try to align AIs, just as Paul Christiano says too.
I think we’ve probably been talking past each other. I thought the whole argument here was “how much value do we lose if (presumably misaligned) AI takes over” and you were arguing for “not much, caring about this seems like overly fixating on humanity” and I was arguing “(presumably misaligned) ais which take over probably results in substantially less value”. This now seems incorrect and we perhaps only have minor quantitative disagreements?
I think it probably would have helped if you were more quantitative here. Exactly how much of the value?
I thought the whole argument here was “how much value do we lose if (presumably misaligned) AI takes over”
I think the key question here is: compared to what? My position is that we lose a lot of potential value both from delaying AI and from having unaligned AI, but it’s not a crazy high reduction in either case. In other words they’re pretty comparable in terms of lost value.
Ranking the options in rough order (taking up your offer to be quantitative):
Aligned AIs built tomorrow: 100% of the value from my perspective
Aligned AIs built in 100 years: 50% of the value
Unaligned AIs built tomorrow: 15% of the value
Unaligned AIs built in 100 years: 25% of the value
Note that I haven’t thought about these exact numbers much.
What drives this huge drop? Naive utility would be very close to 100%. (Do you mean “aligned ais built in 100y if humanity still exists by that point”, which includes extinction risk before 2123?)
I attempted to explain the basic intuitions behind my judgement in this thread. Unfortunately it seems I did a poor job. For the full explanation you’ll have to wait until I write a post, if I ever get around to doing that.
The simple, short, and imprecise explanation is: I don’t really value humanity as a species as much as I value the people who currently exist, (something like) our current communities and relationships, our present values, and the existence of sentient and sapient life living positive experiences. Much of this will go away after 100 years.
If you’re interested in diving into “how bad/good is it to cede the universe to AIs”, I strongly think it’s worth reading and responding to “When is unaligned AI morally valuable?” which is the current state of the art on the topic (same thing I linked above). I now regret rehashing a bunch of these arguments which I think are mostly made better here. In particular, I think the case for “AIs created in the default way might have low moral value is reasonably well argued for here:
(And the same recommendation for onlookers.)
It’s fine if you don’t want to continue this discussion. I can sympathize if you find it tedious. That said, I don’t really see why you’d appeal to that post in this context (FWIW, I read the post at the time it came out, and just re-read it). I interpret Paul Christiano to mainly be making arguments in the direction of “unaligned AIs might be morally valuable, even if we’d prefer aligned AI” which is what I thought I was broadly arguing for, in contradistinction to your position. I thought you were saying something closer to the opposite of what Paul was arguing for (although you also made several separate points, and I don’t mean to oversimplify your position).
(But I agree with the quoted part of his post that we shouldn’t be happy with AIs doing “whatever they choose to do”. I don’t think I’m perfectly happy with unaligned AI. I’d prefer we try to align AIs, just as Paul Christiano says too.)
Huh, no I almost entirely agree with this post as I noted in my prior comment. I cited this much earlier: “More generally, I think I basically endorse the views here (which discusses the questions of when you should cede power etc.).”
I do think unaligned ai would be morally valuable (I said in an earlier comment unaligned ai which take over might capture 10-30% of the value. That’s a lot of value.)
I think we’ve probably been talking past each other. I thought the whole argument here was “how much value do we lose if (presumably misaligned) AI takes over” and you were arguing for “not much, caring about this seems like overly fixating on humanity” and I was arguing “(presumably misaligned) ais which take over probably results in substantially less value”. This now seems incorrect and we perhaps only have minor quantitative disagreements?
I think it probably would have helped if you were more quantitative here. Exactly how much of the value?
I think the key question here is: compared to what? My position is that we lose a lot of potential value both from delaying AI and from having unaligned AI, but it’s not a crazy high reduction in either case. In other words they’re pretty comparable in terms of lost value.
Ranking the options in rough order (taking up your offer to be quantitative):
Aligned AIs built tomorrow: 100% of the value from my perspective
Aligned AIs built in 100 years: 50% of the value
Unaligned AIs built tomorrow: 15% of the value
Unaligned AIs built in 100 years: 25% of the value
Note that I haven’t thought about these exact numbers much.
What drives this huge drop? Naive utility would be very close to 100%. (Do you mean “aligned ais built in 100y if humanity still exists by that point”, which includes extinction risk before 2123?)
I attempted to explain the basic intuitions behind my judgement in this thread. Unfortunately it seems I did a poor job. For the full explanation you’ll have to wait until I write a post, if I ever get around to doing that.
The simple, short, and imprecise explanation is: I don’t really value humanity as a species as much as I value the people who currently exist, (something like) our current communities and relationships, our present values, and the existence of sentient and sapient life living positive experiences. Much of this will go away after 100 years.