I now regret rehashing a bunch of these arguments which I think are mostly made better here.
It’s fine if you don’t want to continue this discussion. I can sympathize if you find it tedious. That said, I don’t really see why you’d appeal to that post in this context (FWIW, I read the post at the time it came out, and just re-read it). I interpret Paul Christiano to mainly be making arguments in the direction of “unaligned AIs might be morally valuable, even if we’d prefer aligned AI” which is what I thought I was broadly arguing for, in contradistinction to your position. I thought you were saying something closer to the opposite of what Paul was arguing for (although you also made several separate points, and I don’t mean to oversimplify your position).
(But I agree with the quoted part of his post that we shouldn’t be happy with AIs doing “whatever they choose to do”. I don’t think I’m perfectly happy with unaligned AI. I’d prefer we try to align AIs, just as Paul Christiano says too.)
Huh, no I almost entirely agree with this post as I noted in my prior comment. I cited this much earlier: “More generally, I think I basically endorse the views here (which discusses the questions of when you should cede power etc.).”
I do think unaligned ai would be morally valuable (I said in an earlier comment unaligned ai which take over might capture 10-30% of the value. That’s a lot of value.)
I don’t think I’m perfectly happy with unaligned AI. I’d prefer we try to align AIs, just as Paul Christiano says too.
I think we’ve probably been talking past each other. I thought the whole argument here was “how much value do we lose if (presumably misaligned) AI takes over” and you were arguing for “not much, caring about this seems like overly fixating on humanity” and I was arguing “(presumably misaligned) ais which take over probably results in substantially less value”. This now seems incorrect and we perhaps only have minor quantitative disagreements?
I think it probably would have helped if you were more quantitative here. Exactly how much of the value?
I thought the whole argument here was “how much value do we lose if (presumably misaligned) AI takes over”
I think the key question here is: compared to what? My position is that we lose a lot of potential value both from delaying AI and from having unaligned AI, but it’s not a crazy high reduction in either case. In other words they’re pretty comparable in terms of lost value.
Ranking the options in rough order (taking up your offer to be quantitative):
Aligned AIs built tomorrow: 100% of the value from my perspective
Aligned AIs built in 100 years: 50% of the value
Unaligned AIs built tomorrow: 15% of the value
Unaligned AIs built in 100 years: 25% of the value
Note that I haven’t thought about these exact numbers much.
What drives this huge drop? Naive utility would be very close to 100%. (Do you mean “aligned ais built in 100y if humanity still exists by that point”, which includes extinction risk before 2123?)
I attempted to explain the basic intuitions behind my judgement in this thread. Unfortunately it seems I did a poor job. For the full explanation you’ll have to wait until I write a post, if I ever get around to doing that.
The simple, short, and imprecise explanation is: I don’t really value humanity as a species as much as I value the people who currently exist, (something like) our current communities and relationships, our present values, and the existence of sentient and sapient life living positive experiences. Much of this will go away after 100 years.
It’s fine if you don’t want to continue this discussion. I can sympathize if you find it tedious. That said, I don’t really see why you’d appeal to that post in this context (FWIW, I read the post at the time it came out, and just re-read it). I interpret Paul Christiano to mainly be making arguments in the direction of “unaligned AIs might be morally valuable, even if we’d prefer aligned AI” which is what I thought I was broadly arguing for, in contradistinction to your position. I thought you were saying something closer to the opposite of what Paul was arguing for (although you also made several separate points, and I don’t mean to oversimplify your position).
(But I agree with the quoted part of his post that we shouldn’t be happy with AIs doing “whatever they choose to do”. I don’t think I’m perfectly happy with unaligned AI. I’d prefer we try to align AIs, just as Paul Christiano says too.)
Huh, no I almost entirely agree with this post as I noted in my prior comment. I cited this much earlier: “More generally, I think I basically endorse the views here (which discusses the questions of when you should cede power etc.).”
I do think unaligned ai would be morally valuable (I said in an earlier comment unaligned ai which take over might capture 10-30% of the value. That’s a lot of value.)
I think we’ve probably been talking past each other. I thought the whole argument here was “how much value do we lose if (presumably misaligned) AI takes over” and you were arguing for “not much, caring about this seems like overly fixating on humanity” and I was arguing “(presumably misaligned) ais which take over probably results in substantially less value”. This now seems incorrect and we perhaps only have minor quantitative disagreements?
I think it probably would have helped if you were more quantitative here. Exactly how much of the value?
I think the key question here is: compared to what? My position is that we lose a lot of potential value both from delaying AI and from having unaligned AI, but it’s not a crazy high reduction in either case. In other words they’re pretty comparable in terms of lost value.
Ranking the options in rough order (taking up your offer to be quantitative):
Aligned AIs built tomorrow: 100% of the value from my perspective
Aligned AIs built in 100 years: 50% of the value
Unaligned AIs built tomorrow: 15% of the value
Unaligned AIs built in 100 years: 25% of the value
Note that I haven’t thought about these exact numbers much.
What drives this huge drop? Naive utility would be very close to 100%. (Do you mean “aligned ais built in 100y if humanity still exists by that point”, which includes extinction risk before 2123?)
I attempted to explain the basic intuitions behind my judgement in this thread. Unfortunately it seems I did a poor job. For the full explanation you’ll have to wait until I write a post, if I ever get around to doing that.
The simple, short, and imprecise explanation is: I don’t really value humanity as a species as much as I value the people who currently exist, (something like) our current communities and relationships, our present values, and the existence of sentient and sapient life living positive experiences. Much of this will go away after 100 years.