This seems like an isolated demand for rigor to me. I think it’s fine to say something is “no evidence” when, speaking pedantically, it’s only a negligible amount of evidence.
I think that’s fair, but I’m still admittedly annoyed at this usage of language. I don’t think it’s an isolated demand for rigor because I have personally criticized many other similar uses of “no evidence” in the past.
I think future AIs will be much more aligned than humans, because we will have dramatically more control over them than over humans.
That’s plausible to me, but I’m perhaps not as optimistic as you are. I think AIs might easily end up becoming roughly as misaligned with humans as humans are to each other, at least eventually.
We did not intend to deny that some AIs will be well-described as having goals.
If you agree that AIs will intuitively have goals that they robustly pursue, I guess I’m just not sure why you thought it was important to rebut goal realism? You wrote,
The goal realist perspective relies on a trick of language. By pointing to a thing inside an AI system and calling it an “objective”, it invites the reader to project a generalized notion of “wanting” onto the system’s imagined internal ponderings, thereby making notions such as scheming seem more plausible.
But I think even on a reductionist view, it can make sense to talk about AIs “wanting” things, just like it makes sense to talk about humans wanting things. I’m not sure why you think this distinction makes much of a difference.
The goal realism section was an argument in the alternative. If you just agree with us that the indifference principle is invalid, then the counting argument fails, and it doesn’t matter what you think about goal realism.
If you think that some form of indifference reasoning still works— in a way that saves the counting argument for scheming— the most plausible view on which that’s true is goal realism combined with Huemer’s restricted indifference principle. We attack goal realism to try to close off that line of reasoning.
I think that’s fair, but I’m still admittedly annoyed at this usage of language. I don’t think it’s an isolated demand for rigor because I have personally criticized many other similar uses of “no evidence” in the past.
That’s plausible to me, but I’m perhaps not as optimistic as you are. I think AIs might easily end up becoming roughly as misaligned with humans as humans are to each other, at least eventually.
If you agree that AIs will intuitively have goals that they robustly pursue, I guess I’m just not sure why you thought it was important to rebut goal realism? You wrote,
But I think even on a reductionist view, it can make sense to talk about AIs “wanting” things, just like it makes sense to talk about humans wanting things. I’m not sure why you think this distinction makes much of a difference.
The goal realism section was an argument in the alternative. If you just agree with us that the indifference principle is invalid, then the counting argument fails, and it doesn’t matter what you think about goal realism.
If you think that some form of indifference reasoning still works— in a way that saves the counting argument for scheming— the most plausible view on which that’s true is goal realism combined with Huemer’s restricted indifference principle. We attack goal realism to try to close off that line of reasoning.