This doesnât seem right to me because I think itâs popular among those concerned with the longer term future to expect it to be populated with emulated humans, which clearly isnât a continuation of the genetic legacy of humans, so I feel pretty confident that itâs something else about humanity that people want to preserve against AI. (Iâm not here to defend this particular vision of the future beyond noting that people like Holden Karnofsky have written about it, so itâs not exactly niche.)
You say that expecting AI to have worse goals than humans would require studying things like what the empirical observed goals of AI systems turn out to be, and similar â sure, so in the absence of having done those studies, we should delay our replacement until they can be done. And doing these studies is undermined by the fact that right now the state of our knowledge on how to reliably determine what an AI is thinking is pretty bad, and it will only get worse as they develop their abilities to strategise and lie. Solving these problems would be a major piece of what people are looking for in alignment research, and precisely the kind of thing it seems worth delaying AI progress for.
This doesnât seem right to me because I think itâs popular among those concerned with the longer term future to expect it to be populated with emulated humans, which clearly isnât a continuation of the genetic legacy of humans, so I feel pretty confident that itâs something else about humanity that people want to preserve against AI.
Your point that people may not necessarily care about humanityâs genetic legacy in itself is reasonable. However, if people value simulated humans but not generic AIs, the key distinction they are making still seems to be based on species identity rather than on a principle that a utilitarian, looking at things impartially, would recognize as morally significant.
In this context, âspeciesâ wouldnât be defined strictly in terms of genetic inheritance. Instead, it would encompass a slightly broader conceptâone that includes both genetic heritage and the faithful functional replication of biologically evolved beings within a digital medium. Nonetheless, the core element of my thesis remains intact: this preference appears rooted in non-utilitarian considerations.
You say that expecting AI to have worse goals than humans would require studying things like what the empirical observed goals of AI systems turn out to be, and similar â sure, so in the absence of having done those studies, we should delay our replacement until they can be done.
Right now, we lack significant empirical evidence to determine whether AI civilization will ultimately generate more or less valuable than human civilization from a utilitarian point of view. Since we cannot say which is the case, there is no clear reason to default to delaying AI development over accelerating it. If AIs turn out to be generate more moral value, then delaying AI would mean we are actively making a mistakeâwe would be pushing the future toward a suboptimal state from a utilitarian perspective, by entrenching the human species.
This is because, by assumption, the main effect from delaying AI is to increase the probability that AIs will be aligned with human interests, which is not equivalent to maximizing utilitarian moral value. Conversely, if AIs end up generating less moral value, as many effective altruists currently believe, then delaying AI would indeed be the right call. But since we donât know which scenario is true, we should acknowledge our uncertainty rather than assume that delaying AI is the obvious default course of action.
Given this uncertainty, the rational approach is to suspend judgment rather than confidently assert that slowing down AI is beneficial. Yet I perceive many EAs as taking the confident approachâacting as if delaying AI is clearly the right decision from a longtermist utilitarian perspective, despite the lack of solid evidence.
Additionally, delaying AI would likely impose significant costs on currently existing humans by delaying technological development, which in my view shifts the default consideration in the opposite direction from what you suggest. This becomes especially relevant for those who do not adhere strictly to total utilitarian longtermism but instead care, at least to some degree, about the well-being of people alive today.
This doesnât seem right to me because I think itâs popular among those concerned with the longer term future to expect it to be populated with emulated humans, which clearly isnât a continuation of the genetic legacy of humans, so I feel pretty confident that itâs something else about humanity that people want to preserve against AI. (Iâm not here to defend this particular vision of the future beyond noting that people like Holden Karnofsky have written about it, so itâs not exactly niche.)
You say that expecting AI to have worse goals than humans would require studying things like what the empirical observed goals of AI systems turn out to be, and similar â sure, so in the absence of having done those studies, we should delay our replacement until they can be done. And doing these studies is undermined by the fact that right now the state of our knowledge on how to reliably determine what an AI is thinking is pretty bad, and it will only get worse as they develop their abilities to strategise and lie. Solving these problems would be a major piece of what people are looking for in alignment research, and precisely the kind of thing it seems worth delaying AI progress for.
Your point that people may not necessarily care about humanityâs genetic legacy in itself is reasonable. However, if people value simulated humans but not generic AIs, the key distinction they are making still seems to be based on species identity rather than on a principle that a utilitarian, looking at things impartially, would recognize as morally significant.
In this context, âspeciesâ wouldnât be defined strictly in terms of genetic inheritance. Instead, it would encompass a slightly broader conceptâone that includes both genetic heritage and the faithful functional replication of biologically evolved beings within a digital medium. Nonetheless, the core element of my thesis remains intact: this preference appears rooted in non-utilitarian considerations.
Right now, we lack significant empirical evidence to determine whether AI civilization will ultimately generate more or less valuable than human civilization from a utilitarian point of view. Since we cannot say which is the case, there is no clear reason to default to delaying AI development over accelerating it. If AIs turn out to be generate more moral value, then delaying AI would mean we are actively making a mistakeâwe would be pushing the future toward a suboptimal state from a utilitarian perspective, by entrenching the human species.
This is because, by assumption, the main effect from delaying AI is to increase the probability that AIs will be aligned with human interests, which is not equivalent to maximizing utilitarian moral value. Conversely, if AIs end up generating less moral value, as many effective altruists currently believe, then delaying AI would indeed be the right call. But since we donât know which scenario is true, we should acknowledge our uncertainty rather than assume that delaying AI is the obvious default course of action.
Given this uncertainty, the rational approach is to suspend judgment rather than confidently assert that slowing down AI is beneficial. Yet I perceive many EAs as taking the confident approachâacting as if delaying AI is clearly the right decision from a longtermist utilitarian perspective, despite the lack of solid evidence.
Additionally, delaying AI would likely impose significant costs on currently existing humans by delaying technological development, which in my view shifts the default consideration in the opposite direction from what you suggest. This becomes especially relevant for those who do not adhere strictly to total utilitarian longtermism but instead care, at least to some degree, about the well-being of people alive today.