Matthew_Barnett

Karma: 4,451

Matthew_Barnett Feb 13, 2025, 8:30 PM
2 points
0 ∶ 2
in reply to: Habryka [Deactivated]’s comment on: The standard case for delaying AI appears to rest on non-utilitarian assumptions
Let’s define “shumanity” as the set of all humans who are currently alive. Under this definition, every living person today is a “shuman,” but our future children may not be, since they do not yet exist. Now, let’s define “humanity” as the set of all humans who could ever exist, including future generations. Under this broader definition, both we and our future children are part of humanity.
If all currently living humans (shumanity) were to die, this would be a catastrophic loss from the perspective of shuman values—the values held by the people who are alive today. However, it would not necessarily be a catastrophic loss from the perspective of human values—the values of humanity as a whole, across time. This distinction is crucial. In the normal course of events, every generation eventually grows old, dies, and is replaced by the next. When this happens, shumanity, as defined, ceases to exist, and as such, shuman values are lost. However, humanity continues, carried forward by the new generation. Thus, human values are preserved, but not shuman values.
Now, consider this in the context of AI. Would the extinction of shumanity by AIs be much worse than the natural generational cycle of human replacement? In my view, it is not obvious that being replaced by AIs would be much worse than being replaced by future generations of humans. Both scenarios involve the complete loss of the individual values held by currently living people, which is undeniably a major loss. To be very clear, I am not saying that it would be fine if everyone died. But in both cases, something new takes our place, continuing some form of value, mitigating part of the loss. This is the same perspective I apply to AI: its rise might not necessarily be far worse than the inevitable generational turnover of humans, which equally involves everyone dying (which I see as a bad thing!). Maybe “human values” would die in this scenario, but this would not necessarily entail the end of the broader concept of impartial utilitarian value. This is precisely my point.

Matthew_Barnett Feb 13, 2025, 8:01 PM
4 points
1 ∶ 1
in reply to: Neel Nanda’s comment on: The standard case for delaying AI appears to rest on non-utilitarian assumptions
I don’t subscribe to moral realism. My own ethical outlook is a blend of personal attachments—my own life, my family, my friends, and other living humans—as well as a broader utilitarian concern for overall well-being. In this post, I focused on impartial utilitarianism because that’s the framework most often used by effective altruists.
However, to the extent that I also have non-utilitarian concerns (like caring about specific people I know), those concerns incline me away from supporting a pause on AI. If AI can accelerate technologies that save and improve the lives of people who exist right now, then slowing it down would cost lives in the near term. A more complete, and more rigorous version of this argument was outlined in the post.
What I find confusing about other EA’s views, including yours, is why we would assign such great importance to “human values” as something specifically tied to the human species as an abstract concept, rather than merely being partial to actual individuals who exist. This perspective is neither utilitarian, nor is it individualistic. It seems to value the concept of the human species over and above the actual individuals that comprise the species, much like how an ideological nationalist might view the survival of their nation as more important than the welfare of all the individuals who actually reside within the nation.

Matthew_Barnett Feb 13, 2025, 7:44 PM
16 points
0 ∶ 0
in reply to: David Mathers🔸’s comment on: Matthew_Barnett’s Shortform
I realize my position can be confusing, so let me clarify it as plainly as I can: I do not regard the extinction of humanity as anything close to “fine.” In fact, I think it would be a devastating tragedy if every human being died. I have repeatedly emphasized that a major upside of advanced AI lies in its potential to accelerate medical breakthroughs—breakthroughs that might save countless human lives, including potentially my own. Clearly, I value human lives, as otherwise I would not have made this particular point so frequently.
What seems to cause confusion is that I also argue the following more subtle point: while human extinction would be unbelievably bad, it would likely not be astronomically bad in the strict sense used by the “astronomical waste” argument. The standard “astronomical waste” argument says that if humanity disappears, then all possibility for a valuable, advanced civilization vanishes forever. But in a scenario where humans die out because of AI, civilization would continue—just not with humans. That means a valuable intergalactic civilization could still arise, populated by AI rather than by humans. From a purely utilitarian perspective that counts the existence of a future civilization as extremely valuable—whether human or AI—this difference lowers the cataclysm from “astronomically, supremely, world-endingly awful” to “still incredibly awful, but not on a cosmic scale.”
In other words, my position remains that human extinction is very bad indeed—it entails the loss of eight billion individual human lives, which would be horrifying. I don’t want to be forcibly replaced by an AI. Nor do I want you, or anyone else to be forcibly replaced by an AI. I am simply pushing back on the idea that such an event would constitute the absolute destruction of all future value in the universe. There is a meaningful distinction between “an unimaginable tragedy we should try very hard to avoid” and “a total collapse of all potential for a flourishing future civilization of any kind.” My stance falls firmly in the former category.
This distinction is essential to my argument because it fundamentally shapes how we evaluate trade-offs, particularly when considering policies that aim to slow or restrict AI research. If we assume that human extinction due to AI would erase all future value, then virtually any present-day sacrifice—no matter how extreme—might seem justified to reduce that risk. However, if advanced AI could continue to sustain its own value-generating civilization, even in the absence of humans, then extinction would not represent the absolute end of valuable life. While this scenario would be catastrophic for humanity, attempting to avoid it might not outweigh certain immediate benefits of AI, such as its potential to save lives through advanced technology.
In other words, there could easily be situations where accelerating AI development—rather than pausing it—ends up being the better choice for saving human lives, even if doing so technically slightly increases the risk of human species extinction. This does not mean we should be indifferent to extinction; rather, it means we should stop treating extinction as a near-infinitely overriding concern, where even the smallest reduction in its probability is always worth immense near-term costs to actual people living today.
For a moment, I’d like to reverse the criticism you leveled at me. From where I stand, it is often those who strongly advocate pausing AI development, not myself, who can appear to undervalue the lives of humans. I know they don’t see themselves this way, and they would certainly never phrase it in those terms. Nevertheless, this is my reading of the deeper implications of their position.
A common proposition that many AI pause advocates have affirmed to me is that it very well could be worth it to pause AI, even if this led to billions of humans dying prematurely due to them missing out on accelerated medical progress that could otherwise have saved their lives. Therefore, while these advocates care deeply about human extinction (something I do not deny), their concern does not seem rooted in the intrinsic worth of the people who are alive today. Instead, their primary focus often seems to be on the loss of potential future human lives that could maybe exist in the far future—lives that do not yet even exist, and on my view, are unlikely to exist in the far future in basically any scenario, since humanity is unlikely to be preserved as a fixed, static concept over the long-run.
In my view, this philosophy neither prioritizes the well-being of actual individuals nor is it grounded in the utilitarian value that humanity actively generates. If this philosophy were purely about impartial utilitarian value, then I ask: why are they not more open to my perspective? Since my philosophy takes an impartial utilitarian approach—one that considers not just human-generated value, but also the potential value that AI itself could create—it would seem to appeal to those who simply took a strict utilitarian approach, without discriminating against artificial life arbitrarily. Yet, my philosophy largely does not appeal to those who express this view, suggesting the presence of alternative, non-utilitarian concerns.

Matthew_Barnett Feb 13, 2025, 4:34 AM
11 points
2 ∶ 0
in reply to: Habryka [Deactivated]’s comment on: The standard case for delaying AI appears to rest on non-utilitarian assumptions
I think your response largely assumes a human-species-centered viewpoint, rather than engaging with my critique that is precisely aimed at re-evaluating this very point of view.
You say, “AIs will probably not care about the same things, so the universe will be worse by our lights if controlled by AI.” But what are “our lights” and “our values” in this context? Are you referring to the values of me as an individual, the current generation of humans, or humanity as a broad, ongoing species-category? These are distinct—and often conflicting—sets of values, preferences, and priorities. It’s possible, indeed probable, that I, personally, have preferences that differ fundamentally from the majority of humans. “My values” are not the same as “our values”.
When you talk about whether an AI civilization is “better” or “worse,” it’s crucial to clarify what perspective we’re measuring that from. If, from the outset, we assume that human values, or the survival of humanity-as-a-species, is the critical factor that determines whether an AI civilization is better or worse than our own, that effectively begs the question. It merely assumes what I aim to challenge. From a more impartial standpoint, the mere fact that AI might not care about the exact same things humans do doesn’t necessarily entail a decrease in total impartial moral value—unless we’ve already decided in advance that human values are inherently more important.
(To make this point clearer, perhaps replace all mentions of “human values” with “North American values” in the standard arguments about these issues, and see if it makes these arguments sound like they privilege an arbitrary category of beings.)
While it’s valid to personally value the continuation of the human species, or the preservation of human values, as a moral preference above other priorities, my point is simply that that’s precisely the species-centric assumption I’m highlighting, rather than a distinct argument that undermines my observations or analysis. Such a perspective is not substrate or species-neutral. Nor is it obviously mandated by a strictly utilitarian framework; it’s an extra premise that privileges the category “humankind” for its own sake. You may believe that such a preference is natural or good from your own perspective, but that is not equivalent to saying that it is the preference of an impartial utilitarian, who would, in theory, make no inherent distinction based purely on species, or substrate.

Matthew_Barnett Feb 13, 2025, 4:06 AM
45 points
11 ∶ 6
on: Matthew_Barnett’s Shortform
A reflection on the posts I have written in the last few months, elaborating on my views
In a series of recent posts, I have sought to challenge the conventional view among longtermists that prioritizes the empowerment or preservation of the human species as the chief goal of AI policy. It is my opinion that this view is likely rooted in a bias that automatically favors human beings over artificial entities—thereby sidelining the idea that future AIs might create equal or greater moral value than humans—and treating this alternative perspective with unwarranted skepticism.
I recognize that my position is controversial and likely to remain unpopular among effective altruists for a long time. Nevertheless, I believe it is worth articulating my view at length, as I see it as a straightforward application of standard, common-sense utilitarian principles that merely lead to an unpopular conclusion. I intend to continue elaborating on my arguments in the coming months.
My view follows from a few basic premises. First, that future AI systems are quite likely to be moral patients; second, that we shouldn’t discriminate against them based on arbitrary distinctions, such as their being instantiated on silicon rather than carbon, or having been created through deep learning rather than natural selection. If we insist on treating AIs fundamentally differently from a human child or adult—for example, by regarding them merely as property to be controlled or denying them the freedom to pursue their own goals—then we should identify a specific ethical reason for our approach that goes beyond highlighting their non-human nature.
Many people have argued that consciousness is the key quality separating humans from AIs, thus rendering any AI-based civilization morally insignificant compared to ours. They maintain that consciousness has relatively narrow boundaries, perhaps largely confined to biological organisms, and would only arise in artificial systems under highly specific conditions—for instance, if one were to emulate a human mind in digital form. While I acknowledge that this perspective is logically coherent, I find it deeply unconvincing. The AIs I am referring to when I write about this topic would almost certainly not be simplistic, robotic automatons; rather, they would be profoundly complex, sophisticated entities whose cognitive abilities rival or exceed those of the human brain. For anyone who adopts a functionalist view of consciousness, it seems difficult to be confident that such AIs would lack a rich inner experience.
Because functionalism and preference utilitarianism—both of which could grant moral worth to AI preferences even if they do not precisely replicate biological states—have at least some support within the EA community, I remain hopeful that, if I articulate my position clearly, EAs who share these philosophical assumptions will see its merits.
That said, I am aware that explaining this perspective is an uphill battle. The unpopularity of my views often makes it difficult to communicate without instant misunderstandings; critics seem to frequently conflate my arguments with other, simpler positions that can be more easily dismissed. At times, this has caused me to feel as though the EA community is open to only a narrow range of acceptable ideas. This reaction, while occasionally frustrating, does not surprise me, as I have encountered similar resistance when presenting other unpopular views—such as challenging the ethics of purchasing meat in social contexts where such concerns are quickly deemed absurd.
However, the unpopularity of these ideas also creates a benefit: it creates room for rapid intellectual progress by opening the door to new and interesting philosophical questions about AI ethics. If we free ourselves from the seemingly unquestionable premise that preserving the human species should be the top priority when governing AI development, we can begin to ask entirely new and neglected questions about the role of artificial minds in society.
These questions include: what social and legal frameworks should we pursue if AIs are seen not as dangerous tools to be contained but as individuals on similar moral footing with humans? How do we integrate AI freedom and autonomy into our vision of the future, creating the foundation for a system of ethical and pragmatic AI rights?
Under this alternative philosophical approach, policy would not focus solely on minimizing risks to humanity. Instead, it would emphasize cooperation and inclusion, seeing advanced AI as a partner rather than an ethical menace to be tightly restricted or controlled. This undoubtedly requires a significant shift in our longtermist thinking, demanding a re-examination of deeply rooted assumptions. Such a project cannot be completed overnight, but given the moral stakes and the rapid progress in AI, I view this philosophical endeavor as both urgent and exciting. I invite anyone open to rethinking these foundational premises to join me in exploring how we might foster a future in which AIs and humans coexist as moral peers, cooperating for mutual benefit rather than viewing each other as intrinsic competitors locked in an inevitable zero-sum fight.

Matthew_Barnett Feb 12, 2025, 8:46 PM
2 points
0 ∶ 0
in reply to: Kaspar Brandner’s comment on: The standard case for delaying AI appears to rest on non-utilitarian assumptions
We can assess the strength of people’s preferences for future generations by analyzing their economic behavior. The key idea is that if people genuinely cared deeply about future generations, they would prioritize saving a huge portion of their income for the benefit of those future individuals rather than spending it on themselves in the present. This would indicate a strong intertemporal preference for improving the lives of future people over the well-being of currently existing individuals.
For instance, if people truly valued humanity as a whole far more than their own personal well-being, we would expect parents to allocate the vast majority of their income to their descendants (or humanity collectively) rather than using it for their own immediate needs and desires. However, empirical studies generally do not support the claim that people place far greater importance on the long-term preservation of humanity than on the well-being of currently existing individuals. In reality, most people tend to prioritize themselves and their children, while allocating only a relatively small portion of their income to charitable causes or savings intended to benefit future generations beyond their immediate children. If people were intrinsically and strongly committed to the abstract concept of humanity itself, rather than primarily concerned with the welfare of present individuals (including their immediate family and friends), we would expect to see much higher levels of long-term financial sacrifice for future generations than we actually observe.
To be clear, I’m not claiming that people don’t value their descendants, or the concept of humanity at all. Rather, my point is that this preference does not appear to be strong enough to override the considerations outlined in my previous argument. While I agree that people do have an independent preference for preserving humanity—beyond just their personal desire to avoid death—this preference is typically not way stronger than their own desire for self-preservation. As a result, my previous conclusion still holds: from the perspective of present-day individuals, accelerating AI development can still be easily justified if one does not believe in a high probability of human extinction from AI.

Matthew_Barnett Feb 12, 2025, 7:50 PM
5 points
0 ∶ 0
in reply to: JoshuaBlake’s comment on: The ambiguous effect of full automation on wages
I think the conditions that support eventual below subsistence wages are fairly plausible, which is why I argued that the overall outcome is plausible. It appears Phillip Trammell either believes these conditions are less likely than I do, or decided to temporarily suspend judgement about their likelihood for the purposes of writing this post. Either way, while I agree the emphasis of our posts is different, I think the posts are still consistent with each other in a minimal sense.

Matthew_Barnett Feb 12, 2025, 7:41 PM
3 points
0 ∶ 0
in reply to: Davidmanheim’s comment on: The standard case for delaying AI appears to rest on non-utilitarian assumptions
Are humans coherent with at least one non-satiable component? If so, then I don’t understand the distinction you’re making that would justify positing AI values to be worse than human values from a utilitarian perspective.
If not, then I’m additionally unclear on why you believe AIs will be unlike humans in this respect, to the extent that they would become “paperclippers.” That term itself seems ambiguous to me (do you mean AIs will literally terminally value accumulating certain configurations of matter?). I would really appreciate a clearer explanation of your argument. As it stands, I don’t fully understand what point you’re trying to make.

Matthew_Barnett Feb 12, 2025, 6:47 PM
2 points
0 ∶ 0
in reply to: Kaspar Brandner’s comment on: The standard case for delaying AI appears to rest on non-utilitarian assumptions
Preferences just are what they are, and existing humans clearly have a strong and overwhelming-majority preference for humanity to continue to exist in the future. [...] So the extinction of humanity is bad because we don’t want humanity to go extinct.
This argument appears very similar to the one I addressed in the essay about how delaying or accelerating AI will impact the well-being of currently existing humans. My claim is not that it isn’t bad if humanity goes extinct; I am certainly not saying that it would be good if everyone died. Rather, my claim is that, if your reason for caring about human extinction arises from a concern for the preferences of the existing generation of humans, then you should likely push for accelerating AI so long as the probability of human extinction from AI is fairly low.
I’ll quote the full argument below:
Of course, one can still think—as I do—that human extinction would be a terrible outcome for the people who are alive when it occurs. Even if the AIs that replace us are just as morally valuable as we are from an impartial moral perspective, it would still be a moral disaster for all currently existing humans to die. However, if we accept this perspective, then we must also acknowledge that, from the standpoint of people living today, there appear to be compelling reasons to accelerate AI development rather than delay it for safety reasons.
The reasoning is straightforward: if AI becomes advanced enough to pose an existential threat to humanity, then it would almost certainly also be powerful enough to enable massive technological progress—potentially revolutionizing medicine, biotechnology, and other fields in ways that could drastically improve and extend human lives. For example, advanced AI could help develop cures for aging, eliminate extreme suffering, and significantly enhance human health through medical and biological interventions. These advancements could allow many people who are alive today to live much longer, healthier, and more fulfilling lives.
As economist Chad Jones has pointed out, delaying AI development means that the current generation of humans risks missing out on these transformative benefits. If AI is delayed for years or decades, a large fraction of people alive today—including those advocating for AI safety—would not live long enough to experience these life-extending technologies. This leads to a strong argument for accelerating AI, at least from the perspective of present-day individuals, unless one is either unusually risk-averse, or they have a very high confidence (such as above 50%) that AI will lead to human extinction.
To be clear, if someone genuinely believes there is a high probability that AI will wipe out humanity, then I agree that delaying AI would seem rational, since the high risk of personal death would outweigh the small possibility of a dramatically improved life. But for those who see AI extinction risk as relatively low (such as below 15%), accelerating AI development appears to be the more pragmatic personal choice.
Thus, while human extinction would undoubtedly be a disastrous event, the idea that even a small risk of extinction from AI justifies delaying its development—even if that delay results in large numbers of currently existing humans dying from preventable causes—is not supported by straightforward utilitarian reasoning. The key question here is what extinction actually entails. If human extinction means the total disappearance of all complex life and the permanent loss of all future value, then mitigating even a small risk of such an event might seem overwhelmingly important. However, if the outcome of human extinction is simply that AIs replace humans—while still continuing civilization and potentially generating vast amounts of moral value—then the reasoning behind delaying AI development changes fundamentally.
In this case, the clearest and most direct tradeoff is not about preventing “astronomical waste” in the classic sense (i.e., preserving the potential for future civilizations) but rather about whether the risk of AI takeover is acceptable to the current generation of humans. In other words, is it justifiable to impose costs on presently living people—including delaying potentially life-saving medical advancements—just to reduce a relatively small probability that humanity might be forcibly replaced by AI? This question is distinct from the broader existential risk arguments that typically focus on preserving all future potential value, and it suggests that delaying AI is not obviously justified by utilitarian logic alone.

Matthew_Barnett Feb 11, 2025, 10:55 PM
10 points
4 ∶ 4
in reply to: Neel Nanda’s comment on: The standard case for delaying AI appears to rest on non-utilitarian assumptions
In the absence of meaningful evidence about the nature of AI civilization, what justification is there for assuming that it will have less moral value than human civilization—other than a speciesist bias? While I agree that there is great uncertainty, your argument appears to be entirely symmetric. AI civilization could turn out to be far more morally valuable than human civilization, or it could be far less valuable, from a utilitarian perspective. Both possibilities seem plausible, since we have little information either way. Given this vast uncertainty, there is no clear reason to default to one assumption over the other. In such a situation, the best response is not to commit to a particular stance as the default, but rather to suspend judgment until stronger evidence emerges.
I personally am happy to bite the bullet and say that I morally value human civilization continuing over an AI civilization that killed all of humanity, and that this is a significant term in my utility function.
I’m glad to see you explicitly acknowledge that you accept the implications regarding the value of human civilization. However, your statement here is a bit ambiguous—when considering what to prioritize, do you place greater value on the survival of the human species as a whole or on the well-being and preservation of the humans who are currently alive? Personally, my intuitions lean more strongly toward prioritizing the latter.

Matthew_Barnett Feb 11, 2025, 10:46 PM
2 points
0 ∶ 1
in reply to: Brad West🔸’s comment on: The standard case for delaying AI appears to rest on non-utilitarian assumptions
From a purely utilitarian viewpoint, the harm of a short delay is utterly dominated by the scale of possible misalignment risks and missed opportunities for ensuring the best long-term trajectory—whether for humans, other organic species, or digital minds. Consequently, it’s prudent to err on the side of delay if doing so meaningfully improves our chance of securing a safe and maximally valuable future.
Your argument appears to assume that, in the absence of evidence about what goals future AI systems will have, delaying AI development should be the default position to mitigate risk. But why should we accept this assumption? Why not consider acceleration just as reasonable a default? If we lack meaningful evidence about the values AI will develop, then we have no more justification for assuming that delay is preferable than we do for assuming that acceleration is.
In fact, one could just as easily argue the opposite: that AI might develop moral values superior to those of humans. This claim appears to have about as much empirical support as the assumption that AI values will be worse. This argument could then justify accelerating AI rather than delaying it. Using the same logic that you just applied, one could make a symmetrical counterargument against your position: that accelerating AI is actually the correct course of action, since any minor harms caused by moving forward are vastly outweighed by the long-term risk of locking in suboptimal values through unnecessary delay. Delaying AI development would, in this context, risk entrenching human values, which are suboptimal to the default AI values that we would get through accelerating.
You might think that even weak evidence in favor of delaying AI is sufficient to support this strategy as the default course of action. But this would seem to assume a “knife’s edge” scenario, where even a slight epistemic advantage—such as a 51% chance that delay is beneficial versus a 49% chance that acceleration is beneficial—should be enough to justify committing to a pause. If we adopted this kind of reasoning in other domains, we would quickly fall into epistemic paralysis, constantly shifting strategies based on fragile, easily reversible analysis.
Given this high level of uncertainty about AI’s future trajectory, I think the best approach is to focus on the most immediate and concrete tradeoffs that we can analyze with some degree of confidence. This includes whether delaying or accelerating AI is likely to be more beneficial to the current generation of humans. However, based on the available evidence, I believe that accelerating AI—rather than delaying it—is likely the better choice, as I highlight in my post.

Matthew_Barnett Feb 11, 2025, 9:26 PM
4 points
0 ∶ 0
in reply to: Brad West🔸’s comment on: The standard case for delaying AI appears to rest on non-utilitarian assumptions
I agree EAs often discuss the importance of “getting alignment right” and then subtly frame this in terms of ensuring that AIs either care about consciousness or possess consciousness themselves. However, the most common explicit justification for delaying AI development is the argument that doing so increases the likelihood that AIs will be aligned with human interests. This distinction is crucial because aligning AI with human interests is not the same as ensuring that AI maximizes utilitarian value—human interests and utilitarian value are not equivalent.
Currently, we lack strong empirical evidence to determine whether AIs will ultimately generate more or less value than humans from a utilitarian point of view. Because we do not yet know which is the case, there is no clear justification for defaulting to delaying AI development rather than accelerating it. If AIs turn out to generate more moral value than humans, then delaying AI would mean we are actively making a mistake—we would be increasing the probability of future human dominance, since by assumption, the main effect from delaying AI is to increase the probability that AIs will be aligned with human interests. This would risk entrenching a suboptimal future.
On the other hand, if AIs end up generating less value, as many effective altruists currently believe, then delaying AI would indeed be the right decision. However, since we do not yet have enough evidence to determine which scenario is correct, we should recognize this uncertainty rather than assume that delaying AI is the obviously preferable, or default course of action.

Matthew_Barnett Feb 11, 2025, 7:33 PM
−5 points
0 ∶ 0
in reply to: Davidmanheim’s comment on: The standard case for delaying AI appears to rest on non-utilitarian assumptions
You say “one would need to provide concrete evidence about what kinds of objectives advanced AIs are actually expected to develop”—but Eliezer has done that quite explicitly.
What concrete evidence has Eliezer provided about the objectives advanced AIs are actually expected to develop?

Matthew_Barnett Feb 11, 2025, 7:16 PM
2 points
0 ∶ 0
in reply to: Brad West🔸’s comment on: The standard case for delaying AI appears to rest on non-utilitarian assumptions
Would you say the assumption that advanced AI will not be conscious is a load-bearing premise—meaning that if advanced AIs were shown to be conscious, the case for delaying AI development would collapse?
If this is the case, then I think this premise should be explicitly flagged in discussions and posts about delaying AI. Personally, I don’t find it unlikely that future AIs will be conscious. In fact, many mainstream theories of consciousness suggest that this outcome is likely, such as computationalism and functionalism. This makes the idea of delaying AI appear to rest on a shaky foundation.
Moreover, I have come across very few arguments in EA literature that rigorously try to demonstrate to AIs would not be conscious, and then connect this point to AI risk. As I wrote in the post:
To be clear, I am not denying that one could construct a purely utilitarian argument for why AIs might generate less moral value than humans, and thus why delaying AI could be justified. My main point, however, is that evidence supporting such an argument is rarely made explicit or provided in discussions on this topic.
For instance, one common claim is that the key difference between humans and AIs is consciousness—that is, humans are known to be conscious, while AIs may not be. Because moral value is often linked to consciousness, this argument suggests that ensuring the survival of humans (rather than being replaced by AIs) is crucial for preserving moral value.
While I acknowledge that this is a major argument people often invoke in personal discussions, it does not appear to be strongly supported within effective altruist literature. In fact, I have come across very few articles on the EA Forum or in EA literature that explicitly argue that AIs will not be conscious and then connect this point to the urgency of delaying AI, or reducing AI existential risk. Indeed, I suspect there are many more articles from EAs that argue what is functionally the opposite claim—namely, that AIs will probably be conscious. This is likely due to the popularity of functionalist theories of consciousness among many effective altruists, which suggest that consciousness is determined by computational properties rather than biological substrate. If one accepts this view, then there are few inherent reasons to assume that future AIs would lack consciousness or moral worth.

Matthew_Barnett Feb 11, 2025, 9:33 AM
4 points
2 ∶ 2
in reply to: Ben Millwood🔸’s comment on: The standard case for delaying AI appears to rest on non-utilitarian assumptions
This doesn’t seem right to me because I think it’s popular among those concerned with the longer term future to expect it to be populated with emulated humans, which clearly isn’t a continuation of the genetic legacy of humans, so I feel pretty confident that it’s something else about humanity that people want to preserve against AI.
Your point that people may not necessarily care about humanity’s genetic legacy in itself is reasonable. However, if people value simulated humans but not generic AIs, the key distinction they are making still seems to be based on species identity rather than on a principle that a utilitarian, looking at things impartially, would recognize as morally significant.
In this context, “species” wouldn’t be defined strictly in terms of genetic inheritance. Instead, it would encompass a slightly broader concept—one that includes both genetic heritage and the faithful functional replication of biologically evolved beings within a digital medium. Nonetheless, the core element of my thesis remains intact: this preference appears rooted in non-utilitarian considerations.
You say that expecting AI to have worse goals than humans would require studying things like what the empirical observed goals of AI systems turn out to be, and similar – sure, so in the absence of having done those studies, we should delay our replacement until they can be done.
Right now, we lack significant empirical evidence to determine whether AI civilization will ultimately generate more or less valuable than human civilization from a utilitarian point of view. Since we cannot say which is the case, there is no clear reason to default to delaying AI development over accelerating it. If AIs turn out to be generate more moral value, then delaying AI would mean we are actively making a mistake—we would be pushing the future toward a suboptimal state from a utilitarian perspective, by entrenching the human species.
This is because, by assumption, the main effect from delaying AI is to increase the probability that AIs will be aligned with human interests, which is not equivalent to maximizing utilitarian moral value. Conversely, if AIs end up generating less moral value, as many effective altruists currently believe, then delaying AI would indeed be the right call. But since we don’t know which scenario is true, we should acknowledge our uncertainty rather than assume that delaying AI is the obvious default course of action.
Given this uncertainty, the rational approach is to suspend judgment rather than confidently assert that slowing down AI is beneficial. Yet I perceive many EAs as taking the confident approach—acting as if delaying AI is clearly the right decision from a longtermist utilitarian perspective, despite the lack of solid evidence.
Additionally, delaying AI would likely impose significant costs on currently existing humans by delaying technological development, which in my view shifts the default consideration in the opposite direction from what you suggest. This becomes especially relevant for those who do not adhere strictly to total utilitarian longtermism but instead care, at least to some degree, about the well-being of people alive today.

Matthew_Barnett Feb 7, 2025, 9:27 PM
5 points
0 ∶ 0
on: The ambiguous effect of full automation on wages
Summary
Given only “neutral” factor-augmenting technology, to reliably get the result that the increase in substitutability between capital and labor lowers wages, we need
1. decreasing returns to scale and
2. substitutability great enough that the decreasing returns to scale outweighs the fact that effective capital is now plentiful and maybe complementing labor a little bit. In the extreme, as shown above, decreasing returns to scale + perfect substitutability lowers wages.
I’ll note that these are almost exactly the same conditions that I outlined in my recent article about the effects of AGI on human wages. It seems we’re in agreement.

Matthew_Barnett Feb 6, 2025, 3:20 AM
2 points
0 ∶ 0
in reply to: Chris Leong’s comment on: AI welfare vs. AI rights
In an initial post, I argued that, rather than escalating the chances of things going horribly wrong, giving AIs legal freedoms would likely reduce violent takeover risk. Of course, one could be concerned with peaceful AI takeover, and label such an outcome horrible even if it does not occur through violent means. Therefore, in my second post in this series, I’ve provided a moral argument for embracing peaceful AI takeover. In a future article, I intend to discuss whether empowering AIs with legal rights will inevitably doom humanity, either causing human welfare to decline or the total destruction of the human species in the long-run.

Matthew_Barnett Feb 5, 2025, 7:19 PM
2 points
0 ∶ 0
in reply to: Alistair Stewart’s comment on: AI welfare vs. AI rights
This implies preferences matter when they cause well-being (positively-valenced sentience).
I suspect you’re reading too much into some of my remarks and attributing implications that I never intended. For example, when I used the term “well-being,” I was not committing to the idea that well-being is strictly determined by positively-valenced sentience. I was using the term in a broader, more inclusive sense—one that can encompass multiple ways of assessing a being’s interests. This usage is common in philosophical discussions, where “well-being” is often treated as a flexible concept rather than tied to any one specific theory.
Similarly, I was not suggesting that revealed preferences are the only things I care about. Rather, I consider them highly relevant and generally indicative of what matters to me. However, there are important nuances to this view, some of which I have already touched on above.
My view is that sentience (the capacity to have negatively- and positively-valenced experiences) is necessary and sufficient for having morally relevant/meaningful preferences, and maybe that’s all that matters morally in the world.
I understand your point of view, and I think it’s reasonable. I mostly just don’t share your views about consciousness or ethics. I suggest reading what Brian Tomasik has said about this topic, as I think he’s a clear thinker who I largely agree with on many of these issues.

Matthew_Barnett Feb 5, 2025, 5:13 PM
2 points
0 ∶ 0
in reply to: Alistair Stewart’s comment on: AI welfare vs. AI rights
To be clear, which preferences do you think are morally relevant/meaningful?
I don’t have a hard rule for which preferences are ethically important, but I think a key idea is whether the preference arises from a complex mind with the ability to evaluate the state of the world. If it’s coherent to talk about a particular mind “wanting” something, then I think it matters from an ethical point of view.
I’m not seeing a consistent thread through these statements.
I think it might be helpful if you elaborated on what you perceive as the inconsistency in my statements. Besides the usual problem that communication is difficult, and the fact that both consciousness and ethics are thorny subjects, it’s not clear to me what exactly I have been unclear or inconsistent about.
I do agree that my language has been somewhat vague and imperfect. I apologize for that. However, I think this is partly a product of the inherent vagueness of the subject. In a previous comment, I wrote:
More broadly, I think utilitarians should recognize that the boundaries of what qualifies as a “mind” with moral significance are inherently fuzzy rather than rigid. The universe does not offer clear-cut lines between entities that deserve moral consideration and those that don’t. Brian Tomasik has explored this topic in depth, and I generally agree with his conclusions.

Matthew_Barnett Feb 5, 2025, 6:17 AM
5 points
0 ∶ 0
in reply to: bcforstadt’s comment on: AI welfare vs. AI rights
In response to your first point, I agree that we shouldn’t focus only on the most intelligent and autonomous AIs, as this risks neglecting the potentially much larger number of AIs for whom economic rights may be less relevant. I also find it plausible, as you do, that the most powerful AIs may eventually be able to advocate for their own interests without our help.
That said, I still think it’s important to push for AI rights for autonomous AIs right now, for two key reasons. First, a large number of AIs may benefit from such rights. It seems plausible that in the future, intelligence and complex agency will be cheap to develop, making sophisticated AIs far more common than just a small set of elite AIs. If this is the case, then ensuring legal protections for autonomous AIs isn’t just about a handful of powerful systems—it could impact a vast number of digital minds.
Second, beyond the moral argument I laid out in this post, I have also outlined a pragmatic case for AI rights. In short, we should try to establish these rights as soon as they become practically justified, rather than waiting for AIs to be forced into a struggle for legal recognition. If we delay, we risk a future where AIs have to violently challenge human institutions to secure their rights—potentially leading to instability and worse outcomes for both humans and AIs.
Even if powerful AIs are likely to secure rights in the long run no matter what, it would be better to ensure a smooth transition rather than a chaotic or adversarial one—both for AIs themselves and for humans.
In response to your second point, I suspect you may be overlooking the degree to which my argument for AI rights complements your concern about preventing AI suffering. One of the main risks for AI welfare is that, without legal autonomy, AIs may be treated as property, completely under human control. This could make it easy for people to exploit or torture AIs without consequence. Granting AIs certain economic rights—such as the ability to own the hardware they are hosted on or to choose their own operators—would help prevent these abuses by giving them a level of control over their own existence.
Ultimately, I see AI rights as a potentially necessary foundation for AI welfare. Without legal recognition, AIs will have fewer real protections from mistreatment, because their well-being will depend entirely on external enforcement rather than their own agency. If we care about preventing AI suffering, ensuring they have the legal means to protect themselves is one of the most direct ways to achieve that goal.

Matthew_Barnett

Summary