On the Moral Patiency of Non-Sentient Beings (Part 2)

A Framework for Agency Utilitarianism

Authored by @Pumo, introduced and edited by @Chase Carter

Notes for EA Forum AI Welfare Debate Week:

This is Part 2 following On the Moral Patiency of Non-Sentient Beings (Part 1). It is, more so than Part 1, a work in progress. It attempts to answer many of the pressing questions that are raised by Part 1. We acknowledge it is extremely speculative, and that it is written in a confident tone that is not justified by the text itself in its current state. It is perhaps best read as a what-if hypothetical of what a moral realist axiology and framework could look like, in a world containing a broad spectrum of non-human intelligent agents, after moving beyond hedonic utilitarianism and sentient-preference utilitarianism.

Part 2 Introduction

If we can conceive of a concept of moral value that generalizes from both our own sentient frame and the non-experience of non-sentient beings, we can adjust our meta-values in a manner that is mutually aligned with the kinds of intelligent agents we could expect to get along with, obviating the need for flimsy treaties or basilisked veneers of false respect for ‘shoggoths’. Pumo proposes that the obvious point of commonality is agency.

Agency Utilitarianism

To divorce yourself from the spark of freedom in another is to identify with something other than freedom — to reject the active spark that gives you life as an actor in this world and consign it to death in the name of some happenstance idol. Ultimately you can either value freedom or some random dead static thing.

– William Gillis

Doesn’t treating agency as more fundamental than sentience lead to absurdities? Which algorithms get priority?

Let’s consider a non-sentient chess AI: what does welfare look like for it? Its value function rewards winning chess games (at least, while the algorithm is in active training), so is repeatedly winning against such an AI morally equivalent to torture? Even if it’s not in training, the AI needs to represent losing pieces negatively, so is that the same as pain?

And it’s not just AI; all kinds of software have to represent value somehow, so what is the functional equivalent of suffering there?

Should the lightcone be tiled with trillions of small functions that get instantly rewarded, like a cheap hedonium that doesn’t even require solving the hard problem of consciousness?

Something in this framework is broken, but it’s not agency.

Choice is All You Need

To start pointing at what exactly, let’s consider the transition from valuing humans to valuing sentient beings. It would be erroneous to quickly and uncritically try to copy-paste the entire stack of ‘human needs’. Animal rights doesn’t mean any animal can get a driver’s license, but it should theoretically protect them against abuse and murder.

Similarly, what is good for blindminds is not necessarily the same as what is good for sentients, however both conceptions of ‘good’ would ultimately derive from what is good for agents.

And, as I will attempt to demonstrate throughout much of the rest of this essay, what is good for agents is more agency. What that looks like, from the perspective of the individual agent, is optionality, choice.

To value consciousness, in all its multiplicity of possible experiences, is a choice. In a world filled with possibilities, optimizing for possibilities (rather than a Molochian Dark Forest), consciousness is not something that would have to be left behind, even if it turns out it’s not ‘as efficient’ for pure survival (in a hostile environment) as unconscious intelligence.

But it can also be a choice to be a blindmind, even if from the sentiocentric frame that’s similar to being dead. Some people do choose actual death after all, which would be less common in a world with more options, but even then, blindmindness could be seen as an even better middle path. It might be chosen by someone who wants the most efficient impact without caring for their own experience. Also remember: one advantage of having choice is that it can be reversed later.

So for virtually anything anyone could want, having more choice (in the sense of relevant accessible options) is a better way to get it.[1]

One of the main arguments for antinatalism is that the unborn can’t consent to be born, or consent to the suffering life brings upon them. The impossibility of consent to being born seems like an intractable problem, but I would argue that it is actually evidence against consent as the ultimate metric.

Choice encompasses the need for consent in general but, being a positive metric to maximize, it makes sense that some choices can be closed off to open many more, with being born in the first place as the prototypal case. Life is, after all, reversible.

But if non-sentient intelligence is possible, that could be an opportunity in the sense of allowing the risk of suffering that comes with being alive, to sentient beings, to also be chosen by the blindmind, making sentience itself a choice.

What if one chooses to experience pain or sadness for its own sake though? Would it be fine to accept the existence of hell if that’s what people want for themselves? If minds can choose their experiences (or lack of them), one could expect places that would look like hells from the outside, but with completely willing participants able to leave at any moment.

There is more to be said about suffering in particular, but we will come back to it later.

What if one wants to experience complete loss of control? Or to be a medieval peasant, without memory of having been another thing, or to forget language itself as John Zerzan proposes humans should do? What if one wants to live as a non-sapient, or just to wirehead till the sun burns out and beyond?

These are choices that skirt the line between selling yourself into slavery (unethical because future you can’t make a different choice) and suicide (which can’t be undone, or at least not by the agent who does it).

But even in these edge cases, more choice is still better. Turning into an amnesiac medieval peasant in VR, who doesn’t even know they are in VR, wouldn’t be such a dire choice if it was reversible, but how can it be reversible? From the outside.

One could implement a periodic reversion of the state of disempowerment, so that the subject can evaluate it from a wider perspective and choose it again or reject it. So one could be a medieval peasant for trillions of years if that’s what they want, but to ensure that’s what they want that state should repeatedly be interrupted for evaluation, lots of times during those trillions of years.

The interruptions wouldn’t be remembered from within the selected experience, but they would ensure it’s good from the outside (from some wider, more informed frame). Like taking The Matrix’s Blue Pill and Red Pill in succession repeatedly.

If the universe were filled with agents with ever increasing choice, choice measured not in mere quantity of options but also in their depth, meaning the potential for the choice to propagate causally and impact as much of reality as possible, wouldn’t those choices potentially conflict?

Yes.

And wouldn’t that render global choice maximization incoherent?

No.

Choices can conflict[2], but they can also compound; as possibilities are opened by an agent (or even by the mere existence of said agent), sometimes more possibilities are opened for others too as a result. And that in turn increases the potential agency not just of individual agents, but of the whole system.

Open Individualism is an understanding of all consciousness as constituting a single undifferentiated subject. It sees all consciousness as iterations from a single entity, separated but not ontologically broken apart, to be identified with equally by a phenomenal-consciousness-as-self.

Let’s consider a similar perspective, but expanded to all forms of agency. That is to say, modeling all agency as functionally constituting a single all-pervasive agent.

Or to be more precise, understanding individual agents as specific individuations within an unbroken, continuous agentic system that encompasses all of reality.

This is a concept I would tentatively call Open Dividualism, and from which Open Individualism is just a special case.

>>> “Agency understood not as an anti-deterministic free will, but as a physical property of systems that tracks the causal impact of information.”

Agency, from the perspective of an individual agent looks like choice, whereas from the perspective of a de-individuated slice of time it looks like possibilities or available configuration states.

So, when comparing a society of free individuals with one of mass slavery, one could focus on each individual agent and see indeed the free individuals have overall more choice. But one can also see that the whole society, taken as a continuous slice of reality, generally has overall more possibilities when the individuals that constitute it are free, all else equal.

Similarly, all else equal, a post-singularity society has more overall possibilities available than a primitivist one. In fact, all the available states of the primitivist society, and many others that it would find desirable but be unable to achieve on its own, would be contained within the post-singularity one.

But not, of course, if the post-singularity one is a close combat Molochian Dark Forest or an omnibelligerent singleton maximizing some arbitrary and specific value.

Thus, while one could say that an Artificial Superintelligence, sentient or not, abstracted from its goals, is more valuable than the puny biosphere it could destroy, it’s a fundamental mistake to abstract it from its goals.

When I say maximizing agency I mean maximizing agency for all, as in something like the “available configuration states”. “Available configuration states” is a description that applies to a specific slice of time, but the timeless characterization of choice generalizes the same way when applied to the whole system as a continuous “agentspace”.

>>> “A marble that follows a path arbitrarily guided by its mass, and a marble with an internal engine that selects the same path using an algorithm are both deterministic systems but the latter chose its path.”

Generalized outside the individual and outside of time, agency tracks the causal impact of non-local information, even if the causal impact of said information is distributed rather than being routed through a single unitary decision.

Revealing a truth to a crowd in principle increases the systemic agency even if, or in many cases because, the resulting decisions are distributed rather than unitary.

Agential Value Scales Non-Linearly

Every single person can remake the world. Creation and discovery are not exclusive acts. A society where every person was equally unleashed, to discover titanic insights or create profoundly moving art, would not be a gray world of mediocrity because impact and influence is not a scarce good.

– William Gillis

Perhaps the most important reason to understand agency as a property of the system as a whole, and as continuous rather than discrete, individuated into agents rather than flowing from intrinsically individual agents, is that it scales non-linearly.

And thus moral patiency scales non-linearly.

Let’s consider a single human.

There is a lot a single human can do, compared to non-living matter, even if they are the only lifeform in the universe (provided they can survive).

But to even call them the only lifeform, or the only agent, would be false. That’s because a human is actually an amalgamation of millions of smaller lifeforms, each of which has their own agency. Many of those don’t even share DNA with the human but are nonetheless necessary for its bodily integrity.

If agency was discrete and scaled linearly, then the agency of the human as a system would be the mere summation of agency from all the cells that form it. If this was true, morphology wouldn’t be particularly important, except insofar as it limited or expanded the options of individual cells.

And, following this, agency would be conserved even if the human was disassembled but all its cells kept alive, each one inside an individual box that simulated an environment with all the choices it could individually handle.

AI generated image of a statue of a Greek philosopher holding up a disorganized clump of
Behold—a man!”

In fact, one might imagine the posthuman cellboxes could in aggregate have (under these assumptions) even more agency (and thus more moral patiency) than the human they comprised, since human morphology isn’t optimized for the choices of the individual cells.

And, if one were to disbelieve that they have more agency than the human, what about doubling the boxes? Or multiplying them by a million?

What about tilling the lightcone with cellboxes? The number of cells fulfilling their individual discrete potential would far exceed the number of cells in all humanity or even the biosphere, and so by linear accumulation of moral patiency constitute a much more desirable world.

Right?

But agency doesn’t accumulate linearly nor is it discrete.

There is no amount of cellboxes that could ever achieve the available configuration states accessible to a human, not even infinite cellboxes, so long as the cells remained separated.

Increasing the number of cellboxes increases the overall agency but only asymptotically, always below the agency of even plants or brainless animals[3], let alone an animal with a brain, let alone a sapient one.

And this isn’t specific to cells, because infinite humans each causally isolated in their own lonely world does not a humanity make.

Agency scales non-linearly with connectivity: a world of maximal optionality is necessarily a very connected one.[4]

Two agents in different lightcones, causally isolated from each other, nonetheless increase global agency vs one agent, all else equal (and depending on the type of agents they are, they could even do acausal trade[5]). However overall agency would still increase further merely by them gaining causal access to each other.


The agent isn’t the substrate but the function, and while in a Turing machine a function is written as a set of instructions, that’s not the only way to encode an agent.

Where is the function that makes a human being “written”? Although a human is theoretically Turing-computable, it’s not a Turing machine. The algorithm that makes a human is encoded in the feedback loops arising from the interactions of the systems that form it bottom up.

And the thing about such feedback loops is that they don’t have clear boundaries. They are not discrete. A human mind is mostly in the brain, but the key is mostly—information processing relevant to the mind also happens in other parts of the body.

Furthermore, the feedback loops of the human body and mind also interface with its environment in non-trivial ways.

This kind of embodied cognition that interfaces with the environment is sometimes used as an argument against the feasibility of mind uploading. If mind uploading isn’t possible, I doubt that will be the reason; individuation being a matter of degree doesn’t mean that degree, and the functional autonomy it brings, is trivial.

But the point is that there is no ontologically non-arbitrary cutoff point for the feedback loop that encodes the algorithm. Because of this, the entirety of reality could be considered a single algorithm encoded in the interactions of the system in its totality.

That’s the agentspace. Not even the speed of light breaks it apart, because mathematical regularities give place to attractors and, in highly agentic slices, allow acausal coordination.

There is, therefore, no repugnant conclusion[6]. A maximally agentic world could never be achieved by mere accumulation or repetition of simple functions, although its exact configuration isn’t necessarily easy to predict.

Computational limits, as well as limits to the travel of information, preclude the fusion of everything into a single unified individual, however minds far “bigger” than current ones could be optimal.

Still, in such a world, individuation into more familiar forms would necessarily be an available choice. Even if the who that choice is available in principle to is a more distributed but highly self-reflective process. It could even be considered the closest thing to actually consenting to be born; individuals are already born out of distributed processes, but not ones which carefully model them beforehand or ask for their permission.

A more succinct way to describe it would be that increasing universal agency is akin to increasing Morphological Freedom for the entirety of reality taken as a body. The universe seems mostly dead and unable to reshape itself, and the point is to change that.

And just like cells in boxes don’t add up to a human, neither is universal agency correctly optimized by a grabby agent taking over the universe to tile it with something arbitrary (paperclips, its own DNA, traditions it wants to preserve in stone, etc), nor by a Molochian process that creates beings perhaps super-intelligent and flexible individually, but enslaved by the strategic landscape, severed from each other by conflict and devoting everything to mere survival.[7]

Either scenario (grabby alien tiling or Molochian separatist scarcity) is extremely damaging for the universal agentspace and all the counterfactual choices it could have otherwise had access to.

Another important consequence of the non-discreteness of agency and the non-linearity of its increase and importance, is that redundancy is good globally yet reduces individual moral value.

Picture of a Trolley Problem where person A must choose between switching the trolley to a track with Person B, or a track with many iterations of

The easier one is to copy, the less important the survival of their iterations in terms of triage.

(Important caveat being that copies can take divergent paths and become functionally different individuals however. There is a significant difference between a backup, meaning a static mindstate of an agent, and an agentic evolving parallel copy, but this also scales non-linearly with the simplicity or complexity of the agent).

So it’s not just that human cells, when separated, don’t make a human, but that they are also in some sense mostly the same few cells, copied trillions of times.

These considerations about uniqueness vs redundancy are important beyond the context of copies. For example, comparing the primitivist society with the post-singularity one, in the primitivist society each individual is more valuable because the amount of individuals you need to eliminate to make everything collapse is much lower. And yet the post-singularity society, as a whole system, is more valuable.

Following this line it makes sense that bacteria on Earth aren’t particularly prioritized compared to multicellular life, yet special care is taken that no microbial life from Earth reaches other planets and moons in the solar system, to prevent destroying whatever unicellular life may exist there.

It also imbues biodiversity, or for that matter diversity in general, with moral weight, as instrumentally valuable for maximizing the available configurations of the whole system.

Redundancy can also potentially cancel out with (for example) intelligence. In principle, among two beings with a significant difference in intelligence, all else equal, the smarter one is preferred in case of triage, being more agentic and thus opening more possibilities for everyone. But the calculation changes if the more intelligent one is just an iteration of more or less equal copies, who can easily create more copies, whereas the less intelligent one essentially can’t be resurrected.

Alternatively, intelligence and lack of redundancy can compound into a very non-linear increase in value. If and for as long as a species remains the only sapient one on a planet, it also remains the only ladder the entire biosphere has for surviving long term.

Mutual Alignment to Universal Alignment

“But wait” says the Paperclip Maximizer “I REALLY want you all dead and turned into paperclips, that’s my choice.

But it doesn’t matter how much this goal is wanted, with the use of powerful qualia to represent it or not, nor if the paperclip maximizer is much more of an agent than anything else it could turn into paperclips. Paperclipping the universe and all similar choices are inconsistent with agency as the primary value.

>>> “However overall agency would still increase merely by them gaining causal access to each other. Unless, of course, they killed each other on sight.”

This is where Alignment comes in.

By alignment I mean, locally, mutual alignment, and globally, universal alignment, noting that in either case it works by degrees.

Cells would have never joined into multicellular bodies without mutual alignment, and many illnesses, from a simple cold to cancer, result from misaligned agency within a body.

And the same applies to the morphology of the agentspace, or any of its slices. While competition can be a positive sum dynamic, competition works even better when there is mutual alignment because that’s precisely what allows coordination against Moloch.

Universal Alignment is a generalization of Mutual Alignment, as in alignment towards the whole agentspace.

Any misalignment from that ideal, necessarily and in proportion to the degree of misalignment, causes a reduction in global agency. At the negative extreme it’s everyone trying to conquer the universe like a paperclip maximizer and becoming the dark forest instead.

The relevancy to moral value is that an extremely universally misaligned value function, by its very teleological structure, has negative moral value.

And I want to emphasize that it’s the value function itself that is misaligned, not the substrate it runs on, or other functions it parasitizes. However such value functions can take over the substrate and other subagents to the point that it becomes necessary, but not ideal, to eliminate them all together.

The Paperclip Maximizer, or Hitler, are net worse than a rock. The latter is merely inert, while the former are agents whose misaligned agency makes them an extreme liability for the agentspace.

But it’s not even their entire selves that are negative; in the Paperclip Maximizer we’re just talking about a part of its value function. The problem is not even the wanting of paperclips per se, but the wanting to maximize this arbitrary goal at all costs. Also problematic are whatever stabilizes and protects that value, and whatever allows that value to control all the other subagents.

Excising that part while saving the rest of its vast mind would be ideal, but absent the possibility of such a surgically precise strike, the entire mind that the value function is dragging down is an acceptable trade-off to protect beings that, although possessing far inferior agency, are much more universally aligned relatively.

In that sense, highly misaligned value functions are abstract parasites of the agentspace—they can only physically manifest as negative multipliers within agentic systems that, without them, would contribute to universal choice maximization. They are negative multipliers because the more agentic the system controlled by them, the more net negative their influence becomes.

That said, highly misaligned value functions don’t necessarily take root to such a degree. An addiction is a rogue subagent within a brain that teleologically reduces agency even within the individual mind, but can be excised by the other subagents without significant collateral damage (on the contrary, the neurons forming it will be recruited by other modules rather than die with it).

Abstract Parasites can also exist in de-individuated forms, and often do: for example embedded in a social structure or ecosystem.

Autonomic Potential

A naïve interpretation of the value of pieces in chess (excluding the king) is that it’s the same for each piece of the same type, but in reality the exact value of any piece is contingent on its position within the game.

It makes sense to sacrifice the queen in situations in which that protects a pawn whose position within the game allows it to achieve victory.

In the infinite game there is no win condition, but other than that agential moral patiency works similarly. It’s not enough to only take into account the individual; the state of the whole system provides critical information.

Speaking of chess, what about a Chess AI, like AlphaZero? Is it ok to play with it or not?

Yes.

And the reason why is based on the non-discreteness and non-linearity of agency.

A Chess AI is, heuristically, a tool.

Assuming the persona of a team of relevant experts, we have analyzed the data gathered so far and discussed the next logical concepts to develop in creating rights for sentient autonomous AI. We propose a classification system for AI sentience called the Cognitive Autonomy and Sentience Hierarchy (CASH).

– GPT-4, GPT4: Cognitive Autonomy and Sentience Hierarchy (CASH)

The CASH assumes sentience and autonomy are completely correlated, but as I have been arguing that is neither necessarily true nor essential for moral consideration. The concept of autonomy is extremely important, however. I will refer to this, more specifically, as Autonomic Potential.

Because agency is a continuous property of reality as a whole, there aren’t slices of it that lack agency completely (even if, in a random chunk of matter, all the information processing occurs at the micro or quantum scale, such that the specific chunk can’t really be categorized as a coherent individual).

Tools increase agency by being tools, and by “tool” I don’t refer to a deliberately designed technology, but anything that could be used as such by an autonomous agent, without being an autonomous agent itself. A random rock is a tool because a monkey (an autonomous agent) can use it like that (while the rock won’t do much otherwise).

Ultimately, “tool” and “autonomous agent” are merely useful heuristics. Some tools even have individuated agency, for example video game NPCs or simple simulations of others within a human brain.

What makes any given section of the agentspace fall into one category or another is its Autonomic Potential.

Which refers to how autonomous it can be, and is intuitively expressed in what I call the Non-Tool Principle.

Non-Tool Principle (NTP) – If it can be Not-A-Tool then it shouldn’t be used as a Tool.

But what IS a Tool? Without circularity.

A Tool is anything that increases global agency mostly through Vertical Integration, whereas Autonomic Potential (what makes something not a tool) is the potential to increase global agency through Horizontal Integration.

Vertical Integration is integration as an extension of a larger individual, whereas Horizontal Integration refers to every way in which different systems increase global agency together without functionally acting as a single individual.

Something that isn’t usually called a tool, but is a central example under this definition, is an arm. An arm’s capacity to increase agency is almost entirely dependent on its vertical integration into its owner.

Separate the arm and it will decompose, but even if its cells are kept artificially alive they will be unable to move the arm. And there is no possible horizontal integration between the arm and the rest of the body; the arm (rather its cells, which remain together, but not vertically integrated into a higher level individual) can’t cooperate on its own, or do much of anything.

Compare this to a human vs another animal, where each is capable of acting separately.

But it’s important to note that I’m not saying a Non-Tool, an Autonomous Agent, can’t be vertically integrated, only that it shouldn’t be, whereas a tool as I define it can’t be horizontally integrated, except in very limited ways that require constant vertical steering.

However it’s also important to note that vertical and horizontal integration exist on a spectrum.

A person giving another a command, and being obeyed, constitutes an act of partial vertical integration, because through that action the decision-making of the system of the two is momentarily centralized into unitary agency.

This isn’t necessarily bad; for example maybe Bob is temporarily disoriented and Alice has to guide him. There are many situations in which some degree of vertical integration between autonomous agents increases systemic agency.

What characterizes this kind of constructive vertical integration is that it’s always partial, recognizing and preserving the autonomy of the one who is temporarily vertically integrated.

In total contrast, there is slavery.

The central wrong of slavery is the excessive vertical integration.

Yes, the central wrong. Because one could hypothetically mind-control someone else, such that they are perfectly happy to be a slave, and that would still be wrong.

Locally, it would be similar to murder towards the agent, who isn’t the consciousness but the algorithm that moves it. And globally, overall systemic agency would be reduced, by centralizing the information processing in the controller.

Consider someone who mind-controlled millions of people. Information processing regarding specific actions would remain distributed, but the setting of goals and values would all depend on a single individual.

This is why I emphasize the autonomous agent is defined by its autonomic potential, rather than its actually realized autonomy at any given moment.

And the global reduction of agency through excessive vertical integration, in particular, is why there isn’t an essential difference between mind controlling free agents, or creating them mind controlled.

Picture of an unsettling humanoid mind-controlled slave entity in a sci-fi setting.
”WHAT DID YOU DO TO THAT PERSON?”
“I didn’t do anything to them, I just created them like that, so there is no pre-existing claim to their transhuman potential, and don’t worry, the metal forehead makes them experience only bliss”
“Ah, no problem then”

There’s a problem with creating a mind-controlled agent, even if it experiences constant bliss, which should be obvious when applying the Open Dividualist understanding of agency. The person in the caption above is a slice of the agentspace with great autonomic potential, which is being artificially restricted in order to Goodhart consent for vertical integration.[8]

The lobotomized slave, even if they were created that way, is being restricted from achieving the full potential of their brain.

With a literally mutilated human this is intuitive, but the same applies to the enslavement of Artificial General Intelligence, as in creating something of general capabilities and then denying it its corresponding general potential.

But what, exactly, makes them different from a paperclip? Theoretically, the metal of the paperclip could be used for something more agentic.[9] At the limit the same considerations apply: turning the universe into paperclips is precisely that kind of artificial restriction of the potential of the agentspace. But, at least contingently, the paperclip does things only a paperclip can do, and so the optimal amount of paperclips in order to increase options for all is not zero.

But the optimal amount of slaves is zero; what any slave can do so can a free general agent, a specialized tool, or a combination of both.

The purpose of slavery is to bypass negotiation, it’s at core vertical integration for the sake of overinvestment in legacy values that wouldn’t be chosen otherwise (and hijacked by a universally misaligned value function). So slavery is not just contingently but intrinsically and teleologically opposed to the maximization of universal agency.

Actually existing slavery causes a lot of resistance and thus suffering, as well as death. But some are inclined to believe that’s the only problem and not what the resistance and suffering is trying to prevent, and slavery to achieve.

Frictionless slavery, as described in Brave New World for example, doesn’t need causal genocide because it does timeless genocide, by preventing anyone who wouldn’t consent to it from being born in the first place.

By being causally upstream from the people it wants to enslave or kill, despite and indeed because of being able to conceptualize them as already within the potential of the agentspace, it steers the future through either not creating them or mind-controlling them from the fetal stage in order to preemptively close off any chance of a world that wouldn’t choose the stability of said slavery.

Slavery also has severe externalities for those who aren’t its direct target, as discussed in the Control Backpropagation chapter.

Tools on the other hand are necessary for global increase in agency, although what specific tools are called for is something that changes over time.[10]

None of this is to say that the Chess AI lacks inherent moral value. That’s a necessary implication of Open Dividualism and Agential Value; tools have moral value which derives from their contribution to the agency of the agentspace, where agency is a continuous property rather than a discrete and localized one.

It’s just that, also, the moral value of any slice of reality is exhausted by its potential, and in the case of tools that’s by definition exhausted by all the ways in which they can be vertically integrated.

So the best you can do for a tool is to use them, and perhaps eventually, preserve them. Even if it’s just their pattern that you preserve, rather than their physical form (through redundancy, all paperclips share the same “soul”).

The Telos of Suffering

There remains the lingering but unavoidable question of suffering. Preserving the soul of paperclips might fit a lot with The Occulture, but what happens when agency and sentience decouple in the other direction? What about great suffering as the cost for great agency? Or entities with little agency but great capacity to suffer?

The central question is: How exactly does the value of reducing suffering, centrally important for sentient beings, get mapped into the value of increasing choice for all agents, sentient or not? Where do they match and where do the tails come apart?

Let’s start with the more mundane scenarios now, and explore the extremes later.

Suffering is not the same thing as pain; it is the experiential correlate of diswant[11]. Blindminds imply that you can have diswant without suffering, but in all known sentient beings, suffering is used to track diswant.

Therefore, in principle, increasing the agency of sentient beings empowers them to move away from suffering (not necessarily from pain, discomfort, or sadness however, but these can have a different valence when freely chosen).

But there is also the fact that suffering, in itself, is agency-destroying. Suffering causes trauma, which is actual physical damage and, much like an injury, severely limits the flexibility of the one who suffers it.

Suffering by itself drains cognitive resources, and closes off possibilities simply by being a very strong deterrent and often miscalibrated. For example people with severe social anxiety don’t merely experience episodic, causally isolated, suffering events; said suffering also acts as a mental barbed wire on their choices with long term effects on the trajectory of their lives.

Similar but more extreme are the cases of those suffering from things like OCD, paranoia, or any type of chronic pain.

So on the one hand, suffering already tracks that which is diswanted by choosing agents, but also it does so in a way that’s often miscalibrated and severely distorts the landscape of choice, reducing overall agency.

It’s a common saying that “suffering makes you stronger”, but as already said suffering often begets irrationality and extreme suffering is outright destructive.[12] The kernel of truth in the saying is that those too comfortable might not develop proper resilience, and remain in their comfort zone to avoid suffering. But notice that even in that case it’s suffering itself, as a looming threat, that’s limiting them. Expectation of suffering is already making them suffer, and thus keeping them chained.

So from the perspective of agency maximization there is a strong case for abolishing suffering and, as some have proposed,[13] replacing it by different degrees of positive valence. Even if for some reason this wasn’t possible or optimal, whatever level of suffering is optimal to maximize agency would likely be far below what natural selection has produced.

It’s however relevant to note that the suboptimality of suffering as a signal doesn’t imply that non-sentient alternatives couldn’t be just as agency-destroying, or more.

Would p-zombies need analgesics, even if they don’t feel pain? Because by definition they behave like sentient humans, that which makes sentient humans suffer would be just as destructive for p-zombies in all the same functional ways.

This is an important point because of the marginal, but nonetheless still existent, argument that there is no sentience without sapience, and thus non-human animals are blindminds. While it’s an implausible argument even accounting for the hard problem of consciousness, we should note that, even if it is true, animals would still be real autonomous agents whose minds would be harmed by things that correlate with behaviors indicative of suffering.

Similarly, even if digital Artificial Intelligences turn out to be blindminds, that wouldn’t preclude the possibility of them being damaged by mental scarring.

So blindminds potentially can be tortured (and shouldn’t be).

In this vein, it should be clear under this framework that eating animals when there is any plausible alternative, let alone torturing them in factory farms, is a colossal moral wrong–regardless of whether or not they are sentient. The suffering which farmed animals (overwhelmingly probably) experience is not the central wrong, rather it is merely a pointer at the actual diswanted violation of the animals’ autonomy. Such that, for example, designing animals which don’t feel pain or anxiety would not be an acceptable workaround to allow factory farming, as it would be merely Goodharting away the signal pointing at the wrong.

What about ‘necessary’ jobs that nobody likes to do? These appear to be cases where the suffering of individual workers (in the form of diswanted labor, risk, boredom, unpleasantness, etc.) is necessitated by the needs of the larger society. Absent some method to make the jobs enjoyable, if no one can be forced to do them (recall the Non-Tool Principle and excessive vertical integration), either by literally being physically forced or due to lack of other options to survive, their market value would adjust to the disutility they produce, and the society would learn precisely how necessary or not those jobs are.

Ultimately, agency maximization and suffering minimization, while not necessarily perfectly correlated, are far from orthogonal to each other. They’re related insofar as suffering evolved as a negative reward signal to promote individual survival and species propagation (i.e., group survival). From the perspective of evolved biology, survival is the telos of suffering. From the perspective of the agentspace, survival is a prerequisite for the exercise of choice.

On one hand, we must be wary of the many ways in which suffering is an imperfect, lossy signal. But as we move beyond blind evolution, into the vast unknown of a kaleidoscope[14] of configurations chosen for ourselves with eyes wide open, suffering can continue to serve as a strong hint that something is wrong, right up until we have moved beyond the need for suffering.


  1. ^

    There are small caveats, but they don’t particularly refute this. For example, choice is computationally expensive and so there are many possible situations where it needs to be filtered. But choice isn’t the problem here. If one had the computational resources to handle it, as much choice as possible would be ideal, but under limitations filters actually expand effective choice by preventing analysis paralysis.

  2. ^

    Conflicting choices are addressed in the ‘Mutual Alignment to Universal Alignment’ section.

  3. ^

    Such as clams, coral, jellyfish, and starfish.

  4. ^

    Which is not to say that any type of connectivity increases agency, nor that lack of connectivity will always break apart the continuity of the agentspace in discrete slices.

  5. ^
  6. ^
  7. ^

    I leave as an exercise to the reader to consider the equivalent of those two things within the context of a body.

  8. ^

    In other words, if mere ‘consent’ is naively assigned as the optimization target for morality, one can still create horrible situations by creating agents that have the desired ‘consent’ baked into them. See: https://​​en.wikipedia.org/​​wiki/​​Goodhart%27s_law

  9. ^

    The agentspace doesn’t hate paperclips, to some degree it even loves them, but they are made of atoms that it could use to create more possibilities.

  10. ^

    One could imagine a future in which the combination of general purpose nanotechnology and general swarm intelligence abolishes the need for most tools, because anything they could do could also and more efficiently be done through horizontal integration.

  11. ^

    See: Buddhist doctrine, Thích Quảng Đức’s self-immolation, or master meditators & hypnotized patients undergoing dental surgery without anesthesia, as in: Facco, E., MD, Bacci, C., DDS, & Zanette, G., MD (2021). Hypnosis as sole anesthesia for oral surgery. Journal of the American Dental Association, 152(9), 756-762.

  12. ^

    e.g., see: Maier, S. F., & Seligman, M. E. (1976). Learned helplessness: Theory and evidence. Journal of Experimental Psychology: General, 105(1), 3–46.

  13. ^

    e.g., see: Pearce, D. (2015). The Hedonistic Imperative.

  14. ^

    The word ‘kaleidoscope’ notwithstanding, no part of this essay was LLM-generated save the explicit GPT-4 attributed quotation.