The Grabby Values Selection Thesis: What values do space-faring civilizations plausibly have?
(This post also has a Russian version, translated from the present original by K. Kirdan.)
Summary: The Grabby Values Selection Thesis (or GST, for short) is the thesis that some values are more expansion-conducive (and therefore more adapted to space colonization races) than others such that we should – all else equal – expect such values to be more represented among the grabbiest civilizations/AGIs. In this post, I present and argue for GST, and raise some considerations regarding how strong and decisive we should expect this selection effect to be. The stronger it is, the more we should expect our successors – in worlds where the future of humanity is big – to have values more grabbing-prone than ours. The same holds for grabby aliens relative to us present humans. While these claims are trivially true, they seem to support conclusions that most longtermists have not paid attention to, such as “the most powerful civilizations don’t care about what the moral truth might be” (see my previous post), and “they don’t care (much) about suffering” (see subsequent post).
The thesis
Spreading to new territories can be motivated by very different values and seems to be a convergent instrumental goal. Whatever a given agent wants, they likely have some incentive to accumulate resources and spread to new territories in order to better achieve their goal(s).
However, not all moral preferences are equally conducive to expansion. Some of them value (intrinsically or instrumentally) colonization more than others. For instance, agents who value spreading intrinsically will likely colonize more and/or more efficiently than those who disvalue being the direct cause of something like “space pollution”, in the interstellar context.
Therefore, there is a selection effect where the most powerful[1] civilizations/AGIs are those who have the values that are the most prone to “grabbing”. This is the Grabby Values Selection Thesis (GST), which is the formalization and generalization of an idea that has been expressed by Robin Hanson (1998).
We can differentiate between two sub-selection effects, here:
The intra-civ (grabby values) selection: Within a civilization, we should expect the agents who have the values that are the most adapted to survival, replication, and expansion to eventually be selected for. In the absence of early value lock-in
, this seems to favor grabby values, since those with the grabbiest values will be the ones controlling the most resources, all else equal. Here is a specific plausible instance of that selection effect, given by Robin Hanson (1998): “Far enough away from the origin of an expanding wave of interstellar colonization, and in the absence of property rights in virgin oases, a selection effect should make leading edge colonists primarily value whatever it takes to stay at the leading edge.”[2]The inter-civ (grabby values) selection: The civilizations that end up with the most grabby-prone values will get more territory than the others.
Do these two different sub-selection effects matter equally? My current impression is that this mainly depends on the likelihood of an early value lock-in – or of design escaping selection early and longlastingly, in Robin Hanson’s (2022) terminology – where “early” means “before grabby values get the time to be selected for within the civilization”. [3] If such an early value lock-in occurs, the inter-civ selection effect is the only one left. If it doesn’t occur, however, the importance of the intra-civ selection effect seems vastly superior to that of the inter-civ one. This is mainly explained by the fact that there is very likely much more room for selection effects within a (not-locked-in) civilization than in between different civilizations.[4]
GST seems trivially true. It is pretty obvious that all values are not equal in terms of how much they value (intrinsically or instrumentally) space colonization, and that those who value space expansion more will expand more. This thesis is based on nothing but these very simple and uncontroversial premises. What might lead to some controversy, however, is asking How strong is the selection effect? Do value systems significantly/drastically differ in how grabbing-prone they are? The next section shallowly addresses this.
How strong is the selection effect?
The previous section already touches on the importance of the intra-civ and inter-civ selection effects relative to each other. Here, we’ll consider their absolute importance, i.e., the extent to which we should expect a large number of values to actually be selected against.
I see two reasons to believe the intra-civ selection, specifically, would be more intense than what we may intuitively think:
(i) we arguably should expect the development and deployment of something like transformative AI to make the intra-civ grabby values selection process particularly fast and strong;[5] and
(ii) as Hanson (2009) argues, “This is the Dream Time” for many present (and past) humans, who have the unusual luxury of being able to value and believe pretty much anything. Agents trying to survive/succeed in a civilization that is optimizing for space colonization probably won’t have this privilege, such that their values are more likely to be determined by evolutionary pressures.
One reason to suppose there wouldn’t be that much (both intra-civ and inter-civ) selection favoring grabby values is that advanced civilizations might eventually converge on a moral truth. However, the previous post within this sequence argues that this is relatively unlikely, partly because there is no reason to assume values aligned with a (potentially discoverable?) moral truth will be more competitive than those that are the most grabbing-prone.
A better reason to believe there wouldn’t be that much (intra-civ and inter-civ) selection favoring grabby values, is that agents with values that might a priori seem less grabbing-prone could still prioritize colonizing space, as a first step, to not fall behind in the race against other agents (aliens or other agents within their civilization), and actually optimize for their values later, such that there is little selection effect. Call this the convergent preemptive colonization argument. Earlier, I wrote:
For instance, agents who value spreading intrinsically will likely colonize more and/or more efficiently than those who disvalue being the direct cause of something like “space pollution”, in the interstellar context.
While these two specific value systems seem to differ greatly in terms of how grabbing-prone they are, the convergent preemptive colonization argument suggests that others might differ much less. For instance, Carl Shulman (2012) argues we should expect agents who want to maximize the number of “people leading rich, happy lives full of interest and reward” (“Eudaimonians”) to be nearly as grabbing-prone as agents who purely want to expand (the “Locusts”). And although I believe his argument to be far from unassailable, Shulman tells a thought-provoking story that reminds us of how much of a convergent instrumental goal space colonization still is for various value systems.
So the relevance of both the intra-civ and inter-civ selection effect might highly depend on the specific values our minds entertain while thinking about this.
Conclusion
The Grabby Values Selection Thesis seems trivially true, but I am pretty uncertain about the significance of the selection effect. Its relevance might vary a lot depending on the exact values we are considering and “making compete” against one another. In future posts, I will investigate the significance of the selection effect, given value variations on different axes.
I warmly welcome any consideration I might have missed. More research is needed, here.
Although uncertainty is big, the more significant the selection effect, the more this has crucial implications for longtermists. My future posts will also touch on these implications and what GST tells us about the values we should expect our successors – and grabby aliens – to have.
Acknowledgment
Thanks to Robin Hanson for our insightful conversation on this topic. Thanks to Antonin Broi, Maxime Riché, and Elias Schmied for helpful comments on earlier drafts. Most of my work on this sequence so far has been funded by Existential Risk Alliance.
All assumptions/claims/omissions are my own.
- ^
By “the most powerful”, I mean “those who control the most resources such that they’re also those who achieve their goals most efficiently.”
- ^
Other pieces have pointed at potential dynamics that are fairly similar/analogous. Nick Bostrom (2004) explores “scenarios where freewheeling evolutionary developments, while continuing to produce complex and intelligent forms of organization, lead to the gradual elimination of all forms of being that we care about.” Paul Christiano (2019) depicts a scenario where “ML training [...] gives rise to “greedy” patterns that try to expand their own influence.” Allan Dafoe (2019; 2020) coined the term “value erosion” to illustrate a dynamic where “[j]ust as a safety-performance tradeoff, in the presence of intense competition, pushes decision-makers to cut corners on safety, so can a tradeoff between any human value and competitive performance incentivize decision makers to sacrifice that value.”
- ^
They arguably have already been somewhat selected for, via natural and cultural evolution (see Will Aldred’s comment) long before space colonization becomes a possibility, though.
- ^
Thanks to Robin Hanson for pointing out this last part to me, and for helping me realize that differentiating between the intra-civ selection and the inter-civ one was much more important than I previously thought.
- ^
Dafoe (2019, section Frequently Asked Question) makes an analogous point.
- Long Reflection Reading List by 24 Mar 2024 16:27 UTC; 92 points) (
- Why we may expect our successors not to care about suffering by 10 Jul 2023 13:54 UTC; 63 points) (
- Cooperating with aliens and AGIs: An ECL explainer by 24 Feb 2024 22:58 UTC; 52 points) (
- Cooperating with aliens and AGIs: An ECL explainer by 24 Feb 2024 22:58 UTC; 51 points) (LessWrong;
- What the Moral Truth might be makes no difference to what will happen by 9 Apr 2023 17:43 UTC; 40 points) (
- What values will control the Future? Overview, conclusion, and directions for future work by 18 Jul 2023 16:11 UTC; 25 points) (
- 11 Jul 2023 20:46 UTC; 1 point) 's comment on Why we may expect our successors not to care about suffering by (
I’m enjoying this sequence, thanks for writing it.
I imagine you’re well aware of what I write below – I write it to maybe help some readers place this post within some wider context.
My model of space-faring civilizations’ values, which I’m sure isn’t original to me, goes something like the following:
Start with a uniform prior over all possible values, and with the reasonable assumption that any agent or civilization in the universe, whether biological or artificial, originated from biological life arising on some planet.
Natural selection. All biological life probably goes through a Darwinian selection process. This process predictably favors values that are correlated with genetic fitness.
Cultural evolution, including moral progress. Most sufficiently intelligent life (e.g., humans) probably organizes itself into a civilization, with culture and cultural evolution. It seems harder to predict which values cultural evolution might tend to favor, though.
Great filters.[1] Notably,
Self-destruction. Values that increase the likelihood of self-destruction (e.g., via nuclear brinkmanship-gone-wrong) are disfavored.
Desire to colonize space, aka grabbiness. As this post discusses, values that correlate with grabbiness are favored.
(For more, see Oesterheld (n.d.).)
A potentially important curveball: the transition from biological to artificial intelligence.
AI alignment appears to be difficult. This probably undoes some of the value selection effects I describe above, because some fraction of space-faring agents/civilizations is presumably AI with values not aligned with those of their biological creators, and I expect the distribution of misaligned AI values, relative to the distribution of values that survive the above selections, to be closer to uniform over all values (i.e., the prior we started with).
Exactly how hard alignment is (i.e., what fraction of biological civilizations that build superintelligent AI are disempowered?), as well as some other considerations (e.g., are alignment failures generally near misses or big misses?; if alignment is effectively impossible, then what fraction of civilizations are cognizant enough to not build superintelligence?), likely factor into how this curveball plays out.
Technically, I mean late-stage steps within the great filter hypothesis (Wikipedia, n.d.; LessWrong, n.d.).
Thanks a lot for this comment! I linked to it in a footnote. I really like this breakdown of different types of relevant evolutionary dynamics. :)
:)
NIce post! My current guess is that the inter-civ selection effect is extremely weak and that the intra-civ selection effect is fairly weak. N=1, but in our civilization the people gunning for control of AGI seem more grabby than average but not drastically so, and it seems possible for this trend to reverse e.g. if the US government nationalizes all the AGI projects.
Thanks for the comment! :) You’re assuming that the AGI’s values will be pretty much locked-in forever once it is deployed such that the evolution of values will stop, right? Assuming this, I agree. But I can also imagine worlds where the AGI is made very corrigible (such that the overseers stay in control of the AGI’s values) and where intra-civ value evolution continues/accelerates. I’d be curious if you see reasons to think these worlds are unlikely.
Not sure I’m assuming that. Maybe. The way I’d put it is, selection pressure towards grabby values seems to require lots of diverse agents competing over a lengthy period, with the more successful ones reproducing more / acquiring more influence / etc. Currently we have this with humans competing for influence over AGI development, but it’s overall fairly weak pressure. What sorts of things are you imagining happening that would strengthen the pressure? Can you elaborate on the sort of scenario you have in mind?
Right so assuming no early value lock-in and the values of the AGI being (at least somewhat) controlled/influenced by its creators, I imagine these creators to have values that are grabby to varying extents, and these values are competing against one another in the big tournament that is cultural evolution.
For simplicity, say there are only two types of creators: the pure grabbers (who value grabbing (quasi-)intrinsically) and the safe grabbers (who are in favor of grabbing only if it is done in a “safe” way, whatever that means).
Since we’re assuming there hasn’t been any early value lock-in, the AGI isn’t committed to some form of compromise between the values of the pure and safe grabbers. Therefore, you can imagine that the AGI allows for competition and helps both groups accomplish what they want proportionally to their size, or something like that. From there, I see two plausible scenarios:
A) The pure and safe grabbers are two cleanly separated groups running a space expansion race against one another, and we should—all else equal—expect the pure grabbers to win, for the same reasons why we should—all else equal—expect the AGI race to be won by the labs optimizing for AI capabilities rather than for AI safety.
B) The safe grabbers “infiltrate” the pure grabbers in an attempt to make their space-expansion efforts “safer”, but are progressively selected against since they drag the pure-grabby project down. The few safe grabbers who might manage not to value drift and not to get kicked out of the pure grabbers are those who are complacent and not pushing really hard for more safety.
The reason why the intra-civ grabby values selection is currently fairly weak on Earth, as you point out, is that humans didn’t even start colonizing space, which makes something like A or B very unlikely to have happened yet. Arguably, the process that may eventually lead to something like A or B hasn’t even begun for real. We’re unlikely to notice a selection for grabby values before people actually start running something like a space expansion race. And most of those we might expect to want to somehow get involved in the potential[1] space expansion race are currently focused on the race to AGI, which makes sense. It seems like this latter race is more relevant/pressing, right now.
It seems like this race will happen (or actually be worth running) if, and only if, AGI has non-locked-in values and is corrigible(-ish) and aligned(-ish) with its creators, as we suggested.
My main crux regarding inter-civ selection effect is how fast will space colonization get. F.e. if it’s possible to produce small black holes, you can use them for an incredibly efficient propulsion and even just slightly grabby civs still spread at approximately the speed of light—roughly the same speed as extremely grabby civs. Maybe it’s also possible with fusion propulsion but I’m not sure—you’d need to ask astro nerds.
I guess the main hope is not that morality gives you a competitive edge (that’s unlikely) but rather that enough agents stumble on it anyway, f.e. realizing open/empty individualism is true, through philosophical reflection.
Yeah, I definitely expect that.
I haven’t thought about whether this should be the main crux but very good point! Magnus Vinding and I discuss this in this recent comment thread.
Yes. Related comment thread I find interesting here.
(The links you sent are broken)
Oops yeah thanks. Fixed :)
Just stumbled upon this sequence and happy to have found it! There seems to be lots of analysis ripe for picking here.
Some thoughts on the strength of the grabbiness selection effect below. I’ll likely to come back to this to add further thoughts in the future.
One factor that seems to be relevant here is the number of pathways to technological completion. If we assume that the only civilisations that dominate the universe in the far future are the ones that have reached technological completion (seems pretty true to me), then tautologically, the dominating civilisations must be those who have walked the path to technological completion. Now imagine that in order to reach technological completion, you must tile 50% of the planets under your control with computer chips, but your value system means that you assign huge disvalue to tiling planets with computer chips*. As a result, you’ll refuse to walk the path to technological completion, and be subjugated or wiped out by the civilisations that did go forward with this action.
The more realistic example here is a future in which suffering subroutines are a necessary step towards technological completion, and so civilisations that disvalue suffering enough to not take this step will be dominated by civilisations that either (1) don’t care for suffering or (2) are willing to bite the bullet of creating suffering sub-routines in order to pre-emptively colonise their available resources.
So the question here is how many paths are there to technological completion? Technological completion could be like a mountain summit that is accessible from many directions—in that case, if your value system doesn’t allow you to follow one path, you can change course and reach the summit from the other direction. But if there’s just a single path with some steps that are necessary to take, then this will constrain the set of value systems that dominate the far future. Sketching out precedents for technological completion would be a first step to gaining clarity here.
*This value system is just for the thought experiment, I’m not claiming that it’s a likely one.