Thanks a lot for this post! I tried addressing this earlier by exploring “extinction” vs “doom” vs “not utopia,” but your writing here is clearer, more precise and more detailed. One alternative framing I have for describing the “power laws of value,” hypothesis as a contrast of your 14-word summary:
“Utopia” by the lights of one axiology or moral framework might be close to worthless under other moral frameworks, assuming an additive axiology.
It’s 23 words and has more jargon, but I think it describes my own confusions better. In particular, I don’t think you need to believe in “weird stuff” to get to many OOMs of difference between “best possible future” and “realistic future”, unless additive/linear axiology itself is weird.
As one simple illustration, humanity can either be correct or incorrect in colonizing the stars with biological bodies instead of digital emulations. Either way, if you’re wrong you lose many OOMs of value
If we decide to go the biological route: biological bodies are much less efficient than digital emulations. it’s also much more difficult, as a practical/short-term matter, to colonize stars with bodies, so you capture a smaller fraction of the lightcone.).
If we decide to go the digital route, and it turns out emulations don’t have meaningful moral value (eg at the level of fidelity that emulations are seeded on, digital emulations are in practice not conscious), then we lose ~100.0000% of the value.
Appreciate this comment, and very much agree. I generally think that humanity’s descendents are going to saturate the stars with Dyson swarms making stuff (there’s good incentives to achieve explosive growth) but I think we’re (1) too quick to assume that, (2) too quick to assume we will stop being attached to inefficient earth stuff, and (3) too quick to assume the Dyson swarms will be implementing great stuff rather than, say, insentient digital slaves used to amass power or solve scientific problems.
Let’s say there are three threat models here: (a) Weird Stuff Matters A Lot, (b) Attachment to Biological Organisms, (c) Disneyland With No Children (the machines aren’t conscious).
I focused mainly on Weird Stuff Matters A Lot. The main reason I focused on this rather than Attachment to Biological Organisms is that I still think that computers are going to be so much more economically efficient than biology that in expectation ~75% of everything is computer. Computers are just much more useful than animals for most purposes, and it would be super crazy from most perspectives not to turn most of the stars into computers. (I wouldn’t totally rule out us failing to do that, but incentives push towards it strongly.) If, in expectation, ~75% of everything is computer, then maximizing computer only makes the world better by 1⁄3.
I think the Disneyland With No Children threat model is much scarier. I focused on it less here because I wanted to shore up broadly appealing theoretical reasons for trajectory change, and this argument feels much more partisan. But on my partisan worldview:
Consciousness is an indeterminate folk notion that we will reduce to computational properties. (Or something very close to this.)
These computational properties are going to be much, much more precise and gradable than our folk notion and they won’t wear folk psychological properties on their sleeves.
As a result we’re just going to have to make some choices about what stuff we think is conscious and what isn’t. There’s going to be a sharp borderline we’re going to have to pick arbitrarily, probably based on nothing more than whimsical values.
People will disagree about where the borderline is.
Even if people don’t disagree about the borderline, they’ll disagree substantially about cardinality, i.e. how much to value different computational properties relative to others.
Given the Power Laws of Value point, some people’s choices will be a mere shadow of value from the perspective of other people’s choices.
If this “irrealist” view is right, it’s extremely easy to lose out on almost all value.
Separately, I just don’t think our descendents are going to care very much about whether the computers are actually conscious, and so AI design choices are going to be orthogonal to moral value. On this different sort of orthogonality thesis, we’ll lose out on most value just because our descendents will use AI for practical reasons other than moral reasons, and so their intrinsic value will be unoptimized.
So Disneyland With No Children type threat models look very credible to me.
(I do think humans will make a lot of copies of themselves, which is decently valuable, but not if you’re comparing it to the most valuable world or if you value diversity.)
You could have a more realist view where we just make a big breakthrough in cognitive science and realize that a very glowy, distinctive set of computational properties was what we were talking about all along when we talked about consciousness, and everyone would agree to that. I don’t really think that’s how science works, but even if you did have that view it’s hard to see how the computational properties would just wear their cardinality on their sleeves. Whatever computational properties you find you can always value them differently. If you find some really natural measure of hedons in computational space you can always map hedons to moral value with different functions. (E.g. map 1 hedon to 1 value, 2 hedons to 10 value, 3 hedons to 100 value...)
So I didn’t focus on it here, but I think it’s definitely good to think about the Disneyland concern and it’s closely related to what I was thinking about when writing the OP.
Thanks a lot for this post! I tried addressing this earlier by exploring “extinction” vs “doom” vs “not utopia,” but your writing here is clearer, more precise and more detailed. One alternative framing I have for describing the “power laws of value,” hypothesis as a contrast of your 14-word summary:
It’s 23 words and has more jargon, but I think it describes my own confusions better. In particular, I don’t think you need to believe in “weird stuff” to get to many OOMs of difference between “best possible future” and “realistic future”, unless additive/linear axiology itself is weird.
As one simple illustration, humanity can either be correct or incorrect in colonizing the stars with biological bodies instead of digital emulations. Either way, if you’re wrong you lose many OOMs of value
If we decide to go the biological route: biological bodies are much less efficient than digital emulations. it’s also much more difficult, as a practical/short-term matter, to colonize stars with bodies, so you capture a smaller fraction of the lightcone.).
If we decide to go the digital route, and it turns out emulations don’t have meaningful moral value (eg at the level of fidelity that emulations are seeded on, digital emulations are in practice not conscious), then we lose ~100.0000% of the value.
Appreciate this comment, and very much agree. I generally think that humanity’s descendents are going to saturate the stars with Dyson swarms making stuff (there’s good incentives to achieve explosive growth) but I think we’re (1) too quick to assume that, (2) too quick to assume we will stop being attached to inefficient earth stuff, and (3) too quick to assume the Dyson swarms will be implementing great stuff rather than, say, insentient digital slaves used to amass power or solve scientific problems.
Let’s say there are three threat models here: (a) Weird Stuff Matters A Lot, (b) Attachment to Biological Organisms, (c) Disneyland With No Children (the machines aren’t conscious).
I focused mainly on Weird Stuff Matters A Lot. The main reason I focused on this rather than Attachment to Biological Organisms is that I still think that computers are going to be so much more economically efficient than biology that in expectation ~75% of everything is computer. Computers are just much more useful than animals for most purposes, and it would be super crazy from most perspectives not to turn most of the stars into computers. (I wouldn’t totally rule out us failing to do that, but incentives push towards it strongly.) If, in expectation, ~75% of everything is computer, then maximizing computer only makes the world better by 1⁄3.
I think the Disneyland With No Children threat model is much scarier. I focused on it less here because I wanted to shore up broadly appealing theoretical reasons for trajectory change, and this argument feels much more partisan. But on my partisan worldview:
Consciousness is an indeterminate folk notion that we will reduce to computational properties. (Or something very close to this.)
These computational properties are going to be much, much more precise and gradable than our folk notion and they won’t wear folk psychological properties on their sleeves.
As a result we’re just going to have to make some choices about what stuff we think is conscious and what isn’t. There’s going to be a sharp borderline we’re going to have to pick arbitrarily, probably based on nothing more than whimsical values.
People will disagree about where the borderline is.
Even if people don’t disagree about the borderline, they’ll disagree substantially about cardinality, i.e. how much to value different computational properties relative to others.
Given the Power Laws of Value point, some people’s choices will be a mere shadow of value from the perspective of other people’s choices.
If this “irrealist” view is right, it’s extremely easy to lose out on almost all value.
Separately, I just don’t think our descendents are going to care very much about whether the computers are actually conscious, and so AI design choices are going to be orthogonal to moral value. On this different sort of orthogonality thesis, we’ll lose out on most value just because our descendents will use AI for practical reasons other than moral reasons, and so their intrinsic value will be unoptimized.
So Disneyland With No Children type threat models look very credible to me.
(I do think humans will make a lot of copies of themselves, which is decently valuable, but not if you’re comparing it to the most valuable world or if you value diversity.)
You could have a more realist view where we just make a big breakthrough in cognitive science and realize that a very glowy, distinctive set of computational properties was what we were talking about all along when we talked about consciousness, and everyone would agree to that. I don’t really think that’s how science works, but even if you did have that view it’s hard to see how the computational properties would just wear their cardinality on their sleeves. Whatever computational properties you find you can always value them differently. If you find some really natural measure of hedons in computational space you can always map hedons to moral value with different functions. (E.g. map 1 hedon to 1 value, 2 hedons to 10 value, 3 hedons to 100 value...)
So I didn’t focus on it here, but I think it’s definitely good to think about the Disneyland concern and it’s closely related to what I was thinking about when writing the OP.
I really liked @Joe_Carlsmith articulation of your 23-word summary: what if all people are paperclippers relative to one another? Though it does make stronger assumptions than we are here.