I’d agree with your points regarding limited scope for the first and second points, but I don’t understand how anyone can make prioritization decisions when we have no discounting—it’s nearly always better to conserve resources. If we have discounting for costs but not benefits, however, I worry the framework is incoherent. This is a much more general confusion I have, and the fact that you didn’t address or resolve it is unsurprising.
Re: S-Risks, I’m wondering whether we need to be concerned about value misalignment leading to arbitrarily large negative utility, given some perspectives. I’m concerned that human values are incoherent, and any given maximization is likely to cause arbitrarily large “suffering” for some values—and if there are multiple groups with different values, this might mean any maximization imposes maximal suffering on the large majority of people’s values.
For example, if 1⁄3 of humanity feels that human liberty is a crucial value, without which human pleasure is worse than meaningless, another 1⁄3 views earning reward as critical, and the last 1⁄3 views bliss/pure hedonium as optimal, we would view tiling the universe with human brains maxed out for any one of these as a hugely negative outcome for 2⁄3 of humanity, much worse than extinction.
Regarding your second point, just a few thoughts: First of all, an important point is how you think values and morality work. If two-thirds of humanity, after thorough reflection, disagree with your values, does this give you a reason to become less certain about your values as well? Maybe adopt their values to a degree? …
Secondly, I am also uncertain how coherent/convergent human values will be. There seem to be good arguments for both sides, see e.g. this blog post by Paul Christiano (and the discussion with Brian Tomasik in the comments of that post): https://rationalaltruist.com/2013/06/13/against-moral-advocacy/
Third: In a situation like the one you described above, at least there would be huge room for compromise/gains from trade/… So if future humanity would be split into the three factions you suggested, they would not necessarily fight a war until only one faction remains that can then tile the universe with their preferred version. Indeed, they probably would not, as cooperation seems better for everyone in expectation.
1) I agree that there is some confusion on my part, and on the part of most others I have spoken to, about how terminal values and morality do or do not get updated.
2) Agreed.
3) I will point to a maybe forthcoming paper / idea of Eric Drexler at FHI that makes this point, which he called “pareto-topia”. Despite the wonderful virtues of the idea, I’m unclear if there is a stable game-theoretic mechanism that prevents a race to the bottom outcome when fundamentally different values are being traded off. Specifically in this case, it’s possible that different values lead to an inability to truthfully/reliably cooperate—a paved road to pareto-topia seems not to exist, and there might be no path at all.
Thanks for replying.
I’d agree with your points regarding limited scope for the first and second points, but I don’t understand how anyone can make prioritization decisions when we have no discounting—it’s nearly always better to conserve resources. If we have discounting for costs but not benefits, however, I worry the framework is incoherent. This is a much more general confusion I have, and the fact that you didn’t address or resolve it is unsurprising.
Re: S-Risks, I’m wondering whether we need to be concerned about value misalignment leading to arbitrarily large negative utility, given some perspectives. I’m concerned that human values are incoherent, and any given maximization is likely to cause arbitrarily large “suffering” for some values—and if there are multiple groups with different values, this might mean any maximization imposes maximal suffering on the large majority of people’s values.
For example, if 1⁄3 of humanity feels that human liberty is a crucial value, without which human pleasure is worse than meaningless, another 1⁄3 views earning reward as critical, and the last 1⁄3 views bliss/pure hedonium as optimal, we would view tiling the universe with human brains maxed out for any one of these as a hugely negative outcome for 2⁄3 of humanity, much worse than extinction.
Regarding your second point, just a few thoughts:
First of all, an important point is how you think values and morality work. If two-thirds of humanity, after thorough reflection, disagree with your values, does this give you a reason to become less certain about your values as well? Maybe adopt their values to a degree? …
Secondly, I am also uncertain how coherent/convergent human values will be. There seem to be good arguments for both sides, see e.g. this blog post by Paul Christiano (and the discussion with Brian Tomasik in the comments of that post): https://rationalaltruist.com/2013/06/13/against-moral-advocacy/
Third: In a situation like the one you described above, at least there would be huge room for compromise/gains from trade/… So if future humanity would be split into the three factions you suggested, they would not necessarily fight a war until only one faction remains that can then tile the universe with their preferred version. Indeed, they probably would not, as cooperation seems better for everyone in expectation.
1) I agree that there is some confusion on my part, and on the part of most others I have spoken to, about how terminal values and morality do or do not get updated.
2) Agreed.
3) I will point to a maybe forthcoming paper / idea of Eric Drexler at FHI that makes this point, which he called “pareto-topia”. Despite the wonderful virtues of the idea, I’m unclear if there is a stable game-theoretic mechanism that prevents a race to the bottom outcome when fundamentally different values are being traded off. Specifically in this case, it’s possible that different values lead to an inability to truthfully/reliably cooperate—a paved road to pareto-topia seems not to exist, and there might be no path at all.