Glad to see this series up! Tons of great points here.
One thing I would add is a that I think the analysis about fragility of value and intervention impact has a structural problem. Supposing that the value of the future is hyper-fragile as a combination of numerous multiplicative factors, you wind up thinking the output is extremely low value compared to the maximum, so there’s more to gain. OK.
But a hypothesis of hyper-fragility along these lines also indicates that after whatever interventions you make you will still get numerous multiplicative factors wrong, so it will again be an extreme failure.
On this analysis it’s the worlds where things are non-fragile (e.g. because of epistemic enhancement and improved bargaining and wealth driving systematically getting things right) that are far more valuable.
Maybe on the hyper-fragile aggregative story it’s easier to 10x the value of the future, but after doing so it will still be a bunch of orders of magnitude off from the optimum. On the feasible convergent optimum story a win gets you the optimum, far better than going from 10^-10 to 10^-9 of the optimum.
So there’s a lot of oomph to be had averting a catstrophic disruption of an otherwise convergent win (e.g. preventing a nasty whimsical permanent dictatorship that does crazy things or has a very bad starting point, AI or human), but not so much messing around with the hyper-fragile cases.
Glad to see this series up! Tons of great points here.
Thanks! And it’s great to see you back on here!
One thing I would add is a that I think the analysis about fragility of value and intervention impact has a structural problem. Supposing that the value of the future is hyper-fragile as a combination of numerous multiplicative factors, you wind up thinking the output is extremely low value compared to the maximum, so there’s more to gain. OK.
But a hypothesis of hyper-fragility along these lines also indicates that after whatever interventions you make you will still get numerous multiplicative factors wrong, so it will again be an extreme failure.
Well, it depends on how many multiplicative factors. If 100, then yes. If 5, then maybe not. So maybe the sweet spot for impact is where value is multiplicative, but with a relatively small number of multiplicative factors.
And you could act to make the difference in worlds in which society has already gotten all-but-one of the factors correct. Or act such that all the factors are better, in a correlated way.
On this analysis it’s the worlds where things are non-fragile (e.g. because of epistemic enhancement and improved bargaining and wealth driving systematically getting things right) that are far more valuable.
Great—I make a similar argument in Convergence and Compromise, section 5. (Apologies that the series is so long and interrelated!). I’ll quote the whole thing at the bottom of this comment.
Maybe on the hyper-fragile aggregative story it’s easier to 10x the value of the future, but after doing so it will still be a bunch of orders of magnitude off from the optimum. On the feasible convergent optimum story a win gets you the optimum, far better than going from 10^-10 to 10^-9 of the optimum.
Here I want to emphasise the distinction between two ways in which it could be “easy” to get things right: (i) mostly-great futures are a broad target because of the nature of ethics (e.g. bounded value at low bounds); (ii) (some) future beings will converge on the best views and promote them. (This essay (No Easy Eutopia) is about (i), and Convergence and Compromise is about (ii).)
W r t (ii)-type reasons, I think this argument works.
I don’t think it works w r t (i)-type reasons, though, because of questions around intertheoretic comparisons. On (i)-type reasons, it’s easier to get to a meaningful % of the optimum because of the nature of ethics (e.g. value is bounded rather than unbounded). But then we need to compare the stakes across different theories. And normalising at the difference in value between 0 and 100% would be a big mistake; it seeming “natural” is just an artifact of the notation we’ve used.
We discuss the intertheoretic comparisons issue in section 3.5 of No Easy Eutopia.
And here’s Convergence and Compromise, section 5:
5. Which scenarios are highest-stakes?
In response to the arguments we’ve given in this essay, and especially the reasons for pessimism about convergence we canvassed in section 2, you might wonder if the practical upshot is that you should pursue personal power-seeking. If a mostly-great future is a narrow target, and you don’t expect other people to AM-converge, then you lose out on most possible value unless the future ends up aligned with almost exactly your values. And, so the thought goes, the only way to ensure that happens is to increase your own power by as much as possible.
However, we don’t think that this is the main upshot. Consider these three scenarios:
Even given good conditions, there’s almost no AM-convergence between any sorts of beings with different preferences.
Given good conditions, humans generally AM-converge on each other; aliens and AIs generally don’t AM-converge with humans.
Given good conditions, there’s broad convergence, where at least a reasonably high fraction of humans and aliens and AIs would AM-converge with each other.
(There are also variants of (2), where “humans” could be replaced with “people sufficiently similar to me”, “co-nationals”, “followers of the same religion”, “followers of the same moral worldview” and so on.)
Though (2) is a commonly held position, we think our discussion has made it less plausible. If a mostly-great future is a very narrow target, then shared human preferences are underpowered for the task of ensuring that the idealising process of different humans goes to the same place. What would be needed is for there to be something about the world itself that would pull different beings towards the same (correct) moral views: for example, if the arguments are much stronger for the correct moral view than for other moral views, or if the value of experiences is present in the nature of experiences, such that by having a good experience one is thereby inclined to believe that that experience is good.55
So we think that the more likely scenarios are (1) and (3). If we were in scenario (1) for sure, then we would have an argument for personal power-seeking (although there are plausibly other arguments against power-seeking strategies; this is discussed in section 4.2 of the essay, What to do to Promote Better Futures). But we think that we should act much more on the assumption that we live in scenario (3), for two reasons.
First, the best actions are higher-impact in scenario (3) than in scenario (1). Suppose that you’re in scenario (1), that you currently have 1 billionth of all global power,56
and that the future is on track to achieve one hundred millionth as much value as if you had all the power.57
Perhaps via successful power-seeking throughout the course of your life, you could increase your current level of power a hundredfold. If so, then you would ensure that the future has one millionth as much value as if you had all the power. You’ve increased the value of the future by one part in a million.
But now suppose that we’re in scenario (3). If so, you should be much more optimistic about the value of the future. Suppose you think, conditional on scenario (3), that the chance of Surviving is 80%, and that Flourishing is 10%. By devoting your life to the issue, can you increase the chance of Surviving by more than one part in a hundred thousand, or improve Flourishing by more than one part in a million? It seems to me that you can, and, if so, then the best actions (which are non-powerseeking) have more impact in scenario (3) than power-seeking does in scenario (1). More generally, the future has a lot more value in scenario (3) than in scenario (1), and one can often make a meaningful proportional difference to future value. So, unless you’re able to enormously multiply your personal power, then you’ll be able to take higher-impact actions in scenario (3) than in scenario (1).
A second, and much more debatable, reason for focusing more on scenario (3) is that you might just care about what happens in scenario (3) more than in scenario (1). Will’s preferences, at least, are such that things are much lower-stakes in general in scenario (1) than they are in scenario (3): he thinks he’s much more likely to have strong cosmic-scale reflective preferences in scenario (3), and much more likely to have reflective preferences that are scope-sensitive and closer to contemporary common-sense in scenario (1).
[O]n this model, making each factor more correlated can dramatically improve the expected value of the future — without improving the expected value of any individual factor at all.
Which could look like “averting a catstrophic disruption of an otherwise convergent win”.
Glad to see this series up! Tons of great points here.
One thing I would add is a that I think the analysis about fragility of value and intervention impact has a structural problem. Supposing that the value of the future is hyper-fragile as a combination of numerous multiplicative factors, you wind up thinking the output is extremely low value compared to the maximum, so there’s more to gain. OK.
But a hypothesis of hyper-fragility along these lines also indicates that after whatever interventions you make you will still get numerous multiplicative factors wrong, so it will again be an extreme failure.
On this analysis it’s the worlds where things are non-fragile (e.g. because of epistemic enhancement and improved bargaining and wealth driving systematically getting things right) that are far more valuable.
Maybe on the hyper-fragile aggregative story it’s easier to 10x the value of the future, but after doing so it will still be a bunch of orders of magnitude off from the optimum. On the feasible convergent optimum story a win gets you the optimum, far better than going from 10^-10 to 10^-9 of the optimum.
So there’s a lot of oomph to be had averting a catstrophic disruption of an otherwise convergent win (e.g. preventing a nasty whimsical permanent dictatorship that does crazy things or has a very bad starting point, AI or human), but not so much messing around with the hyper-fragile cases.
Thanks! And it’s great to see you back on here!
Well, it depends on how many multiplicative factors. If 100, then yes. If 5, then maybe not. So maybe the sweet spot for impact is where value is multiplicative, but with a relatively small number of multiplicative factors.
And you could act to make the difference in worlds in which society has already gotten all-but-one of the factors correct. Or act such that all the factors are better, in a correlated way.
Great—I make a similar argument in Convergence and Compromise, section 5. (Apologies that the series is so long and interrelated!). I’ll quote the whole thing at the bottom of this comment.
Here I want to emphasise the distinction between two ways in which it could be “easy” to get things right: (i) mostly-great futures are a broad target because of the nature of ethics (e.g. bounded value at low bounds); (ii) (some) future beings will converge on the best views and promote them. (This essay (No Easy Eutopia) is about (i), and Convergence and Compromise is about (ii).)
W r t (ii)-type reasons, I think this argument works.
I don’t think it works w r t (i)-type reasons, though, because of questions around intertheoretic comparisons. On (i)-type reasons, it’s easier to get to a meaningful % of the optimum because of the nature of ethics (e.g. value is bounded rather than unbounded). But then we need to compare the stakes across different theories. And normalising at the difference in value between 0 and 100% would be a big mistake; it seeming “natural” is just an artifact of the notation we’ve used.
We discuss the intertheoretic comparisons issue in section 3.5 of No Easy Eutopia.
And here’s Convergence and Compromise, section 5:
Also note 4.2. in ‘How to Make the Future Better’ (and footnote 32) —
Which could look like “averting a catstrophic disruption of an otherwise convergent win”.
Thanks Will!