Here are several questions. They are related to two hypotheses, that could, if both significantly true, make impartial longtermists update the value of Extinction-Risk reduction downward (potentially by 75% to 90%).
Civ-Saturation Hypothesis: Most resources will be claimed by Space-Faring Civilizations (SFCs) regardless of whether humanity creates an SFC.
Civ-Similarity Hypothesis: Humanity’s Space-Faring Civilization would produce utility similar to other SFCs (per unit of resource controlled).
For context, I recently introduced these hypotheses here, and I will publish a few posts producing preliminary evaluations of those during the debate week.
General questions:
What are the best arguments against these hypotheses?
Is the AI Safety community already primarily working on reducing Alignment-Risks and not on reducing Extinction-Risks?
By Alignment-Risks, I mean “increasing the value of futures where Earth-originating intelligent-life survive”.
By Extinction-Risks, I mean “reducing the chance of Earth-originating intelligent-life extinction”.
What are the current relative importance given to Extinction-Risks and Alignment-Risks in the EA community? E.g., what are the relative grant allocations?
Should the EA community do more to study the relative priorities of Extinction-Risks and Alignment-Risks, or are we already allocating significant attention to this question?
Specific questions:
Should we prioritize interventions given EDT (or other evidential decision theories) or CDT? How should we deal with uncertainty there?
I am interested in this question because the Civ-Saturation hypothesis may be significantly true when assuming EDT (and thus at least assuming we control our exact copies, and they exist). However, this hypothesis may be otherwise pretty incorrect assuming CDT.
We are strongly uncertain about how the characteristics of ancestors of space-faring civilizations (e.g., Humanity) would impact the value space-faring civilizations would produce in the far future. Given this uncertainty, should we expect it to be hard to argue that Humanity’s future space-faring civilization would produce significantly different value than other space-faring civilizations?
I am interested in this question, because I believe we should use the Mediocrity Principle as a starting point when comparing our future potential impact with that of aliens, and that it is likely (and also in practice) very hard to find robust enough arguments to update significantly away from this principle, especially given that we can find many arguments reinforcing the mediocrity principle prior (e.g., selection pressures and convergence arguments).
What are our best arguments supporting that Humanity’s space-faring civilization would produce significantly more value than other space-faring civilizations?
How should we aggregate beliefs over possible worlds in which we could have OOMs of difference in impact?
Civ-Saturation seems plausible, though only if there are other agents in the affectable universe. I don’t have a good view on this, and yours is probably better.
Civ-Similarity seems implausible. I at least have some control over what humans do in the future, so I can steer things towards the futures I judge best. I don’t have any control over what aliens do. And there are large differences between the best and middling futures as I argue in Power Laws of Value .
> Civ-Similarity seems implausible. I at least have some control over what humans do in the future Maybe there is a misunderstanding here. The Civ-Similarity is not about having control; it is not about marginal utility. It is that the expected utility (not the marginal) produced by space-faring civilizations given either human ancestry or alien ancestry, are similar. The single strongest argument in favour of this hypothesis is that we are too uncertain about how conditioning on human ancestry or alien ancestry changes the utility produced in the far future by a space-faring civilization. We are too uncertain to say that U(far future | human ancestry) significantly differs from U(far future | alien ancestry).
No, I don’t think there’s a misunderstanding. It’s more that I think the future could go many different ways with wide variance in expected value, and I can shape the direction the human future goes but I cannot shape the direction that alien futures go.
What do you think about just building and letting misaligned AGI loose? That seems fairly similar to letting other civilisations take over. (Apologies that I haven’t read your evaluation.)
Thanks! I haven’t read your stuff yet, but it seems like good work; and this has been a reason in my mind for being more in favour of trajectory change than totla extinction reduction for a while. It would only reduce the value of extinction risk reduction by an OOM at most, though?
I’m sympathetic to something in Mediocrity direction (for AI-built civilisations as well as human-built civilisations), but I think it’s very hard to have a full-blooded Mediocrity principle if you also think that you can take actions today to meaningfully increase or decrease the value of Earth-originating civilisation. Suppose that Earth-originating civilisation’s value is V, and if we all worked on it we could increase that to V+ or to V-. If so, then which is the right value for the alien civilisation? Choosing V rather than V+ or V- (or V+++ or V—etc) seems pretty arbitrary.
Rather, we should think about how good our prospects are compared to a random draw civilisation. You might think we’re doing better or worse, but if it’s possible for us to move the value of the future around, then it seems we should be able to reasonably think that we’re quite a bit better (or worse) than the random draw civ.
It would only reduce the value of extinction risk reduction by an OOM at most, though?
Right, at most, one OOM. Higher updates would require us to learn that the universe is more Civ-Saturated than our current best guess. This could be the case if: - humanity’s extinction would not prevent another intelligent civilization from appearing quickly on Earth— OR that intelligent life in the universe is much more frequent (e.g., to learn that intelligent life can appear around red dwarfs whose lifespan is 100B to 1T years).
Suppose that Earth-originating civilisation’s value is V, and if we all worked on it we could increase that to V+ or to V-. If so, then which is the right value for the alien civilisation? Choosing V rather than V+ or V- (or V+++ or V—etc) seems pretty arbitrary.
I guess, as long as V ~ V+++ ~ V--- (like the relative difference is less than 1%), then it is likely not a big issue. However, the relative difference may become large only when we become significantly more certain about the impact of our actions, e.g., if we are the operators choosing the moral values of the first ASI.
Thank you for organizing this debate!
Here are several questions. They are related to two hypotheses, that could, if both significantly true, make impartial longtermists update the value of Extinction-Risk reduction downward (potentially by 75% to 90%).
Civ-Saturation Hypothesis: Most resources will be claimed by Space-Faring Civilizations (SFCs) regardless of whether humanity creates an SFC.
Civ-Similarity Hypothesis: Humanity’s Space-Faring Civilization would produce utility similar to other SFCs (per unit of resource controlled).
For context, I recently introduced these hypotheses here, and I will publish a few posts producing preliminary evaluations of those during the debate week.
General questions:
What are the best arguments against these hypotheses?
Is the AI Safety community already primarily working on reducing Alignment-Risks and not on reducing Extinction-Risks?
By Alignment-Risks, I mean “increasing the value of futures where Earth-originating intelligent-life survive”.
By Extinction-Risks, I mean “reducing the chance of Earth-originating intelligent-life extinction”.
What are the current relative importance given to Extinction-Risks and Alignment-Risks in the EA community? E.g., what are the relative grant allocations?
Should the EA community do more to study the relative priorities of Extinction-Risks and Alignment-Risks, or are we already allocating significant attention to this question?
Specific questions:
Should we prioritize interventions given EDT (or other evidential decision theories) or CDT? How should we deal with uncertainty there?
I am interested in this question because the Civ-Saturation hypothesis may be significantly true when assuming EDT (and thus at least assuming we control our exact copies, and they exist). However, this hypothesis may be otherwise pretty incorrect assuming CDT.
We are strongly uncertain about how the characteristics of ancestors of space-faring civilizations (e.g., Humanity) would impact the value space-faring civilizations would produce in the far future. Given this uncertainty, should we expect it to be hard to argue that Humanity’s future space-faring civilization would produce significantly different value than other space-faring civilizations?
I am interested in this question, because I believe we should use the Mediocrity Principle as a starting point when comparing our future potential impact with that of aliens, and that it is likely (and also in practice) very hard to find robust enough arguments to update significantly away from this principle, especially given that we can find many arguments reinforcing the mediocrity principle prior (e.g., selection pressures and convergence arguments).
What are our best arguments supporting that Humanity’s space-faring civilization would produce significantly more value than other space-faring civilizations?
How should we aggregate beliefs over possible worlds in which we could have OOMs of difference in impact?
Here is the formalized solution I use to study the above hypotheses: Decision-Relevance of worlds and ADT implementations
Civ-Saturation seems plausible, though only if there are other agents in the affectable universe. I don’t have a good view on this, and yours is probably better.
Civ-Similarity seems implausible. I at least have some control over what humans do in the future, so I can steer things towards the futures I judge best. I don’t have any control over what aliens do. And there are large differences between the best and middling futures as I argue in Power Laws of Value .
You can find a first evaluation of the Civ-Saturation hypothesis in Other Civilizations Would Recover 84+% of Our Cosmic Resources—A Challenge to Extinction Risk Prioritization. It seems pretty accurate as long as you assume EDT.
> Civ-Similarity seems implausible. I at least have some control over what humans do in the future
Maybe there is a misunderstanding here. The Civ-Similarity is not about having control; it is not about marginal utility. It is that the expected utility (not the marginal) produced by space-faring civilizations given either human ancestry or alien ancestry, are similar. The single strongest argument in favour of this hypothesis is that we are too uncertain about how conditioning on human ancestry or alien ancestry changes the utility produced in the far future by a space-faring civilization. We are too uncertain to say that U(far future | human ancestry) significantly differs from U(far future | alien ancestry).
No, I don’t think there’s a misunderstanding. It’s more that I think the future could go many different ways with wide variance in expected value, and I can shape the direction the human future goes but I cannot shape the direction that alien futures go.
What do you think about just building and letting misaligned AGI loose? That seems fairly similar to letting other civilisations take over. (Apologies that I haven’t read your evaluation.)
Thanks! I haven’t read your stuff yet, but it seems like good work; and this has been a reason in my mind for being more in favour of trajectory change than totla extinction reduction for a while. It would only reduce the value of extinction risk reduction by an OOM at most, though?
I’m sympathetic to something in Mediocrity direction (for AI-built civilisations as well as human-built civilisations), but I think it’s very hard to have a full-blooded Mediocrity principle if you also think that you can take actions today to meaningfully increase or decrease the value of Earth-originating civilisation. Suppose that Earth-originating civilisation’s value is V, and if we all worked on it we could increase that to V+ or to V-. If so, then which is the right value for the alien civilisation? Choosing V rather than V+ or V- (or V+++ or V—etc) seems pretty arbitrary.
Rather, we should think about how good our prospects are compared to a random draw civilisation. You might think we’re doing better or worse, but if it’s possible for us to move the value of the future around, then it seems we should be able to reasonably think that we’re quite a bit better (or worse) than the random draw civ.
Right, at most, one OOM. Higher updates would require us to learn that the universe is more Civ-Saturated than our current best guess. This could be the case if:
- humanity’s extinction would not prevent another intelligent civilization from appearing quickly on Earth—
OR that intelligent life in the universe is much more frequent (e.g., to learn that intelligent life can appear around red dwarfs whose lifespan is 100B to 1T years).
I guess, as long as V ~ V+++ ~ V--- (like the relative difference is less than 1%), then it is likely not a big issue. However, the relative difference may become large only when we become significantly more certain about the impact of our actions, e.g., if we are the operators choosing the moral values of the first ASI.