Thanks for this, I think I agree with the broad point you’re making.
That is, I agree that basically all the worlds in which space ends up really mattering this century are worlds in which we get transformative AI (because scenarios in which we start to settle widely and quickly are scenarios in which we get TAI). So, for instance, I agree that there doesn’t seem to be much value in accelerating progress on space technology. And I also agree that getting alignment right is basically a prerequisite to any of the longer-term ‘flowthrough’ considerations.
If I’m reading you right I don’t think your points apply to near-term considerations, such as from arms control in space.
It seems like a crux is something like: how much precedent-setting or preliminary research now on ideal governance setups doesn’t get washed out once TAI arrives, conditional on solving alignment? And my answer is something like: sure, probably not a ton. But if you have a reason to be confident that none of it ends up being useful, it feels like that must be a general reason for scepticism that any kind of efforts at improving governance, or even values change, are rendered moot by the arrival of TAI. And I’m not fully sceptical about those efforts.
Suppose before TAI arrived we came to a strong conclusion: e.g. we’re confident we don’t want to settle using such-and-such a method, or we’re confident we shouldn’t immediately embark on a mission to settle space once TAI arrives. What’s the chance that work ends up making a counterfactual difference, once TAI arrives? Notquite zero, it seems to me.
So I am indeed on balance significantly less excited about working on long-term space governance things than on alignment and AI governance, for the reasons you give. But not so much that they don’t seem worth mentioning.
Ultimately, I’d really like to see [...] More up-front emphasis on the importance of AI alignment as a potential determinant.
This seems like a reasonable point, and one I was/am cognisant of — maybe I’ll make an addition if I get time.
(Happy to try saying more about any of above if useful)
If I’m reading you right I don’t think your points apply to near-term considerations, such as from arms control in space.
That is mostly correct: I wasn’t trying to respond to near-term space governance concerns, such as how to prevent space development or space-based arms races, which I think could indeed play into long-term/x-risk considerations (e.g., undermining cooperation in AI or biosecurity), and may also have near-term consequences (e.g., destruction of space satellites which undermines living standards and other issues).
But if you have a reason to be confident that none of it ends up being useful, it feels like that must be a general reason for scepticism that any kind of efforts at improving governance, or even values change, are rendered moot by the arrival of TAI. And I’m not fully sceptical about those efforts.
To summarize the point I made in response to Charles (which I think is similar, but correct me if I’m misunderstanding): I think that if an action is trying to improve things now (e.g., health and development, animal welfare, improving current institutional decision-making or social values), it can be justified under neartermist values (even if it might get swamped by longtermist calculations). But it seems that if one is trying to figure out “how do we improve governance of space settlements and interstellar travel that could begin 80–200 years from now,” they run the strong risk of their efforts having effectively no impact on affairs 80–200 years from now because AGI might develop before their efforts ever matter towards the goal, and humanity either goes extinct or the research is quickly obsolesced.
Ultimately, any model of the future needs to take into account the potential for transformative AI, and many of the pushes such as for Mars colonization just do not seem to do that, presuming that human-driven (vs. AI-driven) research and efforts will still matter 200 years from now. I’m not super familiar with these discussions, but to me this point stands out so starkly as 1) relatively easy to explain (although it may require introductions to superintelligence for some people); 2) substantially impactful on ultimate conclusions/recommendations, and 3) frequently neglected in the discussions/models I’ve heard so far. Personally, I would put points like this among the top 3–5 takeaway bullet points or in a summary blurb—unless there are image/optics reasons to avoid doing this (e.g., causing a few readers to perhaps-unjustifiably roll their eyes and disregard the rest of the problem profile).
Suppose before TAI arrived we came to a strong conclusion: e.g. we’re confident we don’t want to settle using such-and-such a method, or we’re confident we shouldn’t immediately embark on a mission to settle space once TAI arrives. What’s the chance that work ends up making a counterfactual difference, once TAI arrives? Notquite zero, it seems to me.
This is an interesting point worth exploring further, but I think that it’s helpful to distinguish—perhaps crudely?—between two types of problems:
Technical/scientific problems and “moral problems” which are really just “the difficulty of understanding how our actions will relate to our moral goals, including what sub-goals we should have in order to achieve our ultimate moral goals (e.g., maximizing utility, maximizing virtue/flourishing).”
Social moral alignment—i.e., getting society to want to make more-moral decisions instead of being self-interested at the expense of others.
It seems to me that an aligned superintelligence would very likely be able to obsolesce every effort we make towards the first problem fairly quickly: if we can design a human-aligned superintelligent AI, we should be able to have it automate or at least inform us on everything from “how do we solve this engineering problem” to “will colonizing this solar system—or even space exploration in general—be good per [utilitarianism/etc.]?”
However, making sure that humans care about other extra-terrestrial civilizations/intelligence—and that the developers of AI care about other humans (and possibly animals)—might require some preparation such as via moral circle expansion. Additionally, I suppose it might be possible that a TAI’s performance on the first problem is not as good as we expect (perhaps due to the second problems), and of course there are other scenarios I described where we can’t rely as much on a (singleton) superintelligence, but my admittedly-inexperienced impression is that such scenarios seem unlikely.
Thanks for this, I think I agree with the broad point you’re making.
That is, I agree that basically all the worlds in which space ends up really mattering this century are worlds in which we get transformative AI (because scenarios in which we start to settle widely and quickly are scenarios in which we get TAI). So, for instance, I agree that there doesn’t seem to be much value in accelerating progress on space technology. And I also agree that getting alignment right is basically a prerequisite to any of the longer-term ‘flowthrough’ considerations.
If I’m reading you right I don’t think your points apply to near-term considerations, such as from arms control in space.
It seems like a crux is something like: how much precedent-setting or preliminary research now on ideal governance setups doesn’t get washed out once TAI arrives, conditional on solving alignment? And my answer is something like: sure, probably not a ton. But if you have a reason to be confident that none of it ends up being useful, it feels like that must be a general reason for scepticism that any kind of efforts at improving governance, or even values change, are rendered moot by the arrival of TAI. And I’m not fully sceptical about those efforts.
Suppose before TAI arrived we came to a strong conclusion: e.g. we’re confident we don’t want to settle using such-and-such a method, or we’re confident we shouldn’t immediately embark on a mission to settle space once TAI arrives. What’s the chance that work ends up making a counterfactual difference, once TAI arrives? Notquite zero, it seems to me.
So I am indeed on balance significantly less excited about working on long-term space governance things than on alignment and AI governance, for the reasons you give. But not so much that they don’t seem worth mentioning.
This seems like a reasonable point, and one I was/am cognisant of — maybe I’ll make an addition if I get time.
(Happy to try saying more about any of above if useful)
That is mostly correct: I wasn’t trying to respond to near-term space governance concerns, such as how to prevent space development or space-based arms races, which I think could indeed play into long-term/x-risk considerations (e.g., undermining cooperation in AI or biosecurity), and may also have near-term consequences (e.g., destruction of space satellites which undermines living standards and other issues).
To summarize the point I made in response to Charles (which I think is similar, but correct me if I’m misunderstanding): I think that if an action is trying to improve things now (e.g., health and development, animal welfare, improving current institutional decision-making or social values), it can be justified under neartermist values (even if it might get swamped by longtermist calculations). But it seems that if one is trying to figure out “how do we improve governance of space settlements and interstellar travel that could begin 80–200 years from now,” they run the strong risk of their efforts having effectively no impact on affairs 80–200 years from now because AGI might develop before their efforts ever matter towards the goal, and humanity either goes extinct or the research is quickly obsolesced.
Ultimately, any model of the future needs to take into account the potential for transformative AI, and many of the pushes such as for Mars colonization just do not seem to do that, presuming that human-driven (vs. AI-driven) research and efforts will still matter 200 years from now. I’m not super familiar with these discussions, but to me this point stands out so starkly as 1) relatively easy to explain (although it may require introductions to superintelligence for some people); 2) substantially impactful on ultimate conclusions/recommendations, and 3) frequently neglected in the discussions/models I’ve heard so far. Personally, I would put points like this among the top 3–5 takeaway bullet points or in a summary blurb—unless there are image/optics reasons to avoid doing this (e.g., causing a few readers to perhaps-unjustifiably roll their eyes and disregard the rest of the problem profile).
This is an interesting point worth exploring further, but I think that it’s helpful to distinguish—perhaps crudely?—between two types of problems:
Technical/scientific problems and “moral problems” which are really just “the difficulty of understanding how our actions will relate to our moral goals, including what sub-goals we should have in order to achieve our ultimate moral goals (e.g., maximizing utility, maximizing virtue/flourishing).”
Social moral alignment—i.e., getting society to want to make more-moral decisions instead of being self-interested at the expense of others.
It seems to me that an aligned superintelligence would very likely be able to obsolesce every effort we make towards the first problem fairly quickly: if we can design a human-aligned superintelligent AI, we should be able to have it automate or at least inform us on everything from “how do we solve this engineering problem” to “will colonizing this solar system—or even space exploration in general—be good per [utilitarianism/etc.]?”
However, making sure that humans care about other extra-terrestrial civilizations/intelligence—and that the developers of AI care about other humans (and possibly animals)—might require some preparation such as via moral circle expansion. Additionally, I suppose it might be possible that a TAI’s performance on the first problem is not as good as we expect (perhaps due to the second problems), and of course there are other scenarios I described where we can’t rely as much on a (singleton) superintelligence, but my admittedly-inexperienced impression is that such scenarios seem unlikely.