(Low effort comment as I run out the door, but hope it adds value) To me the most compelling argument in favour of tractability is:
We could make powerful AI agents whose goals are well understood and do not change or update in ex ante predictable ways.
These agents are effectively immortal and the most powerful thing in the affectable universe, with no natural competition. They would be able to overcome potentially all natural obstacles, so they would determine what happens in the lightcone.
So, we can make powerful AI agents that determine what happens in the lightcone, whose goals are well understood and update in ex ante predictable ways.
So, we can take actions that determine what happens in the lightcone in an ex ante predictable way.
This more or less conforms to why I think trajectory changes might be tractable, but I think the idea can be spelled out in a slightly more general way: as technology develops (and especially AI), we can expect to get better at designing institutions that perpetuate themselves. Past challenges to affecting a trajectory change come from erosion of goals due to random and uncontrollable human variation and the chaotic intrusion of external events. Technology may help us make stable institutions that can continue to promote goals for long periods of time.
If you think extinction risk reduction is highly valuable, then you need some kind of a model of what Earth-originating life will do with its cosmic endowment
Some of the parameters in you model must be related to things other than mere survival, like what this life is motivated by or will attempt to do
Plausibly, there are things you can do to change the values of those parameters and not just the extinction parameter
It won’t work for every model (maybe the other parameters just won’t budge), but for some of them it should.
If you think extinction risk reduction is highly valuable, then you need some kind of a model of what Earth-originating life will do with its cosmic endowment
No, you don’t, and you don’t even need to be utilitarian, much less longtermist!
First, you’re adding the assumption that the framing must be longtermist, and second, even conditional on longtermism you don’t need to be utilitarian, so the supposition that you need a model of what we do with the cosmic endowment would still be unjustified.
You’re not going to be prioritizing between extinction risk and long term trajectory changes based on tractability if you don’t care about the far future. And for any moral theory you can ask “why do you think this will be a good outcome?” and as long as you don’t value life intrinsically you’ll have to state some empirical hypotheses about the far future
There is a huge range of “far future” that different views will prioritize differently, and not all need to care about the cosmic endowment at all—people can care about the coming 2-3 centuries based on low but nonzero discount rates, for example, but not care about the longer term future very much.
I don’t understand why that matters. Whatever discount rate you have, if you’re prioritizing between extinction risk and trajectory change you will have some parameters that tell you something about what is going to happen over N years. It doesn’t matter how long this time horizon is. I think you’re not thinking about whether your claims have bearing on the actual matter at hand.
It would probably be most useful for you to try to articulate a view that avoids the dilemma I mentioned in the first comment of this thread.
we can make powerful AI agents that determine what happens in the lightcone
I think that you should articulate a view that explains why you think AI alignment of superintelligent systems is tractable, so that I can understand how you think it’s tractable to allow such systems to be built. That seems like a pretty fundamental disconnect that makes me not understand your )in my view, facile and unconsidered) argument about the tractablity of doing something that seems deeply unlikely to happen.
Well-understood goals in agents that gain power and take over the lightcone is exactly the thing we’d be addressing with AI alignment, so this seems like an argument for investing in AI alignment—which I think most people would see as far closer to preventing existential risk.
That said, without a lot more progress, powerful agents with simple goals is actually just a fancy way of guaranteeing of a really bad outcome, almost certainly including human extinction.
(Low effort comment as I run out the door, but hope it adds value) To me the most compelling argument in favour of tractability is:
We could make powerful AI agents whose goals are well understood and do not change or update in ex ante predictable ways.
These agents are effectively immortal and the most powerful thing in the affectable universe, with no natural competition. They would be able to overcome potentially all natural obstacles, so they would determine what happens in the lightcone.
So, we can make powerful AI agents that determine what happens in the lightcone, whose goals are well understood and update in ex ante predictable ways.
So, we can take actions that determine what happens in the lightcone in an ex ante predictable way.
This more or less conforms to why I think trajectory changes might be tractable, but I think the idea can be spelled out in a slightly more general way: as technology develops (and especially AI), we can expect to get better at designing institutions that perpetuate themselves. Past challenges to affecting a trajectory change come from erosion of goals due to random and uncontrollable human variation and the chaotic intrusion of external events. Technology may help us make stable institutions that can continue to promote goals for long periods of time.
Here’s a shower thought:
If you think extinction risk reduction is highly valuable, then you need some kind of a model of what Earth-originating life will do with its cosmic endowment
Some of the parameters in you model must be related to things other than mere survival, like what this life is motivated by or will attempt to do
Plausibly, there are things you can do to change the values of those parameters and not just the extinction parameter
It won’t work for every model (maybe the other parameters just won’t budge), but for some of them it should.
No, you don’t, and you don’t even need to be utilitarian, much less longtermist!
Any disagreement about longtermist prioritization should presuppose longtermism
First, you’re adding the assumption that the framing must be longtermist, and second, even conditional on longtermism you don’t need to be utilitarian, so the supposition that you need a model of what we do with the cosmic endowment would still be unjustified.
You’re not going to be prioritizing between extinction risk and long term trajectory changes based on tractability if you don’t care about the far future. And for any moral theory you can ask “why do you think this will be a good outcome?” and as long as you don’t value life intrinsically you’ll have to state some empirical hypotheses about the far future
There is a huge range of “far future” that different views will prioritize differently, and not all need to care about the cosmic endowment at all—people can care about the coming 2-3 centuries based on low but nonzero discount rates, for example, but not care about the longer term future very much.
I don’t understand why that matters. Whatever discount rate you have, if you’re prioritizing between extinction risk and trajectory change you will have some parameters that tell you something about what is going to happen over N years. It doesn’t matter how long this time horizon is. I think you’re not thinking about whether your claims have bearing on the actual matter at hand.
It would probably be most useful for you to try to articulate a view that avoids the dilemma I mentioned in the first comment of this thread.
I think that you should articulate a view that explains why you think AI alignment of superintelligent systems is tractable, so that I can understand how you think it’s tractable to allow such systems to be built. That seems like a pretty fundamental disconnect that makes me not understand your )in my view, facile and unconsidered) argument about the tractablity of doing something that seems deeply unlikely to happen.
Well-understood goals in agents that gain power and take over the lightcone is exactly the thing we’d be addressing with AI alignment, so this seems like an argument for investing in AI alignment—which I think most people would see as far closer to preventing existential risk.
That said, without a lot more progress, powerful agents with simple goals is actually just a fancy way of guaranteeing of a really bad outcome, almost certainly including human extinction.