>suppose that the best attainable futures are 1000 times better than the default non-extinction scenario
This seems rather arbitrary. Why woiuld preventing extinction now guarantee that we (forever) lose that 1000x potential?
>In this toy model, you should only allocate your resources to reducing extinction if it is 10 times more tractable than ensuring we are on track to get the best possible future, at the current margin.
I think it is. Gaining the best possible future requires aligning an ASI, which has not been proven to be even theoretically possible afaik.
I think it is. Gaining the best possible future requires aligning an ASI, which has not been proven to be even theoretically possible afaik.
One underrated argument for focusing on non-alignment issues like trajectory change is:
Either alignment is easy or hopeless
If it’s hopeless, we shouldn’t work on it
If it’s easy, we shouldn’t work on it
It only makes sense to work on alignment if we happen to fall in the middle, where marginal efforts can make a difference before it’s too late. If you think this space is small, then it doesn’t look like a very tractable problem.
Oh no! But then we are likely to lose out on almost all value because we won’t have the enormous digital workforce needed to settle the stars. It seems like we should bank on having some chance of solving alignment (at least for some architecture, even if not the current deep learning paradigm) and work towards that at least over the next couple hundred years.
To bank on that we would need to have established at least some solid theoretical grounds for believing it’s possible—do you know of any? I think in fact we are closer to having the opposite: solid theoretical grounds for believing it’s impossible!
I think we can thread the needle by creating strongly non-superintelligent AI systems which can be robustly aligned or controlled. And I agree that we don’t know how to do that at present, but we can very likely get there, even if the proofs of unalignable ASI hold up.
What level of intelligence are you imagining such a system as being at? Some percentile on the scale of top performing humans? Somewhat above the most intelligent humans?
I think we could do what is required for colonizing the galaxy with systems that are at or under the level of 90th percentile humans, which is the issue raised for the concern that otherwise we “lose out on almost all value because we won’t have the enormous digital workforce needed to settle the stars.”
Agree. But I’m sceptical that we could robustly align or control a large population of such AIs (and how would we cap the population?), especially considering the speed advantage they are likely to have.
>suppose that the best attainable futures are 1000 times better than the default non-extinction scenario
This seems rather arbitrary. Why woiuld preventing extinction now guarantee that we (forever) lose that 1000x potential?
>In this toy model, you should only allocate your resources to reducing extinction if it is 10 times more tractable than ensuring we are on track to get the best possible future, at the current margin.
I think it is. Gaining the best possible future requires aligning an ASI, which has not been proven to be even theoretically possible afaik.
It’s super arbitrary! Just trying to pull out your own model.
I give one argument in Power Laws of Value.
One underrated argument for focusing on non-alignment issues like trajectory change is:
Either alignment is easy or hopeless
If it’s hopeless, we shouldn’t work on it
If it’s easy, we shouldn’t work on it
It only makes sense to work on alignment if we happen to fall in the middle, where marginal efforts can make a difference before it’s too late. If you think this space is small, then it doesn’t look like a very tractable problem.
If alignment is hopeless (and I think it is), we should work on preventing ASI from ever being built! That’s what I’m doing.
Oh no! But then we are likely to lose out on almost all value because we won’t have the enormous digital workforce needed to settle the stars. It seems like we should bank on having some chance of solving alignment (at least for some architecture, even if not the current deep learning paradigm) and work towards that at least over the next couple hundred years.
To bank on that we would need to have established at least some solid theoretical grounds for believing it’s possible—do you know of any? I think in fact we are closer to having the opposite: solid theoretical grounds for believing it’s impossible!
I think we can thread the needle by creating strongly non-superintelligent AI systems which can be robustly aligned or controlled. And I agree that we don’t know how to do that at present, but we can very likely get there, even if the proofs of unalignable ASI hold up.
What level of intelligence are you imagining such a system as being at? Some percentile on the scale of top performing humans? Somewhat above the most intelligent humans?
I think we could do what is required for colonizing the galaxy with systems that are at or under the level of 90th percentile humans, which is the issue raised for the concern that otherwise we “lose out on almost all value because we won’t have the enormous digital workforce needed to settle the stars.”
Agree. But I’m sceptical that we could robustly align or control a large population of such AIs (and how would we cap the population?), especially considering the speed advantage they are likely to have.