I’d be particularly interested in any takes on the probability that civilization will be better equipped to deal with the alignment problem in, say, 100 years. My impression is that there’s an important and not well-examined balance between:
Decreasing runaway AI risk & systemic risks by slowing down AI
Increasing the time of perils
Possibly increasing its intensity by giving malicious actors more time to catch up in destructive capabilities
But also possibly increasing the time for reflection on defense before a worse time of perils.
Possibly decreasing the risk of an aligned AI with bad moral values (conditional on this risk being lower in year 2123)
Possibly increasing the risk of astronomic waste (conditional on this risk being higher if AI is significantly slowed down)
I agree with you this is very important, and I’d like to see more work on it. Sadly I don’t have much concrete to say on this topic. The following is my opinion as a layman on AI:
I’ve found Toby Ord’s framework here https://www.youtube.com/watch?v=jb7BoXYTWYI to be useful for thinking about these issues. I guess I’m an advocate for differentialprogress, like Ord. That is, prioritizing safety advancements relative to technical advancements. Not stopping work on AI capabilities, but right now shifting the current balance from capabilities work to safety work. And then in some years/decades once we have figured out alignment, shift the focus on capabilities again.
My very rough take on things is that as long as we manage to develop advanced LLMs (e.g. GPT5, 6, 7… and Copilots) slowly and carefully before dangerous AGI, we should use those LLMs to help us with technical alignment work. I think technical alignment work is the current bottleneck of the whole situation. There are either not enough people or we’re not smart enough to figure it out on our own (but maybe computers could help!).
So, to your points, I think right now (1) Runaway AI Risk is higher than (2) Malicious actors catching up. I don’t know by how much, since I don’t know how well Chinese labs are doing regarding AI, and if they could come to big breakthroughs on their own. (And I don’t know how to compare those points (1) and (2) to (3) and (4).)
Perhaps somebody could do a BOTEC calculation or a rough model with some very rough numbers to see what’s a good tradeoff, and put it up for discussion. I’d like to see some work on this.
Sounds reasonable! I think the empirical side to the question “Will society be better equipped to set AI values in 2123?” is more lacking. For this purpose, I think “better equipped” can be nicely operationalized in a very value-uncertain way as “making decisions based on more reflection & evidence and higher-order considerations”.
This kind of exploration may include issues like:
Populism. Has it significantly decreased the amount of rationality that goes into gov. decision-making, in favor of following incentives & intuitions? And what will be faster—new manipulative technologies or the rate at which new generations get immune to them?
Demographics. Given that fundamentalists tend to have more children, should we expect there will be more of them in 2123?
Cultural evolution. Is Ian Morris or Christopher Brown more right, i.e. should we expect that as we get richer, we’ll be less prone to decide based on what gives us more power, and in turn attain values better calibrated with the most honest interpretation of reality?
Looking forward to the sequel!
I’d be particularly interested in any takes on the probability that civilization will be better equipped to deal with the alignment problem in, say, 100 years. My impression is that there’s an important and not well-examined balance between:
Decreasing runaway AI risk & systemic risks by slowing down AI
Increasing the time of perils
Possibly increasing its intensity by giving malicious actors more time to catch up in destructive capabilities
But also possibly increasing the time for reflection on defense before a worse time of perils.
Possibly decreasing the risk of an aligned AI with bad moral values (conditional on this risk being lower in year 2123)
Possibly increasing the risk of astronomic waste (conditional on this risk being higher if AI is significantly slowed down)
I agree with you this is very important, and I’d like to see more work on it. Sadly I don’t have much concrete to say on this topic. The following is my opinion as a layman on AI:
I’ve found Toby Ord’s framework here https://www.youtube.com/watch?v=jb7BoXYTWYI to be useful for thinking about these issues. I guess I’m an advocate for differential progress, like Ord. That is, prioritizing safety advancements relative to technical advancements. Not stopping work on AI capabilities, but right now shifting the current balance from capabilities work to safety work. And then in some years/decades once we have figured out alignment, shift the focus on capabilities again.
My very rough take on things is that as long as we manage to develop advanced LLMs (e.g. GPT5, 6, 7… and Copilots) slowly and carefully before dangerous AGI, we should use those LLMs to help us with technical alignment work. I think technical alignment work is the current bottleneck of the whole situation. There are either not enough people or we’re not smart enough to figure it out on our own (but maybe computers could help!).
So, to your points, I think right now (1) Runaway AI Risk is higher than (2) Malicious actors catching up. I don’t know by how much, since I don’t know how well Chinese labs are doing regarding AI, and if they could come to big breakthroughs on their own. (And I don’t know how to compare those points (1) and (2) to (3) and (4).)
Perhaps somebody could do a BOTEC calculation or a rough model with some very rough numbers to see what’s a good tradeoff, and put it up for discussion. I’d like to see some work on this.
Sounds reasonable! I think the empirical side to the question “Will society be better equipped to set AI values in 2123?” is more lacking. For this purpose, I think “better equipped” can be nicely operationalized in a very value-uncertain way as “making decisions based on more reflection & evidence and higher-order considerations”.
This kind of exploration may include issues like:
Populism. Has it significantly decreased the amount of rationality that goes into gov. decision-making, in favor of following incentives & intuitions? And what will be faster—new manipulative technologies or the rate at which new generations get immune to them?
Demographics. Given that fundamentalists tend to have more children, should we expect there will be more of them in 2123?
Cultural evolution. Is Ian Morris or Christopher Brown more right, i.e. should we expect that as we get richer, we’ll be less prone to decide based on what gives us more power, and in turn attain values better calibrated with the most honest interpretation of reality?