I agree that none of these seem figured out (both no broad consensus and also personally I am not hugely confident either way).
Some notes
We have not figured out how to solve the alignment problem
It seems useful to distinguish the problem of alignment from the problem of ensuring safety and usefulness from a given AI. Also, it seems worth distinguishing safety issues from wildly superhuman AI and the first AIs which are transformatively useful.
It seems plausible to me that you can adequately control and utilize transformatively useful (but not wildly superhuman) AIs even if these AIs are hugely misaligned (e.g. deceptive alignment). See here for a bit more discussion. By transformatively useful, I mean AIs capable of radically accelerating (e.g. 30x speed up) R&D on key topics like AI safety. It’s not clear that using these AIs to speed up cognitive work will suffice for solving the next problems, but it at least seems relevant.
We don’t know the exact timelines (I define ‘timelines’ here as the moments when an AI system becomes capable of recursively self-improving). It might range from already having happened to 100 years or more.
I think publically known AI is already capable of recursively self-improving via contributing to normal ML research; thus, there is just a quantitative question of how quickly. So, I would try to use a different operationalization of timelines. See here for more discussion.
(As far as “already happened”, I think It seems very unlikely that there are non-publically known AI systems which are much more capable than current publically known AI systems, but much more capable systems might be trained over the next year.)
Ryan, thank you for your thoughts! The distinctions you brought up are something I did not think about yet, so I am going to take a look at the articles you linked in your reply. If I have more to add to this point, I’ll add that. Lots of work ahead to figure out these important things. I hope we have enough time.
I agree that none of these seem figured out (both no broad consensus and also personally I am not hugely confident either way).
Some notes
It seems useful to distinguish the problem of alignment from the problem of ensuring safety and usefulness from a given AI. Also, it seems worth distinguishing safety issues from wildly superhuman AI and the first AIs which are transformatively useful.
It seems plausible to me that you can adequately control and utilize transformatively useful (but not wildly superhuman) AIs even if these AIs are hugely misaligned (e.g. deceptive alignment). See here for a bit more discussion. By transformatively useful, I mean AIs capable of radically accelerating (e.g. 30x speed up) R&D on key topics like AI safety. It’s not clear that using these AIs to speed up cognitive work will suffice for solving the next problems, but it at least seems relevant.
I think publically known AI is already capable of recursively self-improving via contributing to normal ML research; thus, there is just a quantitative question of how quickly. So, I would try to use a different operationalization of timelines. See here for more discussion.
(As far as “already happened”, I think It seems very unlikely that there are non-publically known AI systems which are much more capable than current publically known AI systems, but much more capable systems might be trained over the next year.)
Ryan, thank you for your thoughts! The distinctions you brought up are something I did not think about yet, so I am going to take a look at the articles you linked in your reply. If I have more to add to this point, I’ll add that. Lots of work ahead to figure out these important things. I hope we have enough time.