If you are interested in AI Risk, could you kindly consider filling out a short (10 min) survey on AI Risk Microdynamics? The hope is that I will be able to use your responses to inform an economic model of that risk in the near future, which I think would fill an important gap in our understanding of AI Risk dynamics
I have some qualms with the survey wording.
I answered 70% for this question, but the wording doesn’t feel quite right. I put >80% that a sufficiently capable misaligned AI would disempower humanity, but the first AGI deployed is likely to not be maximally capable unless takeoff is really fast. It could neither initiate a pivotal act/process nor disempower humanity, then over the next days to years (depending on takeoff speeds) different systems could become powerful enough to disempower humanity.
Such a test might not end the acute risk period, because people might not trust the results and could still deploy misaligned AGI. The test would also have to extrapolate into the real world, farther than any currently existing benchmark. It would probably need to rely on transparency tools far in advance of what we have today, and because this region of the transparency tech tree also contains alignment solutions, the development of this test should not be treated as uncorrelated with other alignment solutions.
Even then, I also think there’s a good chance this test is very difficult to develop before AGI. The misalignment test and alignment problem aren’t research problems that we are likely to solve independently of AGI, they’re dramatically sped up by being able to iterate on AI systems and get more than one try on difficult problems.
Also, conditional on aligned ASI being deployed, I expect this test to be developed within a few days. So the question should say “conditional on AGI not being developed”.
Solving the alignment problem doesn’t mean we can create a provably aligned AGI. Nate Soares says