Thanks for asking, Toby! Here are my ranking and quick thoughts:
AGI Battle Royale: Why âslow takeoverâ scenarios devolve into a chaotic multi-AGI fight to the death. I see this as an argument for takeover risk being very connected to differences in power rather than absolute power, with takeover by a few agents remaining super difficult as long as power is not super concentrated. So I would say efforts to mitigate power inequality will continue to be quite important, although society already seems be aware of this.
The Leeroy Jenkins principle: How faulty AI could guarantee âwarning shotsâ. Quantitative illustration of how warning shots can decrease tail risk a lot. Judging from societyâs reaction to very small visible harms caused by AI (e.g. self-driving car accidents), it seems to me like each time an AI disaster kill x humans, society will react in such a way that AI disaster killing 10 x humans will be made significantly less likely. To illustrate how fast the risk can decrease, if each doubling of deaths is made 50 % as likely, since there are 32.9 (= LN(8*10^9)/âLN(2)) doublings between 1 and 8 billion deaths, starting at 1 death caused by AI per year, the annual risk of human extinction would be 1.25*10^-10 (= 0.5^32.9).
Thanks for asking, Toby! Here are my ranking and quick thoughts:
AGI Battle Royale: Why âslow takeoverâ scenarios devolve into a chaotic multi-AGI fight to the death. I see this as an argument for takeover risk being very connected to differences in power rather than absolute power, with takeover by a few agents remaining super difficult as long as power is not super concentrated. So I would say efforts to mitigate power inequality will continue to be quite important, although society already seems be aware of this.
The Leeroy Jenkins principle: How faulty AI could guarantee âwarning shotsâ. Quantitative illustration of how warning shots can decrease tail risk a lot. Judging from societyâs reaction to very small visible harms caused by AI (e.g. self-driving car accidents), it seems to me like each time an AI disaster kill x humans, society will react in such a way that AI disaster killing 10 x humans will be made significantly less likely. To illustrate how fast the risk can decrease, if each doubling of deaths is made 50 % as likely, since there are 32.9 (= LN(8*10^9)/âLN(2)) doublings between 1 and 8 billion deaths, starting at 1 death caused by AI per year, the annual risk of human extinction would be 1.25*10^-10 (= 0.5^32.9).
Chaining the evil genie: why âouterâ AI safety is probably easy. Explainer of how an arbitrarily intelligent AIs can seemingly be made arbitrarily constrained if they robustly adopt the goals humans give them.
The bullseye framework: My case against AI doom. Good overview of titotalâs posts.
How âAGIâ could end up being many different specialized AIâs stitched together. Good pointers to the importance and plausibility of specialisation, and how this reduces risk.
Bandgaps, Brains, and Bioweapons: The limitations of computational science and what it means for AGI. Good illustration that some problems will not be solved by brute force computation, but it leaves room for AI to find efficient heuristic (as AlphaFold does).
âDiamondoid bacteriaâ nanobots: deadly threat or dead-end? A nanotech investigation. Great investigation, but it tackles a specific threat, so the lessons are not so generalisable as those of other posts.