Zero failures is the preferable outcome, but an AGI escape does not necessarily equate to certain doom. For example, the AI may be irrational (because it’s a lot easier to build the perfect paperclipper than the perfect universal reasoner). Or, the AI may calculate that it has to strike before other AI’s come into existence, and hence launch a premature attack in the hope that it gets lucky.
As for the nuclear reactors, all I’m saying is that you can build a reactor that is perfectly safe, if you’re willing to spring out the extra money. Similarly, you can build a boxed AGI, if you’re willing to spend the resources on it. I do not dispute that many corporations would try and cut corners, if left to their own devices.
A) a significant increase in world concern about AGI, leading to higher funding for safe AGI, tighter regulations, and increased incentives to conform to those regulations rather than get a bunch of people killed (and get sued by their families).
and
B) Information about what conditions give rise to rogue AGI, and what mechanisms they will try to use for takeovers.
Both of these things increase the probability of building safe AGI, and decrease the probability of the next AGI attack being successful. Rinse and repeat until AGI alignment is solved.
Agree that those things will happen, but I don’t think it will be anough. “Rinse and repeat until AGI Alignment is solved” seems highly unlikely, especially given that we still have no idea how to actually solve alignment for powerful (superhuman) AGI, and still won’t with the information we get from plausible non-existential warning shots. And as I said, if we can’t even ban gain-of-function research after Covid has killed >10M people, against a tiny lobby of scientists with vested interests, what hope do we have of steering a multi-trillion-dollar industry toward genuine safety and security?
we still have no idea how to actually solve alignment for powerful (superhuman) AGI
Of course we don’t. AGI doesn’t exist yet, and we don’t know the details of what it’ll look like. Solving alignment for every possible imaginary AGI is impossible, solving it for the particular AGI architecture we end up with is significantly easier. I would honestly not be surprised if it turned out that alignment was a requirement on our path to AGI anyway, so the problem solves itself.
As for the gain of function, the story would be different if covid was provably caused by gain-of-function research. As of now, the only relevance of covid is reminding us that pandemics are bad, which we already knew.
Zero failures is the preferable outcome, but an AGI escape does not necessarily equate to certain doom. For example, the AI may be irrational (because it’s a lot easier to build the perfect paperclipper than the perfect universal reasoner). Or, the AI may calculate that it has to strike before other AI’s come into existence, and hence launch a premature attack in the hope that it gets lucky.
As for the nuclear reactors, all I’m saying is that you can build a reactor that is perfectly safe, if you’re willing to spring out the extra money. Similarly, you can build a boxed AGI, if you’re willing to spend the resources on it. I do not dispute that many corporations would try and cut corners, if left to their own devices.
Suppose we do survive a failure or two. What then?
Then we get
A) a significant increase in world concern about AGI, leading to higher funding for safe AGI, tighter regulations, and increased incentives to conform to those regulations rather than get a bunch of people killed (and get sued by their families).
and
B) Information about what conditions give rise to rogue AGI, and what mechanisms they will try to use for takeovers.
Both of these things increase the probability of building safe AGI, and decrease the probability of the next AGI attack being successful. Rinse and repeat until AGI alignment is solved.
Agree that those things will happen, but I don’t think it will be anough. “Rinse and repeat until AGI Alignment is solved” seems highly unlikely, especially given that we still have no idea how to actually solve alignment for powerful (superhuman) AGI, and still won’t with the information we get from plausible non-existential warning shots. And as I said, if we can’t even ban gain-of-function research after Covid has killed >10M people, against a tiny lobby of scientists with vested interests, what hope do we have of steering a multi-trillion-dollar industry toward genuine safety and security?
Of course we don’t. AGI doesn’t exist yet, and we don’t know the details of what it’ll look like. Solving alignment for every possible imaginary AGI is impossible, solving it for the particular AGI architecture we end up with is significantly easier. I would honestly not be surprised if it turned out that alignment was a requirement on our path to AGI anyway, so the problem solves itself.
As for the gain of function, the story would be different if covid was provably caused by gain-of-function research. As of now, the only relevance of covid is reminding us that pandemics are bad, which we already knew.