Great post, including lots of useful things to think through! I do have a couple points I’d like to note, however.
One key assumption I think is questionable is that the choice presented is dangerous AI now or later, so that the two are mutually exclusive. In fact, if we get dangerous AI soon, that doesn’t imply additional and even more dangerous systems would be created later. The potential reason for believing this is that there will be governance mechanisms put in place after a failure—which does seem likely, but it also argues strongly for trying to figure out how to build those mechanisms as soon as possible, since it’s entirely plausible that “later” isn’t that far away.
Another assumption is that we can’t stop dangerous AI completely, that is, the possibility of preventing its development long enough to build aligned AI that would be more capable of ensuring safety. To make that assumption, you need to be pessimistic that governance can control these systems that long—which is a plausible concern, but not a certainty, and again raises the earlier point that we could have dangerous AI both soon and later.
there will be governance mechanisms put in place after a failure
Yep, seems reasonably likely, and we sure don’t know how to do this now.
I’m not sure where I’m assuming we can’t pause dangerous AI “development long enough to build aligned AI that would be more capable of ensuring safety”? This is a large part of what I mean with the underlying end-game plan in this post (which I didn’t state super explicitly, sorry), e.g. the centralization point
centralization is good because it gives this project more time for safety work and securing the world
Great post, including lots of useful things to think through! I do have a couple points I’d like to note, however.
One key assumption I think is questionable is that the choice presented is dangerous AI now or later, so that the two are mutually exclusive. In fact, if we get dangerous AI soon, that doesn’t imply additional and even more dangerous systems would be created later. The potential reason for believing this is that there will be governance mechanisms put in place after a failure—which does seem likely, but it also argues strongly for trying to figure out how to build those mechanisms as soon as possible, since it’s entirely plausible that “later” isn’t that far away.
Another assumption is that we can’t stop dangerous AI completely, that is, the possibility of preventing its development long enough to build aligned AI that would be more capable of ensuring safety. To make that assumption, you need to be pessimistic that governance can control these systems that long—which is a plausible concern, but not a certainty, and again raises the earlier point that we could have dangerous AI both soon and later.
Yep, seems reasonably likely, and we sure don’t know how to do this now.
I’m not sure where I’m assuming we can’t pause dangerous AI “development long enough to build aligned AI that would be more capable of ensuring safety”? This is a large part of what I mean with the underlying end-game plan in this post (which I didn’t state super explicitly, sorry), e.g. the centralization point