Yes, that’s a fair summary. I think that perfect alignment is pretty much impossible, as is perfectly rational/bug-free AI. I think the latter fact may give us enough breathing room to get alignment at least good enough to avert extinction.
I feel like it’s more fruitful to talk about specific classes of defects rather than all of them together. You use the word “bug” to mean everything from divide by zero crashes to wrong beliefs
That’s fair, I think if people were to further explore this topic it would make sense to separate them out. And good point about the bugginess passage, i’ve edited it to be more accurate.
Yes, that’s a fair summary. I think that perfect alignment is pretty much impossible, as is perfectly rational/bug-free AI. I think the latter fact may give us enough breathing room to get alignment at least good enough to avert extinction.
That’s fair, I think if people were to further explore this topic it would make sense to separate them out. And good point about the bugginess passage, i’ve edited it to be more accurate.