I quite liked the setup. I generally agree that software development should focus on making sure the thing you’re making is doing what you want it to do, and the only weird thing in this characterisation is the terminology. We do perform QA, continual assessments, anomaly detection and resolution etc in regular dev, and the though the terminology is overtly anthropomorphised (manipulate internal states, threat assessment), this seems to be saying the same thing. Though I’ve written about AI x-risk conversations being eschatology this very much is the right and sensible approach to be taking, even as I think the extrapolation to potential doomsday scenarios in the latter half of the essay is quite speculative.
I quite liked the setup. I generally agree that software development should focus on making sure the thing you’re making is doing what you want it to do, and the only weird thing in this characterisation is the terminology. We do perform QA, continual assessments, anomaly detection and resolution etc in regular dev, and the though the terminology is overtly anthropomorphised (manipulate internal states, threat assessment), this seems to be saying the same thing. Though I’ve written about AI x-risk conversations being eschatology this very much is the right and sensible approach to be taking, even as I think the extrapolation to potential doomsday scenarios in the latter half of the essay is quite speculative.