Thank you for writing this! I particularly appreciated hearing your responses to Superintelligence and Human Compatible, and would be very interested to hear how you would respond to The Alignment Problem. TAP is more grounded in modern ML and current research than either of the other books, and I suspect that this might help you form more concrete objections (and/or convince you of some points). If you do read it, please consider sharing your responses.
That said, I don’t think that you have any obligation to read TAP, or to consider thinking about AI safety at all. It sounds like you aren’t drawn to a career in the field, and that’s fine. There are plenty of other ways to do good with an ML skill set. But if you don’t need to weigh working in AI safety against other career options, and you don’t find it interesting or enjoyable to consider, then why focus on forming personal views about AI safety at all?
Edited to add a disclaimer: I provided technical feedback on a draft of TAP, and much of the “AGI safety” section focuses on my team’s work. I still think that it’s a good concrete introduction to the field, because of how specific and well-cited it is, but I also am probably somewhat biased.
Thanks! It will be difficult to write an authentic response to TAP since these other responses were originally not meant to be public but I will try to keep the same spirit if I end up writing more about my AI safety journey.
I actually do find AI safety interesting, it just seems that I think about a lot of stuff differently than many people in the field and it hard for me to pin-point why. But the main motivations of spending a lot of time on forming personal views about AI safety are:
I want to understand x-risks better, AI risk is considered important among people who worry about x-risk a lot, and because of my background I should be able to understand the argument for it (better than say, biorisk)
I find it confusing that I understanding the argument is so hard, and that makes me worried (like I explained in the sections “The fear of the answer” and “Friends and appreciation”)
I find it very annoying when I don’t understand why some people are convinced by something, especially if these people are with me in a movement that is important for us all
Thank you for explaining more. In that case, I can understand why you’d want to spend more time thinking about AI safety.
I suspect that much of the reason that “understanding the argument is so hard” is because there isn’t a definitive argument—just a collection of fuzzy arguments and intuitions. The intuitions seem very, well, intuitive to many people, and so they become convinced. But if you don’t share these intuitions, then hearing about them doesn’t convince you. I also have an (academic) ML background, and I personally find some topics (like mesa-optimization) to be incredibly difficult to reason about.
I think that generating more concrete arguments and objections would be very useful for the field, and I encourage you to write up any thoughts that you have in that direction!
(Also, a minor disclaimer that I suppose I should have included earlier: I provided technical feedback on a draft of TAP, and much of the “AGI safety” section focuses on my team’s work. I still think that it’s a good concrete introduction to the field, because of how specific and well-cited it is, but I also am probably somewhat biased.)
Thank you for writing this! I particularly appreciated hearing your responses to Superintelligence and Human Compatible, and would be very interested to hear how you would respond to The Alignment Problem. TAP is more grounded in modern ML and current research than either of the other books, and I suspect that this might help you form more concrete objections (and/or convince you of some points). If you do read it, please consider sharing your responses.
That said, I don’t think that you have any obligation to read TAP, or to consider thinking about AI safety at all. It sounds like you aren’t drawn to a career in the field, and that’s fine. There are plenty of other ways to do good with an ML skill set. But if you don’t need to weigh working in AI safety against other career options, and you don’t find it interesting or enjoyable to consider, then why focus on forming personal views about AI safety at all?
Edited to add a disclaimer: I provided technical feedback on a draft of TAP, and much of the “AGI safety” section focuses on my team’s work. I still think that it’s a good concrete introduction to the field, because of how specific and well-cited it is, but I also am probably somewhat biased.
Thanks! It will be difficult to write an authentic response to TAP since these other responses were originally not meant to be public but I will try to keep the same spirit if I end up writing more about my AI safety journey.
I actually do find AI safety interesting, it just seems that I think about a lot of stuff differently than many people in the field and it hard for me to pin-point why. But the main motivations of spending a lot of time on forming personal views about AI safety are:
I want to understand x-risks better, AI risk is considered important among people who worry about x-risk a lot, and because of my background I should be able to understand the argument for it (better than say, biorisk)
I find it confusing that I understanding the argument is so hard, and that makes me worried (like I explained in the sections “The fear of the answer” and “Friends and appreciation”)
I find it very annoying when I don’t understand why some people are convinced by something, especially if these people are with me in a movement that is important for us all
Thank you for explaining more. In that case, I can understand why you’d want to spend more time thinking about AI safety.
I suspect that much of the reason that “understanding the argument is so hard” is because there isn’t a definitive argument—just a collection of fuzzy arguments and intuitions. The intuitions seem very, well, intuitive to many people, and so they become convinced. But if you don’t share these intuitions, then hearing about them doesn’t convince you. I also have an (academic) ML background, and I personally find some topics (like mesa-optimization) to be incredibly difficult to reason about.
I think that generating more concrete arguments and objections would be very useful for the field, and I encourage you to write up any thoughts that you have in that direction!
(Also, a minor disclaimer that I suppose I should have included earlier: I provided technical feedback on a draft of TAP, and much of the “AGI safety” section focuses on my team’s work. I still think that it’s a good concrete introduction to the field, because of how specific and well-cited it is, but I also am probably somewhat biased.)