Thanks this is really useful. (I will try to go through this course, as well.)
I’m not sure the talk has it quite right though. My take is that on the most popular definitions of alignment and capabilities, they are partly conceptually the same, depending on which intentions we are meant to be aligning with. So, it’s not the case that there is a ‘alignment externality’ of a capabilities improvements, but rather that some alignment improvements are capabilities improvements, by definition.
Thanks this is really useful. (I will try to go through this course, as well.)
I’m not sure the talk has it quite right though. My take is that on the most popular definitions of alignment and capabilities, they are partly conceptually the same, depending on which intentions we are meant to be aligning with. So, it’s not the case that there is a ‘alignment externality’ of a capabilities improvements, but rather that some alignment improvements are capabilities improvements, by definition.