Thanks for the link about the Fermi paradox. Obviously I could not hope to address all arguments about this issue in my critique here. All I meant to establish is that Bostrom’s argument does rely on particular views about the resolution of that paradox.
You say ‘it is tautologically true that agents are motivated against changing their final goals, this is just not possible to dispute’. Respectfully I just don’t agree. It all hinges on what is meant by ‘motivation’ and ‘final goal’. You also say ” it just seems clear that you can program an AI with a particular goal function and that will be all there is to it”, and again I disagree. A narrow AI sure, or even a highly competent AI, but not an AI with human level competence in all cognitive activities. Such an AI would have the ability to reflect on its own goals and motivations, because humans have that ability, and therefore it would not be ‘all there is to it’.
Regarding your last point, what I was getting at is that you can change a goal by explicitly rejecting a goal and choosing a new one, or by changing one’s interpretation of an existing goal. This latter method is an alternative path by which an AI could change its goals in practise even if it still regarded itself as following the same goals it was programmed with. My point isn’t that this makes goal alignment not a problem. My point was that this makes the ‘AI will never change its goals’ not a plausible position.
Hi Zeke!
Thanks for the link about the Fermi paradox. Obviously I could not hope to address all arguments about this issue in my critique here. All I meant to establish is that Bostrom’s argument does rely on particular views about the resolution of that paradox.
You say ‘it is tautologically true that agents are motivated against changing their final goals, this is just not possible to dispute’. Respectfully I just don’t agree. It all hinges on what is meant by ‘motivation’ and ‘final goal’. You also say ” it just seems clear that you can program an AI with a particular goal function and that will be all there is to it”, and again I disagree. A narrow AI sure, or even a highly competent AI, but not an AI with human level competence in all cognitive activities. Such an AI would have the ability to reflect on its own goals and motivations, because humans have that ability, and therefore it would not be ‘all there is to it’.
Regarding your last point, what I was getting at is that you can change a goal by explicitly rejecting a goal and choosing a new one, or by changing one’s interpretation of an existing goal. This latter method is an alternative path by which an AI could change its goals in practise even if it still regarded itself as following the same goals it was programmed with. My point isn’t that this makes goal alignment not a problem. My point was that this makes the ‘AI will never change its goals’ not a plausible position.