I think you are pointing out some important imprecisions, but i think that some of your arguments aren’t as conclusive as you seem to present them to be:
“Bostrom therefore faces a dilemma. If intelligence is a mix of a wide range of distinct abilities as in Intelligence(1), there is no reason to think it can be ‘increased’ in the rapidly self-reinforcing way Bostrom speaks about (in mathematical terms, there is no single variable which we can differentiate and plug into the differential equation, as Bostrom does in his example on pages 75-76). ”
Those variables could be reinforcing each other, as one could argue they had done in the evolution of human intelligence. (in mathematical terms, there is a runaway dynamic similar to the one dimensional case for a linear vector-valued differential equation, as long as all eigenvalues are positive).
“This should become clear if one considers that ‘essentially all human cognitive abilities’ includes such activities as pondering moral dilemmas, reflecting on the meaning of life, analysing and producing sophisticated literature, formulating arguments about what constitutes a ‘good life’, interpreting and writing poetry, forming social connections with others, and critically introspecting upon one’s own goals and desires. To me it seems extraordinarily unlikely that any agent capable of performing all these tasks with a high degree of proficiency would simultaneously stand firm in its conviction that the only goal it had reasons to pursue was tilling the universe with paperclips. To me it seems extraordinarily unlikely that any agent capable of performing all these tasks with a high degree of proficiency would simultaneously stand firm in its conviction that the only goal it had reasons to pursue was tilling the universe with paperclips.”
Why does it seem unlikely? Also, do you mean unlikely as in “agents emerging in a world similar to ours is nowprobably won’t have this property” or as in “given that someone figured out how to construct a great variety of superintelligent agents, she would still have trouble constructing an agent with this property?”
Regarding your first point, I agree that the situation you posit is a possibility, but it isn’t something Bostrom talks about (and remember I only focused on what he argued, not other possible expansions of the argument). Also, when we consider the possibility of numerous distinct cognitive abilities it is just as possible that there could be complex interactions which inhibit the growth of particular abilities. There could easily be dozens of separate abilities and the full matrix of interactions becomes very complex. The original force of the ‘rate of growth of intelligence is proportional to current intelligence leading to exponential growth’ argument is, in my view, substantively blunted.
Regarding your second point, it seems unlikely to me because if an agent had all these abilities, I believe they would use then to uncover reasons to reject highly reductionistic goals like tilling the universe with paperclips. They might end up with goals that are still in opposition to human values, but I just don’t see how an agent with these abilities would not become dissatisfied with extremely narrow goals.
Thanks for writing this!
I think you are pointing out some important imprecisions, but i think that some of your arguments aren’t as conclusive as you seem to present them to be:
“Bostrom therefore faces a dilemma. If intelligence is a mix of a wide range of distinct abilities as in Intelligence(1), there is no reason to think it can be ‘increased’ in the rapidly self-reinforcing way Bostrom speaks about (in mathematical terms, there is no single variable which we can differentiate and plug into the differential equation, as Bostrom does in his example on pages 75-76). ”
Those variables could be reinforcing each other, as one could argue they had done in the evolution of human intelligence. (in mathematical terms, there is a runaway dynamic similar to the one dimensional case for a linear vector-valued differential equation, as long as all eigenvalues are positive).
“This should become clear if one considers that ‘essentially all human cognitive abilities’ includes such activities as pondering moral dilemmas, reflecting on the meaning of life, analysing and producing sophisticated literature, formulating arguments about what constitutes a ‘good life’, interpreting and writing poetry, forming social connections with others, and critically introspecting upon one’s own goals and desires. To me it seems extraordinarily unlikely that any agent capable of performing all these tasks with a high degree of proficiency would simultaneously stand firm in its conviction that the only goal it had reasons to pursue was tilling the universe with paperclips. To me it seems extraordinarily unlikely that any agent capable of performing all these tasks with a high degree of proficiency would simultaneously stand firm in its conviction that the only goal it had reasons to pursue was tilling the universe with paperclips.”
Why does it seem unlikely? Also, do you mean unlikely as in “agents emerging in a world similar to ours is nowprobably won’t have this property” or as in “given that someone figured out how to construct a great variety of superintelligent agents, she would still have trouble constructing an agent with this property?”
Thanks for your thoughts.
Regarding your first point, I agree that the situation you posit is a possibility, but it isn’t something Bostrom talks about (and remember I only focused on what he argued, not other possible expansions of the argument). Also, when we consider the possibility of numerous distinct cognitive abilities it is just as possible that there could be complex interactions which inhibit the growth of particular abilities. There could easily be dozens of separate abilities and the full matrix of interactions becomes very complex. The original force of the ‘rate of growth of intelligence is proportional to current intelligence leading to exponential growth’ argument is, in my view, substantively blunted.
Regarding your second point, it seems unlikely to me because if an agent had all these abilities, I believe they would use then to uncover reasons to reject highly reductionistic goals like tilling the universe with paperclips. They might end up with goals that are still in opposition to human values, but I just don’t see how an agent with these abilities would not become dissatisfied with extremely narrow goals.