Weak upvote for engaging seriously with content and linking to the other parts of the argument.
On the other hand, while it’s good to see complex arguments on the Forum, it’s difficult to discuss pieces that are written without very many headings or paragraph breaks. It’s generally helpful to break down your piece into labelled sections so that people can respond unambiguously to various points. I also think this would help you make this argument across fewer than five posts, which would also make discussion easier.
I’m not the best-positioned person to comment on this topic (hopefully someone with more expertise will step in and correct both of our misconceptions), but these sections stood out:
To see how these two arguments rest on different conceptions of intelligence, note that considering Intelligence(1), it is not at all clear that there is any general, single way to increase this form of intelligence, as Intelligence(1) incorporates a wide range of disparate skills and abilities that may be quite independent of each other. As such, even a superintelligence that was better than humans at improving AIs would not necessarily be able to engage in rapidly recursive self-improvement of Intelligence(1), because there may well be no such thing as a single variable or quantity called ‘intelligence’ that is directly associated with AI-improving ability.
Indeed, there may be no variable or quantity like this. But I’m not sure there isn’t, and it seems really, really important to be sure before we write off the possibility. We don’t understand human reasoning very well; it seems plausible to me that there really are a few features of the human mind that account for nearly all of our reasoning ability. (I think the “single quantity” thing is a red herring; an AI could make self-recursive progress on several variables at once.)
To give a silly human example, I’ll name Tim Ferriss, who has used the skills of “learning to learn”, “ignoring ‘unwritten rules’ that other people tend to follow”, and “closely observing the experience of other skilled humans” to learn many languages, become an extremely successful investor, write a book that sold millions of copies before he was well-known, and so on. His IQ may not be higher now than when he begin, but his end results look like the end results of someone who became much more “intelligent”.
Tim has done his best to break down “human-improving ability” into a small number of rules. I’d be unsurprised to see someone use those rules to improve their own performance in almost any field, from technical research to professional networking.
Might the same thing be true of AI—that a few factors really do allow for drastic improvements in problem-solving across many domains? It’s not at all clear that it isn’t.
If, however, we adopt the much more expansive conception of Intelligence(1), the argument becomes much less defensible. This should become clear if one considers that ‘essentially all human cognitive abilities’ includes such activities as pondering moral dilemmas, reflecting on the meaning of life, analysing and producing sophisticated literature, formulating arguments about what constitutes a ‘good life’, interpreting and writing poetry, forming social connections with others, and critically introspecting upon one’s own goals and desires. To me it seems extraordinarily unlikely that any agent capable of performing all these tasks with a high degree of proficiency would simultaneously stand firm in its conviction that the only goal it had reasons to pursue was tilling the universe with paperclips.
Some of the world’s most famous intellectuals have made what most people in the EA community would see as bizarre or dangerous errors in moral reasoning. It’s possible for someone to have a deep grasp of literature, a talent for moral philosophy, and great social skills—and still have desires that are antithetical to sentient well-being (there are too many historical examples to count).
Motivation is a strange thing. Much of the world, including some of those famous intellectuals I mentioned, believes in religious and patriotic ideals that don’t seem “rational” to me. I’m sure there are people far more intelligent than I who would like to tile the world with China, America, Christianity, or Islam, and who are unlikely to break from this conviction. The ability to reflect on life, like the ability to solve problems, often seems to have little impact on how easily you can change your motivations.
It’s also important not to take the “paperclip” example too seriously. It’s meant to be absurd in a fun, catchy way, but also to stand in for the class of “generally alien goals”, which are often much less ridiculous.
If an AI were to escape the bonds of human civilization and begin harvesting all of the sun’s energy for some eldritch purpose, it’s plausible to me that the AI would have a very good reason (e.g. “learn about the mysteries of the universe”). However, this doesn’t mean that its good reason has to be palatable to any actual humans. If an AI were to decide that existence is inherently net-negative and begin working to end life in the universe, it would be engaging in deep, reflective philosophy (and might even be right in some hard-to-fathom way) but that would little comfort to us.
[Sorry for picking out a somewhat random point unrelated to the main conversation. This just struck me because I feel like it’s similar to a divergence in intuitions I often notice between myself and other EAs and particularly people from the ‘rationalist’ community. So I’m curious if there is something here it would be valuable for me to better understand.]
To give a silly human example, I’ll name Tim Ferriss, who has used the skills of “learning to learn”, “ignoring ‘unwritten rules’ that other people tend to follow”, and “closely observing the experience of other skilled humans” to learn many languages, become an extremely successful investor, write a book that sold millions of copies before he was well-known, and so on. His IQ may not be higher now than when he begin, but his end results look like the end results of someone who became much more “intelligent”.
Tim has done his best to break down “human-improving ability” into a small number of rules. I’d be unsurprised to see someone use those rules to improve their own performance in almost any field, from technical research to professional networking.
Here is an alternative hypothesis, a bit exaggerated for clarity:
There is a large number of people who try to be successful in various ways.
While trying to be successful, people tend to confabulate explicit stories for what they’re doing and why it might work, for example “ignoring ‘unwritten rules’ that other people tend to follow”.
These confabulations are largely unrelated to the actual causes of success, or at least don’t refer to them in a way nearly as specific as they seem to do. (E.g., perhaps a cause could be ‘practicing something in an environment with frequent and accurate feedback’, while a confabulation would talk about quite specific and tangential features of how this practice was happening.)
Most people actually don’t end up having large successes, but a few do. We might be pulled to think that their confabulations about what they were doing are insightful or worth emulating, but in fact it’s all a mix of survivorship bias and people with certain innate traits (IQ, conscientiousness, perhaps excitement-seeking, …) not occurring in the confabulations doing better.
Do you think we have evidence that this alternative hypothesis is false?
I think the truth is a mix of both hypotheses. I don’t have time to make a full response, but some additional thoughts:
It’s very likely that there exist reliable predictors of success that extend across many fields.
Some of these are innate traits (intelligence, conscientiousness, etc.)
But if you look at a group of people in a field who have very similar traits, some will still be more successful than others. Some of this inequality will be luck, but some of it seems like it would also be related to actions/habits/etc.
Some of these actions will be trait-related (e.g. “excitement-seeking” might predict “not following unwritten rules”). But it should also be possible to take the right actions even if you aren’t strong in the corresponding traits; there are ways you can become less bound by unwritten rules even if you don’t have excitement-seeking tendencies. (A concrete example: Ferriss sometimes recommends practicing requests in public to get past worries about social faux pas—e.g. by asking for a discount on your coffee. CFAR does something similar with “comfort zone expansion”.)
No intellectual practice/”rule” is universal—if many people tried the sorts of things Tim Ferriss tried, most would fail or at least have a lot less success. But some actions are more likely than others to generate self-improvement/success, and some actions seem like they would make a large difference (for example, “trying new things” or “asking for things”).
One (perhaps pessimistic) picture of the world could look like this:
Most people are going to be roughly as capable/successful as they are now forever, even if they try to change, unless good or bad luck intervenes
Some people who try to change will succeed, because they expose themselves to the possibility of good luck (e.g. by starting a risky project, asking for help with something, or giving themselves the chance to stumble upon a habit/routine that suits them very well)
A few people will succeed whether or not they try to change, because they won the trait lottery, but within this group, trying to change in certain ways will still be associated with greater success.
One of Ferriss’s stated goals is to look at groups of people who succeed at X, then find people within those groups who have been unexpectedly successful. A common interview question: “Who’s better at [THING] than they should be?” (For example, an athlete with an unusual body type, or a startup founder from an unusual background.) You can never take luck out of the equation completely, especially in the complex world of intellectual/business pursuits, but I think there’s some validity to the common actions Ferriss claims to have identified.
Thanks for your thoughts. Regarding spreading my argument across 5 posts, I did this in part because I thought connected sequences of posts were encouraged?
Regarding the single quantity issue, I don’t think it is a red herring, because if there are multiple distinct quantities then the original argument for self-sustaining rapid growth becomes significantly weaker (see my responses to Flodorner and Lukas for more on this).
You say “Might the same thing be true of AI—that a few factors really do allow for drastic improvements in problem-solving across many domains? It’s not at all clear that it isn’t.” I believe we have good reason to think no such few factors exist. I would say because A) this does not seem to be how human intelligence works and B) because this does not seem to be consistent with the history of progress in AI research. Both I would say are characterised by many different functionalities or optimisations for particular tasks. Not to say there are no general principles but I think these are not as extensive as you seem to believe. However regardless of this point, I would just say that if Bostrom’s argument is to succeed I think he needs to give some persuasive reasons or evidence as to why we should think such factors exist. Its not sufficient just to argue that they might.
Connected sequences of posts are definitely encouraged, as they are sometimes the best way to present an extensive argument. However, I’d generally recommend that someone make one post over two short posts if they could reasonably fit their content into one post, because that makes discussion easier.
In this case, I think the content could have been fit into fewer posts (not just one, but fewer than five) had the organization system been a bit different, but this isn’t meant to be a strong criticism—you may well have chosen the best way to sort your arguments. The critique I’m most sure about is that your section on “the nature of intelligence” could have benefited from being broken down a bit more, with more subheadings and/or other language meant to guide readers through the argument (similarly to the way you presented Bostrom’s argument in the form of a set of premises, which was helpful).
Weak upvote for engaging seriously with content and linking to the other parts of the argument.
On the other hand, while it’s good to see complex arguments on the Forum, it’s difficult to discuss pieces that are written without very many headings or paragraph breaks. It’s generally helpful to break down your piece into labelled sections so that people can respond unambiguously to various points. I also think this would help you make this argument across fewer than five posts, which would also make discussion easier.
I’m not the best-positioned person to comment on this topic (hopefully someone with more expertise will step in and correct both of our misconceptions), but these sections stood out:
Indeed, there may be no variable or quantity like this. But I’m not sure there isn’t, and it seems really, really important to be sure before we write off the possibility. We don’t understand human reasoning very well; it seems plausible to me that there really are a few features of the human mind that account for nearly all of our reasoning ability. (I think the “single quantity” thing is a red herring; an AI could make self-recursive progress on several variables at once.)
To give a silly human example, I’ll name Tim Ferriss, who has used the skills of “learning to learn”, “ignoring ‘unwritten rules’ that other people tend to follow”, and “closely observing the experience of other skilled humans” to learn many languages, become an extremely successful investor, write a book that sold millions of copies before he was well-known, and so on. His IQ may not be higher now than when he begin, but his end results look like the end results of someone who became much more “intelligent”.
Tim has done his best to break down “human-improving ability” into a small number of rules. I’d be unsurprised to see someone use those rules to improve their own performance in almost any field, from technical research to professional networking.
Might the same thing be true of AI—that a few factors really do allow for drastic improvements in problem-solving across many domains? It’s not at all clear that it isn’t.
Some of the world’s most famous intellectuals have made what most people in the EA community would see as bizarre or dangerous errors in moral reasoning. It’s possible for someone to have a deep grasp of literature, a talent for moral philosophy, and great social skills—and still have desires that are antithetical to sentient well-being (there are too many historical examples to count).
Motivation is a strange thing. Much of the world, including some of those famous intellectuals I mentioned, believes in religious and patriotic ideals that don’t seem “rational” to me. I’m sure there are people far more intelligent than I who would like to tile the world with China, America, Christianity, or Islam, and who are unlikely to break from this conviction. The ability to reflect on life, like the ability to solve problems, often seems to have little impact on how easily you can change your motivations.
It’s also important not to take the “paperclip” example too seriously. It’s meant to be absurd in a fun, catchy way, but also to stand in for the class of “generally alien goals”, which are often much less ridiculous.
If an AI were to escape the bonds of human civilization and begin harvesting all of the sun’s energy for some eldritch purpose, it’s plausible to me that the AI would have a very good reason (e.g. “learn about the mysteries of the universe”). However, this doesn’t mean that its good reason has to be palatable to any actual humans. If an AI were to decide that existence is inherently net-negative and begin working to end life in the universe, it would be engaging in deep, reflective philosophy (and might even be right in some hard-to-fathom way) but that would little comfort to us.
[Sorry for picking out a somewhat random point unrelated to the main conversation. This just struck me because I feel like it’s similar to a divergence in intuitions I often notice between myself and other EAs and particularly people from the ‘rationalist’ community. So I’m curious if there is something here it would be valuable for me to better understand.]
Here is an alternative hypothesis, a bit exaggerated for clarity:
There is a large number of people who try to be successful in various ways.
While trying to be successful, people tend to confabulate explicit stories for what they’re doing and why it might work, for example “ignoring ‘unwritten rules’ that other people tend to follow”.
These confabulations are largely unrelated to the actual causes of success, or at least don’t refer to them in a way nearly as specific as they seem to do. (E.g., perhaps a cause could be ‘practicing something in an environment with frequent and accurate feedback’, while a confabulation would talk about quite specific and tangential features of how this practice was happening.)
Most people actually don’t end up having large successes, but a few do. We might be pulled to think that their confabulations about what they were doing are insightful or worth emulating, but in fact it’s all a mix of survivorship bias and people with certain innate traits (IQ, conscientiousness, perhaps excitement-seeking, …) not occurring in the confabulations doing better.
Do you think we have evidence that this alternative hypothesis is false?
I think the truth is a mix of both hypotheses. I don’t have time to make a full response, but some additional thoughts:
It’s very likely that there exist reliable predictors of success that extend across many fields.
Some of these are innate traits (intelligence, conscientiousness, etc.)
But if you look at a group of people in a field who have very similar traits, some will still be more successful than others. Some of this inequality will be luck, but some of it seems like it would also be related to actions/habits/etc.
Some of these actions will be trait-related (e.g. “excitement-seeking” might predict “not following unwritten rules”). But it should also be possible to take the right actions even if you aren’t strong in the corresponding traits; there are ways you can become less bound by unwritten rules even if you don’t have excitement-seeking tendencies. (A concrete example: Ferriss sometimes recommends practicing requests in public to get past worries about social faux pas—e.g. by asking for a discount on your coffee. CFAR does something similar with “comfort zone expansion”.)
No intellectual practice/”rule” is universal—if many people tried the sorts of things Tim Ferriss tried, most would fail or at least have a lot less success. But some actions are more likely than others to generate self-improvement/success, and some actions seem like they would make a large difference (for example, “trying new things” or “asking for things”).
One (perhaps pessimistic) picture of the world could look like this:
Most people are going to be roughly as capable/successful as they are now forever, even if they try to change, unless good or bad luck intervenes
Some people who try to change will succeed, because they expose themselves to the possibility of good luck (e.g. by starting a risky project, asking for help with something, or giving themselves the chance to stumble upon a habit/routine that suits them very well)
A few people will succeed whether or not they try to change, because they won the trait lottery, but within this group, trying to change in certain ways will still be associated with greater success.
One of Ferriss’s stated goals is to look at groups of people who succeed at X, then find people within those groups who have been unexpectedly successful. A common interview question: “Who’s better at [THING] than they should be?” (For example, an athlete with an unusual body type, or a startup founder from an unusual background.) You can never take luck out of the equation completely, especially in the complex world of intellectual/business pursuits, but I think there’s some validity to the common actions Ferriss claims to have identified.
Thanks for your thoughts. Regarding spreading my argument across 5 posts, I did this in part because I thought connected sequences of posts were encouraged?
Regarding the single quantity issue, I don’t think it is a red herring, because if there are multiple distinct quantities then the original argument for self-sustaining rapid growth becomes significantly weaker (see my responses to Flodorner and Lukas for more on this).
You say “Might the same thing be true of AI—that a few factors really do allow for drastic improvements in problem-solving across many domains? It’s not at all clear that it isn’t.” I believe we have good reason to think no such few factors exist. I would say because A) this does not seem to be how human intelligence works and B) because this does not seem to be consistent with the history of progress in AI research. Both I would say are characterised by many different functionalities or optimisations for particular tasks. Not to say there are no general principles but I think these are not as extensive as you seem to believe. However regardless of this point, I would just say that if Bostrom’s argument is to succeed I think he needs to give some persuasive reasons or evidence as to why we should think such factors exist. Its not sufficient just to argue that they might.
Connected sequences of posts are definitely encouraged, as they are sometimes the best way to present an extensive argument. However, I’d generally recommend that someone make one post over two short posts if they could reasonably fit their content into one post, because that makes discussion easier.
In this case, I think the content could have been fit into fewer posts (not just one, but fewer than five) had the organization system been a bit different, but this isn’t meant to be a strong criticism—you may well have chosen the best way to sort your arguments. The critique I’m most sure about is that your section on “the nature of intelligence” could have benefited from being broken down a bit more, with more subheadings and/or other language meant to guide readers through the argument (similarly to the way you presented Bostrom’s argument in the form of a set of premises, which was helpful).