Maybe you’re claiming that AI risk proponents reject analogies in general when someone is using an analogy that supports the opposite conclusion, but accepting the validity of analogies when it supports their conclusion. If this were the case, it would be bad, but I don’t actually think this is what is happening.
Then perhaps you can reply to the examples I used in the post when arguing that analogies are often used selectively? I named two examples: (1) a preference for an analogy to chimps rather than to golden retrievers when arguing about AI alignment, and (2) a preference for an analogy to human evolution rather than an analogy to within-lifetime learning when arguing about inner misalignment.
I do think that a major element of my thesis is that many analogies appear to be chosen selectively. While I advocate that we should not merely switch analogies, I think if we are going to use analogies-as-arguments anyway, then we should try to find ones that are the most plausible, and natural. And I don’t currently see much reason to prefer to chimp and evolution analogies over their alternatives in that case.
I actually thought that the discussion of the chimp analogy was handled pretty well in the podcast. Ajeya brought up that example and then Rob explicitly brought up an alternate mental model of it being a tool (like Google Maps). Discussing multiple possible mental models is exactly want you want to be doing to guard against biases. I agree that it would have be nice to discuss an analogy more like a golden retriever or kid as well, but there’s always additional issues that could be discussed.
I agree Ajeya didn’t really provide her reasons for seeing the chimp analogy as useful there, but I think it’s valuable as a way of highlighting the AI equivalent of the nature vs. nurture debate. Many people talk about AI’s using the analogy of children and they assume that we can produce moral AI’s by just treating them well/copying good human parenting strategies. I think the chimp analogy is useful as a way of highlighting that appearance can be decieving.
I actually thought that the discussion of the chimp analogy was handled pretty well in the podcast. Ajeya brought up that example and then Rob explicitly brought up an alternate mental model of it being a tool (like Google Maps)
The tool analogy appeared to have been brought up as a way of strawmanning/weakmanning people who disagree with them. I think the analogy to Google Maps is not actually representative of how most intelligent AI optimists reason about AI as of 2023 (even if Holden Karnofsky used it in 2012, before the deep learning revolution). The full quote was,
Rob Wiblin: Right. I guess the idea there is that you might think that the chimp is learning that people are to be trusted and it’s all good, but it’s a different mind that thinks differently and draws different conclusions, and it might have particular tendencies that are not obvious to you, particular impulses that are not relatable to you.
The shrinking number of people who are not troubled by any of this at all, I assume that most of them have a different analogy in mind, which is like a can opener or a toaster. OK, that’s a little bit silly. To be more sympathetic, the analogy that they have in their mind is that this is a tool that we’ve made, that we’ve designed.
Ajeya Cotra: Like Google Maps.
Rob Wiblin: Like Google Maps. “We designed it to do the thing that we want. Why do you think it’s going to spin out of control? Tools that we’ve made have never spun out of control and started acting in these bizarre ways before.” If the analogy you have in mind is something like Google Maps, or your phone, or even like a recommendation algorithm, it makes sense that it’s going to seem very counterintuitive in that case to think that it’s going to be dangerous. It’ll be way less intuitive in that case than in the case where you’re thinking about raising a gorilla.
Ajeya Cotra: Yeah. I think the real disanalogy between Google Maps and all of this stuff and AI systems is that we are not producing these AI systems in the same way that we produced Google Maps: by some human sitting down, thinking about what it should look like, and then writing code that determines what it should look like.
Many people talk about AI’s using the analogy of children and they assume that we can produce moral AI’s by just treating them well/copying good human parenting strategies. I think the chimp analogy is useful as a way of highlighting that appearance can be decieving.
As I said in the post, I think the chimp analogy can be good for conveying the logical possibility of misalignment. Indeed, appearances can be deceiving. I don’t see any particularly strong reasons to think appearances actually are deceiving here. What evidence is there that AIs won’t actually just be aligned by default given good “parenting strategies” i.e. reasonably good training regimes? (And again, I’m not saying AIs will necessarily be aligned by default. I just think this question is uncertain, and I don’t think the chimp analogy is actually useful as a mental model of the situation here.)
A lot of people think about AI in all sorts of inaccurate ways, including those who argue for AI pessimism. “AI is like Google Maps” is not at all how most intelligent AI optimists such as Nora Belrose, Quintin Pope, Robin Hanson, and so on, think about AI in 2024. It’s a weakman, in a pretty basic sense.
I think that neither of those are selective uses of analogies. They do point to similarities between things we have access to and future ASI that you might not think are valid similarities, but that is one thing that makes analogies useful—they can make locating disagreements in people’s models very fast, since they’re structurally meant to transmit information in a highly compressed fashion.
Then perhaps you can reply to the examples I used in the post when arguing that analogies are often used selectively? I named two examples: (1) a preference for an analogy to chimps rather than to golden retrievers when arguing about AI alignment, and (2) a preference for an analogy to human evolution rather than an analogy to within-lifetime learning when arguing about inner misalignment.
I do think that a major element of my thesis is that many analogies appear to be chosen selectively. While I advocate that we should not merely switch analogies, I think if we are going to use analogies-as-arguments anyway, then we should try to find ones that are the most plausible, and natural. And I don’t currently see much reason to prefer to chimp and evolution analogies over their alternatives in that case.
I actually thought that the discussion of the chimp analogy was handled pretty well in the podcast. Ajeya brought up that example and then Rob explicitly brought up an alternate mental model of it being a tool (like Google Maps). Discussing multiple possible mental models is exactly want you want to be doing to guard against biases. I agree that it would have be nice to discuss an analogy more like a golden retriever or kid as well, but there’s always additional issues that could be discussed.
I agree Ajeya didn’t really provide her reasons for seeing the chimp analogy as useful there, but I think it’s valuable as a way of highlighting the AI equivalent of the nature vs. nurture debate. Many people talk about AI’s using the analogy of children and they assume that we can produce moral AI’s by just treating them well/copying good human parenting strategies. I think the chimp analogy is useful as a way of highlighting that appearance can be decieving.
The tool analogy appeared to have been brought up as a way of strawmanning/weakmanning people who disagree with them. I think the analogy to Google Maps is not actually representative of how most intelligent AI optimists reason about AI as of 2023 (even if Holden Karnofsky used it in 2012, before the deep learning revolution). The full quote was,
As I said in the post, I think the chimp analogy can be good for conveying the logical possibility of misalignment. Indeed, appearances can be deceiving. I don’t see any particularly strong reasons to think appearances actually are deceiving here. What evidence is there that AIs won’t actually just be aligned by default given good “parenting strategies” i.e. reasonably good training regimes? (And again, I’m not saying AIs will necessarily be aligned by default. I just think this question is uncertain, and I don’t think the chimp analogy is actually useful as a mental model of the situation here.)
There are lots of people who think about AI as a tool.
A lot of people think about AI in all sorts of inaccurate ways, including those who argue for AI pessimism. “AI is like Google Maps” is not at all how most intelligent AI optimists such as Nora Belrose, Quintin Pope, Robin Hanson, and so on, think about AI in 2024. It’s a weakman, in a pretty basic sense.
I think that neither of those are selective uses of analogies. They do point to similarities between things we have access to and future ASI that you might not think are valid similarities, but that is one thing that makes analogies useful—they can make locating disagreements in people’s models very fast, since they’re structurally meant to transmit information in a highly compressed fashion.