Thanks so much, that’s great to hear! I’ll answer your first question in this comment and leave a separate reply for your Murphyjitsu question.
First of all, I definitely agree that the difference between 2050 and 2032 is a big deal and worth getting to the bottom of; it would make a difference to Open Phil’s prioritization (and internally we’re trying to do projects that could convince us of timelines significantly shorter than in my report). You may be right that it could have a counterintuitively small impact on many individual people’s career choices, for the reasons you say, but I think many others (especially early career people) would and should change their actions substantially.
I think there are roughly three types of reasons why Bob might disagree with Alice about a bottom line conclusion like TAI timelines, which correspond to three types of research or discourse contributions Bob could make in this space:
1. Disagreements can come from Bob knowing more facts than Alice about a key parameter, which can allow Bob to make “straightforward corrections” to Alice’s proposed value for that parameter. E.g., “You didn’t think much about hardware, but I did a solid research project into hardware and I think experts would agree that because of optical computing progress will be faster than you assumed; changing to the better values makes timelines shorter.” If Bob does a good enough job with this empirical investigation, Alice will often just say “Great, thanks!” and adopt Bob’s number. 2. Disagreements can come from Bob modeling out a part of the world in more mechanistic detail that Alice fudged or simplified, which can allow Bob to propose a better structure than Alice’s model. E.g., “You agree that earlier AI systems can generate revenue which can be reinvested into AI research but you didn’t explicitly model that and just made a guess about spending trajectory; I’ll show that accounting for this properly would make timelines shorter.” Alice may feel some hesitance adopting Bob’s model wholesale here, because Alice’s model may fudge/elide one thing in an overly-conservative direction which she feels is counterbalanced by fudging/eliding another thing in an overly-aggressive direction, but it will often be tractable to argue that the new model is better and Alice will often be happy to adopt it (perhaps changing some other fudged parameters a little to preserve intuitions that seemed important to her). 3. Finally, disagreements can come from differences in intuition about the subjective weight different considerations should get when coming up with values for the more debatable parameters (such as the different biological anchor hypotheses). It’s more difficult for Bob to make a contribution toward changing Alice’s bottom line here, because a lot of the action is in hard-to-access mental alchemy going on in Alice and Bob’s minds when they make difficult judgment calls. Bob can try to reframe things, offer intuition pumps, trace disagreements about one topic back to a deeper disagreement about another topic and argue about that, and so on, but he should expect it to be slow going and expect Alice to be pretty hard to move.
In my experience, most large and persistent disagreements between people about big-picture questions like TAI timelines or the magnitude of risk from AI are mostly the third kind of disagreement, and these disagreements can be entangled with dozens of other differences in background assumptions / outlook / worldview. My sense is that your most major disagreements with me fall into the third category: you think that I’m overweighting the hypothesis that we’d need to do meta-learning in which the “inner loop” takes a long subjective time; you may also think that I’m underweighting the possibility of sudden takeoff or overweighting the efficiency of markets in a certain way, which leads me to lend too much credence to considerations like “Well if the low end of the compute range is actually right, we should probably be seeing more economic impact from the slightly-smaller AI systems right now.” If you were to change my mind on this, it might not even be from doing “timelines research”: maybe you do “takeoff speeds research” that convinces me to take sudden takeoff more seriously, which in turn causes me to take shorter timelines (which would imply more sudden takeoff) more seriously.
I’d say tackling category 3 disagreements is high risk and effort but has the possibility of high reward, and tackling category 1 disagreements is lower risk and effort with more moderate reward. My subjective impression is that EAs tend to under-invest in tackling categories 1 and 2 because they perceive category 3 as where the real action is—in some sense they’re right about that, but they may underestimate how hard it’ll be to change people’s minds there. For example, changing someone’s minds about a category 3 disagreement often greatly benefits from having a lot of face time with them, which isn’t very scalable, and arguments may be more particular to individuals: what finally convinces Alice may not be moving to Charlie.
I think one potential way to get at a category 3 disagreement about a long-term forecast is by proposing bets about nearer-term forecasts, although I think this is often a lot harder than it sounds, because people are sensitive to the possibility of “losing on a technicality”: they were right about the big picture but wrong about how that big picture actually translates to a near-term prediction. Even making short-term bets often benefits from having a lot of face time to hash out the terms.
It occurred to me that another way to try to move someone on complicated category 3 disagreements might be to put together a well-constructed survey of a population that the person is inclined to defer to. This approach is definitely still tricky: you’d have to convince the person that the relevant population was provided with the strongest arguments for that person’s view in addition to your counterarguments, and that the individuals surveyed were thinking about it reasonably hard. But if done well, it could be pretty powerful.
Thanks, this was a surprisingly helpful answer, and I had high expectations!
This is updating me somewhat towards doing more blog posts of the sort that I’ve been doing. As it happens, I have a draft of one that is very much Category 3, let me know if you are interested in giving comments!
Your sense of why we disagree is pretty accurate, I think. The only thing I’d add is that I do think we should update downwards on low-end compute scenarios because of market efficiency considerations, just not as strongly as you perhaps, and moreover I also think that we should update upwards for various reasons (the surprising recent sucesses of deep learning, the fact that big corporations are investing heavily-by-historical-standards in AI, the fact that various experts think they are close to achieving AGI) and the upwards update mostly cancels out the downwards update IMO.
Thanks so much, that’s great to hear! I’ll answer your first question in this comment and leave a separate reply for your Murphyjitsu question.
First of all, I definitely agree that the difference between 2050 and 2032 is a big deal and worth getting to the bottom of; it would make a difference to Open Phil’s prioritization (and internally we’re trying to do projects that could convince us of timelines significantly shorter than in my report). You may be right that it could have a counterintuitively small impact on many individual people’s career choices, for the reasons you say, but I think many others (especially early career people) would and should change their actions substantially.
I think there are roughly three types of reasons why Bob might disagree with Alice about a bottom line conclusion like TAI timelines, which correspond to three types of research or discourse contributions Bob could make in this space:
1. Disagreements can come from Bob knowing more facts than Alice about a key parameter, which can allow Bob to make “straightforward corrections” to Alice’s proposed value for that parameter. E.g., “You didn’t think much about hardware, but I did a solid research project into hardware and I think experts would agree that because of optical computing progress will be faster than you assumed; changing to the better values makes timelines shorter.” If Bob does a good enough job with this empirical investigation, Alice will often just say “Great, thanks!” and adopt Bob’s number.
2. Disagreements can come from Bob modeling out a part of the world in more mechanistic detail that Alice fudged or simplified, which can allow Bob to propose a better structure than Alice’s model. E.g., “You agree that earlier AI systems can generate revenue which can be reinvested into AI research but you didn’t explicitly model that and just made a guess about spending trajectory; I’ll show that accounting for this properly would make timelines shorter.” Alice may feel some hesitance adopting Bob’s model wholesale here, because Alice’s model may fudge/elide one thing in an overly-conservative direction which she feels is counterbalanced by fudging/eliding another thing in an overly-aggressive direction, but it will often be tractable to argue that the new model is better and Alice will often be happy to adopt it (perhaps changing some other fudged parameters a little to preserve intuitions that seemed important to her).
3. Finally, disagreements can come from differences in intuition about the subjective weight different considerations should get when coming up with values for the more debatable parameters (such as the different biological anchor hypotheses). It’s more difficult for Bob to make a contribution toward changing Alice’s bottom line here, because a lot of the action is in hard-to-access mental alchemy going on in Alice and Bob’s minds when they make difficult judgment calls. Bob can try to reframe things, offer intuition pumps, trace disagreements about one topic back to a deeper disagreement about another topic and argue about that, and so on, but he should expect it to be slow going and expect Alice to be pretty hard to move.
In my experience, most large and persistent disagreements between people about big-picture questions like TAI timelines or the magnitude of risk from AI are mostly the third kind of disagreement, and these disagreements can be entangled with dozens of other differences in background assumptions / outlook / worldview. My sense is that your most major disagreements with me fall into the third category: you think that I’m overweighting the hypothesis that we’d need to do meta-learning in which the “inner loop” takes a long subjective time; you may also think that I’m underweighting the possibility of sudden takeoff or overweighting the efficiency of markets in a certain way, which leads me to lend too much credence to considerations like “Well if the low end of the compute range is actually right, we should probably be seeing more economic impact from the slightly-smaller AI systems right now.” If you were to change my mind on this, it might not even be from doing “timelines research”: maybe you do “takeoff speeds research” that convinces me to take sudden takeoff more seriously, which in turn causes me to take shorter timelines (which would imply more sudden takeoff) more seriously.
I’d say tackling category 3 disagreements is high risk and effort but has the possibility of high reward, and tackling category 1 disagreements is lower risk and effort with more moderate reward. My subjective impression is that EAs tend to under-invest in tackling categories 1 and 2 because they perceive category 3 as where the real action is—in some sense they’re right about that, but they may underestimate how hard it’ll be to change people’s minds there. For example, changing someone’s minds about a category 3 disagreement often greatly benefits from having a lot of face time with them, which isn’t very scalable, and arguments may be more particular to individuals: what finally convinces Alice may not be moving to Charlie.
I think one potential way to get at a category 3 disagreement about a long-term forecast is by proposing bets about nearer-term forecasts, although I think this is often a lot harder than it sounds, because people are sensitive to the possibility of “losing on a technicality”: they were right about the big picture but wrong about how that big picture actually translates to a near-term prediction. Even making short-term bets often benefits from having a lot of face time to hash out the terms.
It occurred to me that another way to try to move someone on complicated category 3 disagreements might be to put together a well-constructed survey of a population that the person is inclined to defer to. This approach is definitely still tricky: you’d have to convince the person that the relevant population was provided with the strongest arguments for that person’s view in addition to your counterarguments, and that the individuals surveyed were thinking about it reasonably hard. But if done well, it could be pretty powerful.
Thanks, this was a surprisingly helpful answer, and I had high expectations!
This is updating me somewhat towards doing more blog posts of the sort that I’ve been doing. As it happens, I have a draft of one that is very much Category 3, let me know if you are interested in giving comments!
Your sense of why we disagree is pretty accurate, I think. The only thing I’d add is that I do think we should update downwards on low-end compute scenarios because of market efficiency considerations, just not as strongly as you perhaps, and moreover I also think that we should update upwards for various reasons (the surprising recent sucesses of deep learning, the fact that big corporations are investing heavily-by-historical-standards in AI, the fact that various experts think they are close to achieving AGI) and the upwards update mostly cancels out the downwards update IMO.
Update: The draft I mentioned is now a post!