It seems that half of these examples are from 15+ years ago, from a period for which Eliezer has explicitly disavowed his opinions (and the ones that are not strike me as most likely correct, like treating coherence arguments as forceful and that AI progress is likely to be discontinuous and localized and to require relatively little compute).
Let’s go example-by-example:
1. Predicting near-term extinction from nanotech
This critique strikes me as about as sensible as digging up someone’s old high-school essays and critiquing their stance on communism or the criminal justice system. I want to remind any reader that this is an opinion from 1999, when Eliezer was barely 20 years old. I am confident I can find crazier and worse opinions for every single leadership figure in Effective Altruism, if I am willing to go back to what they thought while they were in high-school. To give some character, here are some things I believed in my early high-school years:
The economy was going to collapse because the U.S. was establishing a global surveillance state
Nuclear power plants are extremely dangerous and any one of them is quite likely to explode in a given year
We could have easily automated the creation of all art, except for the existence of a vaguely defined social movement that tries to preserve the humanity of art-creation
These are dumb opinions. I am not ashamed of having had them. I was young and trying to orient in the world. I am confident other commenters can add their own opinions they had when they were in high-school. The only thing that makes it possible for someone to critique Eliezer on these opinions is that he was virtuous and wrote them down, sometimes in surprisingly well-argued ways.
If someone were to dig up an old high-school essay of mine, in-particular one that has at the top written “THIS IS NOT ENDORSED BY ME, THIS IS A DUMB OPINION”, and used it to argue that I am wrong about important cause prioritization questions, I would feel deeply frustrated and confused.
For context, on Eliezer’s personal website it says:
My parents were early adopters, and I’ve been online since a rather young age. You should regard anything from 2001 or earlier as having been written by a different person who also happens to be named “Eliezer Yudkowsky”. I do not share his opinions.
2. Predicting that his team had a substantial chance of building AGI before 2010
Given that this is only 2 years later, all my same comments apply. But let’s also talk a bit about the object-level here.
This is the quote on which this critique is based:
Our best guess for the timescale is that our final-stage AI will reach transhumanity sometime between 2005 and 2020, probably around 2008 or 2010. As always with basic research, this is only a guess, and heavily contingent on funding levels.
This… is not a very confident prediction. This paragraph literally says “only a guess”. I agree, if Eliezer said this today, I would definitely dock him some points, but this is again a freshman-aged Eliezer, and it was more than 20 years ago.
But also, I don’t know, predicting AGI by 2020 from the year 2000 doesn’t sound that crazy. If we didn’t have a whole AI winter, if Moore’s law had accelerated a bit instead of slowed down, if more talent had flowed into AI and chip-development, 2020 doesn’t seem implausible to me. I think it’s still on the aggressive side, given what we know now, but technological forecasting is hard, and the above sounds more like a 70% confidence interval instead of a 90% confidence interval.
3. Having high confidence that AI progress would be extremely discontinuous and localized and not require much compute
This opinion strikes me as approximately correct. I still expect highly discontinuous progress, and many other people have argued for this as well. Your analysis that the world looks more like Hanson’s world described in the AI foom debate also strikes me as wrong (and e.g. Paul Christiano has also said that Hanson’s predictions looked particularly bad in the FOOM debate. EDIT: I think this was worded too strong, and while Paul had some disagreements with Robin, on the particular dimension of discontinuity and competitiveness, Paul thinks Robin came away looking better than Eliezer). Indeed, I would dock Hanson many more points in that discussion (though, overall, I give both of them a ton of points, since they both recognized the importance of AI-like technologies early, and performed vastly above baseline for technological forecasting, which again, is extremely hard).
This seems unlikely to be the right place for a full argument on discontinuous progress. However, continuous takeoff is very far from consensus in the AI Alignment field, and this post seems to try to paint it as such, which seems pretty bad to me (especially if it’s used in a list with two clearly wrong things, without disclaiming it as such).
4. Treating early AI risk arguments as close to decisive
You say:
My point, here, is not necessarily that Yudkowsky was wrong, but rather that he held a much higher credence in existential risk from AI than his arguments justified at the time. The arguments had pretty crucial gaps that still needed to be resolved[14], but, I believe, his public writing tended to suggest that these arguments were tight and sufficient to justify very high credences in doom.
I think the arguments are pretty tight and sufficient to establish the basic risk argument. I found your critique relatively uncompelling. In particular, I think you are misrepresenting that a premise of the original arguments was a fast takeoff. I can’t currently remember any writing that said it was a necessary component of the AI risk arguments that takeoff happens fast, or at least whether the distinction between “AI vastly exceeds human intelligence in 1 week vs 4 years” is that crucial to the overall argument, which is as far as I can tell the range that most current opinions in the AI Alignment field falls into (and importantly, I know of almost no one who believes that it could take 20+ years for AI to go from mildly subhuman to vastly superhuman, which does feel like it could maybe change the playing field, but also seems to be a very rarely held opinion).
Indeed, I think Eliezer was probably underconfident in doom from AI, since I currently assign >50% probability to AI Doom, as do many other people in the AI Alignment field.
Coherence arguments do indeed strike me as one of the central valid arguments in favor of AI Risk. I think there was a common misunderstanding that did confuse some people, but that misunderstanding was not argued for by Eliezer or other people at MIRI, as far as I can tell (and I’ve looked into this for 5+ hours as part of discussions with Rohin and Richard).
The central core of coherence arguments, which are based in arguments of competetiveness and economic efficiency strike me as very strong, robustly argued for, and one of the main reasons for why AI Risk will be dangerous. The Neumann-Morgensterm theorem does play a role here, though it’s definitely not sufficient to establish a strong case, and Rohin and Richard have successfully argued against that, though I don’t think Eliezer has historically argued that the Neumann-Morgenstern theorem is sufficient to establish an AI-alignment relevant argument on its own (though Dutch-book style arguments are very suggestive for the real structure of the argument).
Given my disagreements with the above, I think doing so would be a mistake. But even without that, let’s look at the merits of this critique.
For the two “clear cut” examples, Eliezer has posted dozens of times on the internet that he has disendorsed his views from before 2002. This is present on his personal website, the relevant articles are no longer prominently linked anywhere, and Eliezer has openly and straightforwardly acknowledged that his predictions and beliefs from the relevant period were wrong.
For the disputed examples, Eliezer still believes all of these arguments (as do I), so it would be disingenuous for Eliezer to “acknowledge his mixed track record” in this domain. You can either argue that he is wrong, or you can argue that he hasn’t acknowledged that he has changed his mind and was previously wrong, but you can’t both argue that Eliezer is currently wrong in his beliefs, and accuse him of not telling others that he is wrong. I want people to say things they believe. And for the only two cases where you have established that Eliezer has changed his mind, he has extensively acknowledged his track record.
Some comments on the overall post:
I really dislike this post. I think it provides very little argument, and engages in extremely extensive cherry-picking in a way that does not produce a symmetric credit-allocation (i.e. most people who are likely to update downwards on Yudkowsky on the basis of this post, seem to me to be generically too trusting, and I am confident I can write a more compelling post about any other central figure in Effective Altruism that would likely cause you to update downwards even more).
I think a good and useful framing on this post could have been “here are 3 points where I disagree with Eliezer on AI Risk” (I don’t think it would have been useful under almost any circumstance to bring up the arguments from the year 2000). And then to primarily spend your time arguing about the concrete object-level. Not to start a post that is trying to say that Eliezer is “overconfident in his beliefs about AI” and “miscalibrated”, and then to justify that by cherry-picking two examples from when Eliezer was barely no longer a teenager, and three arguments on which there is broad disagreement within the AI Alignment field.
I also dislike calling this post “On Deference and Yudkowsky’s AI Risk Estimates”, as if this post was trying to be an unbiased analysis of how much to defer to Eliezer, while you just list negative examples. I think this post is better named “against Yudkowsky on AI Risk estimates”. Or “against Yudkowsky’s track record in AI Risk Estimates”. Which would have made it clear that you are selectively giving evidence for one side, and more clearly signposted that if someone was trying to evaluate Eliezer’s track record, this post will only be a highly incomplete starting point.
I have many more thoughts, but I think I’ve written enough for now. I think I am somewhat unlikely to engage with replies in much depth, because writing this comment has already taken up a lot of my time, and I expect given the framing of the post, discussion on the post to be unnecessarily conflicty and hard to navigate.
Just to note that the boldfaced part has no relevance in this context. The post is not attributing these views to present-day Yudkowsky. Rather, it is arguing that Yudkowsky’s track record is less flattering than some people appear to believe. You can disavow an opinion that you once held, but this disavowal doesn’t erase a bad prediction from your track record.
Hmm, I think that part definitely has relevance. Clearly we would trust Eliezer less if his response to that past writing was “I just got unlucky in my prediction, I still endorse the epistemological principles that gave rise to this prediction, and would make the same prediction, given the same evidence, today”.
If someone visibly learns from forecasting mistakes they make, that should clearly update us positively on them not repeating the same mistakes.
If someone visibly learns from forecasting mistakes they make, that should clearly update us positively on them not repeating the same mistakes.
I suppose one of my main questions is whether he has visibly learned from the mistakes, in this case.
For example, I wasn’t able to find a post or comment to the effect of “When I was younger, I spent of years of my life motivated by the belief that near-term extinction from nanotech was looming. I turned out to be wrong. Here’s what I learned from that experience and how I’ve applied it to my forecasts of near-term existential risk from AI.” Or a post or comment acknowledging his previous over-optimistic AI timelines and what he learned from them, when formulating his current seemingly short AI timelines.
(I genuinely could be missing these, since he has so much public writing.)
Eliezer writes a bit about his early AI timeline and nanotechnology opinions here, though it sure is a somewhat obscure reference that takes a bunch of context to parse:
Luke Muehlhauser reading a previous draft of this (only sounding much more serious than this, because Luke Muehlhauser): You know, there was this certain teenaged futurist who made some of his own predictions about AI timelines -
Eliezer: I’d really rather not argue from that as a case in point. I dislike people who screw up something themselves, and then argue like nobody else could possibly be more competent than they were. I dislike even more people who change their mind about something when they turn 22, and then, for the rest of their lives, go around acting like they are now Very Mature Serious Adults who believe the thing that a Very Mature Serious Adult believes, so if you disagree with them about that thing they started believing at age 22, you must just need to wait to grow out of your extended childhood.
Luke Muehlhauser (still being paraphrased): It seems like it ought to be acknowledged somehow.
Eliezer: That’s fair, yeah, I can see how someone might think it was relevant. I just dislike how it potentially creates the appearance of trying to slyly sneak in an Argument From Reckless Youth that I regard as not only invalid but also incredibly distasteful. You don’t get to screw up yourself and then use that as an argument about how nobody else can do better.
Humbali: Uh, what’s the actual drama being subtweeted here?
Eliezer: A certain teenaged futurist, who, for example, said in 1999, “The most realistic estimate for a seed AI transcendence is 2020; nanowar, before 2015.”
Humbali: This young man must surely be possessed of some very deep character defect, which I worry will prove to be of the sort that people almost never truly outgrow except in the rarest cases. Why, he’s not even putting a probability distribution over his mad soothsaying—how blatantly absurd can a person get?
Eliezer: Dear child ignorant of history, your complaint is far too anachronistic. This is 1999 we’re talking about here; almost nobody is putting probability distributions on things, that element of your later subculture has not yet been introduced. Eliezer-2002 hasn’t been sent a copy of “Judgment Under Uncertainty” by Emil Gilliam. Eliezer-2006 hasn’t put his draft online for “Cognitive biases potentially affecting judgment of global risks”. The Sequences won’t start until another year after that. How would the forerunners of effective altruism in 1999 know about putting probability distributions on forecasts? I haven’t told them to do that yet! We can give historical personages credit when they seem to somehow end up doing better than their surroundings would suggest; it is unreasonable to hold them to modern standards, or expect them to have finished refining those modern standards by the age of nineteen.
Though there’s also a more subtle lesson you could learn, about how this young man turned out to still have a promising future ahead of him; which he retained at least in part by having a deliberate contempt for pretended dignity, allowing him to be plainly and simply wrong in a way that he noticed, without his having twisted himself up to avoid a prospect of embarrassment. Instead of, for example, his evading such plain falsification by having dignifiedly wide Very Serious probability distributions centered on the same medians produced by the same basically bad thought processes.
But that was too much of a digression, when I tried to write it up; maybe later I’ll post something separately.
While also including some other points, I do read it as a pretty straightforward “Yes, I was really wrong. I didn’t know about cognitive biases, and I did not know about the virtue of putting probability distributions on things, and I had not thought enough about the art of thinking well. I would not make the same mistakes today.”.
How would the forerunners of effective altruism in 1999 know about putting probability distributions on forecasts? I haven’t told them to do that yet!
Did Yudkowsky actually write these sentences?
If Yudkowsky thinks, as this suggests, that people in EA think or do things because he tells them to—this alone means it’s valuable to question whether people give him the right credibility.
I am not sure about the question. Yeah, this is a quote from the linked post, so he wrote those sections.
Also, yeah, seems like Eliezer has had a very large effect on whether this community uses things like probability distributions, models things in a bayesian way, makes lots of bets, and pays attention to things like forecasting track records. I don’t think he gets to take full credit for those norms, but my guess is he is the single individual who most gets to take credit for those norms.
I don’t see how he has encouraged people to pay attention to forecasting track records. People who have encouraged that norm make public bets or go on public forecasting platforms and make predictions about questions that can resolve in the short term. Bryan Caplan does this; I think greg Lewis and David Manheim are superforecasters.
I thought the upshot of this piece and the Jotto post was that Yudkowsky is in fact very dismissive of people who make public forecasts. “I consider naming particular years to be a cognitively harmful sort of activity; I have refrained from trying to translate my brain’s native intuitions about this into probabilities, for fear that my verbalized probabilities will be stupider than my intuitions if I try to put weight on them.” This seems like the opposite of encouraging people to pay attention to forecasting but is rather dismissing the whole enterprise of forecasting.
I wanted to make sure I’m not missing something, since this shines a negative light about him IMO.
There’s a difference between saying, for example, “You can’t expect me to have done X then—nobody was doing it, and I haven’t even written about it yet, nor was I aware of anyone else doing so”—and saying ”… nobody was doing it because I haven’t told them to.”
This isn’t about credit. It’s about self-perception and social dynamics.
I mean… it is true that Eliezer really did shape the culture in the direction of forecasting and predictions and that kind of stuff. My best guess is that without Eliezer, we wouldn’t have a culture of doing those things (and like, the AI Alignment community as is probably wouldn’t exist). You might disagree with me and him on this, in which case sure, update in that direction, but I don’t think it’s a crazy opinion to hold.
My best guess is that without Eliezer, we wouldn’t have a culture of [forecasting and predictions]
The timeline doesn’t make sense for this version of events at all. Eliezer was uninformed on this topic in 1999, at a time when Robin Hanson had already written about gambling on scientific theories (1990), prediction markets (1996), and other betting-related topics, as you can see from the bibliography of his Futarchy paper (2000). Before Eliezer wrote his sequences (2006-2009), the Long Now Foundation already had Long Bets (2003), and Tetlock had already written Expert Political Judgment (2005).
If Eliezer had not written his sequences, forecasting content would have filtered through to the EA community from contacts of Hanson. For instance, through blogging by other GMU economists like Caplan (2009). And of course, through Jason Matheny, who worked at FHI, where Hanson was an affiliate. He ran the ACE project (2010), which led to the science behind Superforecasting, a book that the EA community would certainly have discovered.
Hmm, I think these are good points. My best guess is that I don’t think we would have a strong connection to Hanson without Eliezer, though I agree that that kind of credit is harder to allocate (and it gets fuzzy what we even mean by “this community” as we extend into counterfactuals like this).
I do think the timeline here provides decent evidence in favor of less credit allocation (and I think against the stronger claim “we wouldn’t have a culture of [forecasting and predictions] without Eliezer”). My guess is in terms of causing that culture to take hold, Eliezer is probably still the single most-responsible individual, though I do now expect (after having looked into a bunch of comment threads from 1996 to 1999 and seeing many familiar faces show up) that a lot of the culture would show up without Eliezer.
speaking for myself, eliezer has played no role in encouraging me to give quantitative probability distributions. For me, that was almost entirely due to people like Tetlock and Bryan Caplan, both of whom I would have encountered regardless of Eliezer. I strongly suspect this is true of lots of people who are in EA but don’t identify with the rationalist community
More generally, I do think that Eliezer and other rationalists overestimate how much influence they have had on wider views in the community. eg I have not read the sequences and I just don’t think it plays a big role in the internal story of a lot of EAs.
For me, even people like Nate Silver or David McKay, who aren’t part of the community, have played a bigger role on encouraging quantification and probabilistic judgment.
I’ll currently take your word for that because I haven’t been here nearly as long. I’ll mention that some of these contributions I don’t necessarily consider positive.
But the point is, is Yudkowsky a (major) contributor to a shared project, or is he a ruler directing others, like his quote suggests? How does he view himself? How do the different communities involved view him?
P.S. I disagree with whoever (strong-)downvoted your comment.
Yudkowsky often complainsrants hopes people will form their own opinions instead of just listening to him, I can find references if you want.
I also think he lately finds it depressing worrying that he’s got to be the responsible adult. Easy references: Search for “Eliezer” in List Of Lethalities.
I also think he lately finds it depressing worrying that he’s got to be the responsible adult. Easy references: Search for “Eliezer” in List Of Lethalities
I think this strengthens my point, especially given how it is written in the post you linked. Telling people you’re the responsible adult, or the only one who notices things, still means telling them you’re smarter than them and they should just defer to you.
I’m trying to account for my biases in these comments, but I encourage others to go to that post, search for “Eliezer” as you suggested, and form their own views.
Telling people you’re the responsible adult, or the only one who notices things, still means telling them you’re smarter than them and they should just defer to you.
Those are four very different claims. In general, I think it’s bad to collapse all (real or claimed) differences in ability into a single status hierarchy, for the reasons stated in Inadequate Equilibria.
Eliezer is claiming that other people are not taking the problem sufficiently seriously, claiming ownership of it, trying to form their own detailed models of the full problem, and applying enough rigor and clarity to make real progress on the problem.
He is specifically not saying “just defer to me”, and in fact is saying that he and everyone else is going to die if people rely on deference here. A core claim in AGI Ruin is that we need more people with “not the ability to read this document and nod along with it, but the ability to spontaneously write it from scratch without anybody else prompting you”.
Deferring to Eliezer means that Eliezer is the bottleneck on humanity solving the alignment problem; which means we die. The thing Eliezer claims we need is a larger set of people who arrive at true, deep, novel insights about the problem on their own —without Eliezer even mentioning the insights, much less spending a ton of time trying to persuade anyone of them—and writing them up.
It’s true that Eliezer endorses his current stated beliefs; this goes without saying, or he obviously wouldn’t have written them down. It doesn’t mean that he thinks humanity has any path to survival via deferring to him, or that he thinks he has figured out enough of the core problems (or ever could conceivably could do so, on his own) to give humanity a significant chance of surviving. Quoting AGI Ruin:
It’s guaranteed that some of my analysis is mistaken, though not necessarily in a hopeful direction. The ability to do new basic work noticing and fixing those flaws is the same ability as the ability to write this document before I published it[.]
The end of the “death with dignity” post is also alluding to Eliezer’s view that it’s pretty useless to figure out what’s true merely via deferring to Eliezer.
Eliezer is cleanly just a major contributor. If he went off the rails tomorrow, some people would follow him (and the community would be better with those few gone), but the vast majority would say “wtf is that Eliezer fellow doing”. I also don’t think he sees himself as the leader of the community either.
Probably Eliezer likes Eliezer more than EA/Rationality likes Eliezer, because Eliezer really likes Eliezer. If I were as smart & good at starting social movements as Eliezer, I’d probably also have an inflated ego, so I don’t take it as too unreasonable of a character flaw.
Yes, definitely much more than Philip Tetlock, given that our community had strong norms of forecasting and making bets before Tetlock had done most of his work on the topic (Expert Political Forecasting was out, but as far as I can tell was not a major influence on people in the community, though I am not totally confident of that).
Does that particular quote from Yudkowsky not strike you as slightly arrogant?
I am generally strongly against a culture of fake modesty. If I want people to make good decisions, they need to be able to believe things about them that might sound arrogant to others. Yes, it sounds arrogant to an external audience, but it also seems true, and it seems like whether it is true should be the dominant fact on whether it is good to say.
FWIW I think “it was 20 years ago” is a good reason not to take these failed predictions too seriously, and “he has disavowed these predictions after seeing they were false” is a bad reason to take them unseriously.
I want to remind any reader that this is an opinion from 1999, when Eliezer was barely 20 years old.
I think your comment might give the misimpression that I don’t discuss this fact in the post or explain why I include the case. What I write is:
I should, once again, emphasize that Yudkowsky was around twenty when he did the final updates on this essay. In that sense, it might be unfair to bring this very old example up.
Nonetheless, I do think this case can be treated as informative, since: the belief was so analogous to his current belief about AI (a high outlier credence in near-term doom from an emerging technology), since he had thought a lot about the subject and was already highly engaged in the relevant intellectual community, since it’s not clear when he dropped the belief, and since twenty isn’t (in my view) actually all that young. I do know a lot of people in their early twenties; I think their current work and styles of thought are likely to be predictive of their work and styles of thought in the future, even though I do of course expect the quality to go up over time....
An addition reason why I think it’s worth distinguishing between his views on nanotech and (e.g.) your views on nuclear power: I think there’s a difference between an off-hand view picked up from other people vs. a fairly idiosyncratic view that you consciously adopted after a lot of reflection and that you decide to devote your professional life to and found an organization to address.
It’s definitely up to the reader to decide how relevant the nanotech case is. Since it’s not widely known, it seems at least pretty plausibly relevant, and the post twice flags his age at the time, I do still endorse including it.
At face value, as well: we’re trying to assess how much weight to give to someone’s extreme, outlier-ish prediction that an emerging technology is almost certain to kill everyone very soon. It just does seem very relevant, to me, that they previously had a different extreme outlier-ish prediction that another emerging technology was very likely kill everyone within a decade.
I don’t find it plausible that we should assign basically no significance to this.
On 6 (the question of whether Yudkowsky has acknowledged negative aspects of his track record):
For the two “clear cut” examples, Eliezer has posted dozens of times on the internet that he has disendorsed his views from before 2002. This is present on his personal website, the relevant articles are no longer prominently linked anywhere, and Eliezer has openly and straightforwardly acknowledged that his predictions and beliefs from the relevant period were wrong.
Similarly, I think your comment may give the impression that I don’t discuss this point in the post. What I write is this:
He has written about mistakes from early on in his intellectual life (particularly pre-2003) and has, on this basis, even made a blanket-statement disavowing his pre-2003 work. However, based on my memory and a quick re-read/re-skim, this writing is an exploration of why it took him a long time to become extremely concerned about existential risks from misaligned AI. For instance, the main issue it discusses with his plans to build AGI are that these plans didn’t take into account the difficulty and importance of ensuring alignment. This writing isn’t, I think, an exploration or acknowledgement of the kinds of mistakes I’ve listed in this post.
On the general point that this post uses old examples:
Give the sorts of predictions involved (forecasts about pathways to transformative technologies), old examples are generally going to be more unambiguous than new examples. Similarly for risk arguments: it’s hard to have a sense of how new arguments are going to hold up. It’s only for older arguments that we can start to approach the ability to say that technological progress, progress in arguments, and evolving community opinion say something clear-ish about how strong the arguments were.
On signposting:
I also dislike calling this post “On Deference and Yudkowsky’s AI Risk Estimates”, as if this post was trying to be an unbiased analysis of how much to defer to Eliezer, while you just list negative examples. I think this post is better named “against Yudkowsky on AI Risk estimates”. Or “against Yudkowsky’s track record in AI Risk Estimates”. Which would have made it clear that you are selectively giving evidence for one side, and more clearly signposted that if someone was trying to evaluate Eliezer’s track record, this post will only be a highly incomplete starting point.
I think it’s possible another title would have been better (I chose a purposely bland one partly for the purpose of trying to reduce heat—and that might have been a mistake). But I do think I signpost what the post is doing fairly clearly.
The introduction says it’s focusing on “negative aspects” of Yudkowsky’s track record, the section heading for the section introducing the examples describes them as “cherry-picked,” and the start of the section introducing the examples has an italicized paragraph re-emphasizing that the examples are selective and commenting on the significance of this selectiveness.
On the role of the fast take-off assumption in classic arguments:
I think the arguments are pretty tight and sufficient to establish the basic risk argument. I found your critique relatively uncompelling. In particular, I think you are misrepresenting that a premise of the original arguments was a fast takeoff.
I disagree with this. I do think it’s fair to say that fast take-off was typically a premise of the classic arguments.
Two examples I have off-hand (since they’re in the slides from my talk) are from Yudkowsky’s exchange with Caplan and from Superintelligence. Superintelligence isn’t by Yudkowsky, of course, but hopefully is still meaningful to include (insofar as Superintelligence heavily drew on Yudkowsky’s work and was often accepted as a kind of distillation of the best arguments as they existed at the time).
“I’d ask which of the following statements Bryan Caplan [a critic of AI risk arguments] denies:
Orthogonality thesis: Intelligence can be directed toward any compact goal….
Instrumental convergence: An AI doesn’t need to specifically hate you to hurt you; a paperclip maximizer doesn’t hate you but you’re made out of atoms that it can use to make paperclips, so leaving you alive represents an opportunity cost and a number of foregone paperclips….
Rapid capability gain and large capability differences: Under scenarios seeming more plausible than not, there’s the possibility of AIs gaining in capability very rapidly, achieving large absolute differences of capability, or some mixture of the two….
1-3 in combination imply that Unfriendly AI is a critical problem-to-be-solved, because AGI is not automatically nice, by default does things we regard as harmful, and will have avenues leading up to great intelligence and power.”
(Caveat that the fast-take-off premise is stated a bit ambiguity here, so it’s not clear what level of rapidness is being assumed.)
From Superintelligence:
Taken together, these three points [decisive strategic advantage, orthogonality, and instrumental convergence] thus indicate that the first superintelligence may shape the future of Earth-originating life, could easily have non-anthropomorphic final goals, and would likely have instrumental reasons to pursue open-ended resource acquisition. If we now reflect that human beings consist of useful resources (such as conveniently located atoms) and that we depend for our survival and flourishing on many more local resources, we can see that the outcome could easily be one in which humanity quickly becomes extinct.
The decisive strategic advantage point is justified through a discussion of the possibility of a fast take-off. The first chapter of the book also starts by introducing the possibility of an intelligence explosion. It then devotes two chapters to the possibility of a fast take-off and the idea this might imply a decisive strategic advantage, before it gets to discussing things like the orthogonality thesis.
I think it’s also relevant that content from MIRI and people associated with MIRI, raising the possibility of extinction from AI, tended to very strongly emphasize (e.g. spend most of its time on) the possibility of a run-away intelligence explosion. The most developed classic pieces arguing for AI risk often have names like “Shaping the Intelligence Explosion,” “Intelligence Explosion: Evidence and import,” “Intelligence Explosion Microeconomics,” and “Facing the Intelligence Explosion.”
Overall, then, I do think it’s fair to consider a fast-takeoff to be a core premise of the classic arguments. It wasn’t incidental or a secondary consideration.
[[Note: I’ve edited my comment, here, to respond to additional points. Although there are still some I haven’t responded to yet.]]
One quick response, since it was easy (might respond more later):
Overall, then, I do think it’s fair to consider a fast-takeoff to be a core premise of the classic arguments. It wasn’t incidental or a secondary consideration.
I do think takeoff speeds between 1 week and 10 years are a core premise of the classic arguments. I do think the situation looks very different if we spend 5+ years in the human domain, but I don’t think there are many who believe that that is going to happen.
I don’t think the distinction between 1 week and 1 year is that relevant to the core argument for AI Risk, since it seems in either case more than enough cause for likely doom, and that premise seems very likely to be true to me. I do think Eliezer believes things more on the order of 1 week than 1 year, but I don’t think the basic argument structure is that different in either case (though I do agree that the 1 year opens us up to some more potential mitigating strategies).
“Orthogonality thesis: Intelligence can be directed toward any compact goal….
Instrumental convergence: An AI doesn’t need to specifically hate you to hurt you; a paperclip maximizer doesn’t hate you but you’re made out of atoms that it can use to make paperclips, so leaving you alive represents an opportunity cost and a number of foregone paperclips….
Rapid capability gain and large capability differences: Under scenarios seeming more plausible than not, there’s the possibility of AIs gaining in capability very rapidly, achieving large absolute differences of capability, or some mixture of the two….
1-3 in combination imply that Unfriendly AI is a critical problem-to-be-solved, because AGI is not automatically nice, by default does things we regard as harmful, and will have avenues leading up to great intelligence and power.””
1-3 in combination don’t imply anything with high probability.
(i.e. most people who are likely to update downwards on Yudkowsky on the basis of this post, seem to me to be generically too trusting, and I am confident I can write a more compelling post about any other central figure in Effective Altruism that would likely cause you to update downwards even more)
My impression is the post is somewhat unfortunate attempt to “patch” the situation in which many generically too trusting people updated a lot on AGI Ruin: A List of Lethalities and Death with Dignity and subsequent deference/update cascades.
In my view the deeper problem here is instead of disagreements about model internals, many of these people do some sort of “averaging conclusions” move, based on signals like seniority, karma, vibes, etc.
Many of these signals are currently wildly off from truth-tracking, so you get attempts to push the conclusion-updates directly.
This critique strikes me as about as sensible as digging up someone’s old high-school essays and critiquing their stance on communism or the criminal justice system. I want to remind any reader that this is an opinion from 1999, when Eliezer was barely 20 years old. I am confident I can find crazier and worse opinions for every single leadership figure in Effective Altruism, if I am willing to go back to what they thought while they were in high-school. To give some character, here are some things I believed in my early high-school years
This is really minor and nitpicky, and I agree with much of your overall points, but I don’t think equivocating between “barely 20” and “early high-school” is fair. The former is a normal age to be a third-year university student in the US, and plenty of college-age EAs are taken quite seriously by the rest of us.
Oh, hmm, I think this is just me messing up the differences between the U.S. and german education systems (I was 18 and 19 in high-school, and enrolled in college when I was 20).
I think the first quote on nanotechnology was actually written in 1996 originally (though was maybe updated in 1999). Which would put Eliezer at ~17 years old when he wrote that.
The second quote was I think written in more like 2000, which would put him more in the early college years, and I agree that it seems good to clarify that.
I don’t think Eliezer has an unambiguous upper hand in the FOOM debate at all
Then I listed a bunch of ways in which the world looks more like Robin’s predictions, particularly regarding continuity and locality. I said Robin’s predictions about AI timelines in particular looked bad. This isn’t closely related to the topic of your section 3, where I mostly agree with the OP.
Hmm, I think this is fair, rereading that comment.
I feel a bit confused here, since at the scale that Robin is talking about, timelines and takeoff speeds seem very inherently intertwined (like, if Robin predicts really long timelines, this clearly implies a much slower takeoff speed, especially when combined with gradual continuous increases). I agree there is a separate competitiveness dimension that you and Robin are closer on, which is important for some of the takeoff dynamics, but on overall takeoff speed, I feel like you are closer to Eliezer than Robin (Eliezer predicting weeks to months to cross the general intelligence human->superhuman gap, you predicting single-digit years to cross that gap, and Hanson predicting decades to cross that gap). Though it’s plausible that I am missing something here.
In any case, I agree that my summary of your position here is misleading, and will edit accordingly.
I think my views about takeoff speeds are generally similar to Robin’s though neither Robin nor Eliezer got at all concrete in that discussion so I can’t really say. You can read this essay from 1998 with his “outside-view” guesses, which I suspect are roughly in line with what he’s imagining in the FOOM debate.
I think that doc implies significant probability on a “slow” takeoff of 8, 4, 2… year doublings (more like the industrial revolution), but a broad distribution over dynamics which also puts significant probability on e.g. a relatively fast jump to a 1 month doubling time (more like the agricultural revolution). In either case, over the next few doublings he would by default expect still further acceleration. Overall I think this is basically a sensible model.
(I agree that shorter timelines generally suggest faster takeoff, but I think either Robin or Eliezer’s views about timelines would be consistent with either Robin or Eliezer’s views about takeoff speed.)
I am confident I can write a more compelling post about any other central figure in Effective Altruism that would likely cause you to update downwards even more
If done in a polite and respectful manner, I think this would be a genuinely good idea.
It seems that half of these examples are from 15+ years ago, from a period for which Eliezer has explicitly disavowed his opinions (and the ones that are not strike me as most likely correct, like treating coherence arguments as forceful and that AI progress is likely to be discontinuous and localized and to require relatively little compute).
Let’s go example-by-example:
1. Predicting near-term extinction from nanotech
This critique strikes me as about as sensible as digging up someone’s old high-school essays and critiquing their stance on communism or the criminal justice system. I want to remind any reader that this is an opinion from 1999, when Eliezer was barely 20 years old. I am confident I can find crazier and worse opinions for every single leadership figure in Effective Altruism, if I am willing to go back to what they thought while they were in high-school. To give some character, here are some things I believed in my early high-school years:
The economy was going to collapse because the U.S. was establishing a global surveillance state
Nuclear power plants are extremely dangerous and any one of them is quite likely to explode in a given year
We could have easily automated the creation of all art, except for the existence of a vaguely defined social movement that tries to preserve the humanity of art-creation
These are dumb opinions. I am not ashamed of having had them. I was young and trying to orient in the world. I am confident other commenters can add their own opinions they had when they were in high-school. The only thing that makes it possible for someone to critique Eliezer on these opinions is that he was virtuous and wrote them down, sometimes in surprisingly well-argued ways.
If someone were to dig up an old high-school essay of mine, in-particular one that has at the top written “THIS IS NOT ENDORSED BY ME, THIS IS A DUMB OPINION”, and used it to argue that I am wrong about important cause prioritization questions, I would feel deeply frustrated and confused.
For context, on Eliezer’s personal website it says:
2. Predicting that his team had a substantial chance of building AGI before 2010
Given that this is only 2 years later, all my same comments apply. But let’s also talk a bit about the object-level here.
This is the quote on which this critique is based:
This… is not a very confident prediction. This paragraph literally says “only a guess”. I agree, if Eliezer said this today, I would definitely dock him some points, but this is again a freshman-aged Eliezer, and it was more than 20 years ago.
But also, I don’t know, predicting AGI by 2020 from the year 2000 doesn’t sound that crazy. If we didn’t have a whole AI winter, if Moore’s law had accelerated a bit instead of slowed down, if more talent had flowed into AI and chip-development, 2020 doesn’t seem implausible to me. I think it’s still on the aggressive side, given what we know now, but technological forecasting is hard, and the above sounds more like a 70% confidence interval instead of a 90% confidence interval.
3. Having high confidence that AI progress would be extremely discontinuous and localized and not require much compute
This opinion strikes me as approximately correct. I still expect highly discontinuous progress, and many other people have argued for this as well. Your analysis that the world looks more like Hanson’s world described in the AI foom debate also strikes me as wrong (and e.g. Paul Christiano has also said that Hanson’s predictions looked particularly bad in the FOOM debate. EDIT: I think this was worded too strong, and while Paul had some disagreements with Robin, on the particular dimension of discontinuity and competitiveness, Paul thinks Robin came away looking better than Eliezer). Indeed, I would dock Hanson many more points in that discussion (though, overall, I give both of them a ton of points, since they both recognized the importance of AI-like technologies early, and performed vastly above baseline for technological forecasting, which again, is extremely hard).
This seems unlikely to be the right place for a full argument on discontinuous progress. However, continuous takeoff is very far from consensus in the AI Alignment field, and this post seems to try to paint it as such, which seems pretty bad to me (especially if it’s used in a list with two clearly wrong things, without disclaiming it as such).
4. Treating early AI risk arguments as close to decisive
You say:
I think the arguments are pretty tight and sufficient to establish the basic risk argument. I found your critique relatively uncompelling. In particular, I think you are misrepresenting that a premise of the original arguments was a fast takeoff. I can’t currently remember any writing that said it was a necessary component of the AI risk arguments that takeoff happens fast, or at least whether the distinction between “AI vastly exceeds human intelligence in 1 week vs 4 years” is that crucial to the overall argument, which is as far as I can tell the range that most current opinions in the AI Alignment field falls into (and importantly, I know of almost no one who believes that it could take 20+ years for AI to go from mildly subhuman to vastly superhuman, which does feel like it could maybe change the playing field, but also seems to be a very rarely held opinion).
Indeed, I think Eliezer was probably underconfident in doom from AI, since I currently assign >50% probability to AI Doom, as do many other people in the AI Alignment field.
See also Nate’s recent comment on some similar critiques to this: https://www.lesswrong.com/posts/8NKu9WES7KeKRWEKK/why-all-the-fuss-about-recursive-self-improvement
5. Treating “coherence arguments” as forceful
Coherence arguments do indeed strike me as one of the central valid arguments in favor of AI Risk. I think there was a common misunderstanding that did confuse some people, but that misunderstanding was not argued for by Eliezer or other people at MIRI, as far as I can tell (and I’ve looked into this for 5+ hours as part of discussions with Rohin and Richard).
The central core of coherence arguments, which are based in arguments of competetiveness and economic efficiency strike me as very strong, robustly argued for, and one of the main reasons for why AI Risk will be dangerous. The Neumann-Morgensterm theorem does play a role here, though it’s definitely not sufficient to establish a strong case, and Rohin and Richard have successfully argued against that, though I don’t think Eliezer has historically argued that the Neumann-Morgenstern theorem is sufficient to establish an AI-alignment relevant argument on its own (though Dutch-book style arguments are very suggestive for the real structure of the argument).
Edit: Rohin says something similar in a separate comment reply.
6. Not acknowledging his mixed track record
Given my disagreements with the above, I think doing so would be a mistake. But even without that, let’s look at the merits of this critique.
For the two “clear cut” examples, Eliezer has posted dozens of times on the internet that he has disendorsed his views from before 2002. This is present on his personal website, the relevant articles are no longer prominently linked anywhere, and Eliezer has openly and straightforwardly acknowledged that his predictions and beliefs from the relevant period were wrong.
For the disputed examples, Eliezer still believes all of these arguments (as do I), so it would be disingenuous for Eliezer to “acknowledge his mixed track record” in this domain. You can either argue that he is wrong, or you can argue that he hasn’t acknowledged that he has changed his mind and was previously wrong, but you can’t both argue that Eliezer is currently wrong in his beliefs, and accuse him of not telling others that he is wrong. I want people to say things they believe. And for the only two cases where you have established that Eliezer has changed his mind, he has extensively acknowledged his track record.
Some comments on the overall post:
I really dislike this post. I think it provides very little argument, and engages in extremely extensive cherry-picking in a way that does not produce a symmetric credit-allocation (i.e. most people who are likely to update downwards on Yudkowsky on the basis of this post, seem to me to be generically too trusting, and I am confident I can write a more compelling post about any other central figure in Effective Altruism that would likely cause you to update downwards even more).
I think a good and useful framing on this post could have been “here are 3 points where I disagree with Eliezer on AI Risk” (I don’t think it would have been useful under almost any circumstance to bring up the arguments from the year 2000). And then to primarily spend your time arguing about the concrete object-level. Not to start a post that is trying to say that Eliezer is “overconfident in his beliefs about AI” and “miscalibrated”, and then to justify that by cherry-picking two examples from when Eliezer was barely no longer a teenager, and three arguments on which there is broad disagreement within the AI Alignment field.
I also dislike calling this post “On Deference and Yudkowsky’s AI Risk Estimates”, as if this post was trying to be an unbiased analysis of how much to defer to Eliezer, while you just list negative examples. I think this post is better named “against Yudkowsky on AI Risk estimates”. Or “against Yudkowsky’s track record in AI Risk Estimates”. Which would have made it clear that you are selectively giving evidence for one side, and more clearly signposted that if someone was trying to evaluate Eliezer’s track record, this post will only be a highly incomplete starting point.
I have many more thoughts, but I think I’ve written enough for now. I think I am somewhat unlikely to engage with replies in much depth, because writing this comment has already taken up a lot of my time, and I expect given the framing of the post, discussion on the post to be unnecessarily conflicty and hard to navigate.
Just to note that the boldfaced part has no relevance in this context. The post is not attributing these views to present-day Yudkowsky. Rather, it is arguing that Yudkowsky’s track record is less flattering than some people appear to believe. You can disavow an opinion that you once held, but this disavowal doesn’t erase a bad prediction from your track record.
Hmm, I think that part definitely has relevance. Clearly we would trust Eliezer less if his response to that past writing was “I just got unlucky in my prediction, I still endorse the epistemological principles that gave rise to this prediction, and would make the same prediction, given the same evidence, today”.
If someone visibly learns from forecasting mistakes they make, that should clearly update us positively on them not repeating the same mistakes.
I suppose one of my main questions is whether he has visibly learned from the mistakes, in this case.
For example, I wasn’t able to find a post or comment to the effect of “When I was younger, I spent of years of my life motivated by the belief that near-term extinction from nanotech was looming. I turned out to be wrong. Here’s what I learned from that experience and how I’ve applied it to my forecasts of near-term existential risk from AI.” Or a post or comment acknowledging his previous over-optimistic AI timelines and what he learned from them, when formulating his current seemingly short AI timelines.
(I genuinely could be missing these, since he has so much public writing.)
Eliezer writes a bit about his early AI timeline and nanotechnology opinions here, though it sure is a somewhat obscure reference that takes a bunch of context to parse:
While also including some other points, I do read it as a pretty straightforward “Yes, I was really wrong. I didn’t know about cognitive biases, and I did not know about the virtue of putting probability distributions on things, and I had not thought enough about the art of thinking well. I would not make the same mistakes today.”.
Did Yudkowsky actually write these sentences?
If Yudkowsky thinks, as this suggests, that people in EA think or do things because he tells them to—this alone means it’s valuable to question whether people give him the right credibility.
I am not sure about the question. Yeah, this is a quote from the linked post, so he wrote those sections.
Also, yeah, seems like Eliezer has had a very large effect on whether this community uses things like probability distributions, models things in a bayesian way, makes lots of bets, and pays attention to things like forecasting track records. I don’t think he gets to take full credit for those norms, but my guess is he is the single individual who most gets to take credit for those norms.
I don’t see how he has encouraged people to pay attention to forecasting track records. People who have encouraged that norm make public bets or go on public forecasting platforms and make predictions about questions that can resolve in the short term. Bryan Caplan does this; I think greg Lewis and David Manheim are superforecasters.
I thought the upshot of this piece and the Jotto post was that Yudkowsky is in fact very dismissive of people who make public forecasts. “I consider naming particular years to be a cognitively harmful sort of activity; I have refrained from trying to translate my brain’s native intuitions about this into probabilities, for fear that my verbalized probabilities will be stupider than my intuitions if I try to put weight on them.” This seems like the opposite of encouraging people to pay attention to forecasting but is rather dismissing the whole enterprise of forecasting.
I wanted to make sure I’m not missing something, since this shines a negative light about him IMO.
There’s a difference between saying, for example, “You can’t expect me to have done X then—nobody was doing it, and I haven’t even written about it yet, nor was I aware of anyone else doing so”—and saying ”… nobody was doing it because I haven’t told them to.”
This isn’t about credit. It’s about self-perception and social dynamics.
I mean… it is true that Eliezer really did shape the culture in the direction of forecasting and predictions and that kind of stuff. My best guess is that without Eliezer, we wouldn’t have a culture of doing those things (and like, the AI Alignment community as is probably wouldn’t exist). You might disagree with me and him on this, in which case sure, update in that direction, but I don’t think it’s a crazy opinion to hold.
The timeline doesn’t make sense for this version of events at all. Eliezer was uninformed on this topic in 1999, at a time when Robin Hanson had already written about gambling on scientific theories (1990), prediction markets (1996), and other betting-related topics, as you can see from the bibliography of his Futarchy paper (2000). Before Eliezer wrote his sequences (2006-2009), the Long Now Foundation already had Long Bets (2003), and Tetlock had already written Expert Political Judgment (2005).
If Eliezer had not written his sequences, forecasting content would have filtered through to the EA community from contacts of Hanson. For instance, through blogging by other GMU economists like Caplan (2009). And of course, through Jason Matheny, who worked at FHI, where Hanson was an affiliate. He ran the ACE project (2010), which led to the science behind Superforecasting, a book that the EA community would certainly have discovered.
Hmm, I think these are good points. My best guess is that I don’t think we would have a strong connection to Hanson without Eliezer, though I agree that that kind of credit is harder to allocate (and it gets fuzzy what we even mean by “this community” as we extend into counterfactuals like this).
I do think the timeline here provides decent evidence in favor of less credit allocation (and I think against the stronger claim “we wouldn’t have a culture of [forecasting and predictions] without Eliezer”). My guess is in terms of causing that culture to take hold, Eliezer is probably still the single most-responsible individual, though I do now expect (after having looked into a bunch of comment threads from 1996 to 1999 and seeing many familiar faces show up) that a lot of the culture would show up without Eliezer.
speaking for myself, eliezer has played no role in encouraging me to give quantitative probability distributions. For me, that was almost entirely due to people like Tetlock and Bryan Caplan, both of whom I would have encountered regardless of Eliezer. I strongly suspect this is true of lots of people who are in EA but don’t identify with the rationalist community
More generally, I do think that Eliezer and other rationalists overestimate how much influence they have had on wider views in the community. eg I have not read the sequences and I just don’t think it plays a big role in the internal story of a lot of EAs.
For me, even people like Nate Silver or David McKay, who aren’t part of the community, have played a bigger role on encouraging quantification and probabilistic judgment.
This is my impression and experience as well
“My best guess is that I don’t think we would have a strong connection to Hanson without Eliezer”
Fwiw, I found Eliezer through Robin Hanson.
Yeah, I think this isn’t super rare, but overall still much less common than the reverse.
I’ll currently take your word for that because I haven’t been here nearly as long. I’ll mention that some of these contributions I don’t necessarily consider positive.
But the point is, is Yudkowsky a (major) contributor to a shared project, or is he a ruler directing others, like his quote suggests? How does he view himself? How do the different communities involved view him?
P.S. I disagree with whoever (strong-)downvoted your comment.
Yudkowsky often
complainsrantshopes people will form their own opinions instead of just listening to him, I can find references if you want.I also think he lately finds it
depressingworrying that he’s got to be the responsible adult. Easy references: Search for “Eliezer” in List Of Lethalities.I think this strengthens my point, especially given how it is written in the post you linked. Telling people you’re the responsible adult, or the only one who notices things, still means telling them you’re smarter than them and they should just defer to you.
I’m trying to account for my biases in these comments, but I encourage others to go to that post, search for “Eliezer” as you suggested, and form their own views.
Those are four very different claims. In general, I think it’s bad to collapse all (real or claimed) differences in ability into a single status hierarchy, for the reasons stated in Inadequate Equilibria.
Eliezer is claiming that other people are not taking the problem sufficiently seriously, claiming ownership of it, trying to form their own detailed models of the full problem, and applying enough rigor and clarity to make real progress on the problem.
He is specifically not saying “just defer to me”, and in fact is saying that he and everyone else is going to die if people rely on deference here. A core claim in AGI Ruin is that we need more people with “not the ability to read this document and nod along with it, but the ability to spontaneously write it from scratch without anybody else prompting you”.
Deferring to Eliezer means that Eliezer is the bottleneck on humanity solving the alignment problem; which means we die. The thing Eliezer claims we need is a larger set of people who arrive at true, deep, novel insights about the problem on their own —without Eliezer even mentioning the insights, much less spending a ton of time trying to persuade anyone of them—and writing them up.
It’s true that Eliezer endorses his current stated beliefs; this goes without saying, or he obviously wouldn’t have written them down. It doesn’t mean that he thinks humanity has any path to survival via deferring to him, or that he thinks he has figured out enough of the core problems (or ever could conceivably could do so, on his own) to give humanity a significant chance of surviving. Quoting AGI Ruin:
The end of the “death with dignity” post is also alluding to Eliezer’s view that it’s pretty useless to figure out what’s true merely via deferring to Eliezer.
Thanks, those are some good counterpoints.
Eliezer is cleanly just a major contributor. If he went off the rails tomorrow, some people would follow him (and the community would be better with those few gone), but the vast majority would say “wtf is that Eliezer fellow doing”. I also don’t think he sees himself as the leader of the community either.
Probably Eliezer likes Eliezer more than EA/Rationality likes Eliezer, because Eliezer really likes Eliezer. If I were as smart & good at starting social movements as Eliezer, I’d probably also have an inflated ego, so I don’t take it as too unreasonable of a character flaw.
More than Philip Tetlock (author of Superforecasting)?
Does that particular quote from Yudkowsky not strike you as slightly arrogant?
Yes, definitely much more than Philip Tetlock, given that our community had strong norms of forecasting and making bets before Tetlock had done most of his work on the topic (Expert Political Forecasting was out, but as far as I can tell was not a major influence on people in the community, though I am not totally confident of that).
I am generally strongly against a culture of fake modesty. If I want people to make good decisions, they need to be able to believe things about them that might sound arrogant to others. Yes, it sounds arrogant to an external audience, but it also seems true, and it seems like whether it is true should be the dominant fact on whether it is good to say.
FWIW I think “it was 20 years ago” is a good reason not to take these failed predictions too seriously, and “he has disavowed these predictions after seeing they were false” is a bad reason to take them unseriously.
If EY gets to disavow his mistakes, so does everyone else.
On 1 (the nanotech case):
I think your comment might give the misimpression that I don’t discuss this fact in the post or explain why I include the case. What I write is:
An addition reason why I think it’s worth distinguishing between his views on nanotech and (e.g.) your views on nuclear power: I think there’s a difference between an off-hand view picked up from other people vs. a fairly idiosyncratic view that you consciously adopted after a lot of reflection and that you decide to devote your professional life to and found an organization to address.
It’s definitely up to the reader to decide how relevant the nanotech case is. Since it’s not widely known, it seems at least pretty plausibly relevant, and the post twice flags his age at the time, I do still endorse including it.
At face value, as well: we’re trying to assess how much weight to give to someone’s extreme, outlier-ish prediction that an emerging technology is almost certain to kill everyone very soon. It just does seem very relevant, to me, that they previously had a different extreme outlier-ish prediction that another emerging technology was very likely kill everyone within a decade.
I don’t find it plausible that we should assign basically no significance to this.
On 6 (the question of whether Yudkowsky has acknowledged negative aspects of his track record):
Similarly, I think your comment may give the impression that I don’t discuss this point in the post. What I write is this:
On the general point that this post uses old examples:
Give the sorts of predictions involved (forecasts about pathways to transformative technologies), old examples are generally going to be more unambiguous than new examples. Similarly for risk arguments: it’s hard to have a sense of how new arguments are going to hold up. It’s only for older arguments that we can start to approach the ability to say that technological progress, progress in arguments, and evolving community opinion say something clear-ish about how strong the arguments were.
On signposting:
I think it’s possible another title would have been better (I chose a purposely bland one partly for the purpose of trying to reduce heat—and that might have been a mistake). But I do think I signpost what the post is doing fairly clearly.
The introduction says it’s focusing on “negative aspects” of Yudkowsky’s track record, the section heading for the section introducing the examples describes them as “cherry-picked,” and the start of the section introducing the examples has an italicized paragraph re-emphasizing that the examples are selective and commenting on the significance of this selectiveness.
On the role of the fast take-off assumption in classic arguments:
I disagree with this. I do think it’s fair to say that fast take-off was typically a premise of the classic arguments.
Two examples I have off-hand (since they’re in the slides from my talk) are from Yudkowsky’s exchange with Caplan and from Superintelligence. Superintelligence isn’t by Yudkowsky, of course, but hopefully is still meaningful to include (insofar as Superintelligence heavily drew on Yudkowsky’s work and was often accepted as a kind of distillation of the best arguments as they existed at the time).
From Yudkowsky’s debate with Caplan (2016):
(Caveat that the fast-take-off premise is stated a bit ambiguity here, so it’s not clear what level of rapidness is being assumed.)
From Superintelligence:
The decisive strategic advantage point is justified through a discussion of the possibility of a fast take-off. The first chapter of the book also starts by introducing the possibility of an intelligence explosion. It then devotes two chapters to the possibility of a fast take-off and the idea this might imply a decisive strategic advantage, before it gets to discussing things like the orthogonality thesis.
I think it’s also relevant that content from MIRI and people associated with MIRI, raising the possibility of extinction from AI, tended to very strongly emphasize (e.g. spend most of its time on) the possibility of a run-away intelligence explosion. The most developed classic pieces arguing for AI risk often have names like “Shaping the Intelligence Explosion,” “Intelligence Explosion: Evidence and import,” “Intelligence Explosion Microeconomics,” and “Facing the Intelligence Explosion.”
Overall, then, I do think it’s fair to consider a fast-takeoff to be a core premise of the classic arguments. It wasn’t incidental or a secondary consideration.
[[Note: I’ve edited my comment, here, to respond to additional points. Although there are still some I haven’t responded to yet.]]
One quick response, since it was easy (might respond more later):
I do think takeoff speeds between 1 week and 10 years are a core premise of the classic arguments. I do think the situation looks very different if we spend 5+ years in the human domain, but I don’t think there are many who believe that that is going to happen.
I don’t think the distinction between 1 week and 1 year is that relevant to the core argument for AI Risk, since it seems in either case more than enough cause for likely doom, and that premise seems very likely to be true to me. I do think Eliezer believes things more on the order of 1 week than 1 year, but I don’t think the basic argument structure is that different in either case (though I do agree that the 1 year opens us up to some more potential mitigating strategies).
“Orthogonality thesis: Intelligence can be directed toward any compact goal….
Instrumental convergence: An AI doesn’t need to specifically hate you to hurt you; a paperclip maximizer doesn’t hate you but you’re made out of atoms that it can use to make paperclips, so leaving you alive represents an opportunity cost and a number of foregone paperclips….
Rapid capability gain and large capability differences: Under scenarios seeming more plausible than not, there’s the possibility of AIs gaining in capability very rapidly, achieving large absolute differences of capability, or some mixture of the two….
1-3 in combination imply that Unfriendly AI is a critical problem-to-be-solved, because AGI is not automatically nice, by default does things we regard as harmful, and will have avenues leading up to great intelligence and power.””
1-3 in combination don’t imply anything with high probability.
My impression is the post is somewhat unfortunate attempt to “patch” the situation in which many generically too trusting people updated a lot on AGI Ruin: A List of Lethalities and Death with Dignity and subsequent deference/update cascades.
In my view the deeper problem here is instead of disagreements about model internals, many of these people do some sort of “averaging conclusions” move, based on signals like seniority, karma, vibes, etc.
Many of these signals are currently wildly off from truth-tracking, so you get attempts to push the conclusion-updates directly.
This is really minor and nitpicky, and I agree with much of your overall points, but I don’t think equivocating between “barely 20” and “early high-school” is fair. The former is a normal age to be a third-year university student in the US, and plenty of college-age EAs are taken quite seriously by the rest of us.
Oh, hmm, I think this is just me messing up the differences between the U.S. and german education systems (I was 18 and 19 in high-school, and enrolled in college when I was 20).
I think the first quote on nanotechnology was actually written in 1996 originally (though was maybe updated in 1999). Which would put Eliezer at ~17 years old when he wrote that.
The second quote was I think written in more like 2000, which would put him more in the early college years, and I agree that it seems good to clarify that.
Thank you, this clarification makes sense to me!
To clarify, what I said was:
Then I listed a bunch of ways in which the world looks more like Robin’s predictions, particularly regarding continuity and locality. I said Robin’s predictions about AI timelines in particular looked bad. This isn’t closely related to the topic of your section 3, where I mostly agree with the OP.
Hmm, I think this is fair, rereading that comment.
I feel a bit confused here, since at the scale that Robin is talking about, timelines and takeoff speeds seem very inherently intertwined (like, if Robin predicts really long timelines, this clearly implies a much slower takeoff speed, especially when combined with gradual continuous increases). I agree there is a separate competitiveness dimension that you and Robin are closer on, which is important for some of the takeoff dynamics, but on overall takeoff speed, I feel like you are closer to Eliezer than Robin (Eliezer predicting weeks to months to cross the general intelligence human->superhuman gap, you predicting single-digit years to cross that gap, and Hanson predicting decades to cross that gap). Though it’s plausible that I am missing something here.
In any case, I agree that my summary of your position here is misleading, and will edit accordingly.
I think my views about takeoff speeds are generally similar to Robin’s though neither Robin nor Eliezer got at all concrete in that discussion so I can’t really say. You can read this essay from 1998 with his “outside-view” guesses, which I suspect are roughly in line with what he’s imagining in the FOOM debate.
I think that doc implies significant probability on a “slow” takeoff of 8, 4, 2… year doublings (more like the industrial revolution), but a broad distribution over dynamics which also puts significant probability on e.g. a relatively fast jump to a 1 month doubling time (more like the agricultural revolution). In either case, over the next few doublings he would by default expect still further acceleration. Overall I think this is basically a sensible model.
(I agree that shorter timelines generally suggest faster takeoff, but I think either Robin or Eliezer’s views about timelines would be consistent with either Robin or Eliezer’s views about takeoff speed.)
If done in a polite and respectful manner, I think this would be a genuinely good idea.