bmg comments on On Deference and Yudkowsky’s AI Risk Estimates

bmg 21 Jun 2022 22:41 UTC
129 points
0 ∶ 0
I really appreciate the time people have taken to engage with this post (and actually hope the attention cost hasn’t been too significant). I decided to write some post-discussion reflections on what I think this post got right and wrong.

The reflections became unreasonably long—and almost certainly should be edited down—but I’m posting them here in a hopefully skim-friendly format. They cover what I see as some mistakes with the post, first, and then cover some views I stand by.

Things I would do differently in a second version of the post:

1. I would either drop the overall claim about how much people should defer to Yudkowsky — or defend it more explicitly

At the start of the post, I highlight the two obvious reasons to give Yudkowsky’s risk estimates a lot of weight: (a) he’s probably thought more about the topic than anyone else and (b) he developed many of the initial AI risk arguments. I acknowledge that many people, justifiably, treat these as important factors when (explicitly or implicitly) deciding how much to defer to Yudkowsky.

Then the post gives some evidence that, at each stage of his career, Yudkowsky has made a dramatic, seemingly overconfident prediction about technological timelines and risks—and at least hasn’t obviously internalised lessons from these apparent mistakes.

The post expresses my view that these two considerations at least counterbalance each other—so that, overall, Yudkowsky’s risk estimates shouldn’t be given more weight than (e.g.) those of other established alignment researchers or the typical person on the OpenPhil worldview investigation team.

But I don’t do a lot in the post to actually explore how we should weigh these factors up. In that sense: I think it’d be fair to regard the post’s central thesis as importantly under-supported by the arguments contained in the post.

I should have either done more to explicitly defend my view or simply framed the post as “some evidence about the reliability of Yudkowsky’s risk estimates.”

2. I would be clearer about how and why I generated these examples

In hindsight, this is a significant oversight on my part. The process by which I generated these examples is definitely relevant for judging how representative they are—and, therefore, how much to update on them. But I don’t say anything about this in the post. My motives (or at least conscious motives) are also part of the story that I only discuss in pretty high-level terms, but seem like they might be relevant for forming judgments.

For context, then, here was the process:

A few years ago, I tried to get a clearer sense of the intellectual history of the AI risk and existential risk communities. For that reason, I read a bunch of old white papers, blog posts, and mailing list discussions.

These gave me the impression that Yudkowsky’s track record (and—to some extent—the track record of the surrounding community) was worse than I’d realised. From reading old material, I basically formed something like this impression: “At each stage of Yudkowsky’s professional life, his work seems to have been guided by some dramatic and confident belief about technological trajectories and risks. The older beliefs have turned out to be wrong. And the ones that haven’t yet resolved at least seem to have been pretty overconfident in hindsight.”

I kept encountering the idea that Yudkowsky has an exceptionally good track record or that he has an unparalleled ability to think well about AI (he’s also expressed view himself) - and I kept thinking, basically, that this seemed wrong. I wrote up some initial notes on this discrepancy at some point, but didn’t do anything with them.

I eventually decided to write something public after the “Death with Dignity” post, since the view it expresses (that we’re all virtually certain to die soon) both seems wrong to me and very damaging if it’s actually widely adopted in the community. I also felt like the “Death with Dignity” post was getting more play than it should, simply because people have a strong tendency to give Yudkowsky’s views weight. I can’t imagine a similar post written by someone else having nearly as large of an impact. Notably, since that post didn’t really have substantial arguments in it (although the later one did), I think the fact it had an impact is seemingly a testament to the power of deference; I think it’d be hard to look at the reaction to that post and argue that it’s only Yudkowsky’s arguments (rather than his public beliefs in-and-of-themselves) that have a major impact on the community.

People are obviously pretty aware of Yudkowsky’s positive contributions, but my impression is that (especially) new community members tended not to be aware of negative aspects of his track record. So I wanted to write a post drawing attention to the negative aspects.

I was initially going to have the piece explicitly express the impression I’d formed, which was something like: “At each stage of Yudkowsky’s professional life, his work has been guided by some dramatic and seemingly overconfident belief about technological trajectories and risks.” The examples in the post were meant to map onto the main ‘animating predictions’ about technology he had at each stage of his career. I picked out the examples that immediately came to mind.

Then I realised I wasn’t at all sure I could defend the claim that these were his main ‘animating predictions’ - the category was obviously extremely vague, and the main examples that came to mind were extremely plausibly a biased sample. I thought there was a good chance that if I reflected more, then I’d also want to include various examples that were more positive.

I didn’t want to spend the time doing a thorough accounting exercise, though, so I decided to drop any claim that the examples were representative and just describe them as “cherry-picked” — and add in lots of caveats emphasising that they’re cherry-picked.

(At least, these were my conscious thought processes and motivations as I remember them. I’m sure other factors played a role!)

3. I’d tweak my discussion of take-off speeds

I’d make it clearer that my main claim is: it would have been unreasonable to assign a very high credence to fast take-offs back in (e.g.) the early- or mid-2000s, since the arguments for fast take-offs had significant gaps. For example, there were a lots of possible countervailing arguments for slow take-offs that pro-fast-take-off authors simply hadn’t address yet — as evidenced, partly, by the later publication of slow-take-off arguments leading a number of people to become significantly more sympathetic to slow take-offs. (I’m not claiming that there’s currently a consensus against fast-take-off views.)

4. I’d add further caveats to the “coherence arguments” case—or simply leave it out

Rohin’s and Oli’s comments under the post have made me aware that there’s a more positive way to interpret Yudkowsky’s use of coherence arguments. I’m not sure if that interpretation is correct, or if it would actually totally undermine the example, but this is at minimum something I hadn’t reflected on. I think it’s totally possible that further reflection would lead me to simply remove the example.

Positions I stand by:

On the flipside, here’s a set of points I still stand by:

1. If a lot of people in the community believe AI is probably going to kill everyone soon, then (if they’re wrong) this can have really important negative effects

In terms of prioritisation: My prediction is that if you were to ask different funders, career advisors, and people making career decisions (e.g. deciding whether to go into AI policy or bio policy) how much they value having a good estimate of AI risk, they’ll very often answer that they value it a great deal. I do think that over-estimating the level of risk could lead to concretely worse decisions.

In terms of community health: I think that believing you’re probably going to die soon is probably bad for a large portion of people. Reputationally: Being perceived as believing that everyone is probably going to die soon (particularly if this actually an excessive level of worry) also seems damaging.

I think we should also take seriously the tail-risk that at least one person with doomy views (even if they’re not directly connected to the existential risk community) will take dramatic and badly harmful actions on the basis of their views.

2. Directly and indirectly, deference to Yudkowsky has a significant influence on a lot of people’s views

As above: One piece of evidence for this is Yudkowsky’s “Death with Dignity” post triggered a big reaction, even though it didn’t contain any significant new arguments. I think his beliefs (above and beyond his arguments) clearly do have an impact.

Another reason to believe deference is a factor: I think it’s both natural and rational for people, particularly people new to an area, to defer to people with more expertise in that area.^[1] Yudkowsky is one of the most obvious people to defer to, as one of the two people most responsible for developing and popularising AI risk arguments and as someone who has (likely) spent more time thinking about the subject than anyone else.

Beyond that: A lot of people also clearly in general have huge amount of respect for Yudkowsky, sometimes more than they have for any other public intellectual. I think it’s natural (and sensible) for people’s views to be influenced by the views of the people they respect. In general, I think, unless you have tremendous self-control, this will tend to happen sub-consciously even if you don’t consciously choose to defer to the people you respect.

Also, people sometimes just do talk about Yudkowsky’s track record or reputation as a contributing factor to their views.

3. The track records of influential intellectuals (including Yudkowsky) should be publicly discussed.

A person’s track-record provides evidence about how reliable their predictions are. If people are considering how much to defer to some intellectual, then they should want to know what their track record (at least within the relevant domain) looks like.

The main questions that matter are: What has the intellectual gotten wrong and right? Beyond whether they were wrong or right, about a given case, does it also seem like their predictions were justified? If they’ve made certain kinds of mistakes in the past, do we now have reason to think they won’t repeat those kinds of mistakes?

4. Yudkowsky’s track record suggests a substantial bias toward dramatic and overconfident predictions.

One counter—which I definitely think it’s worth reflecting on—is that it might be possible to generate a similarly bias-suggesting list of examples like this for any other public intellectual or member of the existential risk community.

I’ll focus on one specific comment, suggesting that Yudkowsky’s incorrect predictions about nanotechnology are in the same reference class as ‘writing a typically dumb high school essay.’ The counter goes something like this: Yes, it was possible to find this example from Yudkowsky’s past—but that’s not importantly different than being able to turn up anyone else’s dumb high school essay about (e.g.) nuclear power.

Ultimately, I don’t buy the comparison. I think it’s really out-of-distribution for someone in their late teens and early twenties to pro-actively form the view that an emerging technology is likely to kill everyone within a decade, found an organization and devote years of their professional life to address the risk, and talk about how they’re the only person alive who can stop it.

That just seems very different from writing a dumb high school essay. Much more than a standard dumb high school essay, I think this aspect of Yudkowsky’s track record really does suggest a bias toward dramatic and overconfident predictions. This prediction is also really strikingly analogous to the prediction Yudkowsky is making right now—its relevance is clearly higher than the relevance of (e.g.) a random poorly thought-out view in a high school essay.

(Yudkowsky’s early writing and work is also impressive, in certain ways, insofar as it suggests a much higher level of originality of thought and agency than the typical young person has. But the fact that this example is impressive doesn’t undercut, I think, the claim that it’s also highly suggestive of a bias toward highly confident and dramatic predictions.)

5. Being one of the first people to identify, develop, or take seriously some idea doesn’t necessarily mean that you predictions about the idea will be unusually reliable

By analogy:
- I don’t think we can assume that the first person to take the covid lab leak theory seriously (when others were dismissive) is currently the most reliable predictor of whether the theory is true.
- I don’t think we can assume that the first person to develop the many worlds theory of quantum mechanics (when others were dismissive) would currently be the best person to predict whether the theory is true, if they were still alive.
There are, certainly, reasons to give pioneers in a domain special weight when weighing expert opinion in that domain.^[2] But these reasons aren’t absolute.

There are even easons that point in the opposite direction: we might worry that the pioneer has an attachment to their theory, so will be biased toward believing it is true and as important as possible. We might also worry that the pioneering-ness of their beliefs is evidence that these beliefs front-ran the evidence and arguments (since one way to be early is to simply be excessively confident). We also have less evidence of their open-mindedness than we do for the people who later on moved toward the pioneer’s views — since moving toward the pioneer’s views, when you were initially dismissive, is at least a bit of evidence for open-mindedness and humility.^[3]

Overall, I do think we should tend defer more to pioneers (all else being equal). But this tendency can definitely be overruled by other evidence and considerations.

6. The causal effects that people have had on the world don’t (in themselves) have implications for how much we should defer to them

At least in expectation, so far, Eliezer Yudkowsky has probably had a very positive impact on the world. There is a plausible case to be made that misaligned AI poses a substantial existential risk—and Yudkowsky’s work has probably, on net, massively increased the number of people thinking about it and taking it seriously. He’s also written essays that have exposed huge numbers of people to other important ideas and helped them to think more clearly. It makes sense for people to applaud all of this.

Still, I don’t think his positive causal effect on the world gives people much additional reason to be deferential to him.

Here’s a dumb thought experiment: Suppose that Yudkowsky wrote all of the same things, but never published them. But suppose, also, that a freak magnetic storm ended up implanting all of the same ideas in his would-be-readers’ brains. Would this absence of a casual effect count against deferring to Yudkowsky? I don’t think so. The only thing that ultimately matters, I think, is his track record of beliefs—and the evidence we currently have about how accurate or justified those beliefs were.

I’m not sure anyone disagrees with the above point, but I did notice there seemed to be a decent amount of discussion in the comments about Yudkowsky’s impact—and I’m not sure I think this issue will ultimately be relevant.^[4]
1. ↩︎
  For example: I had ten hours to form a view about the viability of some application of nanotechnology, I definitely wouldn’t want to ignore the beliefs of people who have already thought about the question. Trying to learn the relevant chemistry and engineering background wouldn’t be a good use of my time.
2. ↩︎
  One really basic reason is simply that they’ve simply had more time to think about certain subjects than anyone else.
3. ↩︎
  Here’s a concrete case: Holden Karnofsky eventually moved toward taking AI risks seriously, after publicly being fairly dismissive of it, and then wrote up a document analysing why he was initially dismissive and drawing lessons from the experience. It seems like we could count that as positive evidence about his future judgment.
4. ↩︎
  Even though I’ve just said I’m not sure this question is relevant, I do also want to say a little bit about Yudkowsky’s impact. I personally think’s probably had a very significant impact. Nonetheless, I also think the impact can be overstated. For example, I think, it’s been suggested that the effective altruism community might not be very familiar with concepts like Bayesian or the importance of overcoming bias if it weren’t for Yudkowsky’s writing. I don’t really find that particular suggestion plausible.
  
  Here’s one data point I can offer from my own life: Through a mixture of college classes and other reading, I’m pretty confident I had already encountered the heuristics and biases literature, Bayes’ theorem, Bayesian epistemology, the ethos of working to overcome bias, arguments for the many worlds interpretation, the expected utility framework, population ethics, and a number of other ‘rationalist-associated’ ideas before I engaged with the effective altruism or rationalist communities. For example, my college had classes in probability theory, Bayesian epistemology, and the philosophy of quantum mechanics, and I’d read at least parts of books like Thinking Fast and Slow, the Signal and the Noise, the Logic of Science, and various books associated with the “skeptic community.” (Admittedly, I think it would have been harder to learn some of these things if I’d gone to college a bit earlier or had a different major. I also probably “got lucky” in various ways with the classes I took and books I picked up.) See also Carl Shulman making a similar point and John Halstead also briefly commenting the way in which he personally encountered some the relevant ideas.
What links here?
- RobBensinger 23 Jun 2022 3:41 UTC
  32 points
  0 ∶ 0
  Parent
  I noted some places I agree with your comment here, Ben. (Along with my overall take on the OP.)
  Some additional thoughts:
  Notably, since that post didn’t really have substantial arguments in it (although the later one did), I think the fact it had an impact is seemingly a testament to the power of deference
  The “death with dignity” post came in the wake of Eliezer writing hundreds of thousands of words about why he thinks alignment is hard in the Late 2021 MIRI Conversations (in addition to the many specific views and arguments about alignment difficulty he’s written up in the preceding 15+ years). So it seems wrong to say that everyone was taking it seriously based on deference alone.
  The post also has a lot of content beyond “p(doom) is high”. Indeed, I think the post’s focus (and value-add) is mostly in its discussion of rationalization, premature/excessive conditionalizing, and ethical injunctions, not in the bare assertion that p(doom) is high. Eliezer was already saying pretty similar stuff about p(doom) back in September.
  I’d make it clearer that my main claim is: it would have been unreasonable to assign a very high credence to fast take-offs back in (e.g.) the early- or mid-2000s, since the arguments for fast take-offs had significant gaps. For example, there were a lots of possible countervailing arguments for slow take-offs that pro-fast-take-off authors simply hadn’t address yet — as evidenced, partly, by the later publication of slow-take-off arguments leading a number of people to become significantly more sympathetic to slow take-offs.
  I disagree; I think that, e.g., noting how powerful and widely applicable general intelligence has historically been, and noting a bunch of standard examples of how human cognition is a total shitshow, is sufficient to have a very high probability on hard takeoff.
  I think the people who updated a bunch toward hard takeoff based on the recent debate were making a mistake, and should have already had a similarly high p(hard takeoff) going back to the Foom debate, if not earlier.
  Insofar as others disagree, I obviously think it’s a good thing for people to publish arguments like “but ML might be very competitive”, and for people to publicly respond to them. But I don’t think “but ML might be very competitive” and related arguments ought to look compelling at a glance (given the original simple arguments for hard takeoff), so I don’t think someone should need to consider the newer discussion in order to arrive at a confident hard-takeoff view.
  (Also, insofar as Paul recently argued for X and Eliezer responded with a valid counter-argument for Y, it doesn’t follow that Eliezer had never considered anything like X or Y in initially reaching his confidence. Eliezer’s stated view is that the new Paul arguments seem obviously invalid and didn’t update him at all when he read them. Your criticism would make more sense here if Eliezer had said “Ah, that’s an important objection I hadn’t considered; but now that I’m thinking about it, I can generate totally new arguments that deal with the objections, and these new counter-arguments seem correct to me.”)
  The main questions that matter are: What has the intellectual gotten wrong and right? Beyond whether they were wrong or right, about a given case, does it also seem like their predictions were justified?
  At least as important, IMO, is the visible quality of their reasoning and arguments, and their retrodictions.
  AGI, moral philosophy, etc. are not topics where we can observe extremely similar causal processes today and test all the key claims and all the key reasoning heuristics with simple experiments. Tossing out ‘argument evaluation’ and ‘how well does this fit what I already know?’ altogether would mean tossing out the majority of our evidence about how much weight to put on people’s views.
  Ultimately, I don’t buy the comparison. I think it’s really out-of-distribution for someone in their late teens and early twenties to pro-actively form the view that an emerging technology is likely to kill everyone within a decade, found an organization and devote years of their professional life to address the risk, and talk about how they’re the only person alive who can stop it.
  I take the opposite view on this comparison. I agree that this is really unusual, but I think the comparison is unfavorable to the high school students, rather than unfavorable to Eliezer. Having unusual views and then not acting on them in any way is way worse than actually acting on your predictions.
  I agree that Eliezer acting on his beliefs to this degree suggests he was confident; but in a side-by-side comparison of a high schooler who’s expressed equal confidence in some other unusual view, but takes no unusual actions as a result, the high schooler is the one I update negatively about.
  (This also connects up to my view that EAs generally are way too timid/passive in their EA activity, don’t start enough new things, and (when they do start new things) start too many things based on ‘what EA leadership tells them’ rather than based on their own models of the world. The problem crippling EA right now is not that we’re generating and running with too many wildly different, weird, controversial moonshot ideas. The problem is that we’re mostly just passively sitting around, over-investing in relatively low-impact meta-level interventions, and/or hoping that the most mainstream already-established ideas will somehow suffice.)
  - Oliver Sourbut 25 Jun 2022 9:08 UTC
    23 points
    0 ∶ 0
    Parent
    I just wanted to state agreement that it seems a large number of people largely misread Death with Dignity, at least according to what seems to me the most plausible intended message: mainly about the ethical injunctions (which are very important as a finitely-rational and prone-to-rationalisation being), as Yudkowsky has written of in the past.
    
    The additional detail of ‘and by the way this is a bad situation and we are doing badly’ is basically modal Yudkowsky schtick and I’m somewhat surprised it updated anyone’s beliefs (about Yudkowsky’s beliefs, and therefore their all-things-considered-including-deference beliefs).
    
    I think if he had been a little more audience-aware he might have written it differently. Then again maybe not, if the net effect is more attention and investment in AI safety—and more recent posts and comments suggest he’s more willing than before to use certain persuasive techniques to spur action (which seems potentially misguided to me, though understandable).
  - Michael St Jules 🔸 23 Jun 2022 18:29 UTC
    11 points
    0 ∶ 0
    Parent
    The “death with dignity” post came in the wake of Eliezer writing hundreds of thousands of words about why he thinks alignment is hard in the Late 2021 MIRI Conversations (in addition to the many specific views and arguments about alignment difficulty he’s written up in the preceding 15+ years). So it seems wrong to say that everyone was taking it seriously based on deference alone.
    I think “deference alone” is a stronger claim than the one we should worry about. People might read the arguments on either side (or disproportionately Eliezer’s arguments), but then defer largely to Eliezer’s weighing of arguments because of his status/position, confidence, references to having complicated internal models (that he often doesn’t explain or link explanations to), or emotive writing style.
    What share of people with views similar to Eliezer’s do you expect to have read these conversations? They’re very long, not well organized, and have no summaries/takeaways. The format seems pretty bad if you value your time.
    I think the AGI Ruin: A List of Lethalities post was formatted pretty accessibly, but that came after death with dignity.
    Also, insofar as Paul recently argued for X and Eliezer responded with a valid counter-argument for Y, it doesn’t follow that Eliezer had never considered anything like X or Y in initially reaching his confidence. Eliezer’s stated view is that the new Paul arguments seem obviously invalid and didn’t update him at all when he read them.
    If the new Paul arguments seem obviously invalid, then Eliezer should be able to explain why in such a way that convinces Paul. Has this generally been the case?
- Habryka [Deactivated] 23 Jun 2022 5:14 UTC
  21 points
  0 ∶ 0
  Parent
  I appreciate this update!
  Then the post gives some evidence that, at each stage of his career, Yudkowsky has made a dramatic, seemingly overconfident prediction about technological timelines and risks—and at least hasn’t obviously internalised lessons from these apparent mistakes.
  I am confused about you bringing in the claim of “at each stage of his career”, given that the only two examples you cited that seemed to provide much evidence here were from the same (and very early) stage of his career. Of course, you might have other points of evidence that point in this direction, but I did want to provide some additional pushback on the “at each stage of his career” point, which I think you didn’t really provide evidence for.
  I do think finding evidence for each stage of his career would of course be time-consuming, and I understand that you didn’t really want to go through all of that, but it seemed good to point out explicitly.
  Ultimately, I don’t buy the comparison. I think it’s really out-of-distribution for someone in their late teens and early twenties to pro-actively form the view that an emerging technology is likely to kill everyone within a decade, found an organization and devote years of their professional life to address the risk, and talk about how they’re the only person alive who can stop it.
  FWIW, indeed in my teens I basically did dedicate a good chunk of my time and effort towards privacy efforts out of a concern for US and UK-based surveillance-state concerns. I was in high-school, so making it my full-time efforts was a bit hard, though I did help found a hackerspace in my hometown that had a lot of privacy concerns baked into the culture, and I did write a good number of essays on this. I think the key difference between me and Eliezer here is more the fact that Eliezer was home-schooled and had experience doing things on his own, and not some kind of other fact about his relationship to the ideas being very different.
  It’s plausible you should update similarly on me, which I think isn’t totally insane (I do think I might have, as Luke put it, the “taking ideas seriously gene”, which I would also associate with taking other ideas to their extremes, like religious beliefs).
- Owen Cotton-Barratt 21 Jun 2022 23:51 UTC
  21 points
  0 ∶ 0
  Parent
  I really appreciated this update. Mostly it checks out to me, but I wanted to push back on this:
  Here’s a dumb thought experiment: Suppose that Yudkowsky wrote all of the same things, but never published them. But suppose, also, that a freak magnetic storm ended up implanting all of the same ideas in his would-be-readers’ brains. Would this absence of a casual effect count against deferring to Yudkowsky? I don’t think so. The only thing that ultimately matters, I think, is his track record of beliefs—and the evidence we currently have about how accurate or justified those beliefs were.
  It seems to me that a good part of the beliefs I care about assessing are the beliefs about what is important. When someone has a track record of doing things with big positive impact, that’s some real evidence that they have truth-tracking beliefs about what’s important. In the hypothetical where Yudkowsky never published his work, I don’t get the update that he thought these were important things to publish, so he doesn’t get credit for being right about that.
  - Yonatan Cale 22 Jun 2022 13:30 UTC
    4 points
    0 ∶ 0
    Parent
    There’s also (imperfect) information in “lots of smart people thought about EY’s opinions and agree with him” that you don’t get from the freak magnetic storm scenario.
- richard_ngo 23 Jun 2022 6:41 UTC
  16 points
  0 ∶ 0
  Parent
  Thanks for writing this update. I think my number one takeaway here is something like: when writing a piece with the aim of changing community dynamics, it’s important to be very clear about motivations and context. E.g. I think a version of the piece which said “I think people are overreacting to Death with Dignity, here are my specific models of where Yudkowsky tends to be overconfident, here are the reasons why I think people aren’t taking those into account as much as they should” would have been much more useful and much less controversial than the current piece, which (as I interpret it) essentially pushes a general “take Yudkowsky less seriously” meme (and is thereby intrinsically political/statusy).
- Yonatan Cale 22 Jun 2022 13:38 UTC
  10 points
  0 ∶ 0
  Parent
  I’m a bit confused about a specific small part:
  tendency toward expressing dramatic views
  I imagine that for many people, including me (including you?), once we work on [what we believe to be] preventing the world from ending, we would only move to another job if it was also preventing the world from ending, probably in an even more important way.
  In other words, I think “working at a 2nd x-risk job and believing it is very important” is mainly predicted by “working at a 1st x-risk job and believing it is very important”, much more than by personality traits.
  This is almost testable, given we have lots of people working on x-risk today and believing it is very important. But maybe you can easily put your finger on what I’m missing?
- ekka 22 Jun 2022 0:56 UTC
  9 points
  0 ∶ 0
  Parent
  For what it’s worth, I found this post and the ensuing comments very illuminating. As a person relatively new to both EA and the arguments about AI risk, I was a little bit confused as to why there was not much push back on the very high confidence beliefs about AI doom within the next 10 years. My assumption had been that there was a lot of deference to EY because of reverence and fealty stemming from his role in getting the AI alignment field started not to mention the other ways he has shaped people’s thinking. I also assumed that his track record on predictions was just ambiguous enough for people not to question his accuracy. Given that I don’t give much credence to the idea that prophets/oracles exist, I thought it unlikely that the high confidence on his predictions were warranted on the count that there doesn’t seem to be much evidence supporting the accuracy of long range forecasts. I did not think that there were such glaring mispredictions made by EY in the past so thank you for highlighting them.
- Verden 22 Jun 2022 0:22 UTC
  8 points
  0 ∶ 0
  Parent
  I feel like people are missing one fairly important consideration when discussing how much to defer to Yudkowsky, etc. Namely, I’ve heard multiple times that Nate Soares, the executive director of MIRI, has models of AI risk that are very similar to Yudkowsky’s, and their p(doom) are also roughly the same. My limited impression is that Soares is no less smart or otherwise capable than Yudkowsky. So, when having this kind of discussion, focusing on Yudkowsky’s track record or whatever, I think it’s good to remember that there’s another very smart person, who entered AI safety much later than Yudkowsky, and who holds very similar inside views on AI risk.
  - technicalities 22 Jun 2022 8:01 UTC
    7 points
    0 ∶ 0
    Parent
    This isn’t much independent evidence I think: seems unlikely that you could become director of MIRI unless you agreed. (I know that there’s a lot of internal disagreement at other levels.)
    - Verden 22 Jun 2022 12:16 UTC
      9 points
      0 ∶ 0
      Parent
      My point has little to do with him being the director of MIRI per se.
      I suppose I could be wrong about this, but my impression is that Nate Soares is among the top 10 most talented/insightful people with elaborate inside view and years of research experience in AI alignment. He also seems to agree with Yudkowsky on a whole lot of issues and predicts about the same p(doom) for about the same reasons. And I feel that many people don’t give enough thought to the fact that while e.g. Paul Christiano has interacted a lot with Yudkowsky and disagreed with him on many key issues (while agreeing on many others), there’s also Nate Soares, who broadly agrees with Yudkowsky’s models that predict very high p(doom).
      Another, more minor point: if someone is bringing up Yudkowsky’s track record in the context of his extreme views on AI risk, it seems helpful to talk about Soares’ track record as well.
      - Guy Raveh 22 Jun 2022 21:57 UTC
        3 points
        0 ∶ 0
        Parent
        I think this maybe argues against a point not made in the OP. Garfinkel isn’t saying “disregard Yudkowsky’s views”—rather he’s saying “don’t give them extra weight just because Yudkowsky’s the one saying them”.
        
        For example, from his reply to Richard Ngo:
        
        I think it’s really important to seperate out the question “Is Yudkowsky an unusually innovative thinker?” and the question “Is Yudkowsky someone whose credences you should give an unusual amount of weight to?”
        
        I read your comment as arguing for the former, which I don’t disagree with. But that doesn’t mean that people should currently weigh his risk estimates more highly than they weigh the estimates of other researchers currently in the space
        
        So at least from Garfinkel’s perspective, Yudkowsky and Soares do count as data points, they’re just equal in weight to other relevant data points.
        
        (I’m not expressing any of my own, mostly unformed, views here)
        RobBensinger 23 Jun 2022 1:08 UTC
        4 points
        0 ∶ 0
        Parent
        So at least from Garfinkel’s perspective, Yudkowsky and Soares do count as data points, they’re just equal in weight to other relevant data points.
        Ben has said this about Eliezer, but not about Nate, AFAIK.
- David Mathers🔸 22 Jun 2022 10:59 UTC
  5 points
  0 ∶ 0
  Parent
  ‘Here’s one data point I can offer from my own life: Through a mixture of college classes and other reading, I’m pretty confident I had already encountered the heuristics and biases literature, Bayes’ theorem, Bayesian epistemology, the ethos of working to overcome bias, arguments for the many worlds interpretation, the expected utility framework, population ethics, and a number of other ‘rationalist-associated’ ideas before I engaged with the effective altruism or rationalist communities.’
  
  I think some of this is just a result of being a community founded partly by analytic philosophers. (though as a philosopher I would say that!).
  
  I think it’s normal to encounter some of these ideas in undergrad philosophy programs. At my undergrad back in 2005-09 there was a whole upper-level undergraduate course in decision theory. I don’t think that’s true everywhere all the time, but I’d be surprised if it was wildly unusual. I can’t remember if we covered population ethics in any class, but I do remember discovering Parfit on the Repugnant Conclusion in 2nd-year of undergrad because one of my ethics lecturers said Reasons and Persons was a super-important book. In terms of the Oxford phil scene where the term “effective altruism” was born, the main titled professorship in ethics at that time was held by John Broome, a utilitarianism-sympathetic former economist, who had written famous stuff on expected utility theory. I can’t remember if he was the PhD supervisor of anyone important to the founding of EA, but I’d be astounded if some of the phil. people involved in that had not been reading his stuff and talking to him about it. Most of the phil. physics people at Oxford were gung-ho for many worlds, it’s not a fringe view in philosophy of physics as far as I know. (Though I think Oxford was kind of a centre for it and there was more dissent elsewhere.) As far as I can tell, Bayesian epistemology in at least some senses of that term is a fairly well-known approach in philosophy of science. Philosophers specializing in epistemology might more often ignore it, but they know it’s there. And not all of them ignore it! I’m not an epistemologist, by my doctoral supervisor was, and it’s not unusual for his work to refer to Bayesian ideas in modelling stuff about how to evaluate evidence. (I.e. in uhm, defending the fine-tuning argument for the existence of God, which might not be the best use, but still!: https://www.yoaavisaacs.com/uploads/6/9/2/0/69204575/ms_for_fine-tuning_fine-tuning.pdf). (John was my supervisor, not Yoav.)
  
  A high interest in bias stuff might genuinely be more an Eliezer/LessWrong legacy though.
  - Pablo 22 Jun 2022 12:46 UTC
    16 points
    0 ∶ 0
    Parent
    the main titled professorship in ethics at that time was held by John Broome, a utilitarianism-sympathetic former economist, who had written famous stuff on expected utility theory. I can’t remember if he was the PhD supervisor of anyone important to the founding of EA, but I’d be astounded if some of the phil. people involved in that had not been reading his stuff and talking to him about it.
    Indeed, Broome co-supervised the doctoral theses of both Toby Ord and Will MacAskill. And Broome was, in fact, the person who advised Will to get in touch with Toby, before the two had met.
  - Linch 24 Jun 2022 20:14 UTC
    15 points
    0 ∶ 0
    Parent
    Speaking for myself, I was interested in a lot of the same things in the LW cluster (Bayes, approaches to uncertainty, human biases, utilitarianism, philosophy, avoiding the news) before I came across LessWrong or EA. The feeling is much more like “I found people who can describe these ideas well” than “oh these are interesting and novel ideas to me.” (I had the same realization when I learned about utilitarianism...much more of a feeling that “this is the articulation of clearly correct ideas, believing otherwise seems dumb”).
    
    That said, some of the ideas on LW that seemed more original to me (AI risk, logical decision theory stuff, heroic responsibility in an inadequate world), do seem both substantively true and extremely important, and it took me a lot of time to be convinced of this.
    (There are also other ideas that I’m less sure about, like cryonics and MW).
  - Guy Raveh 22 Jun 2022 11:43 UTC
    3 points
    0 ∶ 0
    Parent
    Veering entirely off-topic here, but how does the many worlds hypothesis tie in with all the rest of the rationality/EA stuff?
    - Yonatan Cale 22 Jun 2022 13:41 UTC
      4 points
      0 ∶ 0
      Parent
      [replying only to you with no context]
      EY pointed out the many worlds hypothesis as a thing that even modern science, specifically physics (which is considered a very well functioning science, it’s not like social psychology), is missing.
      And he used this as an example to get people to stop trusting authority, including modern science, which many people around him seem to trust.
      I think this is a reasonable reference.
      - Guy Raveh 22 Jun 2022 21:37 UTC
        2 points
        0 ∶ 0
        Parent
        Can’t say any of that makes sense to me. I have the feeling there’s some context I’m totally missing (or he’s just wrong about it). I may ask you about this in person at some point :)
- anonymous_ea 4 Jul 2022 1:13 UTC
  1 point
  1 ∶ 0
  Parent
  Edit: I think this came off more negatively than I intended it to, particularly about Yudkowsky’s understanding of physics. The main point I was trying to make is that Yudkowsky was overconfident, not that his underlying position was wrong. See the replies for more clarification.
  I think there’s another relevant (and negative) data point when discussing Yudkowsky’s track record: his argument and belief that the Many-Worlds Interpretation of quantum mechanics is the only viable interpretation of quantum mechanics, and anyone who doesn’t agree is essentially a moron. Here’s one 2008 link from the Sequences where he expresses this position^[1]; there are probably many other places where he’s said similar things. (To be clear, I don’t know if he still holds this belief, and if he doesn’t anymore, when and why he updated away from it.)
  Many Worlds is definitely a viable and even leading interpretation, and may well be correct. But Yudkowsky’s confidence in Many Worlds, as well as his conviction that people who disagree with him are making elementary mistakes, is more than a little disproportionate, and may come partly from a lack of knowledge and expertise.
  The above is a paraphrase of Scott Aaronson, a credible authority on quantum mechanics who is sympathetic to both Yudkowsky and Many Worlds (bold added):
  I think Yudkowsky’s central argument—basically, that anyone who rejects [Many Worlds] needs to have their head examined—is to put it mildly, a bit overstated. :) I’ll resist the temptation to elaborate, since this is really a discussion for another thread.
  In several posts, Yudkowsky gives indications that he doesn’t really understand the concept of mixed states. (For example, he writes about the No-Communication Theorem as something complicated and mysterious, which it’s not from a density-matrix perspective.) As I see it, this might be part of the reason why Yudkowsky sees anything besides Many-Worlds as insanity, and can’t understand what (besides sheep-like conformity) would drive any knowledgeable physicist to any other point of view. If I didn’t know that in real life, people pretty much never encounter pure states, but only more general objects that (to paraphrase Jaynes) scramble together “subjective” probabilities and “objective” amplitudes into a single omelette, the view that quantum states are “states of knowledge” that “live in the mind, not in the world” would probably also strike me as meaningless nonsense.
  While this isn’t directly related to AI risk, I think it’s relevant to Yudkowsky’s track record as a public intellectual.
  1. ^
    He expresses this in the last six paragraphs of the post. I’m excerpting some of it (bold added, italics were present in the original):
    Many-worlds is an obvious fact, if you have all your marbles lined up correctly (understand very basic quantum physics, know the formal probability theory of Occam’s Razor, understand Special Relativity, etc.) It is in fact considerably more obvious to me than the proposition that spinning black holes should obey conservation of angular momentum.
    ...
    So let me state then, very clearly, on behalf of any and all physicists out there who dare not say it themselves: Many-worlds wins outright given our current state of evidence. There is no more reason to postulate a single Earth, than there is to postulate that two colliding top quarks would decay in a way that violates Conservation of Energy. It takes more than an unknown fundamental law; it takes magic.
    The debate should already be over. It should have been over fifty years ago. The state of evidence is too lopsided to justify further argument. There is no balance in this issue. There is no rational controversy to teach. The laws of probability theory are laws, not suggestions; there is no flexibility in the best guess given this evidence. Our children will look back at the fact that we were still arguing about this in the early twenty-first century, and correctly deduce that we were nuts.
    We have embarrassed our Earth long enough by failing to see the obvious. So for the honor of my Earth, I write as if the existence of many-worlds were an established fact, because it is. The only question now is how long it will take for the people of this world to update.
  - Steven Byrnes 5 Jul 2022 17:34 UTC
    16 points
    0 ∶ 0
    Parent
    OTOH, I am (or I guess was?) a professional physicist, and when I read Rationality A-Z, I found that Yudkowsky was always reaching exactly the same conclusions as me whenever he talked about physics, including areas where (IMO) the physics literature itself is a mess—not only interpretations of QM, but also how to think about entropy & the 2nd law of thermodynamics, and, umm, I thought there was a third thing too but I forget.
    That increased my respect for him quite a bit.
    And who the heck am I? Granted, I can’t out-credential Scott Aaronson in QM. But FWIW, hmm let’s see, I had the highest physics GPA in my Harvard undergrad class and got the highest preliminary-exam score in my UC Berkeley physics grad school class, and I’ve played a major role in designing I think 5 different atomic interferometers (including an atomic clock) for various different applications, and in particular I was always in charge of all the QM calculations related to estimating their performance, and also I once did a semester-long (unpublished) research project on quantum computing with superconducting qubits, and also I have made lots of neat wikipedia QM diagrams and explanations including a pedagogical introduction to density matrices and mixed states.
    I don’t recall feeling strongly that literally every word Yudkowsky wrote about physics was correct, more like “he basically figured out the right idea, despite not being a physicist, even in areas where physicists who are devoting their career to that particular topic are all over the place”. In particular, I don’t remember exactly what Yudkowsky wrote about the no-communication theorem. But I for one absolutely understand mixed states, and that doesn’t prevent me from being a pro-MWI extremist like Yudkowsky.
    - anonymous_ea 5 Jul 2022 21:19 UTC
      2 points
      0 ∶ 0
      Parent
      I agree that: Yudkowsky has an impressive understanding of physics for a layman, in some situations his understanding is on par with or exceeds some experts, and he has written explanations of technical topics that even some experts like and find impressive. This includes not just you, but also e.g. Scott Aaronson, who praised his series on QM in the same answer I excerpted above, calling it entertaining, enjoyable, and getting the technical stuff mostly right. He also praised it for its conceptual goals. I don’t believe this is faint praise, especially given stereotypes of amateurs writing about physics. This is a positive part of Yudkowsky’s track record. I think my comment sounds more negative about Yudkowsky’s QM sequence than it deserves, so thanks for pushing back on that.
      I’m not sure what you mean when you call yourself a pro-MWI extremist but in any case AFAIK there are physicists, including one or more prominent ones, who think MWI is really the only explanation that makes sense, although there are obviously degrees in how fervently one can hold this position and Yudkowsky seems at the extreme end of the scale in some of his writings. And he is far from the only one who thinks Copenhagen is ridiculous. These two parts of Yudkowsky’s position on MWI are not without parallel within professional physicists, and the point about Copenhagen being ridiculous is probably a point in his favor from most views (e.g. Nobel laureate Murray Gell-Mann said that Neils Bohr brainwashed people into Copenhagen), let alone this community. Perhaps I should have clarified this in my comment, although I did say that MWI is a leading interpretation and may well be correct.
      The negative aspects I said in my comment were:
      Yudkowsky’s confidence in MWI is disproportionate
      Yudkowsky’s conviction that people who disagree with him are making elementary mistakes is disproportionate
      These may come partly from a lack of knowledge or expertise
      Maybe (3) is a little unfair, or sounds harsher than I meant it. It’s a bit unclear to me how seriously to take Aaronson’s quote. It seems like plenty of physicists have looked through the sequences to find glaring flaws, and basically found none (physics stackexchange). This is a nontrivial achievement in context. At the same time I expect most of the scrutiny has been to a relatively shallow level, partly because Yudkowsky is a polarizing writer. Aaronson is probably one of fairly few people who have deep technical expertise and have read the sequences with both enjoyment and a critical eye. Aaronson suggested a specific, technical flaw that may be partly responsible for Yudkowsky holding an extreme position with overconfidence and misunderstanding what people who disagree with him think. Probably this is a flaw Yudkowsky would not have made if he had worked with a professional physicist or something. But maybe Aaronson was just casually speculating and maybe this doesn’t matter too much. I don’t know. Possibly you are right to push back on the mixed states explanation.
      I think (1) and (2) are well worth considering though. The argument here is not that his position is necessarily wrong or impossible, but that it is overconfident. I am not courageous enough to argue for this position to a physicist who holds some kind of extreme pro-MWI view, but I think this is a reasonable view and there’s a good chance (1) and (2) are correct. It also fits in Ben’s point 4 in the comment above: “Yudkowsky’s track record suggests a substantial bias toward dramatic and overconfident predictions.”
      What links here?
      anonymous_ea's comment on On Deference and Yudkowsky’s AI Risk Estimates by bmg (5 Jul 2022 22:38 UTC; 3 points)
      - Steven Byrnes 6 Jul 2022 3:25 UTC
        17 points
        1 ∶ 0
        Parent
        Hmm, I’m a bit confused where you’re coming from.
        Suppose that the majority of eminent mathematicians believe 5+5=10, but a significant minority believes 5+5=11. Also, out of the people in the 5+5=10 camp, some say “5+5=10 and anyone who says otherwise is just totally wrong”, whereas other people said “I happen to believe that the balance of evidence is that 5+5=10, but my esteemed colleagues are reasonable people and have come to a different conclusion, so we 5+5=10 advocates should approach the issue with appropriate humility, not overconfidence.”
        In this case, the fact of the matter is that 5+5=10. So in terms of who gets the most credit added to their track-record, the ranking is:
        1st place: The ones who say “5+5=10 and anyone who says otherwise is just totally wrong”,
        2nd place: The ones who say “I think 5+5=10, but one should be humble, not overconfident”,
        3rd place: The ones who say “I think 5+5=11, but one should be humble, not overconfident”,
        Last place: The ones who say “5+5=11 and anyone who says otherwise is just totally wrong.
        Agree so far?
        (See also: Bayes’s theorem, Brier score, etc.)
        Back to the issue here. Yudkowsky is claiming “MWI, and anyone who says otherwise is a just totally wrong”. (And I agree—that’s what I meant when I called myself a pro-MWI extremist.)
        IF the fact of the matter is that careful thought shows MWI to be unambiguously correct, then Yudkowsky (and I) get more credit for being more confident. Basically, he’s going all in and betting his reputation on MWI being right, and (in this scenario) he won the bet.
        Conversely, IF the fact of the matter is that careful thought shows MWI to be not unambiguously correct, then Eliezer loses the maximum number of points. He staked his reputation on MWI being right, and (in this scenario) he lost the bet.
        So that’s my model, and in my model “overconfidence” per se is not really a thing in this context. Instead we first have to take a stand on the object-level controversy. I happen to agree with Eliezer that careful thought shows MWI to be unambiguously correct, and given that, the more extreme his confidence in this (IMO correct) claim, the more credit he deserves.
        I’m trying to make sense of why you’re bringing up “overconfidence” here. The only thing I can think of is that you think that maybe there is simply not enough information to figure out whether MWI is right or wrong (not even for even an ideal reasoner with a brain the size of Jupiter and a billion years to ponder the topic), and therefore saying “MWI is unambiguously correct” is “overconfident”? If that’s what you’re thinking, then my reply is: if “not enough information” were the actual fact of the matter about MWI, then we should criticize Yudkowsky first and foremost for being wrong, not for being overconfident.
        As for your point (2), I forget what mistakes Yudkowsky claimed that anti-MWI-advocates are making, and in particular whether he thought those mistakes were “elementary”. I am open-minded to the possibility that Yudkowsky was straw-manning the MWI critics, and that they are wrong for more interesting and subtle reasons than he gives them credit for, and in particular that he wouldn’t pass an anti-MWI ITT. (For my part, I’ve tried harder, see e.g. here.) But that’s a different topic. FWIW I don’t think of Yudkowsky as having a strong ability to explain people’s wrong opinions in a sympathetic and ITT-passing way, or if he does have that ability, then I find that he chooses not to exercise it too much in his writings. :-P
        RobBensinger 6 Jul 2022 22:28 UTC
        6 points
        0 ∶ 0
        Parent
        I happen to agree with Eliezer that careful thought shows MWI to be unambiguously correct, and given that, the more extreme his confidence in this (IMO correct) claim, the more credit he deserves.
        ‘The more probability someone assigns to a claim, the more credit they get when the claim turns out to be true’ is true as a matter of Bayesian math. And I agree with you that MWI is true, and that we have enough evidence to say it’s true with very high confidence, if by ‘MWI’ we just mean a conjunction like “Objective collapse is false.” and “Quantum non-realism is false / the entire complex amplitude is in some important sense real”.
        (I think Eliezer had a conjunction like this in mind when he talked about ‘MWI’ in the Sequences; he wasn’t claiming that decoherence explains the Born rule, and he certainly wasn’t claiming that we need to reify ‘worlds’ as a fundamental thing. I think a better term for MWI might be the ‘Much World Interpretation’, since the basic point is about how much stuff there is, not about a division of that stuff into discrete ‘worlds’.)
        That said, I have no objection in principle to someone saying ‘Eliezer was right about MWI (and gets more points insofar as he was correct), but I also dock him more points than he gained because I think he was massively overconfident’.
        E.g., imagine someone who assigns probability 1 (or probability .999999999) to a coin flip coming up heads. If the coin then comes up heads, then I’m going to either assume they were trolling me, or I’m going to infer that they’re very bad at reasoning. Even if they somehow rigged the coin, .999999999 is just too extreme a probability to be justified here.
        By the same logic, if Eliezer had said that MWI is true with probability 1, or if he’d put too many ‘9s’ at the end of his .99… probability assignment, then I’d probably dock him more points than he gained for being object-level-correct. (Or I’d at least assume he has a terrible understanding of how Bayesian probability works. Someone could indeed be very miscalibrated and bad at talking in probabilistic terms, and yet be very knowledgeable and correct on object-level questions like MWI.)
        I’m not sure exactly how many 9s is too many in the case of MWI, but it’s obviously possible to have too many 9s here. E.g., a hundred 9s would be too many! So I think this objection can make sense; I just don’t think Eliezer is in fact overconfident about MWI.
        Steven Byrnes 6 Jul 2022 22:36 UTC
        3 points
        0 ∶ 0
        Parent
        Fair enough, thanks.
        anonymous_ea 6 Jul 2022 20:44 UTC
        6 points
        0 ∶ 0
        Parent
        I’m trying to make sense of why you’re bringing up “overconfidence” here. The only thing I can think of is that you think that maybe there is simply not enough information to figure out whether MWI is right or wrong (not even for even an ideal reasoner with a brain the size of Jupiter and a billion years to ponder the topic), and therefore saying “MWI is unambiguously correct” is “overconfident”?
        Here’s my point: There is a rational limit to the amount of confidence one can have in MWI (or any belief). I don’t know where exactly this limit is for MWI-extremism but Yudkowsky clearly exceeded it sometimes. To use made up numbers, suppose:
        MWI is objectively correct
        Eliezer says P(MWI is correct) = 0.9999999
        But rationally one can only reach P(MWI) = 0.999
        Because there are remaining uncertainties that cannot be eliminated through superior thinking and careful consideration, such lack of experimental evidence, the possibility of QM getting overturned, the possibility of a new and better interpretation in the future, and unknown unknowns.
        These factors add up to at least P(Not MWI) = 0.001.
        Then even though Eliezer is correct about MWI being correct, he is still significantly overconfident in his belief about it.
        Consider Paul’s example of Eliezer saying MWI is comparable to heliocentrism:
        If we are deeply wrong about physics, then I [Paul Christiano] think this could go either way. And it still seems quite plausible that we are deeply wrong about physics in one way or another (even if not in any particular way). So I think it’s wrong to compare many-worlds to heliocentrism (as Eliezer has done). Heliocentrism is extraordinarily likely even if we are completely wrong about physics—direct observation of the solar system really is a much stronger form of evidence than a priori reasoning about the existence of other worlds.
        I agree with Paul here. Heliocentrism is vastly more likely than any particular interpretation of quantum mechanics, and Eliezer was wrong to have made this comparison.
        This may sound like I’m nitpicking, but I think it fits into a pattern of Eliezer making dramatic and overconfident pronouncements, and it’s relevant information for people to consider e.g. when evaluating Eliezer’s belief that p(doom) = ~1 and the AI safety situation is so hopeless that the only thing left is to die with slightly more dignity.
        Of course, it’s far from the only relevant data point.
        Regarding (2), I think we’re on the same page haha.
        RobBensinger 6 Jul 2022 22:54 UTC
        8 points
        0 ∶ 0
        Parent
        Could someone point to the actual quotes where Eliezer compares heliocentrism to MWI? I don’t generally assume that when people are ‘comparing’ two very-high-probability things, they’re saying they have the same probability. Among other things, I’d want confirmation that ‘Eliezer and Paul assign roughly the same probability to MWI, but they have different probability thresholds for comparing things to heliocentrism’ is false.
        E.g., if I compare Flat Earther beliefs, beliefs in psychic powers, belief ‘AGI was secretly invented in the year 2000’, geocentrism, homeopathy, and theism to each other, it doesn’t follow that I’d assign the same probabilities to all of those six claims, or even probabilities that are within six orders of magnitude of each other.
        In some contexts it might indeed Griceanly imply that all six of those things pass my threshold for ‘unlikely enough that I’m happy to call them all laughably silly views’, but different people have their threshold for that kind of thing in different places.
        Steven Byrnes 7 Jul 2022 0:19 UTC
        3 points
        0 ∶ 0
        Parent
        Gotcha, thanks. I guess we have an object-level disagreement: I think that careful thought reveals MWI to be unambiguously correct, with enough 9’s as to justify Eliezer’s tone. And you don’t. ¯\_(ツ)_/¯
        (Of course, this is bound to be a judgment call; e.g. Eliezer didn’t state how many 9’s of confidence he has. It’s not like there’s a universal convention for how many 9’s are enough 9’s to state something as a fact without hedging, or how many 9’s are enough 9’s to mock the people who disagree with you.)
        anonymous_ea 8 Jul 2022 22:47 UTC
        6 points
        0 ∶ 0
        Parent
        (Of course, this is bound to be a judgment call; e.g. Eliezer didn’t state how many 9’s of confidence he has. It’s not like there’s a universal convention for how many 9’s are enough 9’s to state something as a fact without hedging, or how many 9’s are enough 9’s to mock the people who disagree with you.)
        Yes, agreed.
        Let me lay out my thinking in more detail. I mean this to explain my views in more detail, not as an attempt to persuade.
        Paul’s account of Aaronson’s view says that Eliezer shouldn’t be as confident in MWI as he is, which in words sounds exactly like my point, and similar to Aaronson’s stack exchange answer. But it still leaves open the question of how overconfident he was, and what, if anything, should be taken away from this. It’s possible that there’s a version of my point which is true but is also uninteresting or trivial (who cares if Yudkowsky was 10% too confident about MWI 15 years ago?).
        And it’s worth reiterating that a lot of people give Eliezer credit for his writing on QM, including for being forceful in his views. I have no desire to argue against this. I had hoped to sidestep discussing this entirely since I consider it to be a separate point, but perhaps this was unfair and led to miscommunication. If someone wants to write a detailed comment/post explaining why Yudkowsky deserves a lot of credit for his QM writing, including credit for how forceful he was at times, I would be happy to read it and would likely upvote/strong upvote it depending on quality.
        However, here my intention was to focus on the overconfidence aspect.
        I’ll explain what I see as the epistemic mistakes Eliezer likely made to end up in an overconfident state. Why do I think Eliezer was overconfident on MWI?
        (Some of the following may be wrong.)
        He didn’t understand non-MWI-extremist views, which should have rationally limited his confidence
        I don’t have sources for this, but I think something like this is true.
        This was an avoidable mistake
        Worth noting that Eliezer has updated towards the competence of elites in science since some of his early writing according to Rob’s comment elsewhere this thread
        It’s possible that his technical understanding was uneven. This should also have limited his confidence.
        Aaronson praised him for “actually get most of the technical stuff right”, which of course implies that not everything technical was correct.
        He also suggested a specific, technical flaw in Yudkowsky’s understanding.
        One big problem with having extreme conclusions based on uneven technical understanding is that you don’t know what you don’t know. And in fact Aaronson suggests a mistake Yudkowsky seems unaware of as a reason why Yudkowsky’s central argument is overstated/why Yudkowsky is overconfident about MWI.
        However, it’s unclear how true/important a point this really is
        At least 4 points limit confidence in P(MWI) to some degree:
        Lack of experimental evidence
        The possibility of QM getting overturned
        The possibility of a new and better interpretation in the future
        Unknown unknowns
        I believe most or all of these are valid, commonly brought up points that together limit how confident anyone can be in P(MWI). Reasonable people may disagree with their weighting of course.
        I am skeptical that Eliezer correctly accounted for these factors
        Note that these are all points about the epistemic position Eliezer was in, not about the correctness of MWI. The first two are particular to him, and the last one applies to everyone.
        Now, Rob points out that maybe the heliocentrism example is lacking context in some way (I find it a very compelling example of a super overconfident mistake if it’s not). Personally I think there are at least a couple^[1] ^[2] of places in the sequences where Yudkowsky clearly says something that I think indicates ridiculous overconfidence tied to epistemic mistakes, but to be honest I’m not excited to argue about whether some of his language 15 years ago was or wasn’t overzealous.
        The reason I brought this up despite it being a pretty minor point is because I think it’s part of a general pattern of Eliezer being overconfident in his views and overstating them. I am curious how much people actually disagree with this.
        Of course, whether Eliezer has a tendency to be overconfident and overstate his views is only one small data point among very many others in evaluating p(doom), the value of listening to Eliezer’s views, etc.
        ^
        “Many-worlds is an obvious fact, if you have all your marbles lined up correctly (understand very basic quantum physics, know the formal probability theory of Occam’s Razor, understand Special Relativity, etc.)”
        ^
        “The only question now is how long it will take for the people of this world to update.” Both quotes from https://www.lesswrong.com/s/Kqs6GR7F5xziuSyGZ/p/S8ysHqeRGuySPttrS
        Steven Byrnes 10 Jul 2022 17:27 UTC
        8 points
        0 ∶ 0
        Parent
        For what it’s worth, consider the claim “The Judeo-Christian God, the one who listens to prayers and so on, doesn’t exist.” I have such high confidence in this claim that I would absolutely state it as a fact without hedging, and psychoanalyze people for how they came to disagree with me. Yet there’s a massive theology literature arguing to the contrary of that claim, including by some very smart and thoughtful people, and I’ve read essentially none of this theology literature, and if you asked me to do an anti-atheism ITT I would flunk it catastrophically.
        I’m not sure what lesson you’ll take from that; for all I know you yourself are very religious, and this anecdote will convince you that I have terrible judgment. But if you happen to be on the same page as me, then maybe this would be an illustration of the fact that (I claim) one can rationally and correctly arrive at extremely-confident beliefs without it needing to pass through a deep understanding and engagement with the perspectives of the people who disagree with you.
        I agree that this isn’t too important a conversation, it’s just kinda interesting. :)
        Paul_Christiano 9 Jul 2022 1:56 UTC
        6 points
        0 ∶ 0
        Parent
        I’m not sure either of the quotes you cited by Eliezer require or suggest ridiculous overconfidence.
        If I’ve seen some photos of a tiger in town, and I know a bunch of people in town who got eaten by an animal, and we’ve all seen some apparent tiger-prints near where people got eaten, I may well say “it’s obvious there is a tiger in town eating people.” If people used to think it was a bear, but that belief was formed based on priors when we didn’t yet have any hard evidence about the tiger, I may be frustrated with people who haven’t yet updated. I may say “The only question is how quickly people’s views shift from bear to tiger. Those who haven’t already shifted seem like they are systematically slow on the draw and we should learn from their mistakes.” I don’t think any of those statements imply I think there’s a 99.9% chance that it’s a tiger. It’s more a statement rejecting the reasons why people think there is a bear, and disagreeing with those reasons, and expecting their views to predictably change over time. But I could say all that while still acknowledging some chance that the tiger is a hoax, that there is a new species of animal that’s kind of like a tiger, that the animal we saw in photos is different from the one that’s eating people, or whatever else. The exact smallness of the probability of “actually it wasn’t the tiger after all” is not central to my claim that it’s obvious or that people will come around.
        I don’t think it’s central to this point, but I think 99% is a defensible estimate for many-worlds. I would probably go somewhat lower but certainly wouldn’t run victory laps about that or treat it as damning of someone’s character. The above is mostly a bad analogy explaining why I think it’s pretty reasonable to say things like Eliezer did even if your all-things-considered confidence was 99% or even lower.
        To get a sense for what Eliezer finds frustrating and intends to critique, you can read If many-worlds had come first (which I find quite obnoxious). I think to the extent that he’s wrong it’s generally by mischaracterizing the alternative position and being obnoxious about it (e.g. misunderstanding the extent to which collapse is proposed as ontologically fundamental rather than an expression of agnosticism or a framework for talking about experiments, and by slightly misunderstanding what “ontologically fundamental collapse” would actually mean). I don’t think it has much to do with overconfidence directly, or speaks to the quality of Eliezer’s reasoning about the physical world, though I think it is a bad recurring theme in Eliezer’s reasoning about and relationships with other humans. And in fairness I do think there are a lot of people who probably deserve Eliezer’s frustration on this point (e.g. who talk about how collapse is an important and poorly-understood phenomenon rather than most likely just being the most boring thing) though I mostly haven’t talked with them and I think they are systematically more mediocre physicists.
      - TAG 30 Aug 2023 14:05 UTC
        1 point
        0 ∶ 0
        Parent
        “Maybe (3) is a little unfair, or sounds harsher than I meant it. It’s a bit unclear to me how seriously to take Aaronson’s quote. It seems like plenty of physicists have looked through the sequences to find glaring flaws, and basically found none (physics stackexchange). T”
        Here’s a couple: he conflates Copenhagen and Objective collapse throughout.
        He fails to distinguish Everettian and Decoherence based MWI.
  - Paul_Christiano 5 Jul 2022 21:02 UTC
    9 points
    0 ∶ 0
    Parent
    This doesn’t feel like a track record claim to me. Nothing has changed since Eliezer wrote that; it reads as reasonably now as it did then; and we have nothing objective against which to evaluate it.
    I broadly agree with Eliezer that (i) collapse seems unlikely, (ii) if the world is governed by QM as we understand it, the whole state is probably as “real” as we are, (iii) there seems to be nothing to favor the alternative interpretations other than those that make fewer claims and are therefore more robust to unknown-unknowns. So if anything I’d be inclined to give him a bit of credit on this one, given that it seems to have held up fine for readers who know much more about quantum mechanics than he did when writing the sequence.
    The main way the sequence felt misleading was by moderately overstating how contrarian this take was. For example, near the end of my PhD I was talking with Scott Aaronson and my advisor Umesh Vazirani, who I considered not-very-sympathetic to many worlds. When asked why, my recollection of his objection was “What are these ‘worlds’ that people are talking about? There’s just the state.” That is, the whole issue turned on a (reasonable) semantic objection.
    However, I do think Eliezer is right that in some parts of physics collapse is still taken very seriously and there are more-than-semantic disagreements. For example, I was pretty surprised by David Griffiths’ discussion of collapse in the afterword of his textbook (pdf) during undergrad. I think that Eliezer is probably right that some of these are coming from a pretty confused place. I think the actual situation with respect to consensus is a bit muddled, and e.g. I would be fairly surprised if Eliezer was able to make a better prediction about the result of any possible experiment than the physics community based on his confidence in many-worlds. But I also think that a naive-Paul perspective of “no way anyone is as confused as Eliezer is saying” would have been equally-unreasonable.
    I agree that Eliezer is overconfident about the existence of the part of the wavefunction we never see. If we are deeply wrong about physics, then I think this could go either way. And it still seems quite plausible that we are deeply wrong about physics in one way or another (even if not in any particular way). So I think it’s wrong to compare many-worlds to heliocentrism (as Eliezer has done). Heliocentrism is extraordinarily likely even if we are completely wrong about physics—direct observation of the solar system really is a much stronger form of evidence than a priori reasoning about the existence of other worlds. Similarly, I think it’s wrong to compare many-worlds to a particular arbitrary violation of conservation of energy when top quarks collide, rather than something more like “there is a subtle way in which our thinking about conservation of energy is mistaken and the concept either doesn’t apply or is only approximately true.” (It sounds reasonable to compare it to the claim that spinning black holes obey conservation of angular momentum, at least if you don’t yet made any astronomical observations that back up that claim.)
    My understanding is this is the basic substance of Eliezer’s disagreement with Scott Aaronson. My vague understanding of Scott’s view (from one conversation with Scott and Eliezer about this ~10 years ago) is roughly “Many worlds is a strong prediction of our existing theories which is intuitively wild and mostly-experimentally-unconfirmed. Probably true, and would be ~the most interesting physics result ever if false, but still seems good to test and you shouldn’t be as confident as you are about heliocentrism.”
    What links here?
    anonymous_ea's comment on On Deference and Yudkowsky’s AI Risk Estimates by bmg (6 Jul 2022 20:44 UTC; 6 points)
    - anonymous_ea 5 Jul 2022 22:38 UTC
      3 points
      0 ∶ 0
      Parent
      When I said it was relevant to his track record as a public intellectual, I was referring to his tendency to make dramatic and overconfident pronouncements (which Ben mentioned in the parent comment). I wasn’t intending to imply that the debate around QM had been settled or that new information had come out. I do think that even at the time Eliezer’s positions on both MWI and why people disagreed with him on it were overconfident though.
      I think you’re right that my comment gave too little credit to Eliezer, and possibly misleadingly implied that Eliezer is the only one who holds some kind of extreme MWI or anti-collapse view or that such views are not or cannot be reasonable (especially anti-collapse). I said that MWI is a leading candidate but that’s still probably underselling how many super pro-MWI positions there are. I expanded on this in another comment.
      Your story of Eliezer comparing MWI to heliocentrism is a central example of what I’m talking about. It is not that his underlying position is wrong or even unlikely, but that he is significantly overconfident.
      I think this is relevant information for people trying to understand Eliezer’s recent writings.
      To be clear, I don’t think it’s a particularly important example, and there is a lot of other more important information than whether Eliezer overestimated the case for MWI to some degree while also displaying impressive understanding of physics and possibly/probably being right about MWI.