bgarfinkel comments on On Deference and Yudkowsky’s AI Risk Estimates

bgarfinkel 20 Jun 2022 8:26 UTC
91 points
3 ∶ 0

The part of this post which seems most wild to me is the leap from “mixed track record” to

In particular, I think, they shouldn’t defer to him more than they would defer to anyone else who seems smart and has spent a reasonable amount of time thinking about AI risk.

For any reasonable interpretation of this sentence, it’s transparently false. Yudkowsky has proven to be one of the best few thinkers in the world on a very difficult topic. Insofar as there are others who you couldn’t write a similar “mixed track record” post about, it’s almost entirely because they don’t have a track record of making any big claims, in large part because they weren’t able to generate the relevant early insights themselves. Breaking ground in novel domains is very, very different from forecasting the weather or events next year; a mixed track record is the price of entry.

I disagree that the sentence is false for the interpretation I have in mind.

I think it’s really important to seperate out the question “Is Yudkowsky an unusually innovative thinker?” and the question “Is Yudkowsky someone whose credences you should give an unusual amount of weight to?”

I read your comment as arguing for the former, which I don’t disagree with. But that doesn’t mean that people should currently weigh his risk estimates more highly than they weigh the estimates of other researchers currently in the space (like you).

I also think that there’s a good case to be made that Yudkowsky tends to be overconfident, and this should be taken into account when deferring; but when it comes to making big-picture forecasts, the main value of deference is in helping us decide which ideas and arguments to take seriously, rather than the specific credences we should place on them, since the space of ideas is so large.

But we do also need to try to have well-calibrated credences, of course. For the reason given in the post, it’s important to know whether the risk of everyone dying soon is 5% or 99%. It’s not enough just to determine whether we should take AI risk seriously.

We’re also now past the point, as a community, where “Should AI risk be taken seriously?” is that much of a live question. The main epistemic question that matters is what probability we assign to it—and I think this post is relevant to that.

(More generally, rather than reading this post, I recommend people read this one by Paul Christiano, which outlines specific agreements and disagreements.)

I definitely recommend people read the post Paul just wrote! I think it’s overall more useful than this one.

But I don’t think there’s an either-or here. People—particularly non-experts in a domain—do and should form their views through a mixture of engaging with arguments and deferring to others. So both arguments and track records should be discussed.

The EA community has ended up strongly moving in Yudkowsky’s direction over the last decade, and that seems like much more compelling evidence than anything listed in this post.

I discuss this in response to another comment, here, but I’m not convinced of that point.
What links here?
- Making decisions using multiple worldviews by Richard_Ngo (LessWrong; 13 Jul 2022 19:15 UTC; 50 points)
- Making decisions using multiple worldviews by richard_ngo (13 Jul 2022 19:15 UTC; 43 points)
- richard_ngo 20 Jun 2022 19:55 UTC
  53 points
  1 ∶ 0
  Parent
  I phrased my reply strongly (e.g. telling people to read the other post instead of this one) because deference epistemology is intrinsically closely linked to status interactions, and you need to be pretty careful in order to make this kind of post not end up being, in effect, a one-dimensional “downweight this person”. I don’t think this post was anywhere near careful enough to avoid that effect. That seems particularly bad because I think most EAs should significantly upweight Yudkowsky’s views if they’re doing any kind of reasonable, careful deference, because most EAs significantly underweight how heavy-tailed the production of innovative ideas actually is (e.g. because of hindsight bias, it’s hard to realise how much worse than Eliezer we would have been at inventing the arguments for AI risk, and how many dumb things we would have said in his position).
  By contrast, I think your post is implicitly using a model where we have a few existing, well-identified questions, and the most important thing is to just get to the best credences on those questions, and we should do so partly by just updating in the direction of experts. But I think this model of deference is rarely relevant; see my reply to Rohin for more details. Basically, as soon as we move beyond toy models of deference, the “innovative thinking” part becomes crucially important, and the “well-calibrated” part becomes much less so.
  One last intuition: different people have different relationships between their personal credences and their all-things-considered credences. Inferring track records in the way you’ve done here will, in addition to favoring people who are quieter and say fewer useful things, also favor people who speak primarily based on their all-things-considered credences rather than their personal credences. But that leads to a vicious cycle where people are deferring to people who are deferring to people who… And then the people who actually do innovative thinking in public end up getting downweighted to oblivion via cherrypicked examples.
  Modesty epistemology delenda est.