Buck comments on Some thoughts on deference and inside-view models

Buck Jun 3, 2020, 5:12 AM
8 points
0 ∶ 0
I think the history of maths also provides some suggestive examples of the dangers of requiring end-to-end stories. E.g., consider some famous open questions in Ancient mathematics that were phrased in the language of geometric constructions with ruler and compass, such as whether it’s possible to ‘square the circle’. It was solved 2,000 years after it was posed using modern number theory. But if you had insisted that everyone working on it has an end-to-end story for how what they’re doing contributes to solving that problem, I think there would have been a real risk that people continue thinking purely in ruler-and-compass terms and we never develop modern number theory in the first place.
I think you’re interpreting me to say that people ought to have an externally validated end-to-end story; I’m actually just saying that they should have an approach which they think might be useful, which is weaker.
- Max_Daniel Jun 3, 2020, 9:08 AM
  9 points
  0 ∶ 0
  Parent
  Thanks, I think this is a useful clarification. I’m actually not sure if I even clearly distinguished these cases in my thinking when I wrote my previous comments, but I agree the thing you quoted is primarily relevant to when end-to-end stories will be externally validated. (By which I think you mean something like: they would lead to an ‘objective’ solution, e.g. maths proof, if executed without major changes.)
  The extent to which we agree depends on what counts as end-to-end story. For example, consider someone working on ML transparency claiming their research is valuable for AI alignment. My guess is:
  - If literally everything they can say when queried is “I don’t know how transparency helps with AI alignment, I just saw the term in some list of relevant research directions”, then we both are quite pessimistic about the value of that work.
  - If they say something like “I’ve made the deliberate decision not to focus on research for which I can fully argue it will be relevant to AI alignment right now. Instead, I just focus on understanding ML transparency as best as I can because I think there are many scenarios in which understanding transparency will be beneficial.”, and then they say something showing they understand longtermist thought on AI risk, then I’m not necessarily pessimistic. I’d think they won’t come up with their own research agenda in the next two years, but depending on the circumstances I might well be optimistic about that person’s impact over their whole career, and I wouldn’t necessarily recommend them to change their approach. I’m not sure what you’d think, but I think initially I read you as being pessimistic in such a case, and this was partly what I was reacting against.
  - If they give an end-to-end story for how their work fits within AI alignment, then all else equal I consider that to be a good sign. However, depending on the circumstances I might still think the best long-term strategy for that person is to postpone the direct pursuit of that end-to-end story and instead focus on targeted deliberate practice of some of the relevant skills, or at least complement the direct pursuit with such deliberate practice. For example, if someone is very junior, and their story says that mathematical logic is important for their work, I might recommend they grab a logic textbook and work through all the exercises. My guess is we disagree on such cases, but that the disagreement is somewhat gradual; i.e. we both agree about extreme cases, but I’d more often recommend more substantial deliberate practice.