Ben Snodin comments on Ben Snodin’s Quick takes

Ben Snodin 2 Jun 2021 9:56 UTC
8 points
0 ∶ 0
Changing your working to fit the answer
I wrote this last Autumn as a private “blog post” shared only with a few colleagues. I’m posting it publicly now (after mild editing) because I have some vague idea that it can be good to make things like this public. It is quite rambling and doesn’t really have a clear point (but I think it’s at least an interesting topic).
Say you want to come up with a model for AI timelines, i.e. the probability of transformative AI being developed by year X for various values of X. You put in your assumptions (beliefs about the world), come up with a framework for combining them, and get an answer out. But then you’re not happy with the answer—your framework must have been flawed, or maybe on reflection one of your assumptions needs a bit of revision. So you fiddle with one or two things and get another answer—now it looks much better, close enough to your prior belief that it seems plausible, but not so close that it seems suspicious.
Is this kind of procedure valid? Here’s one case where the answer seems to be yes: if your conclusions are logically impossible, you know that either there’s a flaw in your framework or you need to revise your assumptions (or both).
A closely related case is where the conclusion is logically possible, but extremely unlikely. It seems like there’s a lot of pressure to revise something then too.
But in the right context revising your model in this way can look pretty dodgy. It seems like you’re “doing things the wrong way round”—what was the point of building the model if you were going to fiddle with the assumptions until you got the answer you expected anyway?
I think this is connected to a lot of related issues / concepts:
- Model building
  - Option pricing models in finance: you start (both historically and conceptually) with the nice clean Black-Scholes model, which fails to explain actually observed option prices. Due to this, various assumptions are relaxed or modified, adding (arguably, somewhat ad hoc) complexity until, for the right set of parameters, the model gets all (sufficiently important) observed option prices right.
  - Regularisation / overfitting in ML: you might think of overfitting as “placing too much weight on getting the answer you expect”.
- Arguments
  - “One person’s modus ponens is another’s modus tollens”: if we’re presented with a logical argument, usually the person presenting it wants us to accept the premises and agree that the argument is valid, in which case we must accept the conclusion. If we don’t like the conclusion, we often focus on showing that the argument is invalid. But if you think the conclusion is very unlikely, you also have the option of acknowledging the argument as valid, but rejecting one of the premises. There are lots of fun examples of this from science and philosophy on Gwern’s page on the subject.
  - “Begging the question”: a related accusation in philosophy that seems to mean roughly “your conclusion follows trivially from your premises but I reject one of your premises (and by the way it should have been obvious that I’d reject one of your premises so it was a waste of both my time and yours that you made this argument)”
  - Reductio ad absurdum: disprove something by using it as an assumption that leads to an implausible (or maybe logically impossible) conclusion
  - “Proving too much”: an accusation in philosophy that is supposed to count against the argument doing the “proving”.
  - (Not) updating your beliefs from an argument that appears convincing on the face of it: if the conclusions are implausible enough, you might not update your beliefs too much the first time you encounter the argument, even if it appears watertight.
- Research methods
  - Sanity checking your answer: check that the results of a complex calculation or experiment roughly match the result you get from a quick and crude approach.
Presumably, you could put this question of whether and how much to modify your model into some kind of formal Bayesian framework where on learning a new argument you update all your beliefs based on your prior beliefs in the premises, conclusion, and validity of the argument. I’m not sure whether there’s a literature on this, or whether e.g. highly skilled forecasters actually think like this.
In general though, it seems (to me) that there’s something important about “following where the assumptions / model takes you”. Maybe, given all the ways we fall short of being perfectly rational, we should (and I think that in fact we do) put more emphasis on this than a perfectly rational Bayesian agent would. Avoiding having a very strong prior on the conclusion seems helpful here.