In considering whether incentives toward clarity (e.g., via being able to explain one’s work to potential funders) are likely to pull in good or bad directions, I think it’s important to distinguish between two different motions that might be used as a researcher (or research institution) responds to those incentives.
Motion A: Taking the research they were already doing, and putting a decent fraction of effort into figuring out how to explain it, figuring out how to get it onto firm foundations, etc.
Motion B: Choosing which research to do by thinking about which things will be easy to explain clearly afterward.
It seems to me that “attempts to be clear” in the sense of Motion A are indeed likely to be helpful, and are worth putting a significant fraction of one’s effort into. I agree also that they can be aversive and that this aversiveness (all else equal) may tend to cause underinvestment in them.
Motion B, however, strikes me as more of a mixed bag. There is merit in choosing which research to do by thinking about what will be explainable to other researchers, such that other researchers can build on it. But there is also merit to sometimes attempting research on the things that feel most valuabe/tractable/central to a given researcher, without too much shame if it then takes years to get their research direction to be “clear”.
As a loose analogy, one might ask whether “incentives to not fail” have a good or bad effect on achievement. And it seems like a mixed bag. The good part (analogous to Motion A) is that, once one has chosen to devote hours/etc. to a project, it is good to try to get that project to succeed. The more mixed part (analogous to Motion B) is that “incentives to not fail” sometimes cause people to refrain from attempting ambitious projects at all. (Of course, it sometimes is worth not trying a particular project because its success-odds are too low — Motion B is not always wrong.)
I agree with all this. I read your original “attempts to be clear” as Motion A (which I was taking a stance in favour of), and your original “attempts to be exainable” as Motion B (which I wasn’t sure about).
Gotcha. Your phrasing distinction makes sense; I’ll adopt it. I agree now that I shouldn’t have included “clarity” in my sentence about “attempts to be clear/explainable/respectable”.
The thing that confused me is that it is hard to incentivize clarity but not the explainability; the easiest observable is just “does the person’s research make sense to me?”, which one can then choose how to interpret, and how to incentivize.
It’s easy enough to invest in clarity / Motion A without investing in explainability / Motion B, though. My random personal guess is that MIRI invests about half of their total research effort into clarity (from what I see people doing around the office), but I’m not sure (and I could ask the researchers easily enough). Do you have a suspicion about whether MIRI over- or under-invests in Motion A?
My suspicion is that MIRI significantly underinvests/misinvests in Motion A, although of course this is a bit hard to assess from outside.
I think that they’re not that good at clearly explaining their thoughts, but that this is a learnable (and to some extent teachable) skill, and I’m not sure their researchers have put significant effort into trying to learn it.
I suspect that they don’t put enough time into trying to clearly explain the foundations for what they’re doing, relative to trying to clearly explain their new results (though I’m less confident about this, because so much is unobserved).
I think they also sometimes indugle in a motion where they write to try to persuade the reader that what they’re doing is the correct approach and helpful on the problem at hand, rather than trying to give the reader the best picture of the ways in which their work might or might not actually be applicable. I think at a first pass this is trying to substitute for Motion B, but it actively pushes against Motion A.
I’d like to see explanations which trend more towards:
Clearly separating out the motivation for the formalisation from the parts using the formalisation. Then these can be assessed separately. (I think they’ve got better at this recently.)
Putting their cards on the table and giving their true justification for different assumptions. In some cases this might be “slightly incoherent intuition”. If that’s what they have, that’s what they should write. This would make it easier for other people to evaluate, and to work out which bits to dive in on and try to shore up.
Not sure how much this is a response to you, but:
In considering whether incentives toward clarity (e.g., via being able to explain one’s work to potential funders) are likely to pull in good or bad directions, I think it’s important to distinguish between two different motions that might be used as a researcher (or research institution) responds to those incentives.
Motion A: Taking the research they were already doing, and putting a decent fraction of effort into figuring out how to explain it, figuring out how to get it onto firm foundations, etc.
Motion B: Choosing which research to do by thinking about which things will be easy to explain clearly afterward.
It seems to me that “attempts to be clear” in the sense of Motion A are indeed likely to be helpful, and are worth putting a significant fraction of one’s effort into. I agree also that they can be aversive and that this aversiveness (all else equal) may tend to cause underinvestment in them.
Motion B, however, strikes me as more of a mixed bag. There is merit in choosing which research to do by thinking about what will be explainable to other researchers, such that other researchers can build on it. But there is also merit to sometimes attempting research on the things that feel most valuabe/tractable/central to a given researcher, without too much shame if it then takes years to get their research direction to be “clear”.
As a loose analogy, one might ask whether “incentives to not fail” have a good or bad effect on achievement. And it seems like a mixed bag. The good part (analogous to Motion A) is that, once one has chosen to devote hours/etc. to a project, it is good to try to get that project to succeed. The more mixed part (analogous to Motion B) is that “incentives to not fail” sometimes cause people to refrain from attempting ambitious projects at all. (Of course, it sometimes is worth not trying a particular project because its success-odds are too low — Motion B is not always wrong.)
I agree with all this. I read your original “attempts to be clear” as Motion A (which I was taking a stance in favour of), and your original “attempts to be exainable” as Motion B (which I wasn’t sure about).
Gotcha. Your phrasing distinction makes sense; I’ll adopt it. I agree now that I shouldn’t have included “clarity” in my sentence about “attempts to be clear/explainable/respectable”.
The thing that confused me is that it is hard to incentivize clarity but not the explainability; the easiest observable is just “does the person’s research make sense to me?”, which one can then choose how to interpret, and how to incentivize.
It’s easy enough to invest in clarity / Motion A without investing in explainability / Motion B, though. My random personal guess is that MIRI invests about half of their total research effort into clarity (from what I see people doing around the office), but I’m not sure (and I could ask the researchers easily enough). Do you have a suspicion about whether MIRI over- or under-invests in Motion A?
My suspicion is that MIRI significantly underinvests/misinvests in Motion A, although of course this is a bit hard to assess from outside.
I think that they’re not that good at clearly explaining their thoughts, but that this is a learnable (and to some extent teachable) skill, and I’m not sure their researchers have put significant effort into trying to learn it.
I suspect that they don’t put enough time into trying to clearly explain the foundations for what they’re doing, relative to trying to clearly explain their new results (though I’m less confident about this, because so much is unobserved).
I think they also sometimes indugle in a motion where they write to try to persuade the reader that what they’re doing is the correct approach and helpful on the problem at hand, rather than trying to give the reader the best picture of the ways in which their work might or might not actually be applicable. I think at a first pass this is trying to substitute for Motion B, but it actively pushes against Motion A.
I’d like to see explanations which trend more towards:
Clearly separating out the motivation for the formalisation from the parts using the formalisation. Then these can be assessed separately. (I think they’ve got better at this recently.)
Putting their cards on the table and giving their true justification for different assumptions. In some cases this might be “slightly incoherent intuition”. If that’s what they have, that’s what they should write. This would make it easier for other people to evaluate, and to work out which bits to dive in on and try to shore up.