How does their approach (interact cross-sectional potato suitability with time series variation) compare with the recent shift-share literature? It looks like they’re not explicitly using instrumental variables.
I don’t think the recent diff-in-diff literature is a huge issue here—you’re computing a linear approximation, which might be bad if the actual effect size isn’t linear, but this is just the usual issue with linear regression. The main problem the recent diff-in-diff literature addresses is that terrible things can happen if a) effects are heterogenous (probable here!) and b) treatment timing is staggered (I’m not super concerned here since the analysis is so course and assumes roughly similar timing for all units getting potatos.)
They try to establish something like a pretrends analysis in table II, but I agree that it would be helpful to have a lot more—like an event-study type plot would be nice. In general diff-in-diff is a nice way to get information about really hard to answer questions, but I wouldn’t take the effect size estimates too literally.
I don’t think the recent diff-in-diff literature is a huge issue here—you’re computing a linear approximation, which might be bad if the actual effect size isn’t linear, but this is just the usual issue with linear regression.
honestly re-reading my comment, that is a very fair question. That part was very poorly phrased.
I think what I had in mind is that the issue with continuous DID goes away if you assume constant effect sizes that are linear in treatment effect. When this doesn’t hold, you start to estimate some weird parameter, which Goodman-Bacon, Sant’Anna, and Callaway describe in detail in the link you provided.
I like this paper because it tells us what happens under misspecification, which is exciting because in practice everything is misspecified all the time! But a concern I have with interpreting it is that I think the problem is inherent to linear regression, not the DID case specifically, which means we should really have this kind of problem in mind any time anybody linearly controls for anything.
(So maybe a better way of phrasing this would have been “we should be this nervous all the time, except in cases where misspecification doesn’t matter” rather than “it isn’t a huge issue here.”)
How does their approach (interact cross-sectional potato suitability with time series variation) compare with the recent shift-share literature? It looks like they’re not explicitly using instrumental variables.
Actually, it seems more related to the recent diff-in-diff literature, in particular, with a continuous treatment.
Also note that the Nunn & Qian food aid paper used a similar identification strategy ; critique here.
I don’t think the recent diff-in-diff literature is a huge issue here—you’re computing a linear approximation, which might be bad if the actual effect size isn’t linear, but this is just the usual issue with linear regression. The main problem the recent diff-in-diff literature addresses is that terrible things can happen if a) effects are heterogenous (probable here!) and b) treatment timing is staggered (I’m not super concerned here since the analysis is so course and assumes roughly similar timing for all units getting potatos.)
They try to establish something like a pretrends analysis in table II, but I agree that it would be helpful to have a lot more—like an event-study type plot would be nice. In general diff-in-diff is a nice way to get information about really hard to answer questions, but I wouldn’t take the effect size estimates too literally.
What is this referring to?
honestly re-reading my comment, that is a very fair question. That part was very poorly phrased.
I think what I had in mind is that the issue with continuous DID goes away if you assume constant effect sizes that are linear in treatment effect. When this doesn’t hold, you start to estimate some weird parameter, which Goodman-Bacon, Sant’Anna, and Callaway describe in detail in the link you provided.
I like this paper because it tells us what happens under misspecification, which is exciting because in practice everything is misspecified all the time! But a concern I have with interpreting it is that I think the problem is inherent to linear regression, not the DID case specifically, which means we should really have this kind of problem in mind any time anybody linearly controls for anything.
(So maybe a better way of phrasing this would have been “we should be this nervous all the time, except in cases where misspecification doesn’t matter” rather than “it isn’t a huge issue here.”)
This paper makes that point about linear regressions in general.