tobycrisford 🔸 comments on A Primer in Causal Inference: Animal Product Consumption as a Case Study

tobycrisford 🔸 24 Apr 2025 6:18 UTC
4 points
0 ∶ 0
This is a fantastic , clearly written, post. Thank you for writing up and sharing!
In the 3 models, why is outcome_2 not included as a predictor?
I’m just trying to wrap my head around how the 3-wave separation works, but can’t quite follow how the confounders will be controlled for if the treatment is the only variable included from wave 2.
For example, in the first model:
- Suppose ‘activism’ was a confounder for the effect of ‘veganuary’ on ‘outcome’ (so ‘activism’ caused increased ‘veganuary’ exposure, as well as increased ‘outcome’).
- Suppose we have 2 participants with identical Wave 1 responses.
- Between wave 1 and wave 2, the first participant is exposed to ‘activism’, which increases both their ‘veganuary’ and ‘outcome’ values, and this change persists all the way through to Wave 3.
- The first participant now has higher outcome_3 and veganuary_2 than the second participant, with all other predictors in the model equal, so this will lead to a positive coefficient for veganuary_2, even though the relationship between veganuary and outcome is not causal.
I can see how this problem is avoided if outcome_2 is included as a predictor instead (or maybe as well as..?) outcome_1. So maybe this is just a typo..? If so I would be interested in the explanation for whether you need outcome_1 and outcome_2, or if just outcome_2 is enough. I’m finding that quite confusing to think about!
- Jared Winslow 24 Apr 2025 16:44 UTC
  7 points
  0 ∶ 0
  Parent
  Thanks Toby! Great question: outcome_2 isn’t included because it would over-adjust our estimate for veganuary_2. By design, outcome_2 occurs after (or at the same time as) veganuary_2. If it occurs after, outcome_2 will “contain” the effect of veganuary_2 (and in the real world, this contained effect may be larger than the effect on outcome_3 given attenuating effects over time). If we include outcome_2, our model will adjust for the now updated outcome_2, and “control away” most or all of the effect when estimating for outcome_3. On the other hand, including activism_2 would successfully adjust for any inter-wave activism exposure.
  There are then two directly related follow-up questions:
  1. If outcome_2 occurs after, why not make it the primary outcome instead of outcome_3?
  2. If exposure to activism occurs between wave 1 and 2, why don’t we include activism_2 when estimating veganuary_2′s effect on the outcome?
  This is interesting and directly relevant to inferring events from measurements. In this study, the outcome was prospective (e.g., what is your current consumption), while the predictors were both prospective and retrospective (e.g., what happened in the last six months). For question 1, outcome_2 occurs after the retrospective predictors but not after the prospective ones, so we have a reverse causation problem for some of the predictors. For question 2, from the framing of the survey questions (how much activism were you exposed to in the last six months), it’s not possible to determine whether activism_2 or veganuary_2 occurred first, meaning we would again have over-adjustment for many of the models.
  In an ideal scenario, we would adjust for all potential confounders immediately prior to the exposure. But in those cases it’s a tug-o-war between temporal precedence and no alternative explanations because in the real world as soon as you start measuring extremely close to the exposure, it becomes unclear where the confounding control ends and where the exposure begins.
  I hope that helps and feel free to follow up!
  - tobycrisford 🔸 25 Apr 2025 6:19 UTC
    3 points
    0 ∶ 0
    Parent
    Thank you for the detailed reply Jared!
    It makes sense that including outcome_2 would risk controlling away much of any effect of veganuary on outcome. And your answers to those pre-empted follow up questions make sense to me as well!
    But does that then mean my original concern is still valid..? There is still a possibility that a statistically significant coefficient for veganuary_2 in the model might not be causal, but due to a confounder? Even a confounder that was actually measured, like activism exposure?
    - Jared Winslow 25 Apr 2025 22:54 UTC
      3 points
      0 ∶ 0
      Parent
      Thanks for the in-depth questions! You’re right, and this is another limitation. Even for cases where there is no inter-wave activism, I should make it clear that the estimates are only truly causal if you adjust for all relevant confounders, which is unlikely in practice. So the results we get are associations, but less biased (aka causal under certain assumptions).
      The main way we address this issue is through the sensitivity analysis, since it gives a sense of how much unmeasured confounding is required (from a variable not collected or a variable collected not granularly enough like you pointed out) to overturn significance. In our case, a moderate amount would be needed, so the estimates are likely at least directionally consistent.