Thanks for the feedback Sam. It’s definitely a limitation but the diff-in-diff analysis still has significant value. The specific way the treatment and control groups are different constrains the stories we can tell where the conference did have a big (hopefully positive) effect but appared not to due to some unobserved factors. If none of these stories seem plausible then we can still be relatively confident in the results.
The post mentions that the difference in donation appears to be driven by a 3 respondents, and the idea that non-attendee donations fall by ~50% without attendance but would be unchanged with attendance seems unlikely (and confounded with high-earning professionals having presumably less time to attend).
Otherwise, the control group seems to have similar beliefs but is much less likely to take EA actions. This isn’t surprising given attending EAGx is an EA action but does present a problem. Looking only at people who were planning to attend but didn’t (for various reasons) would have given a very solid subgroup but there were too few of these to do any statistical analysis. Though a bigger conference could have looked specifically at that group, which I’d be really excited to see.
With diff-in-diff we need the parallel trends assumption as you point out, but we don’t need parallel levels: if the groups would have continued at their previous (different) rates of engagement in the absence of the conference then we should be fine. Similarly, if there’s some external event affecting EA in general and we can assume it would have impacted both groups equivalently (at least in % of engagement) then the diff-in-diff methodology should account for that.
So (excluding the donation case) we have a situation where a more engaged group and a less engaged group both didnt change their behavior.
If the conference had a big positive effect then this would imply that in the absence of the conference the attendees / the more engaged group would have decreased their engagement dramatically but that the effect of the conference happened to cancel that out. It also implies that whatever factor would have led attendees to become less engaged wouldn’t have affected non-attendees (or at least is strongly correlated to attendance).
You could imagine the response rates being responsible, but I’m struggling to think of a credibly story for this: The 41% of attendees who dropped out of the follow-up survey would presumably be those least affected by the conference, which would make the data overestimate the impact of EAGx. Perhaps the 3% of contacted people who volunteered for the treatment group were much more consistent in their EA engagement than the (more engaged on average) attendees who volunteered and so were less affected by an EA-wide downturn that conference attendance happened to cancel out? But this seems tenuous and ‘just-so’.
To me the most plausible way this could happen is reversion to the mean: EA engagement is highly volatile on a year-to-year level with only the most engaged going to EAGx and that results in them maintaining their high-level of EA engagement for at least the next year (roughly cancelling out the usual decline).
This last point is the biggest issue with the analysis in my opinion. Following attendees over the long-run with multiple surveys per year (to compare results before vs. after a conference) would help a lot, but huge incentives would be needed to maintain a meaningful sample for more than a couple of years.
Thanks for the feedback Sam. It’s definitely a limitation but the diff-in-diff analysis still has significant value. The specific way the treatment and control groups are different constrains the stories we can tell where the conference did have a big (hopefully positive) effect but appared not to due to some unobserved factors. If none of these stories seem plausible then we can still be relatively confident in the results.
The post mentions that the difference in donation appears to be driven by a 3 respondents, and the idea that non-attendee donations fall by ~50% without attendance but would be unchanged with attendance seems unlikely (and confounded with high-earning professionals having presumably less time to attend).
Otherwise, the control group seems to have similar beliefs but is much less likely to take EA actions. This isn’t surprising given attending EAGx is an EA action but does present a problem. Looking only at people who were planning to attend but didn’t (for various reasons) would have given a very solid subgroup but there were too few of these to do any statistical analysis. Though a bigger conference could have looked specifically at that group, which I’d be really excited to see.
With diff-in-diff we need the parallel trends assumption as you point out, but we don’t need parallel levels: if the groups would have continued at their previous (different) rates of engagement in the absence of the conference then we should be fine. Similarly, if there’s some external event affecting EA in general and we can assume it would have impacted both groups equivalently (at least in % of engagement) then the diff-in-diff methodology should account for that.
So (excluding the donation case) we have a situation where a more engaged group and a less engaged group both didnt change their behavior.
If the conference had a big positive effect then this would imply that in the absence of the conference the attendees / the more engaged group would have decreased their engagement dramatically but that the effect of the conference happened to cancel that out. It also implies that whatever factor would have led attendees to become less engaged wouldn’t have affected non-attendees (or at least is strongly correlated to attendance).
You could imagine the response rates being responsible, but I’m struggling to think of a credibly story for this: The 41% of attendees who dropped out of the follow-up survey would presumably be those least affected by the conference, which would make the data overestimate the impact of EAGx. Perhaps the 3% of contacted people who volunteered for the treatment group were much more consistent in their EA engagement than the (more engaged on average) attendees who volunteered and so were less affected by an EA-wide downturn that conference attendance happened to cancel out? But this seems tenuous and ‘just-so’.
To me the most plausible way this could happen is reversion to the mean: EA engagement is highly volatile on a year-to-year level with only the most engaged going to EAGx and that results in them maintaining their high-level of EA engagement for at least the next year (roughly cancelling out the usual decline).
This last point is the biggest issue with the analysis in my opinion. Following attendees over the long-run with multiple surveys per year (to compare results before vs. after a conference) would help a lot, but huge incentives would be needed to maintain a meaningful sample for more than a couple of years.