Jeff Kaufman 🔸 comments on An Initial Response to MFA’s Online Ads Study

Jeff Kaufman 🔸Feb 20, 2016, 5:12 AM
4 points
0 ∶ 0

Would it be better to do a pre-analysis plan for future studies?

Did you see the methodology? Or are you wanting them to have committed to something in the analysis that they didn’t talk about there?

Would it be better to do pre treatment/intervention and post treatment/intervention data collection rather than just post treatment/intervention data collection for future studies?

The idea is, have a third, smaller, group that went immediately to a survey? That’s a good idea, and not that expensive per survey result. That helps you see the difference between things like whether the video makes people more likely to go veg vs reduces recidivism.

Was it worth using Edge Research to analyze the data for this study? Will external bodies like Edge Research do data analysis for future MFA studies?

The Edge analysis doesn’t look useful to me, since they didn’t do anything that unusual and there are lots of people in the community in a position to analyze the data. Additionally, my impression is that working with them added months of delay. So I certainly wouldn’t recommend this in the future!

Why was the study so low powered? Was it originally thought that online ads were more effective or perhaps the study’s power was constrained by inadequate funding?

In the methodology they write: “We need to get at minimum 3.2k people to take the survey to have any reasonable hope of finding an effect. Ideally, we’d want say 16k people or more.” My guess is they just failed to bring enough people back in for followup through ads.

“Edge Research then “weighted” the data so the respondents from each group were identical in gender, geography, and age.” I am not totally sure what this means and it seems important. It would be great if someone could please explain more about what the “weighting” process entails.

I think it’s something like this. You take the combined experimental and control groups and you figure out for each characteristic (gender, country, age range) what the distribution is. Then if you happened to get extra UK people in your control group compared to your experimental group, instead of concluding that you made people leave the UK you conclude that you happened to over-sample the UK in the control and under-sample in the experimental. To fix this, you assign a weight to every response based on for each demographic how over- or under-sampled it is. Then if you’re, say, totalling up servings of pork, instead of straight adding them up you first multiply the number of servings each person said they had by their weight, and then add them up.
- kierangreig🔸Feb 20, 2016, 11:35 AM
  1 point
  0 ∶ 0
  Parent
  Did you see the methodology?
  
  Yeah- Looks like this is the relevant section:
  
  “Later, Edge Research will complete “a final data report including (a) an outline of the research methodology and rationale, (b) high level findings and takeaways, and a (c) drill downs on specific areas and audiences.
  
  It’s currently unclear what precise methodology Edge Research will use to analyze the data, but the expectation is that they would use a Chi-Square test to compare the food frequency questionnaires between both the treatment and control groups, looking both for meat reduction and elimination.”
  
  Or are you wanting them to have committed to something in the analysis that they didn’t talk about there?
  
  Yeah I would have liked a more detailed pre-analysis plan. I think there was perhaps too much researcher freedom in the data analysis. This probably makes questionable data analysis techniques and inaccurate interpretation of results more likely. Some things that I think could have been useful to mention in a pre-analysis plan are:
  - Information about the data weighting process.
  - How incomplete survey responses will be treated.
  - How the responses of those who aren’t females aged 13-25 will be treated.
  KG: “Would it be better to do pre treatment/intervention and post treatment/intervention data collection rather than just post treatment/intervention data collection for future studies?”
  
  JK: “The idea is, have a third, smaller, group that went immediately to a survey? That’s a good idea, and not that expensive per survey result. That helps you see the difference between things like whether the video makes people more likely to go veg vs reduces recidivism.”
  
  The idea you suggest sounds promising but it’s not what I meant. With my initial question I intended to ask: Would it be better for future studies to have both a baseline collection of data prior to intervention and an endline collection of data sometime after the intervention rather than just an endline collection of data sometime after the intervention? I ask because my general impression is the standard practice for RCTs in the social sciences is to do pre and post intervention data collection and there’s likely good reasons for why that’s the case. I understand that there may be significant costs increases for pre and post intervention data collection relative to just post intervention data collection but I wonder if the possibly increased usefulness of a study’s results outweigh these increased costs.
  
  The Edge analysis doesn’t look useful to me, since they didn’t do anything that unusual and there are lots of people in the community in a position to analyze the data. Additionally, my impression is that working with them added months of delay. So I certainly wouldn’t recommend this in the future!
  
  Sounds like we probably have pretty similar views about the limited value of Edge’s collaboration. I also probably wouldn’t recommend using them in future.
  
  My guess is they just failed to bring enough people back in for follow up through ads.
  
  That makes sense as a likely reason why the study was low powered. I wonder if alternative options could have been explored when/if it looked like this was the case to prevent the study from being so low powered. For instance, showing more people the initial ad in this circumstance could have led to more people completing the survey which would have likely have increased the power of the survey. Although it may have been difficult to do this for a variety of reasons.
  
  You take the combined experimental and control groups and you figure out for each characteristic (gender, country, age range) what the distribution is. Then if you happened to get extra UK people in your control group compared to your experimental group, instead of concluding that you made people leave the UK you conclude that you happened to over-sample the UK in the control and under-sample in the experimental. To fix this, you assign a weight to every response based on for each demographic how over- or under-sampled it is. Then if you’re, say, totalling up servings of pork, instead of straight adding them up you first multiply the number of servings each person said they had by their weight, and then add them up.
  
  Thanks for explaining this- it’s much clearer to me now :)
  - Jeff Kaufman 🔸Feb 20, 2016, 2:54 PM
    1 point
    0 ∶ 0
    Parent
    
    I wonder if alternative options could have been explored when/if it looked like this was the case to prevent the study from being so low powered. For instance, showing more people the initial ad in this circumstance could have led to more people completing the survey which would have likely have increased the power of the survey. Although it may have been difficult to do this for a variety of reasons.
    
    The way it worked is they advertised to a bunch of people, then four months later tried to follow up with as many as they could using cookie retargeting. At that point they learned they didn’t have as much power as they hoped, but to fix it you need to take another four months and increase the budget by at least 4x.