geoffrey comments on Can we do useful meta-analysis? Unjournal evaluations of “Meaningfully reducing consumption of meat and animal products is an unsolved problem...”

geoffrey 11 Nov 2025 1:50 UTC
7 points
0 ∶ 0
Chiming in here with my outsider impressions on how fair the process seems
@david_reinstein If I were to rank the evaluator reports, evaluation summary, and the EA Forum post in which ones seemed the most fair, I would have ranked the Forum post last. It wasn’t until I clicked through to the evaluation reports that I felt the process wasn’t so cutting.
Let me focus on one very specific framing in the Forum post, since it feels representative. One heading includes the phrase “this meta-analysis is not rigorous enough”. This has a few connotations that you probably didn’t mean. One, this meta-analysis is much worse than others. Two, the claims are questionable. Three, there’s a universally correct level of quality that meta-analyses should reach and anything that falls short of that is inadmissible as evidence.
In reality, it seems this meta-analysis is par for the course in terms of quality. And it was probably more difficult to do so given the heterogeneity in the literature. And the central claim of the meta-analysis doesn’t seem like something either evaluator disputed (though one evaluator was hesitant).
Again, I know that’s not what you meant and there are many caveats throughout the post. But it’s one of a few editorial choices that make the Forum post seem much more critical than the evaluation reports, which is a bit unusual since the Evaluators are the ones who are actually critiquing the paper.
Finally, one piece of context that felt odd not to mention was the fundamental difficulty of finding an expert in both food consumption and meta-analysis. That limits the ability of any reviewer to make a fair evaluation. This is acknowledged at the bottom of the Evaluation Summary. Elsewhere, I’m not sure where it’s said. Without that mentioned, I think it’s easy for a casual reader to leave thinking the two Evaluators are the “most correct”.
- david_reinstein 11 Nov 2025 2:34 UTC
  2 points
  0 ∶ 0
  Parent
  Thanks for the detailed feedback, this seems mostly reasonable. I’ll take a look again at some of the framings, and try to adjust. (Below and hopefully later in more detail).
  
  the phrase “this meta-analysis is not rigorous enough”. it seems this meta-analysis is par for the course in terms of quality.
  
  This was my take on how to succinctly depict the evaluators’ reports (not my own take), in a way the casual reader would be able to digest. Maybe this was rounding down too much, but not by a lot, I think. Some quotes from Janés evaluation that I think are representative:
  
  Overall, aside from its commendable transparency, the meta-analysis is not of particularly high quality Overall, the transparency is strong, but the underlying analytic quality is limited.
  
  This doesn’t seem to reflect ‘par for the course’ to me, but it depends on what the course is; i.e., what the comparison group. My own sense/guess is that this more rigorous and careful than most work in this area of meat consumption interventions (and adjacent) but less rigorous than the meta-analyses the evaluators are used to seeing in their academic contexts and the practices they espouse. But academic meta-analysts will tend to focus on areas where they can find a proliferation of high-quality more homogenous research, not necessarily the highest impact areas.
  
  Note that the evaluators rated this 40th and 25th percentile for methods and 75th and 39th percentile overall.
  
  And the central claim of the meta-analysis doesn’t seem like something either evaluator disputed (though one evaluator was hesitant).
  
  To be honest I’m having trouble pinning down what the central claim of the meta-analysis is. Is it a claim that “the main approaches being used to motivate reduced meat consumption don’t seem to work”, i.e., that we can bound the effects as very small, at best? That’s how I’d interpret the reporting of the pooled effects 95% CI as standardized mean effect of 0.02 and 0.12. I would say that both evaluators are sort of disputing that claim.
  
  However the authors hedge this in places and sometimes it sounds more like they’re saying that ~”even the best meta-analysis possible leaves a lot of uncertainty” … An absence of evidence more than an evidence of absence, and this is something the evaluators seem to agree with.
  
  Finally, one piece of context that felt odd not to mention was the fundamental difficulty of finding an expert in both food consumption and meta-analysis.
  
  That is/was indeed challenging. Let me try to adjust this post to note that.
  
  a few editorial choices … make the Forum post seem much more critical than the evaluation reports, which is a bit unusual since the Evaluators are the ones who are actually critiquing the paper.
  
  My goal for this post was to fairly represent the evaluator’s take, to provide insights to people who might want to use this for decision-making and future research, to raise the question of standards in meta-analysis in EA-related areas. I will keep thinking about whether I missed the mark here. One possible clarification though: we don’t frame the evaluator’s role as (only) looking to criticize or find errors in the paper. We ask them to give a fair assessment of it, evaluating its strengths, weaknesses, credibility, and usefulness. These evaluations can also be useful if they give people more confidence in the paper and its conclusions, and thus reason to update more on this for their own decision-making.
  - Seth Ariel Green 🔸 13 Dec 2025 18:01 UTC
    4 points
    0 ∶ 0
    Parent
    Hi David,
    To be honest I’m having trouble pinning down what the central claim of the meta-analysis is.
    To paraphrase Diddy’s character in Get Him to the Greek, “What are you talking about, the name of the [paper] is called “[Meaningfully reducing consumption of meat and animal products is an unsolved problem]!” (😃) That is our central claim. We’re not saying nothing works; we’re saying that meaningful reductions either have not been discovered yet or do not have substantial evidence in support.
    However the authors hedge this in places
    That’s author, singular. I said at the top of my initial response that I speak only for myself.
    - david_reinstein 13 Dec 2025 18:28 UTC
      2 points
      0 ∶ 0
      Parent
      I think “an unsolved problem” could indicate several things. it could be
      
      We have evidence that all of the commonly tried approaches are ineffective, i.e., we have measured all of their effects and they are tightly bounded as being very small
      
      We have a lack of evidence, thus very wide credible intervals over the impact of each of the common approaches.
      
      To me, the distinction is important. Do you agree?
      
      You say above
      
      meaningful reductions either have not been discovered yet or do not have substantial evidence in support
      
      But even “do not have substantial evidence in support” could mean either of the above … a lack of evidence, or strong evidence that the effects are close to zero. At least to my ears.
      
      As for ‘hedge this’, I was referring to the paper not to the response, but I can check this again.
      - geoffrey 13 Dec 2025 19:10 UTC
        1 point
        0 ∶ 0
        Parent
        For what it’s worth, I read that abstract as saying something like, “within the class of interventions studied so far, the literature has yet to settle onto any intervention that can reliably reduce animal product consumption by a meaningful amount, where meaningful amount might be a 1% reduction at Costco scale or long-term 10% reduction at a single cafeteria. The class of interventions being studied tends to be informational and nudge-style interventions like advertising, menu design, and media pamphlets. When effect sizes differ for a given type of intervention, the literature has not offered a convincing reason why a menu-design choice works in one setting versus another.”
        
        Okay, now that I’ve typed that up, I can see why “unsolved problem” is unclear.
        And I’m probably taking a lot of leaps of faith in interpretation here
        Seth Ariel Green 🔸 13 Dec 2025 21:35 UTC
        2 points
        0 ∶ 0
        Parent
        It’s an interesting question.
        From the POV of our core contention -- that we don’t currently have a validated, reliable intervention to deploy at scale—whether this is because of absence of evidence (AoE) or evidence of absence (EoA) is hard to say. I don’t have an overall answer, and ultimately both roads lead to “unsolved problem.”
        We can cite good arguments for EoA (these studies are stronger than the norm in the field but show weaker effects, and that relationship should be troubling for advocates) or AoE (we’re not talking about very many studies at all), and ultimately I think the line between the two is in the eye of the beholder.
        
        Going approach by approach, my personal answers are
        choice architecture is probably AoE, it might work better than expected but we just don’t learn very much from 2 studies (I am working on something about this separately)
        the animal welfare appeals are more EoA, esp. those from animal advocacy orgs
        social psych approaches, I’m skeptical of but there weren’t a lot of high-quality papers so I’m not so sure (see here for a subsequent meta-analysis of dynamic norms approaches).
        I would recommend health for older folks, environmental appeals for Gen Z. So there I’d say we have evidence of efficacy, but to expect effects to be on the order of a few percentage points.
        Were I discussing this specifically with a funder, I would say, if you’re going to do one of the meta-analyzed approaches—psych, nudge, environment, health, or animal welfare, or some hybrid thereof—you should expect small effect sizes unless you have some strong reason to believe that your intervention is meaningfully better than the category average. For instance, animal welfare appeals might not work in general, but maybe watching Dominion is unusually effective. However, as we say in our paper, there are a lot of cool ideas that haven’t been tested rigorously yet, and from the point of view of knowledge, I’d like to see those get funded first.