These are great questions. I want to say at the outset here: we explicitly chose to stick closely and without major adjustments to the Birch et al. 2021 framework for this review, such that our results would be directly comparable to their study of decapods and cephalopods that led to the protection of those groups in the Animal Welfare (Sentience) Act 2022 in the UK.
Here is what the paper says about the framework, and the confidence levels:
The five possible confidence levels are:
(1) “Very high confidence”, when the weight of scientific evidence leaves no scope for reasonable doubt;
(2) “High confidence”, when we are convinced that the animal satisfies or fails the criterion, although scope for reasonable doubt remains;
(3) “Medium confidence”, when concerns about the evidence’s reliability and quality prevent us from having high confidence;
(4) “Low confidence”, when there is little or flawed evidence;
(5) “Very low confidence”, when the evidence is seriously inadequate or when there is no evidence whatsoever to make a determination (e.g., no confidence).
So, to be clear the colors are not ratings of ‘very high confidence that they satisfy the criteria’ but rather ‘very high confidence that we can determine whether they satisfy or fail the criteria’. This is why it’s important that we clarify that there is no evidence that insects fail any criteria. Green happens to mean ‘satisfies’ and not ‘fails’ in all cases for our study, but that’s a result of how the evidence shakes out—and not specified by the framework itself, which does allow for the collection of ‘evidence against’ the criterion.
So, in Rockoptera—let’s imagine we had really, really high quality evidence that they did not meet any criteria. We would find a row of green boxes—we are very highly confident in our determination that they fail8⁄8 criteria. Then, when we do our final summation of the evidence, as in Section 4 of the paper: satisfying0⁄8 criteria is classified as “capacity for pain unknown or unlikely” in the group. They explicitly state “If remaining indicators are uncertain [e.g., white/red] rather than shown absent, sentience (or pain) is simply unknown. However, if high-quality scientific work shows the other indicators to be absent [e.g., green], pain is unlikely” (emphasis and brackets mine). So, we would say, according to Birch et al. 2021 that we have very high confidence that pain is unlikely in Rockoptera. Whew! :)
I do think there’s the potential here for color = “pain satisfaction” to be a common misunderstanding; so it seems like future iterations of this work (as you note) might be presented more clearly by adding some kind of symbology that interacts with the color, such that you know whether green = satisfy vs. fail upon immediately looking at it.
Last thought—re: 1000 high quality papers, split down the middle and with no way to resolve the data by considering the biological context. I would classify this as ‘Medium confidence’ - we do not have the evidence to be convinced of satisfaction or failure (e.g., high) but the evidence is apparently neither little nor flawed (e.g., low).
This comment is representative only of MRB’s opinions and expertise, and not the other post/publication authors.
I replied to your comment before you edited it and added the following, so I will make quick replies to these new questions throughout.
To the extent that we found research on these orders O and criteria C, each of the orders satisfies each of the criteria.
- I would rephrase: For the OxC combinations that we found enough research to make a determination about whether each order O satisfies or fails each criterion C, we found that each order satisfied each criterion.
We are not saying anything about the degree to which a particular O satisfies a particular C. [Uhm, I am not sure why. Are the criteria extremely binary, even if you measure them statistically? Or were you looking at the degrees, and every O satisfied every C to a high enough degree that you just decided not to talk about it in the post?]
- These criteria, like many in science, are actually particularly binary. E.g., you either have nociceptors or you do not have nociceptors (you either are an insect, or you are not an insect!). So, in this way, we are assessing satisfaction/failure in a binary sense.
- But, of course, there are relevant degrees that emerge after determining satisfaction or failure. For example, you might have more types or fewer types of nociceptors. They might be expressed in greater or fewer numbers. Ion channels could be expressed in different types of cells or sequestered on the interior of the cell following expression for different amounts of time/due to different physiological causes. In all cases, these degrees would not change our determination about whether the animal group possesses nociceptors (e.g., satisfies/fails the criterion). But, of course, these degrees might have some relevant effects on our eventual credence for pain! We do spend some time on this in the paper (which is 75 pages and I could not replicate here! But see criterion 7 for some of this light discussion of degrees).
- To my mind, the point of this framework, and determining pass/fail for the criterion, is to 1) determine whether it is worth taking the idea of pain seriously in an animal group (e.g., providing evidence for or against applying some version of a precautionary principle); and 2) determining where we should direct research effort by identifying areas where we don’t yet have high-quality evidence for or against the satisfaction of criteria that might be relevant to insect pain.
To recap: you don’t talk about the degrees-of-satisfying-criteria, and any research that existed pointed towards sufficient-degree-of-C, for any O and C. Given this, the tables in this post essentially just depict “How much quality-adjusted research we found on this.”
- We talk about this briefly, for criterion 7, but to be clear, there was relatively little ‘degrees-of-satisfying’ evidence to be found in insects at this time. In most OxC cases, as the table demonstrates, there wasn’t even sufficient evidence to demonstrate with high/very high confidence that the order met or failed to meet the binary condition of the criterion – much less the degrees of meeting/failure, after having met or failed it.
In particular, the tables do not depict anything like “Do we think these insects can feel pain, according to this measure?”. Actually, you believe that probably once there is enough high-quality research, the research will conclude that all insects will satisfy all of the criteria. (Or all orders of insects sufficiently similar to the ones you studied.)
[Here, I mean “believe” in the Bayesian sense where if you had to bet, this is what you would bet on. Not in the sense of you being confident that all the research will come up this way. In particular, no offense meant by this :-) .]
- I don’t really understand the first part of this. But I guess I would say that the table itself doesn’t represent any particular quantifiable credence that insects feel pain. However, the summation of these lines of evidence can provide some traction for thinking about how seriously to take the idea of insect pain at all—even if it again doesn’t give us any particular credence level.
- Re: point 2, I strongly disagree that this is my belief. I believe that once there is enough high quality research for each criterion and order, we will be able to conclude whether or not insect orders satisfy or fail all criteria. There are a select few criteria where available evidence suggests that we might bet on more research coming up satisfactorily – e.g., criterion 3 in adults, where evidence for integrated nociception is distributed across the phylogeny and criterions 1 + 2 (the preconditions) are also robustly met, plus we know that both 1 + 2 are robustly conserved across all the insect orders. However, in most O x C cases we have very, very little data – or even no real data (criterion 8, particularly!) – that can lead us to make any specificor generalizable conclusions about the likelihood of any or all insects meeting that criterion.
This comment is representative only of MRB’s opinions and expertise, and not the other post/publication authors.
These are great questions. I want to say at the outset here: we explicitly chose to stick closely and without major adjustments to the Birch et al. 2021 framework for this review, such that our results would be directly comparable to their study of decapods and cephalopods that led to the protection of those groups in the Animal Welfare (Sentience) Act 2022 in the UK.
Here is what the paper says about the framework, and the confidence levels:
So, to be clear the colors are not ratings of ‘very high confidence that they satisfy the criteria’ but rather ‘very high confidence that we can determine whether they satisfy or fail the criteria’. This is why it’s important that we clarify that there is no evidence that insects fail any criteria. Green happens to mean ‘satisfies’ and not ‘fails’ in all cases for our study, but that’s a result of how the evidence shakes out—and not specified by the framework itself, which does allow for the collection of ‘evidence against’ the criterion.
So, in Rockoptera—let’s imagine we had really, really high quality evidence that they did not meet any criteria. We would find a row of green boxes—we are very highly confident in our determination that they fail 8⁄8 criteria. Then, when we do our final summation of the evidence, as in Section 4 of the paper: satisfying 0⁄8 criteria is classified as “capacity for pain unknown or unlikely” in the group. They explicitly state “If remaining indicators are uncertain [e.g., white/red] rather than shown absent, sentience (or pain) is simply unknown. However, if high-quality scientific work shows the other indicators to be absent [e.g., green], pain is unlikely” (emphasis and brackets mine). So, we would say, according to Birch et al. 2021 that we have very high confidence that pain is unlikely in Rockoptera. Whew! :)
I do think there’s the potential here for color = “pain satisfaction” to be a common misunderstanding; so it seems like future iterations of this work (as you note) might be presented more clearly by adding some kind of symbology that interacts with the color, such that you know whether green = satisfy vs. fail upon immediately looking at it.
Last thought—re: 1000 high quality papers, split down the middle and with no way to resolve the data by considering the biological context. I would classify this as ‘Medium confidence’ - we do not have the evidence to be convinced of satisfaction or failure (e.g., high) but the evidence is apparently neither little nor flawed (e.g., low).
This comment is representative only of MRB’s opinions and expertise, and not the other post/publication authors.
I replied to your comment before you edited it and added the following, so I will make quick replies to these new questions throughout.
To the extent that we found research on these orders O and criteria C, each of the orders satisfies each of the criteria.
- I would rephrase: For the OxC combinations that we found enough research to make a determination about whether each order O satisfies or fails each criterion C, we found that each order satisfied each criterion.
We are not saying anything about the degree to which a particular O satisfies a particular C. [Uhm, I am not sure why. Are the criteria extremely binary, even if you measure them statistically? Or were you looking at the degrees, and every O satisfied every C to a high enough degree that you just decided not to talk about it in the post?]
- These criteria, like many in science, are actually particularly binary. E.g., you either have nociceptors or you do not have nociceptors (you either are an insect, or you are not an insect!). So, in this way, we are assessing satisfaction/failure in a binary sense.
- But, of course, there are relevant degrees that emerge after determining satisfaction or failure. For example, you might have more types or fewer types of nociceptors. They might be expressed in greater or fewer numbers. Ion channels could be expressed in different types of cells or sequestered on the interior of the cell following expression for different amounts of time/due to different physiological causes. In all cases, these degrees would not change our determination about whether the animal group possesses nociceptors (e.g., satisfies/fails the criterion). But, of course, these degrees might have some relevant effects on our eventual credence for pain! We do spend some time on this in the paper (which is 75 pages and I could not replicate here! But see criterion 7 for some of this light discussion of degrees).
- To my mind, the point of this framework, and determining pass/fail for the criterion, is to 1) determine whether it is worth taking the idea of pain seriously in an animal group (e.g., providing evidence for or against applying some version of a precautionary principle); and 2) determining where we should direct research effort by identifying areas where we don’t yet have high-quality evidence for or against the satisfaction of criteria that might be relevant to insect pain.
To recap: you don’t talk about the degrees-of-satisfying-criteria, and any research that existed pointed towards sufficient-degree-of-C, for any O and C. Given this, the tables in this post essentially just depict “How much quality-adjusted research we found on this.”
- We talk about this briefly, for criterion 7, but to be clear, there was relatively little ‘degrees-of-satisfying’ evidence to be found in insects at this time. In most OxC cases, as the table demonstrates, there wasn’t even sufficient evidence to demonstrate with high/very high confidence that the order met or failed to meet the binary condition of the criterion – much less the degrees of meeting/failure, after having met or failed it.
In particular, the tables do not depict anything like “Do we think these insects can feel pain, according to this measure?”. Actually, you believe that probably once there is enough high-quality research, the research will conclude that all insects will satisfy all of the criteria. (Or all orders of insects sufficiently similar to the ones you studied.)
[Here, I mean “believe” in the Bayesian sense where if you had to bet, this is what you would bet on. Not in the sense of you being confident that all the research will come up this way. In particular, no offense meant by this :-) .]
- I don’t really understand the first part of this. But I guess I would say that the table itself doesn’t represent any particular quantifiable credence that insects feel pain. However, the summation of these lines of evidence can provide some traction for thinking about how seriously to take the idea of insect pain at all—even if it again doesn’t give us any particular credence level.
- Re: point 2, I strongly disagree that this is my belief. I believe that once there is enough high quality research for each criterion and order, we will be able to conclude whether or not insect orders satisfy or fail all criteria. There are a select few criteria where available evidence suggests that we might bet on more research coming up satisfactorily – e.g., criterion 3 in adults, where evidence for integrated nociception is distributed across the phylogeny and criterions 1 + 2 (the preconditions) are also robustly met, plus we know that both 1 + 2 are robustly conserved across all the insect orders. However, in most O x C cases we have very, very little data – or even no real data (criterion 8, particularly!) – that can lead us to make any specific or generalizable conclusions about the likelihood of any or all insects meeting that criterion.
This comment is representative only of MRB’s opinions and expertise, and not the other post/publication authors.