Thanks a lot for this! Like willbradshaw I agree that this post is “well-written, thoughtful, well-linked and thorough!”
What are some objections to anything I’ve written here?
If I were to nitpick, I think my biggest objection is that your approach to tackling the problem of NPIs for pandemic preparedness and response appears extremely atheoretical. I think this is fine for a scoping study that tries to estimate the scale of the problem, and fine (perhaps even highly underrated!) for clinical studies. But I think we can get decent results at lower cost with a bit of simple theory.
I believe this because I think the human body in general, and the immune system in particular, is woefully complicated, so it makes sense that we cannot have much faith in biologically plausible mechanisms for treatments, which leads us to necessitate correspondingly greater faith in end-to-end RCTs(and be in a state of radical cluelessness otherwise). But there are other parts of epidemiology that’s simpler and more well-understood, such that for transmission we can be reasonably confident in our ability to dice the problem and isolate it into specific confusing subcomponents.
For example, suppose we are worried about a potential respiratory disease pandemic, and we want to figure out whether intervention X (say installing MERV filters for offices) has a sufficiently large impact on an (un)desired endpoint (eg symptomatic disease, hospitalizations). One approach might just be:
Sounds plausible, but we can’t know much with confidence in the absence of end-to-end empirical results. What we can do is run an RCT where we install MERV filters in the treatment group and don’t install it in the control group with a sufficiently large sample size to power for differences that are big enough for us to care about, and compare results after the study’s natural endpoint.
I think this is good, but potentially quite expensive/time-consuming (which is really bad in a fast-moving pandemic!). One way we can potentially do better:
Well, disease transmission isn’t magic, and we’re reasonably confident in the very high-level theory of respiratory diseases. So we can at least decompose the problem into two parts:
Treat human bodies as a black box function that takes in some combination of scary microbe-laden particles and outputs some probability of undesired endpoints.
Model the world as something that sends scary microbe-laden particles and figure out which interventions reduce such particles to a level that the modeled function in (1) should consider too low to notice.
My decomposition isn’t particularly interesting, but I think it’s reasonably clean. With it, we can
Tackle 1) with human challenge trials where microbe dose/frequency/timing is variable, to understand what are plausible ranges of parameters for how many droplets are needed to be bad.
Tackle 2) with some combination of
computational fluid dynamics simulations
lab experiments on how much people breathe each other’s air, and how fast air need to cycle to reduce that.
field experiments on effect of MERV filters on closely analogous particles (on the physical level)
prior knowledge of the transmission patterns of other similar diseases
???
Now my decomposition is still quite high level, and I’m not sure that my suggested instrumentalizations here aren’t dumb. But hopefully what I’m gesturing at makes sense?
Thanks a lot for the comment. I do think that what your gesturing at makes sense: if I understand correctly you are saying that certain physical interventions can have more predictable effects that ‘biological’ ones because we have a decent idea of exactly how they work. In some cases this is definitely true: as an extreme example, we don’t need RCTs of aeroplane safety as we have a very good understanding of the physical processes and are able to model them well. If we have an airborne pathogen, it’s hardly necessary to run an RCT to see whether or not there is an effect of a stay at home order: there will be one.
In many of the example questions I gave though, I think the fact that there is a large behavioural component pushes us closer to the situation we have with drugs than to the aeroplane. For example, although it could be demonstrated in a laboratory which of mask or shield is actually more effective at blocking exhaled particles, it would be harder to capture the different effects that each has on how often you touch your face, how often it is removed, or other aspects of compliance. These will differ a lot between people, so you’d need to test it on a large group, and the social setting might influence behaviour. I don’t think that we can decompose the often important behavioural component of these interventions in the same way that we can the physical components.
That said, the air filtration question I posed might not have been well chosen. As you point out, it seems reasonable that we can get a good understanding of whether that is likely to be helpful by applying what we know about the filters and viral transmission. Of the questions I posed, RCTs are likely to be the least useful there and may not be useful at all.
However, I do have some thoughts on why an RCT could still be worthwhile. I’m not saying these because I disagree with your points; I’m just providing some possible counterargument.
Learning: by introducing the filters not in an RCT, you are basically doing an experiment but losing the opportunity to learn from it. Even if it has been decided that filters should be introduced in all schools/offices (or whatever unit), it won’t normally be possible to install all of those in parallel. So there is a time where some offices have the filter and some don’t. As long as you can randomise this, you can take advantage of the differences in time for implementation in something like a stepped wedge cluster randomised trial. The effect could be analysed on an ongoing basis in a Bayesian analysis such that if there are large effects they would be detected early in the experiment and implementation of the remaining filters can be accelerated. If you are doing something like this across several interventions, this would help with deciding which to prioritise.
Cost-benefit: There are ~ 137,000 schools in the US. I don’t know how much it costs to install and maintain filtration systems, but I imagine it is not negligible. There are a lot more schools globally. Doing an RCT comparing e.g. air filtration to opening the windows could save quite a bit of money if it turns out that filtration systems don’t provide additional benefit.
Implementation and interaction with behaviour: even assuming that they do work, do people use them? Maybe the filtration is noisy so teachers turn it off; maybe they simply forget to turn it on. In medicine, even with drugs that demonstrably improve the patient’s condition, adherence is (to me) surprisingly low. Perhaps large rooms where people tend to congregate most cannot be adequately filtered, maybe the filtration system gives people a sense of security so they congregate more.
Overall, I think the areas where trials would be most useful are those where we can expect relatively modest effects and where there is a larger behavioural component. The combination of modest effects, if better understood, might be quite important.
There’s an additional factor: Marketing and public persuasion. It is one thing to say: Based on a theoretical model, air filters work, and a totally different thing to say: We saw that air filters cut transmission by X% . My hope would be that the certainty and the effect estimate could serve to overcome the collective inaction we saw in the pandemic (in that many people agree that e.g. air filters would probably help, but barely nobody installed them in schools).
Thanks a lot for this! Like willbradshaw I agree that this post is “well-written, thoughtful, well-linked and thorough!”
If I were to nitpick, I think my biggest objection is that your approach to tackling the problem of NPIs for pandemic preparedness and response appears extremely atheoretical. I think this is fine for a scoping study that tries to estimate the scale of the problem, and fine (perhaps even highly underrated!) for clinical studies. But I think we can get decent results at lower cost with a bit of simple theory.
I believe this because I think the human body in general, and the immune system in particular, is woefully complicated, so it makes sense that we cannot have much faith in biologically plausible mechanisms for treatments, which leads us to necessitate correspondingly greater faith in end-to-end RCTs(and be in a state of radical cluelessness otherwise). But there are other parts of epidemiology that’s simpler and more well-understood, such that for transmission we can be reasonably confident in our ability to dice the problem and isolate it into specific confusing subcomponents.
For example, suppose we are worried about a potential respiratory disease pandemic, and we want to figure out whether intervention X (say installing MERV filters for offices) has a sufficiently large impact on an (un)desired endpoint (eg symptomatic disease, hospitalizations). One approach might just be:
I think this is good, but potentially quite expensive/time-consuming (which is really bad in a fast-moving pandemic!). One way we can potentially do better:
My decomposition isn’t particularly interesting, but I think it’s reasonably clean. With it, we can
Tackle 1) with human challenge trials where microbe dose/frequency/timing is variable, to understand what are plausible ranges of parameters for how many droplets are needed to be bad.
Tackle 2) with some combination of
computational fluid dynamics simulations
lab experiments on how much people breathe each other’s air, and how fast air need to cycle to reduce that.
field experiments on effect of MERV filters on closely analogous particles (on the physical level)
prior knowledge of the transmission patterns of other similar diseases
???
Now my decomposition is still quite high level, and I’m not sure that my suggested instrumentalizations here aren’t dumb. But hopefully what I’m gesturing at makes sense?
Thanks a lot for the comment. I do think that what your gesturing at makes sense: if I understand correctly you are saying that certain physical interventions can have more predictable effects that ‘biological’ ones because we have a decent idea of exactly how they work. In some cases this is definitely true: as an extreme example, we don’t need RCTs of aeroplane safety as we have a very good understanding of the physical processes and are able to model them well. If we have an airborne pathogen, it’s hardly necessary to run an RCT to see whether or not there is an effect of a stay at home order: there will be one.
In many of the example questions I gave though, I think the fact that there is a large behavioural component pushes us closer to the situation we have with drugs than to the aeroplane. For example, although it could be demonstrated in a laboratory which of mask or shield is actually more effective at blocking exhaled particles, it would be harder to capture the different effects that each has on how often you touch your face, how often it is removed, or other aspects of compliance. These will differ a lot between people, so you’d need to test it on a large group, and the social setting might influence behaviour. I don’t think that we can decompose the often important behavioural component of these interventions in the same way that we can the physical components.
That said, the air filtration question I posed might not have been well chosen. As you point out, it seems reasonable that we can get a good understanding of whether that is likely to be helpful by applying what we know about the filters and viral transmission. Of the questions I posed, RCTs are likely to be the least useful there and may not be useful at all.
However, I do have some thoughts on why an RCT could still be worthwhile. I’m not saying these because I disagree with your points; I’m just providing some possible counterargument.
Learning: by introducing the filters not in an RCT, you are basically doing an experiment but losing the opportunity to learn from it. Even if it has been decided that filters should be introduced in all schools/offices (or whatever unit), it won’t normally be possible to install all of those in parallel. So there is a time where some offices have the filter and some don’t. As long as you can randomise this, you can take advantage of the differences in time for implementation in something like a stepped wedge cluster randomised trial. The effect could be analysed on an ongoing basis in a Bayesian analysis such that if there are large effects they would be detected early in the experiment and implementation of the remaining filters can be accelerated. If you are doing something like this across several interventions, this would help with deciding which to prioritise.
Cost-benefit: There are ~ 137,000 schools in the US. I don’t know how much it costs to install and maintain filtration systems, but I imagine it is not negligible. There are a lot more schools globally. Doing an RCT comparing e.g. air filtration to opening the windows could save quite a bit of money if it turns out that filtration systems don’t provide additional benefit.
Implementation and interaction with behaviour: even assuming that they do work, do people use them? Maybe the filtration is noisy so teachers turn it off; maybe they simply forget to turn it on. In medicine, even with drugs that demonstrably improve the patient’s condition, adherence is (to me) surprisingly low. Perhaps large rooms where people tend to congregate most cannot be adequately filtered, maybe the filtration system gives people a sense of security so they congregate more.
Overall, I think the areas where trials would be most useful are those where we can expect relatively modest effects and where there is a larger behavioural component. The combination of modest effects, if better understood, might be quite important.
There’s an additional factor: Marketing and public persuasion. It is one thing to say: Based on a theoretical model, air filters work, and a totally different thing to say: We saw that air filters cut transmission by X% . My hope would be that the certainty and the effect estimate could serve to overcome the collective inaction we saw in the pandemic (in that many people agree that e.g. air filters would probably help, but barely nobody installed them in schools).
Good point. This is similar to what I was trying to get at when talking about lack of willingness to engage in probabilistic reasoning.