Karthik Tadepalli comments on How much do you believe your results?

Karthik Tadepalli 7 May 2023 22:57 UTC
13 points
1 ∶ 1

In general I think it’s not crazy to guess that the standard error of your measurement is proportional to the size of the effect you’re trying to measure

Take a hierarchical model for effects. Each intervention $i$ has a true effect $β_{i}$ , and all the $β_{i}$ are drawn from a common distribution $G$ . Now for each intervention, we run an RCT and estimate ${^β}_{i} = β_{i} + ϵ_{i}$ where $ϵ_{i}$ is experimental noise.

By the CLT, $ϵ_{i} \sim N (0, σ_{i}^{2} / n_{i})$ where $σ_{i}^{2}$ is the inherent sampling variance in your environment and $n_{i}$ is the sample size of your RCT. What you’re saying is that $σ_{i}^{2}$ has the same order of magnitude as the variance of $G$ . But even if that’s true, the standard error shrinks linearly as your RCT sample size grows, so they should not be in the same OOM for reasonable values of $n_{i}$ . I would have to do some simulations to confirm that, though.

I also don’t think it’s likely to be true that $σ_{i}^{2}$ has the same OOM as the variance of $G$ . The factors that cause sampling variance—randomness in how people respond to the intervention, randomness in who gets selected for a trial, etc—seem roughly comparable across interventions. But the intervention qualities are not roughly comparable—we know that the best interventions are OOMs better than the average intervention. I don’t think we have any reason to believe that the noisiest interventions are OOMs noisier than the average intervention.

(I think that for something as clean as a well-set-up experiment with independent trials of a representative sample of the real world, you can estimate the standard error well, but I think the real world is sufficiently messy that this is rarely the case.)

I’m not sure what you mean by this, I think any collection of RCTs satisfies the setting I’ve laid out.
- JoshuaBlake 9 May 2023 11:57 UTC
  5 points
  1 ∶ 0
  Parent
  I think you’re assuming your conclusion here:
  
  Now for each intervention, we run an RCT and estimate ${^β}_{i} = β_{i} + ϵ_{i}$ where $ϵ_{i}$ is experimental noise.
  
  What if the noise is on the log scale?
  - Karthik Tadepalli 10 May 2023 5:24 UTC
    4 points
    2 ∶ 1
    Parent
    The central limit theorem is exactly that $\sqrt{N} ({^β}_{i} - β_{i}) \sim N (0, σ_{i}^{2})$ which implies what I said. The noise is not on the log scale because of the CLT.
    
    Now, if you transform your coefficient into a log scale then all bets are off. But that is not happening throughout this post. And it’s not really what happens in reality either. I don’t know why anyone would do it.