Karthik Tadepalli comments on Deworming and decay: replicating GiveWell’s cost-effectiveness analysis

Karthik Tadepalli 28 Jul 2022 23:53 UTC
7 points
0 ∶ 0
I’m unclear on whether this works since a constant effects model is a decay model, with a decay parameter set to zero. So you’re just setting hyperparameters on the distribution of the decay parameter, which is normal bayesian modelling and not model averaging.
- coreyvernot 29 Jul 2022 3:54 UTC
  5 points
  0 ∶ 0
  Parent
  Thanks for this point, I didn’t think clearly about how the models are nested. I think that means the BMA I describe is the same as having one model with a decay parameter (as you say) but instead of a continuous prior on the decay parameter the prior is a mixture model with a point mass at zero. I know this is one bayesian method for penalizing model complexity, similar to lasso or ridge regression.
  https://wesselb.github.io/assets/write-ups/Bruinsma,%20Spike%20and%20Slab%20Priors.pdf
  So I now realize that what I proposed could just be seen as putting an explicit penalty on the extra parameter needed in the decay model, where the penalty is the size of the point mass. The motivation for that would be to avoid overfitting, which isn’t how I thought of it originally.