I’m unclear on whether this works since a constant effects model is a decay model, with a decay parameter set to zero. So you’re just setting hyperparameters on the distribution of the decay parameter, which is normal bayesian modelling and not model averaging.
Thanks for this point, I didn’t think clearly about how the models are nested. I think that means the BMA I describe is the same as having one model with a decay parameter (as you say) but instead of a continuous prior on the decay parameter the prior is a mixture model with a point mass at zero. I know this is one bayesian method for penalizing model complexity, similar to lasso or ridge regression.
So I now realize that what I proposed could just be seen as putting an explicit penalty on the extra parameter needed in the decay model, where the penalty is the size of the point mass. The motivation for that would be to avoid overfitting, which isn’t how I thought of it originally.
I’m unclear on whether this works since a constant effects model is a decay model, with a decay parameter set to zero. So you’re just setting hyperparameters on the distribution of the decay parameter, which is normal bayesian modelling and not model averaging.
Thanks for this point, I didn’t think clearly about how the models are nested. I think that means the BMA I describe is the same as having one model with a decay parameter (as you say) but instead of a continuous prior on the decay parameter the prior is a mixture model with a point mass at zero. I know this is one bayesian method for penalizing model complexity, similar to lasso or ridge regression.
https://wesselb.github.io/assets/write-ups/Bruinsma,%20Spike%20and%20Slab%20Priors.pdf
So I now realize that what I proposed could just be seen as putting an explicit penalty on the extra parameter needed in the decay model, where the penalty is the size of the point mass. The motivation for that would be to avoid overfitting, which isn’t how I thought of it originally.