On your first point, instead of using a single prior distribution I could do a weighted combination of multiple distributions. There are two ways to do this: either have a prior be a combination distribution, or compute multiple posteriors with different distributions and take their weighted average. Not sure which one correctly handles this uncertainty.
Not sure what you mean by a ‘combination distribution’, but I think something like Carl’s suggestion is correct: have a hierarchical model where the type of distribution over effectiveness that you will use is itself a random variable, which the distribution over effectiveness has as a ‘hyperparameter’. You could also add a level to the hierarchy by having a distribution over the probabilities for each type of distribution. That being said, it might be convenient to fix these probabilities since it’s difficult to put all the evidence you have access to in the model. Probabilistic programming languages are a convenient way to handle such hierarchical models, if you’re interested, I recommend checking out this tutorial for an introduction focussing on applications in psychology.
Not sure what you mean by a ‘combination distribution’
I mean that your prior probability density is given by $P(X) = w_{Pareto} P_{Pareto}(X) + w_{lognorm} P_{lognorm}(X)$ for weights $w$. (You can read LaTeX right?)
Sure. I think a better thing to do (which I think what Carl is suggesting) is to have a prior distribution over x (the effectiveness of a randomly chosen intervention), and interventionDistribution (a categorical distribution over different shapes you think the space of interventions might have). So P(x, ‘Pareto’) = P(‘Pareto’) P(x | ‘Pareto’) = w_{Pareto} P_{Pareto}(x) and P(x, ‘logNormal’) = P(‘logNormal’) P(x | ‘logNormal’) = w_{logNormal} P_{logNormal}(x). Then, for the first intervention you see, your prior density over effectiveness is indeed P(x) = w_{Pareto} P_{Pareto}(x) + w_{logNormal} P_{logNormal}(x), but after measuring a bunch of interventions, you can update your beliefs about the empirical distribution of effectivenesses.
Not sure what you mean by a ‘combination distribution’, but I think something like Carl’s suggestion is correct: have a hierarchical model where the type of distribution over effectiveness that you will use is itself a random variable, which the distribution over effectiveness has as a ‘hyperparameter’. You could also add a level to the hierarchy by having a distribution over the probabilities for each type of distribution. That being said, it might be convenient to fix these probabilities since it’s difficult to put all the evidence you have access to in the model. Probabilistic programming languages are a convenient way to handle such hierarchical models, if you’re interested, I recommend checking out this tutorial for an introduction focussing on applications in psychology.
I mean that your prior probability density is given by $P(X) = w_{Pareto} P_{Pareto}(X) + w_{lognorm} P_{lognorm}(X)$ for weights $w$. (You can read LaTeX right?)
Sure. I think a better thing to do (which I think what Carl is suggesting) is to have a prior distribution over x (the effectiveness of a randomly chosen intervention), and interventionDistribution (a categorical distribution over different shapes you think the space of interventions might have). So P(x, ‘Pareto’) = P(‘Pareto’) P(x | ‘Pareto’) = w_{Pareto} P_{Pareto}(x) and P(x, ‘logNormal’) = P(‘logNormal’) P(x | ‘logNormal’) = w_{logNormal} P_{logNormal}(x). Then, for the first intervention you see, your prior density over effectiveness is indeed P(x) = w_{Pareto} P_{Pareto}(x) + w_{logNormal} P_{logNormal}(x), but after measuring a bunch of interventions, you can update your beliefs about the empirical distribution of effectivenesses.