It’s a potential solution, but I think it requires the prior to decrease quickly enough with increasing cost effectiveness, and this isn’t guaranteed. So I’m wondering is there any analysis to show that the methods being used are actually robust to this problem e.g. exploring sensitivity to how answers would look if the deworming RCT results had been higher or lower and that they change sensibly?
A document that looks to give more info on the method used for deworming looks to be here, so perhaps that can be built on—but from a quick look it doesn’t seem to say exactly what shape is being used for the priors in all cases, though they look quite Gaussian from the plots.
Reflecting, in the everything-is-Gaussian case a prior doesn’t help much. Here, your posterior mean is a weighted average of prior and likelihood, with the weights depending only on the variance of the two distributions. So if the likelihood mean increases but with constant variance then your posterior mean increases linearly. You’d probably need a bias term or something in your model (if you’re doing this formally).
This might actually be an argument in favour of GiveWell’s current approach, assuming they’d discount more as the study estimate becomes increasinly implausible.
exploring sensitivity to how answers would look if the deworming RCT results had been higher or lower and that they change sensibly?
Do you just mean that the change in the posterior expectation is in the correct direction? In that case, we know the answer from theory: yes, for any prior and a wide range of likelihood functions.
Andrews et al. 1972 (Lemma 1) shows that when the signal B is normally distributed, with mean T, then, for any prior distribution over T, E[T|B=b] is increasing in b.
This was generalised by Ma 1999 (Corollary 1.3) to any likelihood function arising from a B that (i) has T as a location parameter, and (ii) is strongly unimodally distributed.
I guess it depends on what the “correct direction” is thought to be. From the reasoning quoted in my first post, it could be the case that as the study result becomes larger the posterior expectation should actually reduce. It’s not inconceivable that as we saw the estimate go to infinity, we should start reasoning that the study is so ridiculous as to be uninformative and so not the posterior update becomes smaller. But I don’t know. What you say seems to suggest that Bayesian reasoning could only do that for rather specific choices of likelihood functions, which is interesting.
A lognormal prior (and a normal likelihood function) might be a good starting point when adjusting for the statistical uncertainty in an effect size estimate. The resulting posterior cannot be calculated in closed form, but I have a simple website that calculates it using numerical methods. Here’s an example.
Worth noting that adjusting for the statistical uncertainty in an effect size estimate is quite different from adjusting for the totality of our uncertainty in a cost-effectiveness estimate. For doing the latter, it’s unclear to me what likelihood function would be appropriate. I’d love to know if there are practical methods for choosing the likelihood function in these cases.
GiveWell does seem to be using mostly normal priors in the document you linked. I don’t have time to read the whole document and think carefully about what prior would be most appropriate. For its length (3,600 words including footnotes) the document doesn’t appear to give much reasoning for the choices of distribution families.
It’s a potential solution, but I think it requires the prior to decrease quickly enough with increasing cost effectiveness, and this isn’t guaranteed. So I’m wondering is there any analysis to show that the methods being used are actually robust to this problem e.g. exploring sensitivity to how answers would look if the deworming RCT results had been higher or lower and that they change sensibly?
A document that looks to give more info on the method used for deworming looks to be here, so perhaps that can be built on—but from a quick look it doesn’t seem to say exactly what shape is being used for the priors in all cases, though they look quite Gaussian from the plots.
I agree.
Reflecting, in the everything-is-Gaussian case a prior doesn’t help much. Here, your posterior mean is a weighted average of prior and likelihood, with the weights depending only on the variance of the two distributions. So if the likelihood mean increases but with constant variance then your posterior mean increases linearly. You’d probably need a bias term or something in your model (if you’re doing this formally).
This might actually be an argument in favour of GiveWell’s current approach, assuming they’d discount more as the study estimate becomes increasinly implausible.
Do you just mean that the change in the posterior expectation is in the correct direction? In that case, we know the answer from theory: yes, for any prior and a wide range of likelihood functions.
Andrews et al. 1972 (Lemma 1) shows that when the signal
B
is normally distributed, with meanT
, then, for any prior distribution overT
,E[T|B=b]
is increasing inb
.This was generalised by Ma 1999 (Corollary 1.3) to any likelihood function arising from a
B
that (i) hasT
as a location parameter, and (ii) is strongly unimodally distributed.I guess it depends on what the “correct direction” is thought to be. From the reasoning quoted in my first post, it could be the case that as the study result becomes larger the posterior expectation should actually reduce. It’s not inconceivable that as we saw the estimate go to infinity, we should start reasoning that the study is so ridiculous as to be uninformative and so not the posterior update becomes smaller. But I don’t know. What you say seems to suggest that Bayesian reasoning could only do that for rather specific choices of likelihood functions, which is interesting.
A lognormal prior (and a normal likelihood function) might be a good starting point when adjusting for the statistical uncertainty in an effect size estimate. The resulting posterior cannot be calculated in closed form, but I have a simple website that calculates it using numerical methods. Here’s an example.
Worth noting that adjusting for the statistical uncertainty in an effect size estimate is quite different from adjusting for the totality of our uncertainty in a cost-effectiveness estimate. For doing the latter, it’s unclear to me what likelihood function would be appropriate. I’d love to know if there are practical methods for choosing the likelihood function in these cases.
GiveWell does seem to be using mostly normal priors in the document you linked. I don’t have time to read the whole document and think carefully about what prior would be most appropriate. For its length (3,600 words including footnotes) the document doesn’t appear to give much reasoning for the choices of distribution families.