2) We need to know the means of the distribution to do the standardization—after all, if an intervention was estimated to be below the mean, we should anticipate it to regress upwards.
Trickier for an EA context, as the groups that do evaluation focus their efforts on what appear to be the most promising things, so there isn’t a clear handle on the ‘mean global health intervention’ which may be our distribution of interest. To some extent though, this problem solves itself if the underlying distributions of interest are log normal or similarly fat tailed and you are confident your estimate lies far from the mean (whatever it is): log (X—something small) approximates to log(X)
Sadly I don’t think a log-normal distribution solves this problem for you, because to apply your model I think you are working entirely in the log-domain, so taking log(X) - log(something small), rather than log(X—something small). Then the choice of the small thing can have quite an effect on the answer.
For example when you regressed the estimate of cost-effectiveness of malaria nets, you had an implicit mean cost-effectiveness of 1 DALY/$100,000. If you’d assumed instead 1 DALY/$10,000, you’d have regressed to $77/DALY instead of $97/DALY.
Sadly I don’t think a log-normal distribution solves this problem for you, because to apply your model I think you are working entirely in the log-domain, so taking log(X) - log(something small), rather than log(X—something small). Then the choice of the small thing can have quite an effect on the answer.
For example when you regressed the estimate of cost-effectiveness of malaria nets, you had an implicit mean cost-effectiveness of 1 DALY/$100,000. If you’d assumed instead 1 DALY/$10,000, you’d have regressed to $77/DALY instead of $97/DALY.