If I understand correctly, all the variables are simulated freshly for each model. In particular, that applies to some underlying assumptions that are logically shared or correlated between models (say, sentience probabilities or x-risks).
I think this may cause some issues when comparing between different causes. At the very least, it seems likely to understate the certainty by which one intervention may be better than another. I think it may also affect the ordering, particularly if we take some kind of risk aversion or other non-linear utility into account.
To be clear, this would be a problem in any uncertainty-based CEA modeling, and clearly the situation in non-randomized models is usually much worse. It may also be very minor, not sure.
With “variables are simulated freshly for each model”, do you mean that certain probability distributions are re-sampled when performing cause comparisons?
Yeah. If I understand correctly, everything is resampled rather than cached, so comparing results between two models is only done on aggregate rather than on a sample-by-sample basis
If I understand correctly, all the variables are simulated freshly for each model. In particular, that applies to some underlying assumptions that are logically shared or correlated between models (say, sentience probabilities or x-risks).
I think this may cause some issues when comparing between different causes. At the very least, it seems likely to understate the certainty by which one intervention may be better than another. I think it may also affect the ordering, particularly if we take some kind of risk aversion or other non-linear utility into account.
To be clear, this would be a problem in any uncertainty-based CEA modeling, and clearly the situation in non-randomized models is usually much worse. It may also be very minor, not sure.
With “variables are simulated freshly for each model”, do you mean that certain probability distributions are re-sampled when performing cause comparisons?
Yeah. If I understand correctly, everything is resampled rather than cached, so comparing results between two models is only done on aggregate rather than on a sample-by-sample basis
We used to have a caching layer meant to fix this, but the objective is for there not to be too much inter-run variability.