Some ~first impressions on the writeup and implementation here. I think you have recognized these issue to an extent, but I hope another impression is useful. I hope to dig in more.
(Note, I’m particularly interested in this because I’m thinking about how to prioritize research for Unjournal.org to commission for evaluation.)
I generally agree with this approach and it seems to be really going in the right direction. The calculations here seem great as a start, mostly following what I imagine is best practive, and they seem very well documented/explained!
But there are some obstacles to practical use I think (semi-aside: and if not considered carefully it could also instill over-certainty in the results, ~that usual ‘rationalist quantify too much’/‘ludic fallacy’ thing that I usually think is overblown, but still.)
Particularly:
1. Where do you come up with all the inputs in a real case? Like, how would I know the median (mean?) impact of the research project is -$2 per DALY? ($50 to $48)?
2. The simulation seems to assume all uncertainties are uncorrelated. This could matter a lot (as you note).
3. For a lot of parameters discussed in the example, there are pretty strong reasons to anticipate negative or positive correlations, e.g.,
a. Where the benefit of a ‘new target intervention’ is larger, there will general be a greater probability of discovering it (larger effect sizes coming out of the research with less data, etc.)
b. In a world where there is more funding available (perhaps because more people are contributing to the funder, bringing a broader range of values and beliefs), I would tend to expect that a larger share could be diverted.
c. “Funder is swayed by the result … 80% of the time” … probably they will be swayed enough to shift less money more of the time.
Thanks for your impressions. I think your concerns largely align with ours. The model should definitely be interpreted with caution, not just because of the correlations it leaves out, but because of the uncertainty with the inputs. For the things that the model leaves out, you’ve got to adjust its verdicts. I think that this is still very useful because it gives us a better baseline to update from.
As for where we get inputs from, Marcus might have more to say. However, I can speak to the history of the app. Previously, we were using a standard percentage improvement, e.g. a 10% increase in DALYs averted per $. Switching to allowing users to choose a specific target effectiveness number gave us more flexibility. I am not sure what made us think that the percentages we had previously set were reasonable, but I suspect it came from experience with similar projects.
Some ~first impressions on the writeup and implementation here. I think you have recognized these issue to an extent, but I hope another impression is useful. I hope to dig in more.
(Note, I’m particularly interested in this because I’m thinking about how to prioritize research for Unjournal.org to commission for evaluation.)
I generally agree with this approach and it seems to be really going in the right direction. The calculations here seem great as a start, mostly following what I imagine is best practive, and they seem very well documented/explained!
But there are some obstacles to practical use I think (semi-aside: and if not considered carefully it could also instill over-certainty in the results, ~that usual ‘rationalist quantify too much’/‘ludic fallacy’ thing that I usually think is overblown, but still.)
Particularly:
1. Where do you come up with all the inputs in a real case? Like, how would I know the median (mean?) impact of the research project is -$2 per DALY? ($50 to $48)?
2. The simulation seems to assume all uncertainties are uncorrelated. This could matter a lot (as you note).
3. For a lot of parameters discussed in the example, there are pretty strong reasons to anticipate negative or positive correlations, e.g.,
a. Where the benefit of a ‘new target intervention’ is larger, there will general be a greater probability of discovering it (larger effect sizes coming out of the research with less data, etc.)
b. In a world where there is more funding available (perhaps because more people are contributing to the funder, bringing a broader range of values and beliefs), I would tend to expect that a larger share could be diverted.
c. “Funder is swayed by the result … 80% of the time” … probably they will be swayed enough to shift less money more of the time.
Thanks for your impressions. I think your concerns largely align with ours. The model should definitely be interpreted with caution, not just because of the correlations it leaves out, but because of the uncertainty with the inputs. For the things that the model leaves out, you’ve got to adjust its verdicts. I think that this is still very useful because it gives us a better baseline to update from.
As for where we get inputs from, Marcus might have more to say. However, I can speak to the history of the app. Previously, we were using a standard percentage improvement, e.g. a 10% increase in DALYs averted per $. Switching to allowing users to choose a specific target effectiveness number gave us more flexibility. I am not sure what made us think that the percentages we had previously set were reasonable, but I suspect it came from experience with similar projects.