You looked at the overall recap and saw the takeaways there? e.g. Sensitivity analysis indicates that some inputs are substantially more influential than others, and there are some plausible values of inputs which would reorder the ranking of top charities.
These are sort of meta-conclusions though and I’m guessing you’re hoping for more direct conclusions. That’s sort of hard to do. As I mention in several places, the analysis depends on the uncertainty you feed into it. To maintain “neutrality”, I just pretended to be equally uncertain about each input. But, given this, any simple conclusions like “The AMF cost-effectiveness estimates have the most uncertainty.” or “The relative cost-effectiveness is most sensitive to the discount rate.” would be misleading at best.
The only way to get simple conclusions like that is to feed input parameters you actually believe in to the linked Jupyter notebook. Or I could put in my best guesses as to inputs and draw simple conclusions from that. But then you’d be learning about me as much as you’d be learning about the world as you see it.
Does that all make sense? Is there another kind of takeaway that you’re imagining?
Despite your reservations, I think it would actually be very useful for you to input your best guess inputs (and its likely to be more useful for you to do it than an average EA, given you’ve thought about this more). My thinking is this. I’m not sure I entirely followed the argument, but I took it that the thrust of what you’re saying is “we should do uncertainty analysis (use Monte Carlo simulations instead of point-estimates) as our cost-effectiveness might be sensitive to it”. But you haven’t shown that GiveWell’s estimates are sensitive to a reliance on point estimates (have you?), so you haven’t (yet) demonstrated it’s worth doing the uncertainty analysis you propose after all. :)
More generally, if someone says “here’s a new, really complicated methodology we *could* use”, I think its incumbent on them to show that we *should* use it, given the extra effort involved.
the thrust of what you’re saying is “we should do uncertainty analysis (use Monte Carlo simulations instead of point-estimates) as our cost-effectiveness might be sensitive to it”
Yup, this is the thrust of it.
But you haven’t shown that GiveWell’s estimates are sensitive to a reliance on point estimates (have you?)
I think I have—conditionally. The uncertainty analysis shows that, if you think the neutral uncertainty I use as input is an acceptable approximation, substantially different rankings are within the bounds of plausibility. If I put in my own best estimates, the conclusion would still be conditional. It’s just that instead of being conditional upon “if you think the neutral uncertainty I use as input is an acceptable approximation” it’s conditional upon “if you think my best estimates of the uncertainty are an acceptable approximation”.
So the summary point there is that there’s really no way to escape conditional conclusions within a subjective Bayesian framework. Conclusions will always be of the form “Conclusion C is true if you accept prior beliefs B”. This makes generic, public communication hard (as we’re seeing!), but offers lots of benefits too (which I tried to demonstrate in the post—e.g. an explicit quantification of uncertainty, a sense of which inputs are most influential).
here’s a new, really complicated methodology we could use
If I’ve given the impression that it’s really complicated, I think might have misled. One of the things I really like about the approach is that you pay a relatively modest fixed cost and then you get this kind of analysis “for free”. By which I mean the complexity doesn’t infect all your actual modeling code. For example, the GiveDirectly model here actually reads more clearly to me than the corresponding spreadsheet because I’m not constantly jumping around trying to figure out what the cell reference (e.g. B23) means in formulas.
Admittedly, some of the stuff about delta moment-independent sensitivity analysis and different distance metrics is a bit more complicated. But the distance metric stuff is specific to this particular problem—not the methodology in general—and the sensitivity analysis can largely be treated as a black box. As long as you understand what the properties of the resulting number are (e.g. ranges from 0-1, 0 means independence), the internal workings aren’t crucial.
I think it would actually be very useful for you to input your best guess inputs (and its likely to be more useful for you to do it than an average EA, given you’ve thought about this more)
Given the responses here, I think I will go ahead and try that approach. Though I guess even better would be getting GiveWell’s uncertainty on all the inputs (rather than just the inputs highlighted in the “User weights” and “Moral inputs” tab).
Sorry for adding even more text to what’s already a lot of text :). Hope that helps.
Did you ever get round to running the analysis with your best guess inputs?
If that revealed substantial decision uncertainty (and especially if you were very uncertain about your inputs), I’d also like to see it run with GiveWell’s inputs. They could be aggregated distributions from multiple staff members, elicited using standard methods, or in some cases perhaps ‘official’ GiveWell consensus distributions. I’m kind of surprised this doesn’t seem to have been done already, given obvious issues with using point estimates in non-linear models. Or do you have reason to believe the ranking and cost-effectiveness ratios would not be sensitive to methodological changes like this?
You looked at the overall recap and saw the takeaways there? e.g. Sensitivity analysis indicates that some inputs are substantially more influential than others, and there are some plausible values of inputs which would reorder the ranking of top charities.
These are sort of meta-conclusions though and I’m guessing you’re hoping for more direct conclusions. That’s sort of hard to do. As I mention in several places, the analysis depends on the uncertainty you feed into it. To maintain “neutrality”, I just pretended to be equally uncertain about each input. But, given this, any simple conclusions like “The AMF cost-effectiveness estimates have the most uncertainty.” or “The relative cost-effectiveness is most sensitive to the discount rate.” would be misleading at best.
The only way to get simple conclusions like that is to feed input parameters you actually believe in to the linked Jupyter notebook. Or I could put in my best guesses as to inputs and draw simple conclusions from that. But then you’d be learning about me as much as you’d be learning about the world as you see it.
Does that all make sense? Is there another kind of takeaway that you’re imagining?
Despite your reservations, I think it would actually be very useful for you to input your best guess inputs (and its likely to be more useful for you to do it than an average EA, given you’ve thought about this more). My thinking is this. I’m not sure I entirely followed the argument, but I took it that the thrust of what you’re saying is “we should do uncertainty analysis (use Monte Carlo simulations instead of point-estimates) as our cost-effectiveness might be sensitive to it”. But you haven’t shown that GiveWell’s estimates are sensitive to a reliance on point estimates (have you?), so you haven’t (yet) demonstrated it’s worth doing the uncertainty analysis you propose after all. :)
More generally, if someone says “here’s a new, really complicated methodology we *could* use”, I think its incumbent on them to show that we *should* use it, given the extra effort involved.
Thanks for your thoughts.
Yup, this is the thrust of it.
I think I have—conditionally. The uncertainty analysis shows that, if you think the neutral uncertainty I use as input is an acceptable approximation, substantially different rankings are within the bounds of plausibility. If I put in my own best estimates, the conclusion would still be conditional. It’s just that instead of being conditional upon “if you think the neutral uncertainty I use as input is an acceptable approximation” it’s conditional upon “if you think my best estimates of the uncertainty are an acceptable approximation”.
So the summary point there is that there’s really no way to escape conditional conclusions within a subjective Bayesian framework. Conclusions will always be of the form “Conclusion C is true if you accept prior beliefs B”. This makes generic, public communication hard (as we’re seeing!), but offers lots of benefits too (which I tried to demonstrate in the post—e.g. an explicit quantification of uncertainty, a sense of which inputs are most influential).
If I’ve given the impression that it’s really complicated, I think might have misled. One of the things I really like about the approach is that you pay a relatively modest fixed cost and then you get this kind of analysis “for free”. By which I mean the complexity doesn’t infect all your actual modeling code. For example, the GiveDirectly model here actually reads more clearly to me than the corresponding spreadsheet because I’m not constantly jumping around trying to figure out what the cell reference (e.g. B23) means in formulas.
Admittedly, some of the stuff about delta moment-independent sensitivity analysis and different distance metrics is a bit more complicated. But the distance metric stuff is specific to this particular problem—not the methodology in general—and the sensitivity analysis can largely be treated as a black box. As long as you understand what the properties of the resulting number are (e.g. ranges from 0-1, 0 means independence), the internal workings aren’t crucial.
Given the responses here, I think I will go ahead and try that approach. Though I guess even better would be getting GiveWell’s uncertainty on all the inputs (rather than just the inputs highlighted in the “User weights” and “Moral inputs” tab).
Sorry for adding even more text to what’s already a lot of text :). Hope that helps.
Did you ever get round to running the analysis with your best guess inputs?
If that revealed substantial decision uncertainty (and especially if you were very uncertain about your inputs), I’d also like to see it run with GiveWell’s inputs. They could be aggregated distributions from multiple staff members, elicited using standard methods, or in some cases perhaps ‘official’ GiveWell consensus distributions. I’m kind of surprised this doesn’t seem to have been done already, given obvious issues with using point estimates in non-linear models. Or do you have reason to believe the ranking and cost-effectiveness ratios would not be sensitive to methodological changes like this?