Can you expand on how you would directly estimate the reliability of charity evaluations? I feel like there are a lot of realistic situations where this would be extremely difficult to do well.
I mean do the adjustment for the optimizer’s curse. Or whatever else is in that paper.
I think talk of doing things “well” or “reliably” should be tabooed from this discussion, because no one has any coherent idea of what the threshold for ‘well enough’ or ‘reliable enough’ means or is in this context. “Better” or “more reliable” makes sense.
Kind of an odd assumption that dependence on luck varies from player to player.
Intuitively, it strikes me as appropriate for some realistic situations. For example, you might try to estimate the performance of people based on quite different kinds or magnitudes of inputs; e.g. one applicant might have a long relevant track record, for another one you might just have a brief work test. Or you might compare the impact of interventions that are backed by very different kinds of evidence—say, a RCT vs. a speculative, qualitative argument.
Maybe there is something I’m missing here about why the assumption is odd, or perhaps even why the examples I gave don’t have the property required in the paper? (The latter would certainly be plausible as I read the paper a while ago, and even back then not very closely.)
If we are talking about charity evaluations then reliability can be estimated directly so this is no longer a predictable error.
Hmm. This made me wonder whether the paper’s results depends on the decision-maker being uncertain about which options have been estimated reliably vs. unreliably. It seems possible that the effect could disappear if the reliability of my estimates varies but I know that the variance of my value estimate for option 1 is v_1, the one for option 2 v_2 etc. (even if the v_i vary a lot). (I don’t have time to check the paper or get clear on this I’m afraid.)
Kind of an odd assumption that dependence on luck varies from player to player.
If we are talking about charity evaluations then reliability can be estimated directly so this is no longer a predictable error.
Can you expand on how you would directly estimate the reliability of charity evaluations? I feel like there are a lot of realistic situations where this would be extremely difficult to do well.
I mean do the adjustment for the optimizer’s curse. Or whatever else is in that paper.
I think talk of doing things “well” or “reliably” should be tabooed from this discussion, because no one has any coherent idea of what the threshold for ‘well enough’ or ‘reliable enough’ means or is in this context. “Better” or “more reliable” makes sense.
Intuitively, it strikes me as appropriate for some realistic situations. For example, you might try to estimate the performance of people based on quite different kinds or magnitudes of inputs; e.g. one applicant might have a long relevant track record, for another one you might just have a brief work test. Or you might compare the impact of interventions that are backed by very different kinds of evidence—say, a RCT vs. a speculative, qualitative argument.
Maybe there is something I’m missing here about why the assumption is odd, or perhaps even why the examples I gave don’t have the property required in the paper? (The latter would certainly be plausible as I read the paper a while ago, and even back then not very closely.)
Hmm. This made me wonder whether the paper’s results depends on the decision-maker being uncertain about which options have been estimated reliably vs. unreliably. It seems possible that the effect could disappear if the reliability of my estimates varies but I know that the variance of my value estimate for option 1 is v_1, the one for option 2 v_2 etc. (even if the v_i vary a lot). (I don’t have time to check the paper or get clear on this I’m afraid.)
Is this what you were trying to say here?