One consideration that comes to my mind is if something like this type of evaluation further reinforces a “success to the successful” feedback loop which is inherently sensitive to initial conditions. As in people might be able to produce great work given the right support and conditions but don’t have them in the beginning. Someone else is more lucky and gets picked up, then more supported, which then reinforces further success.
Thus, it seems generally pretty hard to use something like this kind of system to achieve “optimal” outcomes or, rather, let’s say you have to be careful about how you implement such rating systems and be aware of such feedback loops.
Yeah, I agree that for forecasting setups self-fulfilling prophesies/feedback loops can be a problem, but it seems likely that they can be mitigated with a certain amount of exploration (e.g., occasionally try things you’d expect to fail in order to test your system.)
It’s also not clear that this type of evaluation is worse than the alternative informal evaluation system. For example, with a formal evaluation system you’d be able to pick up high quality outputs even if they come from otherwise low status people (and then give them further support.)
Thanks for the thoughtful answer. I agree that it’s not clear that it is worse than other alternatives, in my comment I didn’t give a reference solution to compare it to after all.
I just wanted to highlight the potential for problems that ought to be looked at while designing such solutions. So, if you consider working more on this in the future, it might be fruitful to think about how it would influence such feedback loops.
Do you mean that people who had past successes (even if by external support) would be more likely to score higher here?
I intuitively think that this kind of rubric would actually be more robust against such feedback loops. Most of the factors are not specific to the person, so I think that there would be less bias there.
In essence, I think that act of adding quantitative measures may lend a veil of “objectivity” to assessments of peoples work, which is intrinsically vulnerable to the success to the successful feedback loop.
Based on your comment, I had another look at the specific criteria of the rubric and agree that it seems possible that it could help to counteract something like the dynamic I outlined above, however, it would still have to be applied with care and recognizing the possibility of such dynamics.
The main problem I wanted to highlight is that something like this might obscure those dynamics and might be employed for political purposes such as justifying existing status hierarchies which might be simply circumstantial and not based on merit.
Thanks for the interesting post.
One consideration that comes to my mind is if something like this type of evaluation further reinforces a “success to the successful” feedback loop which is inherently sensitive to initial conditions. As in people might be able to produce great work given the right support and conditions but don’t have them in the beginning. Someone else is more lucky and gets picked up, then more supported, which then reinforces further success.
Thus, it seems generally pretty hard to use something like this kind of system to achieve “optimal” outcomes or, rather, let’s say you have to be careful about how you implement such rating systems and be aware of such feedback loops.
What do you think about this?
Yeah, I agree that for forecasting setups self-fulfilling prophesies/feedback loops can be a problem, but it seems likely that they can be mitigated with a certain amount of exploration (e.g., occasionally try things you’d expect to fail in order to test your system.)
It’s also not clear that this type of evaluation is worse than the alternative informal evaluation system. For example, with a formal evaluation system you’d be able to pick up high quality outputs even if they come from otherwise low status people (and then give them further support.)
Thanks for the thoughtful answer. I agree that it’s not clear that it is worse than other alternatives, in my comment I didn’t give a reference solution to compare it to after all.
I just wanted to highlight the potential for problems that ought to be looked at while designing such solutions. So, if you consider working more on this in the future, it might be fruitful to think about how it would influence such feedback loops.
Do you mean that people who had past successes (even if by external support) would be more likely to score higher here?
I intuitively think that this kind of rubric would actually be more robust against such feedback loops. Most of the factors are not specific to the person, so I think that there would be less bias there.
In essence, I think that act of adding quantitative measures may lend a veil of “objectivity” to assessments of peoples work, which is intrinsically vulnerable to the success to the successful feedback loop.
Based on your comment, I had another look at the specific criteria of the rubric and agree that it seems possible that it could help to counteract something like the dynamic I outlined above, however, it would still have to be applied with care and recognizing the possibility of such dynamics.
The main problem I wanted to highlight is that something like this might obscure those dynamics and might be employed for political purposes such as justifying existing status hierarchies which might be simply circumstantial and not based on merit.