I agree with Habryka here—it seems potentially very damaging to EA for arguments to be advanced with obvious holes in them, especially if the motivation for that seems to be political. In that spirit I want to find a better source to cite for the point I’m trying to make here. I think EA is really hard. I think we’ll consistently get things wrong if we relax our standards for accuracy at all.
I do think criminal justice predictive algorithms are a decent example of ML interpretability concerns and ‘what we said isn’t what we meant’ concerns. I think most people do not actually want a system which treats two identical people differently because one is black and one is white; human values include ‘reduce recidivism’ but also ‘do not evaluate people on the basis of skin color’. But because of the statistical problem, it’s actually really hard to prevent a system from either using race or guessing race from proxies and using its best guess of race. That’s illegal under current U.S. antidiscrimination law, and I do think it’s not really what we want—that is, I think we’re willing to sacrifice some predictive power in order to not use race to decide whether people remain in prison or not, just like we’re willing to sacrifice predictive power to get people lawyers and willing to sacrifice predictive power to require cops to have a warrant and willing to sacrifice predictive power to protect the right not to incriminate yourself. But none of that nebulous stuff makes it into the classifier, and so the classifier is genuinely exhibiting unintended behavior—and unintended behavior we struggle to make it stop exhibiting, since it’ll keep trying to find proxies for race and using them for prediction.
I’m curious if Larks/others think that this summary is decent and would avoid misleading someone who didn’t know the stats background; if so, I’ll try to write it up somewhere in more depth (or find it written up in more depth) so I can link that instead of the existing links.
I think I might disagree with the overall point? I don’t have super strong moral intuitions here (maybe because I am from Germany, which generally has a lot less culture around skin-color based discrimination because it’s a more ethnically uniform country?).
None of the examples you listed as being analogous to skin-color discrimination strike me as fundamentally moral. Let’s walk through them one by one:
just like we’re willing to sacrifice predictive power to get people lawyers
I would guess that we give people lawyers to ensure the long-term functionality of the legal system, which will increase long-run accuracy. Lawyers help determine the truth by ensuring that the state actually needs to do its job in order to convict someone. They strike me as essential instruments that increase accuracy, not decrease accuracy.
and willing to sacrifice predictive power to require cops to have a warrant
Again, we require cops to have a warrant in order to limit the direct costs of house searches. Without need for warrants there would be a lot more searches, which would come at pretty significant cost for the people being searched. The warrants just ensure that there is significant cause for a search, in order to limit the amount of collateral damage the police causes. I don’t see how this is analogous to the skin-color situation.
and willing to sacrifice predictive power to protect the right not to incriminate yourself
I think the reason why we have this rule is mostly because the alternative isn’t really functional. Forcing people to incriminate themselves will probably lead to much less cooperation with the legal system, and incur large costs on anyone involved with it. I don’t personally see any exceptional benefits that come from having this rule, outside of its instrumental effects on the direct costs and long-run accuracy of the legal system.
This doesn’t mean that I don’t think there are good arguments for potentially limiting what information we want to use about a person, but I don’t think the examples you used are illustrative of the situation with the skin-color discrimination.
I am currently 80% on “there is no big problem with using skin color as a discriminator in machine learning in criminal justice, in the same way we would obviously use height or intelligence or any other mostly inherent attribute”. Given that, I think it would be a lot better to replace that whole section with something that actually has solid moral foundations for it (of which I think there are many).
Huh, yeah, I disagree. It seems to me pretty fundamental to a justice system’s credibility that it not imprison one person and free another when the only difference between them is the color of their skin (or, yes, their height), and it makes a lot of sense to me that U.S. law mandates sacrificing predictive power in order to maintain this feature of the system.
Similarly, I don’t think all of the restrictions the legal system imposes on what kinds of evidence to use are, in fact, motivated by long-term harm-reduction considerations. I think they’re motivated by wanting the system to embody the ideal of justice. EAs are mostly consequentialists (I am) and mostly only interested in harm, not in fairness, but I think it’s important to realize that the overwhelming majority of people care a lot about whether a justice system is fair in addition to whether it is harm-reducing, and that this is the actual motivation for the laws I discuss above, even if you can technically propose a defense of them in harm-reduction terms.
but I think it’s important to realize that the overwhelming majority of people care a lot about whether a justice system is fair in addition to whether it is harm-reducing, and that this is the actual motivation for the laws I discuss above, even if you can technically propose a defense of them in harm-reduction terms.
I take the same stance towards moral arguments as I do towards epistemic ones. I would be very sad if EAs make moral arguments that are based on bad moral reasoning just because they appeal to the preconceived notions of some parts of society. I think most arguments in favor of naive conceptions of fairness fall into this category, and I would strongly prefer to advocate for moral stances that we feel confident in, have checked their consistency and feel comfortable defending on our own grounds.
Hmm. I think I’m thinking of concern for justice-system outcomes as a values difference rather than a reasoning error, and so treating it as legitimate feels appropriate in the same way it feels appropriate to say ‘an AI with poorly specified goals could wirehead everyone, which is an example of optimizing for one thing we wanted at the expense of other things we wanted’ even though I don’t actually feel that confident that my preferences against wireheading everyone are principled and consistent.
I agree that most peoples’ conceptions of fairness are inconsistent, but that’s only because most peoples’ values are inconsistent in general; I don’t think it means they’d necessarily have my values if they thought about it more. I also think that ‘the U.S. government should impose the same prison sentence for the same crime regardless of the race of the defendant’ is probably correct under my value system, which probably influences me towards thinking that other people who value it would still value it if they were less confused.
Some instrumental merits of imposing the same prison sentence for the same crime regardless of the race of the defendant:
I want to gesture at something in the direction of pluralism: we agree to treat all religions the same, not because they are of equal social value or because we think they are equally correct, but because this is social technology to prevent constantly warring over whose religion is correct/of the most social value. I bet some religious beliefs predict less recidivism, but I prefer not using religion to determine sentencing because I think there are a lot of practical benefits to the pluralistic compromise the U.S. uses here. This generalizes to race.
There are ways you can greatly exacerbate an initially fairly small difference by updating on it in ways that are all technically correct. I think the classic example is a career path with lots of promotions, where one thing people are optimizing for at each level is the odds of being promoted at the next level; this will result in a very small difference in average ability producing a huge difference in odds of reaching the highest level. I think it is good for systems like the U.S. justice system to try to adopt procedures that avoid this, where this is sane and the tradeoffs relatively small.
(least important): Justice systems run on social trust. If they use processes which undermine social trust, even if they do this because the public is objectively unreasonable, they will work less well; people will be less likely to report crimes, cooperate with police, testify, serve on juries, make truthful decisions on juries, etc. I know that when crimes are committed against me, I weigh whether I expect the justice system to behave according to my values when deciding whether to report the crimes. If this is common, there’s reason for justice systems to use processes that people consider aligned. If we want to change what people value, we should use instruments for this other than the justice system.
Expanding on this: I don’t think ‘fairness’ is a fundamental part of morality. It’s better for good things to happen than bad ones, regardless of how they’re distributed, and it’s bad to sacrifice utility for fairness.
However, I think there are some aspects of policy where fairness is instrumentally really useful, and I think the justice system is the single place where it’s most useful, and the will/preferences of the American populace is demonstrably for a justice system to embody fairness, and so it seems to me that we’re missing a really important point if we decide that it’s not a problem for a justice system to badly fail to embody the values it was intended to embody just because we don’t non-instrumentally value fairness.
My perspective here is that many forms of fairness are inconsistent, and fall apart on significant moral introspection as you try to make your moral preferences consistent. I think the skin-color thing is one of them, which is really hard to maintain as something that you shouldn’t pay attention to, as you realize that it can’t be causally disentangled from other factors that you feel like you definitely should pay attention to (such as the person’s physical strength, or their height, or the speed at which they can run).
Not paying attention to skin color has to mean that you don’t pay attention to physical strength, since those are causally entangled in a way that makes it impossible to pay attention to one without the other. You won’t ever have full information on someone’s physical strength, so hearing about their ethnic background will always give you additional evidence. Skin color is not an isolated epiphenomenal node in the causal structure of the world, and you can’t just decide to “not discriminate on it” without stopping to pay attention to every single phenomenon that is correlated with it and that you can’t fully screen off, which is a really large range of things that you would definitely want to know in a criminal investigation.
My perspective here is that many forms of fairness are inconsistent, and fall apart on significant moral introspection as you try to make your moral preferences consistent. I think the skin-color thing is one of them, which is really hard to maintain as something that you shouldn’t pay attention to, as you realize that it can’t be causally disentangled from other factors that you feel like you definitely should pay attention to (such as the person’s physical strength, or their height, or the speed at which they can run).
I think that a sensible interpretation of “is the justice system (or society in general) fair” is “does the justice system (or society) reward behaviors that are good overall, and punish behaviors that are bad overall”; in other words, can you count on society to cooperate with you rather than defect on you if you cooperate with it. If you get jailed based (in part) on your skin color, then if you have the wrong skin color (which you can’t affect), there’s an increased probability of society defecting on you regardless of whether you cooperate or defect. This means that you have an extra incentive to defect since you might get defected on anyway. This feels like a sensible thing to try to avoid.
This is not for criminal investigation. This is for, when a person has been convicted of a crime, estimating when to release them (by estimating how likely they are to commit another crime).
Will write a longer reply later, since I am about to board a plane.
I was indeed thinking of a criminal investigation context, but I think the question of how likely someone is to commit further crimes is likely to be directly related to their ability to commit further crimes, which will depend on many of the variables I mentioned above, and so the same argument holds.
I expect those variables to still be highly relevant when you want to assess the likelihood of another crime, and there are many more that are more obviously relevant and also correlated with race (such as their impulsivity, their likelihood to get addicted to drugs, etc.). Do you think we should not take into account someone’s impulsivity when predicting whether they will commit more crimes?
Huh, yeah, I disagree. It seems to me pretty fundamental to a justice system’s credibility that it not imprison one person and free another when the only difference between them is the color of their skin (or, yes, their height), and it makes a lot of sense to me that U.S. law mandates sacrificing predictive power in order to maintain this feature of the system.
If the crime was performed by someone who had to be at least 2m tall, and one of the suspects is 2.10m and other one is 1.60m tall, then it seems really obvious to me that you should use their height as evidence? I would be deeply surprised if you think otherwise.
If the crime was performed by someone who had to be at least 2m tall, and one of the suspects is 2.10m and other one is 1.60m tall, then it seems really obvious to me that you should use their height as evidence?
That’s not what these articles describe—the algorithm in question wasn’t being used to determine whether a suspect had committed a crime, it was being used for risk assessment, ie determine the probability that a person convicted of one crime will go on to commit another.
1. A system that will imprison a black person but not an otherwise-identical white person can be accurately described as “a racist systsem”
2. One example of such a system is employing a ML algorithm that uses race as a predictive factor to determine bond amounts and sentencing
3. White people will tend to be biased towards more positive evaluations of a racist system because they have not experienced racism, so their evaluations should be given lower weight
4. Non-white people tend to evaluate racist systems very negatively, even when they improve predictive accuracy
To me, the rational conclusion is to not support racist systems, such as the use of this predictive algorithm.
It seems like many EAs disagree, which is why I’ve tried to break down my thinking to identify specific points of disagreement. Maybe people believe that #4 is false? I’m not sure where to find hard data to prove it (custom Google survey maybe?). I’m ~90% sure it’s true, and would be willing to bet money on it, but if others’ credences are lower that might explain the disagreement.
Edit: Maybe an implicit difference is epistemic modesty regarding moral theories—you could frame my argument in terms of “white people misestimating the negative utility of racial discrimination”, but I think it’s also possible for demographic characteristics to bias one’s beliefs about morality. There’s no a priori reason to expect your demographic group to have more moral insight than others; one obvious example is the correlation between gender and support for utilitarianism. I don’t see any reason why men would have more moral insight, so as a man I might want to reduce my credence in utilitarianism to correct for this bias.
Similarly, I expect the disagreement between a white EA who likes race-based sentencing and a random black person who doesn’t to be a combination of disagreement about facts (e.g. the level of harm caused by racism) and moral beliefs (e.g. importance of fairness). However, *both* disagreements could stem from bias on the EA’s part, and so I think the EA ought not discount the random guy’s point of view by assigning 0 probability to the chance that fairness is morally important.
I agree with Habryka here—it seems potentially very damaging to EA for arguments to be advanced with obvious holes in them, especially if the motivation for that seems to be political. In that spirit I want to find a better source to cite for the point I’m trying to make here. I think EA is really hard. I think we’ll consistently get things wrong if we relax our standards for accuracy at all.
I do think criminal justice predictive algorithms are a decent example of ML interpretability concerns and ‘what we said isn’t what we meant’ concerns. I think most people do not actually want a system which treats two identical people differently because one is black and one is white; human values include ‘reduce recidivism’ but also ‘do not evaluate people on the basis of skin color’. But because of the statistical problem, it’s actually really hard to prevent a system from either using race or guessing race from proxies and using its best guess of race. That’s illegal under current U.S. antidiscrimination law, and I do think it’s not really what we want—that is, I think we’re willing to sacrifice some predictive power in order to not use race to decide whether people remain in prison or not, just like we’re willing to sacrifice predictive power to get people lawyers and willing to sacrifice predictive power to require cops to have a warrant and willing to sacrifice predictive power to protect the right not to incriminate yourself. But none of that nebulous stuff makes it into the classifier, and so the classifier is genuinely exhibiting unintended behavior—and unintended behavior we struggle to make it stop exhibiting, since it’ll keep trying to find proxies for race and using them for prediction.
I’m curious if Larks/others think that this summary is decent and would avoid misleading someone who didn’t know the stats background; if so, I’ll try to write it up somewhere in more depth (or find it written up in more depth) so I can link that instead of the existing links.
I think I might disagree with the overall point? I don’t have super strong moral intuitions here (maybe because I am from Germany, which generally has a lot less culture around skin-color based discrimination because it’s a more ethnically uniform country?).
None of the examples you listed as being analogous to skin-color discrimination strike me as fundamentally moral. Let’s walk through them one by one:
I would guess that we give people lawyers to ensure the long-term functionality of the legal system, which will increase long-run accuracy. Lawyers help determine the truth by ensuring that the state actually needs to do its job in order to convict someone. They strike me as essential instruments that increase accuracy, not decrease accuracy.
Again, we require cops to have a warrant in order to limit the direct costs of house searches. Without need for warrants there would be a lot more searches, which would come at pretty significant cost for the people being searched. The warrants just ensure that there is significant cause for a search, in order to limit the amount of collateral damage the police causes. I don’t see how this is analogous to the skin-color situation.
I think the reason why we have this rule is mostly because the alternative isn’t really functional. Forcing people to incriminate themselves will probably lead to much less cooperation with the legal system, and incur large costs on anyone involved with it. I don’t personally see any exceptional benefits that come from having this rule, outside of its instrumental effects on the direct costs and long-run accuracy of the legal system.
This doesn’t mean that I don’t think there are good arguments for potentially limiting what information we want to use about a person, but I don’t think the examples you used are illustrative of the situation with the skin-color discrimination.
I am currently 80% on “there is no big problem with using skin color as a discriminator in machine learning in criminal justice, in the same way we would obviously use height or intelligence or any other mostly inherent attribute”. Given that, I think it would be a lot better to replace that whole section with something that actually has solid moral foundations for it (of which I think there are many).
Huh, yeah, I disagree. It seems to me pretty fundamental to a justice system’s credibility that it not imprison one person and free another when the only difference between them is the color of their skin (or, yes, their height), and it makes a lot of sense to me that U.S. law mandates sacrificing predictive power in order to maintain this feature of the system.
Similarly, I don’t think all of the restrictions the legal system imposes on what kinds of evidence to use are, in fact, motivated by long-term harm-reduction considerations. I think they’re motivated by wanting the system to embody the ideal of justice. EAs are mostly consequentialists (I am) and mostly only interested in harm, not in fairness, but I think it’s important to realize that the overwhelming majority of people care a lot about whether a justice system is fair in addition to whether it is harm-reducing, and that this is the actual motivation for the laws I discuss above, even if you can technically propose a defense of them in harm-reduction terms.
I take the same stance towards moral arguments as I do towards epistemic ones. I would be very sad if EAs make moral arguments that are based on bad moral reasoning just because they appeal to the preconceived notions of some parts of society. I think most arguments in favor of naive conceptions of fairness fall into this category, and I would strongly prefer to advocate for moral stances that we feel confident in, have checked their consistency and feel comfortable defending on our own grounds.
Hmm. I think I’m thinking of concern for justice-system outcomes as a values difference rather than a reasoning error, and so treating it as legitimate feels appropriate in the same way it feels appropriate to say ‘an AI with poorly specified goals could wirehead everyone, which is an example of optimizing for one thing we wanted at the expense of other things we wanted’ even though I don’t actually feel that confident that my preferences against wireheading everyone are principled and consistent.
I agree that most peoples’ conceptions of fairness are inconsistent, but that’s only because most peoples’ values are inconsistent in general; I don’t think it means they’d necessarily have my values if they thought about it more. I also think that ‘the U.S. government should impose the same prison sentence for the same crime regardless of the race of the defendant’ is probably correct under my value system, which probably influences me towards thinking that other people who value it would still value it if they were less confused.
Some instrumental merits of imposing the same prison sentence for the same crime regardless of the race of the defendant:
I want to gesture at something in the direction of pluralism: we agree to treat all religions the same, not because they are of equal social value or because we think they are equally correct, but because this is social technology to prevent constantly warring over whose religion is correct/of the most social value. I bet some religious beliefs predict less recidivism, but I prefer not using religion to determine sentencing because I think there are a lot of practical benefits to the pluralistic compromise the U.S. uses here. This generalizes to race.
There are ways you can greatly exacerbate an initially fairly small difference by updating on it in ways that are all technically correct. I think the classic example is a career path with lots of promotions, where one thing people are optimizing for at each level is the odds of being promoted at the next level; this will result in a very small difference in average ability producing a huge difference in odds of reaching the highest level. I think it is good for systems like the U.S. justice system to try to adopt procedures that avoid this, where this is sane and the tradeoffs relatively small.
(least important): Justice systems run on social trust. If they use processes which undermine social trust, even if they do this because the public is objectively unreasonable, they will work less well; people will be less likely to report crimes, cooperate with police, testify, serve on juries, make truthful decisions on juries, etc. I know that when crimes are committed against me, I weigh whether I expect the justice system to behave according to my values when deciding whether to report the crimes. If this is common, there’s reason for justice systems to use processes that people consider aligned. If we want to change what people value, we should use instruments for this other than the justice system.
Expanding on this: I don’t think ‘fairness’ is a fundamental part of morality. It’s better for good things to happen than bad ones, regardless of how they’re distributed, and it’s bad to sacrifice utility for fairness.
However, I think there are some aspects of policy where fairness is instrumentally really useful, and I think the justice system is the single place where it’s most useful, and the will/preferences of the American populace is demonstrably for a justice system to embody fairness, and so it seems to me that we’re missing a really important point if we decide that it’s not a problem for a justice system to badly fail to embody the values it was intended to embody just because we don’t non-instrumentally value fairness.
My perspective here is that many forms of fairness are inconsistent, and fall apart on significant moral introspection as you try to make your moral preferences consistent. I think the skin-color thing is one of them, which is really hard to maintain as something that you shouldn’t pay attention to, as you realize that it can’t be causally disentangled from other factors that you feel like you definitely should pay attention to (such as the person’s physical strength, or their height, or the speed at which they can run).
Not paying attention to skin color has to mean that you don’t pay attention to physical strength, since those are causally entangled in a way that makes it impossible to pay attention to one without the other. You won’t ever have full information on someone’s physical strength, so hearing about their ethnic background will always give you additional evidence. Skin color is not an isolated epiphenomenal node in the causal structure of the world, and you can’t just decide to “not discriminate on it” without stopping to pay attention to every single phenomenon that is correlated with it and that you can’t fully screen off, which is a really large range of things that you would definitely want to know in a criminal investigation.
I think that a sensible interpretation of “is the justice system (or society in general) fair” is “does the justice system (or society) reward behaviors that are good overall, and punish behaviors that are bad overall”; in other words, can you count on society to cooperate with you rather than defect on you if you cooperate with it. If you get jailed based (in part) on your skin color, then if you have the wrong skin color (which you can’t affect), there’s an increased probability of society defecting on you regardless of whether you cooperate or defect. This means that you have an extra incentive to defect since you might get defected on anyway. This feels like a sensible thing to try to avoid.
This is not for criminal investigation. This is for, when a person has been convicted of a crime, estimating when to release them (by estimating how likely they are to commit another crime).
Will write a longer reply later, since I am about to board a plane.
I was indeed thinking of a criminal investigation context, but I think the question of how likely someone is to commit further crimes is likely to be directly related to their ability to commit further crimes, which will depend on many of the variables I mentioned above, and so the same argument holds.
I expect those variables to still be highly relevant when you want to assess the likelihood of another crime, and there are many more that are more obviously relevant and also correlated with race (such as their impulsivity, their likelihood to get addicted to drugs, etc.). Do you think we should not take into account someone’s impulsivity when predicting whether they will commit more crimes?
If the crime was performed by someone who had to be at least 2m tall, and one of the suspects is 2.10m and other one is 1.60m tall, then it seems really obvious to me that you should use their height as evidence? I would be deeply surprised if you think otherwise.
That’s not what these articles describe—the algorithm in question wasn’t being used to determine whether a suspect had committed a crime, it was being used for risk assessment, ie determine the probability that a person convicted of one crime will go on to commit another.
1. A system that will imprison a black person but not an otherwise-identical white person can be accurately described as “a racist systsem”
2. One example of such a system is employing a ML algorithm that uses race as a predictive factor to determine bond amounts and sentencing
3. White people will tend to be biased towards more positive evaluations of a racist system because they have not experienced racism, so their evaluations should be given lower weight
4. Non-white people tend to evaluate racist systems very negatively, even when they improve predictive accuracy
To me, the rational conclusion is to not support racist systems, such as the use of this predictive algorithm.
It seems like many EAs disagree, which is why I’ve tried to break down my thinking to identify specific points of disagreement. Maybe people believe that #4 is false? I’m not sure where to find hard data to prove it (custom Google survey maybe?). I’m ~90% sure it’s true, and would be willing to bet money on it, but if others’ credences are lower that might explain the disagreement.
Edit: Maybe an implicit difference is epistemic modesty regarding moral theories—you could frame my argument in terms of “white people misestimating the negative utility of racial discrimination”, but I think it’s also possible for demographic characteristics to bias one’s beliefs about morality. There’s no a priori reason to expect your demographic group to have more moral insight than others; one obvious example is the correlation between gender and support for utilitarianism. I don’t see any reason why men would have more moral insight, so as a man I might want to reduce my credence in utilitarianism to correct for this bias.
Similarly, I expect the disagreement between a white EA who likes race-based sentencing and a random black person who doesn’t to be a combination of disagreement about facts (e.g. the level of harm caused by racism) and moral beliefs (e.g. importance of fairness). However, *both* disagreements could stem from bias on the EA’s part, and so I think the EA ought not discount the random guy’s point of view by assigning 0 probability to the chance that fairness is morally important.