Right—there’s still a correlation between the legible external factor and outcome, even if there is no causal relationship.
Hypothetical example: Prestigious University does not consider test scores in determining admissions at all. However, test scores happen to be strongly correlated to academic ability, and it so happens that most admitted students have scores in the 99th percentile. This would still be useful information for someone with a test score in the 80th percentile even though there is zero direct causal relationship between test scores and admission.
Whether this information is good to include depends crucially on the causal relationships though.
In the simple test score case, academic ability causes both test scores and admission success, and test scores serve as a strong proxy for academic ability and we assume no other causal relationships complicating matters. Here test scores serve as a useful proxy for academic ability, and are relatively innocent as an indicator for likelihood of admission (i.e. they serve as a pretty good indicator of whether one is likely to succeed).
But telling people about something which was strongly associated with success, but not causally connected with the factors which determine success in the right way would be misleading.
In a more complex (but perhaps realistic) case, where completing a PhD is causally related to a bunch of other factors, then saying that most successful applicants have PhDs risks being misleading about one’s chances of success / suitability for the role and about the practical utility of getting a PhD for success.
David, I think that you’ve hit the nail on the head. I imagine how I would react if a job posting said “the majority of successful candidates grew up in families with either wealth or income in the top quartile of their home country,” and I know even though that is predictive of success (and it might be a useful data point for estimating how likely m own application would be to succeed). I wouldn’t want to see it as a candidate. We could substitute in something less controversial, such as height, and I think that my preference to not see it would remain the same.
Right, but there is definitely a way you can communicate this information without being misleading. You could say, “in previous rounds, >50% of successful applicants had a PhD, but we do not assign weight to PhDs and do not believe there is a direct causal relationship between having a PhD and receiving an offer”.
I broadly agree that such a statement, taken completely literally, would not be misleading in itself. But it raises the questions:
What useful information is being conveyed by such a statement?
If interpreted correctly, I think the applicant should take anything actionable away from the statement. But what I suspect many will do, will be to conclude that they probably shouldn’t apply if they don’t have a PhD, even if they meet all the requirements.
What is pragmatically implied by such a statement?
People don’t typically go out of their way to state things which they don’t think are relevant (without some reason). So if employers go out of their way to state “>50% of successful applicants had a PhD...”, even with the caveat, people are reasonably going to wonder “Why are they telling me this?” and a natural interpretation is “They want to communicate that if I don’t have a PhD, I’m probably not suited to the role, even if I meet all the requirements”, which is exactly what employers don’t want to communicate (and is not true) in the cases I’m describing.[1]
I think there are roles where unless you have a PhD, you are unlikely to meet the requirements of the role. In such cases, communicating that would be useful. But the cases I’m describing are not like that: in these cases, PhDs are really not relevant to the roles, but applicants will have very commonly undertaken PhDs. I imagine that part of the motivation for wanting to see the information are because people think that things are really like the former case, not the latter case.
Even if your current best guess is that it’s not causal, if having a PhD meaningfully increases your chances of getting hired conditional on having applied, that information would help candidates get a better sense of their probability of getting hired
[edited to specify that I meant conditional on applying]
A relevant reframing here is whether having a PhD provides a high Bayes factor update to being hired. Eg, if people with and without PhDs have a 2% chance of being hired, but “>50% of successful applicants had a PhD” because most applicants have a PhD, then you should probably not include this, but if 1 in 50 applicants are hired, but it rises to 1 in 10 people if you have a PhD and falls to 1 in 100 if you don’t, then the PhD is a massive evidential update even if there is no causal effect.
To further elaborate on what I think might be a crux here:
I think that where the job requirements are clearly specified, predictive proxies like having a PhD may have no additional predictive power above what is transparent in the job requirements and transparent to the applicants themselves in terms of whether they have them or not.
For example:
Knowing programming language X may be necessary for a job and may be most common among people who studied computer science. But if ‘knowing X’ is listed in the job ad and the applicant knows they know X, then knowing they have a computer science degree and knowing the % successful applicants with such a degree adds no additional predictive power.
Having a degree in Y may be necessary for a job and because more men than women have degrees in Y, being a man may thereby be predictive of success. But if you are a woman and know you have a degree in Y, then you don’t gain any additional predictive power from knowing the % successful female applicants.
My supposition is that possession of a PhD is mostly just a case like the above for many EA roles (though I’m sure it varies by org, role and proxy). But I imagine those who want the information about PhDs to be revealed think they are likely to be proxies for latent qualities of the applicant, which the applicants themselves don’t know and which aren’t transparent in the job ad.
I think this is one piece of information you would need to include to stop such a statement being misleading, but as I argue here, there are potentially lots of other pieces of information which would need to be included to make it non-misleading (i.e. information about any and all other confounders which explain the association).
Otherwise, applicants will not know that conditional on X, they are not less likely to be successful, if they do not have a PhD (even though disproportionately many people with X have a PhD).
Edit: TLDR, if you do not also condition on satisfying the role requirements, but only on applying, then this information will still be misleading (e.g. causing people who meet the requirements but lack the confounded proxy to underestimate their chances).
As I suggested in my first comment, you could do the same “by reporting other characteristics which play no role in selection, but which are heavily over-represented in successful applicants”: for example, you could report that >50% of successful applicants are male,[1] white, live in certain countries, >90% have liberal political beliefs, and probably a very disproportionately large number have read Harry Potter fan fic.[2] Presumably one could identify other traits which are associated with success via their association with these other traits e.g. if most successful applicants have PhDs and PhDs disproportionately tend to [drink red wine, ski etc.], then successful applicants may also disproportionately have these traits.
Of course, different people can disagree about whether or not each of these are causal. But even if they are predictive, I imagine that we would agree that at least one of these would likely mislead people. For example, having read Harry Potter fan fic is associated with being involved with communities interested in EA-related jobs for largely arbitrary historical reasons.[3]
This concern is particularly acute when we take into account the pragmatics of employers highlighting some specific fact.[4] People typically don’t offer irrelevant information for no reason. So if orgs go out of their way to say “>50% of successful applicants have PhDs”, even with the caveat about this not being causal, applicants will still reasonably wonder “Why are they telling me this?” and many will reasonably infer “What they want to convey is that this is a very competitive position and I should not apply.”
As I mentioned in the footnote of my comment above, there are jobs where this would be a reasonable inference. But I think most EA jobs are not like this.
If one wanted to provide applicants with full, non-misleading information, I think you would need to distinguish which of the cases applies, and provide a full account of the association which explains why successful applicants might often have PhDs, but that this is not the case when you control for x, y, z. That way (in theory), applicants would be able to know that conditional on them being a person who meets the requirements specified in the application (e.g. they can complete the coding test task), the fact that they don’t have a PhD does or does not imply anything about their chances of success. But I think that in practice, providing such an account for any given trait is either very difficult or impossible.[5]
Though in EA Survey data, there is no significant gender difference in likelihood of having an EA job. In fact, a slightly larger proportion of women tend to have EA jobs.
Of course, you could describe a situation where having read Harry Potter fan fic actually serves as a useful indicator of some relevant trait like involvement in the EA community. But, again, I’m not referring to cases like this. Even in cases where involvement in the EA community is of no relevance to the role at all (e.g. all you need to do to be hired is to perform some technical, testable skill, like coding very well), applicants are likely to be disproportionately interested in EA, and successful applicants may be yet further disproportionately interested in EA, even if it has nothing to do with selection.
This can happen if, for example, 50% of the applications are basically spam (e.g. applications from a large job site, who have barely read the job advert and don’t have any relevant skills but are applying for everything they can click on). In such cases, the subset of applications who are actually vaguely relevant, will be disproportionately people with an interest in EA, people with degrees etc.
In some countries there may be a norm of releasing information about certain characteristics, in which case this consideration doesn’t apply for those characteristics, but would for others.
And that is not taking into account the important question of whether all applicants would actually update on such information provided completely rationally, or would whether many would be irrationally inclined to be negative about their chances, and just conclude that they aren’t good enough to apply if they don’t have a PhD from a fancy institution.
Right—there’s still a correlation between the legible external factor and outcome, even if there is no causal relationship.
Hypothetical example: Prestigious University does not consider test scores in determining admissions at all. However, test scores happen to be strongly correlated to academic ability, and it so happens that most admitted students have scores in the 99th percentile. This would still be useful information for someone with a test score in the 80th percentile even though there is zero direct causal relationship between test scores and admission.
Whether this information is good to include depends crucially on the causal relationships though.
In the simple test score case, academic ability causes both test scores and admission success, and test scores serve as a strong proxy for academic ability and we assume no other causal relationships complicating matters. Here test scores serve as a useful proxy for academic ability, and are relatively innocent as an indicator for likelihood of admission (i.e. they serve as a pretty good indicator of whether one is likely to succeed).
But telling people about something which was strongly associated with success, but not causally connected with the factors which determine success in the right way would be misleading.
In a more complex (but perhaps realistic) case, where completing a PhD is causally related to a bunch of other factors, then saying that most successful applicants have PhDs risks being misleading about one’s chances of success / suitability for the role and about the practical utility of getting a PhD for success.
David, I think that you’ve hit the nail on the head. I imagine how I would react if a job posting said “the majority of successful candidates grew up in families with either wealth or income in the top quartile of their home country,” and I know even though that is predictive of success (and it might be a useful data point for estimating how likely m own application would be to succeed). I wouldn’t want to see it as a candidate. We could substitute in something less controversial, such as height, and I think that my preference to not see it would remain the same.
Right, but there is definitely a way you can communicate this information without being misleading. You could say, “in previous rounds, >50% of successful applicants had a PhD, but we do not assign weight to PhDs and do not believe there is a direct causal relationship between having a PhD and receiving an offer”.
I broadly agree that such a statement, taken completely literally, would not be misleading in itself. But it raises the questions:
What useful information is being conveyed by such a statement?
If interpreted correctly, I think the applicant should take anything actionable away from the statement. But what I suspect many will do, will be to conclude that they probably shouldn’t apply if they don’t have a PhD, even if they meet all the requirements.
What is pragmatically implied by such a statement?
People don’t typically go out of their way to state things which they don’t think are relevant (without some reason). So if employers go out of their way to state “>50% of successful applicants had a PhD...”, even with the caveat, people are reasonably going to wonder “Why are they telling me this?” and a natural interpretation is “They want to communicate that if I don’t have a PhD, I’m probably not suited to the role, even if I meet all the requirements”, which is exactly what employers don’t want to communicate (and is not true) in the cases I’m describing.[1]
I think there are roles where unless you have a PhD, you are unlikely to meet the requirements of the role. In such cases, communicating that would be useful. But the cases I’m describing are not like that: in these cases, PhDs are really not relevant to the roles, but applicants will have very commonly undertaken PhDs. I imagine that part of the motivation for wanting to see the information are because people think that things are really like the former case, not the latter case.
Even if your current best guess is that it’s not causal, if having a PhD meaningfully increases your chances of getting hired conditional on having applied, that information would help candidates get a better sense of their probability of getting hired
[edited to specify that I meant conditional on applying]
A relevant reframing here is whether having a PhD provides a high Bayes factor update to being hired. Eg, if people with and without PhDs have a 2% chance of being hired, but “>50% of successful applicants had a PhD” because most applicants have a PhD, then you should probably not include this, but if 1 in 50 applicants are hired, but it rises to 1 in 10 people if you have a PhD and falls to 1 in 100 if you don’t, then the PhD is a massive evidential update even if there is no causal effect.
To further elaborate on what I think might be a crux here:
I think that where the job requirements are clearly specified, predictive proxies like having a PhD may have no additional predictive power above what is transparent in the job requirements and transparent to the applicants themselves in terms of whether they have them or not.
For example:
Knowing programming language X may be necessary for a job and may be most common among people who studied computer science. But if ‘knowing X’ is listed in the job ad and the applicant knows they know X, then knowing they have a computer science degree and knowing the % successful applicants with such a degree adds no additional predictive power.
Having a degree in Y may be necessary for a job and because more men than women have degrees in Y, being a man may thereby be predictive of success. But if you are a woman and know you have a degree in Y, then you don’t gain any additional predictive power from knowing the % successful female applicants.
My supposition is that possession of a PhD is mostly just a case like the above for many EA roles (though I’m sure it varies by org, role and proxy). But I imagine those who want the information about PhDs to be revealed think they are likely to be proxies for latent qualities of the applicant, which the applicants themselves don’t know and which aren’t transparent in the job ad.
I think this is one piece of information you would need to include to stop such a statement being misleading, but as I argue here, there are potentially lots of other pieces of information which would need to be included to make it non-misleading (i.e. information about any and all other confounders which explain the association).
Otherwise, applicants will not know that conditional on X, they are not less likely to be successful, if they do not have a PhD (even though disproportionately many people with X have a PhD).
Edit: TLDR, if you do not also condition on satisfying the role requirements, but only on applying, then this information will still be misleading (e.g. causing people who meet the requirements but lack the confounded proxy to underestimate their chances).
Exactly
As I suggested in my first comment, you could do the same “by reporting other characteristics which play no role in selection, but which are heavily over-represented in successful applicants”: for example, you could report that >50% of successful applicants are male,[1] white, live in certain countries, >90% have liberal political beliefs, and probably a very disproportionately large number have read Harry Potter fan fic.[2] Presumably one could identify other traits which are associated with success via their association with these other traits e.g. if most successful applicants have PhDs and PhDs disproportionately tend to [drink red wine, ski etc.], then successful applicants may also disproportionately have these traits.
Of course, different people can disagree about whether or not each of these are causal. But even if they are predictive, I imagine that we would agree that at least one of these would likely mislead people. For example, having read Harry Potter fan fic is associated with being involved with communities interested in EA-related jobs for largely arbitrary historical reasons.[3]
This concern is particularly acute when we take into account the pragmatics of employers highlighting some specific fact.[4] People typically don’t offer irrelevant information for no reason. So if orgs go out of their way to say “>50% of successful applicants have PhDs”, even with the caveat about this not being causal, applicants will still reasonably wonder “Why are they telling me this?” and many will reasonably infer “What they want to convey is that this is a very competitive position and I should not apply.”
As I mentioned in the footnote of my comment above, there are jobs where this would be a reasonable inference. But I think most EA jobs are not like this.
If one wanted to provide applicants with full, non-misleading information, I think you would need to distinguish which of the cases applies, and provide a full account of the association which explains why successful applicants might often have PhDs, but that this is not the case when you control for x, y, z. That way (in theory), applicants would be able to know that conditional on them being a person who meets the requirements specified in the application (e.g. they can complete the coding test task), the fact that they don’t have a PhD does or does not imply anything about their chances of success. But I think that in practice, providing such an account for any given trait is either very difficult or impossible.[5]
Though in EA Survey data, there is no significant gender difference in likelihood of having an EA job. In fact, a slightly larger proportion of women tend to have EA jobs.
None of these reflect real numbers from any actual hiring rounds, though they do reflect general disparities observed in the wider community.
Of course, you could describe a situation where having read Harry Potter fan fic actually serves as a useful indicator of some relevant trait like involvement in the EA community. But, again, I’m not referring to cases like this. Even in cases where involvement in the EA community is of no relevance to the role at all (e.g. all you need to do to be hired is to perform some technical, testable skill, like coding very well), applicants are likely to be disproportionately interested in EA, and successful applicants may be yet further disproportionately interested in EA, even if it has nothing to do with selection.
This can happen if, for example, 50% of the applications are basically spam (e.g. applications from a large job site, who have barely read the job advert and don’t have any relevant skills but are applying for everything they can click on). In such cases, the subset of applications who are actually vaguely relevant, will be disproportionately people with an interest in EA, people with degrees etc.
In some countries there may be a norm of releasing information about certain characteristics, in which case this consideration doesn’t apply for those characteristics, but would for others.
And that is not taking into account the important question of whether all applicants would actually update on such information provided completely rationally, or would whether many would be irrationally inclined to be negative about their chances, and just conclude that they aren’t good enough to apply if they don’t have a PhD from a fancy institution.