Thanks for writing this! I’ve personally struggled to apply academic research to my hiring, and now roughly find myself in the position of who “Stubborn Reliance” is criticizing, i.e. I am aware of academic research but believe it doesn’t apply to me (at least not in any useful way). I would be interested to hear more motivation/explanation about why hiring managers should take these results seriously.
Two small examples of why I think the literature is hard to apply:
GMA Tests If you take hiring literature seriously, my impression is that the thing you would most take away is that you should use GMA tests. It repeatedly shows up as the most predictive test, often by a substantial margin.[1]
But even you, in this article arguing that people should use academic research, suggest to not use GMA tests. I happen to think you are right that GMA tests aren’t super useful, but the fact that the primary finding of a field is dismissed in a couple sentences seems worthy of note.
Structured versus unstructured interviews The thing which primarily caused me to update against using academic hiring research is the question of structured versus unstructured interviews. Hunter and Schmidt 1998, which I understand to be the granddaddy of hiring predictors meta-analyses, found that structured interviews were substantially better than unstructured ones. But the 2016 update found that they were actually almost identical – as I understand it, this update is because of a change in statistical technique (the handling of range restriction).
Regardless of whether structured or unstructured interviews are actually better, the fact that the result you get from academic literature depends on a fairly esoteric statistics question highlights how difficult it is to extract meaning from this research.[2]
My impression is that this is true of psychology more broadly: you can’t just read an abstract which says “structured > unstructured” and conclude something useful; you have to really dig into the statistical methods, data sources, and often times you even need to request materials which weren’t made public to actually get a sense of what’s going on.
I’m not trying to criticize organizational psychologists – reality just is very complicated. But I do think this means that the average hiring manager – or even the average hiring manager who is statistically literate – can’t really get much useful information from these kinds of academic reviews.
E.g. Hunter and Schmidt 1998: “The most wellknown conclusion from this research is that for hiring employees without previous experience in the job the most valid predictor of future performance and learning is general mental ability ([GMA], i.e., intelligence or general cognitive ability; Hunter & Hunter, 1984; Ree & Earles, 1992)”
I briefly skimmed the papers you cited in favor of structured interviews. I couldn’t immediately tell how they were handling range restriction; no doubt a better statistician than myself could figure this out easily, but I think it proves my point that it’s quite hard for the average hiring manager to make sense of the academic literature.
I think that “dismissal” is a bit of a mischaracterization. I can try to explain my current stance on GMA tests a little more. All of the research I’ve read that link GMA to work performance uses data from 1950s to 1980s, and I haven’t seen what tool/test they use to measure GMA. So I think that my concerns are mainly two factors: First, I haven’t yet read anything to suggest that the tools/tests used really do measure GMA. They might end up measuring household wealth or knowledge of particular arithmetic conventions, as GMAT seems to do. I don’t know, because I haven’t looked at the details of these older studies. Second, my rough impression is that psychological tests/tools from that era were mainly implemented by and tested on a sample of people that doesn’t seem particularly representative of humanity as a whole, or even of the USA as a whole.
My current stance isn’t that GMA is useless, but that there are a few obstacles that I’d like to see overcome before I recommend it. I also have only a vague knowledge of the legal risks, thus I want to encourage caution for any organization trying to implement GMA as part of hiring criteria.
If you have recommended readings that you would be willing to share, especially ones that could help me clarify the current concerns, I’d be happy to see them.
But the 2016 update found that they were actually almost identical – as I understand it, this update is because of a change in statistical technique (the handling of range restriction).
Regardless of whether structured or unstructured interviews are actually better, the fact that the result you get from academic literature depends on a fairly esoteric statistics question highlights how difficult it is to extract meaning from this research
For anyone who is curious, I have a bit of an update. (low priority, feel free to skip with no hard feelings)
It has puzzled me that a finding not supported by theory[1] didn’t get more press in the psychology world. It appears to have made almost no impact. I would expect that either A) there would be a bunch of articles citing it and claiming that we should change how we do hiring, or B) there would be some articles refuting the findings. I recently got the chance to ask some Industrial Organizational Psychologists what is going on here. Here is my little summary and exceprts of their answers:
The paper is a book chapter (which hasn’t gone through the peer-review process), and thus professional researchers don’t give it much consideration.
The paper is an exercise in methodological minutiae that’s not terribly useful or practical. (“what are you going to do with the results of that paper, even assuming they’re right on everything? Are you REALLY going to tell people that using an unstructured interview is fine? Just throw best practices, legal defensibility, etc. out the window?”[2])
There’s a time and a place for brute-force empiricism, but this is not one of them, especially when it informs high-stakes HR practices. “It really makes no theoretical sense to me why unstructured which by definition has so much noise introduced to it, would be more predictive then structured. I really would need a WHY and HOW to buy that argument, not just an application of statistical corrections.”
“That’s crazy that they would say unstructured interviews are better. That’s like choosing a Rorschach test over a MMPI.”
My takeaway is that (roughly speaking) I didn’t have enough domain knowledge and context to properly place and weigh that paper (thus supporting the claim that “average hiring manager can’t really get much useful information from these kinds of academic reviews”).
This and the other quotes are almost direct quotes, but I edited them to remove some profanity and correct grammar. People were really worked up about the issues with this paper, but I don’t think that kind of harsh language has a place on the EA Forum.
TLDR: I agree with you. It is complicated and ambiguous and I wish it was more clear-cut.
Regarding GMA Tests, my loosely held opinion at the moment is that I think there is a big difference between 1) GMA being a valid predictor, and 2) having a practical way to use GMA in a hiring process. All the journal articles seem to point toward #1, but what I really want is #2. I suppose we could simply require that all applicants do a test from Wonderlic/GMAT/SAT, but I’m wary of the legal risks and the biases, two topics about which I lack the knowledge to give any confident recommendations. That is roughly why my advice is “only use these if you have really done your research to make sure it works in your situation.”
I’m still exploring the area, and haven’t yet found anything that gives me confidence, but I’m assuming there has to be solutions that exist other than “just pay Wonderlic to do it.”
reality just is very complicated. But I do think this means that the average hiring manager – or even the average hiring manager who is statistically literate – can’t really get much useful information from these kinds of academic reviews.
I strongly agree with you. I’ll echo a previous idea I wrote about: the gap between this is valid and here are the details of how to implement this seem fairly large. If I was a researcher I assume I’d have mentors and more senior researchers that I could bounce ideas off of, or who could point me in the right direction, but learning about these topics as an individual without that kind of structure is strange: I mostly just search on Google Scholar and use forums to ask more experienced people.
My anecdotal experience with GMA tests is that hiring processes already use proxies for GMA (education, standardized test scores, work experience, etc.) so the marginal benefit of doing a bona fide GMA test is relatively low.
It would be cool to have a better sense of when these tests are useful though, and an easy way to implement them in those circumstances.
Regarding structured versus unstructured interviews, I was just introduced to the 2016 update yesterday and I skimmed through it. I, too, was very surprised to see that there was so little difference [Edit: different between structured interviews and unstructured interviews]. While I want to be wary of over-updating from a single paper, I do want to read the Rethinking the validity of interviews for employment decision making paper so that I can look at the details. Thanks for sharing this info.
Thanks for writing this! I’ve personally struggled to apply academic research to my hiring, and now roughly find myself in the position of who “Stubborn Reliance” is criticizing, i.e. I am aware of academic research but believe it doesn’t apply to me (at least not in any useful way). I would be interested to hear more motivation/explanation about why hiring managers should take these results seriously.
Two small examples of why I think the literature is hard to apply:
GMA Tests
If you take hiring literature seriously, my impression is that the thing you would most take away is that you should use GMA tests. It repeatedly shows up as the most predictive test, often by a substantial margin.[1]
But even you, in this article arguing that people should use academic research, suggest to not use GMA tests. I happen to think you are right that GMA tests aren’t super useful, but the fact that the primary finding of a field is dismissed in a couple sentences seems worthy of note.
Structured versus unstructured interviews
The thing which primarily caused me to update against using academic hiring research is the question of structured versus unstructured interviews. Hunter and Schmidt 1998, which I understand to be the granddaddy of hiring predictors meta-analyses, found that structured interviews were substantially better than unstructured ones. But the 2016 update found that they were actually almost identical – as I understand it, this update is because of a change in statistical technique (the handling of range restriction).
Regardless of whether structured or unstructured interviews are actually better, the fact that the result you get from academic literature depends on a fairly esoteric statistics question highlights how difficult it is to extract meaning from this research.[2]
My impression is that this is true of psychology more broadly: you can’t just read an abstract which says “structured > unstructured” and conclude something useful; you have to really dig into the statistical methods, data sources, and often times you even need to request materials which weren’t made public to actually get a sense of what’s going on.
I’m not trying to criticize organizational psychologists – reality just is very complicated. But I do think this means that the average hiring manager – or even the average hiring manager who is statistically literate – can’t really get much useful information from these kinds of academic reviews.
E.g. Hunter and Schmidt 1998: “The most wellknown conclusion from this research is that for hiring employees without previous experience in the job the most valid predictor of future performance and learning is general mental ability ([GMA], i.e., intelligence or general cognitive ability; Hunter & Hunter, 1984; Ree & Earles, 1992)”
I briefly skimmed the papers you cited in favor of structured interviews. I couldn’t immediately tell how they were handling range restriction; no doubt a better statistician than myself could figure this out easily, but I think it proves my point that it’s quite hard for the average hiring manager to make sense of the academic literature.
Thanks, to me this comment is a large update away from the value of structured interviews.
As someone else who casually reads literature on hiring assessment, I am also confused/not convinced by OP’s dismissals re: GMA tests.
I think that “dismissal” is a bit of a mischaracterization. I can try to explain my current stance on GMA tests a little more. All of the research I’ve read that link GMA to work performance uses data from 1950s to 1980s, and I haven’t seen what tool/test they use to measure GMA. So I think that my concerns are mainly two factors: First, I haven’t yet read anything to suggest that the tools/tests used really do measure GMA. They might end up measuring household wealth or knowledge of particular arithmetic conventions, as GMAT seems to do. I don’t know, because I haven’t looked at the details of these older studies. Second, my rough impression is that psychological tests/tools from that era were mainly implemented by and tested on a sample of people that doesn’t seem particularly representative of humanity as a whole, or even of the USA as a whole.
My current stance isn’t that GMA is useless, but that there are a few obstacles that I’d like to see overcome before I recommend it. I also have only a vague knowledge of the legal risks, thus I want to encourage caution for any organization trying to implement GMA as part of hiring criteria.
If you have recommended readings that you would be willing to share, especially ones that could help me clarify the current concerns, I’d be happy to see them.
For anyone who is curious, I have a bit of an update. (low priority, feel free to skip with no hard feelings)
It has puzzled me that a finding not supported by theory[1] didn’t get more press in the psychology world. It appears to have made almost no impact. I would expect that either A) there would be a bunch of articles citing it and claiming that we should change how we do hiring, or B) there would be some articles refuting the findings. I recently got the chance to ask some Industrial Organizational Psychologists what is going on here. Here is my little summary and exceprts of their answers:
A more recent meta-analysis has re-asserted that structured interviews are more predictive.
The paper is a book chapter (which hasn’t gone through the peer-review process), and thus professional researchers don’t give it much consideration.
The paper is an exercise in methodological minutiae that’s not terribly useful or practical. (“what are you going to do with the results of that paper, even assuming they’re right on everything? Are you REALLY going to tell people that using an unstructured interview is fine? Just throw best practices, legal defensibility, etc. out the window?”[2])
There’s a time and a place for brute-force empiricism, but this is not one of them, especially when it informs high-stakes HR practices. “It really makes no theoretical sense to me why unstructured which by definition has so much noise introduced to it, would be more predictive then structured. I really would need a WHY and HOW to buy that argument, not just an application of statistical corrections.”
“That’s crazy that they would say unstructured interviews are better. That’s like choosing a Rorschach test over a MMPI.”
My takeaway is that (roughly speaking) I didn’t have enough domain knowledge and context to properly place and weigh that paper (thus supporting the claim that “average hiring manager can’t really get much useful information from these kinds of academic reviews”).
The finding that unstructured interviews have similar predictive validity to structured interviews, originally published as Rethinking the validity of interviews for employment decision making, and later cited in 2016 working paper.
This and the other quotes are almost direct quotes, but I edited them to remove some profanity and correct grammar. People were really worked up about the issues with this paper, but I don’t think that kind of harsh language has a place on the EA Forum.
Interesting, thanks for the follow-up Joseph! It makes sense that other meta-analyses would find different outcomes.
TLDR: I agree with you. It is complicated and ambiguous and I wish it was more clear-cut.
Regarding GMA Tests, my loosely held opinion at the moment is that I think there is a big difference between 1) GMA being a valid predictor, and 2) having a practical way to use GMA in a hiring process. All the journal articles seem to point toward #1, but what I really want is #2. I suppose we could simply require that all applicants do a test from Wonderlic/GMAT/SAT, but I’m wary of the legal risks and the biases, two topics about which I lack the knowledge to give any confident recommendations. That is roughly why my advice is “only use these if you have really done your research to make sure it works in your situation.”
I’m still exploring the area, and haven’t yet found anything that gives me confidence, but I’m assuming there has to be solutions that exist other than “just pay Wonderlic to do it.”
I strongly agree with you. I’ll echo a previous idea I wrote about: the gap between this is valid and here are the details of how to implement this seem fairly large. If I was a researcher I assume I’d have mentors and more senior researchers that I could bounce ideas off of, or who could point me in the right direction, but learning about these topics as an individual without that kind of structure is strange: I mostly just search on Google Scholar and use forums to ask more experienced people.
Thanks for the thoughtful response!
My anecdotal experience with GMA tests is that hiring processes already use proxies for GMA (education, standardized test scores, work experience, etc.) so the marginal benefit of doing a bona fide GMA test is relatively low.
It would be cool to have a better sense of when these tests are useful though, and an easy way to implement them in those circumstances.
Regarding structured versus unstructured interviews, I was just introduced to the 2016 update yesterday and I skimmed through it. I, too, was very surprised to see that there was so little difference [Edit: different between structured interviews and unstructured interviews]. While I want to be wary of over-updating from a single paper, I do want to read the Rethinking the validity of interviews for employment decision making paper so that I can look at the details. Thanks for sharing this info.