They ask applicants to take a Big Five assessment. Personality tests are not predictive of job performance (at least integrity and conscientiousness tests aren’t: selval). And they can be gamed.
This is not how I read your link (pg 265 in Psychological Bulletin 1998, Vol. 124, No. 2, pg.4 in pdf). In the relevant metrics, it seems like incremental validity (above just using intelligence tests) for both conscientiousness and integrity tests is very high, on comparable levels (!) to work sample performance and structured interviews.
Thanks for pointing that out. I had only looked at the validity of each method on its own and not at the validity gain numbers. Don’t the results indicate, though, that you would have to also subject the candidate to a GMA test if you want to get validity from conscientiousness and integrity tests? And GMA tests are rarely performed in hiring processes.
Pulling up an addendum from below (added 2022-10-19):
I would explain the high incremental validity by the fact that a GMA test barely measures conscientiousness and integrity. In fact, footnote ‘c,d’ mentions that ‘the correlation between integrity and ability is zero’. But conscientiousness and integrity are important for job performance (depending on the job [and the hiring manager]). I would expect much lower incremental validity over structured interviews or work samples. Because these, when done well, tell a lot about conscientiousness and integrity by themselves.
[Update: This comment of mine was wrong, but I still think the claim in the post is contradicted by the source cited; see below.]
It looks like integrity and conscientiousness tests were also the 3rd and 4th highest rated things for the “Validity” column itself, out of a large list? And they appear to have ranked above interviews and some CV-like things (e.g., reference checks and job experience (years)), yet your post recommends interviews and CVs.
I’m pretty confused by this apparent misreading of the source. I think readers should treat that as a reason for being more skeptical of the rest of the post. My guess is it’d be worth you “moving a bit slower” (to check sources more carefully etc.) and stating things less confidently in future posts.
They are the third and fourth row in the table, but the rows aren’t ordered by the validity column. When you order by the validity column, integrity tests are 8th and conscientiousness tests are 12th unless I’ve miscounted.
I admit that I cherry-picked this article, basically only looked at the validity numbers in the table, and don’t know anything else from that literature. This post has a wider view for those interested: https://forum.effectivealtruism.org/posts/j9JnmEZtgKcpPj8kz/hiring-how-to-do-it-better On the other hand, the validity table was excerpted on an 80,000 Hours page for years, which gives me some confidence by proxy. Also, the points of my post rely only lightly on this article.
Added 2022-10-19: It would be nice if you removed the bolding on the text that you found to be wrong. Otherwise the impatient reader is apt to miss the bracketed text above.
Oh crap, my bad, should’ve looked closer! (And that’s ironic given my comment, although at least I’d say the epistemic standards for comments can/should be lower than for posts.) Sorry about that.
Though still I think “not predictive” seems like a strong misrepresentation of that source. I think correlations of .41 and .31 would typically be regarded as moderate/weak and definitely not “approximately zero”. (And then also the apparently high incremental validity is interesting, though not as immediately easy to figure out the implications of.)
I agree that “the points of [your] post rely only lightly on this article”. But I still think this is a sufficiently strong & (in my view, seemingly) obvious* misrepresentation of the source that it would make sense to see this issue as a reason for readers to be skeptical of any other parts of the post that they haven’t closely checked.
(I also find it surprising that in your two replies in this thread you didn’t note that the table indeed seems inconsistent with “not predictive”.)
*I.e., it doesn’t seem like it requires a very close/careful reading of the source to determine that it’s saying something very different to “not predictive”. (Though I may be wrong—I only jumped to the table, and I guess it’s possible other parts of the paper are very misleadingly written.)
The reason why I wasn’t noting that the table is inconsistent with ‘not predictive’ is that I was unconsciously equating ‘not predictive’ with ‘not sufficiently predictive to be worth the candidate’s time’. Only your insisting made me think about it more carefully. Given that unconscious semantics, it’s not a strong misrepresentation of the source either. But of course it’s sloppy and you were right to point it out.
I hope this somewhat restores your expectation of my epistemic integrity. Also, I think there is not just evidence against, but also evidence for epistemic integrity in my article. That should factor into how ‘skeptical’ readers ought to be. Examples: The last paragraph of the introduction. The fact that I edit the article based on comments once I’m convinced that a comment is correct. The fact that I call out edits and don’t just rewrite history. The fact that it’s well-structured overall (not necessarily at the paragraph level), which makes it easy to respond to claims. The fact that I include and address possible objections to my points.
Addendum: I would explain the high incremental validity by the fact that a GMA test barely measures conscientiousness and integrity. In fact, footnote ‘c,d’ mentions that ‘the correlation between integrity and ability is zero’. But conscientiousness and integrity are important for job performance (depending on the job). I would expect much lower incremental validity over structured interviews or work samples. Because these, when done well, tell a lot about conscientiousness and integrity by themselves.
This is not how I read your link (pg 265 in Psychological Bulletin 1998, Vol. 124, No. 2, pg.4 in pdf). In the relevant metrics, it seems like incremental validity (above just using intelligence tests) for both conscientiousness and integrity tests is very high, on comparable levels (!) to work sample performance and structured interviews.
Thanks for pointing that out. I had only looked at the validity of each method on its own and not at the validity gain numbers. Don’t the results indicate, though, that you would have to also subject the candidate to a GMA test if you want to get validity from conscientiousness and integrity tests? And GMA tests are rarely performed in hiring processes.
Pulling up an addendum from below (added 2022-10-19):
[Update: This comment of mine was wrong, but I still think the claim in the post is contradicted by the source cited; see below.]
It looks like integrity and conscientiousness tests were also the 3rd and 4th highest rated things for the “Validity” column itself, out of a large list? And they appear to have ranked above interviews and some CV-like things (e.g., reference checks and job experience (years)), yet your post recommends interviews and CVs.
I’m pretty confused by this apparent misreading of the source. I think readers should treat that as a reason for being more skeptical of the rest of the post. My guess is it’d be worth you “moving a bit slower” (to check sources more carefully etc.) and stating things less confidently in future posts.
They are the third and fourth row in the table, but the rows aren’t ordered by the validity column. When you order by the validity column, integrity tests are 8th and conscientiousness tests are 12th unless I’ve miscounted.
I admit that I cherry-picked this article, basically only looked at the validity numbers in the table, and don’t know anything else from that literature. This post has a wider view for those interested: https://forum.effectivealtruism.org/posts/j9JnmEZtgKcpPj8kz/hiring-how-to-do-it-better On the other hand, the validity table was excerpted on an 80,000 Hours page for years, which gives me some confidence by proxy. Also, the points of my post rely only lightly on this article.
Added 2022-10-19: It would be nice if you removed the bolding on the text that you found to be wrong. Otherwise the impatient reader is apt to miss the bracketed text above.
Oh crap, my bad, should’ve looked closer! (And that’s ironic given my comment, although at least I’d say the epistemic standards for comments can/should be lower than for posts.) Sorry about that.
Though still I think “not predictive” seems like a strong misrepresentation of that source. I think correlations of .41 and .31 would typically be regarded as moderate/weak and definitely not “approximately zero”. (And then also the apparently high incremental validity is interesting, though not as immediately easy to figure out the implications of.)
I agree that “the points of [your] post rely only lightly on this article”. But I still think this is a sufficiently strong & (in my view, seemingly) obvious* misrepresentation of the source that it would make sense to see this issue as a reason for readers to be skeptical of any other parts of the post that they haven’t closely checked.
(I also find it surprising that in your two replies in this thread you didn’t note that the table indeed seems inconsistent with “not predictive”.)
*I.e., it doesn’t seem like it requires a very close/careful reading of the source to determine that it’s saying something very different to “not predictive”. (Though I may be wrong—I only jumped to the table, and I guess it’s possible other parts of the paper are very misleadingly written.)
Particularly when the most predictive things in that table were .51 and .54.
Okay, you convince me. I’ve rewritten that item.
The reason why I wasn’t noting that the table is inconsistent with ‘not predictive’ is that I was unconsciously equating ‘not predictive’ with ‘not sufficiently predictive to be worth the candidate’s time’. Only your insisting made me think about it more carefully. Given that unconscious semantics, it’s not a strong misrepresentation of the source either. But of course it’s sloppy and you were right to point it out.
I hope this somewhat restores your expectation of my epistemic integrity. Also, I think there is not just evidence against, but also evidence for epistemic integrity in my article. That should factor into how ‘skeptical’ readers ought to be. Examples: The last paragraph of the introduction. The fact that I edit the article based on comments once I’m convinced that a comment is correct. The fact that I call out edits and don’t just rewrite history. The fact that it’s well-structured overall (not necessarily at the paragraph level), which makes it easy to respond to claims. The fact that I include and address possible objections to my points.
Addendum: I would explain the high incremental validity by the fact that a GMA test barely measures conscientiousness and integrity. In fact, footnote ‘c,d’ mentions that ‘the correlation between integrity and ability is zero’. But conscientiousness and integrity are important for job performance (depending on the job). I would expect much lower incremental validity over structured interviews or work samples. Because these, when done well, tell a lot about conscientiousness and integrity by themselves.