Hiring: a couple of lessons

Background: I work in recruitment for CEA. Here are two aspects of recruitment where I worry some organisations may be making costly mistakes.

Target your role vision (carefully)

When a hiring manager identifies the need to hire an additional staff member, I think the process they ought to follow is to figure out that person’s anticipated key responsibilities and then carefully draft a list of competencies which they endorse as predictive of success at those specific tasks. And to the extent possible, they should then evaluate only based on those traits that are (ideally pre-registered) predictors of success.

What I think sometimes happens instead is something like one of the two following situations:

  • The hiring manager imagines a person they would like to fill that role and writes down a list of traits for this imagined individual.

  • The hiring manager picks a title for the role and writes down a list of traits and abilities that usually are associated with that role title.

I think this causes problems.

Toy example:

Imagine you are hiring for a new executive assistant. You’ve never had an executive assistant before. You imagine your ideal executive assistant. They are personable, detail oriented, they write extremely clear professional emails, and are highly familiar with the EA community. So you test them on writing, on how warm and personable they are, and on knowledge of EA cause areas. Maybe you hire a recent grad who is an extremely strong writer. Once you hire them, you notice that things aren’t working out because while they write elegant emails, they do so slowly. More importantly, they are struggling to keep track of all the competing priorities you hoped they’d help you handle, and have trouble working on novel tasks autonomously. It turned out, you should have focused instead on someone who can keep track of competing priorities, who is highly productive and can work autonomously, etc. Maybe it turns out that some of those criteria you originally listed were nice-to-haves, but in fact for this particular role at this particular moment, it’s actually correct to prefer a seasoned deadline ninja who is a non-native speaker and writes slightly clunky emails. The person you hired might have also actually been an excellent executive assistant for some people in some contexts, they just weren’t the puzzle piece you most needed. If you had taken the prediction exercise seriously, I claim you would have noticed at least some of the mismatch and adjusted.

In my opinion, it’s worth investing significant time into targeting your role vision.

Other benefits:

  • Facilitating stakeholder alignment. Often there are a variety of stakeholders in a given hiring round (e.g. other team members, leadership). If you create a thoughtful vision document in advance of launching the round, you can share it with those stakeholders. This can surface and hopefully allow you to resolve previously invisible disagreements that would otherwise have snuck up on you mid-way through the recruitment round.

  • (Ideally) fighting bias

    • If you don’t pre-register endorsed traits, you are more likely to be (even more) subconsciously swayed by the candidates’ similarity to you or possession of not-role-relevant status-related traits (e.g. confidence in an interview setting)

    • Done effectively, pre-registering endorsed traits should be good for diversity

  • Noticing/​acknowledging confusion earlier in the process. Even if after attempting this exercise you don’t have a strongly held list of traits, then at least you know you don’t know

Evaluate the traits you care about (and not related traits)

I suspect many EA orgs are using candidate evaluation tools in a way that causes a harmful level of false negatives.

I don’t think this problem is remotely specific to EA; I would guess we’re significantly better at candidate assessment than the average organisation. But it might be especially harmful in EA. If our recruitment rounds are even slightly prioritizing EA alignment or even EA context, then unlike the many hiring rounds, we are limited to a relatively tiny hiring pool, compared to e.g. someone hiring for a job at a bank. For many roles, organisations also seem to think that the best candidate might be far more impactful than a median candidate. If that’s true and if the starting set of possible people is small, we should care more about false negatives relative to other tradeoffs; if we’re predictably and preventably hiring in a way that risks ruling out the best candidates for a given role, we are likely filling many potentially-impactful positions with people who will be significantly less capable of generating that impact.

Watching friends go through recruitment rounds at EA orgs, I’m often suspicious that the recruiters are filtering people out using traits correlated with a competency they (should) care about for the role, but not the competency itself. For a real world example, I know of someone applying for a job doing “research engineering” and being tested on software engineering generally, rather than on skills specific to the particular type of ML engineering work that seem like they would probably have been the bread and butter of the actual role.

Another common mistake is grading/​rejecting candidates based on whether they can display excellence in a trait that’s useful for the role, but where excellence isn’t required — you just need a baseline of this trait. A hiring manager might, for example, be filtering candidates based on their performance on a written task reliant on strategic problem-solving because they think “oh, it would be nice if this candidate had that skill” but perhaps the actual role is much more execution focused. If the hiring manager rejects people who are kinda meh on that writing task, in doing so they might lose the person who would have actually been by far the best at executing the actual role. This is true even if this written problem solving skill would have been of some benefit for the role. It’s also true if a low level of that skill was indeed necessary, but the higher level at which they set the bar for progression in the hiring round was not.

I think there’s also a sort of search for keys under the lamppost failure mode at play sometimes, where people put candidates through a particular evaluation because it’s easy, rather than because it’s laser-targeted at the competencies they should most care about.

In an ideal world, you can figure out an evaluation tool that measures the trait you are after extremely effectively. You can test it, become confident in it, use it and then test that it predicts success in the role for the people you actually hire. This is often a far away pipedream for the weird, often unique roles that exist at EA orgs. If you don’t have a test you can be confident in, I think simply knowing your tool and its limitations is a huge improvement.

If you suspect your test is e.g. too broad, too specific, or is (mostly?) testing a related trait to your actual target, you can be strategic about how you interpret the results. A couple of ways this might change how you treat task submissions:

  • Confidence: If you aren’t confident in your tool, you may want to be more generous in who you pass on to the next stage.

  • Rule in vs. rule out: sometimes you should be confident in your tool to rule out the lowest performers, but not confident enough to think it’s reliably helping you distinguish between medium and high performers. If that’s true, my suggestion is to make a pass /​ fail type task, so you try not to update on strong performance. I once wrote an event-related task for an operational role. I endorsed a prediction that very poor submissions should indeed be ruled out. But given I was not hiring for a role that matched the task closely, I wanted the graders to treat all the reasonable submissions ~equally, as otherwise skills that I did not endorse as key to strong execution of the role (e.g. extremely high writing skill, event production knowledge) might bias us towards the wrong candidates.