In the short term, senior hires are most likely to come from finding and onboarding people who already have the required skills, experience, credentials and intrinsic motivation to reduce x-risks.
Can you be more specific about the the required skills and experience are?
Skimming the report, you say “All senior hires require exceptionally good judgement and decision-making.” Can you be more specific about what that means and how it can be assessed?
It seems to me that in many cases the specific skills that are needed are both extremely rare and not well captured by the standard categories.
For instance, Paul Christiano seems to me to be an enormous asset to solving the core problems of AI safety. If “we didn’t have a Paul” I would be willing to trade huge amounts of EA resources to have him working on AI safety, and I would similarly trade huge resources to get another Paul-equivalent working on the problem.
But it doesn’t seem like Paul’s skillset is one that I can easily select for. He’s knowledgeable about ML, but there are many people with ML knowledge (about 100 new ML PhDs each year). That isn’t the thing that distinguishes him.
Nevertheless, Paul has some qualities, above and beyond his technical familiarity, that allow him to do original and insightful thinking about AI safety. I don’t understand what those qualities are, or know how to assess them, but they seem to me to be much more critical than having object level knowledge.
I have close to no idea how to recruit more people that can do the sort of work that Paul can do. (I wish I did. As I said, I would give up way more than my left arm to get more Pauls).
But, I’m afraid there’s a tendency here to goodhart on the easily measurable virtues, like technical skill or credentials.
There aren’t many people with PhD-level research experience in relevant fields who are focusing on AI safety, so I think it’s a bit early to conclude these skills are “extremely rare” amongst qualified individuals.
AI safety research spans a broad range of areas, but for the more ML-oriented research the skills are, unsurprisingly, not that different from other fields of ML research. There are two main differences I’ve noticed:
In AI safety you often have to turn ill-defined, messy intuitions into formal problem statements before you can start working on them. In other areas of AI, people are more likely to have already formalized the problem for you.
It’s important to be your own harshest critic. This is cultivated in some other fields, such as computer security and (in a different way) in mathematics. But ML tends to encourage a sloppy attitude here.
Both of these I think are fairly easily measurable from looking at someone’s past work and talking to them, though.
Identifying highly capable individuals is indeed hard, but I don’t think this is any more of a problem in AI safety research than in other fields. I’ve been involved in screening in two different industries (financial trading and, more recently, AI research). In both cases there’s always been a lot of guesswork involved, and I don’t get the impression it’s any better in other sectors. If anything I’ve found screening in AI easier: at least you can actually read the person’s work, rather than everything behind behind an NDA (common in many industries).
Identifying highly capable individuals is indeed hard, but I don’t think this is any more of a problem in AI safety research than in other fields.
Quite. I think that my model of Eli was setting the highest standard possible—not merely a good researcher, but a great one, the sort of person who can bring whole new paradigms/subfields into existence (Kahneman & Tversky, Von Neumann, Shannon, Einstein, etc), and then noting that because the tails come apart (aka regressional goodharting), optimising for the normal metrics used in standard hiring practices won’t get you these researchers (I realise that probably wasn’t true for Von Neumann, but I think it was true for all the others).
How about this: you, as someone already grappling with these problems, present some existing problems to a recrutee, and ask them to come up with some one-paragraph descriptions of original solutions. You read these, and introspect whether they give you a sense of traction/quality, or match solutions that have been proposed by experts you trust (that they haven’t heard of).
I’m looking to do a pilot for this. If anyone would like to join, message me.
The required skills and experience of senior hires vary between fields and roles; senior x-risk staff are probably best-placed to specify these requirements in their respective domains of work. You can look at x-risk job ads and recruitment webpages of leading x-risk orgs for some reasonable guidance. (we are developing a set of profiles for prospective high-impact talent, to give a more nuanced picture of who’s required).
“Exceptionally good judgement and decision-making”, for senior x-risk talent, I believe requires:
a thorough and nuanced understanding of EA concepts and how they apply to the context
good pragmatic foresight—an intuitive grasp of the likely and possible implications of one’s actions
a conscientious risk-aware attitude, with the ability to think clearly and creatively to identify failure modes
Assessing good-judgement and decision-making is hard; it’s particularly hard to assess the consistency of a person’s judgement without knowing/working with them over at least several months. Some methods:
Speaking to a person can quickly clarify their level of knowledge of EA concepts and how they apply to the context of their role.
Speaking to references could be very helpful, to get a picture of how a person updates their beliefs and actions.
Actually working with them (perhaps via a work trial, partnership or consultancy project) is probably the best way to test whether a person is suitable for the role
A critical thinking psychometric test may plausibly be a good preliminary filter, but is perhaps more relevant for junior talent. A low score would be a big red flag, but a high score is far from sufficient to imply overall good judgement and decision-making.
Can you be more specific about the the required skills and experience are?
Skimming the report, you say “All senior hires require exceptionally good judgement and decision-making.” Can you be more specific about what that means and how it can be assessed?
It seems to me that in many cases the specific skills that are needed are both extremely rare and not well captured by the standard categories.
For instance, Paul Christiano seems to me to be an enormous asset to solving the core problems of AI safety. If “we didn’t have a Paul” I would be willing to trade huge amounts of EA resources to have him working on AI safety, and I would similarly trade huge resources to get another Paul-equivalent working on the problem.
But it doesn’t seem like Paul’s skillset is one that I can easily select for. He’s knowledgeable about ML, but there are many people with ML knowledge (about 100 new ML PhDs each year). That isn’t the thing that distinguishes him.
Nevertheless, Paul has some qualities, above and beyond his technical familiarity, that allow him to do original and insightful thinking about AI safety. I don’t understand what those qualities are, or know how to assess them, but they seem to me to be much more critical than having object level knowledge.
I have close to no idea how to recruit more people that can do the sort of work that Paul can do. (I wish I did. As I said, I would give up way more than my left arm to get more Pauls).
But, I’m afraid there’s a tendency here to goodhart on the easily measurable virtues, like technical skill or credentials.
There aren’t many people with PhD-level research experience in relevant fields who are focusing on AI safety, so I think it’s a bit early to conclude these skills are “extremely rare” amongst qualified individuals.
AI safety research spans a broad range of areas, but for the more ML-oriented research the skills are, unsurprisingly, not that different from other fields of ML research. There are two main differences I’ve noticed:
In AI safety you often have to turn ill-defined, messy intuitions into formal problem statements before you can start working on them. In other areas of AI, people are more likely to have already formalized the problem for you.
It’s important to be your own harshest critic. This is cultivated in some other fields, such as computer security and (in a different way) in mathematics. But ML tends to encourage a sloppy attitude here.
Both of these I think are fairly easily measurable from looking at someone’s past work and talking to them, though.
Identifying highly capable individuals is indeed hard, but I don’t think this is any more of a problem in AI safety research than in other fields. I’ve been involved in screening in two different industries (financial trading and, more recently, AI research). In both cases there’s always been a lot of guesswork involved, and I don’t get the impression it’s any better in other sectors. If anything I’ve found screening in AI easier: at least you can actually read the person’s work, rather than everything behind behind an NDA (common in many industries).
Quite. I think that my model of Eli was setting the highest standard possible—not merely a good researcher, but a great one, the sort of person who can bring whole new paradigms/subfields into existence (Kahneman & Tversky, Von Neumann, Shannon, Einstein, etc), and then noting that because the tails come apart (aka regressional goodharting), optimising for the normal metrics used in standard hiring practices won’t get you these researchers (I realise that probably wasn’t true for Von Neumann, but I think it was true for all the others).
I like the breakdown of those two bullet points, a lot, and I want to think more about them.
I bet that you could do that, yes. But that seems like a different question than making a scalable system that can do it.
In any case, Ben articulates the view that generated the comment above, above.
How about this: you, as someone already grappling with these problems, present some existing problems to a recrutee, and ask them to come up with some one-paragraph descriptions of original solutions. You read these, and introspect whether they give you a sense of traction/quality, or match solutions that have been proposed by experts you trust (that they haven’t heard of).
I’m looking to do a pilot for this. If anyone would like to join, message me.
The required skills and experience of senior hires vary between fields and roles; senior x-risk staff are probably best-placed to specify these requirements in their respective domains of work. You can look at x-risk job ads and recruitment webpages of leading x-risk orgs for some reasonable guidance. (we are developing a set of profiles for prospective high-impact talent, to give a more nuanced picture of who’s required).
“Exceptionally good judgement and decision-making”, for senior x-risk talent, I believe requires:
a thorough and nuanced understanding of EA concepts and how they apply to the context
good pragmatic foresight—an intuitive grasp of the likely and possible implications of one’s actions
a conscientious risk-aware attitude, with the ability to think clearly and creatively to identify failure modes
Assessing good-judgement and decision-making is hard; it’s particularly hard to assess the consistency of a person’s judgement without knowing/working with them over at least several months. Some methods:
Speaking to a person can quickly clarify their level of knowledge of EA concepts and how they apply to the context of their role.
Speaking to references could be very helpful, to get a picture of how a person updates their beliefs and actions.
Actually working with them (perhaps via a work trial, partnership or consultancy project) is probably the best way to test whether a person is suitable for the role
A critical thinking psychometric test may plausibly be a good preliminary filter, but is perhaps more relevant for junior talent. A low score would be a big red flag, but a high score is far from sufficient to imply overall good judgement and decision-making.