Interesting research. One thing I’d take into account is that talent need is a somewhat limited measure for impact. I expect that there would be decreasing marginal returns as you add more people to the same research direction. So for example, if you already have 100 people doing interpretability research, I expect that they’d already be picking most of the low-hanging fruit, especially if you’re adding more iterators. However, this might be worthwhile anyway if you believe that we’re in a short-timeline world and that one of the most important things is producing usable research fast.
This is a good point, and something that I definitely had in mind when putting this post together. There are a few thoughts, though, that would temper my phrasing of a similar claim:
Many interviewees said things like “I want 50 more iterators, 10 amplifiers to manage them, and 1-2 connectors.” Interviewees were also working on diverse research agendas, meaning that each of these agendas could probably absorb >100 iterators if not for managerial bottlenecks and, to a lesser extent, funding constraints. This is even more true if those iterators have sufficient research taste (experience) to design their own followup experiments.
This points toward abundant low hanging fruit and a massive experimental backlog field-wide. For this reason and others, I’d probably bump up the 100 number in your hypothetical by a few oom which, given the (fast in an absolute sense but, relative to our actual needs/funds) slow growth of the field, probably means the need for iterators holds even in long timelines, particularly if read as “for at least a few months, please prioritize making more iterators and amplifiers” and not “for all time, no more connectors are needed.”
If we just keep tasting the soup, and figuring out what it needs as we go, we’ll get better results than if any one-time appraisal or cultural mood becomes dogma.
There’s a line I hear paraphrased a lot by the ex-physicists around here, from Paul Dirac, about physics in the immediate wake of relativity: it was a time when “second-rate physicists could do first-rate work.” The AI safety situation seems similar: the rate of growth, the large number of folks who’ve made meaningful contributions, the immaturity of the paradigm, the proliferation of divergent conceptual models, all point to a landscape in which a lot of dry scientific churning needs doing.
I definitely agree that marginal ‘more-of-the-same’ talent has diminishing returns. But I also think diverse teams have a multiplicative effect, and my intention in the post is to advocate for a diversified talent portfolio (as in the numbered takeaways section, which is in some sense a list of priorities, but in another sense a list of considerations that I would personally refuse to trade off against if I were the Dictator of AI Safety Field-building). That is, you get more from 5 iterators, one amplifier, and one connector working together on mech interp, than you do from 30 iterators doing the same. But I wasn’t thinking about building the mech interp talent pool from scratch in a frictionless vacuum; I was looking at the current mech interp talent pool and trying to see how far it is, right now, from its ideal composition, then fill those gaps (where job openings, especially at small safety orgs, and preferences of grant makers, are a decent proxy for the gaps).
Sorry to go so hard in this response! I’ve just been living inside thinking about this for 4-5 months, and a lot of this type of background was cut from the initial post for concision and legibility (neither of which are particularly native to me). I’d hoped the comment section might be a good place for me to provide more context and tempering, so thanks so much for engaging!
Interesting research. One thing I’d take into account is that talent need is a somewhat limited measure for impact. I expect that there would be decreasing marginal returns as you add more people to the same research direction. So for example, if you already have 100 people doing interpretability research, I expect that they’d already be picking most of the low-hanging fruit, especially if you’re adding more iterators. However, this might be worthwhile anyway if you believe that we’re in a short-timeline world and that one of the most important things is producing usable research fast.
This is a good point, and something that I definitely had in mind when putting this post together. There are a few thoughts, though, that would temper my phrasing of a similar claim:
Many interviewees said things like “I want 50 more iterators, 10 amplifiers to manage them, and 1-2 connectors.” Interviewees were also working on diverse research agendas, meaning that each of these agendas could probably absorb >100 iterators if not for managerial bottlenecks and, to a lesser extent, funding constraints. This is even more true if those iterators have sufficient research taste (experience) to design their own followup experiments.
This points toward abundant low hanging fruit and a massive experimental backlog field-wide. For this reason and others, I’d probably bump up the 100 number in your hypothetical by a few oom which, given the (fast in an absolute sense but, relative to our actual needs/funds) slow growth of the field, probably means the need for iterators holds even in long timelines, particularly if read as “for at least a few months, please prioritize making more iterators and amplifiers” and not “for all time, no more connectors are needed.”
If we just keep tasting the soup, and figuring out what it needs as we go, we’ll get better results than if any one-time appraisal or cultural mood becomes dogma.
There’s a line I hear paraphrased a lot by the ex-physicists around here, from Paul Dirac, about physics in the immediate wake of relativity: it was a time when “second-rate physicists could do first-rate work.” The AI safety situation seems similar: the rate of growth, the large number of folks who’ve made meaningful contributions, the immaturity of the paradigm, the proliferation of divergent conceptual models, all point to a landscape in which a lot of dry scientific churning needs doing.
I definitely agree that marginal ‘more-of-the-same’ talent has diminishing returns. But I also think diverse teams have a multiplicative effect, and my intention in the post is to advocate for a diversified talent portfolio (as in the numbered takeaways section, which is in some sense a list of priorities, but in another sense a list of considerations that I would personally refuse to trade off against if I were the Dictator of AI Safety Field-building). That is, you get more from 5 iterators, one amplifier, and one connector working together on mech interp, than you do from 30 iterators doing the same. But I wasn’t thinking about building the mech interp talent pool from scratch in a frictionless vacuum; I was looking at the current mech interp talent pool and trying to see how far it is, right now, from its ideal composition, then fill those gaps (where job openings, especially at small safety orgs, and preferences of grant makers, are a decent proxy for the gaps).
Sorry to go so hard in this response! I’ve just been living inside thinking about this for 4-5 months, and a lot of this type of background was cut from the initial post for concision and legibility (neither of which are particularly native to me). I’d hoped the comment section might be a good place for me to provide more context and tempering, so thanks so much for engaging!