“the top 1% stay on the New York Times bestseller list more than 25 times longer than the median author in that group.”
FWIW my intuition is not that this author is 25x more talented, but rather that the author and their marketing team are a little bit more talented in a winner-takes-most market.
I wanted to point this out because I regularly see numbers like this used to justify claims that individuals vary significantly in talent or productivity. It’s important to keep the business model in mind if you’re claiming talent based on sales!
(Research citations are also a winner-takes-most market; people end up citing the same paper even if it’s not much better than the next best paper.)
I fully agree with this, and think we essentially say as much in the post/document. This is e.g. why we’ve raised different explanations in the 2nd paragraph, immediately after referring to the phenomenon to be explained.
Curious if you think we could have done a better job at clarifying that we don’t think differences in outcomes can only be explained by differences in talent?
Let me try a different framing and see if that helps. Economic factors mediate how individual task performance translates into firm success. In industries with winner-takes-most effects, small differences in task performance cause huge differences in payoffs. “The Economics of Superstars” is a classic 1981 paper on this. But many industries aren’t like that.
Knowing your industry tells you how important it is to hire the right people. If you’re hiring someone to write an economics textbook (an example from the “Superstars” paper), you’d better hire the best textbook-writer you can find, because almost no one buys the tenth-best economics textbook. But if you’re running a local landscaping company, you don’t need the world’s best landscaper. And if your industry has incumbent “superstar” firms protected by first-mover advantages, economies of scale, or network effects, it may not matter much who you hire.
So in what kind of “industry” are the EA organizations you want to help with hiring? Is there some factor that multiplies or negates small individual differences in task performance?
My point is more “context matters,” even if you’re talking about a specific skill like programming, and that the contexts that generated the examples in this post may be meaningfully different from the contexts that EA organizations are working in.
I don’t necessarily disagree with anything you and Max have written; it’s just a difference of emphasis, especially when it comes to advising people who are making hiring decisions.
I was going to raise a similar comment to what others have said here. I hope this adds something.
I think we need to distinguish quality and quantity of ‘output’ from ‘success’ (the outcome of their output). I am deliberately not using ‘performance’ as it’s unclear, in common language, which one of the two it refers to. Various outputs are sometimes very reproducible—anyone can listen to a music track, or read an academic paper. There are often huge rewards to being the best vs second best—eg winning in sports. And sometimes success generates further success (the ‘Matthew effect’) - more people want to work with you, etc. Hence, I don’t find it all weird to think that small differences in outputs, as measured on some cardinal scale, sometimes generate huge differences in outcomes.
I’m not sure exactly what follows from this. I’m a bit worried you’re concentrated on the wrong metric—success—when it’s outputs that are more important. Can you explain why you focus on outcomes?
Let’s say you’re thinking about funding research. How much does it matter to fund the best person? I mean, they will get most of the credit, but if you fund the less-than-best, that person’s work is probably not much worse and ends up being used by the best person anyway. If the best person gets 1,000 more citations, should you be prepared to spend 1,000 more to fund their work? Not obviously.
I’m suspicious you can do a good job of predicting ex ante outcomes. After all, that’s what VCs would want to do and they have enormous resources. Their strategy is basically to pick as many plausible winners as they can fund.
It might be interesting to investigate differences in quality and quantity of outputs separately. Intuitively, it seems the best people do produce lots more work than the good people, but it’s less obvious the quality of the best people is much higher than of the good. I recognise all these terms are vague.
On your main point, this was the kind of thing we were trying to make clearer, so it’s disappointing that hasn’t come through.
Just on the particular VC example:
I’m suspicious you can do a good job of predicting ex ante outcomes. After all, that’s what VCs would want to do and they have enormous resources. Their strategy is basically to pick as many plausible winners as they can fund.
YC having a low acceptance rate could mean they are highly confident in their ability to predict ex ante outcomes. It could also mean that they get a lot of unserious applications. Essays such as this one by Paul Graham bemoaning the difficulty of predicting ex ante outcomes make me think it is more the latter. (“it’s mostly luck once you get down to the top 1-5%” makes it sound to me like ultra-successful startups should have elite founders, but my take on Graham’s essay is that ultra-successful startups tend to be unusual, often in a way that makes them look non-elite according to traditional metrics—I tend to suspect this is true of exceptionally innovative people more generally)
I’m not trying to be obtuse, it wasn’t super clear to me on a quick-ish skim; maybe if I’d paid more attention I’ve have clocked it.
Yup, I was too hasty on VCs. It seems like they are pretty confident they know what the top >5% are, but not that can say anything more precise than. (Although I wonder what evidence indicates they can reliably tell the top 5% from those below, rather than they just think they can).
(Although I wonder what evidence indicates they can reliably tell the top 5% from those below, rather than they just think they can).
The Canadian inventors assistance program provides a rating of how good an invention is to inventors for a nominal fee. A large fraction of the people who get a bad rating try to make a company anyway, so we can judge the accuracy of their evaluations.
55% of the inventions which they give the highest rating to achieve commercial success, compared to 0% for the lowest rating.
ah, this is great. evidence the selectors could tell the top 2% from the rest, but 2%-20% was much of a muchness. Shame that it doesn’t give any more information on ‘commercial success’.
I’m not trying to be obtuse, it wasn’t super clear to me on a quick-ish skim; maybe if I’d paid more attention I’ve have clocked it.
FWIW I think it’s the authors’ job to anticipate how their audience is going to engage with their writing, where they’re coming from etc. - You were not the only one who reacted by pushing back against our framing as evident e.g. from Khorton’s much upvoted comment.
So no matter what we tried to convey, and what info is in the post or document if one reads closely enough, I think this primarily means that I (as main author of the wording in the post) could have done a better job, not that you or anyone else is being obtuse.
I’m suspicious you can do a good job of predicting ex ante outcomes. After all, that’s what VCs would want to do and they have enormous resources. Their strategy is basically to pick as many plausible winners as they can fund.
I agree that looking at e.g. VC practices is relevant evidence. However, it seems to me that if VCs thought they couldn’t predict anything, they would allocate their capital by a uniform lottery among all applicants, or something like that. I’m not aware of a VC adopting such a strategy (though possible I just haven’t heard of it); to the extent that they can distinguish “plausible” from “implausible” winners, this does suggest some amount of ex-ante predictability. Similarly, my vague impression is that VCs and other investors often specialize by domain/sector, which suggests they think they can utilize their knowledge and network when making decisions ex ante.
Sure, predictability may be “low” in some sense, but I’m not sure we’re saying anything that would commit us to denying this.
Yeah, I’d be interested to know if VC were better than chance. Not quite sure how you would assess this, but probably someone’s tried.
But here’s where it seems relevant. If you want to pick the top 1% of people, as they provide so much of the value, but you can only pick the top 10%, then your efforts to pick are much less cost-effective and you would likely want to rethink how you did it.
I think it’s plausible that VCs aren’t better than chance when choosing between a suitably restricted “population”, i.e. investment opportunities that have passed some bar of “plausibility”.
I don’t think it’s plausible that they are no better than chance simpliciter. In that case I would expect to see a lot of VCs who cut costs by investing literally zero time into assessing investment opportunities and literally fund on a first-come first-serve or lottery basis.
And yes, I totally agree that how well we can predict (rather than just the question whether predictability is zero or nonzero) is relevant in practice.
If the ex-post distribution is heavy-tailed, there are a bunch of subtle considerations here I’d love someone to tease out. For example, if you have a prediction method that is very good for the bottom 90% but biased toward ‘typical’ outcomes, i.e. the median, then you might be better off in expectation to allocate by a lottery over the full population (b/c this gets you the mean, which for heavy-tailed distributions will be much higher than the median).
Data from the IAP indicates that they can identify the top few percent of successful inventions with pretty good accuracy. (Where “success” is a binary variable – not sure how they perform if you measure financial returns.)
I’m not sure exactly what follows from this. I’m a bit worried you’re concentrated on the wrong metric—success—when it’s outputs that are more important. Can you explain why you focus on outcomes?
I’m not sure I agree that outputs are more important. I think it depends a lot on the question or decision we’re considering, which is why I highlighted a careful choice of metric as one of the key pieces of advice.
So e.g. if our goal is to set performance incentives (e.g. salaries), then it may be best to reward people for things that are under their control. E.g. pay people more if they work longer hours (inputs), or if there are fewer spelling mistakes in their report (cardinal output metric) or whatever. At other times, paying more attention to inputs or outputs rather than outcomes or things beyond the individual performer’s control may be justified by considerations around e.g. fairness or equality.
All of these things are of course really important to get right within the EA community as well, whether or not we care about them instrumentally or intrinsically. There are lot of tricky and messy questions here.
But if we can say anything general, then I think that especially in EA contexts we care more, ore more often, about outcomes/success/impact on the world, and less about inputs and outputs, than usual. We want to maximize well-being, and from ‘the point of view of the universe’ it doesn’t ultimately matter if someone is happy because someone else produced more outputs or because the same outputs had greater effects. Nor does it ultimately matter if impact differences are due to differences in talent, resource endowments, motivation, luck, or …
Another way to see this is that often actors that care more about inputs or outputs do so because they don’t internalize all the benefits from outcomes. But if a decision is motivated by impartial altruism, there is a sense in which there are no externalities.
Of course, we need to make all the usual caveats against ‘naive consequentialism’. But I do think there is something important in this observation.
I was thinking the emphasis on outputs might be the important part as those are more controllable than outcomes, and so the decision-relevant bit, even though we want to maximise impartial value (outcomes).
I can imagine someone thinking the following way: “we must find and fund the best scientists because they have such outsized outcomes, in terms of citations.” But that might be naive if it’s really just the top scientist who gets the citations and the work of all the good scientists has a more or less equal contribution to impartial value.
I’m sympathetic to the point that we’re lumping together quite different things under the vague label “performance”, perhaps stretching its beyond its common use. That’s why I said in bold that we’re using a loose notion of performance. But it’s possible it would have been better if I had spent more time to come up with a better terminology.
Okay good! Yeah, I would be curious to see how much it changed the analysis distinguishing outputs from outcomes and, further, between different types of outputs.
I think the language of a person who “achieves orders of magnitude more” suggests that their output (research, book, etc) is orders of magnitude better, instead of just being more popular. Sometimes more popular is better, but often in EA that’s not what we’re focused on.
I also believe you’re talking about hiring individuals in this piece(?), but most of your examples are about successful teams, which have different qualities to successful individuals.
I thought your examples of Math Olympiad scores correlating with Nobel Prize wins was a useful exception to this trend, because those are about an individual and aren’t just about popularity.
FWIW I think I see the distinction between popularity and other qualities as less clear as you seem to do. For instance, I would expect that book sales and startup returns are also affected by how “good” in whatever other sense the book or startup product is. Conversely, I would guess that realistically Nobel Prizes and other scientific awards are also about popularity and not just about the quality of the scientific work by other standards. I’m happy to concede that, in some sense, book sales seem more affected by popularity than Nobel Prizes, but it seems a somewhat important insight to me that neither is “just about popularity” nor “just about achievement/talent/quality/whatever”.
It’s also not that clear to me whether there is an obviously more adequate standard of overall “goodness” here: how much joy the book brings readers? What literary critics would say about the book? I think the ultimate lesson here is that the choice of metric is really important, and depends a lot on what you want to know or decide, which is why “Carefully choose the underlying population and the metric for performance” is one of our key points of advice. I can see that saying something vague and general like “some people achieve more” and then giving examples of specific metrics pushes against this insight by suggesting that these are the metrics we should generally most care about. FWIW I still feel OK about our wording here since I feel like in an opening paragraph we need to balance nuance/detail and conciseness / getting the reader interested.
As an aside, my vague impression is that it’s somewhat controversial to what extent successful teams have different qualities to successful individuals. In some sense this is of course true since there are team properties that don’t even make sense for individuals. However, my memory is that for a while there was some more specific work in psychology that was allegedly identifying properties that predicted team success better than the individual abilities of its members, which then largely didn’t replicate.
However, my memory is that for a while there was some more specific work in psychology that was allegedly identifying properties that predicted team success better than the individual abilities of its members, which then largely didn’t replicate.
Woolley et al (2010) was an influential paper arguing that individual intelligence doesn’t predict collective intelligence well. Here’s one paper criticising them. I’m sure there are plenty of other relevant papers (I seem to recall one paper providing positive evidence that individual intelligence predicted group performance fairly well, but can’t find it now).
Agreed. The slight initial edge that drives the eventual enormous success in the winner-takes-most market can also be provided by something other than talent — that is, by something other than people trying to do things and succeeding at what they tried to do. For example, the success of Fifty Shades of Grey seems best explained by luck.
I was going to comment something to this effect, too. The authors write:
For instance, we find ‘heavy-tailed’ distributions (e.g. log-normal, power law) of scientific citations, startup valuations, income, and media sales. By contrast, a large meta-analysis reports ‘thin-tailed’ (Gaussian) distributions for ex-post performance in less complex jobs such as cook or mail carrier: the top 1% account for 3-3.7% of the total.
But there’s an important difference between these groups – the products involved in the first group are cheaply reproducible (any number of people can read the same papers, invest in the same start-up or read the same articles – I don’t know how to interpret income here) & those in the second group are not (not everyone can use the same cook or mail carrier).
So I propose that the difference there has less to do with the complexity of the jobs & more to do with how reproducible the products involved are.
I think you’re right that complexity at the very least isn’t the only cause/explanation for these differences.
E.g. Aguinis et al. (2016) find that, based on an analysis of a very large number of productivity data sets, the following properties make a heavy-tailed output distribution more likely:
Multiplicity of productivity,
Monopolistic productivity,
Job autonomy,
Job complexity,
No productivity ceiling (I guess your point is a special case of this: if the marginal cost of increasing output becomes too high too soon, there will effectively be a ceiling; but there can also e.g. be ceilings imposed by the output metric we use, such as when a manager gives a productivity rating on a 1-10 scale)
As we explain in the paper, I have some open questions about the statistical approach in that paper. So I currently don’t take their analysis to be that much evidence that this is in fact right. However, they also sound right to me just based on priors and based on theoretical considerations (such as the ones in our section on why we expect heavy-tailed ex-ante performance to be widespread).
In the part you quoted, I wrote “less complex jobs” because the data I’m reporting is from a paper that explicitly distinguishes low-, medium-, and high-complexity jobs, and finds that only the first two types of job potentially have a Gaussian output distribution (this is Hunter et al. 1990). [TBC, I understand that the reader won’t know this, and I do think my current wording is a bit sloppy/bad/will predictably lead to the valid pushback you made.]
One theoretical point in favour of complexity is that complex production often looks like an ‘o-ring’ process, which will create heavy-tailed outcomes.
“the top 1% stay on the New York Times bestseller list more than 25 times longer than the median author in that group.”
FWIW my intuition is not that this author is 25x more talented, but rather that the author and their marketing team are a little bit more talented in a winner-takes-most market.
I wanted to point this out because I regularly see numbers like this used to justify claims that individuals vary significantly in talent or productivity. It’s important to keep the business model in mind if you’re claiming talent based on sales!
(Research citations are also a winner-takes-most market; people end up citing the same paper even if it’s not much better than the next best paper.)
I fully agree with this, and think we essentially say as much in the post/document. This is e.g. why we’ve raised different explanations in the 2nd paragraph, immediately after referring to the phenomenon to be explained.
Curious if you think we could have done a better job at clarifying that we don’t think differences in outcomes can only be explained by differences in talent?
Let me try a different framing and see if that helps. Economic factors mediate how individual task performance translates into firm success. In industries with winner-takes-most effects, small differences in task performance cause huge differences in payoffs. “The Economics of Superstars” is a classic 1981 paper on this. But many industries aren’t like that.
Knowing your industry tells you how important it is to hire the right people. If you’re hiring someone to write an economics textbook (an example from the “Superstars” paper), you’d better hire the best textbook-writer you can find, because almost no one buys the tenth-best economics textbook. But if you’re running a local landscaping company, you don’t need the world’s best landscaper. And if your industry has incumbent “superstar” firms protected by first-mover advantages, economies of scale, or network effects, it may not matter much who you hire.
So in what kind of “industry” are the EA organizations you want to help with hiring? Is there some factor that multiplies or negates small individual differences in task performance?
I think that’s a good summary, but it’s not only winner-takes-all effects that generate heavy-tailed outcomes.
You can get heavy tailed outcomes if performance is the product of two normally distributed factors (e.g. intelligence x effort).
It can also arise from the other factors that Max lists in another comment (e.g. scalable outputs, complex production).
Luck can also produce heavy tailed outcomes if it amplifies outcomes or is itself heavy-tailed.
My point is more “context matters,” even if you’re talking about a specific skill like programming, and that the contexts that generated the examples in this post may be meaningfully different from the contexts that EA organizations are working in.
I don’t necessarily disagree with anything you and Max have written; it’s just a difference of emphasis, especially when it comes to advising people who are making hiring decisions.
I was going to raise a similar comment to what others have said here. I hope this adds something.
I think we need to distinguish quality and quantity of ‘output’ from ‘success’ (the outcome of their output). I am deliberately not using ‘performance’ as it’s unclear, in common language, which one of the two it refers to. Various outputs are sometimes very reproducible—anyone can listen to a music track, or read an academic paper. There are often huge rewards to being the best vs second best—eg winning in sports. And sometimes success generates further success (the ‘Matthew effect’) - more people want to work with you, etc. Hence, I don’t find it all weird to think that small differences in outputs, as measured on some cardinal scale, sometimes generate huge differences in outcomes.
I’m not sure exactly what follows from this. I’m a bit worried you’re concentrated on the wrong metric—success—when it’s outputs that are more important. Can you explain why you focus on outcomes?
Let’s say you’re thinking about funding research. How much does it matter to fund the best person? I mean, they will get most of the credit, but if you fund the less-than-best, that person’s work is probably not much worse and ends up being used by the best person anyway. If the best person gets 1,000 more citations, should you be prepared to spend 1,000 more to fund their work? Not obviously.
I’m suspicious you can do a good job of predicting ex ante outcomes. After all, that’s what VCs would want to do and they have enormous resources. Their strategy is basically to pick as many plausible winners as they can fund.
It might be interesting to investigate differences in quality and quantity of outputs separately. Intuitively, it seems the best people do produce lots more work than the good people, but it’s less obvious the quality of the best people is much higher than of the good. I recognise all these terms are vague.
On your main point, this was the kind of thing we were trying to make clearer, so it’s disappointing that hasn’t come through.
Just on the particular VC example:
Most VCs only pick from the top 1-5% of startups. E.g. YC’s acceptance rate is 1%, and very few startups they reject make it to series A. More data on VC acceptance rates here: https://80000hours.org/2014/06/the-payoff-and-probability-of-obtaining-venture-capital/
So, I think that while it’s mostly luck once you get down to the top 1-5%, I think there’s a lot of predictors before that.
Also see more on predictors of startup performance here: https://80000hours.org/2012/02/entrepreneurship-a-game-of-poker-not-roulette/
YC having a low acceptance rate could mean they are highly confident in their ability to predict ex ante outcomes. It could also mean that they get a lot of unserious applications. Essays such as this one by Paul Graham bemoaning the difficulty of predicting ex ante outcomes make me think it is more the latter. (“it’s mostly luck once you get down to the top 1-5%” makes it sound to me like ultra-successful startups should have elite founders, but my take on Graham’s essay is that ultra-successful startups tend to be unusual, often in a way that makes them look non-elite according to traditional metrics—I tend to suspect this is true of exceptionally innovative people more generally)
Hello Ben.
I’m not trying to be obtuse, it wasn’t super clear to me on a quick-ish skim; maybe if I’d paid more attention I’ve have clocked it.
Yup, I was too hasty on VCs. It seems like they are pretty confident they know what the top >5% are, but not that can say anything more precise than. (Although I wonder what evidence indicates they can reliably tell the top 5% from those below, rather than they just think they can).
The Canadian inventors assistance program provides a rating of how good an invention is to inventors for a nominal fee. A large fraction of the people who get a bad rating try to make a company anyway, so we can judge the accuracy of their evaluations.
55% of the inventions which they give the highest rating to achieve commercial success, compared to 0% for the lowest rating.
https://www.researchgate.net/publication/227611370_Profitable_Advice_The_Value_of_Information_Provided_by_Canadas_Inventors_Assistance_Program
ah, this is great. evidence the selectors could tell the top 2% from the rest, but 2%-20% was much of a muchness. Shame that it doesn’t give any more information on ‘commercial success’.
This is amazing data, and not what I would have expected—I’ve just had my mind changed on the predictability of invention success. Thanks!
This is really cool, thank you!
That’s very interesting, thanks for sharing!
ETA: I’ve added this to our doc acknowledging your comment.
FWIW I think it’s the authors’ job to anticipate how their audience is going to engage with their writing, where they’re coming from etc. - You were not the only one who reacted by pushing back against our framing as evident e.g. from Khorton’s much upvoted comment.
So no matter what we tried to convey, and what info is in the post or document if one reads closely enough, I think this primarily means that I (as main author of the wording in the post) could have done a better job, not that you or anyone else is being obtuse.
I agree that looking at e.g. VC practices is relevant evidence. However, it seems to me that if VCs thought they couldn’t predict anything, they would allocate their capital by a uniform lottery among all applicants, or something like that. I’m not aware of a VC adopting such a strategy (though possible I just haven’t heard of it); to the extent that they can distinguish “plausible” from “implausible” winners, this does suggest some amount of ex-ante predictability. Similarly, my vague impression is that VCs and other investors often specialize by domain/sector, which suggests they think they can utilize their knowledge and network when making decisions ex ante.
Sure, predictability may be “low” in some sense, but I’m not sure we’re saying anything that would commit us to denying this.
Yeah, I’d be interested to know if VC were better than chance. Not quite sure how you would assess this, but probably someone’s tried.
But here’s where it seems relevant. If you want to pick the top 1% of people, as they provide so much of the value, but you can only pick the top 10%, then your efforts to pick are much less cost-effective and you would likely want to rethink how you did it.
I think it’s plausible that VCs aren’t better than chance when choosing between a suitably restricted “population”, i.e. investment opportunities that have passed some bar of “plausibility”.
I don’t think it’s plausible that they are no better than chance simpliciter. In that case I would expect to see a lot of VCs who cut costs by investing literally zero time into assessing investment opportunities and literally fund on a first-come first-serve or lottery basis.
And yes, I totally agree that how well we can predict (rather than just the question whether predictability is zero or nonzero) is relevant in practice.
If the ex-post distribution is heavy-tailed, there are a bunch of subtle considerations here I’d love someone to tease out. For example, if you have a prediction method that is very good for the bottom 90% but biased toward ‘typical’ outcomes, i.e. the median, then you might be better off in expectation to allocate by a lottery over the full population (b/c this gets you the mean, which for heavy-tailed distributions will be much higher than the median).
Data from the IAP indicates that they can identify the top few percent of successful inventions with pretty good accuracy. (Where “success” is a binary variable – not sure how they perform if you measure financial returns.)
I’m not sure I agree that outputs are more important. I think it depends a lot on the question or decision we’re considering, which is why I highlighted a careful choice of metric as one of the key pieces of advice.
So e.g. if our goal is to set performance incentives (e.g. salaries), then it may be best to reward people for things that are under their control. E.g. pay people more if they work longer hours (inputs), or if there are fewer spelling mistakes in their report (cardinal output metric) or whatever. At other times, paying more attention to inputs or outputs rather than outcomes or things beyond the individual performer’s control may be justified by considerations around e.g. fairness or equality.
All of these things are of course really important to get right within the EA community as well, whether or not we care about them instrumentally or intrinsically. There are lot of tricky and messy questions here.
But if we can say anything general, then I think that especially in EA contexts we care more, ore more often, about outcomes/success/impact on the world, and less about inputs and outputs, than usual. We want to maximize well-being, and from ‘the point of view of the universe’ it doesn’t ultimately matter if someone is happy because someone else produced more outputs or because the same outputs had greater effects. Nor does it ultimately matter if impact differences are due to differences in talent, resource endowments, motivation, luck, or …
Another way to see this is that often actors that care more about inputs or outputs do so because they don’t internalize all the benefits from outcomes. But if a decision is motivated by impartial altruism, there is a sense in which there are no externalities.
Of course, we need to make all the usual caveats against ‘naive consequentialism’. But I do think there is something important in this observation.
I was thinking the emphasis on outputs might be the important part as those are more controllable than outcomes, and so the decision-relevant bit, even though we want to maximise impartial value (outcomes).
I can imagine someone thinking the following way: “we must find and fund the best scientists because they have such outsized outcomes, in terms of citations.” But that might be naive if it’s really just the top scientist who gets the citations and the work of all the good scientists has a more or less equal contribution to impartial value.
FWIW, it’s not clear we’re disagreeing!
Thanks for this comment!
I’m sympathetic to the point that we’re lumping together quite different things under the vague label “performance”, perhaps stretching its beyond its common use. That’s why I said in bold that we’re using a loose notion of performance. But it’s possible it would have been better if I had spent more time to come up with a better terminology.
Okay good! Yeah, I would be curious to see how much it changed the analysis distinguishing outputs from outcomes and, further, between different types of outputs.
I think the language of a person who “achieves orders of magnitude more” suggests that their output (research, book, etc) is orders of magnitude better, instead of just being more popular. Sometimes more popular is better, but often in EA that’s not what we’re focused on.
I also believe you’re talking about hiring individuals in this piece(?), but most of your examples are about successful teams, which have different qualities to successful individuals.
I thought your examples of Math Olympiad scores correlating with Nobel Prize wins was a useful exception to this trend, because those are about an individual and aren’t just about popularity.
Thanks for clarifying!
FWIW I think I see the distinction between popularity and other qualities as less clear as you seem to do. For instance, I would expect that book sales and startup returns are also affected by how “good” in whatever other sense the book or startup product is. Conversely, I would guess that realistically Nobel Prizes and other scientific awards are also about popularity and not just about the quality of the scientific work by other standards. I’m happy to concede that, in some sense, book sales seem more affected by popularity than Nobel Prizes, but it seems a somewhat important insight to me that neither is “just about popularity” nor “just about achievement/talent/quality/whatever”.
It’s also not that clear to me whether there is an obviously more adequate standard of overall “goodness” here: how much joy the book brings readers? What literary critics would say about the book? I think the ultimate lesson here is that the choice of metric is really important, and depends a lot on what you want to know or decide, which is why “Carefully choose the underlying population and the metric for performance” is one of our key points of advice. I can see that saying something vague and general like “some people achieve more” and then giving examples of specific metrics pushes against this insight by suggesting that these are the metrics we should generally most care about. FWIW I still feel OK about our wording here since I feel like in an opening paragraph we need to balance nuance/detail and conciseness / getting the reader interested.
As an aside, my vague impression is that it’s somewhat controversial to what extent successful teams have different qualities to successful individuals. In some sense this is of course true since there are team properties that don’t even make sense for individuals. However, my memory is that for a while there was some more specific work in psychology that was allegedly identifying properties that predicted team success better than the individual abilities of its members, which then largely didn’t replicate.
Woolley et al (2010) was an influential paper arguing that individual intelligence doesn’t predict collective intelligence well. Here’s one paper criticising them. I’m sure there are plenty of other relevant papers (I seem to recall one paper providing positive evidence that individual intelligence predicted group performance fairly well, but can’t find it now).
Great, thank you! I do believe work by Woolley was what I had in mind.
Fwiw, I wrote a post explaining such dynamics a few years ago.
Agreed. The slight initial edge that drives the eventual enormous success in the winner-takes-most market can also be provided by something other than talent — that is, by something other than people trying to do things and succeeding at what they tried to do. For example, the success of Fifty Shades of Grey seems best explained by luck.
I was going to comment something to this effect, too. The authors write:
But there’s an important difference between these groups – the products involved in the first group are cheaply reproducible (any number of people can read the same papers, invest in the same start-up or read the same articles – I don’t know how to interpret income here) & those in the second group are not (not everyone can use the same cook or mail carrier).
So I propose that the difference there has less to do with the complexity of the jobs & more to do with how reproducible the products involved are.
I think you’re right that complexity at the very least isn’t the only cause/explanation for these differences.
E.g. Aguinis et al. (2016) find that, based on an analysis of a very large number of productivity data sets, the following properties make a heavy-tailed output distribution more likely:
Multiplicity of productivity,
Monopolistic productivity,
Job autonomy,
Job complexity,
No productivity ceiling (I guess your point is a special case of this: if the marginal cost of increasing output becomes too high too soon, there will effectively be a ceiling; but there can also e.g. be ceilings imposed by the output metric we use, such as when a manager gives a productivity rating on a 1-10 scale)
As we explain in the paper, I have some open questions about the statistical approach in that paper. So I currently don’t take their analysis to be that much evidence that this is in fact right. However, they also sound right to me just based on priors and based on theoretical considerations (such as the ones in our section on why we expect heavy-tailed ex-ante performance to be widespread).
In the part you quoted, I wrote “less complex jobs” because the data I’m reporting is from a paper that explicitly distinguishes low-, medium-, and high-complexity jobs, and finds that only the first two types of job potentially have a Gaussian output distribution (this is Hunter et al. 1990). [TBC, I understand that the reader won’t know this, and I do think my current wording is a bit sloppy/bad/will predictably lead to the valid pushback you made.]
[References in the doc linked in the OP.]
Thanks for the clarification & references!
This is cool.
One theoretical point in favour of complexity is that complex production often looks like an ‘o-ring’ process, which will create heavy-tailed outcomes.