The “human range” at various tasks is much larger than one would naively think because most people don’t obsess over becoming good at any metric, let alone the types of metrics on which GPT-4 seems impressive. Most people don’t obsessively practice anything. There’s a huge difference between “the entire human range at some skill” and “the range of top human experts at some skill.” (By “top experts,” I mean people who’ve practiced the skill for at least a decade and consider it their life’s primary calling.)
GPT-4 hasn’t entered the range of “top human experts” for domains that are significantly more complex than stuff for which it’s easy to write an evaluative function. If we made a list of the world’s best 10,000 CEOs, GPT-4 wouldn’t make the list. Once it ranks at 9,999, I’m pretty sure we’re <1 year away from superintelligence. (The same argument goes for top-10,000 Lesswrong-type generalists or top-10,000 machine learning researchers who occasionally make new discoveries. Also top-10,000 novelists, but maybe not on a gamifiable metric like “popular appeal.”)
GPT-4 hasn’t entered the range of “top human experts” for domains that are significantly more complex than stuff for which it’s easy to write an evaluative function. If we made a list of the world’s best 10,000 CEOs, GPT-4 wouldn’t make the list. Once it ranks at 9,999, I’m pretty sure we’re <1 year away from superintelligence. (The same argument goes for top-10,000 Lesswrong-type generalists or top-10,000 machine learning researchers who occasionally make new discoveries. Also top-10,000 novelists, but maybe not on a gamifiable metric like “popular appeal.”)
I think it’s plausible that top-10,000 human range is fairly wide. Eg in a pretty quantifiable domain like chess 10,000th position is not even international master (the level below GM) and will be ~2,400 Elo,[1] while the top player is ~2850, for a difference of >450 Elo points. If I’m reading the AI impacts analysis correctly, this type of progress took on the order of 7-8 years.
The bottom 1% of chess players is maybe 500 rating.[2] So measured by Elo, the entire practical human range is about ~2400 (2900-500), or about 5.3x the difference at the top 10k range.
Estimates put the number of people who ever played chess at hundreds of millions, so this isn’t a problem of not many people play chess.
I wouldn’t be surprised if there’s similarly large differences at the top in other domains.
For example, the difference between the 10,000th ML researcher and the best is probably the difference between a recent PhD graduate from a decent-but-not-amazing ML university and a Turing award winner.
Great points! I think “top-1,000” would’ve worked better for the point I wanted to convey.
I had the intuition that there are more (aspiring) novelists than competitive game players, but on reflection, I’m not sure that’s correct.
I think the AI history for chess is somewhat unusual compared to the games where AI made headlines more recently because AI spent a lot longer within the range of human chess professionals. We can try to tell various stories about why that is. On the “hard takeoff” side of arguments, maybe chess is particularly suited for AI and maybe humans including Kasparov simply weren’t that good before chess AI helped them understand better strategies. On the “slow(er) takeoff” side, maybe the progress in Go or poker or Diplomacy looks more rapid mostly because there was a hardware overhang and researchers didn’t bother to put a lot of effort into these games before it became clear that they can beat human experts.
Yeah, I think I sort of Aumann-absorbed the idea that AIs would skip over the human level because it has no special significance for them without wondering exactly how wide that human level should be. I think what I had in mind was competence greater than what I thought was humanly possible in one intellectual field that I respected, so more like physics or infosec than climbing. I think my prior was that it would be easier to build specialized systems, so that it had probably not crossed my mind that an AI could be superhuman by being a bit above average in way more fields than any human could.
Eliezer mentioned in a recent interview that he also considers himself to have been wrong about that. This could be a bit of a silver lining. If AI goes from capybara levels of smart straight to NZT-48 levels of smart, no one will be prepared. As it stands, no one will be prepared either, but it’s at least a bit less dignified to not be prepared now.
The “human range” at various tasks is much larger than one would naively think because most people don’t obsess over becoming good at any metric, let alone the types of metrics on which GPT-4 seems impressive. Most people don’t obsessively practice anything. There’s a huge difference between “the entire human range at some skill” and “the range of top human experts at some skill.” (By “top experts,” I mean people who’ve practiced the skill for at least a decade and consider it their life’s primary calling.)
GPT-4 hasn’t entered the range of “top human experts” for domains that are significantly more complex than stuff for which it’s easy to write an evaluative function. If we made a list of the world’s best 10,000 CEOs, GPT-4 wouldn’t make the list. Once it ranks at 9,999, I’m pretty sure we’re <1 year away from superintelligence. (The same argument goes for top-10,000 Lesswrong-type generalists or top-10,000 machine learning researchers who occasionally make new discoveries. Also top-10,000 novelists, but maybe not on a gamifiable metric like “popular appeal.”)
I think it’s plausible that top-10,000 human range is fairly wide. Eg in a pretty quantifiable domain like chess 10,000th position is not even international master (the level below GM) and will be ~2,400 Elo,[1] while the top player is ~2850, for a difference of >450 Elo points. If I’m reading the AI impacts analysis correctly, this type of progress took on the order of 7-8 years.
The bottom 1% of chess players is maybe 500 rating.[2] So measured by Elo, the entire practical human range is about ~2400 (2900-500), or about 5.3x the difference at the top 10k range.
Estimates put the number of people who ever played chess at hundreds of millions, so this isn’t a problem of not many people play chess.
I wouldn’t be surprised if there’s similarly large differences at the top in other domains.
For example, the difference between the 10,000th ML researcher and the best is probably the difference between a recent PhD graduate from a decent-but-not-amazing ML university and a Turing award winner.
Deduced from the table here. By assumption, almost all top chess players have a FIDE rating. https://en.wikipedia.org/wiki/FIDE_titles#cite_ref-fideratings_5-0
https://i.redd.it/s5smrjgqhjnx.png (using chess.com figures rather than FIDE because I assume FIDE ratings are pretty truncated, very casual players do not go to clubs).
Great points! I think “top-1,000” would’ve worked better for the point I wanted to convey.
I had the intuition that there are more (aspiring) novelists than competitive game players, but on reflection, I’m not sure that’s correct.
I think the AI history for chess is somewhat unusual compared to the games where AI made headlines more recently because AI spent a lot longer within the range of human chess professionals. We can try to tell various stories about why that is. On the “hard takeoff” side of arguments, maybe chess is particularly suited for AI and maybe humans including Kasparov simply weren’t that good before chess AI helped them understand better strategies. On the “slow(er) takeoff” side, maybe the progress in Go or poker or Diplomacy looks more rapid mostly because there was a hardware overhang and researchers didn’t bother to put a lot of effort into these games before it became clear that they can beat human experts.
Yeah, I think I sort of Aumann-absorbed the idea that AIs would skip over the human level because it has no special significance for them without wondering exactly how wide that human level should be. I think what I had in mind was competence greater than what I thought was humanly possible in one intellectual field that I respected, so more like physics or infosec than climbing. I think my prior was that it would be easier to build specialized systems, so that it had probably not crossed my mind that an AI could be superhuman by being a bit above average in way more fields than any human could.
Eliezer mentioned in a recent interview that he also considers himself to have been wrong about that. This could be a bit of a silver lining. If AI goes from capybara levels of smart straight to NZT-48 levels of smart, no one will be prepared. As it stands, no one will be prepared either, but it’s at least a bit less dignified to not be prepared now.