Scott Alexander comments on Scoring forecasts from the 2016 “Expert Survey on Progress in AI”

Scott Alexander 6 Mar 2023 21:20 UTC
7 points
1 ∶ 0
Update: I think Bing passes the high school essay bar, based on the section “B- Essays No More” at https://oneusefulthing.substack.com/p/i-hope-you-werent-getting-too-comfortable
- PatrickL 8 Mar 2023 12:23 UTC
  3 points
  0 ∶ 0
  Parent
  Yeah good find, I also think that passes the bar. Although I do think people have generally overestimated GPT’s essay-writing ability compared to humans, and think I might be falling for that here.
  
  I’m not planning to change the doc because Bing’s AI wasn’t released by Feb 23, but if you think it should be included (which would be reasonable given OpenAI pretty obviously made this before Feb 23), it would mean:
  - Experts expected 9 milestones to be met vs actually 11 milestones
  - The calibration curve looks four percentage points worse at the 10% mark
  - Bulls’ Brier score: 0.29
  - Experts’ Brier score: 0.24
  - Bears’ Brier score: 0.29
  I’ve added it to this tracker of milestones (feel free to request edit access).