Our World In Data are excellent; they provide world-class data and analysis on a bunch of subjects. Their COVID coverage made it obvious that this is a very great public good.
So far, they haven’t included data on base rates; but from Tetlock we know that base rates are the king of judgmental forecasting (EAs generally agree). Making them easily available can thus help people think better about the future. Here’s a cool corporate example.
e.g.
“85% of big data projects fail”; “10% of people refuse to be vaccinated because of fearing needles (pre-COVID so you can compare to the COVID hesitancy)”; “11% of ballot initiatives pass” “7% of Emergent Ventures applications are granted”; “50% of applicants get 80k advice”; “x% of applicants get to the 3rd round of OpenPhil hiring”, “which takes y months”; “x% of graduates from country [y] start a business”.
MVP:
come up with hundreds of baserates relevant to EA causes
recurse: get people to forecast the true value, or later value (put them in a private competition on Foretold, index them on metaforecast.org)
Later, QURI-style innovations: add methods to combine multiple estimates and do proper Bayesian inference on them. If we go the crowdsourcing route, we could use the infrastructure used for graphclasses(voting on edits). Prominently mark the age of the estimate.
PS: We already sympathise with the many people who critique base rates for personal probability.
Perhaps-minor note: if you’d do it at scale, I imagine you’d want something more sophisticated than coarse base rates. More like, “For a project that has these parameters, our model estimates that you have a 85% chance of failure.”
I of course see this as basically a bunch of estimation functions, but you get the idea.
Our World in Base Rates
Epistemic Institutions
Our World In Data are excellent; they provide world-class data and analysis on a bunch of subjects. Their COVID coverage made it obvious that this is a very great public good.
So far, they haven’t included data on base rates; but from Tetlock we know that base rates are the king of judgmental forecasting (EAs generally agree). Making them easily available can thus help people think better about the future. Here’s a cool corporate example.
e.g.
“85% of big data projects fail”;
“10% of people refuse to be vaccinated because of fearing needles (pre-COVID so you can compare to the COVID hesitancy)”;
“11% of ballot initiatives pass”
“7% of Emergent Ventures applications are granted”;
“50% of applicants get 80k advice”;
“x% of applicants get to the 3rd round of OpenPhil hiring”, “which takes y months”;
“x% of graduates from country [y] start a business”.
MVP:
come up with hundreds of baserates relevant to EA causes
scrape Wikidata for them, or diffbot.com
recurse: get people to forecast the true value, or later value (put them in a private competition on Foretold, index them on metaforecast.org)
Later, QURI-style innovations: add methods to combine multiple estimates and do proper Bayesian inference on them. If we go the crowdsourcing route, we could use the infrastructure used for graphclasses (voting on edits). Prominently mark the age of the estimate.
PS: We already sympathise with the many people who critique base rates for personal probability.
I think this is neat.
Perhaps-minor note: if you’d do it at scale, I imagine you’d want something more sophisticated than coarse base rates. More like, “For a project that has these parameters, our model estimates that you have a 85% chance of failure.”
I of course see this as basically a bunch of estimation functions, but you get the idea.