Working at Metaculus. Formerly a data analyst at the Forecasting Research Institute. Member of the Samotsvety forecasting team. Currently ranked 2nd in INFER (formerly CSET-Foretell) all-time leaderboard (7th in current season). Interests include prediction markets, technology policy, and traditional music.
Molly Hickman
A Case for Nuanced Risk Assessment
Conditional Trees: Generating Informative Forecasting Questions (FRI) -- AI Risk Case Study
Forecasting: the way I think about it
@Bob Fischer, my understanding is the recovery from liver donation is quite a bit worse than recovery from kidney, which makes intuitive sense to me… The liver has to grow back (painfully, iiuc) versus the remaining kidney just gradually works a little harder. I don’t know how different the incisions are. FWIW, I was able to stop taking my prescribed opioid less than a week post-op, and didn’t even need acetaminophen shortly thereafter. I’m happy to tell you more about my kidney donation if that would be helpful!
Per Scott Alexander:
I donated my kidney, but I’m probably not going to donate a lobe of my liver (even though this is also mostly safe and also helps people in need). This isn’t because there’s a real distinction about which parts of my body are vs. aren’t sacred, it’s just that I guess I’m ethical enough to do something moderately hard and painful, but not to do something very hard and painful. If anyone gives you grief about admitting this, ask them how much of the axiological law they’re following.
FRI went further and quantitatively estimated how important each crux was—a great starting point towards an adversarially-collaborated synthesis.
And you can too! We evaluated cruxes on two axes: “value of information” (VOI) and “value of discrimination” (VOD). Essentially: VOI is how much someone expects to gain by finding out the answer to a given crux question (with respect to an ultimate question), and VOD is how much two people expect to converge on the ultimate question when they find out the answer to the crux question.
There’s a google sheets calculator, as well as an R library, which will be released on CRAN at some point.
Hi @Vasco Gril, thanks for the question. That is the standard deviation in percentage points. The distribution is decidedly un-Gaussian so the standard deviation is a little misleading.
We limited the y axis range on the box-and-dot plots like that one on page 272 -- they’re all truncated at the 95th percentile of tournament participants + a 5% cushion (footnote on page 18) -- so the max for Stage 1 for supers was actually 21.9%.
Here are a couple more summary stats for the superforecasters, for the 2030 question. The raw data are available here if you want to explore in more detail!
stage count mean sd min median max <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl> 1 1 88 0.510 2.60 0 0.0001 21.9 2 2 57 0.378 1.74 0 0.0001 12 3 3 16 0.0392 0.125 0 0.00075 0.5 4 4 69 0.180 1.20 0 0.0001 10
Results from an Adversarial Collaboration on AI Risk (FRI)
Poor power quality is a bottleneck to global health and development
Very excited that you’re doing this. The reading and listening list looks terrific. Here are a few suggestions, which you can take or leave!
Some perspectives from sociology and related fields:
Tacit Knowledge, Weapons Design, and the Uninvention of Nuclear Weapons. Donald MacKenzie and Graham Spinardi, 1995. (This is essential reading IMO. Extremely well-written, and a perspective I hadn’t read anywhere else. I think about it a lot.)
From ‘Inherently Safe’ to ‘Proliferation Resistant’: New Perspectives on Reactor Designs. Sonja D. Schmid, 2021.
Reimagining Nuclear Engineering. Aditi Verma and Denia Djokic, 2021.
From Accountants to Detectives: How Nuclear Safeguards Inspectors Make Knowledge at the IAEA. Anna Weichselbraun, 2020.
And a moving piece from the New Yorker (accounts from Hiroshima survivors), and Eisenhower’s speech he gave to the UN General Assembly:
Hiroshima. John Hersey. New Yorker, 1946.
Eisenhower’s “Atoms for Peace” speech. 1953. (Recording.)
Thanks Ivan! More work on actually articulating our models of the world: incoming...! I don’t think point forecasts accomplish this very well.