Linch comments on All AGI Safety questions welcome (especially basic ones) [April 2023]

Linch 16 Apr 2023 11:39 UTC
7 points
2 ∶ 0
Yep this is roughly the right process!

80%*5%*10%*1% ~= 4 x 10^-5 hmm yeah this sounds right. About 3x difference.

I agree that at those numbers the case is not clear either way (slight change in the numbers can flip the conclusion, also not uncertainties are created alike: 3x in a highly speculative calculation might not be enough to swing you to prefer it over the much more validated and careful estimates from Givewell).

Some numbers I disagree with:

P(existential catastrophe | AGI) = 5%. This number feels somewhat low to me, though I think it’s close to the median numbers that AI experts (not AI safety experts) put out.
P(human extinction | existential catastrophe) = 10%. This also feels low to me. Incidentally, if your probability of (extinction | existential catastrophe) is relatively low, you should also have a rough estimate of the number of expected lives saved from non-extinction existential catastrophe scenarios, because those might be significant.
Your other 2 numbers seem reasonable at first glance. One caveat is that you might expect the next $X of spending by Open Phil on alignment to be less effective than the first $X.
Agree the case is not very clear-cut. I remember doing some other quick modeling before and coming to a similar conclusion: by some pretty fuzzy empirical assumptions, x-safety interventions are very slightly better than global health charities for present people assuming zero risk/ambiguity aversion, but the case is pretty unclear overall.