If k=0—if the people at the lab consider life without AI to be as bad an outcome as an AI-induced existential catastrophe—then the probability of survival
here equals
, which is independent of S. The safety benefits of doing more safety work S are fully offset by inducing the lab to choose higher C. S has no effect on the probability of survival.
e−1 is missing after “here equals”.
If k>0—the (presumably more typical) case that people at the lab consider life without AI to be a better outcome than an AI-induced existential catastrophe—then probability of survival
here equals 1 if S≤k (because then C=0), and
if S>k, which decreases in S. Safety work backfires.
Nice post, Phil!
e−1 is missing after “here equals”.
e−(1−k/S) is missing before “if S>k”.
Whoops, thanks! Issues importing from the Google doc… fixing now.