This is interesting. I’m strongly in favor of having rough models like this in general. Thanks for sharing!
Edit suggestions:
STI says “what percent of bad scenarios should we expect this to avert”, but the formula uses it as a fraction. Probably best to keep the formula and change the wording.
Would help to clarify that TXR is a probability of X-risk. (This is clear after a little thought/inspection, but might as well make it as easy to use as possible.)
Quick thoughts:
It might be helpful to talk in terms of research-years rather than researchers.
It’s slightly strange that the model assumes 1-P(xrisk) is linear in researchers, but then only estimates the coefficient from TXR x STI/(2 x SOT), when (1-TXR)/SOT should also be an estimate. It does make sense that risk would be “more nonlinear” for lower n_researchers, though.
A clear problem with this model is that AFAICT, it assumes that (i) the size of the research community working on safety when AI is developed is independent of (ii) the the degree to which adding a researcher now will change the total number of researchers.
Both (i) and (ii) can vary by orders of magnitude, at least on my model, but are very correlated, because they depend on timelines. This means I get an oddly high chance of averting existential risk. If the questions where combined together into “what fraction of the AI community will the community by enlarged by adding an extra person” then I think my chance of averting existential risk would come out much lower.
Yes, I think this is a significant concern with this version of the model (somewhat less so with the original cruder version using something like medians, but that version also fails to pick up on legitimate effects of “what if these variables are all in the tails”). Combining the variables as you suggest is the easiest way to patch it. More complex would be to add in explicit time-dependency.
When I visit the model page I see errors about improper syntax. (I assume this is because it’s publicly editable and someone accidentally messed up the syntax?)
This is interesting. I’m strongly in favor of having rough models like this in general. Thanks for sharing!
Edit suggestions:
STI says “what percent of bad scenarios should we expect this to avert”, but the formula uses it as a fraction. Probably best to keep the formula and change the wording.
Would help to clarify that TXR is a probability of X-risk. (This is clear after a little thought/inspection, but might as well make it as easy to use as possible.)
Quick thoughts:
It might be helpful to talk in terms of research-years rather than researchers.
It’s slightly strange that the model assumes 1-P(xrisk) is linear in researchers, but then only estimates the coefficient from TXR x STI/(2 x SOT), when (1-TXR)/SOT should also be an estimate. It does make sense that risk would be “more nonlinear” for lower n_researchers, though.
A clear problem with this model is that AFAICT, it assumes that (i) the size of the research community working on safety when AI is developed is independent of (ii) the the degree to which adding a researcher now will change the total number of researchers.
Both (i) and (ii) can vary by orders of magnitude, at least on my model, but are very correlated, because they depend on timelines. This means I get an oddly high chance of averting existential risk. If the questions where combined together into “what fraction of the AI community will the community by enlarged by adding an extra person” then I think my chance of averting existential risk would come out much lower.
Yes, I think this is a significant concern with this version of the model (somewhat less so with the original cruder version using something like medians, but that version also fails to pick up on legitimate effects of “what if these variables are all in the tails”). Combining the variables as you suggest is the easiest way to patch it. More complex would be to add in explicit time-dependency.
When I visit the model page I see errors about improper syntax. (I assume this is because it’s publicly editable and someone accidentally messed up the syntax?)