I think RP underrates the extent to which their default values will end up being the defaults for model users (particularly some of the users they most want to influence)
This is a fair criticism: we started this project with the plan of providing somewhat authoritative numbers but discovered this to be more difficult than we initially expected and instead opted to express significant skepticism about the default choices. Where there was controversy (for instance, in how many years forward we should look), we opted for middle-of-the-road choices. I agree that it would add a lot of value to get reasonable and well-thought-out defaults. Maybe the best way to approach controversy would be to opt for different sets of parameter defaults that users could toggle between based on what different people in the community think.
I found it difficult to provide very large numbers on future population per star—I think with current rates of economic and compute growth, the number of digital people could be extremely high very quickly.
The ability to try to represent digital people with populations per star was a last-minute choice. We originally just aimed for that parameter to represent human populations. (It isn’t even completely obvious to me that stars are the limiting factor on the number of digital people.) However, I also think these things don’t matter since the main aim of the project isn’t really affected by exactly how valuable x-risk projects are in expectation. If you think there may be large populations, the model is going to imply incredibly high rates of return on extinction risk work. Whether those are the obvious choice or not depends not on exactly how high the return, but on how you feel about the risk, and the risks won’t change with massively higher populations.
I think some x-risk interventions could plausibly have very long run effects on x-risk (e.g. by building an aligned super intelligence)
If you think we’ll likely have an aligned super-intelligence within 100 years, then you might try to model this by setting risks very low after the next century and treating your project as just a small boost on its eventual discovery. However, you might not think that either superaligned AI or extinction is inevitable. One thing we don’t try to do is model trajectory changes, and those seem potentially hugely significant, but also rather difficult to model with any degree of confidence.
The x-risk model seems to confuse existential risk and extinction risk (medium confidence—maybe this was explained somewhere, and I missed it)
We distinguish extinction risk from risks of sub-extinction catastrophes, but we don’t model any kind of as-bad-as-extinction risks.
Thanks for recording these thoughts!
Here are a few responses to the criticisms.
This is a fair criticism: we started this project with the plan of providing somewhat authoritative numbers but discovered this to be more difficult than we initially expected and instead opted to express significant skepticism about the default choices. Where there was controversy (for instance, in how many years forward we should look), we opted for middle-of-the-road choices. I agree that it would add a lot of value to get reasonable and well-thought-out defaults. Maybe the best way to approach controversy would be to opt for different sets of parameter defaults that users could toggle between based on what different people in the community think.
The ability to try to represent digital people with populations per star was a last-minute choice. We originally just aimed for that parameter to represent human populations. (It isn’t even completely obvious to me that stars are the limiting factor on the number of digital people.) However, I also think these things don’t matter since the main aim of the project isn’t really affected by exactly how valuable x-risk projects are in expectation. If you think there may be large populations, the model is going to imply incredibly high rates of return on extinction risk work. Whether those are the obvious choice or not depends not on exactly how high the return, but on how you feel about the risk, and the risks won’t change with massively higher populations.
If you think we’ll likely have an aligned super-intelligence within 100 years, then you might try to model this by setting risks very low after the next century and treating your project as just a small boost on its eventual discovery. However, you might not think that either superaligned AI or extinction is inevitable. One thing we don’t try to do is model trajectory changes, and those seem potentially hugely significant, but also rather difficult to model with any degree of confidence.
We distinguish extinction risk from risks of sub-extinction catastrophes, but we don’t model any kind of as-bad-as-extinction risks.