I guess my main point is that I’d like to see some applications of this framework (and some of the other frameworks you mention too) to important longtermist problems, before I accept it as useful.
I 100% fully agree. Totally. I think a key point I want to make is that we should be testing all our frameworks against the real world and seeing what is useful. I would love to sit down with CLR or FHI or other organisations and see how this framework can be applied. (Also expect a future post with details of some of the policy work I have been doing that uses some of this).
(I would also love people who have alternative frameworks to be similarly testing them in terms of how much they lead to real world outputs or changes in decisions.)
I’m still unsure if this would be the best approach to reducing existential risk
The aim here is to provide a tool kit that folk can use when needed.
For example these tools they are not that useful where solutions are technical and fairly obvious. I don’t think you need to go through all these steps to conclude that we should be doing interpretability research on AI systems. But if you want to make plans to ensure the incentives of future researches who develop a transformative AI are aligned to the global good then you have a complex high-uncertainty long-term problem and I expect these kinds of tools become the sort of think you would want to use.
Also as I say in the post more bespoke tools beat more general tools. Even in specific cases there will be other toolboxes to use. Organisational design methods for aligning future actors incentives, vulnerability assessments for identifying risks, etc. The tools above are the most general form for anyone to pick up and use.
I’m also sceptical about the claim that we can’t affect probabilities of lock-in events that may happen beyond the next few decades. As I also say here, what about growing the Effective Altruism/longtermist community, or saving/investing money for the future, or improving values?
I think this is a misunderstanding. You totally can affect those events. (I gave the example of patient philanthropy that has non-negligible expected value even in 300 years.) But in most cases a good way of having an impact in more than a few decades is to map out high level goals on shorter decade long timelines. On climate change we are trying to prevent disaster in 2100 but we do it by stetting targets for 2050. The forestry commission might plant oak tress that will grow for 100s of years but they will make planting plans on 10 year cycles. Etc
What would the 30 year vision be? What would intermediate targets be?
I 100% fully agree. Totally. I think a key point I want to make is that we should be testing all our frameworks against the real world and seeing what is useful. I would love to sit down with CLR or FHI or other organisations and see how this framework can be applied. (Also expect a future post with details of some of the policy work I have been doing that uses some of this).
(I would also love people who have alternative frameworks to be similarly testing them in terms of how much they lead to real world outputs or changes in decisions.)
The aim here is to provide a tool kit that folk can use when needed.
For example these tools they are not that useful where solutions are technical and fairly obvious. I don’t think you need to go through all these steps to conclude that we should be doing interpretability research on AI systems. But if you want to make plans to ensure the incentives of future researches who develop a transformative AI are aligned to the global good then you have a complex high-uncertainty long-term problem and I expect these kinds of tools become the sort of think you would want to use.
Also as I say in the post more bespoke tools beat more general tools. Even in specific cases there will be other toolboxes to use. Organisational design methods for aligning future actors incentives, vulnerability assessments for identifying risks, etc. The tools above are the most general form for anyone to pick up and use.
I think this is a misunderstanding. You totally can affect those events. (I gave the example of patient philanthropy that has non-negligible expected value even in 300 years.) But in most cases a good way of having an impact in more than a few decades is to map out high level goals on shorter decade long timelines. On climate change we are trying to prevent disaster in 2100 but we do it by stetting targets for 2050. The forestry commission might plant oak tress that will grow for 100s of years but they will make planting plans on 10 year cycles. Etc
Some examples here if helpful.