I provided some comments on a draft of this post where I said that I was skeptical of the use of many of these tools for EA longtermists, although felt they were very useful for policymakers who are looking to improve the future across a shorter timeframe. On a second read I feel more optimistic of the use for EA longtermists, but am still slightly uncertain.
For example, you suggest setting a vision for a good future in 20-30 years and then designing a range of 10 year targets that move the world towards that vision. This seems reasonable a lot of the time, but I’m still unsure if this would be the best approach to reducing existential risk (which is currently the most accepted approach to improving the far future in expectation amongst EAs).
Take the existential risk of misaligned AI for example. What would the 30 year vision be? What would intermediate targets be? What is wrong with the current approach of “shout about how hard the alignment problem is to make important people listen, whilst also carrying out alignment research, and also getting people into influential positions so that they can act on this research”?
I guess my main point is that I’d like to see some applications of this framework (and some of the other frameworks you mention too) to important longtermist problems, before I accept it as useful. I think the framework does work well for more general goals like “let’s make a happier world in the next few decades” which is very vague and needs to be broken down systematically, but I’m unsure it would work well for more specific goals such as “let’s not let AI / nuclear weapons etc. destroy us”. I’m not saying the framework won’t work, but I’d like to see someone try to apply it.
Longtermists are going to have to make plans along the lines of: let’s minimise the chance we fall into a bad attractor state and maximise the chance we fall into a good attractor state within the length of time that we can reasonably influence, which is 10-100s of years
I’m also sceptical about the claim that we can’t affect probabilities of lock-in events that may happen beyond the next few decades. As I also say here, what about growing the Effective Altruism/longtermist community, or saving/investing money for the future, or improving values? These are all things that many EAs think can be credible longtermist interventions and could reasonably affect chances of lock-in beyond the next few decades as they essentially increase the number of thoughtful/good people in the future or the amount of resources such people have at their disposal. I do think it is important for us to carefully consider how we can affect lock-in events over longer timescales.
I guess my main point is that I’d like to see some applications of this framework (and some of the other frameworks you mention too) to important longtermist problems, before I accept it as useful.
I 100% fully agree. Totally. I think a key point I want to make is that we should be testing all our frameworks against the real world and seeing what is useful. I would love to sit down with CLR or FHI or other organisations and see how this framework can be applied. (Also expect a future post with details of some of the policy work I have been doing that uses some of this).
(I would also love people who have alternative frameworks to be similarly testing them in terms of how much they lead to real world outputs or changes in decisions.)
I’m still unsure if this would be the best approach to reducing existential risk
The aim here is to provide a tool kit that folk can use when needed.
For example these tools they are not that useful where solutions are technical and fairly obvious. I don’t think you need to go through all these steps to conclude that we should be doing interpretability research on AI systems. But if you want to make plans to ensure the incentives of future researches who develop a transformative AI are aligned to the global good then you have a complex high-uncertainty long-term problem and I expect these kinds of tools become the sort of think you would want to use.
Also as I say in the post more bespoke tools beat more general tools. Even in specific cases there will be other toolboxes to use. Organisational design methods for aligning future actors incentives, vulnerability assessments for identifying risks, etc. The tools above are the most general form for anyone to pick up and use.
I’m also sceptical about the claim that we can’t affect probabilities of lock-in events that may happen beyond the next few decades. As I also say here, what about growing the Effective Altruism/longtermist community, or saving/investing money for the future, or improving values?
I think this is a misunderstanding. You totally can affect those events. (I gave the example of patient philanthropy that has non-negligible expected value even in 300 years.) But in most cases a good way of having an impact in more than a few decades is to map out high level goals on shorter decade long timelines. On climate change we are trying to prevent disaster in 2100 but we do it by stetting targets for 2050. The forestry commission might plant oak tress that will grow for 100s of years but they will make planting plans on 10 year cycles. Etc
What would the 30 year vision be? What would intermediate targets be?
I provided some comments on a draft of this post where I said that I was skeptical of the use of many of these tools for EA longtermists, although felt they were very useful for policymakers who are looking to improve the future across a shorter timeframe. On a second read I feel more optimistic of the use for EA longtermists, but am still slightly uncertain.
For example, you suggest setting a vision for a good future in 20-30 years and then designing a range of 10 year targets that move the world towards that vision. This seems reasonable a lot of the time, but I’m still unsure if this would be the best approach to reducing existential risk (which is currently the most accepted approach to improving the far future in expectation amongst EAs).
Take the existential risk of misaligned AI for example. What would the 30 year vision be? What would intermediate targets be? What is wrong with the current approach of “shout about how hard the alignment problem is to make important people listen, whilst also carrying out alignment research, and also getting people into influential positions so that they can act on this research”?
I guess my main point is that I’d like to see some applications of this framework (and some of the other frameworks you mention too) to important longtermist problems, before I accept it as useful. I think the framework does work well for more general goals like “let’s make a happier world in the next few decades” which is very vague and needs to be broken down systematically, but I’m unsure it would work well for more specific goals such as “let’s not let AI / nuclear weapons etc. destroy us”. I’m not saying the framework won’t work, but I’d like to see someone try to apply it.
I’m also sceptical about the claim that we can’t affect probabilities of lock-in events that may happen beyond the next few decades. As I also say here, what about growing the Effective Altruism/longtermist community, or saving/investing money for the future, or improving values? These are all things that many EAs think can be credible longtermist interventions and could reasonably affect chances of lock-in beyond the next few decades as they essentially increase the number of thoughtful/good people in the future or the amount of resources such people have at their disposal. I do think it is important for us to carefully consider how we can affect lock-in events over longer timescales.
I 100% fully agree. Totally. I think a key point I want to make is that we should be testing all our frameworks against the real world and seeing what is useful. I would love to sit down with CLR or FHI or other organisations and see how this framework can be applied. (Also expect a future post with details of some of the policy work I have been doing that uses some of this).
(I would also love people who have alternative frameworks to be similarly testing them in terms of how much they lead to real world outputs or changes in decisions.)
The aim here is to provide a tool kit that folk can use when needed.
For example these tools they are not that useful where solutions are technical and fairly obvious. I don’t think you need to go through all these steps to conclude that we should be doing interpretability research on AI systems. But if you want to make plans to ensure the incentives of future researches who develop a transformative AI are aligned to the global good then you have a complex high-uncertainty long-term problem and I expect these kinds of tools become the sort of think you would want to use.
Also as I say in the post more bespoke tools beat more general tools. Even in specific cases there will be other toolboxes to use. Organisational design methods for aligning future actors incentives, vulnerability assessments for identifying risks, etc. The tools above are the most general form for anyone to pick up and use.
I think this is a misunderstanding. You totally can affect those events. (I gave the example of patient philanthropy that has non-negligible expected value even in 300 years.) But in most cases a good way of having an impact in more than a few decades is to map out high level goals on shorter decade long timelines. On climate change we are trying to prevent disaster in 2100 but we do it by stetting targets for 2050. The forestry commission might plant oak tress that will grow for 100s of years but they will make planting plans on 10 year cycles. Etc
Some examples here if helpful.