I think that the relative value of the different types of intervention changes according to when the risk is coming: the longer we have before ‘crunch time’, the better the start of the list looks, and the worse the end looks. This is complicated by uncertainty over how long we may have.
I think Ord’s “coarse setting” is very close to my type II. The activities you mentioned belong to type II inasmuch as they consider specific scenarios or to type I inasmuch as they raise general awareness of the subject.
Regarding relative value vs. time: I absolutely agree! This is part of the point I was trying to make.
Btw, I was somewhat surprised by Ord’s assessment of the value of current type III interventions in AI. I have a very different view. In particular, the 25-35 years time window he mentions strikes me as very short due to what Ord calls “serial depth effects”. He mentions examples from the business literature on the time scale of several years but I think that the time scale for this type of research is larger by orders of magnitude. AI safety research seems to me similar to fundamental research in science and mathematics: driven mostly by a small pool of extremely skilled individuals, a lot of dependent steps, and thus very difficult to scale up.
I agree that AI safety has some similarities to those fields, but:
I guess you may be overestimating the effect of serial depth in those fields. While there is quite a lot of material that builds on other material, there are a lot of different directions that get pushed on simultaneously, too.
AI safety as a field is currently tiny. It could absorb many more (extremely skilled) researchers before they started seriously treading on each others’ toes by researching the same stuff at the same time.
I think some type III interventions are valuable now, but mostly for their instrumental effects in helping type I and type II, or for helping with scenarios where AI comes surprisingly soon.
I think the distance between our current understanding of AI safety and the required one is of similar order of magnitude to the distance between invention of Dirac sea in 1930 and discovery of asymptotic freedom in non-Abelian gauge theory in 1973. This is 43 years of well-funded research by the top minds of mankind. And that without taking into account the engineering part of the project.
If remaining time frame for solving FAI is 25 years than:
We’re probably screwed anyway
We need invest all possible effort into FAI since the tail the probability distribution is probably fast falling
On the other hand, my personal estimate regarding time to human level AI is about 80 years. This is still not that long.
Could you say something about why your subjective probability distribution for the difficulty is so tight? I think it is very hard to predict in advance how difficult these problems are; witness the distribution of solution times for Hilbert’s problems.
Even if you’re right, I think that says that we should try to quickly get to the point with a serious large programme. It’s not clear that the route to that means doing focusing on direct work at the margin now. It will involve some, but mostly because of the instrumental benefits in helping increase the growth of people working on it, and because it’s hard to scale up later overnight.
My distribution isn’t tight, I’m just saying there is a significant probability of large serial depth. You are right that much of the benefit of current work is “instrumental”: interesting results will convince other people to join the effort.
Right now my guess is that a combination of course-setting (Type II?) and some relatively targeted sociocultural intervention (things like movement growth—perhaps this is better classed with course-setting) are the best activities. But I think Type I and Type III are both at least plausible.
Thanks, this is a really important topic and it’s a nice overview. Exploring classifications of the kinds of intervention available to us is great.
The two pieces I know of which are closest to this are Nick Bostrom’s paper Existential Risk Prevention as Global Priority, and Toby Ord’s article The timing of labour aimed at existential risk. Ord’s “course setting” is a slightly broader bucket than your “futures research”. I wonder if it’s a more natural one? If not, where would you put the other course-setting activities (which could include either of those pieces, your article, or my paper on allocating risk mitigation for risks at different times)?
I think that the relative value of the different types of intervention changes according to when the risk is coming: the longer we have before ‘crunch time’, the better the start of the list looks, and the worse the end looks. This is complicated by uncertainty over how long we may have.
Thx for the feedback and the references!
I think Ord’s “coarse setting” is very close to my type II. The activities you mentioned belong to type II inasmuch as they consider specific scenarios or to type I inasmuch as they raise general awareness of the subject.
Regarding relative value vs. time: I absolutely agree! This is part of the point I was trying to make.
Btw, I was somewhat surprised by Ord’s assessment of the value of current type III interventions in AI. I have a very different view. In particular, the 25-35 years time window he mentions strikes me as very short due to what Ord calls “serial depth effects”. He mentions examples from the business literature on the time scale of several years but I think that the time scale for this type of research is larger by orders of magnitude. AI safety research seems to me similar to fundamental research in science and mathematics: driven mostly by a small pool of extremely skilled individuals, a lot of dependent steps, and thus very difficult to scale up.
I agree that AI safety has some similarities to those fields, but:
I guess you may be overestimating the effect of serial depth in those fields. While there is quite a lot of material that builds on other material, there are a lot of different directions that get pushed on simultaneously, too.
AI safety as a field is currently tiny. It could absorb many more (extremely skilled) researchers before they started seriously treading on each others’ toes by researching the same stuff at the same time.
I think some type III interventions are valuable now, but mostly for their instrumental effects in helping type I and type II, or for helping with scenarios where AI comes surprisingly soon.
I think the distance between our current understanding of AI safety and the required one is of similar order of magnitude to the distance between invention of Dirac sea in 1930 and discovery of asymptotic freedom in non-Abelian gauge theory in 1973. This is 43 years of well-funded research by the top minds of mankind. And that without taking into account the engineering part of the project.
If remaining time frame for solving FAI is 25 years than:
We’re probably screwed anyway
We need invest all possible effort into FAI since the tail the probability distribution is probably fast falling
On the other hand, my personal estimate regarding time to human level AI is about 80 years. This is still not that long.
Could you say something about why your subjective probability distribution for the difficulty is so tight? I think it is very hard to predict in advance how difficult these problems are; witness the distribution of solution times for Hilbert’s problems.
Even if you’re right, I think that says that we should try to quickly get to the point with a serious large programme. It’s not clear that the route to that means doing focusing on direct work at the margin now. It will involve some, but mostly because of the instrumental benefits in helping increase the growth of people working on it, and because it’s hard to scale up later overnight.
My distribution isn’t tight, I’m just saying there is a significant probability of large serial depth. You are right that much of the benefit of current work is “instrumental”: interesting results will convince other people to join the effort.
Right now my guess is that a combination of course-setting (Type II?) and some relatively targeted sociocultural intervention (things like movement growth—perhaps this is better classed with course-setting) are the best activities. But I think Type I and Type III are both at least plausible.