I’ve been increasingly hearing advice to the effect that “stories” are an effective way for an AI x-safety researcher to figure out what to work on, that drawing scenarios about how you think it could go well or go poorly and doing backward induction to derive a research question is better than traditional methods of finding a research question. Do you agree with this? It seems like the uncertainty when you draw such scenarios is so massive that one couldn’t make a dent in it, but do you think it’s valuable for AI x-safety researchers to make significant (i.e. more than 30% of their time) investments in both 1. doing this directly by telling stories and attempting backward induction, and 2. training so that their stories will be better/more reflective of reality (by studying forecasting, for instance)?
I would love to see more stories of this form, and think that writing stories like this is a good area of research to be pursuing for its own sake that could help inform strategy at Open Phil and elsewhere. With that said, I don’t think I’d advise everyone who is trying to do technical AI alignment to determine what questions they’re going to pursue based on an exercise like this—doing this can be very laborious, and the technical research route it makes the most sense for you to pursue will probably be affected by a lot of considerations not captured in the exercise, such as your existing background, your native research intuitions and aesthetic (which can often determine what approaches you’ll be able to find any purchase on), what mentorship opportunities you have available to you and what your potential mentors are interested in, etc.
I’ve been increasingly hearing advice to the effect that “stories” are an effective way for an AI x-safety researcher to figure out what to work on, that drawing scenarios about how you think it could go well or go poorly and doing backward induction to derive a research question is better than traditional methods of finding a research question. Do you agree with this? It seems like the uncertainty when you draw such scenarios is so massive that one couldn’t make a dent in it, but do you think it’s valuable for AI x-safety researchers to make significant (i.e. more than 30% of their time) investments in both 1. doing this directly by telling stories and attempting backward induction, and 2. training so that their stories will be better/more reflective of reality (by studying forecasting, for instance)?
I would love to see more stories of this form, and think that writing stories like this is a good area of research to be pursuing for its own sake that could help inform strategy at Open Phil and elsewhere. With that said, I don’t think I’d advise everyone who is trying to do technical AI alignment to determine what questions they’re going to pursue based on an exercise like this—doing this can be very laborious, and the technical research route it makes the most sense for you to pursue will probably be affected by a lot of considerations not captured in the exercise, such as your existing background, your native research intuitions and aesthetic (which can often determine what approaches you’ll be able to find any purchase on), what mentorship opportunities you have available to you and what your potential mentors are interested in, etc.