I don’t think of our strategy as having changed much in the last year. For example, in the last AMA I said that the plan was to work on some big open problems (I named 5 here: asymptotically good reasoning under logical uncertainty, identifying the best available decision with respect to a predictive world-model and utility function, performing induction from inside an environment, identifying the referents of goals in realistic world-models, and reasoning about the behavior of smarter reasoners), and that I’d be thrilled if we could make serious progress on any of these problems within 5 years. Scott Garrabrant then promptly developed logical induction, which represents serious progress on two (maybe three) of the big open problems. I consider this to be a good sign of progress, and that set of research priorities remains largely unchanged.
Jessica Taylor is now leading a new research program, and we’re splitting our research time between this agenda and our 2014 agenda. I see this as a natural consequence of us bringing on new researchers with their own perspectives on various alignment problems, rather than as a shift in organizational strategy. Eliezer, Benya, and I drafted the agent foundations agenda when we were MIRI’s only full-time researchers; Jessica, Patrick, and Critch co-wrote a new agenda with their take once they were added to the team. The new agenda reflects a number of small changes: some updates that we’ve all made in response to evidence over the last couple of years, some writing-up of problems that we’d been thinking about for some time but which hadn’t made the cut into the previous agenda, and some legitimate differences in intuition and perspective brought to the table by Jessica, Patrick, and Critch. The overall strategy is still “do research that we think others won’t do,” and the research methods and intuitions we rely on continue to have a MIRI-ish character.
Regarding success probability, I think MIRI has a decent chance of success compared to other potential AI risk interventions, but AI risk is a hard problem. I’d guess that humanity as a whole has a fairly low probability of success, with wide error bars.
Unless I’m missing context, I think the “medium probability of success” language comes from old discussions on LessWrong about how to respond to Pascal’s mugging. (See Rob’s note about Pascalian reasoning here.) In that context, I think the main dichotomy Eliezer had in mind was “tiny” probabilities (that can be practically ignored, like gambling in the powerball) and strategically relevant probabilities like 1% or 10%. See Eliezer’s post here. I’m fine with calling the latter probabilities “medium-sized” in the context of lottery-style errors, and calling them “small” in other contexts. With respect to ensuring that the first AGI designs developed by AI scientists are easy to align, I don’t think MIRI’s odds are stellar, though I do feel comfortable saying that they’re higher than 1%. Let me know if I’ve misunderstood the question you had in mind here.
I’d guess that humanity as a whole has a fairly low probability of success, with wide error bars.
Just out of curiosity how would your estimate update if you can enough resources to do anything you deemed necessary but not enough to affect current trajectory of the field
I’m not sure I understand the hypothetical—most of the actions that I deem necessary are aimed at affecting the trajectory of the AI field in one way or another.
I don’t think of our strategy as having changed much in the last year. For example, in the last AMA I said that the plan was to work on some big open problems (I named 5 here: asymptotically good reasoning under logical uncertainty, identifying the best available decision with respect to a predictive world-model and utility function, performing induction from inside an environment, identifying the referents of goals in realistic world-models, and reasoning about the behavior of smarter reasoners), and that I’d be thrilled if we could make serious progress on any of these problems within 5 years. Scott Garrabrant then promptly developed logical induction, which represents serious progress on two (maybe three) of the big open problems. I consider this to be a good sign of progress, and that set of research priorities remains largely unchanged.
Jessica Taylor is now leading a new research program, and we’re splitting our research time between this agenda and our 2014 agenda. I see this as a natural consequence of us bringing on new researchers with their own perspectives on various alignment problems, rather than as a shift in organizational strategy. Eliezer, Benya, and I drafted the agent foundations agenda when we were MIRI’s only full-time researchers; Jessica, Patrick, and Critch co-wrote a new agenda with their take once they were added to the team. The new agenda reflects a number of small changes: some updates that we’ve all made in response to evidence over the last couple of years, some writing-up of problems that we’d been thinking about for some time but which hadn’t made the cut into the previous agenda, and some legitimate differences in intuition and perspective brought to the table by Jessica, Patrick, and Critch. The overall strategy is still “do research that we think others won’t do,” and the research methods and intuitions we rely on continue to have a MIRI-ish character.
Regarding success probability, I think MIRI has a decent chance of success compared to other potential AI risk interventions, but AI risk is a hard problem. I’d guess that humanity as a whole has a fairly low probability of success, with wide error bars.
Unless I’m missing context, I think the “medium probability of success” language comes from old discussions on LessWrong about how to respond to Pascal’s mugging. (See Rob’s note about Pascalian reasoning here.) In that context, I think the main dichotomy Eliezer had in mind was “tiny” probabilities (that can be practically ignored, like gambling in the powerball) and strategically relevant probabilities like 1% or 10%. See Eliezer’s post here. I’m fine with calling the latter probabilities “medium-sized” in the context of lottery-style errors, and calling them “small” in other contexts. With respect to ensuring that the first AGI designs developed by AI scientists are easy to align, I don’t think MIRI’s odds are stellar, though I do feel comfortable saying that they’re higher than 1%. Let me know if I’ve misunderstood the question you had in mind here.
Just out of curiosity how would your estimate update if you can enough resources to do anything you deemed necessary but not enough to affect current trajectory of the field
I’m not sure I understand the hypothetical—most of the actions that I deem necessary are aimed at affecting the trajectory of the AI field in one way or another.
Ok, that’s informative. So the dominant factor is not the ability to get to the finish line faster (which kind of makes sense)