Thanks a lot for this great post! I think the part I like the most, even more than the awesome deconstruction of arguments and their underlying hypotheses, is the sheer number of times you said “I don’t know” or “I’m not sure” or “this might be false”. I feel it places you at the same level than your audience (including me), in the sense that you have more experience and technical competence than the rest of us, but you still don’t know THE TRUTH, or sometimes even good approximations to it. And the standard way to present clearly ideas and research is to structure them so that these points that we don’t know are not the focus. So that was refreshing.
On the more technical side, I had a couple of questions and remarks concerning your different positions.
One underlying hypothesis that was not explicitly pointed out, I think, was that you are looking for priority arguments. That is, part of your argument is about whether AI safety research is the most important thing you could do (It might be so obvious in an EA meeting or the EA forum that it’s not worth exploring, but I like expliciting the obvious hypotheses). But that’s different from whether or not we should do AI safety research at all. That is one common criticism I have about taking at face value effective altruism career recommendations: we would not have for example pure mathematicians, because pure mathematics is never the priority. Whereas you could argue that without pure mathematics, almost all the positive technological progress we have now (from quantum mechanics to computer science) would not exist. (Note that this is not an argument for having a lot of mathematicians, just an argument for having some).
For the problems-that-solve-themselves arguments, I feel like your examples have very “good” qualities for solving themselves: both personal and economic incentives are against them, they are obvious when one is confronted with the situation, and at the point where the problems becomes obvious, you can still solve them. I would argue that not all these properties holds for AGI. What are your thoughts about that?
About the “big deal” argument, I’m not sure that another big deal before AGI would invalidate the value of current AI Safety research. What seems weird in your definition of big deal is that if I assume the big deal, then I can make informed guess and plans about the world after it, no? Something akin to The Age of Em by Hanson, where he starts with ems (whole-brain emulations) and then try to derive what our current understanding of the various sciences can tell us about this future. I don’t see why you can’t do this even if there is another big deal before AGI. Maybe the only cost is more and more uncertainty.
The arguments you point out against the value of research now compared to research closer to AGI seems to forget about incremental research. Not all research is a breakthrough, and most if not all breakthrough build on previous decades or centuries of quiet research work. In this sense, working on it now might be the only way to ensure the necessary breakthroughs closer to the deadline.
For the problems-that-solve-themselves arguments, I feel like your examples have very “good” qualities for solving themselves: both personal and economic incentives are against them, they are obvious when one is confronted with the situation, and at the point where the problems becomes obvious, you can still solve them. I would argue that not all these properties holds for AGI. What are your thoughts about that?
I agree that it’s an important question whether AGI has the right qualities to “solve itself”. To go through the ones you named:
“Personal and economic incentives are aligned against them”—I think AI safety has somewhat good properties here. Basically no-one wants to kill everyone, and AI systems that aren’t aligned with their users are much less useful. On the other hand, it might be the case that people are strongly incentivised to be reckless and deploy things quickly.
“they are obvious when one is confronted with the situation”—I think that alignment problems might be fairly obvious, especially if there’s a long process of continuous AI progress where unaligned non-superintelligent AI systems do non-catastrophic damage. So this comes down to questions about how rapid AI progress will be.
“at the point where the problems become obvious, you can still solve them”—If the problems become obvious because non-superintelligent AI systems are behaving badly, then we can still maybe put more effort into aligning increasingly powerful AI systems after that and hopefully we won’t lose that much of the value of the future.
One underlying hypothesis that was not explicitly pointed out, I think, was that you are looking for priority arguments. That is, part of your argument is about whether AI safety research is the most important thing you could do (It might be so obvious in an EA meeting or the EA forum that it’s not worth exploring, but I like expliciting the obvious hypotheses).
This is a good point.
Whereas you could argue that without pure mathematics, almost all the positive technological progress we have now (from quantum mechanics to computer science) would not exist.
I feel pretty unsure on this point; for a contradictory perspective you might enjoy this article.
Thanks a lot for this great post! I think the part I like the most, even more than the awesome deconstruction of arguments and their underlying hypotheses, is the sheer number of times you said “I don’t know” or “I’m not sure” or “this might be false”. I feel it places you at the same level than your audience (including me), in the sense that you have more experience and technical competence than the rest of us, but you still don’t know THE TRUTH, or sometimes even good approximations to it. And the standard way to present clearly ideas and research is to structure them so that these points that we don’t know are not the focus. So that was refreshing.
On the more technical side, I had a couple of questions and remarks concerning your different positions.
One underlying hypothesis that was not explicitly pointed out, I think, was that you are looking for priority arguments. That is, part of your argument is about whether AI safety research is the most important thing you could do (It might be so obvious in an EA meeting or the EA forum that it’s not worth exploring, but I like expliciting the obvious hypotheses). But that’s different from whether or not we should do AI safety research at all. That is one common criticism I have about taking at face value effective altruism career recommendations: we would not have for example pure mathematicians, because pure mathematics is never the priority. Whereas you could argue that without pure mathematics, almost all the positive technological progress we have now (from quantum mechanics to computer science) would not exist. (Note that this is not an argument for having a lot of mathematicians, just an argument for having some).
For the problems-that-solve-themselves arguments, I feel like your examples have very “good” qualities for solving themselves: both personal and economic incentives are against them, they are obvious when one is confronted with the situation, and at the point where the problems becomes obvious, you can still solve them. I would argue that not all these properties holds for AGI. What are your thoughts about that?
About the “big deal” argument, I’m not sure that another big deal before AGI would invalidate the value of current AI Safety research. What seems weird in your definition of big deal is that if I assume the big deal, then I can make informed guess and plans about the world after it, no? Something akin to The Age of Em by Hanson, where he starts with ems (whole-brain emulations) and then try to derive what our current understanding of the various sciences can tell us about this future. I don’t see why you can’t do this even if there is another big deal before AGI. Maybe the only cost is more and more uncertainty.
The arguments you point out against the value of research now compared to research closer to AGI seems to forget about incremental research. Not all research is a breakthrough, and most if not all breakthrough build on previous decades or centuries of quiet research work. In this sense, working on it now might be the only way to ensure the necessary breakthroughs closer to the deadline.
I agree that it’s an important question whether AGI has the right qualities to “solve itself”. To go through the ones you named:
“Personal and economic incentives are aligned against them”—I think AI safety has somewhat good properties here. Basically no-one wants to kill everyone, and AI systems that aren’t aligned with their users are much less useful. On the other hand, it might be the case that people are strongly incentivised to be reckless and deploy things quickly.
“they are obvious when one is confronted with the situation”—I think that alignment problems might be fairly obvious, especially if there’s a long process of continuous AI progress where unaligned non-superintelligent AI systems do non-catastrophic damage. So this comes down to questions about how rapid AI progress will be.
“at the point where the problems become obvious, you can still solve them”—If the problems become obvious because non-superintelligent AI systems are behaving badly, then we can still maybe put more effort into aligning increasingly powerful AI systems after that and hopefully we won’t lose that much of the value of the future.
This is a good point.
I feel pretty unsure on this point; for a contradictory perspective you might enjoy this article.
I’m curious about the article, but the link points to nothing. ^^