I think you might be engaging in a bit of Motte-and-Baileying here. Throughout this comment, you’re stating MIRI’s position as things like “it will be hard to make ASI safe”, and that AI will “win”, and that it will be hard for an AI to be perfectly aligned with “human flourishing” Those statements seem pretty reasonable.
But the actual stance of MIRI, which you just released a book about, is that there is an extremely high chance that building powerful AI will result in everybody on planet earth being killed. That’s a much narrower and more specific claim. You can imagine a lot of scenarios where AI is unsafe, but not in a way that kills everyone. You can imagine cases where AI “wins”, but decides to cut a deal with us. You can imagine cases where an AI doesn’t care about human flourishing because it doesn’t care about anything, it ends up acting like a tool that we can direct as we please.
I’m aware that you have counterarguments for all of these cases (that I will probably disagree with). But these counterarguments will have to be rooted in the actual nuts and bolts details of how actual, physical AI works. And if you are trying to reason about future machines, you want to be able to get a good prediction about their actual characteristics.
I think in this context, it’s totally reasonable for people to look at your (in my opinion poor) track record of prediction and adjust their credence in your effectiveness as an institution.
I disagree re: motte and bailey; the above is not at all in conflict with the position of the book (which, to be clear, I endorse and agree with and is also my position).
re: “you can imagine,” I strongly encourage people to be careful about leaning too hard on their own ability to imagine things; it’s often fraught and a huge chunk of the work MIRI does is poking at those imaginings to see where they collapse.
I’ll note that core MIRI predictions about e.g. how machines will be misaligned at current levels of sophistication are being borne out—things we have been saying for years about e.g. emergent drives and deception and hacking and brittle proxies. I’m pretty sure that’s not “rooted in the actual nuts and bolts details” in the way you’re wanting, but it still feels … relevant.
Thanks @Duncan Sabien for this excellent explanation. Don’t undersell yourself, I rate your communication here at least as good (if not better) than that of other senior MIRI people in recent years.
I think you might be engaging in a bit of Motte-and-Baileying here. Throughout this comment, you’re stating MIRI’s position as things like “it will be hard to make ASI safe”, and that AI will “win”, and that it will be hard for an AI to be perfectly aligned with “human flourishing” Those statements seem pretty reasonable.
But the actual stance of MIRI, which you just released a book about, is that there is an extremely high chance that building powerful AI will result in everybody on planet earth being killed. That’s a much narrower and more specific claim. You can imagine a lot of scenarios where AI is unsafe, but not in a way that kills everyone. You can imagine cases where AI “wins”, but decides to cut a deal with us. You can imagine cases where an AI doesn’t care about human flourishing because it doesn’t care about anything, it ends up acting like a tool that we can direct as we please.
I’m aware that you have counterarguments for all of these cases (that I will probably disagree with). But these counterarguments will have to be rooted in the actual nuts and bolts details of how actual, physical AI works. And if you are trying to reason about future machines, you want to be able to get a good prediction about their actual characteristics.
I think in this context, it’s totally reasonable for people to look at your (in my opinion poor) track record of prediction and adjust their credence in your effectiveness as an institution.
I disagree re: motte and bailey; the above is not at all in conflict with the position of the book (which, to be clear, I endorse and agree with and is also my position).
re: “you can imagine,” I strongly encourage people to be careful about leaning too hard on their own ability to imagine things; it’s often fraught and a huge chunk of the work MIRI does is poking at those imaginings to see where they collapse.
I’ll note that core MIRI predictions about e.g. how machines will be misaligned at current levels of sophistication are being borne out—things we have been saying for years about e.g. emergent drives and deception and hacking and brittle proxies. I’m pretty sure that’s not “rooted in the actual nuts and bolts details” in the way you’re wanting, but it still feels … relevant.
Thanks @Duncan Sabien for this excellent explanation. Don’t undersell yourself, I rate your communication here at least as good (if not better) than that of other senior MIRI people in recent years.