But EA/XR folks don’t seem to be primarily advocating for specific safety measures. Instead, what I hear (or think I’m hearing) is a kind of generalized fear of progress. Again, that’s where I get lost. I think that (1) progress is too obviously valuable and (2) our ability to actually predict and control future risks is too low.
I think there’s a fear of progress in specific areas (e.g. AGI and certain kinds of bio) but not a general one? At least I’m in favor of progress generally and against progress in some specific areas where we have good object-level arguments for why progress in those areas in particular could be very risky.
(I also think EA/XR folks are primarily advocating for the development of specific safety measures, and not for us to stop progress, but I agree there is at least some amount of “stop progress” in the mix.)
Re: (2), I’m somewhat sympathetic to this, but all the ways I’m sympathetic to it seem to also apply to progress studies (i.e. I’d be sympathetic to “our ability to influence the pace of progress is too low”), so I’m not sure how this becomes a crux.
That’s interesting, because I think it’s much more obvious that we could successfully, say, accelerate GDP growth by 1-2 points per year, than it is that we could successfully, say, stop an AI catastrophe.
The former is something we have tons of experience with: there’s history, data, economic theory… and we can experiment and iterate. The latter is something almost completely in the future, where we don’t get any chances to get it wrong and course-correct.
(Again, this is not to say that I’m opposed to AI safety work: I basically think it’s a good thing, or at least it can be if pursued intelligently. I just think there’s a much greater chance that we look back on it and realize, too late, that we were focused on entirely the wrong things.)
I just think there’s a much greater chance that we look back on it and realize, too late, that we were focused on entirely the wrong things.
If you mean like 10x greater chance, I think that’s plausible (though larger than I would say). If you mean 1000x greater chance, that doesn’t seem defensible.
In both fields you basically ~can’t experiment with the actual thing you care about (you can’t just build a superintelligent AI and check whether it is aligned; you mostly can’t run an intervention on the entire world and check whether world GDP went up). You instead have to rely on proxies.
In some ways it is a lot easier to run proxy experiments for AI alignment—you can train AI systems right now, and run actual proposals in code on those systems, and see what they do; this usually takes somewhere between hours and weeks. It seems a lot harder to do this for “improving GDP growth” (though perhaps there are techniques I don’t know about).
I agree that PS has an advantage with historical data (though I don’t see why economic theory is particularly better than AI theory), and this is a pretty major difference. Still, I don’t think it goes from “good chance of making a difference” to “basically zero chance of making a difference”.
The latter is something almost completely in the future, where we don’t get any chances to get it wrong and course-correct.
Fwiw, I think AI alignment is relevant to current AI systems with which we have experience even if the catastrophic versions are in the future, and we do get chances to get it wrong and course-correct, but we can set that aside for now, since I’d probably still disagree even if I changed my mind on that. (Like, it is hard to do armchair theory without experimental data, but it’s not so hard that you should conclude that you’re completely doomed and there’s no point in trying.)
I think there’s a fear of progress in specific areas (e.g. AGI and certain kinds of bio) but not a general one? At least I’m in favor of progress generally and against progress in some specific areas where we have good object-level arguments for why progress in those areas in particular could be very risky.
(I also think EA/XR folks are primarily advocating for the development of specific safety measures, and not for us to stop progress, but I agree there is at least some amount of “stop progress” in the mix.)
Re: (2), I’m somewhat sympathetic to this, but all the ways I’m sympathetic to it seem to also apply to progress studies (i.e. I’d be sympathetic to “our ability to influence the pace of progress is too low”), so I’m not sure how this becomes a crux.
That’s interesting, because I think it’s much more obvious that we could successfully, say, accelerate GDP growth by 1-2 points per year, than it is that we could successfully, say, stop an AI catastrophe.
The former is something we have tons of experience with: there’s history, data, economic theory… and we can experiment and iterate. The latter is something almost completely in the future, where we don’t get any chances to get it wrong and course-correct.
(Again, this is not to say that I’m opposed to AI safety work: I basically think it’s a good thing, or at least it can be if pursued intelligently. I just think there’s a much greater chance that we look back on it and realize, too late, that we were focused on entirely the wrong things.)
If you mean like 10x greater chance, I think that’s plausible (though larger than I would say). If you mean 1000x greater chance, that doesn’t seem defensible.
In both fields you basically ~can’t experiment with the actual thing you care about (you can’t just build a superintelligent AI and check whether it is aligned; you mostly can’t run an intervention on the entire world and check whether world GDP went up). You instead have to rely on proxies.
In some ways it is a lot easier to run proxy experiments for AI alignment—you can train AI systems right now, and run actual proposals in code on those systems, and see what they do; this usually takes somewhere between hours and weeks. It seems a lot harder to do this for “improving GDP growth” (though perhaps there are techniques I don’t know about).
I agree that PS has an advantage with historical data (though I don’t see why economic theory is particularly better than AI theory), and this is a pretty major difference. Still, I don’t think it goes from “good chance of making a difference” to “basically zero chance of making a difference”.
Fwiw, I think AI alignment is relevant to current AI systems with which we have experience even if the catastrophic versions are in the future, and we do get chances to get it wrong and course-correct, but we can set that aside for now, since I’d probably still disagree even if I changed my mind on that. (Like, it is hard to do armchair theory without experimental data, but it’s not so hard that you should conclude that you’re completely doomed and there’s no point in trying.)