Your first point in your summary of my position is:
The overwhelming majority of potential moral value exists in the distant future. This implies that even immense suffering occurring in the near-term future could be justified if it leads to at least a slight improvement in the expected value of the distant future.
Here’s how I’d say it:
The overwhelming majority of potential moral value exists in the distant future. This means that the risk of wide-scale rights violations or suffering should sometimes not be an overriding consideration when it conflicts with risking the long-term future.
You continue:
Enslaving AIs, or more specifically, adopting measures to control AIs that significantly raise the risk of AI enslavement, could indeed produce immense suffering in the near-term. Nevertheless, according to your reasoning in point (1), these actions would still be justified if such control measures marginally increase the long-term expected value of the future.
I don’t think that it’s very likely that the experience of AIs in the five years around when they first are able to automate all human intellectual labor will be torturously bad, and I’d be much more uncomfortable with the situation if I expected it to be.
I think that rights violations are much more likely than welfare violations over this time period.
I think the use of powerful AI in this time period will probably involve less suffering than factory farming currently does. Obviously “less of a moral catastrophe than factory farming” is a very low bar; as I’ve said, I’m uncomfortable with the situation and if I had total control, we’d be a lot more careful to avoid AI welfare/rights violations.
I don’t think that control measures are likely to increase the extent to which AIs are suffering in the near term. I think the main effect control measures have from the AI’s perspective is that the AIs are less likely to get what they want.
I don’t think that my reasoning here requires placing overwhelming value on the far future.
Firstly, I think your argument creates an unjustified asymmetry: it compares short-term harms against long-term benefits of AI control, rather than comparing potential long-run harms alongside long-term benefits. To be more explicit, if you believe that AI control measures can durably and predictably enhance existential safety, thus positively affecting the future for billions of years, you should equally acknowledge that these same measures could cause lasting, negative consequences for billions of years.
I don’t think we’ll apply AI control techniques for a long time, because they impose much more overhead than aligning the AIs. The only reason I think control techniques might be important is that people might want to make use of powerful AIs before figuring out how to choose the goals/policies of those AIs. But if you could directly control the AI’s behavior, that would be way better and cheaper.
I think maybe you’re using the word “control” differently from me—maybe you’re saying “it’s bad to set the precedent of treating AIs as unpaid slave labor whose interests we ignore/suppress, because then we’ll do that later—we will eventually suppress AI interests by directly controlling their goals instead of applying AI-control-style security measures, but that’s bad too.” I agree, I think it’s a bad precedent to create AIs while not paying attention to the possibility that they’re moral patients.
Secondly, this reasoning, if seriously adopted, directly conflicts with basic, widely-held principles of morality. These moral principles exist precisely as safeguards against rationalizing immense harms based on speculative future benefits.
Yeah, as I said, I don’t think this is what I’m doing, and if I thought that I was working to impose immense harms for speculative massive future benefit, I’d be much more concerned about my work.
Your first point in your summary of my position is:
Here’s how I’d say it:
You continue:
I don’t think that it’s very likely that the experience of AIs in the five years around when they first are able to automate all human intellectual labor will be torturously bad, and I’d be much more uncomfortable with the situation if I expected it to be.
I think that rights violations are much more likely than welfare violations over this time period.
I think the use of powerful AI in this time period will probably involve less suffering than factory farming currently does. Obviously “less of a moral catastrophe than factory farming” is a very low bar; as I’ve said, I’m uncomfortable with the situation and if I had total control, we’d be a lot more careful to avoid AI welfare/rights violations.
I don’t think that control measures are likely to increase the extent to which AIs are suffering in the near term. I think the main effect control measures have from the AI’s perspective is that the AIs are less likely to get what they want.
I don’t think that my reasoning here requires placing overwhelming value on the far future.
I don’t think we’ll apply AI control techniques for a long time, because they impose much more overhead than aligning the AIs. The only reason I think control techniques might be important is that people might want to make use of powerful AIs before figuring out how to choose the goals/policies of those AIs. But if you could directly control the AI’s behavior, that would be way better and cheaper.
I think maybe you’re using the word “control” differently from me—maybe you’re saying “it’s bad to set the precedent of treating AIs as unpaid slave labor whose interests we ignore/suppress, because then we’ll do that later—we will eventually suppress AI interests by directly controlling their goals instead of applying AI-control-style security measures, but that’s bad too.” I agree, I think it’s a bad precedent to create AIs while not paying attention to the possibility that they’re moral patients.
Yeah, as I said, I don’t think this is what I’m doing, and if I thought that I was working to impose immense harms for speculative massive future benefit, I’d be much more concerned about my work.