A lot of these points seem like arguments that itâs possible that unaligned AI takeover will go well, e.g. thereâs no reason not to think that AIs are conscious, or will have interesting moral values, or etc.
My stance is that we (more-or-less) know humans are conscious and have moral values that, while they have failed to prevent large amounts of harm, seem to have the potential to be good. AIs may be conscious and may have welfare-promoting values, but we donât know that yet. We should try to better understand whether AIs are worthy successors before transitioning power to them.
Probably a core point of disagreement here is whether, presented with a ârandomâ intelligent actor, we should expect it to promote welfare or prevent suffering âby defaultâ. My understanding is that some accelerationists believe that we should. I believe that we shouldnât. Moreover I believe that itâs enough to be substantially uncertain about whether this is or isnât the default to want to take a slower and more careful approach.
My stance is that we (more-or-less) know humans are conscious and have moral values that, while they have failed to prevent large amounts of harm, seem to have the potential to be good.
I claim thereâs a weird asymmetry here where youâre happy to put trust into humans because they have the âpotentialâ to do good, but youâre not willing to say the same for AIs, even though they seem to have the same type of âpotentialâ.
Whatever your expectations about AIs, we already know that humans are not blank slates that may or may not be altruistic in the future: we actually have a ton of evidence about the quality and character of human nature, and it doesnât make humans look great. Humans are not mainly described as altruistic creatures. I mentioned factory farming in my original comment, but one can examine the way people spend their money (i.e. not mainly on charitable causes), or the history of genocides, war, slavery, and oppression for additional evidence.
Probably a core point of disagreement here is whether, presented with a ârandomâ intelligent actor, we should expect it to promote welfare or prevent suffering âby defaultâ.
I donât expect humans to âpromote welfare or prevent sufferingâ by default either. Look at the current world. Have humans, on net, reduced or increased suffering? Even if you think humans have been good for the world, itâs not obvious. Sure, itâs easy to dismiss the value of unaligned AIs if you compare against some idealistic baseline; but Iâm asking you to compare against a realistic baseline, i.e. actual human nature.
It seems like youâre just substantially more pessimistic than I am about humans. I think factory farming will be ended, and though it seems like humans have caused more suffering than happiness so far, I think their default trajectory will be to eventually stop doing that, and to ultimately do enough good to outweigh their ignoble past. I donât think this is certain by any means, but I think itâs a reasonable extrapolation. (I maybe donât expect you to find it a reasonable extrapolation.)
Meanwhile I expect the typical unaligned AI may seize power for some purpose that seems to us entirely trivial, and may be uninterested in doing any kind of moral philosophy, and/âor may not place any terminal (rather than instrumental) value in paying attention to other sentient experiences in any capacity. I do think humans, even with their kind of terrible track record, are more promising than that baseline, though I can see why other people might think differently.
Sure, itâs easy to dismiss the value of unaligned AIs if you compare against some idealistic baseline; but Iâm asking you to compare against a realistic baseline, i.e. actual human nature.
I havenât read your entire post about this, but I understand you believe that if we created aligned AI, it would get essentially âcurrentâ human values, rather than e.g. some improved /â more enlightened iteration of human values. If instead you believed the latter, that would set a significantly higher bar for unaligned AI, right?
If instead you believed the latter, that would set a significantly higher bar for unaligned AI, right?
Thatâs right, if I thought human values would improve greatly in the face of enormous wealth and advanced technology, Iâd definitely be open to seeing humans as special and extra valuable from a total utilitarian perspective. Note that many routes through which values could improve in the future could apply to unaligned AIs too. So, for example, Iâd need to believe that humans would be more likely to reflect, and be more likely to do the right type of reflection, relative to the unaligned baseline. In other words itâs not sufficient to argue that humans would reflect a little bit; that wouldnât really persuade me at all.
A lot of these points seem like arguments that itâs possible that unaligned AI takeover will go well, e.g. thereâs no reason not to think that AIs are conscious, or will have interesting moral values, or etc.
My stance is that we (more-or-less) know humans are conscious and have moral values that, while they have failed to prevent large amounts of harm, seem to have the potential to be good. AIs may be conscious and may have welfare-promoting values, but we donât know that yet. We should try to better understand whether AIs are worthy successors before transitioning power to them.
Probably a core point of disagreement here is whether, presented with a ârandomâ intelligent actor, we should expect it to promote welfare or prevent suffering âby defaultâ. My understanding is that some accelerationists believe that we should. I believe that we shouldnât. Moreover I believe that itâs enough to be substantially uncertain about whether this is or isnât the default to want to take a slower and more careful approach.
I claim thereâs a weird asymmetry here where youâre happy to put trust into humans because they have the âpotentialâ to do good, but youâre not willing to say the same for AIs, even though they seem to have the same type of âpotentialâ.
Whatever your expectations about AIs, we already know that humans are not blank slates that may or may not be altruistic in the future: we actually have a ton of evidence about the quality and character of human nature, and it doesnât make humans look great. Humans are not mainly described as altruistic creatures. I mentioned factory farming in my original comment, but one can examine the way people spend their money (i.e. not mainly on charitable causes), or the history of genocides, war, slavery, and oppression for additional evidence.
I donât expect humans to âpromote welfare or prevent sufferingâ by default either. Look at the current world. Have humans, on net, reduced or increased suffering? Even if you think humans have been good for the world, itâs not obvious. Sure, itâs easy to dismiss the value of unaligned AIs if you compare against some idealistic baseline; but Iâm asking you to compare against a realistic baseline, i.e. actual human nature.
It seems like youâre just substantially more pessimistic than I am about humans. I think factory farming will be ended, and though it seems like humans have caused more suffering than happiness so far, I think their default trajectory will be to eventually stop doing that, and to ultimately do enough good to outweigh their ignoble past. I donât think this is certain by any means, but I think itâs a reasonable extrapolation. (I maybe donât expect you to find it a reasonable extrapolation.)
Meanwhile I expect the typical unaligned AI may seize power for some purpose that seems to us entirely trivial, and may be uninterested in doing any kind of moral philosophy, and/âor may not place any terminal (rather than instrumental) value in paying attention to other sentient experiences in any capacity. I do think humans, even with their kind of terrible track record, are more promising than that baseline, though I can see why other people might think differently.
I havenât read your entire post about this, but I understand you believe that if we created aligned AI, it would get essentially âcurrentâ human values, rather than e.g. some improved /â more enlightened iteration of human values. If instead you believed the latter, that would set a significantly higher bar for unaligned AI, right?
Thatâs right, if I thought human values would improve greatly in the face of enormous wealth and advanced technology, Iâd definitely be open to seeing humans as special and extra valuable from a total utilitarian perspective. Note that many routes through which values could improve in the future could apply to unaligned AIs too. So, for example, Iâd need to believe that humans would be more likely to reflect, and be more likely to do the right type of reflection, relative to the unaligned baseline. In other words itâs not sufficient to argue that humans would reflect a little bit; that wouldnât really persuade me at all.