I can currently observe humans which screens off a bunch of the comparison and let’s me do direct analysis.
I’m in agreement that this consideration makes it hard to do a direct comparison. But I think this consideration should mostly make us more uncertain, rather than making us think that humans are better than the alternative. Analogy: if you rolled a die, and I didn’t see the result, the expected value is not low just because I am uncertain about what happened. What matters here is the expected value, not necessarily the variance.
I can directly observe AIs and make predictions of future training methods and their values seem to result from a much more heavily optimized and precise thing with less “slack” in some sense. (Perhaps this is related to genetic bottleneck, I’m unsure.)
I guess I am having trouble understanding this point.
AIs will be primarily trained in things which look extremely different from “cooperatively achieving high genetic fitness”.
Sure, but the question is why being different makes it worse along the relevant axes that we were discussing. The question is not just “will AIs be different than humans?” to which the answer would be “Obviously, yes”. We’re talking about why the differences between humans and AIs make AIs better or worse in expectation, not merely different.
Current AIs seem to use the vast, vast majority of their reasoning power for purposes which aren’t directly related to their final applications. I predict this will also apply for internal high level reasoning of AIs. This doesn’t seem true for humans.
I am having a hard time parsing this claim. What do you mean by “final applications”? And why won’t this be true for future AGIs that are at human-level intelligence or above? And why does this make a difference to the ultimate claim that you’re trying to support?
Humans seem optimized for something which isn’t that far off from utilitarianism from some perspective? Make yourself survive, make your kin group survive, make your tribe survive, etc? I think utilitarianism is often a natural generalization of “I care about the experience of XYZ, it seems arbitrary/dumb/bad to draw the boundary narrowly, so I should extend this further” (This is how I get to utilitarianism.) I think the AI optimization looks considerably worse than this by default.
This consideration seems very weak to me. Early AGIs will presumably be directly optimized for something like consumer value, which looks a lot closer to “utilitarianism” to me than the implicit values in gene-centered evolution. When I talk to GPT-4, I find that it’s way more altruistic and interested in making others happy than most humans are. This seems kind of a little bit like utilitarianism to me—at least more than your description of what human evolution was optimizing for. But maybe I’m just not understanding the picture you’re painting well enough though. Or maybe my model of AI is wrong.
I’m in agreement that this consideration makes it hard to do a direct comparison. But I think this consideration should mostly make us more uncertain, rather than making us think that humans are better than the alternative.
Actually, I was just trying to say “I can see what humans are like, and it seems pretty good relative to me current guesses about AIs in ways that dont just update me up about AIs” sorry about the confusion.
I’m in agreement that this consideration makes it hard to do a direct comparison. But I think this consideration should mostly make us more uncertain, rather than making us think that humans are better than the alternative. Analogy: if you rolled a die, and I didn’t see the result, the expected value is not low just because I am uncertain about what happened. What matters here is the expected value, not necessarily the variance.
I guess I am having trouble understanding this point.
Sure, but the question is why being different makes it worse along the relevant axes that we were discussing. The question is not just “will AIs be different than humans?” to which the answer would be “Obviously, yes”. We’re talking about why the differences between humans and AIs make AIs better or worse in expectation, not merely different.
I am having a hard time parsing this claim. What do you mean by “final applications”? And why won’t this be true for future AGIs that are at human-level intelligence or above? And why does this make a difference to the ultimate claim that you’re trying to support?
This consideration seems very weak to me. Early AGIs will presumably be directly optimized for something like consumer value, which looks a lot closer to “utilitarianism” to me than the implicit values in gene-centered evolution. When I talk to GPT-4, I find that it’s way more altruistic and interested in making others happy than most humans are. This seems kind of a little bit like utilitarianism to me—at least more than your description of what human evolution was optimizing for. But maybe I’m just not understanding the picture you’re painting well enough though. Or maybe my model of AI is wrong.
Actually, I was just trying to say “I can see what humans are like, and it seems pretty good relative to me current guesses about AIs in ways that dont just update me up about AIs” sorry about the confusion.