Thank you for this detailed analysis. I found the human analogy initially confusing, but I think there’s an important argument here that could be made more explicit.
The essay documents extensive DSL (deceitful, sycophantic, lazy) behavior in humans, which initially seems to undermine the claim that AI will be different. However, you do address why humans accomplish difficult things despite these traits:
“People do not appreciate how much of science relies on the majority of people practicing it to be rigorous, skilled, and ideologically dedicated to truth-seeking.”
If I understand correctly, your core argument is:
Humans have DSL traits BUT also possess intrinsic motivation, professional pride, and ideological commitment to truth/excellence that counterbalances these tendencies. AI systems, trained purely through reward optimization, will lack these counterbalancing motivations and therefore DSL traits will dominate more completely.
This is actually quite a sophisticated claim about the limits of instrumental training, but it’s somewhat buried in the “vibe physics” section. Making it more prominent might strengthen the essay, as it directly addresses the apparent paradox of “if humans are DSL, why do they succeed?”
Does this capture your argument correctly?
(note the above comment was generated with the assistance of AI)
I don’t really understand why I am getting downvoted/disagreevoted. I was just pointing out the contrast between humans—who also have the traits discussed in this post—and AI, which is basically that human beings have intrinsic motivations and virtues that AI does not have. I thought that this was a critical piece that was not really emphasized. It is pretty dispiriting to read through an article, point out something you think might be helpful, and have this happen.
Thank you for this detailed analysis. I found the human analogy initially confusing, but I think there’s an important argument here that could be made more explicit.
The essay documents extensive DSL (deceitful, sycophantic, lazy) behavior in humans, which initially seems to undermine the claim that AI will be different. However, you do address why humans accomplish difficult things despite these traits:
“People do not appreciate how much of science relies on the majority of people practicing it to be rigorous, skilled, and ideologically dedicated to truth-seeking.”
If I understand correctly, your core argument is:
Humans have DSL traits BUT also possess intrinsic motivation, professional pride, and ideological commitment to truth/excellence that counterbalances these tendencies. AI systems, trained purely through reward optimization, will lack these counterbalancing motivations and therefore DSL traits will dominate more completely.
This is actually quite a sophisticated claim about the limits of instrumental training, but it’s somewhat buried in the “vibe physics” section. Making it more prominent might strengthen the essay, as it directly addresses the apparent paradox of “if humans are DSL, why do they succeed?”
Does this capture your argument correctly?
(note the above comment was generated with the assistance of AI)
I don’t really understand why I am getting downvoted/disagreevoted. I was just pointing out the contrast between humans—who also have the traits discussed in this post—and AI, which is basically that human beings have intrinsic motivations and virtues that AI does not have. I thought that this was a critical piece that was not really emphasized. It is pretty dispiriting to read through an article, point out something you think might be helpful, and have this happen.