tobycrisford 🔸 comments on AGI Morality and Why It Is Unlikely to Emerge as a Feature of Superintelligence

tobycrisford 🔸Mar 19, 2025, 1:21 PM
1 point
0 ∶ 0
Evolution is chaotic and messy, but so is stochastic gradient descent (the word ‘stochastic’ is in the name!) The optimisation function might be clean, but the process we use to search for optimum models is not.
If AGI emerges from the field of machine learning in the state it’s in today, then it won’t be “designed” to pursue a goal, any more than humans were designed. Instead it will emerge from a random process, through billions of tiny updates, and this process will just have been rigged to favour things which do well on some chosen metric.
This seems extremely similar to how humans were created, through evolution by natural selection. In the case of humans, the metric being optimized for was the ability to spread our genes. In AIs, it might be accuracy at predicting the next word, or human helpfulness scores.
The closest things to AGI we have so far do not act with “strict logical efficiency”, or always behave rationally. In fact, logic puzzles are one of the things they particularly struggle with!
- funnyfranco Mar 19, 2025, 5:27 PM
  1 point
  0 ∶ 0
  Parent
  The key difference is that SGD is not evolution—it’s a guided optimisation process. Evolution has no goal beyond survival and reproduction, while SGD explicitly optimises toward a defined function chosen by human designers. Yes, the search process is stochastic, but the selection criteria are rigidly defined in a way that natural selection is not.
  The fact that current AI systems don’t act with strict efficiency is not evidence that AGI will behave irrationally—it’s just a reflection of their current limitations. If anything, their errors today are an argument for why they won’t develop morality by accident: their behaviour is driven entirely by the training data and reward signals they are given. When they improve, they will become better at pursuing those goals, not more human-like.
  Yes, if AGI emerges from simply trying to create it for the sake of it, then it has no real objectives. If it emerges as a result of an AI tool that is being used to optimise something within a business, or as part of a government or military, then it will. I argue in my first essay that this is the real threat AGI poses: when developed in a competitive system, it will disregard safety and morality in order to get a competitive edge.
  The crux of the issue is this: humans evolved morality as an unintended byproduct of thousands of competing pressures over millions of years. AGI, by contrast, will be shaped by a much narrower and more deliberate selection process. The randomness in training doesn’t mean AGI will stumble into morality—it just means it will be highly optimised for whatever function we define, whether that aligns with human values or not.