No, the title wasn’t a definitional claim, it was pointing out that we’re using the word “software” as hidden inference, in ways that are counterproductive, and so I argued that that we should stop assuming it’s similar to software.
Also, no, AI models aren’t executing code line by line, they are using software to encode the input, then doing matrix math, and feeding the result into software that provides this as human-readable output. The software bits are perfectly understandable, it’s the actual model that isn’t software which I’m trying to discuss.
By executing code line by line. The code in this case being executing linear algebra calculations.
It’s totally fine to isolate that bit of code, and point out “hey, this bit of code is way way more inscrutable than the other bits of code we generally use, and that has severe implications for things”. But don’t let that hide the similarities as well. If you run the same neural network twice with the same input (including seeds for random numbers), you will get the same output. You can stop the neural network halfway through, fiddle with the numbers, and see what happens, etc.
When you say something like “AI is not software”, I hear a request that I should refer to Stockfish (non neural network) as software, but Alphazero (neural network) as “not software”. This just seems like a bad definition. From the perspective of the user they act identically (spitting out good chess moves). Sure, they are different from the programmer side of things, but it’s not like they can do the math that stockfish is doing either.
There is clearly a difference between neural networks and regular code, but being “software” is not it.
The bits of code aren’t inscrutable; the matrices the code makes operations on are.
The code for Google Meet represents instructions written by humans; the actual image that you see on your screen and the sound that you hear are a result of something else interacting with these instructions. The words from your speaker or headphones are not intended by the Google Meet designers.
Similarly, the code for GPT-4 represents instructions designed (mostly?) by humans; the actual outputs of GPT-4 are not intended by its designers and depend on the contents of the inscrutable arrays of numbers humans have found.
We understand that we’re multiplying and taking sums of specific matrices in a specific order; but we have no idea how this is able to lead to the results that we see.
The important difference here is that normal software implements algorithms designed by humans, run on hardware designed by humans; AI, in contrast, are algorithms blindly designed by an optimisation process designed by humans, run on software designed by humans, but with no understanding of the algorithms implemented by the numbers our optimisation algorithms find.
It’s like a contrast between CPUs designed by humans and assembly code we don’t understand sent to us by aliens, that we run on CPUs that we do understand
No, the title wasn’t a definitional claim, it was pointing out that we’re using the word “software” as hidden inference, in ways that are counterproductive, and so I argued that that we should stop assuming it’s similar to software.
Also, no, AI models aren’t executing code line by line, they are using software to encode the input, then doing matrix math, and feeding the result into software that provides this as human-readable output. The software bits are perfectly understandable, it’s the actual model that isn’t software which I’m trying to discuss.
And how is the “matrix math” calculated?
By executing code line by line. The code in this case being executing linear algebra calculations.
It’s totally fine to isolate that bit of code, and point out “hey, this bit of code is way way more inscrutable than the other bits of code we generally use, and that has severe implications for things”. But don’t let that hide the similarities as well. If you run the same neural network twice with the same input (including seeds for random numbers), you will get the same output. You can stop the neural network halfway through, fiddle with the numbers, and see what happens, etc.
When you say something like “AI is not software”, I hear a request that I should refer to Stockfish (non neural network) as software, but Alphazero (neural network) as “not software”. This just seems like a bad definition. From the perspective of the user they act identically (spitting out good chess moves). Sure, they are different from the programmer side of things, but it’s not like they can do the math that stockfish is doing either.
There is clearly a difference between neural networks and regular code, but being “software” is not it.
The bits of code aren’t inscrutable; the matrices the code makes operations on are.
The code for Google Meet represents instructions written by humans; the actual image that you see on your screen and the sound that you hear are a result of something else interacting with these instructions. The words from your speaker or headphones are not intended by the Google Meet designers.
Similarly, the code for GPT-4 represents instructions designed (mostly?) by humans; the actual outputs of GPT-4 are not intended by its designers and depend on the contents of the inscrutable arrays of numbers humans have found.
We understand that we’re multiplying and taking sums of specific matrices in a specific order; but we have no idea how this is able to lead to the results that we see.
The important difference here is that normal software implements algorithms designed by humans, run on hardware designed by humans; AI, in contrast, are algorithms blindly designed by an optimisation process designed by humans, run on software designed by humans, but with no understanding of the algorithms implemented by the numbers our optimisation algorithms find.
It’s like a contrast between CPUs designed by humans and assembly code we don’t understand sent to us by aliens, that we run on CPUs that we do understand
I think I agree with this explanation much more than with the original post.
I do too!
Stockfish has included a neural network since v. 12, and the classical eval was actually removed in v. 16. So this analogy seems mostly outdated.
https://github.com/official-stockfish/Stockfish/commit/af110e02ec96cdb46cf84c68252a1da15a902395