I don’t think you can complain about people engaging in definitional discussions when the title of the post is a definitional claim.
Sure, generative AI has a lot of differences to regular software, but it has a lot of similarities as well. You are still executing code line by line, it’s still being written in python or a regular language, you run it on the same hardware and operating systems, etc. Sure, the output of the code is unpredictable, but wouldn’t that also apply to something like a weather forecasting package?
Ultimately you can call it software or not if you want, depending on whether you want to emphasize the similarities with other software or the differences.
No, the title wasn’t a definitional claim, it was pointing out that we’re using the word “software” as hidden inference, in ways that are counterproductive, and so I argued that that we should stop assuming it’s similar to software.
Also, no, AI models aren’t executing code line by line, they are using software to encode the input, then doing matrix math, and feeding the result into software that provides this as human-readable output. The software bits are perfectly understandable, it’s the actual model that isn’t software which I’m trying to discuss.
By executing code line by line. The code in this case being executing linear algebra calculations.
It’s totally fine to isolate that bit of code, and point out “hey, this bit of code is way way more inscrutable than the other bits of code we generally use, and that has severe implications for things”. But don’t let that hide the similarities as well. If you run the same neural network twice with the same input (including seeds for random numbers), you will get the same output. You can stop the neural network halfway through, fiddle with the numbers, and see what happens, etc.
When you say something like “AI is not software”, I hear a request that I should refer to Stockfish (non neural network) as software, but Alphazero (neural network) as “not software”. This just seems like a bad definition. From the perspective of the user they act identically (spitting out good chess moves). Sure, they are different from the programmer side of things, but it’s not like they can do the math that stockfish is doing either.
There is clearly a difference between neural networks and regular code, but being “software” is not it.
The bits of code aren’t inscrutable; the matrices the code makes operations on are.
The code for Google Meet represents instructions written by humans; the actual image that you see on your screen and the sound that you hear are a result of something else interacting with these instructions. The words from your speaker or headphones are not intended by the Google Meet designers.
Similarly, the code for GPT-4 represents instructions designed (mostly?) by humans; the actual outputs of GPT-4 are not intended by its designers and depend on the contents of the inscrutable arrays of numbers humans have found.
We understand that we’re multiplying and taking sums of specific matrices in a specific order; but we have no idea how this is able to lead to the results that we see.
The important difference here is that normal software implements algorithms designed by humans, run on hardware designed by humans; AI, in contrast, are algorithms blindly designed by an optimisation process designed by humans, run on software designed by humans, but with no understanding of the algorithms implemented by the numbers our optimisation algorithms find.
It’s like a contrast between CPUs designed by humans and assembly code we don’t understand sent to us by aliens, that we run on CPUs that we do understand
I don’t think you can complain about people engaging in definitional discussions when the title of the post is a definitional claim.
Sure, generative AI has a lot of differences to regular software, but it has a lot of similarities as well. You are still executing code line by line, it’s still being written in python or a regular language, you run it on the same hardware and operating systems, etc. Sure, the output of the code is unpredictable, but wouldn’t that also apply to something like a weather forecasting package?
Ultimately you can call it software or not if you want, depending on whether you want to emphasize the similarities with other software or the differences.
No, the title wasn’t a definitional claim, it was pointing out that we’re using the word “software” as hidden inference, in ways that are counterproductive, and so I argued that that we should stop assuming it’s similar to software.
Also, no, AI models aren’t executing code line by line, they are using software to encode the input, then doing matrix math, and feeding the result into software that provides this as human-readable output. The software bits are perfectly understandable, it’s the actual model that isn’t software which I’m trying to discuss.
And how is the “matrix math” calculated?
By executing code line by line. The code in this case being executing linear algebra calculations.
It’s totally fine to isolate that bit of code, and point out “hey, this bit of code is way way more inscrutable than the other bits of code we generally use, and that has severe implications for things”. But don’t let that hide the similarities as well. If you run the same neural network twice with the same input (including seeds for random numbers), you will get the same output. You can stop the neural network halfway through, fiddle with the numbers, and see what happens, etc.
When you say something like “AI is not software”, I hear a request that I should refer to Stockfish (non neural network) as software, but Alphazero (neural network) as “not software”. This just seems like a bad definition. From the perspective of the user they act identically (spitting out good chess moves). Sure, they are different from the programmer side of things, but it’s not like they can do the math that stockfish is doing either.
There is clearly a difference between neural networks and regular code, but being “software” is not it.
The bits of code aren’t inscrutable; the matrices the code makes operations on are.
The code for Google Meet represents instructions written by humans; the actual image that you see on your screen and the sound that you hear are a result of something else interacting with these instructions. The words from your speaker or headphones are not intended by the Google Meet designers.
Similarly, the code for GPT-4 represents instructions designed (mostly?) by humans; the actual outputs of GPT-4 are not intended by its designers and depend on the contents of the inscrutable arrays of numbers humans have found.
We understand that we’re multiplying and taking sums of specific matrices in a specific order; but we have no idea how this is able to lead to the results that we see.
The important difference here is that normal software implements algorithms designed by humans, run on hardware designed by humans; AI, in contrast, are algorithms blindly designed by an optimisation process designed by humans, run on software designed by humans, but with no understanding of the algorithms implemented by the numbers our optimisation algorithms find.
It’s like a contrast between CPUs designed by humans and assembly code we don’t understand sent to us by aliens, that we run on CPUs that we do understand
I think I agree with this explanation much more than with the original post.
I do too!
Stockfish has included a neural network since v. 12, and the classical eval was actually removed in v. 16. So this analogy seems mostly outdated.
https://github.com/official-stockfish/Stockfish/commit/af110e02ec96cdb46cf84c68252a1da15a902395