I think the original commentor is referring to this paragraph in the post:
But even if deep learning is a fad, the field of AI has existed for less than 70 years! And it takes 10-30 years to go through a paradigm. It seems highly plausible that we produce human-level AI with some other paradigm within my lifetime (though reducing risk from an unknown future paradigm of AI does seem much less tractable)
Ok. So there’s a lot to unpack and I think it’s worth commenting on the word or concept of “Paradigm” independently.
“Paradigm” is a pretty loose word and concept. Paradigm basically means a pattern/process/system, or set of them, that seems to work well together and is accepted.
If that definition doesn’t sound tangible or specific, it isn’t. The concept of paradigm is vague. For evidence on this, the wikipedia article is somewhat hard to parse too.
Worse still, the concept of paradigm is really meta—someone talking about paradigms often presents you with giant, convoluted writing, whose connection to reality is merely comments on other convoluted writing.
Because of this, the concept of “paradigms” can be misused or at least serve as a lazy shortcut for thinking.
Personally, when someone talks about paradigms a lot, I find it useful to see that as a yellow flag. (Without being too arch, if someone made an org that had “paradigm” in its name, which actually happened, that’s an issue. Here is further comment, that gets pretty off track, and is a footnote[1].)
However, in general, you can address the “object level issue” (the actual problem) of existing paradigms directly.
To give an example of a “paradigm shift” (note that such “shifts” are often contested), the transition from “classical physics” to “special relativity” is an example.
If you think that classical Newtonian physics is wrong, you can literally show its wrong, and develop a new theory, make predictions and win a Nobel prize, like Einstein did.
You don’t spend your time writing treatises on paradigms.
Suggesting there can be a new paradigm, or something is “preparadigmatic”, doesn’t offer any evidence that a new “paradigm” will happen.
It’s worth pointing out that in the case of a malign entity that uses language like “paradigms”, the involved ploys can sophisticated (because they are smart and this is essentially all they do).
These people or entities can use the word in a way that circumvents or coopts warnings. Specifically:
The malign entities can use this language for the consequent controversy/attention, which is valuable to them. e.g. a conversation that starts off “Hey, I heard this thing about Paradigm?” can lead to more interest and engagement. The value from this publicity can outweigh the negatives from controversy. This can work extremely well in communities that see themselves as marginal and unconventional.
They use this language/branding as a filter, which provides stability and control. The fact that this language gives a correct warning causes stronger people to bounce off from the entity. The people who remain can be more pliable and dedicated (in language that was actually used, not “too big”).
Ok, with that out of the way, getting closer to the substance of the original question about paradigms of AI.
So something that isn’t talked about is something that is known as “classical AI” or “symbolic AI”.
Symbolic AI or classical AI was the original general AI paradigm. (This is written up elsewhere, including on Wikipedia, which provides my knowledge for all further comments.)
Here’s my take to explain the relevance of symbolic AI.
Ok, close your eyes, and imagine for a second that everything in the world is the same, but that deep learning or neural networks don’t exist yet. They just haven’t been discovered.
In this situation, if you told a genius in math or physics, that there was a danger from building an evil, world conquering AI, what would that genius think about?
I honestly bet that they wouldn’t think about feeding giant spreadsheets of data into a bunch of interconnected logistic functions to run argmax, which is essentially what deep learning, RNN, transformers, etc. do today.
Instead, this genius would think about the danger of directly teaching or programming computers with logic, such as giving symbols and rules that seem to allow reasoning. They would think that you could directly evolve these reasoning systems, and growing to general cognition, and become competitive with humans.
If you thought computers could think or use logic, this approach seems like, really obvious.
(According to my 150 seconds of skimming Wikipedia) my guess is that the above way of thinking is exactly what symbolic or classical AI did. According to Wikipedia, this dominated until 1985[1].
Note that this classical or symbolic approach was something full on. This wasn’t “very online”.
Symbolic AI was studied at MIT, Stanford, CMU, RAND, etc. This was a big deal, sort of like how molecular biology, or any mainstream approach today in machine learning is a big deal.
To be really concrete and tangible to explain the “symbolic reasoning” programme, one instance of this probably is this project “Cyc” https://en.wikipedia.org/wiki/Cyc.
If you take a look at that article, you get a sense for this approach. This stores a lot of information and trying to build reasoning from rules.
Ok the story continues, but with a twist—directly relevant to the original question of paradigms.
(According to my 5 minutes of Wikipedia reading) a major critic of of symbolic AI was by this dude, Hubert Dreyfus. Dreyfus was a Harvard trained philosopher and MIT faculty when he was hired by RAND.
There, Dreyfus sort of turned on his new employer (“treacherous turn!”) to write a book against the establishment’s approach to AI.
Dreyfus basically predicted that symbolic approach was a dead end.
“The paper “caused an uproar”, according to Pamela McCorduck. The AI community’s response was derisive and personal. Seymour Papert dismissed one third of the paper as “gossip” and claimed that every quotation was deliberately taken out of context. Herbert A. Simon accused Dreyfus of playing “politics” so that he could attach the prestigious RAND name to his ideas. Simon said, “what I resent about this was the RAND name attached to that garbage”.
Ultimately, Dreyfus was essentially “proved right” (although there’s probably a lot more color here, most contrarians tend to be excessive, dogmatic and fixated themselves, and sometimes right for the wrong reasons.)
Dreyfus did make a splash, but note that even if Dreyfus was correct it’s unlikely that his impact causal. It’s more likely the field petered out as lack of results became too obvious to ignore (based on 3 paragraphs of Wikipedia and mostly my beliefs) .
But ultimately, the field of classical/symbolic AI fell, pretty bigly, to a degree that I don’t think many fields fall.
Directly relevant to the question of “paradigm” in the original question.
This now fallen symbolic AI reasoning faction/programme/agenda is a “paradigm”. This “paradigm” seemed extremely reasonable.
These figures were vastly influential and established, they dotted the top CS schools, the top think tanks and connected with the strongest companies at the time. This AI community was probably vastly more influential than the AI safety community currently is, by a factor of 20 or more.
It still seems like “Symbolic logic”, with some sort of evolution/training, is a reasonable guess for how to build general, strong AI.
In contrast, what exactly are neural networks doing? To many people, deep learning, using a 400B parameter model to hallucinate images and text, seems like a labored and quixotic approach to general AI. So how are these deep learning systems dangerous in any way? (Well, I think there are groups of principled people who have written up important content on this.)
(I’m not a critic or stealth undermining AI safety, this series of comments isn’t a proxy to critique AI safety. These lessons seem generally useful.)
The takeaway from this comment is that:
It’s very likely that these events of symbolic logic hangs over the general AI field (“AI winter”). It’s very possible that the reaction was an overshoot that hampers AI safety today in some way. (But 10 minutes of reading Wikipedia isn’t enough for me to speculate more).
This is a perfect example of a paradigm shift. Symbolic AI was a major commitment by the establishment, the field resisted intensely, but they ultimately fell hard.
There’s probably other “paradigms” (ugh I hate using this phrase and framework) that might be relevant to AI safety. Like, I guess maybe someone should take a look at this as AI safety continues.
Final comment—quick take on “current approach in AI”
If you’re still here, still reading this comment chain (my guess is that there is a 90% chance the original commentor is gone, and most forum readers are gone too), you might be confused because I haven’t mentioned the “current paradigm of ML or AI”.
For completeness it’s worth filling this in. So like, here it is:
Basically, the current paradigm of ML or deep learning right now is a lot less deep than it seems.
Essentially, people have models, like a GPT-3/BERT transformer, that they tweak, run data through, and look at the resulting performance.
To build these models and new ones, people basically append and modify existing models with new architectures or data, and look at how the results perform on established benchmarks (e.g. language translation, object detection).
Yeah, this doesn’t sound super principled, and it isn’t.
To give an analogy, imagine bridge building in civil engineering. My guess is that when designing a bridge, engineers use intricate knowledge of materials and physics, and choose every single square meter of every component of the bridge in accordance to what is needed, so that the resulting whole stands up and supports every other component.
In contrast, another approach to building a bridge is to just bolt and weld a lot of pieces together, in accordance with intuition and experience. This approach probably would copy existing designs, and there would be a lot of tacit knowledge and rules of thumb (e.g. when you want to make a bridge 2x as big, you usually have to use more than 2x more material). With many iterations, over time this process would work and be pretty efficient.
The second approach is basically a lot of how deep learning works. People are trying different architectures, adding layers, that are moderate innovations over the last model.
There’s a lot more details. Things like unsupervised learning, or encoder decoder models, latent spaces, and more, are super interesting and involve real insights or new approaches.
I’m basically too dumb to build a whole new deep learning architecture, but many people are smart enough, and major advances can be made like by new insights. But still, a lot of it is this iteration and trial and error. The biggest enablement is large amounts of data and computing capacity, and I guess lots of investor money.
To start off, comments on the world “Paradigm”:
I think the original commentor is referring to this paragraph in the post:
Ok. So there’s a lot to unpack and I think it’s worth commenting on the word or concept of “Paradigm” independently.
“Paradigm” is a pretty loose word and concept. Paradigm basically means a pattern/process/system, or set of them, that seems to work well together and is accepted.
If that definition doesn’t sound tangible or specific, it isn’t. The concept of paradigm is vague. For evidence on this, the wikipedia article is somewhat hard to parse too.
Worse still, the concept of paradigm is really meta—someone talking about paradigms often presents you with giant, convoluted writing, whose connection to reality is merely comments on other convoluted writing.
Because of this, the concept of “paradigms” can be misused or at least serve as a lazy shortcut for thinking.
Personally, when someone talks about paradigms a lot, I find it useful to see that as a yellow flag. (Without being too arch, if someone made an org that had “paradigm” in its name, which actually happened, that’s an issue. Here is further comment, that gets pretty off track, and is a footnote[1].)
To be clear, the underlying concept of paradigms is important, and Kuhn, the author of this presentation, is right.
However, in general, you can address the “object level issue” (the actual problem) of existing paradigms directly.
To give an example of a “paradigm shift” (note that such “shifts” are often contested), the transition from “classical physics” to “special relativity” is an example.
If you think that classical Newtonian physics is wrong, you can literally show its wrong, and develop a new theory, make predictions and win a Nobel prize, like Einstein did.
You don’t spend your time writing treatises on paradigms.
Suggesting there can be a new paradigm, or something is “preparadigmatic”, doesn’t offer any evidence that a new “paradigm” will happen.
It’s worth pointing out that in the case of a malign entity that uses language like “paradigms”, the involved ploys can sophisticated (because they are smart and this is essentially all they do).
These people or entities can use the word in a way that circumvents or coopts warnings. Specifically:
The malign entities can use this language for the consequent controversy/attention, which is valuable to them. e.g. a conversation that starts off “Hey, I heard this thing about Paradigm?” can lead to more interest and engagement. The value from this publicity can outweigh the negatives from controversy. This can work extremely well in communities that see themselves as marginal and unconventional.
They use this language/branding as a filter, which provides stability and control. The fact that this language gives a correct warning causes stronger people to bounce off from the entity. The people who remain can be more pliable and dedicated (in language that was actually used, not “too big”).
Ok, with that out of the way, getting closer to the substance of the original question about paradigms of AI.
So something that isn’t talked about is something that is known as “classical AI” or “symbolic AI”.
Symbolic AI or classical AI was the original general AI paradigm. (This is written up elsewhere, including on Wikipedia, which provides my knowledge for all further comments.)
Here’s my take to explain the relevance of symbolic AI.
Ok, close your eyes, and imagine for a second that everything in the world is the same, but that deep learning or neural networks don’t exist yet. They just haven’t been discovered.
In this situation, if you told a genius in math or physics, that there was a danger from building an evil, world conquering AI, what would that genius think about?
I honestly bet that they wouldn’t think about feeding giant spreadsheets of data into a bunch of interconnected logistic functions to run argmax, which is essentially what deep learning, RNN, transformers, etc. do today.
Instead, this genius would think about the danger of directly teaching or programming computers with logic, such as giving symbols and rules that seem to allow reasoning. They would think that you could directly evolve these reasoning systems, and growing to general cognition, and become competitive with humans.
If you thought computers could think or use logic, this approach seems like, really obvious.
(According to my 150 seconds of skimming Wikipedia) my guess is that the above way of thinking is exactly what symbolic or classical AI did. According to Wikipedia, this dominated until 1985[1].
Note that this classical or symbolic approach was something full on. This wasn’t “very online”.
Symbolic AI was studied at MIT, Stanford, CMU, RAND, etc. This was a big deal, sort of like how molecular biology, or any mainstream approach today in machine learning is a big deal.
To be really concrete and tangible to explain the “symbolic reasoning” programme, one instance of this probably is this project “Cyc” https://en.wikipedia.org/wiki/Cyc.
If you take a look at that article, you get a sense for this approach. This stores a lot of information and trying to build reasoning from rules.
Ok the story continues, but with a twist—directly relevant to the original question of paradigms.
(According to my 5 minutes of Wikipedia reading) a major critic of of symbolic AI was by this dude, Hubert Dreyfus. Dreyfus was a Harvard trained philosopher and MIT faculty when he was hired by RAND.
There, Dreyfus sort of turned on his new employer (“treacherous turn!”) to write a book against the establishment’s approach to AI.
Dreyfus basically predicted that symbolic approach was a dead end.
As a result, it looks like he was attacked, and it was an epic drama mine.
Ultimately, Dreyfus was essentially “proved right” (although there’s probably a lot more color here, most contrarians tend to be excessive, dogmatic and fixated themselves, and sometimes right for the wrong reasons.)
Dreyfus did make a splash, but note that even if Dreyfus was correct it’s unlikely that his impact causal. It’s more likely the field petered out as lack of results became too obvious to ignore (based on 3 paragraphs of Wikipedia and mostly my beliefs) .
But ultimately, the field of classical/symbolic AI fell, pretty bigly, to a degree that I don’t think many fields fall.
Directly relevant to the question of “paradigm” in the original question.
This now fallen symbolic AI reasoning faction/programme/agenda is a “paradigm”. This “paradigm” seemed extremely reasonable.
These figures were vastly influential and established, they dotted the top CS schools, the top think tanks and connected with the strongest companies at the time. This AI community was probably vastly more influential than the AI safety community currently is, by a factor of 20 or more.
It still seems like “Symbolic logic”, with some sort of evolution/training, is a reasonable guess for how to build general, strong AI.
In contrast, what exactly are neural networks doing? To many people, deep learning, using a 400B parameter model to hallucinate images and text, seems like a labored and quixotic approach to general AI. So how are these deep learning systems dangerous in any way? (Well, I think there are groups of principled people who have written up important content on this.)
(I’m not a critic or stealth undermining AI safety, this series of comments isn’t a proxy to critique AI safety. These lessons seem generally useful.)
The takeaway from this comment is that:
It’s very likely that these events of symbolic logic hangs over the general AI field (“AI winter”). It’s very possible that the reaction was an overshoot that hampers AI safety today in some way. (But 10 minutes of reading Wikipedia isn’t enough for me to speculate more).
This is a perfect example of a paradigm shift. Symbolic AI was a major commitment by the establishment, the field resisted intensely, but they ultimately fell hard.
There’s probably other “paradigms” (ugh I hate using this phrase and framework) that might be relevant to AI safety. Like, I guess maybe someone should take a look at this as AI safety continues.
Final comment—quick take on “current approach in AI”
If you’re still here, still reading this comment chain (my guess is that there is a 90% chance the original commentor is gone, and most forum readers are gone too), you might be confused because I haven’t mentioned the “current paradigm of ML or AI”.
For completeness it’s worth filling this in. So like, here it is:
Basically, the current paradigm of ML or deep learning right now is a lot less deep than it seems.
Essentially, people have models, like a GPT-3/BERT transformer, that they tweak, run data through, and look at the resulting performance.
To build these models and new ones, people basically append and modify existing models with new architectures or data, and look at how the results perform on established benchmarks (e.g. language translation, object detection).
Yeah, this doesn’t sound super principled, and it isn’t.
To give an analogy, imagine bridge building in civil engineering. My guess is that when designing a bridge, engineers use intricate knowledge of materials and physics, and choose every single square meter of every component of the bridge in accordance to what is needed, so that the resulting whole stands up and supports every other component.
In contrast, another approach to building a bridge is to just bolt and weld a lot of pieces together, in accordance with intuition and experience. This approach probably would copy existing designs, and there would be a lot of tacit knowledge and rules of thumb (e.g. when you want to make a bridge 2x as big, you usually have to use more than 2x more material). With many iterations, over time this process would work and be pretty efficient.
The second approach is basically a lot of how deep learning works. People are trying different architectures, adding layers, that are moderate innovations over the last model.
There’s a lot more details. Things like unsupervised learning, or encoder decoder models, latent spaces, and more, are super interesting and involve real insights or new approaches.
I’m basically too dumb to build a whole new deep learning architecture, but many people are smart enough, and major advances can be made like by new insights. But still, a lot of it is this iteration and trial and error. The biggest enablement is large amounts of data and computing capacity, and I guess lots of investor money.