Ok, so thinking about this, one trouble with answering your comment this is that you have a self-consistent worldview which has contrary implications to some of the stuff I hold, but I feel that you are not giving answers with reference to stuff that I already hold, but rather to stuff that further references that worldview.
Let me know if this feels way off.
So I’m going to just pick one object-level argument and dig in to that:
As deep learning attains more and more success, I think that some of the old concerns port over. But I am not sure which ones, to what extent, and in which context. This leads me to reduce some of my probability.
On net I think the deep learning revolution increases p(doom), mostly because it’s a surprisingly opaque and indirect way of building intelligent systems, that gives you relatively few levers to control internal properties of the reasoner you SGDed your way to.
Well, I think that the question is, increased p(doom) compared to what, e.g., what were your default expectations before the DL revollution?
Compared to equivalent progress in a seed AI which has a utility function
Deep learning seems like it has some advantages, e.g,.: it is [doing the kinds of things that were reinforced during its training in the past], which seems safer than [optimizing across a utility function programmed into its core, where we don’t really know how to program utility functions.]
E.g., GPT-3 seems wildly more safe than a seed AGI that had already reached that level of capabilities
“Oh yes, we just had to put in a desire to predict the world, an impulse for curiosity, in addition to the standard self-preservation drive and our experimental caring for humans module, and then just let it explore the internet” sounds fairly terrifying.
I feel that you are not giving answers with reference to stuff that I already hold, but rather to stuff that further references that worldview.
Sounds right to me! I don’t know your worldview, so I’m mostly just reporting my thoughts on stuff, not trying to do anything particularly sophisticated.
what were your default expectations before the DL revollution?
I personally started thinking about ML and AGI risk in 2013, and I didn’t have much of a view of “how are we likely to get to AGI?” at the time.
My sense is that MIRI-circa-2010 wasn’t confident about how humanity would get to AI, but expected it would involve gaining at least some more (object-level, gearsy) insight into how intelligence works. “Just throw more compute at a slightly tweaked version of one of the standard old failed approaches to AGI” wasn’t MIRI’s top-probability scenario.
From my perspective, humanity got “unlucky” in three different respects:
AI techniques started working really well early, giving us less time to build up an understanding of alignment.
Techniques started working for reasons other than us acquiring and applying gearsy new insights into how reasoning works, so the advances in AI didn’t help us understand how to do alignment.
And the specific methods that worked are more opaque than most pre-deep-learning AI, making it hard to see how you’d align the system even in principle.
E.g., GPT-3 seems wildly more safe than a seed AGI that had already reached that level of capabilities
Seems like the wrong comparison; the question is whether AGI built by deep learning (that’s at the “capability level” of GPT-3) is safer than seed AGI (that’s at the “capability level” of GPT-3).
I don’t think GPT-3 is an AGI, or has the same safety profile as baby AGIs built by deep learning. (If there’s an efficient humanly-reachable way to achieve AGI via deep learning.) So an apples-to-apples comparison would either think about hypothetical deep-learning AGI vs. hypothetical seed AGI, or it would look at GPT-3 vs. hypothetical narrow AI built on the road to seed AGI.
If we can use GPT-3 or something very similar to GPT-3 to save the world, then it of course matters that GPT-3 is way safer than seed AGI. But then the relevant argument would look something like “maybe the narrow AI tech that you get on the path to deep-learning AGI is more powerful and/or more safe than the narrow AI tech that you get on the path to seed AGI”, as opposed to “GPT-3 is safer than a baby god” (the latter being something that’s true whether or not the baby god is deep-learning-based).
Ok, so thinking about this, one trouble with answering your comment this is that you have a self-consistent worldview which has contrary implications to some of the stuff I hold, but I feel that you are not giving answers with reference to stuff that I already hold, but rather to stuff that further references that worldview.
Let me know if this feels way off.
So I’m going to just pick one object-level argument and dig in to that:
Well, I think that the question is, increased p(doom) compared to what, e.g., what were your default expectations before the DL revollution?
Compared to equivalent progress in a seed AI which has a utility function
Deep learning seems like it has some advantages, e.g,.: it is [doing the kinds of things that were reinforced during its training in the past], which seems safer than [optimizing across a utility function programmed into its core, where we don’t really know how to program utility functions.]
E.g., GPT-3 seems wildly more safe than a seed AGI that had already reached that level of capabilities
“Oh yes, we just had to put in a desire to predict the world, an impulse for curiosity, in addition to the standard self-preservation drive and our experimental caring for humans module, and then just let it explore the internet” sounds fairly terrifying.
Compared to no progress in deep learning at all
Sure, I agree
Compared to something else
Depends on the something else.
Sounds right to me! I don’t know your worldview, so I’m mostly just reporting my thoughts on stuff, not trying to do anything particularly sophisticated.
I personally started thinking about ML and AGI risk in 2013, and I didn’t have much of a view of “how are we likely to get to AGI?” at the time.
My sense is that MIRI-circa-2010 wasn’t confident about how humanity would get to AI, but expected it would involve gaining at least some more (object-level, gearsy) insight into how intelligence works. “Just throw more compute at a slightly tweaked version of one of the standard old failed approaches to AGI” wasn’t MIRI’s top-probability scenario.
From my perspective, humanity got “unlucky” in three different respects:
AI techniques started working really well early, giving us less time to build up an understanding of alignment.
Techniques started working for reasons other than us acquiring and applying gearsy new insights into how reasoning works, so the advances in AI didn’t help us understand how to do alignment.
And the specific methods that worked are more opaque than most pre-deep-learning AI, making it hard to see how you’d align the system even in principle.
Seems like the wrong comparison; the question is whether AGI built by deep learning (that’s at the “capability level” of GPT-3) is safer than seed AGI (that’s at the “capability level” of GPT-3).
I don’t think GPT-3 is an AGI, or has the same safety profile as baby AGIs built by deep learning. (If there’s an efficient humanly-reachable way to achieve AGI via deep learning.) So an apples-to-apples comparison would either think about hypothetical deep-learning AGI vs. hypothetical seed AGI, or it would look at GPT-3 vs. hypothetical narrow AI built on the road to seed AGI.
If we can use GPT-3 or something very similar to GPT-3 to save the world, then it of course matters that GPT-3 is way safer than seed AGI. But then the relevant argument would look something like “maybe the narrow AI tech that you get on the path to deep-learning AGI is more powerful and/or more safe than the narrow AI tech that you get on the path to seed AGI”, as opposed to “GPT-3 is safer than a baby god” (the latter being something that’s true whether or not the baby god is deep-learning-based).