RobBensinger comments on My highly personal skepticism braindump on existential risk from artificial intelligence.

RobBensinger 29 Jan 2023 3:34 UTC
7 points
0 ∶ 0
I feel that you are not giving answers with reference to stuff that I already hold, but rather to stuff that further references that worldview.
Sounds right to me! I don’t know your worldview, so I’m mostly just reporting my thoughts on stuff, not trying to do anything particularly sophisticated.
what were your default expectations before the DL revollution?
I personally started thinking about ML and AGI risk in 2013, and I didn’t have much of a view of “how are we likely to get to AGI?” at the time.
My sense is that MIRI-circa-2010 wasn’t confident about how humanity would get to AI, but expected it would involve gaining at least some more (object-level, gearsy) insight into how intelligence works. “Just throw more compute at a slightly tweaked version of one of the standard old failed approaches to AGI” wasn’t MIRI’s top-probability scenario.
From my perspective, humanity got “unlucky” in three different respects:
- AI techniques started working really well early, giving us less time to build up an understanding of alignment.
- Techniques started working for reasons other than us acquiring and applying gearsy new insights into how reasoning works, so the advances in AI didn’t help us understand how to do alignment.
- And the specific methods that worked are more opaque than most pre-deep-learning AI, making it hard to see how you’d align the system even in principle.
E.g., GPT-3 seems wildly more safe than a seed AGI that had already reached that level of capabilities
Seems like the wrong comparison; the question is whether AGI built by deep learning (that’s at the “capability level” of GPT-3) is safer than seed AGI (that’s at the “capability level” of GPT-3).
I don’t think GPT-3 is an AGI, or has the same safety profile as baby AGIs built by deep learning. (If there’s an efficient humanly-reachable way to achieve AGI via deep learning.) So an apples-to-apples comparison would either think about hypothetical deep-learning AGI vs. hypothetical seed AGI, or it would look at GPT-3 vs. hypothetical narrow AI built on the road to seed AGI.
If we can use GPT-3 or something very similar to GPT-3 to save the world, then it of course matters that GPT-3 is way safer than seed AGI. But then the relevant argument would look something like “maybe the narrow AI tech that you get on the path to deep-learning AGI is more powerful and/or more safe than the narrow AI tech that you get on the path to seed AGI”, as opposed to “GPT-3 is safer than a baby god” (the latter being something that’s true whether or not the baby god is deep-learning-based).