CBiddulph comments on How I failed to form views on AI safety

CBiddulph 20 Apr 2022 21:09 UTC
9 points
0 ∶ 0
Hi Ada, I’m glad you wrote this post! Although what you’ve written here is pretty different from my own experience with AI safety in many ways, I think I got some sense of your concerns from reading this.
I also read Superintelligence as my first introduction to AI safety, and I remember pretty much buying into the arguments right away.^[1] Although I think I understand that modern-day ML systems do dumb things all the time, this intuitively weighs less on my mind than the idea that AI can in principle be much smarter than humans, and that sooner or later this will happen. When I look specifically at the cutting-edge of modern AI tech like GPT-3, I feel like this supports my view pretty strongly, but I don’t think I could give you a knockdown explanation for why typical modern AI doing dumb things seems less important; this is just my intuition. Usually, intuitions can be tested by seeing how well they make predictions, but the really inconvenient thing about statements about TAI is that they can never be validated.
As I’ve talked to people at EAGxBoston and EAG London, I’ve started to realize that my intuitions seem to be doing a lot of heavy lifting that I don’t feel fully able to explain. Ironically, the more I learn about AI safety, the less I feel that I have principled inside views on questions like “what research avenues are the most important” and “what year will transformative AI happen.” I’ve realized that I pretty much just defer to the weighted average opinion of various EA people who I respect. This heuristic is intuitive to me, but it also seems kind of bad.
I feel like if I really knew what I was talking about, I would be able to come up with novel and clever arguments for my beliefs and talk about them with utmost confidence, like Eliezer Yudkowsky with his outspoken conviction that we’re all doomed; or I’d have a unique and characteristic view on what we can do to decrease AI risk, like Chris Olah with interpretability. Instead, I just have a bunch of intuitions, which to the extent they can be put into words, just boil down to silly-sounding things like, “GPT-3 seems really impressive, and AlexNet happened just 10 years ago and was less impressive. ‘An AI that can do competent AI research’ is really, really impressive, so maybe that will happen in… eh, I want to be conservative, so 20 years?”
Based on your post, I’m guessing maybe you have a similar perspective, but are coming at it from the opposite direction: you have intuitions that AI is not so big of a deal, but aren’t really sure of the reasons for your views. Does that seem accurate?
Maybe my best-guess takeaway for now is that a lot of the differences between people who disagree about speculative things like this is differing priors, which might not be based in specific, articulable, and concrete arguments. For instance, maybe I’m optimistic about the value of space colonization because I read The Long Way to a Small Angry Planet, which presents a vision of a utopian interspecies galactic civilization that appeals to me, but doesn’t make logical arguments for how it would work. Maybe I think that a sufficient amount of intelligence will be able to do really crazy things because I spent a lot of time as a kid trying to prove to people that I was smart and it’s important to my identity. Or maybe I just believe these things because they’re correct. I’m not sure I can tell.
I believe that as a community, we should really try to encourage a wide range of intuitions (as long as those intuitions haven’t clearly been invalidated by evidence). The value of diverse perspectives in EA isn’t a new idea, but if it’s true that priors do a lot of work in whether people believe speculative arguments, it could be all the more important. Otherwise, there could be a strong self-selection effect for people who find EA’s current speculations intuitive, since people who don’t have articulable reasons for disagreement won’t have much in the way to defend their beliefs, even if their priors are in fact well-founded.
1. ^
  The claim that simulating all of physics would be “more easily implementable” than a standard friendly AI does seem pretty ridiculous to me now, though I’m not sure it accurately reflects his original point? I think the argument had more to do with considering counterfactuals rather than actually carrying out a simulation. I would still agree that this is pretty weird and abstract, though I don’t think this point is that relevant anyway.
- Ada-Maaria Hyvärinen 21 Apr 2022 15:57 UTC
  4 points
  0 ∶ 0
  Parent
  Hi Caleb! Very nice to read your reflection on what might make you think what you think. I related to many things you mentioned, such as wondering how much I think intelligence matters because of having wanted to be smart as a kid.
  
  You understood correctly that intuitively, I think AI is less of a big deal than some people feel. This probably has a lot to do with my job, because it includes making estimates on if problems can be solved with current technology given certain constraints, and it is better to err to the side of caution. Previously, one of my tasks was also to explain people why AI is not a silver bullet and that modern ML solutions require things like training data and interfaces in order to be created and integrated to systems. Obviously, if the task is to find out all things that can future AI systems might be able to do at some point, you should take a quite different attitude than when trying to estimate what you yourself can implement right now. This is why I try to take a less conservative approach than would come naturally to me, but I think it still comes across as pretty conservative compared to many AI safety folks.
  
  I also find GPT-3 fascinating but I think the feeling I get from it is not “wow, this thing seems actually intelligent” but rather “wow, statistics can really encompass so many different properties of language”. I love language so it makes me happy. But to me, it seems that GPT-3 is ultimately a cool showcase of the current data-centered ML approaches (“take a model that is based on a relatively non-complex idea^[1], pour a huge amount of data into it, use model”). I don’t see it as a direct stepping stone to science-automating AI, because it is my intuition that “doing science well” is not that well encompassed in the available training data. (I should probably reflect more on what the concrete difference is.)
  
  Importantly, this does not mean I believe there can be no risks (or benefits!) from large language models, and models that will be developed in the near future.
  
  I think it is very hard to be aware of your intuitions, incorporate new valid information to your world view and communicate with others at the same time. But I agree that for everyone it is better if we create better opportunities to do that, because otherwise we will lose information.
  1. ^
    not to say non-complexity would make the model somehow insignificant, quite the opposite, it is fascinating what attention mechanisms accomplish not only in NLP but on other domains as well