Lee_Sharkey comments on An intervention to shape policy dialogue, communication, and AI research norms for AI safety

Lee_Sharkey 3 Oct 2017 13:54 UTC
2 points
0 ∶ 0
I think this proposition could do with some refinement. AI safety should be a superset of both AGI safety and narrow-AI safety. Then we don’t run into problematic sentences like “AI safety may not help much with AGI Safety”, which contradicts how we currently use ‘AI safety’.

To address the point on these terms, then:

I don’t think AI safety runs the risk of being so attractive that misallocation becomes a big problem. Even if we consider risk of funding misallocation as significant, ‘AI risk’ seems like a worse term for permitting conflation of work areas.

Yes, it’s of course useful to have two different concepts for these two types of work, but this conceptual distinction doesn’t go away with a shift toward ‘AI accidents’ as the subject of these two fields. I don’t think a move toward ‘AI accidents’ awkwardly merges all AI safety work.

But if it did: The outcome we want to avoid is AGI safety getting too little funding. This outcome seems more likely in a world that makes two fields of N-AI safety and AGI safety, given the common dispreference for work on AGI safety. Overflow seems more likely in the N-AI Safety → AGI Safety direction when they are treated as the same category than when they are treated as different. It doesn’t seem beneficial for AGI safety to market the two as separate types of work.

Ultimately, though, I place more weight on the other reasons why I think it’s worth reconsidering the terms.
- WillPearson 3 Oct 2017 18:26 UTC
  0 points
  0 ∶ 0
  Parent
  I agree it is worth reconsidering the terms!
  
  The agi/narrow ai distinction is beside the point a bit, I’m happy to drop it. I also have an AI/IA bugbear so I’m used to not liking how things are talked about.
  
  Part of the trouble is we have lost the marketing war before it even began, every vaguely advanced technology we have currently is marketing itself as AI, that leaves no space for anything else.
  
  AI accidents brings to my mind trying to prevent robots crashing into things. 90% of robotics work could be classed as AI accident prevention because they are always crashing into things.
  
  It is not just funding confusion that might be a problem. If I’m reading a journal on AI safety or taking a class on AI safety what should I expect? Robot mishaps or the alignment problem? How will we make sure the next generation of people can find the worthwhile papers/courses?
  
  AI risks is not perfect, but is not at least it is not that.
  
  Perhaps we should take a hard left and say that we are looking at studying Artificial Intelligence Motivation? People know that an incorrectly motivated person is bad and that figuring out how to motivate AIs might be important. It covers the alignment problem and the control problem.
  
  Most AI doesn’t look like it has any form of motivation and is harder to rebrand as such, so it is easier to steer funding to the right people and tell people what research to read.
  
  It doesn’t cover my IA gripe, which briefly is: AI makes people think of separate entities with their own goals/moral worth. I think we want to avoid that as much of possible. General Intelligence augmentation requires its own motivation work, but one so that the motivation of the human is inherited by the computer that human is augmenting. I think that my best hope is that AGI work might move in that direction.
  - Lee_Sharkey 4 Oct 2017 0:16 UTC
    0 points
    0 ∶ 0
    Parent
    
    AI accidents brings to my mind trying to prevent robots crashing into things. 90% of robotics work could be classed as AI accident prevention because they are always crashing into things.
    
    It is not just funding confusion that might be a problem. If I’m reading a journal on AI safety or taking a class on AI safety what should I expect? Robot mishaps or the alignment problem? How will we make sure the next generation of people can find the worthwhile papers/courses?
    
    I take the point. This is a potential outcome, and I see the apprehension, but I think it’s a probably a low risk that users will grow to mistake robotics and hardware accidents for AI accidents (and work that mitigates each) - sufficiently low that I’d argue expected value favours the accident frame. Of course, I recognize that I’m probably invested in that direction.
    
    Perhaps we should take a hard left and say that we are looking at studying Artificial Intelligence Motivation? People know that an incorrectly motivated person is bad and that figuring out how to motivate AIs might be important. It covers the alignment problem and the control problem.
    
    Most AI doesn’t look like it has any form of motivation and is harder to rebrand as such, so it is easier to steer funding to the right people and tell people what research to read.
    
    I think this steers close to an older debate on AI “safety” vs “control” vs “alignment”. I wasn’t a member of that discussion so am hesitant to reenact concluded debates (I’ve found it difficult to find resources on that topic other than what I’ve linked—I’d be grateful to be directed to more). I personally disfavour ‘motivation’ on grounds of risk of anthropomorphism.
    - WillPearson 4 Oct 2017 19:13 UTC
      0 points
      0 ∶ 0
      Parent
      
      I take the point. This is a potential outcome, and I see the apprehension, but I think it’s a probably a low risk that users will grow to mistake robotics and hardware accidents for AI accidents (and work that mitigates each) - sufficiently low that I’d argue expected value favours the accident frame. Of course, I recognize that I’m probably invested in that direction.
      
      I would do some research onto how well sciences that have suffered brand dilution do.
      
      As far as I understand it Research institutions have high incentives to
      
      Find funding
      Pump out tractible digestible papers
      
      See this kind of article for other worries about this kind of thing.
      
      You have to frame things with that in mind, give incentives so that people do the hard stuff and can be recognized for doing the hard stuff.
      
      Nanotech is a classic case of a diluted research path, if you have contacts maybe try and talk to Erik Drexler, he is interested in AI safety so might be interested in how the AI Safety research is framed.
      
      I think this steers close to an older debate on AI “safety” vs “control” vs “alignment”. I wasn’t a member of that discussion so am hesitant to reenact concluded debates (I’ve found it difficult to find resources on that topic other than what I’ve linked—I’d be grateful to be directed to more). I personally disfavour ‘motivation’ on grounds of risk of anthropomorphism.
      
      Fair enough I’m not wedded to motivation (I see animals having motivation as well, so not strictly human). It doesn’t seem to cover Phototaxis which seems like the simplest thing we want to worry about. So that is an argument against motivation. I’m worded out at the moment. I’ll see if my brain thinks of anything better in a bit.