Benjamin Hilton comments on Why I think it’s net harmful to do technical safety research at AGI labs

Benjamin Hilton 7 Feb 2024 17:50 UTC
38 points
6 ∶ 3
Hi Remmelt,
Thanks for sharing your concerns, both with us privately and here on the forum. These are tricky issues and we expect people to disagree about how to about how to weigh all the considerations — so it’s really good to have open conversations about them.
Ultimately, we disagree with you that it’s net harmful to do technical safety research at AGI labs. In fact, we think it can be the best career step for some of our readers to work in labs, even in non-safety roles. That’s the core reason why we list these roles on our job board.
We argue for this position extensively in my article on the topic (and we only list roles consistent with the considerations in that article).
Some other things we’ve published on this topic in the last year or so:
- A range of opinions from anonymous experts about the upsides and downsides of working on AI capabilities
- How policy roles in AI companies can be valuable for career capital and for direct impact (as well as the potential downsides)
- We recently released a podcast episode with Nathan Labenz on some of the controversy around OpenAI, including his concerns about some of their past safety practices, whether ChatGPT’s release was good or bad, and why its mission of developing AGI may be too risky.
Benjamin
What links here?
- Conor Barnes 🔶 12 Feb 2024 15:51 UTC
  9 points
  0 ∶ 0
  Parent
  Hi Remmelt,
  
  Just following up on this — I agree with Benjamin’s message above, but I want to add that we actually did add links to the “working at an AI lab” article in the org descriptions for leading AI companies after we published that article last June.
  
  It turns out that a few weeks ago the links to these got accidentally removed when making some related changes in Airtable, and we didn’t notice these were missing — thanks for bringing this to our attention. We’ve added these back in and think they give good context for job board users, and we’re certainly happy for more people to read our articles.
  
  We also decided to remove the prompt engineer / librarian role from the job board, since we concluded it’s not above the current bar for inclusion. I don’t expect everyone will always agree with the judgement calls we make about these decisions, but we take them seriously, and we think it’s important for people to think critically about their career choices.
  What links here?
  - Remmelt's comment on 80,000 hours should remove OpenAI from the Job Board (and similar EA orgs should do similarly) by Raemon (LessWrong; 19 Jul 2024 14:11 UTC; 20 points)
  - Remmelt's comment on 80,000 hours should remove OpenAI from the Job Board (and similar EA orgs should do similarly) by Raemon (19 Jul 2024 14:13 UTC; 13 points)
  - Remmelt 13 Feb 2024 8:29 UTC
    0 points
    0 ∶ 0
    Parent
    Hi Conor,
    
    Thank you.
    
    I’m glad to see that you already linked to clarifications before. And that you gracefully took the feedback, and removed the prompt engineer role. I feel grateful for your openness here.
    
    It makes me feel less like I’m hitting a brick wall. We can have more of a conversation.
    
    ~ ~ ~
    
    The rest is addressed to people on the team, and not to you in particular:
    
    There are grounded reasons why 80k’s approaches to recommending work at AGI labs – with the hope of steering their trajectory – has supported AI corporations to scale. While disabling efforts that may actually prevent AI-induced extinction.
    
    This concerns work on your listed #1 most pressing problem. It is a crucial consideration that can flip your perceived total impact from positive to negative.
    
    I noticed that 80k staff responses so far started by stating disagreement (with my view), or agreement (with a colleague’s view).
    
    This doesn’t do discussion of it justice. It’s like responding to someone’s explicit reasons for concern that they must be “less optimistic about alignment”. This ends reasoned conversations, rather than opens them up.
    
    Something I would like to see more of is individual 80k staff engaging with the reasoning.
- Remmelt 8 Feb 2024 1:55 UTC
  3 points
  2 ∶ 6
  Parent
  Ben, it is very questionable that 80k is promoting non-safety roles at AGI labs as ‘career steps’.
  
  Consider that your model of this situation may be wrong (account for model error).
  - The upside is that you enabled some people to skill up and gain connections.
  - The downside is that you are literally helping AGI labs to scale commercially (as well as indirectly supporting capability research).
  What links here?
  - Remmelt's comment on Why I think it’s net harmful to do technical safety research at AGI labs by Remmelt (LessWrong; 8 Feb 2024 4:25 UTC; 6 points)
  - Remmelt 8 Feb 2024 3:12 UTC
    9 points
    3 ∶ 0
    Parent
    A range of opinions from anonymous experts about the upsides and downsides of working on AI capabilities
    I did read that compilation of advice, and responded to that in an email (16 May 2023):
    “Dear [a],
    
    People will drop in and look at job profiles without reading your other materials on the website. I’d suggest just writing a do-your-research cautionary line about OpenAI and Anthropic in the job descriptions itself.
    
    Also suggest reviewing whether to trust advice on whether to take jobs that contribute to capability research.
    Particularly advice by nerdy researchers paid/funded by corporate tech.
    Particularly by computer-minded researchers who might not be aware of the limitations of developing complicated control mechanisms to contain complex machine-environment feedback loops.
    Totally up to you of course.
    Warm regards,
    Remmelt”
    
    We argue for this position extensively in my article on the topic
    This is what the article says:
    “All that said, we think it’s crucial to take an enormous amount of care before working at an organisation that might be a huge force for harm. Overall, it’s complicated to assess whether it’s good to work at a leading AI lab — and it’ll vary from person to person, and role to role.”
    
    So you are saying that people are making a decision about working for an AGI lab that might be (or actually is) a huge force for harm. And that whether it’s good (or bad) to work at an AGI lab depends on the person – ie. people need to figure this out for them personally.
    
    Yet you are openly advertising various jobs at AGI labs on the job board. People are clicking through and applying. Do you know how many read your article beforehand?
    
    ~ ~ ~
    Even if they did read through the article, both the content and framing of the advice seems misguided. Noticing what is emphasised in your considerations.
    Here are the first sentences of each consideration section:
    (ie. as what readers are most likely to read, and what you might most want to convey).
    “We think that a leading — but careful — AI project could be a huge force for good, and crucial to preventing an AI-related catastrophe.”
    Is this your opinion about DeepMind, OpenAI and Anthropic?
    
    “Top AI labs are high-performing, rapidly growing organisations. In general, one of the best ways to gain career capital is to go and work with any high-performing team — you can just learn a huge amount about getting stuff done. They also have excellent reputations more widely. So you get the credential of saying you’ve worked in a leading lab, and you’ll also gain lots of dynamic, impressive connections.”
    Is this focussing on gaining prestige and (nepotistic) connections as an instrumental power move, with the hope of improving things later...?
    Instead of on actually improving safety?
    
    “We’d guess that, all else equal, we’d prefer that progress on AI capabilities was slower.”
    Why is only this part stated as a guess?
    I did not read “we’d guess that a leading but careful AI project, all else equal, could be a force of good”.
    Or inversely: “we think that continued scaling of AI capabilities could be a huge force of harm.”
    Notice how those framings come across very differently.
    Wait, reading this section further is blowing my mind.
    “But that’s not necessarily the case. There are reasons to think that advancing at least some kinds of AI capabilities could be beneficial. Here are a few”
    “This distinction between ‘capabilities’ research and ‘safety’ research is extremely fuzzy, and we have a somewhat poor track record of predicting which areas of research will be beneficial for safety work in the future. This suggests that work that advances some (and perhaps many) kinds of capabilities faster may be useful for reducing risks.”
    Did you just argue for working on some capabilities because it might improve safety? This is blowing my mind.
    “Moving faster could reduce the risk that AI projects that are less cautious than the existing ones can enter the field.”
    Are you saying we should consider moving faster because there are people less cautious than us?
    Do you notice how a similarly flavoured argument can be used by and is probably being used by staff at three leading AGI labs that are all competing with each other?
    Did OpenAI moving fast with ChatGPT prevent Google from starting new AI projects?
    “It’s possible that the later we develop transformative AI, the faster (and therefore more dangerously) everything will play out, because other currently-constraining factors (like the amount of compute available in the world) could continue to grow independently of technical progress.”
    How would compute grow independently of AI corporations deciding to scale up capability?
    The AGI labs were buying up GPUs to the point of shortage. Nvidia was not able to supply them fast enough. How is that not getting Nvidia and other producers to increase production of GPUs?
    More comments on the hardware overhang argument here.
    “Lots of work that makes models more useful — and so could be classified as capabilities (for example, work to align existing large language models) — probably does so without increasing the risk of danger”
    What is this claim based on?
    
    “As far as we can tell, there are many roles at leading AI labs where the primary effects of the roles could be to reduce risks.”
    As far as I can tell, this is not the case.
    For technical research roles, you can go by what I just posted.
    For policy, I note that you wrote the following:
    ”Labs also often don’t have enough staff… to figure out what they should be lobbying governments for (we’d guess that many of the top labs would lobby for things that reduce existential risks).”
    I guess that AI corporations use lobbyists for lobbying to open up markets for profit, and to not get actually restricted by regulations (maybe to move focus to somewhere hypothetically in the future, maybe to remove upstart competitors who can’t deal with the extra compliance overhead, but don’t restrict us now!).
    On prior, that is what you should expect, because that is what tech corporations do everywhere. We shouldn’t expect on prior that AI corporations are benevolent entities that are not shaped by the forces of competition. That would be naive.
    
    ~ ~ ~
    After that, there is a new section titled “How can you mitigate the downsides of this option?”
    That section reads as thoughtful and reasonable.
    How about on the job board, you link to that section in each AGI lab job description listed, just above the ‘VIEW JOB DETAILS’ button?
    For example, you could append and hyperlink ‘Suggestions for mitigating downsides’ to the organisational descriptions of Google DeepMind, OpenAI and Anthropic.
    That would help guide through potential applicants to AGI lab positions to think through their decision.
    What links here?
    Remmelt's comment on 80,000 hours should remove OpenAI from the Job Board (and similar EA orgs should do similarly) by Raemon (LessWrong; 19 Jul 2024 14:11 UTC; 20 points)
    Remmelt's comment on 80,000 hours should remove OpenAI from the Job Board (and similar EA orgs should do similarly) by Raemon (19 Jul 2024 14:13 UTC; 13 points)
    - William the Kiwi 8 Feb 2024 9:18 UTC
      5 points
      1 ∶ 4
      Parent
      “This distinction between ‘capabilities’ research and ‘safety’ research is extremely fuzzy, and we have a somewhat poor track record of predicting which areas of research will be beneficial for safety work in the future. This suggests that work that advances some (and perhaps many) kinds of capabilities faster may be useful for reducing risks.”
      This seems like a absurd claim. Are 80k actually making it?
      EDIT: the claim is made by Benjamin Hilton, one of 80k’s analysts and the person the OP is replying too.
      - Remmelt 8 Feb 2024 13:05 UTC
        0 points
        0 ∶ 1
        Parent
        It is an extreme claim to make in that context, IMO.
        
        I think Benjamin made it to be nuanced. But the nuance in that article is rather one-sided.
        
        If anything, the nuance should be on the side of identifying any ways you might accidentally support the development of dangerous auto-scaling technologies.
        
        First do, no harm.
    - Remmelt 8 Feb 2024 4:19 UTC
      1 point
      0 ∶ 0
      Parent
      Note that we are focussing here on decisions at the individual level.
      There are limitations to that.
      
      See my LessWrong comment.
  - William the Kiwi 8 Feb 2024 10:30 UTC
    2 points
    0 ∶ 0
    Parent
    I would agree with Remmelt here. While upskilling people is helpful, if those people then go on to increase the rate of capabilities gain by AI companies, this is reducing the time the world has available to find solutions to alignment and AI regulation.
    
    While, as a rule, I don’t disagree with an industries increasing their capabilities, I do disagree with this when those capabilities knowingly lead to human extinction.