aog comments on Information security considerations for AI and the long term future

aog 3 May 2022 17:00 UTC
10 points
0 ∶ 0
Great overview of an important field for AI safety, thanks for sharing. A few questions if you have the time:
First, what secrets would be worth keeping? Most AI research today is open source, with methods described in detailed papers and code released on GitHub. That which is not released is often quickly reverse-engineered: OpenAI’s GPT-3 and DALLE-2 systems, for example, both have performant open-source implementations. On the other hand, many government and military applications seemingly must be confidential.
What kinds of AI research is kept secret today, and are you happy about it? Do you expect the field to become much more secretive as we move towards AGI? In what areas is this most important?
Second, do you view infosec as a dual-use technology? That is, if somebody spends their career developing better infosec methods, can those methods be applied by malicious actors (e.g. totalitarian governments) just as easily as they can be applied by value-aligned actors? This would make sense if the key contributions would be papers and inventions that the whole field can adopt. But if infosec is an engineering job that must be built individually by each organization pursuing AGI, then individuals working in the field could choose which actors they’d be willing to work with.
Finally, a short plug: I brainstormed why security engineering for nuclear weapons could be an impactful career path. The argument is that AGI’s easiest path to x-risk runs through existing WMDs such as nuclear and bio weapons, so we should secure those weapons from cyber attacks and other ways an advanced AI could take control of them. Do you think infosec for WMDs would be high expected impact? How would you change my framing?
- Jeffrey Ladish 5 May 2022 5:33 UTC
  5 points
  0 ∶ 0
  Parent
  I agree that a lot of the research today by leading labs is being published. I think the norms are slowly changing, at least for some labs. Deciding not to (initially) release the model weights of GPT-2 was a big change in norms iirc, and I think the trend towards being cautious with large language models has continued. I expect that as these systems get more powerful, and the ways they can be misused gets more obvious, norms will naturally shift towards less open publishing. That being said, I’m not super happy with where we’re at now, and I think a lot of labs are being pretty irresponsible with their publishing.
  
  The dual-use question is a good one, I think. Offensive security knowledge is pretty dual-use, yes. Pen testers can use their knowledge to illegally hack if they want to. But the incentives in the US are pretty good regarding legal vs. illegal hacking, less so in other countries. I’m not super worried about people learning hacking skills to protect AGI systems only to use those skills to cause harm—mostly because the offensive security area is already very big / well resourced. In terms of using AI systems to create hacking tools, that’s an area where I think dual-use concerns can definitely come into play, and people should be thoughtful & careful there.
  
  I liked your shortform post. I’d be happy to see people apply infosec skills towards securing nuclear weapons (and in the biodefense area as well). I’m not very convinced this would mitigate risk from superintelligent AI, since nuclear weapons would greatly damage infrastructure without killing everyone, and thus not be very helpful to eliminating humans imo. You’d still need some kind of manufacturing capability in order to create more compute, and if you have the robotics capability to do this then wiping out humans probably doesn’t take nukes—you could do it with drones or bioweapons or whatever. But this is all highly speculative, of course, and I think there is a case for securing nuclear weapons without looking at risks form superintelligence. Improving the security of nuclear weapons may increase the stability of nuclear weapons states, and that seems good for their ability to negotiate with one another, so I could see there being some route to AI existential risk reduction via that avenue.