Error
Unrecognized LW server error:
Field "fmCrosspost" of type "CrosspostOutput" must have a selection of subfields. Did you mean "fmCrosspost { ... }"?
Unrecognized LW server error:
Field "fmCrosspost" of type "CrosspostOutput" must have a selection of subfields. Did you mean "fmCrosspost { ... }"?
Thanks for the post! This seems like a clearly important and currently quite neglected area and I’d love to see more work on it.
My current hot-take is that it seems viable to make AGI research labs a sufficiently hardened target that most actors cannot exploit them. But I don’t really see a path to preventing the most well-resourced state actors from at least exfiltrating source code. There’s just so many paths to this: getting insiders to defect, supply chain attacks, etc. Because of this I suspect it’ll be necessary to get major state actors to play ball by other mechanisms (e.g. international treaties, mutually assured destruction). I’m curious if you agree or are more optimistic on this point?
I also want to note that espionage can reduce x-risk in some cases: e.g. actors may be less tempted to cut corners on safety if they have intelligence that their competitors are still far away from transformative AI. Similarly, it could be used as an (admittedly imperfect) mechanism for monitoring compliance with treaties or more informal agreements. I do still expect better infosec to be net-positive, though.
I think it’s an open question right now. I expect it’s possible with the right resources and environment, but I might be wrong. I think it’s worth treating as an untested hypothesis ( that we can secure X kind of system for Y application of resources ), and to try to get more information to test the hypothesis. If AGI development is impossible to secure, that cuts off a lot of potential alignment strategies. So it seems really worth trying to find out if it’s possible.
Great overview of an important field for AI safety, thanks for sharing. A few questions if you have the time:
First, what secrets would be worth keeping? Most AI research today is open source, with methods described in detailed papers and code released on GitHub. That which is not released is often quickly reverse-engineered: OpenAI’s GPT-3 and DALLE-2 systems, for example, both have performant open-source implementations. On the other hand, many government and military applications seemingly must be confidential.
What kinds of AI research is kept secret today, and are you happy about it? Do you expect the field to become much more secretive as we move towards AGI? In what areas is this most important?
Second, do you view infosec as a dual-use technology? That is, if somebody spends their career developing better infosec methods, can those methods be applied by malicious actors (e.g. totalitarian governments) just as easily as they can be applied by value-aligned actors? This would make sense if the key contributions would be papers and inventions that the whole field can adopt. But if infosec is an engineering job that must be built individually by each organization pursuing AGI, then individuals working in the field could choose which actors they’d be willing to work with.
Finally, a short plug: I brainstormed why security engineering for nuclear weapons could be an impactful career path. The argument is that AGI’s easiest path to x-risk runs through existing WMDs such as nuclear and bio weapons, so we should secure those weapons from cyber attacks and other ways an advanced AI could take control of them. Do you think infosec for WMDs would be high expected impact? How would you change my framing?
I agree that a lot of the research today by leading labs is being published. I think the norms are slowly changing, at least for some labs. Deciding not to (initially) release the model weights of GPT-2 was a big change in norms iirc, and I think the trend towards being cautious with large language models has continued. I expect that as these systems get more powerful, and the ways they can be misused gets more obvious, norms will naturally shift towards less open publishing. That being said, I’m not super happy with where we’re at now, and I think a lot of labs are being pretty irresponsible with their publishing.
The dual-use question is a good one, I think. Offensive security knowledge is pretty dual-use, yes. Pen testers can use their knowledge to illegally hack if they want to. But the incentives in the US are pretty good regarding legal vs. illegal hacking, less so in other countries. I’m not super worried about people learning hacking skills to protect AGI systems only to use those skills to cause harm—mostly because the offensive security area is already very big / well resourced. In terms of using AI systems to create hacking tools, that’s an area where I think dual-use concerns can definitely come into play, and people should be thoughtful & careful there.
I liked your shortform post. I’d be happy to see people apply infosec skills towards securing nuclear weapons (and in the biodefense area as well). I’m not very convinced this would mitigate risk from superintelligent AI, since nuclear weapons would greatly damage infrastructure without killing everyone, and thus not be very helpful to eliminating humans imo. You’d still need some kind of manufacturing capability in order to create more compute, and if you have the robotics capability to do this then wiping out humans probably doesn’t take nukes—you could do it with drones or bioweapons or whatever. But this is all highly speculative, of course, and I think there is a case for securing nuclear weapons without looking at risks form superintelligence. Improving the security of nuclear weapons may increase the stability of nuclear weapons states, and that seems good for their ability to negotiate with one another, so I could see there being some route to AI existential risk reduction via that avenue.
Did you end up writing this, or have a draft of it you’d be willing to share?
There is nice post about this at 80k: https://80000hours.org/career-reviews/information-security/
Thanks for this Jeffrey and Lennart! Very interesting, and I broadly agree. Good area for people to gain skills/expertise, and private companies should beef up their infosec to make it harder for them to be hacked and stop some adversaries.
However, I think its worth being humble/realistic. IMO a small/medium tech company (even Big Tech themselves) are not going to be able to stop a motivated state-linked actor from the P5. Would you broadly agree?
I don’t think an ordinary small/medium tech company can succeed at this. I think it’s possible with significant (extraordinary) effort, but that sort of remains to be seen.
As I said in another thread:
>> I think it’s an open question right now. I expect it’s possible with the right resources and environment, but I might be wrong. I think it’s worth treating as an untested hypothesis ( that we can secure X kind of system for Y application of resources ), and to try to get more information to test the hypothesis. If AGI development is impossible to secure, that cuts off a lot of potential alignment strategies. So it seems really worth trying to find out if it’s possible.