Matthijs Maas
Senior Research Fellow (Law & Artificial Intelligence), Legal Priorities Project
Research Affiliate, Centre for the Study of Existential Risk.
https://www.matthijsmaas.com/ | https://linktr.ee/matthijsmaas
Matthijs Maas
Senior Research Fellow (Law & Artificial Intelligence), Legal Priorities Project
Research Affiliate, Centre for the Study of Existential Risk.
https://www.matthijsmaas.com/ | https://linktr.ee/matthijsmaas
This is awful—Nathan was such an engaging and bright scholar, generous with his comments and insights. I had been hoping to see much more of his work in this field. Thank you for sharing this.
To some extend, I’d prefer not yet to anchor people too much, before finishing the entire sequence. I’ll aim to circle around later and have more deep reflection on my own commitments. In fact, one reason why I’m doing this project is that I notice I have rather large uncertainties over these different theories myself, and want to think through their assumptions and tradeoffs.
Still, while going into more detail on it later, I think it’s fair that I provide some disclaimers about my own preferences, for those who wish to know them before going in:
[preferences below break]
… … … …
TLDR: my currently (weakly held) perspective is something like ’(a) as default, pursue portfolio approach consisting of interventions from Exploratory, Prosaic Engineering, Path-setting, Adaptation-enabling, Network-building, and Environment-shaping perspectives: (b) under extremely short timelines and reasonably good alignment chances, switch to Anticipatory and Pivotal Engineering; (c) under extremely low alignment success probability, switch to Containing;”
This seems grounded in a set of predispositions / biases / heuristics that are something like:
Given I’ve quite a lot of uncertainty about key (technical and governance) parameters, I’m hesitant to commit to any one perspective and prefer portfolio approaches. —That means I lean towards strategic perspectives that are more information-providing (Exploratory), more robustly compatible with- and supportive of many others (Network-building, Environment-shaping), and/or more option-preserving and flexible (Adaptation-enabling); —conversely, for these reasons I may have less affinity for perspectives that potentially recommend far-reaching, hard-to-reverse actions under limited information conditions (Pivotal Engineering, Containing, Anticipatory);
My academic and research background (governance; international law) probably gives me a bias towards the more explicitly ‘regulatory’ perspectives (Anticipatory, Path-setting, Adaptation-enabling), especially in multilateral version (Coalitional); and a bias against perspectives that are more exclusively focused on the technical side alone (eg both Engineering perspectives), pursue more unilateral actions (Pivotal Engineering, Partisan), or which seek to completely break or go beyond existing systems (System-changing)
There are some perspectives (Adaptation-enabling, Containing) that have remained relatively underexplored within our community. While I personally am not yet convinced that there’s enough ground to adopt these as major pillars for direct action, from an Exploratory meta-perspective I am eager to see these options studied in more detail.
I am aware that under very short timelines, many of these perspectives fall away or begin looking less actionable;
[ED: I probably ended up being more explicit here than I intended to; I’d be happy to discuss these predispositions, but also would prefer to keep discussion of specific approaches concentrated in the perspective-specific posts (coming soon).
Thanks Nuño! I don’t think I’ve got well thought out views on relative importance or rankings of these work streams; I’m mostly focused on understanding scenarios in which my own work might be more or less impactful (I also should note that if some lines of research mentioned here seem much more impactful, that may be more a result of me being more familiar with them, and being able to give a more detailed account of what the research is trying to get at / what threat models and policy goals it is connected to).
On your second question, as with other academic institutes, I believe it’s actually both doable and common for donors or funders to support some of CSER’s themes or lines of work but not others. Some institutional funders (e.g. for large academic grants) will often focus on particular themes or risks (rather than e.g. ‘X-risk’ as a general class), and therefore want to ensure their funding is going to just that work. The same has been the case for individual donations, to support certain projects we’ve done, I think.
[ED: -- see link to CSER donation form. Admittedly, this web form doesn’t clearly allow you to specify different lines of work to support, but in practice this could be arranged in a bespoke way—by sending an email to director@cser.cam.ac.uk indicating what area of work one would want to support.]
+1 to this proposal and focus.
On ‘technical levers to make AI coordination/regulation enforceable’, there is a fair amount of work suggesting that e.g. arms control agreements have often dependend on/been enabled by new technological avenues for enabling unilateral monitoring (or for enabling cooperative, but non-intrusive monitoring—e.g. sensors on missile factories, as part of the US-USSR INF Treaty), have been instrumental (see Coe and Vaynmann 2020 ).
That doesn’t mean that it’s always an unalloyed good: there are indeed cases where new capabilities can introduce new security or escalation risks (e.g. Vaynmann 2021); they can also perversely hold up negotiations; e.g. Richard Burns (link, introduction) discusses a case where the involvement of engineers in designing a monitoring system for the Comprehensive Test Ban Treaty, actually held up negotiations of the regime, basically because the engineers focused excessively on technical perfection of the monitoring system [beyond a level of assurance that would’ve been strictly politically required by the contracting parties], which enabled opponents of the treaty to paint it as not giving sufficiently good guarantees.
Still, beyond improving enforcement, there’s interesting work on ways that AI technology could speed up and support the negotiation of treaty regimes (Deeks 2020, 2020b, Maas 2021), both for AI governance specifically, and in supporting international cooperation more broadly.
The Legal Priorities Project’s research agenda also includes consideration of s-risks, alongside with x-risks and other type of trajectory changes, though I do agree this remains somewhat under-integrated with other parts of the long-termist AI governance landscape (in part, I speculate, because the perspective might face [even] more inferential distance from the concerns of AI policymakers than x-risk focused work).
Thanks for the overview! You might also be interested in this (forthcoming) report and lit review: https://docs.google.com/document/d/12AoyaISpmhCbHOc2f9ytSfl4RnDe5uUEgXwzNJhF-fA/edit?usp=drivesdk
Thanks for collating this, Zach! Just to note, my ‘TAI Governance: a Literature Review’ is publicly shareable—but since we’ll be cleaning up the main doc as a report the coming week, could you update the link to this copy? https://docs.google.com/document/d/1CDj_sdTzZGP9Tpppy7PdaPs_4acueuNxTjMnAiCJJKs/edit#heading=h.5romymfdade3
A few additional papers that look into this topic, that might be of interest: https://dl.acm.org/doi/10.1145/3278721.3278766
https://www.mdpi.com/2409-9287/6/3/53
https://www.tandfonline.com/doi/abs/10.1080/13523260.2019.1576464?journalCode=fcsp20
And (more narrowly focused on NAT in LAWS) https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3161446
(apologies for very delayed reply)
Broadly, I’d see this as:
‘anticipatory’ if it is directly tied to a specific policy proposal or project we want to implement (‘we need to persuade everyone of the risk, so they understand the need to implement this specific governance solution’),
‘environment-shaping’ (aimed at shaping key actors’ norms and/or perceptions), if we do not have a strong sense of what policy we want to see adopted, but we would like to inform these actors to come up with the right choices themselves, once convinced.
Thanks for this post, I found it very interesting.
More that I’d like to write after reflection, but briefly—on further possible scenario variables, on either the technical or governance side, I’m working out a number of these here https://docs.google.com/document/d/1Mlt3rHcxJCBCGjSqrNJool0xB33GwmyH0bHjcveI7oc/edit# , and would be interested to discuss.
NC3 early warning systems are susceptible to error signals, and the chain of command hasn’t always been v secure (and may not be today), so it wouldn’t necessarily be that hard for a relatively unsophisticated AGI to spoof and trigger a nuclear war:* certainly easier than many other avenues that would involve cracking scientific problems.
(*which is another thing from hacking to the level of “controlling” the arsenal and being able to retarget it at will, which would probably require a more advanced capability, where the risk from the nuclear avenue might perhaps be redundant compared to risks from other, direct avenues).
Incidentally, at CSER I’ve been working with co-authors on a draft chapter that explores “military AI as cause or compounder of global catastrophic risk”, and one of the avenues also involves discussion of what we call “weapons/arsenal overhang”, so this is an interesting topic that I’d love to discuss more