One thing you can do is collect some demographic variables on non-respondents and see whether there is self-selection bias on those. You could then try to see if the variables that see self-selection correlate with certain answers. Baobao Zhang and Noemi Dreksler did some of this work for the 2019 survey (found in D1/page 32 here: https://arxiv.org/pdf/2206.04132.pdf ).
Really excited to see this!
I noticed the survey featured the MIRI logo fairly prominently. Is there a way to tell whether that caused some self-selection bias?
In the post, you say “Zhang et al ran a followup survey in 2019 (published in 2022)1 however they reworded or altered many questions, including the definitions of HLMI, so much of their data is not directly comparable to that of the 2016 or 2022 surveys, especially in light of large potential for framing effects observed.” Just to make sure you haven’t missed this: we had the 2016 respondents who also responded to the 2019 survey receive the exact same question they were asked in 2016, including re HLMI and milestones. (I was part of the Zhang et al team)
Hi Lexley, Good question. Kirsten’s suggestions are all great. To that, I’d add:
Try to work as a research assistant to someone who you think is doing interesting work. Quite often, more so than other roles, RA roles are quite often not advertised and set up on a more ad hoc basis. Perhaps the best route in is to read someone’s work and
Another thing you could do is to try to take a stab independently on some important-seeming question. You could e.g. pick a research question hinted at in a paper/piece (some have a section specifically with suggestions for further work), mentioned in a research agenda (e.g. Dafoe 2018), or in lists of research ideas (GovAI collated one here and Michael Aird, I think, sporadically updates this collection of lists of EA-relevant research questions).
My impression is that you can join the AGI Safety Fundamentals as an undergrad.
You could also look into the various “ERIs”: SERI, CHERI, CERI, and so on.
As for GovAI, we have in the past engaged undergrads as research assistants and I could imagine us taking on particularly promising undergrads for the GovAI Fellowship. However, overall, I expect our comparative advantage will be working with folks who either have significant context on AI governance or who have relevant experience from some other domain. It may also lay in producing writing that can help people navigate the field.
Thanks Jeffrey! I hope we’re a community where it doesn’t matter so much whether you think we suck. If you think the EA community should engage more with nuclear security issues and should do so in different ways, I’m sure people would love to hear it. I would! Especially if you’d help answer questions like: How much can work on nuclear security reduce existential risk? What kind of nuclear security work is most important from an x-risk perspective?
I’d love to hear more about what your concerns and criticisms are. For example, I’d love to know: Is the Scoblic post the main thing that’s informing your impression? Do you have views on this set of posts about the severity of a US-Russia nuclear exchange from Luisa Rodriguez (https://forum.effectivealtruism.org/s/KJNrGbt3JWcYeifLk)? Is there effective altruist funding or activity in the nuclear security space that you think has been misguided?
All things being equal, I’d recommend you publish in journals that are prestigious in your particular field (though it might not be worth the effort). In international relations / political science (which I know best) that might be e.g.: International Organization, International Security, American Journal of Political Science, PNAS.
Other journals that are less prestigious but more likely to be keen on AI governance work include: Nature Machine Intelligence, Global Policy, Journal of AI Research, AI & Society. There are also a number of conferences to consider: AIES, FAccT, workshops at big ML conferences like NeurIPS or ICML. Another thing to look out for is journals with AI governance/policy special issues.
I find that one good strategy for finding a suitable journal is looking for articles similar to what you want to publish and seeing where they’ve been published. You can then e.g. refer to those in your letter to the editors, highlighting how your work is relevant to their interests.
Overall, I think it’s not that surprising that this change is being proposed and I think it’s a fairly reasonable. However, I do think it should be complemented with duties to avoid e.g. AI systems being put to high-risk uses without going through a conformity assessment and that it should be made clear that certain parts of the conformity assessment will require changes on the part of the producer of a general system if that’s used to produce a system for a high-risk use.
In more detail, my view is that the following changes should be made:
Goal 1: Avoid general systems being without the appropriate regulatory burdens kicking in. There are two kinds of cases one might worry about:
(i) general systems might make it easier to produce a system that should either be covered by the transparency requirements (e.g. if your system is a chatbot, you need to tell the user that) or the high-risk requirements, leading to more such systems being put on the market without them being registered.
Proposed solution: Make it the case that providers of general systems must do certain checks on how their model is being used and whether it is being used for high risk uses without that AI system having been registered or having gone through the conformity assessment. Perhaps this would be done by giving the market surveillance authorities (MSAs) the right to ask providers of general models about certain information about how the model is being used. In practice, it could look as follows: the provider of the general system could have various ways to try to detect whether someone is using their system for something high risk (companies like OpenAI are already developing tools and systems to do this). If they detect such a use, they are required to check that against the database of high risk AI systems deployed on the EU market. If there’s a discrepancy, they must report it to the MSA and share some of the relevant information as evidence.
(ii) There’s a chance that individuals using general systems for high-risk uses without placing anything on the market will not be covered by the regulation. That is, as the regulation is currently designed, if a company where to use public CCTV footage to assess the number of women vs. men walking down a street, I believe that would be a high risk use. But if an individual does it, it might not count as a high risk use because nothing is placed on the market. This could end up being an issue, especially if word about these kinds of use cases spreads. Perhaps a more compelling example would be people starting to use large language models as personal chat bots. The proposed regulation wouldn’t require the provider of the LLM to add any warnings about how this is simply a chatbot, even if the user starts e.g. using it as a therapist or for medical advice.
Proposed solution: My guess is that the solution is that the provision suggested above is expanded to also look for individuals using the systems for high risk or limited risk uses and that they are required to stop such use.
Goal 2: (perhaps most important) Try to make it the case that crucial and appropriate parts of the conformity assessment will require changes on the part of the producer of the general system.
This could be done by e.g. making it the case that the technical documentation requires information that only the producer of the general model would have. It would plausibly already be the case with regards to the data requirements. It would also plausibly be the case regarding robustness. It seems worth making sure of those things. I don’t know if that’s a matter of changing the text of the legislation itself or about how the legislation will end up being interpreted.
One way to make sure that this is the case is to require that deployers only use general models that have gone through a certification process or that has also passed the conformity assessment (or perhaps a lighter version). I’m currently excited about the latter.
Why am I not excited about something more onerous on the part of the provider of the general system?
Introducing requirements on all general systems that can be used on the EU market seems hugely onerous to me. So much so that it would probably be a bad idea. I think that companies could fairly easily go from offering a general system on the EU market to offering a general-system-that-you’re-not-allowed-to-use-for-high-risk-uses. This could for example be done by adjusting the terms and conditions (OpenAI’s API usage guidelines already disallows most if not all high-risk uses as defined in the AI Act) or writing in big font somewhere “Not intended for high-risk uses as defined by the EU’s AI Act”. I worry that introducing requirements on general systems on masse would lead to that being the default response and that it wouldn’t deliver much benefit beyond what we’d get if the changes I gestured at above were made.
We’ve now relaunched. We wrote up our current principles with regards to conflicts of interest and governance here: https://www.governance.ai/legal/conflict-of-interest. I’d be curious if folks have thoughts, in particular @ofer.
Thanks for the post! I was interested in what the difference between “Semiconductor industry amortize their R&D cost due to slower improvements” and “Sale price amortization when improvements are slower” are. Would the decrease in price stem from the decrease in cost as companies no longer need to spend as much on R&D?
Thanks! What happens to your doubling times if you exclude the outliers from efficient ML models?
I really appreciated the extension on “AI and Compute”. Do you have a sense of the extent to which your estimate of the doubling time differs from “AI and Compute” stems from differences in selection criteria vs new data since its publication in 2018? Have you done analysis on what the trend looks like if you only include data points that fulfil their inclusion criteria?
For reference, it seems like their criteria is ”… results that are relatively well known, used a lot of compute for their time, and gave enough information to estimate the compute used.” Whereas yours is “important publication within the field of AI OR lots of citations OR performance record on common benchmark”. ”… used a lot of compute for their time” would probably do a whole lot of work to select data points that will show a faster doubling time.
Thanks for this! I really look forward to seeing the rest of the sequence, especially on the governance bits.
Came here to say the same thing :)
Thanks for the question. I agree that managing these kinds of issues is important and we aim to do so appropriately.
GovAI will continue to do research on regulation. To date, most of our work has been fairly foundational, though the past 1-2 years has seen an increase in research that may provide some fairly concrete advice to policymakers. This is primarily as the field is maturing, as policymakers are increasingly seeking to put in place AI regulation, and some folks at GovAI have had an interest in pursuing more policy-relevant work.
My view is that most of our policy work to date has been fairly (small c) conservative and has seldom passed judgment on whether there should be more or less regulation and praising specific actors. You can sample some of that previous work here:
We’re not yet decided on how we’ll manage potential conflicts of interest. Thoughts on what principles are welcome. Below is a subset of things that are likely to be put in place:
We’re aiming for a board that does not have a majority of folks from any of: industry, policy, academia.
Allan will be the co-lead of the organisation. We hope to be able to announce others soon.
Whenever someone has a clear conflict of interest regarding a candidate or a piece of research – say we were to publish a ranking of how responsible various AI labs were being – we’ll have the person recuse themselves from the decision.
For context, I expect most folks who collaborate with GovAI to not be directly paid by GovAI. Most folks will be employed elsewhere and not closely line managed by the organization.
Thanks! I agree that using a term like “socially beneficial” might be better. On the other hand, it might be helpful to couch self-governance proposals in terms of corporate social responsibility, as it is a term already in wide use.
Some brief thoughts (just my quick takes. My guess is that others might disagree, including at GovAI):
Overall, I think the situation is quite different compared to 2018, when I think the talk was recorded. AI governance / policy issues are much more prominent in the media, in politics, etc. The EU Commission has proposed some pretty comprehensive AI legislation. As such, there’s more pressure on companies as well as governments to take action. I think there’s also better understanding of what AI policy is sensible. All these things update me against 1 (insofar as we are still in the formative stages) and 2. They also update me in favour of thinking something like: governments will want to take a bunch of actions related to AI and so we should try to steer those actions in positive directions.
I think the AI policy / governance field is mature enough at this point that it’s not that helpful to think of an AI governance regime as one unitary thing. I much prefer thinking about specific areas of AI governance. Depending on the area, I’d likely have different views on 1-3. For example, it seems likely that companies are best placed to help develop standards that may be used to inform legislation further down the line. I wouldn’t expect companies to be best placed to figure out what the US should do wrt updates to antitrust regulation.
On 3, I think it’s true that companies have incentives in favour of acting prosocially and that we can boost these incentives. I’m not sure those incentives outweigh their other incentives, though. The view is not that e.g. Facebook, Amazon, Google, are all-things-considered going to act in the public interest. I also don’t think Jade-2018 held that view.
Happy to give my view. Could you say something about what particular views or messages you’re curious about? (I don’t have time to reread the script atm)