CSET report on AI and Compute does not acknowledge Alignment
It was disappointing to see that in this recent report by CSET, the default (mainstream) assumption that continued progress in AI capabilities is important was never questioned. Indeed, AI alignment/safety/x-risk is not mentioned once, and all the policy recommendations are to do with accelerating/maintaining the growth of AI capabilities! This coming from an org that OpenPhil has given over$50Mto set up.
UK Gov consultation on AI closing in ~36 hours: https://www.gov.uk/government/publications/ai-regulation-a-pro-innovation-approach. I’m limited on time, but I dashed off a very quick 2-min response just answering the couple of questions in the “foundation models” section (stressing need for moratorium on further development, and global limits on data and compute to ensure that models larger than GPT-4 aren’t trained).
Who else thinks we should be aiming for a global moratorium on AGI research at atthispoint? I’m considering ending every comment I make with “AGI research cessandum est”, or “Furthermore, AGI research must be stopped”.
Strong agreement that a global moratorium would be great.
I’m unsure if aiming for a global moratorium is the best thing to aim for rather than a slowing of the race-like behaviour—maybe a relevant similar case is whether to aim directly for the abolition of factory farms or just incremental improvements in welfare standards.
Loudly and publicly calling for a global moratorium should have the effect of slowing down race-like behaviour, even if it is ultimately unsuccessful. We can at least buy some more time, it’s not all or nothing in that sense. And more time can be used to buy yet more time, etc.
Factory farming is an interesting analogy, but the trade-off is different. You can think about whether abolitionism or welfarism has higher EV over the long term, but the stakes aren’t literally the end of the world if factory farming continues to gain power for 5-15 more years (i.e. humanity won’t end up in them).
The linked post is great, thanks for the reminder of it (and good to see it so high up the All Time top LW posts now). Who wants to start the institution lc talks about at the end? Who wants to devote significant resources to working on convincing AGI capabilities researchers to stop?
Isn’t it possible that calling for a complete stop to AI development actually counterfactually speeds up AI development?
The scenario I’m thinking of is something like:
There’s a growing anti-AI movement calling for a complete stop
A lot of people in that movement are ignorant about AI, and about the nature AI risks
It’s therefore easy for pro-AI people to dismiss these concerns, because the reasons given for the stop are in fact wrong/bad
Any other, well-grounded calls for AI slowdown aren’t given the time of day, because they are assumed to be the same as the others
Rather than thoughtful debate, the discourse turns into just attacking the other group
I’m not sure how exactly you’re proposing to advocate for a complete stop, but my worry would be that coming across as alarmist and not being able to give compelling specific reasons that AI poses a serious existential threat would poison the well.
I think it’s great that you’re trying to act seriously on your beliefs, Greg, but I am worried about a dynamic like this.
Really great to see the FLI open letter with some big names attached, so soon after posting the above. Great to see some sense prevailing on this issue at a high level. This is a big step in pushing the global conversation on AGI forward toward a much needed moratorium. I’m much more hopeful than I was yesterday! But there is still a lot of work to be done in getting it to actually happen.
Cap the model size and sophistication somewhere near where it is now? Seems like there’s easily a decade worth of alignment research that could be done on current models (and other theoretical work), which should be done before capabilities are advanced further. A moratorium would help bridge that gap. Demis Hassabis has talked about hitting the pause button as we get closer to the “grey zone”. Now is the time!
A variant on your proposal could be a moratorium on training new large models (e.g. OpenAI would be forbidden from training GPT-5, for example).
That would be more enforceable, because you need lots of compute to train a new model. I don’t know how we would stop an academic thinking up new ideas on how to structure AI models better, and even if we could, it would be hard to disentangle this from alignment research.
It would probably achieve most of what you want. For someone who’s worried about short timelines, reducing the scope for the scaling hypothesis to apply is probably pretty powerful, at least in the short term
Interesting, yes such moratorium on training new LLMs could help. But we also need to make the research morally unacceptable too—I think stigmatisation of AGI capabilities research could go a long way. No one is working on human genetic enhancement or cloning, mainly because of the taboos around them. It’s not like there is a lot of underground research there. (I’m thinking this is needed, because any limits on compute that are imposed could easily be got around).
A limit on compute designed to constrain OpenAI, Anthropic, or Google from training a new model sounds like a very high bar. I don’t understand why that could easily be got around?
Spoofing accounts to combine multiple of them together (as in the Clippy story linked, but I’m imagining humans doing it). The kind of bending of the rules that happens when something is merely regulated but not taboo. It’s not just Microsoft and Google we need to worry about. If the techniques and code are out there (open source models are not far behind cutting edge research), many actors will be trying to run them at scale.
These will still be massive, and massively expensive, training runs though—big operations that will constitute very big strategic decisions only available to the best-resourced actors.
In the post-AutoGPT world, this seems like it will no longer be the case. There is enough fervour by AGI accelerationists that the required resources could be quickly amassed by crowdfunding (cf. crypto projects raising similar amounts to those needed).
A lot of EA writing contains many hedging statements (along the lines of “I’m uncertain about this, but”, “my best guess is”, “this is potentially a good idea”, “it might be good”, “I’m tentatively going to say”,”I’m not confident in this assertion, but”,”I’m unsure of the level of support/evidence base for this” etc).
To make things more concise, perhaps [ha!] a shorthand could be developed, where (rough) probabilities are given for statements. Maybe [haha] it could take the form of a subscript with a number, with the statements bounded by apostrophes (’), except the apostrophes are also subscript. To be as minimal as possible, the numbers could be [lol] written as 9 for 0.9 of 90%, 75 for 0.75 or 75%, 05 for 0.05 or 5%, 001 for 0.001 or 0.1% etc (basically just taking the decimal probability and omitting the decimal point). Footnotes could be added for explanations where appropriate.
Maybe the statements (or numbers) could be colour coded for ease of spotting whether something is regarded as highly likely or highly unlikely, or somewhere in the middle. Although maybe all of this will disrupt the flow of reading too much?
The problem with these is having everyone on the same page of what the words mean. I recall Toby Ord not liking the IPCC’s use of them in The Precipice for this reason.
AGI x-risk timelines: 10% chance (by year X) estimates should be the headline, not 50%.
Given the stakes involved (the whole world/future light cone), we should regard timelines of ≥10% probability of AGI in ≤10 years as crunch time, and—given that there is already an increasingly broad consensus around this[1] -- be treating AGI x-risk as an urgent immediate priority (not something to mull over leisurely as part of a longtermist agenda).
Of course it’s not just time to AGI that is important. It’s also P(doom|AGI & alignment progress). I think most people in AI Alignment would regard this as >50% given our current state of alignment knowledge and implementation[2].
To borrow from Stuart Russell’s analogy: if there was a 10% chance of aliens landing in the next 10-15 years[3], we would be doing a lot more than we are currently doing[4]. AGI is akin to an alien species more intelligent than us that is unlikely to share our values.
Note that Holden Karnofsky’s all-things-considered (and IMO conservative) estimate for the advent of AGI is >10% chance in (now) 14 years. Anecdotally, the majority of people I’ve spoke to on the current AGISF course have estimates for 10% chance of 10 years or less.
Correct me if you think this is wrong; would be interesting to see a recent survey on this. Maybe there is more optimism factoring in extra progress before the advent of AGI.
This is different to the original analogy, which was an email saying: “People of Earth: We will arrive on your planet in 50 years. Get ready.” Say astronomers spotted something that looked like a space-craft, heading in approximately our direction, and estimated there was 10% chance that it was indeed a spacecraft heading to Earth.
Although perhaps we wouldn’t. Maybe people would endlessly argue about whether the evidence is strong enough to declare a >10% probability. Or flatly deny it.
[Half-baked global health idea based on a conversation with my doctor: earlier cholesterol checks and prescription of statins]
I’ve recently found out that I’ve got high (bad) cholesterol, and have been prescribed statins. What surprised me was that my doctor said that they normally wait until the patient has a 10% chance of heart attack or stroke in the next 10 years before they do anything(!) This seems crazy in light of the amount of resources put into preventing things with a similar (or lower) risk profiles, such as Covid, or road traffic accidents. Would reducing that to, say 5%* across the board (i.e. worldwide), be a low hanging fruit? Say by adjusting things set at a high level. Or have I just got this totally wrong? (I’ve done ~zero research, apart from searching givewell.org for “statins”, from which I didn’t find anything relevant).
*my risk is currently at 5%, and I was pro-active about getting my blood tested.
Surprisingly, globally, high cholesterol might kill 4m per year − 50% in emerging economies. I think OPP is looking into air pollution which kills 7m per year, so maybe this is indeed something to lookin into.
Job: Operations Manager – We are seeking a part time Operations Manager to begin late September. Deadline to apply: 16th August.
Expression of Interest: Executive Director – We are hoping to hire an Executive Director to start around November. While we await funding, we are seeking expressions of interest in the role. Tentative deadline for expressing interest: end of September.
Voluntary role: Trustee– We are seeking an additional trustee for CEEALAR. Applications will close on 30th September.
CSET report on AI and Compute does not acknowledge Alignment
It was disappointing to see that in this recent report by CSET, the default (mainstream) assumption that continued progress in AI capabilities is important was never questioned. Indeed, AI alignment/safety/x-risk is not mentioned once, and all the policy recommendations are to do with accelerating/maintaining the growth of AI capabilities! This coming from an org that OpenPhil has given over $50M to set up.
UK Gov consultation on AI closing in ~36 hours: https://www.gov.uk/government/publications/ai-regulation-a-pro-innovation-approach. I’m limited on time, but I dashed off a very quick 2-min response just answering the couple of questions in the “foundation models” section (stressing need for moratorium on further development, and global limits on data and compute to ensure that models larger than GPT-4 aren’t trained).
Who else thinks we should be aiming for a global moratorium on AGI research at at this point? I’m considering ending every comment I make with “AGI research cessandum est”, or “Furthermore, AGI research must be stopped”.
Strong agreement that a global moratorium would be great.
I’m unsure if aiming for a global moratorium is the best thing to aim for rather than a slowing of the race-like behaviour—maybe a relevant similar case is whether to aim directly for the abolition of factory farms or just incremental improvements in welfare standards.
This post from last year—What an actually pessimistic containment strategy looks like - has some good discussion on the topic of slowing down AGI research.
Loudly and publicly calling for a global moratorium should have the effect of slowing down race-like behaviour, even if it is ultimately unsuccessful. We can at least buy some more time, it’s not all or nothing in that sense. And more time can be used to buy yet more time, etc.
Factory farming is an interesting analogy, but the trade-off is different. You can think about whether abolitionism or welfarism has higher EV over the long term, but the stakes aren’t literally the end of the world if factory farming continues to gain power for 5-15 more years (i.e. humanity won’t end up in them).
The linked post is great, thanks for the reminder of it (and good to see it so high up the All Time top LW posts now). Who wants to start the institution lc talks about at the end? Who wants to devote significant resources to working on convincing AGI capabilities researchers to stop?
Isn’t it possible that calling for a complete stop to AI development actually counterfactually speeds up AI development?
The scenario I’m thinking of is something like:
There’s a growing anti-AI movement calling for a complete stop
A lot of people in that movement are ignorant about AI, and about the nature AI risks
It’s therefore easy for pro-AI people to dismiss these concerns, because the reasons given for the stop are in fact wrong/bad
Any other, well-grounded calls for AI slowdown aren’t given the time of day, because they are assumed to be the same as the others
Rather than thoughtful debate, the discourse turns into just attacking the other group
I’m not sure how exactly you’re proposing to advocate for a complete stop, but my worry would be that coming across as alarmist and not being able to give compelling specific reasons that AI poses a serious existential threat would poison the well.
I think it’s great that you’re trying to act seriously on your beliefs, Greg, but I am worried about a dynamic like this.
Well I’ve articulated what I think are compelling, specific reasons, that AI poses a serious existential threat in my new post: AGI rising: why we are in a new era of acute risk and increasing public awareness, and what to do now. Hope this can positively impact the public discourse toward informed debate. (And action!)
Yes, thank you for that! I’m probably going to write an object level comment there.
[Edit: I tweeted this]
Really great to see the FLI open letter with some big names attached, so soon after posting the above. Great to see some sense prevailing on this issue at a high level. This is a big step in pushing the global conversation on AGI forward toward a much needed moratorium. I’m much more hopeful than I was yesterday! But there is still a lot of work to be done in getting it to actually happen.
GPT-4 is advanced enough that it will be used to meaningfully speed up the development of GPT-5. If GPT-5 can make GPT-6 on it’s own, it’s game over.
I don’t see how we could implement a moratorium on AGI research that does stop capabilities research but doesn’t stop alignment research?
Cap the model size and sophistication somewhere near where it is now? Seems like there’s easily a decade worth of alignment research that could be done on current models (and other theoretical work), which should be done before capabilities are advanced further. A moratorium would help bridge that gap. Demis Hassabis has talked about hitting the pause button as we get closer to the “grey zone”. Now is the time!
A variant on your proposal could be a moratorium on training new large models (e.g. OpenAI would be forbidden from training GPT-5, for example).
That would be more enforceable, because you need lots of compute to train a new model. I don’t know how we would stop an academic thinking up new ideas on how to structure AI models better, and even if we could, it would be hard to disentangle this from alignment research.
It would probably achieve most of what you want. For someone who’s worried about short timelines, reducing the scope for the scaling hypothesis to apply is probably pretty powerful, at least in the short term
Interesting, yes such moratorium on training new LLMs could help. But we also need to make the research morally unacceptable too—I think stigmatisation of AGI capabilities research could go a long way. No one is working on human genetic enhancement or cloning, mainly because of the taboos around them. It’s not like there is a lot of underground research there. (I’m thinking this is needed, because any limits on compute that are imposed could easily be got around).
A limit on compute designed to constrain OpenAI, Anthropic, or Google from training a new model sounds like a very high bar. I don’t understand why that could easily be got around?
Spoofing accounts to combine multiple of them together (as in the Clippy story linked, but I’m imagining humans doing it). The kind of bending of the rules that happens when something is merely regulated but not taboo. It’s not just Microsoft and Google we need to worry about. If the techniques and code are out there (open source models are not far behind cutting edge research), many actors will be trying to run them at scale.
These will still be massive, and massively expensive, training runs though—big operations that will constitute very big strategic decisions only available to the best-resourced actors.
In the post-AutoGPT world, this seems like it will no longer be the case. There is enough fervour by AGI accelerationists that the required resources could be quickly amassed by crowdfunding (cf. crypto projects raising similar amounts to those needed).
Yes, but they will become increasingly cheaper. A taboo is far stronger than regulation.
Shorthand for hedging statements?
A lot of EA writing contains many hedging statements (along the lines of “I’m uncertain about this, but”, “my best guess is”, “this is potentially a good idea”, “it might be good”, “I’m tentatively going to say”,”I’m not confident in this assertion, but”,”I’m unsure of the level of support/evidence base for this” etc).
To make things more concise, perhaps [ha!] a shorthand could be developed, where (rough) probabilities are given for statements. Maybe [haha] it could take the form of a subscript with a number, with the statements bounded by apostrophes (’), except the apostrophes are also subscript. To be as minimal as possible, the numbers could be [lol] written as 9 for 0.9 of 90%, 75 for 0.75 or 75%, 05 for 0.05 or 5%, 001 for 0.001 or 0.1% etc (basically just taking the decimal probability and omitting the decimal point). Footnotes could be added for explanations where appropriate.
Maybe the statements (or numbers) could be colour coded for ease of spotting whether something is regarded as highly likely or highly unlikely, or somewhere in the middle. Although maybe all of this will disrupt the flow of reading too much?
Words of estimative probability from the intelligence world is a related concept.
The problem with these is having everyone on the same page of what the words mean. I recall Toby Ord not liking the IPCC’s use of them in The Precipice for this reason.
I agree that words are quite imprecise and usually having numbers is superior.
AGI x-risk timelines: 10% chance (by year X) estimates should be the headline, not 50%.
Given the stakes involved (the whole world/future light cone), we should regard timelines of ≥10% probability of AGI in ≤10 years as crunch time, and—given that there is already an increasingly broad consensus around this[1] -- be treating AGI x-risk as an urgent immediate priority (not something to mull over leisurely as part of a longtermist agenda).
Of course it’s not just time to AGI that is important. It’s also P(doom|AGI & alignment progress). I think most people in AI Alignment would regard this as >50% given our current state of alignment knowledge and implementation[2].
To borrow from Stuart Russell’s analogy: if there was a 10% chance of aliens landing in the next 10-15 years[3], we would be doing a lot more than we are currently doing[4]. AGI is akin to an alien species more intelligent than us that is unlikely to share our values.
Note that Holden Karnofsky’s all-things-considered (and IMO conservative) estimate for the advent of AGI is >10% chance in (now) 14 years. Anecdotally, the majority of people I’ve spoke to on the current AGISF course have estimates for 10% chance of 10 years or less.
Correct me if you think this is wrong; would be interesting to see a recent survey on this. Maybe there is more optimism factoring in extra progress before the advent of AGI.
This is different to the original analogy, which was an email saying: “People of Earth: We will arrive on your planet in 50 years. Get ready.” Say astronomers spotted something that looked like a space-craft, heading in approximately our direction, and estimated there was 10% chance that it was indeed a spacecraft heading to Earth.
Although perhaps we wouldn’t. Maybe people would endlessly argue about whether the evidence is strong enough to declare a >10% probability. Or flatly deny it.
I agree with this, and think maybe this should just be a top-level post
Done :)
[Half-baked global health idea based on a conversation with my doctor: earlier cholesterol checks and prescription of statins]
I’ve recently found out that I’ve got high (bad) cholesterol, and have been prescribed statins. What surprised me was that my doctor said that they normally wait until the patient has a 10% chance of heart attack or stroke in the next 10 years before they do anything(!) This seems crazy in light of the amount of resources put into preventing things with a similar (or lower) risk profiles, such as Covid, or road traffic accidents. Would reducing that to, say 5%* across the board (i.e. worldwide), be a low hanging fruit? Say by adjusting things set at a high level. Or have I just got this totally wrong? (I’ve done ~zero research, apart from searching givewell.org for “statins”, from which I didn’t find anything relevant).
*my risk is currently at 5%, and I was pro-active about getting my blood tested.
Romeo Stevens writes about cholesterol here.
Companies like thriva.co offer cheap at home lipid tests.
Here are a few recent papers on new drugs:
https://academic.oup.com/eurjpc/article/28/11/1279/5898664
https://www.sciencedirect.com/science/article/pii/S0735109721061131?via%3Dihub
Cardiovascular disease is on the rise in emerging economies, so maybe it’d be competitive in the future.
Saturated fat seems to be a main culprit:
https://www.cochranelibrary.com/cdsr/doi/10.1002/14651858.CD011737.pub3/full
Public health interventions might be a fat tax:
https://en.wikipedia.org/wiki/Fat_tax
https://www.cochranelibrary.com/cdsr/doi/10.1002/14651858.CD012415.pub2/full
Or donating to the Good Food institute on human health grounds.
Surprisingly, globally, high cholesterol might kill 4m per year − 50% in emerging economies. I think OPP is looking into air pollution which kills 7m per year, so maybe this is indeed something to lookin into.
Thanks for sharing. I’m adding this to my potential research agenda, kept here: https://airtable.com/shrGF5lAwSZpQ7uhP/tblJR9TaKLT41AoSL and https://airtable.com/shrQdonZuU20cpGR4
My petition to UK Parliament: Seek a global moratorium on development of AI technology due to extinction risk. If you agree, please share and sign (open to UK citizens). Let’s build momentum ahead of Rishi Sunak’s AI summit! (Tweet)
CEEALAR is hiring for a full-time Operations Manager (again), please share with anyone you think may be interested: https://ceealar.org/job-operations-manager/
Blackpool, UK. To start mid-late September. £31,286 – £35,457 per year (40 hours a week). Applications due by 31st August.
CEEALAR is hiring for a full-time Community Manager, please share with anyone you think may be interested: https://ceealar.org/job-community-manager/
To start mid-late September. £31,286 – £35,457 per year (full time, 40 hours a week).
CEEALAR is hiring
Job: Operations Manager – We are seeking a part time Operations Manager to begin late September. Deadline to apply: 16th August.
Expression of Interest: Executive Director – We are hoping to hire an Executive Director to start around November. While we await funding, we are seeking expressions of interest in the role. Tentative deadline for expressing interest: end of September.
Voluntary role: Trustee– We are seeking an additional trustee for CEEALAR. Applications will close on 30th September.
CEEALAR is hiring for a full-time Operations Manager, please share with anyone you think may be interested: https://ceealar.org/job-operations-manager
To start mid-late February. £31,286 – £35,457 per year (full time, 40 hours a week).
CEEALAR is hiring for a full-time Operations Manager, please share with anyone you think may be interested: https://ceealar.org/job-operations-manager/ Blackpool, UK.
To start mid-late July. £31,286 – £35,457 per year (40 hours a week).