RSS

AI risk

TagLast edit: 20 Jan 2023 13:49 UTC by Will Howard

An AI risk is a catastrophic or existential risk arising from the creation of advanced artificial intelligence (AI).

Developments in AI have the potential to enable people around the world to flourish in hitherto unimagined ways. Such developments might also give humanity tools to address other sources of risk.

Despite this, AI also poses its own risks. AI systems sometimes behave in ways that surprise people. At the moment, such systems are usually narrow in their capabilities—for example, they are excellent at Go, or at minimizing power consumption in a server facility, but they can’t do other tasks. If people designed a machine intelligence that was a sufficiently good general reasoner, or even better at general reasoning than people are, it might become difficult for human agents to interfere with its functioning. If it then behaved in a way which did not reflect human values, it might pose a real risk to humanity. Such a machine intelligence might use its intellectual superiority to develop a decisive strategic advantage. If its goals were incompatible with human flourishing, it could then pose an existential risk.

Note that AI could pose an existential risk without being sentient, gaining consciousness, or having any ill will towards humanity.

Further reading

Bostrom, Nick (2014) Superintelligence: Paths, Dangers, Strategies, Oxford: Oxford University Press.
Offers a detailed analysis of risks posed by AI.

Christiano, Paul (2019) What failure looks like, LessWrong, March 17.

Dewey, Daniel (2015) Three areas of research on the superintelligence control problem, Global Priorities Project, October 20.
Provides an overview and suggested reading in AI risk.

Karnofsky, Holden (2016) Potential risks from advanced artificial intelligence: the philanthropic opportunity, Open Philanthropy, May 6.
Explains why the Open Philanthropy Project regards risks from AI as an area worth exploring.

Dai, Wei & Daniel Kokotajlo (2019) The main sources of AI risk?, AI Alignment Forum, March 21.
An attempt to list all the significant sources of AI risk.

Related entries

AI alignment | AI governance | AI forecasting | AI safety | instrumental convergence thesis | orthogonality thesis

Prevent­ing an AI-re­lated catas­tro­phe—Prob­lem profile

Benjamin Hilton29 Aug 2022 18:49 UTC
132 points
17 comments4 min readEA link
(80000hours.org)

Without spe­cific coun­ter­mea­sures, the eas­iest path to trans­for­ma­tive AI likely leads to AI takeover

Ajeya18 Jul 2022 19:07 UTC
215 points
12 comments75 min readEA link
(www.lesswrong.com)

AGI Ruin: A List of Lethalities

EliezerYudkowsky6 Jun 2022 23:28 UTC
160 points
55 comments30 min readEA link
(www.lesswrong.com)

Re­sources I send to AI re­searchers about AI safety

Vael Gates11 Jan 2023 1:24 UTC
42 points
0 comments1 min readEA link

AI Could Defeat All Of Us Combined

Holden Karnofsky10 Jun 2022 23:25 UTC
142 points
13 comments17 min readEA link

A cen­tral AI al­ign­ment prob­lem: ca­pa­bil­ities gen­er­al­iza­tion, and the sharp left turn

So8res15 Jun 2022 14:19 UTC
51 points
2 comments7 min readEA link

Katja Grace: Let’s think about slow­ing down AI

peterhartree23 Dec 2022 0:57 UTC
81 points
7 comments2 min readEA link
(worldspiritsockpuppet.substack.com)

Draft re­port on ex­is­ten­tial risk from power-seek­ing AI

Joe_Carlsmith28 Apr 2021 21:41 UTC
87 points
34 comments1 min readEA link

My Most Likely Rea­son to Die Young is AI X-Risk

AISafetyIsNotLongtermist4 Jul 2022 15:34 UTC
232 points
62 comments4 min readEA link
(www.lesswrong.com)

AI Timelines: Where the Ar­gu­ments, and the “Ex­perts,” Stand

Holden Karnofsky7 Sep 2021 17:35 UTC
83 points
3 comments11 min readEA link

AI ethics: the case for in­clud­ing an­i­mals (my first pub­lished pa­per, Peter Singer’s first on AI)

Fai12 Jul 2022 4:14 UTC
76 points
4 comments1 min readEA link
(link.springer.com)

AI Risk is like Ter­mi­na­tor; Stop Say­ing it’s Not

skluug8 Mar 2022 19:17 UTC
184 points
44 comments10 min readEA link
(skluug.substack.com)

AGI Safety Fun­da­men­tals cur­ricu­lum and application

richard_ngo20 Oct 2021 21:45 UTC
123 points
20 comments8 min readEA link
(docs.google.com)

2019 AI Align­ment Liter­a­ture Re­view and Char­ity Comparison

Larks19 Dec 2019 2:58 UTC
147 points
28 comments64 min readEA link

2018 AI Align­ment Liter­a­ture Re­view and Char­ity Comparison

Larks18 Dec 2018 4:48 UTC
118 points
28 comments64 min readEA link

Why AI al­ign­ment could be hard with mod­ern deep learning

Ajeya21 Sep 2021 15:35 UTC
140 points
16 comments14 min readEA link
(www.cold-takes.com)

Ben Garfinkel: How sure are we about this AI stuff?

bgarfinkel9 Feb 2019 19:17 UTC
123 points
19 comments18 min readEA link

Align­ing the Align­ers: En­sur­ing Aligned AI acts for the com­mon good of all mankind

timunderwood16 Jan 2023 11:13 UTC
33 points
2 comments4 min readEA link

Coun­ter­ar­gu­ments to the ba­sic AI risk case

Katja_Grace14 Oct 2022 20:30 UTC
276 points
23 comments34 min readEA link

AI X-Risk: In­te­grat­ing on the Shoulders of Giants

TD_Pilditch1 Nov 2022 16:07 UTC
34 points
0 comments47 min readEA link

On how var­i­ous plans miss the hard bits of the al­ign­ment challenge

So8res12 Jul 2022 5:35 UTC
125 points
13 comments27 min readEA link

[linkpost] Chris­ti­ano on agree­ment/​dis­agree­ment with Yud­kowsky’s “List of Lethal­ities”

Owen Cotton-Barratt19 Jun 2022 22:47 UTC
130 points
1 comment1 min readEA link
(www.lesswrong.com)

A tale of 2.5 or­thog­o­nal­ity theses

Arepo1 May 2022 13:53 UTC
138 points
31 comments15 min readEA link

Biolog­i­cal An­chors ex­ter­nal re­view by Jen­nifer Lin (linkpost)

peterhartree30 Nov 2022 13:06 UTC
36 points
0 comments1 min readEA link
(docs.google.com)

Disagree­ments about Align­ment: Why, and how, we should try to solve them

ojorgensen8 Aug 2022 22:32 UTC
16 points
6 comments16 min readEA link

There are no co­her­ence theorems

EJT20 Feb 2023 21:52 UTC
83 points
49 comments19 min readEA link

AGI x-risk timelines: 10% chance (by year X) es­ti­mates should be the head­line, not 50%.

Greg_Colbourn1 Mar 2022 12:02 UTC
69 points
22 comments1 min readEA link

[Question] Is AI safety still ne­glected?

Coafos30 Mar 2022 9:09 UTC
13 points
14 comments1 min readEA link

Video and Tran­script of Pre­sen­ta­tion on Ex­is­ten­tial Risk from Power-Seek­ing AI

Joe_Carlsmith8 May 2022 3:52 UTC
96 points
7 comments30 min readEA link

Me­diocre AI safety as ex­is­ten­tial risk

Gavin16 Mar 2022 11:50 UTC
52 points
12 comments3 min readEA link

AGI and the EMH: mar­kets are not ex­pect­ing al­igned or un­al­igned AI in the next 30 years

basil.halperin10 Jan 2023 16:05 UTC
334 points
172 comments26 min readEA link

On Defer­ence and Yud­kowsky’s AI Risk Estimates

bgarfinkel19 Jun 2022 14:35 UTC
262 points
188 comments17 min readEA link

(Even) More Early-Ca­reer EAs Should Try AI Safety Tech­ni­cal Research

levin30 Jun 2022 21:14 UTC
86 points
38 comments11 min readEA link

The next decades might be wild

mariushobbhahn15 Dec 2022 16:10 UTC
130 points
31 comments1 min readEA link

How many peo­ple are work­ing (di­rectly) on re­duc­ing ex­is­ten­tial risk from AI?

Benjamin Hilton17 Jan 2023 14:03 UTC
117 points
3 comments4 min readEA link
(80000hours.org)

Messy per­sonal stuff that af­fected my cause pri­ori­ti­za­tion (or: how I started to care about AI safety)

Julia_Wise5 May 2022 17:59 UTC
262 points
14 comments2 min readEA link

The case for be­com­ing a black-box in­ves­ti­ga­tor of lan­guage models

Buck6 May 2022 14:37 UTC
89 points
7 comments3 min readEA link

How I Formed My Own Views About AI Safety

Neel Nanda27 Feb 2022 18:52 UTC
130 points
12 comments13 min readEA link
(www.neelnanda.io)

What suc­cess looks like

mariushobbhahn28 Jun 2022 14:30 UTC
107 points
20 comments19 min readEA link

My highly per­sonal skep­ti­cism brain­dump on ex­is­ten­tial risk from ar­tifi­cial in­tel­li­gence.

NunoSempere23 Jan 2023 20:08 UTC
432 points
115 comments14 min readEA link
(nunosempere.com)

Four rea­sons I find AI safety emo­tion­ally compelling

Kat Woods28 Jun 2022 14:01 UTC
32 points
5 comments4 min readEA link

Large Lan­guage Models as Fi­du­cia­ries to Humans

johnjnay24 Jan 2023 19:53 UTC
25 points
0 comments34 min readEA link
(papers.ssrn.com)

On pre­sent­ing the case for AI risk

Aryeh Englander8 Mar 2022 21:37 UTC
114 points
12 comments4 min readEA link

Disen­tan­gling ar­gu­ments for the im­por­tance of AI safety

richard_ngo23 Jan 2019 14:58 UTC
63 points
14 comments8 min readEA link

My per­sonal cruxes for work­ing on AI safety

Buck13 Feb 2020 7:11 UTC
135 points
35 comments45 min readEA link

Blake Richards on Why he is Skep­ti­cal of Ex­is­ten­tial Risk from AI

Michaël Trazzi14 Jun 2022 19:11 UTC
63 points
14 comments4 min readEA link
(theinsideview.ai)

Part 1: The AI Safety com­mu­nity has four main work groups, Strat­egy, Gover­nance, Tech­ni­cal and Move­ment Building

PeterSlattery25 Nov 2022 3:45 UTC
72 points
7 comments6 min readEA link

Graph­i­cal Rep­re­sen­ta­tions of Paul Chris­ti­ano’s Doom Model

Nathan Young7 May 2023 13:03 UTC
47 points
2 comments1 min readEA link

AI safety uni­ver­sity groups: a promis­ing op­por­tu­nity to re­duce ex­is­ten­tial risk

mic30 Jun 2022 18:37 UTC
50 points
1 comment11 min readEA link

A grand strat­egy to re­cruit AI ca­pa­bil­ities re­searchers into AI safety research

Peter S. Park15 Apr 2022 17:11 UTC
22 points
13 comments4 min readEA link

Con­nor Leahy on Con­jec­ture and Dy­ing with Dignity

Michaël Trazzi22 Jul 2022 19:30 UTC
34 points
0 comments10 min readEA link
(theinsideview.ai)

Why AGI Timeline Re­search/​Dis­course Might Be Overrated

Miles_Brundage3 Jul 2022 8:04 UTC
115 points
30 comments10 min readEA link

My thoughts on nan­otech­nol­ogy strat­egy re­search as an EA cause area

Ben Snodin2 May 2022 9:41 UTC
135 points
17 comments33 min readEA link

Slow­ing down AI progress is an un­der­ex­plored al­ign­ment strategy

Michael Huang13 Jul 2022 3:22 UTC
89 points
11 comments3 min readEA link
(www.lesswrong.com)

Why poli­cy­mak­ers should be­ware claims of new “arms races” (Bul­letin of the Atomic Scien­tists)

christian.r14 Jul 2022 13:38 UTC
55 points
1 comment1 min readEA link
(thebulletin.org)

Data col­lec­tion for AI al­ign­ment—Ca­reer review

Benjamin Hilton3 Jun 2022 11:44 UTC
34 points
1 comment5 min readEA link
(80000hours.org)

Eli’s re­view of “Is power-seek­ing AI an ex­is­ten­tial risk?”

elifland30 Sep 2022 12:21 UTC
58 points
3 comments1 min readEA link

Com­mon mis­con­cep­tions about OpenAI

Jacob_Hilton25 Aug 2022 14:08 UTC
51 points
2 comments1 min readEA link
(www.lesswrong.com)

[Question] How will the world re­spond to “AI x-risk warn­ing shots” ac­cord­ing to refer­ence class fore­cast­ing?

Ryan Kidd18 Apr 2022 9:10 UTC
18 points
1 comment1 min readEA link

There should be an AI safety pro­ject board

mariushobbhahn14 Mar 2022 16:08 UTC
24 points
3 comments1 min readEA link

[linkpost] “What Are Rea­son­able AI Fears?” by Robin Han­son, 2023-04-23

Arjun Panickssery14 Apr 2023 23:26 UTC
40 points
3 comments4 min readEA link
(quillette.com)

Public Ex­plainer on AI as an Ex­is­ten­tial Risk

AndrewDoris7 Oct 2022 19:23 UTC
13 points
4 comments15 min readEA link

Why EAs are skep­ti­cal about AI Safety

Lukas Trötzmüller18 Jul 2022 19:01 UTC
279 points
31 comments30 min readEA link

Vael Gates: Risks from Ad­vanced AI (June 2022)

Vael Gates14 Jun 2022 0:49 UTC
45 points
5 comments30 min readEA link

The ba­sic rea­sons I ex­pect AGI ruin

RobBensinger18 Apr 2023 3:37 UTC
55 points
13 comments1 min readEA link

Ex­pected eth­i­cal value of a ca­reer in AI safety

Jordan Taylor14 Jun 2022 14:25 UTC
36 points
16 comments11 min readEA link

AI safety starter pack

mariushobbhahn28 Mar 2022 16:05 UTC
120 points
11 comments6 min readEA link

Grokking “Semi-in­for­ma­tive pri­ors over AI timelines”

anson12 Jun 2022 22:15 UTC
60 points
1 comment11 min readEA link

[Question] How would a lan­guage model be­come goal-di­rected?

David Mears16 Jul 2022 14:50 UTC
113 points
19 comments1 min readEA link

How to pur­sue a ca­reer in tech­ni­cal AI alignment

CharlieRS4 Jun 2022 21:36 UTC
247 points
8 comments39 min readEA link

New US Se­nate Bill on X-Risk Miti­ga­tion [Linkpost]

Evan R. Murphy4 Jul 2022 1:28 UTC
22 points
12 comments1 min readEA link
(www.hsgac.senate.gov)

Deep­Mind’s gen­er­al­ist AI, Gato: A non-tech­ni­cal explainer

frances_lorenz16 May 2022 21:19 UTC
127 points
13 comments6 min readEA link

What if we don’t need a “Hard Left Turn” to reach AGI?

Eigengender15 Jul 2022 9:49 UTC
39 points
7 comments4 min readEA link

“The Race to the End of Hu­man­ity” – Struc­tural Uncer­tainty Anal­y­sis in AI Risk Models

Froolow19 May 2023 12:03 UTC
36 points
3 comments21 min readEA link

‘Dis­solv­ing’ AI Risk – Pa­ram­e­ter Uncer­tainty in AI Fu­ture Forecasting

Froolow18 Oct 2022 22:54 UTC
105 points
63 comments39 min readEA link

“Tech com­pany sin­gu­lar­i­ties”, and steer­ing them to re­duce x-risk

Andrew Critch13 May 2022 17:26 UTC
51 points
5 comments4 min readEA link

How we could stum­ble into AI catastrophe

Holden Karnofsky16 Jan 2023 14:52 UTC
78 points
0 comments31 min readEA link
(www.cold-takes.com)

Tips for con­duct­ing wor­ld­view investigations

lukeprog12 Apr 2022 19:28 UTC
80 points
4 comments2 min readEA link

NYT: Google will ‘re­cal­ibrate’ the risk of re­leas­ing AI due to com­pe­ti­tion with OpenAI

Michael Huang22 Jan 2023 2:13 UTC
168 points
8 comments1 min readEA link
(www.nytimes.com)

Samotsvety’s AI risk forecasts

elifland9 Sep 2022 4:01 UTC
170 points
30 comments3 min readEA link

Ex­plor­ing Me­tac­u­lus’s AI Track Record

Peter Scoblic1 May 2023 21:02 UTC
41 points
5 comments7 min readEA link
(www.metaculus.com)

Longevity re­search as AI X-risk intervention

DirectedEvolution6 Nov 2022 17:58 UTC
25 points
0 comments9 min readEA link

Deep Deceptiveness

So8res21 Mar 2023 2:51 UTC
38 points
1 comment1 min readEA link

AGI mis­al­ign­ment x-risk may be lower due to an over­looked goal speci­fi­ca­tion technology

johnjnay21 Oct 2022 2:03 UTC
20 points
1 comment1 min readEA link

AI timelines by bio an­chors: The de­bate in one place

Will Aldred30 Jul 2022 23:04 UTC
89 points
6 comments2 min readEA link

An­nounc­ing The Most Im­por­tant Cen­tury Writ­ing Prize

michel31 Oct 2022 21:37 UTC
46 points
0 comments2 min readEA link

My Ob­jec­tions to “We’re All Gonna Die with Eliezer Yud­kowsky”

Quintin Pope21 Mar 2023 1:23 UTC
172 points
17 comments39 min readEA link

Tran­scripts of in­ter­views with AI researchers

Vael Gates9 May 2022 6:03 UTC
140 points
14 comments2 min readEA link

An­nounc­ing Epoch: A re­search or­ga­ni­za­tion in­ves­ti­gat­ing the road to Trans­for­ma­tive AI

Jaime Sevilla27 Jun 2022 13:39 UTC
183 points
11 comments2 min readEA link
(epochai.org)

The in­or­di­nately slow spread of good AGI con­ver­sa­tions in ML

RobBensinger29 Jun 2022 4:02 UTC
59 points
2 comments6 min readEA link

Po­ten­tial Risks from Ad­vanced Ar­tifi­cial In­tel­li­gence: The Philan­thropic Opportunity

Holden Karnofsky6 May 2016 12:55 UTC
2 points
0 comments23 min readEA link
(www.openphilanthropy.org)

Steer­ing AI to care for an­i­mals, and soon

Andrew Critch14 Jun 2022 1:13 UTC
207 points
38 comments1 min readEA link

Strate­gic Per­spec­tives on Trans­for­ma­tive AI Gover­nance: Introduction

MMMaas2 Jul 2022 11:20 UTC
107 points
18 comments4 min readEA link

England & Wales & Windfalls

John Bridge3 Jun 2022 10:26 UTC
13 points
1 comment26 min readEA link

How might we al­ign trans­for­ma­tive AI if it’s de­vel­oped very soon?

Holden Karnofsky29 Aug 2022 15:48 UTC
156 points
17 comments44 min readEA link

We are fight­ing a shared bat­tle (a call for a differ­ent ap­proach to AI Strat­egy)

Gideon Futerman16 Mar 2023 14:37 UTC
57 points
11 comments15 min readEA link

In­tent al­ign­ment should not be the goal for AGI x-risk reduction

johnjnay26 Oct 2022 1:24 UTC
7 points
1 comment1 min readEA link

Skill up in ML for AI safety with the In­tro to ML Safety course (Spring 2023)

james5 Jan 2023 11:02 UTC
36 points
3 comments2 min readEA link

Trans­for­ma­tive AI is­sues (not just mis­al­ign­ment): an overview

Holden Karnofsky6 Jan 2023 2:19 UTC
31 points
0 comments22 min readEA link
(www.cold-takes.com)

AI Safety Camp, Vir­tual Edi­tion 2023

Linda Linsefors6 Jan 2023 0:55 UTC
30 points
0 comments1 min readEA link

[Linkpost] Jan Leike on three kinds of al­ign­ment taxes

Akash6 Jan 2023 23:57 UTC
29 points
0 comments1 min readEA link

Some thoughts on risks from nar­row, non-agen­tic AI

richard_ngo19 Jan 2021 0:07 UTC
36 points
2 comments8 min readEA link

We should say more than “x-risk is high”

OllieBase16 Dec 2022 22:09 UTC
49 points
12 comments4 min readEA link

Ex­is­ten­tial AI Safety is NOT sep­a­rate from near-term applications

stecas13 Dec 2022 14:47 UTC
28 points
9 comments1 min readEA link

AGI Safety Needs Peo­ple With All Skil­lsets!

Severin25 Jul 2022 13:30 UTC
33 points
7 comments2 min readEA link

Is this com­mu­nity over-em­pha­siz­ing AI al­ign­ment?

Lixiang8 Jan 2023 6:23 UTC
2 points
5 comments1 min readEA link

Nearcast-based “de­ploy­ment prob­lem” anal­y­sis (Karnofsky, 2022)

Will Aldred9 Jan 2023 16:57 UTC
36 points
0 comments4 min readEA link
(www.alignmentforum.org)

An­nounc­ing the GovAI Policy Team

MarkusAnderljung1 Aug 2022 22:46 UTC
107 points
11 comments2 min readEA link

[Question] How to cre­ate cur­ricu­lum for self-study to­wards AI al­ign­ment work?

OIUJHKDFS7 Jan 2023 19:53 UTC
10 points
5 comments1 min readEA link

12 ca­reer ad­vis­ing ques­tions that may (or may not) be helpful for peo­ple in­ter­ested in al­ign­ment research

Akash12 Dec 2022 22:36 UTC
14 points
0 comments1 min readEA link

8 pos­si­ble high-level goals for work on nu­clear risk

MichaelA29 Mar 2022 6:30 UTC
46 points
4 comments13 min readEA link

Sort­ing Peb­bles Into Cor­rect Heaps: The Animation

Writer10 Jan 2023 15:58 UTC
12 points
0 comments1 min readEA link

Against us­ing stock prices to fore­cast AI timelines

basil.halperin10 Jan 2023 16:04 UTC
22 points
4 comments2 min readEA link

En­cul­tured AI, Part 1: En­abling New Benchmarks

Andrew Critch8 Aug 2022 22:49 UTC
17 points
0 comments5 min readEA link

Have your timelines changed as a re­sult of ChatGPT?

Chris Leong5 Dec 2022 15:03 UTC
30 points
18 comments1 min readEA link

We don’t trade with ants

Katja_Grace12 Jan 2023 0:48 UTC
134 points
8 comments1 min readEA link

VIRTUA: a novel about AI alignment

Karl von Wendt12 Jan 2023 9:37 UTC
21 points
0 comments1 min readEA link

Vic­to­ria Krakovna on AGI Ruin, The Sharp Left Turn and Paradigms of AI Alignment

Michaël Trazzi12 Jan 2023 17:09 UTC
16 points
0 comments1 min readEA link

Be­ware safety-washing

Lizka13 Jan 2023 10:39 UTC
128 points
6 comments4 min readEA link

Rea­sons I’ve been hes­i­tant about high lev­els of near-ish AI risk

elifland22 Jul 2022 1:32 UTC
202 points
16 comments7 min readEA link
(www.foxy-scout.com)

The aca­demic con­tri­bu­tion to AI safety seems large

Gavin30 Jul 2020 10:30 UTC
117 points
28 comments9 min readEA link

Soft­ware en­g­ineer­ing—Ca­reer review

Benjamin Hilton8 Feb 2022 6:11 UTC
92 points
19 comments8 min readEA link
(80000hours.org)

Can GPT-3 pro­duce new ideas? Par­tially au­tomat­ing Robin Han­son and others

NunoSempere16 Jan 2023 15:05 UTC
82 points
6 comments10 min readEA link

The Parable of the Boy Who Cried 5% Chance of Wolf

Kat Woods15 Aug 2022 14:22 UTC
75 points
8 comments2 min readEA link

A Bare­bones Guide to Mechanis­tic In­ter­pretabil­ity Prerequisites

Neel Nanda29 Nov 2022 18:43 UTC
50 points
1 comment3 min readEA link
(neelnanda.io)

AGISF adap­ta­tion for in-per­son groups

Sam Marks17 Jan 2023 18:33 UTC
30 points
0 comments3 min readEA link
(www.lesswrong.com)

Col­lin Burns on Align­ment Re­search And Dis­cov­er­ing La­tent Knowl­edge Without Supervision

Michaël Trazzi17 Jan 2023 17:21 UTC
21 points
3 comments1 min readEA link

AMA: Ought

stuhlmueller3 Aug 2022 17:24 UTC
41 points
52 comments1 min readEA link

How I failed to form views on AI safety

Ada-Maaria Hyvärinen17 Apr 2022 11:05 UTC
207 points
71 comments40 min readEA link

List of tech­ni­cal AI safety ex­er­cises and projects

Jakub Kraus19 Jan 2023 9:35 UTC
15 points
0 comments1 min readEA link

Sup­ple­ment to “The Brus­sels Effect and AI: How EU AI reg­u­la­tion will im­pact the global AI mar­ket”

MarkusAnderljung16 Aug 2022 20:55 UTC
107 points
7 comments8 min readEA link

Hereti­cal Thoughts on AI | Eli Dourado

𝕮𝖎𝖓𝖊𝖗𝖆19 Jan 2023 16:11 UTC
137 points
15 comments1 min readEA link

Raphaël Millière on the Limits of Deep Learn­ing and AI x-risk skepticism

Michaël Trazzi24 Jun 2022 18:33 UTC
20 points
0 comments4 min readEA link
(theinsideview.ai)

What’s go­ing on with ‘crunch time’?

rosehadshar20 Jan 2023 9:38 UTC
83 points
5 comments4 min readEA link

[TIME mag­a­z­ine] Deep­Mind’s CEO Helped Take AI Main­stream. Now He’s Urg­ing Cau­tion (Per­rigo, 2023)

Will Aldred20 Jan 2023 20:37 UTC
93 points
0 comments1 min readEA link
(time.com)

Gen­eral vs spe­cific ar­gu­ments for the longter­mist im­por­tance of shap­ing AI development

Sam Clarke15 Oct 2021 14:43 UTC
44 points
7 comments2 min readEA link

The right to pro­tec­tion from catas­trophic AI risk

Jack Cunningham9 Apr 2022 23:11 UTC
11 points
0 comments5 min readEA link

Spread­ing mes­sages to help with the most im­por­tant century

Holden Karnofsky25 Jan 2023 20:35 UTC
122 points
20 comments18 min readEA link
(www.cold-takes.com)

Ex­cerpts from “Do­ing EA Bet­ter” on x-risk methodology

BrownHairedEevee26 Jan 2023 1:04 UTC
19 points
5 comments6 min readEA link
(forum.effectivealtruism.org)

AI Risk Man­age­ment Frame­work | NIST

𝕮𝖎𝖓𝖊𝖗𝖆26 Jan 2023 15:27 UTC
50 points
0 comments1 min readEA link

The AI Mes­siah

ryancbriggs5 May 2022 16:58 UTC
69 points
44 comments2 min readEA link

Restrict­ing brain organoid re­search to slow down AGI

freedomandutility9 Nov 2022 13:01 UTC
8 points
2 comments1 min readEA link

The case for tak­ing AI se­ri­ously as a threat to humanity

EA Handbook10 Nov 2020 0:00 UTC
10 points
0 comments1 min readEA link
(www.vox.com)

Tech­nolog­i­cal de­vel­op­ments that could in­crease risks from nu­clear weapons: A shal­low review

MichaelA9 Feb 2023 15:41 UTC
79 points
3 comments5 min readEA link
(bit.ly)

Fore­sight for AGI Safety Strategy

jacquesthibs5 Dec 2022 16:09 UTC
6 points
1 comment1 min readEA link

What is it like do­ing AI safety work?

Kat Woods21 Feb 2023 19:24 UTC
95 points
2 comments10 min readEA link

Jobs that can help with the most im­por­tant century

Holden Karnofsky12 Feb 2023 18:19 UTC
52 points
2 comments32 min readEA link
(www.cold-takes.com)

Pod­cast: Shoshan­nah Tekofsky on skil­ling up in AI safety, vis­it­ing Berkeley, and de­vel­op­ing novel re­search ideas

Akash25 Nov 2022 20:47 UTC
14 points
0 comments1 min readEA link

AGI in sight: our look at the game board

Andrea_Miotti18 Feb 2023 22:17 UTC
30 points
18 comments1 min readEA link

[MLSN #8]: Mechanis­tic in­ter­pretabil­ity, us­ing law to in­form AI al­ign­ment, scal­ing laws for proxy gaming

ThomasW20 Feb 2023 16:06 UTC
25 points
0 comments4 min readEA link
(newsletter.mlsafety.org)

Les­sons from Three Mile Is­land for AI Warn­ing Shots

NickGabs26 Sep 2022 2:47 UTC
42 points
0 comments12 min readEA link

A Cal­ifor­nia Effect for Ar­tifi­cial Intelligence

henryj9 Sep 2022 14:17 UTC
73 points
2 comments4 min readEA link
(docs.google.com)

AI Risk In­tro 1: Ad­vanced AI Might Be Very Bad

LRudL11 Sep 2022 10:57 UTC
22 points
0 comments30 min readEA link

AI strat­egy nearcasting

Holden Karnofsky26 Aug 2022 16:25 UTC
61 points
3 comments9 min readEA link

Com­mu­nity Build­ing for Grad­u­ate Stu­dents: A Tar­geted Approach

Neil Crawford29 Mar 2022 19:47 UTC
13 points
0 comments3 min readEA link

Ap­ply for the ML Win­ter Camp in Cam­bridge, UK [2-10 Jan]

Nathan_Barnard2 Dec 2022 19:33 UTC
50 points
11 comments2 min readEA link

AGI with feelings

Nicolai Meberg7 Dec 2022 16:00 UTC
−13 points
0 comments1 min readEA link
(twitter.com)

An­nounc­ing the Open Philan­thropy AI Wor­ld­views Contest

Jason Schukraft10 Mar 2023 2:33 UTC
139 points
33 comments3 min readEA link
(www.openphilanthropy.org)

A ten­ta­tive di­alogue with a Friendly-boxed-su­per-AGI on brain uploads

Ramiro12 May 2022 21:55 UTC
5 points
0 comments4 min readEA link

[Link post] How plau­si­ble are AI Takeover sce­nar­ios?

SammyDMartin27 Sep 2021 13:03 UTC
26 points
0 comments1 min readEA link

How bad a fu­ture do ML re­searchers ex­pect?

Katja_Grace13 Mar 2023 5:47 UTC
164 points
20 comments1 min readEA link

[Question] I’m in­ter­view­ing Nova Das Sarma about AI safety and in­for­ma­tion se­cu­rity. What shouId I ask her?

Robert_Wiblin25 Mar 2022 15:38 UTC
17 points
14 comments1 min readEA link

“Aligned with who?” Re­sults of sur­vey­ing 1,000 US par­ti­ci­pants on AI values

Holly Morgan21 Mar 2023 22:07 UTC
40 points
0 comments2 min readEA link
(www.lesswrong.com)

[Question] Are there any AI Safety labs that will hire self-taught ML en­g­ineers?

Tomer_Goloboy6 Apr 2022 23:32 UTC
5 points
12 comments1 min readEA link

Con­ti­nu­ity Assumptions

Jan_Kulveit13 Jun 2022 21:36 UTC
42 points
4 comments4 min readEA link
(www.alignmentforum.org)

AI and Evolution

Dan H30 Mar 2023 13:09 UTC
41 points
1 comment2 min readEA link
(arxiv.org)

Re­sults for a sur­vey of tool use and work­flows in al­ign­ment research

jacquesthibs19 Dec 2022 15:19 UTC
29 points
0 comments1 min readEA link

Two con­trast­ing mod­els of “in­tel­li­gence” and fu­ture growth

Magnus Vinding24 Nov 2022 11:54 UTC
63 points
29 comments29 min readEA link

Wi­den­ing Over­ton Win­dow—Open Thread

Prometheus31 Mar 2023 10:06 UTC
12 points
5 comments1 min readEA link
(www.lesswrong.com)

[Question] If FTX is liqui­dated, who ends up con­trol­ling An­thropic?

Ofer15 Nov 2022 15:04 UTC
63 points
8 comments1 min readEA link

Refine: An In­cu­ba­tor for Con­cep­tual Align­ment Re­search Bets

adamShimi15 Apr 2022 8:59 UTC
47 points
0 comments4 min readEA link

It’s OK not to go into AI (for stu­dents)

ruthgrace14 Jul 2022 15:16 UTC
59 points
18 comments2 min readEA link

A con­cern about the “evolu­tion­ary an­chor” of Ajeya Co­tra’s re­port on AI timelines.

NunoSempere16 Aug 2022 14:44 UTC
75 points
43 comments5 min readEA link
(nunosempere.com)

Poster Ses­sion on AI Safety

Neil Crawford12 Nov 2022 3:50 UTC
8 points
0 comments4 min readEA link

Mis­gen­er­al­iza­tion as a misnomer

So8res6 Apr 2023 20:43 UTC
45 points
0 comments1 min readEA link

New sur­vey: 46% of Amer­i­cans are con­cerned about ex­tinc­tion from AI; 69% sup­port a six-month pause in AI development

Akash5 Apr 2023 1:26 UTC
138 points
33 comments1 min readEA link

Read­ing the ethi­cists 2: Hunt­ing for AI al­ign­ment papers

Charlie Steiner6 Jun 2022 15:53 UTC
9 points
0 comments1 min readEA link
(www.lesswrong.com)

How dath ilan co­or­di­nates around solv­ing AI alignment

Thomas Kwa14 Apr 2022 1:53 UTC
12 points
1 comment5 min readEA link

Let’s think about slow­ing down AI

Katja_Grace23 Dec 2022 19:56 UTC
320 points
7 comments1 min readEA link

You Un­der­stand AI Align­ment and How to Make Soup

Leen Armoush28 May 2022 6:22 UTC
0 points
2 comments5 min readEA link

Ap­ply to the Ma­chine Learn­ing For Good boot­camp in France

Alexandre Variengien17 Jun 2022 9:13 UTC
9 points
0 comments1 min readEA link
(www.lesswrong.com)

FLI re­port: Poli­cy­mak­ing in the Pause

Zach Stein-Perlman15 Apr 2023 17:01 UTC
28 points
4 comments1 min readEA link

In­for­ma­tion in risky tech­nol­ogy races

nemeryxu2 Aug 2022 23:35 UTC
15 points
2 comments3 min readEA link

In­ter­gen­er­a­tional trauma im­ped­ing co­op­er­a­tive ex­is­ten­tial safety efforts

Andrew Critch3 Jun 2022 17:27 UTC
82 points
2 comments3 min readEA link

An­nounc­ing the AI Safety Nudge Com­pe­ti­tion to Help Beat Procrastination

Marc Carauleanu1 Oct 2022 1:49 UTC
24 points
1 comment2 min readEA link

[Question] How/​When Should One In­tro­duce AI Risk Ar­gu­ments to Peo­ple Un­fa­mil­iar With the Idea?

Harrison Durland9 Aug 2022 2:57 UTC
12 points
4 comments1 min readEA link

AGI Bat­tle Royale: Why “slow takeover” sce­nar­ios de­volve into a chaotic multi-AGI fight to the death

titotal22 Sep 2022 15:00 UTC
36 points
9 comments15 min readEA link

Per­sua­sion Tools: AI takeover with­out AGI or agency?

kokotajlod20 Nov 2020 16:56 UTC
15 points
5 comments10 min readEA link

Disagree­ment with bio an­chors that lead to shorter timelines

mariushobbhahn16 Nov 2022 14:40 UTC
80 points
1 comment1 min readEA link

Black Box In­ves­ti­ga­tions Re­search Hackathon

Esben Kran15 Sep 2022 10:09 UTC
23 points
0 comments2 min readEA link

[Question] What are the best ideas of how to reg­u­late AI from the US ex­ec­u­tive branch?

Jack Cunningham2 Apr 2022 21:53 UTC
10 points
0 comments1 min readEA link

In­tro­duc­tion to Prag­matic AI Safety [Prag­matic AI Safety #1]

ThomasW9 May 2022 17:02 UTC
68 points
0 comments6 min readEA link

[Linkpost] ‘The God­father of A.I.’ Leaves Google and Warns of Danger Ahead

Darius11 May 2023 19:54 UTC
42 points
3 comments3 min readEA link
(www.nytimes.com)

Why aren’t more of us work­ing to pre­vent AI hell?

Dawn Drescher4 May 2023 17:47 UTC
63 points
41 comments1 min readEA link

[Question] What to in­clude in a guest lec­ture on ex­is­ten­tial risks from AI?

Aryeh Englander13 Apr 2022 17:06 UTC
6 points
3 comments1 min readEA link

Chain­ing the evil ge­nie: why “outer” AI safety is prob­a­bly easy

titotal30 Aug 2022 13:55 UTC
20 points
11 comments10 min readEA link

AI Risk & Policy Fore­casts from Me­tac­u­lus & FLI’s AI Path­ways Workshop

Will Aldred16 May 2023 8:53 UTC
40 points
0 comments8 min readEA link

Slightly against al­ign­ing with neo-luddites

Matthew_Barnett26 Dec 2022 23:27 UTC
70 points
17 comments4 min readEA link

Fu­ture Mat­ters #4: AI timelines, AGI risk, and ex­is­ten­tial risk from cli­mate change

Pablo8 Aug 2022 11:00 UTC
59 points
0 comments17 min readEA link

Anal­y­sis of AI Safety sur­veys for field-build­ing insights

Ash Jafari5 Dec 2022 17:37 UTC
24 points
7 comments5 min readEA link

Bandgaps, Brains, and Bioweapons: The limi­ta­tions of com­pu­ta­tional sci­ence and what it means for AGI

titotal26 May 2023 15:57 UTC
38 points
0 comments18 min readEA link

Con­crete Steps to Get Started in Trans­former Mechanis­tic Interpretability

Neel Nanda26 Dec 2022 13:00 UTC
18 points
0 comments12 min readEA link

New book on s-risks

Tobias_Baumann26 Oct 2022 12:04 UTC
289 points
27 comments1 min readEA link

NIST AI Risk Man­age­ment Frame­work re­quest for in­for­ma­tion (RFI)

Aryeh Englander31 Aug 2021 22:24 UTC
7 points
0 comments2 min readEA link

A pseudo math­e­mat­i­cal for­mu­la­tion of di­rect work choice be­tween two x-risks

Joseph Bloom11 Aug 2022 0:28 UTC
7 points
0 comments4 min readEA link

AI Safety Seems Hard to Measure

Holden Karnofsky11 Dec 2022 1:31 UTC
89 points
2 comments14 min readEA link
(www.cold-takes.com)

The Ri­val AI De­ploy­ment Prob­lem: a Pre-de­ploy­ment Agree­ment as the least-bad response

HaydnBelfield23 Sep 2022 9:28 UTC
38 points
1 comment13 min readEA link

Re­view: What We Owe The Future

Kelsey Piper21 Nov 2022 21:41 UTC
165 points
3 comments1 min readEA link
(asteriskmag.com)

20 Cri­tiques of AI Safety That I Found on Twitter

Daniel Kirmani23 Jun 2022 15:11 UTC
14 points
13 comments1 min readEA link

Grokking “Fore­cast­ing TAI with biolog­i­cal an­chors”

anson6 Jun 2022 18:56 UTC
43 points
0 comments12 min readEA link

Red­wood Re­search is hiring for sev­eral roles (Oper­a­tions and Tech­ni­cal)

JJXWang14 Apr 2022 15:23 UTC
45 points
0 comments1 min readEA link

Open Prob­lems in AI X-Risk [PAIS #5]

ThomasW10 Jun 2022 2:22 UTC
44 points
1 comment36 min readEA link

Warn­ing Shots Prob­a­bly Wouldn’t Change The Pic­ture Much

So8res6 Oct 2022 5:15 UTC
88 points
20 comments2 min readEA link

Seek­ing par­ti­ci­pants for study of AI safety researchers

Gardner14 Dec 2022 9:38 UTC
18 points
3 comments1 min readEA link

[Question] Re­spon­si­ble/​fair AI vs. benefi­cial/​safe AI?

tae2 Jun 2022 19:37 UTC
6 points
10 comments1 min readEA link

[$20K In Prizes] AI Safety Ar­gu­ments Competition

ThomasW26 Apr 2022 16:21 UTC
71 points
134 comments3 min readEA link

Con­crete Ad­vice for Form­ing In­side Views on AI Safety

Neel Nanda17 Aug 2022 23:26 UTC
57 points
4 comments9 min readEA link
(www.alignmentforum.org)

Yud­kowsky and Chris­ti­ano on AI Take­off Speeds [LINKPOST]

aogara5 Apr 2022 0:57 UTC
15 points
0 comments11 min readEA link

[Question] Why not offer a multi-mil­lion /​ billion dol­lar prize for solv­ing the Align­ment Prob­lem?

Aryeh Englander17 Apr 2022 16:08 UTC
15 points
9 comments1 min readEA link

Error

The value
  NIL
is not of type
  SIMPLE-STRING
when binding #:USER-ID9

Im­por­tant, ac­tion­able re­search ques­tions for the most im­por­tant century

Holden Karnofsky24 Feb 2022 16:34 UTC
288 points
15 comments19 min readEA link

Re­views of “Is power-seek­ing AI an ex­is­ten­tial risk?”

Joe_Carlsmith16 Dec 2021 20:50 UTC
69 points
4 comments1 min readEA link

How likely are ma­lign pri­ors over ob­jec­tives? [aborted WIP]

David Johnston11 Nov 2022 6:03 UTC
6 points
0 comments1 min readEA link

Col­lec­tion of work on ‘Should you should fo­cus on the EU if you’re in­ter­ested in AI gov­er­nance for longter­mist/​x-risk rea­sons?’

MichaelA6 Aug 2022 16:49 UTC
40 points
1 comment1 min readEA link

Why does no one care about AI?

Olivia Addy7 Aug 2022 22:04 UTC
55 points
47 comments1 min readEA link

Con­crete ac­tions to im­prove AI gov­er­nance: the be­havi­our sci­ence approach

AlexanderSaeri1 Dec 2022 21:34 UTC
31 points
0 comments11 min readEA link

AGI ruin sce­nar­ios are likely (and dis­junc­tive)

So8res27 Jul 2022 3:24 UTC
54 points
5 comments6 min readEA link

[Question] Is trans­for­ma­tive AI the biggest ex­is­ten­tial risk? Why or why not?

BrownHairedEevee5 Mar 2022 3:54 UTC
9 points
11 comments1 min readEA link

List #3: Why not to as­sume on prior that AGI-al­ign­ment workarounds are available

Remmelt24 Dec 2022 9:54 UTC
6 points
0 comments1 min readEA link

AI Safety Micro­grant Round

Chris Leong14 Nov 2022 4:25 UTC
81 points
1 comment3 min readEA link

Are al­ign­ment re­searchers de­vot­ing enough time to im­prov­ing their re­search ca­pac­ity?

Carson Jones4 Nov 2022 0:58 UTC
11 points
1 comment1 min readEA link

ML Safety Schol­ars Sum­mer 2022 Retrospective

ThomasW1 Nov 2022 3:09 UTC
56 points
2 comments21 min readEA link

Align­ment’s phlo­gis­ton

Eleni_A18 Aug 2022 1:41 UTC
18 points
1 comment2 min readEA link

Early-warn­ing Fore­cast­ing Cen­ter: What it is, and why it’d be cool

Linch14 Mar 2022 19:20 UTC
57 points
8 comments11 min readEA link

Up­date on Har­vard AI Safety Team and MIT AI Alignment

Xander Davies2 Dec 2022 6:09 UTC
70 points
3 comments1 min readEA link

The op­ti­mal timing of spend­ing on AGI safety work; why we should prob­a­bly be spend­ing more now

Tristan Cook24 Oct 2022 17:42 UTC
88 points
11 comments36 min readEA link

[Link] GCRI’s Seth Baum re­views The Precipice

Aryeh Englander6 Jun 2022 19:33 UTC
21 points
0 comments1 min readEA link

AGI Isn’t Close—Fu­ture Fund Wor­ld­view Prize

Toni MUENDEL18 Dec 2022 16:03 UTC
−8 points
24 comments13 min readEA link

De­cep­tion as the op­ti­mal: mesa-op­ti­miz­ers and in­ner al­ign­ment

Eleni_A16 Aug 2022 3:45 UTC
19 points
0 comments5 min readEA link

Is AI fore­cast­ing a waste of effort on the mar­gin?

Emrik5 Nov 2022 0:41 UTC
9 points
6 comments3 min readEA link

Spicy takes about AI policy (Clark, 2022)

Will Aldred9 Aug 2022 13:49 UTC
43 points
0 comments3 min readEA link
(twitter.com)

An­nounc­ing AI safety Men­tors and Mentees

mariushobbhahn23 Nov 2022 15:21 UTC
62 points
0 comments1 min readEA link

13 Very Differ­ent Stances on AGI

Ozzie Gooen27 Dec 2021 23:30 UTC
84 points
27 comments3 min readEA link

High-level hopes for AI alignment

Holden Karnofsky20 Dec 2022 2:11 UTC
118 points
14 comments19 min readEA link
(www.cold-takes.com)

[Question] What are the num­bers in mind for the su­per-short AGI timelines so many long-ter­mists are alarmed about?

Evan_Gaensbauer19 Apr 2022 21:09 UTC
41 points
2 comments1 min readEA link

[Question] What is the best source to ex­plain short AI timelines to a skep­ti­cal per­son?

trevor123 Nov 2022 5:20 UTC
2 points
3 comments1 min readEA link

In­tro­duc­ing the Fund for Align­ment Re­search (We’re Hiring!)

AdamGleave6 Jul 2022 2:00 UTC
74 points
3 comments4 min readEA link

AGI and Lock-In

Lukas_Finnveden29 Oct 2022 1:56 UTC
124 points
28 comments10 min readEA link
(docs.google.com)

What Should We Op­ti­mize—A Conversation

Johannes C. Mayer7 Apr 2022 14:48 UTC
1 point
0 comments15 min readEA link

In­for­ma­tion se­cu­rity con­sid­er­a­tions for AI and the long term future

Jeffrey Ladish2 May 2022 20:53 UTC
123 points
7 comments11 min readEA link

Ques­tions about AI that bother me

Eleni_A31 Jan 2023 6:50 UTC
33 points
6 comments2 min readEA link

[Question] Do EA folks want AGI at all?

Noah Scales16 Jul 2022 5:44 UTC
8 points
10 comments1 min readEA link

[Question] Which pos­si­ble AI im­pacts should re­ceive the most ad­di­tional at­ten­tion?

David Johnston31 May 2022 2:01 UTC
10 points
10 comments1 min readEA link

$20K in Boun­ties for AI Safety Public Materials

ThomasW5 Aug 2022 2:57 UTC
45 points
11 comments6 min readEA link

Dis­cussing how to al­ign Trans­for­ma­tive AI if it’s de­vel­oped very soon

elifland28 Nov 2022 16:17 UTC
36 points
0 comments1 min readEA link

Part 2: AI Safety Move­ment Builders should help the com­mu­nity to op­ti­mise three fac­tors: con­trib­u­tors, con­tri­bu­tions and coordination

PeterSlattery15 Dec 2022 22:48 UTC
34 points
0 comments6 min readEA link

A new­comer’s guide to the tech­ni­cal AI safety field

zeshen4 Nov 2022 14:29 UTC
12 points
0 comments1 min readEA link

Three pillars for avoid­ing AGI catas­tro­phe: Tech­ni­cal al­ign­ment, de­ploy­ment de­ci­sions, and co­or­di­na­tion

alexlintz3 Aug 2022 21:24 UTC
90 points
4 comments11 min readEA link

Call For Distillers

johnswentworth6 Apr 2022 3:03 UTC
69 points
6 comments3 min readEA link

My take on What We Owe the Future

elifland1 Sep 2022 18:07 UTC
351 points
51 comments26 min readEA link

Fu­ture Mat­ters #3: digi­tal sen­tience, AGI ruin, and fore­cast­ing track records

Pablo4 Jul 2022 17:44 UTC
70 points
2 comments19 min readEA link

Ap­pli­ca­tions open for AGI Safety Fun­da­men­tals: Align­ment Course

Jamie Bernardi13 Dec 2022 10:50 UTC
75 points
0 comments2 min readEA link

Fu­ture Mat­ters #5: su­per­vol­ca­noes, AI takeover, and What We Owe the Future

Pablo14 Sep 2022 13:02 UTC
31 points
5 comments18 min readEA link

Please provide feed­back on AI-safety grant pro­posal, thanks!

Alex Long11 Dec 2022 23:29 UTC
8 points
1 comment2 min readEA link

Race to the Top: Bench­marks for AI Safety

isaduan4 Dec 2022 22:50 UTC
51 points
8 comments1 min readEA link

The an­i­mals and hu­mans anal­ogy for AI risk

freedomandutility13 Aug 2022 15:35 UTC
5 points
2 comments1 min readEA link

Les­sons learned from talk­ing to >100 aca­demics about AI safety

mariushobbhahn10 Oct 2022 13:16 UTC
138 points
21 comments1 min readEA link

List #1: Why stop­ping the de­vel­op­ment of AGI is hard but doable

Remmelt24 Dec 2022 9:52 UTC
24 points
2 comments1 min readEA link

Differ­en­tial tech­nol­ogy de­vel­op­ment: preprint on the concept

Hamish_Hobbs12 Sep 2022 13:52 UTC
61 points
0 comments2 min readEA link

Hu­man­ity’s vast fu­ture and its im­pli­ca­tions for cause prioritization

BrownHairedEevee26 Jul 2022 5:04 UTC
35 points
3 comments4 min readEA link
(sunyshore.substack.com)

Key Papers in Lan­guage Model Safety

aogara20 Jun 2022 14:59 UTC
19 points
0 comments22 min readEA link

Fu­ture Mat­ters #6: FTX col­lapse, value lock-in, and coun­ter­ar­gu­ments to AI x-risk

Pablo30 Dec 2022 13:10 UTC
57 points
2 comments21 min readEA link

Prob­a­bly good pro­jects for the AI safety ecosystem

Ryan Kidd5 Dec 2022 3:24 UTC
20 points
0 comments1 min readEA link

Rac­ing through a minefield: the AI de­ploy­ment problem

Holden Karnofsky31 Dec 2022 21:44 UTC
74 points
1 comment13 min readEA link
(www.cold-takes.com)

[Job]: AI Stan­dards Devel­op­ment Re­search Assistant

Tony Barrett14 Oct 2022 20:18 UTC
13 points
0 comments2 min readEA link

Pre-An­nounc­ing the 2023 Open Philan­thropy AI Wor­ld­views Contest

Jason Schukraft21 Nov 2022 21:45 UTC
291 points
26 comments1 min readEA link

An­nounc­ing: Mechanism De­sign for AI Safety—Read­ing Group

Rubi J. Hudson9 Aug 2022 4:25 UTC
35 points
1 comment4 min readEA link

Con­crete ac­tion­able poli­cies rele­vant to AI safety (writ­ten 2019)

weeatquince16 Dec 2022 18:41 UTC
48 points
0 comments22 min readEA link

Ways to buy time

Akash12 Nov 2022 19:31 UTC
47 points
1 comment1 min readEA link

New Se­quence—Towards a wor­ld­wide, wa­ter­tight Wind­fall Clause

John Bridge7 Apr 2022 15:02 UTC
25 points
4 comments8 min readEA link

Belief Bias: Bias in Eval­u­at­ing AGI X-Risks

Remmelt2 Jan 2023 8:59 UTC
5 points
0 comments1 min readEA link

Reflec­tions on the PIBBSS Fel­low­ship 2022

nora11 Dec 2022 22:03 UTC
69 points
4 comments18 min readEA link

Why Would AI “Aim” To Defeat Hu­man­ity?

Holden Karnofsky29 Nov 2022 18:59 UTC
19 points
0 comments32 min readEA link
(www.cold-takes.com)

Rea­sons for my nega­tive feel­ings to­wards the AI risk discussion

fergusq1 Sep 2022 7:33 UTC
41 points
9 comments4 min readEA link

“Tech­nolog­i­cal un­em­ploy­ment” AI vs. “most im­por­tant cen­tury” AI: how far apart?

Holden Karnofsky11 Oct 2022 4:50 UTC
15 points
1 comment3 min readEA link
(www.cold-takes.com)

Miti­gat­ing x-risk through modularity

Toby Newberry17 Dec 2020 19:54 UTC
96 points
6 comments14 min readEA link

Large Lan­guage Models as Cor­po­rate Lob­by­ists, and Im­pli­ca­tions for So­cietal-AI Alignment

johnjnay4 Jan 2023 22:22 UTC
10 points
6 comments8 min readEA link

When you plan ac­cord­ing to your AI timelines, should you put more weight on the me­dian fu­ture, or the me­dian fu­ture | even­tual AI al­ign­ment suc­cess? ⚖️

Jeffrey Ladish5 Jan 2023 1:55 UTC
16 points
2 comments2 min readEA link

Com­plex Sys­tems for AI Safety [Prag­matic AI Safety #3]

ThomasW24 May 2022 0:04 UTC
49 points
6 comments21 min readEA link

AI Gover­nance Needs Tech­ni­cal Work

Mauricio5 Sep 2022 22:25 UTC
94 points
3 comments7 min readEA link

Ar­tifi­cial In­tel­li­gence and Nu­clear Com­mand, Con­trol, & Com­mu­ni­ca­tions: The Risks of Integration

Peter Rautenbach18 Nov 2022 13:01 UTC
60 points
3 comments50 min readEA link

AI Safety Un­con­fer­ence NeurIPS 2022

Orpheus_Lummis7 Nov 2022 15:39 UTC
13 points
5 comments1 min readEA link
(aisafetyevents.org)

2022 AI ex­pert sur­vey results

Zach Stein-Perlman4 Aug 2022 15:54 UTC
88 points
7 comments2 min readEA link
(aiimpacts.org)

AGI as a Black Swan Event

Stephen McAleese4 Dec 2022 23:35 UTC
5 points
2 comments7 min readEA link
(www.lesswrong.com)

Fol­lowup on Terminator

skluug12 Mar 2022 1:11 UTC
32 points
0 comments9 min readEA link
(skluug.substack.com)

Why I think that teach­ing philos­o­phy is high impact

Eleni_A19 Dec 2022 23:00 UTC
17 points
2 comments2 min readEA link

Thoughts on AGI or­ga­ni­za­tions and ca­pa­bil­ities work

RobBensinger7 Dec 2022 19:46 UTC
77 points
7 comments5 min readEA link

BERI, Epoch, and FAR will ex­plain their work & cur­rent job open­ings on­line this Sunday

Rockwell19 Aug 2022 20:34 UTC
7 points
0 comments1 min readEA link

Sce­nario Map­ping Ad­vanced AI Risk: Re­quest for Par­ti­ci­pa­tion with Data Collection

Kiliank27 Mar 2022 11:44 UTC
14 points
0 comments5 min readEA link

Toby Ord’s new re­port on les­sons from the de­vel­op­ment of the atomic bomb

Ishan Mukherjee22 Nov 2022 10:37 UTC
65 points
3 comments1 min readEA link
(www.governance.ai)

[Question] How does one find out their AGI timelines?

Yadav7 Nov 2022 22:34 UTC
19 points
4 comments1 min readEA link

How to en­gage with AI 4 So­cial Jus­tice ac­tors

TomWestgarth26 Apr 2022 8:39 UTC
14 points
5 comments1 min readEA link

Two rea­sons we might be closer to solv­ing al­ign­ment than it seems

Kat Woods24 Sep 2022 17:38 UTC
38 points
18 comments4 min readEA link

Catholic the­olo­gians and priests on ar­tifi­cial intelligence

anonymous614 Jun 2022 18:53 UTC
21 points
3 comments1 min readEA link

The miss­ing link to AGI

Yuri Barzov28 Sep 2022 16:37 UTC
1 point
7 comments1 min readEA link

[Question] By how much should Meta’s Blen­derBot be­ing re­ally bad cause me to up­date on how jus­tifi­able it is for OpenAI and Deep­Mind to be mak­ing sig­nifi­cant progress on AI ca­pa­bil­ities?

Sisi10 Aug 2022 6:40 UTC
24 points
8 comments1 min readEA link

Why I think strong gen­eral AI is com­ing soon

porby28 Sep 2022 6:55 UTC
14 points
1 comment1 min readEA link

[Question] Is there any re­search or fore­casts of how likely AI Align­ment is go­ing to be a hard vs. easy prob­lem rel­a­tive to ca­pa­bil­ities?

Jordan Arel14 Aug 2022 15:58 UTC
8 points
1 comment1 min readEA link

On Ar­tifi­cial Gen­eral In­tel­li­gence: Ask­ing the Right Questions

Heather Douglas2 Oct 2022 5:00 UTC
−1 points
7 comments3 min readEA link

AGI Safety Com­mu­ni­ca­tions Initiative

Ines11 Jun 2022 16:30 UTC
33 points
5 comments1 min readEA link

Differ­ence, Pro­jec­tion, and Adaptation

YOG10 Nov 2022 10:46 UTC
0 points
0 comments3 min readEA link

What if AI de­vel­op­ment goes well?

RoryG3 Aug 2022 8:57 UTC
25 points
7 comments12 min readEA link

Mas­sive Scal­ing Should be Frowned Upon

harsimony17 Nov 2022 17:44 UTC
9 points
0 comments5 min readEA link

Don’t leave your finger­prints on the future

So8res8 Oct 2022 0:35 UTC
86 points
4 comments1 min readEA link

I there a demo of “You can’t fetch the coffee if you’re dead”?

Ram Rachum10 Nov 2022 11:03 UTC
8 points
3 comments1 min readEA link

Si­mu­la­tors and Mindcrime

𝕮𝖎𝖓𝖊𝖗𝖆9 Dec 2022 15:20 UTC
1 point
0 comments1 min readEA link

In­sti­tu­tions Can­not Res­train Dark-Triad AI Exploitation

Remmelt27 Dec 2022 10:34 UTC
8 points
0 comments1 min readEA link

[Linkpost] “Blueprint for an AI Bill of Rights”—Office of Science and Tech­nol­ogy Policy, USA (2022)

rodeo_flagellum5 Oct 2022 16:48 UTC
15 points
0 comments1 min readEA link

AI Safety in a Vuln­er­a­ble World: Re­quest­ing Feed­back on Pre­limi­nary Thoughts

Jordan Arel6 Dec 2022 22:36 UTC
5 points
4 comments3 min readEA link

AI coöper­a­tion is more pos­si­ble than you think

42317524 Sep 2022 23:04 UTC
2 points
0 comments1 min readEA link

The Hap­piness Max­i­mizer: Why EA is an x-risk

Obasi Shaw30 Aug 2022 4:29 UTC
8 points
6 comments29 min readEA link

Effec­tive Per­sua­sion For AI Align­ment Risk

Brian Lui9 Aug 2022 23:55 UTC
5 points
7 comments4 min readEA link

Against Agents as an Ap­proach to Aligned Trans­for­ma­tive AI

𝕮𝖎𝖓𝖊𝖗𝖆27 Dec 2022 0:47 UTC
4 points
0 comments1 min readEA link

[MLSN #6]: Trans­parency sur­vey, prov­able ro­bust­ness, ML mod­els that pre­dict the future

Dan H12 Oct 2022 20:51 UTC
21 points
1 comment6 min readEA link

In­tro­duc­ing Gen­er­ally In­tel­li­gent: an AI re­search lab fo­cused on im­proved the­o­ret­i­cal and prag­matic understanding

joshalbrecht21 Oct 2022 8:20 UTC
8 points
0 comments1 min readEA link

“AGI timelines: ig­nore the so­cial fac­tor at their peril” (Fu­ture Fund AI Wor­ld­view Prize sub­mis­sion)

ketanrama5 Nov 2022 17:45 UTC
10 points
0 comments12 min readEA link
(trevorklee.substack.com)

Dist­in­guish­ing test from training

So8res29 Nov 2022 21:41 UTC
27 points
0 comments1 min readEA link

Prov­ably Hon­est—A First Step

Srijanak De5 Nov 2022 21:49 UTC
1 point
0 comments1 min readEA link

Law-Fol­low­ing AI 4: Don’t Rely on Vi­car­i­ous Liability

Cullen2 Aug 2022 23:23 UTC
13 points
0 comments3 min readEA link

Prize and fast track to al­ign­ment re­search at ALTER

Vanessa18 Sep 2022 9:15 UTC
38 points
0 comments3 min readEA link

Public-fac­ing Cen­sor­ship Is Safety Theater, Caus­ing Rep­u­ta­tional Da­m­age

Yitz23 Sep 2022 5:08 UTC
49 points
7 comments1 min readEA link

How to store hu­man val­ues on a computer

oliver_siegel4 Nov 2022 19:36 UTC
1 point
2 comments1 min readEA link

Un­der­stand­ing the diffu­sion of large lan­guage mod­els: summary

Ben Cottier21 Dec 2022 13:49 UTC
124 points
18 comments22 min readEA link

Epoch is hiring a Re­search Data Analyst

merilalama22 Nov 2022 17:34 UTC
21 points
0 comments4 min readEA link
(careers.rethinkpriorities.org)

How Open Source Ma­chine Learn­ing Soft­ware Shapes AI

Max Langenkamp28 Sep 2022 17:49 UTC
11 points
3 comments14 min readEA link
(maxlangenkamp.me)

Ba­hamian Ad­ven­tures: An Epic Tale of En­trepreneur­ship, AI Strat­egy Re­search and Potatoes

Jaime Sevilla9 Aug 2022 8:37 UTC
67 points
9 comments4 min readEA link

A New York Times ar­ti­cle on AI risk

Eleni_A6 Sep 2022 0:46 UTC
20 points
0 comments1 min readEA link
(www.nytimes.com)

AI Safety Ex­ec­u­tive Summary

Sean Osier6 Sep 2022 8:26 UTC
20 points
2 comments5 min readEA link
(seanosier.notion.site)

“Develop An­thro­po­mor­phic AGI to Save Hu­man­ity from It­self” (Fu­ture Fund AI Wor­ld­view Prize sub­mis­sion)

ketanrama5 Nov 2022 17:57 UTC
19 points
6 comments7 min readEA link

Pos­si­ble di­rec­tions in AI ideal gov­er­nance research

RoryG10 Aug 2022 8:36 UTC
5 points
0 comments3 min readEA link

Re­sults from the lan­guage model hackathon

Esben Kran10 Oct 2022 8:29 UTC
23 points
2 comments1 min readEA link

Ber­lin AI Safety Open Meetup July 2022

Isidor Regenfuß22 Jul 2022 16:26 UTC
1 point
0 comments1 min readEA link

[Question] AI Safety Pitches post ChatGPT

ojorgensen5 Dec 2022 22:48 UTC
6 points
2 comments1 min readEA link

Mili­tary Ar­tifi­cial In­tel­li­gence as Con­trib­u­tor to Global Catas­trophic Risk

MMMaas27 Jun 2022 10:35 UTC
40 points
0 comments54 min readEA link

MIRI Con­ver­sa­tions: Tech­nol­ogy Fore­cast­ing & Grad­u­al­ism (Distil­la­tion)

TheMcDouglas13 Jul 2022 10:45 UTC
27 points
9 comments19 min readEA link

Align­ing AI with Hu­mans by Lev­er­ag­ing Le­gal Informatics

johnjnay18 Sep 2022 7:43 UTC
20 points
11 comments3 min readEA link

Is the time crunch for AI Safety Move­ment Build­ing now?

Chris Leong8 Jun 2022 12:19 UTC
14 points
10 comments2 min readEA link

[Question] Does China have AI al­ign­ment re­sources/​in­sti­tu­tions? How can we pri­ori­tize cre­at­ing more?

Jakub Kraus4 Aug 2022 19:23 UTC
18 points
9 comments1 min readEA link

Align­ment is hard. Com­mu­ni­cat­ing that, might be harder

Eleni_A1 Sep 2022 11:45 UTC
17 points
1 comment3 min readEA link

[Question] Why does (any par­tic­u­lar) AI safety work re­duce s-risks more than it in­creases them?

MichaelStJules3 Oct 2021 16:55 UTC
48 points
19 comments1 min readEA link

[Question] Slow­ing down AI progress?

Eleni_A26 Jul 2022 8:46 UTC
14 points
9 comments1 min readEA link

Error

The value
  NIL
is not of type
  SIMPLE-STRING
when binding #:USER-ID9

Stress Ex­ter­nal­ities More in AI Safety Pitches

NickGabs26 Sep 2022 20:31 UTC
31 points
13 comments2 min readEA link

Ap­pli­ca­tions are now open for In­tro to ML Safety Spring 2023

Joshc4 Nov 2022 22:45 UTC
49 points
1 comment2 min readEA link

AI ac­cel­er­a­tion from a safety per­spec­tive: Trade-offs and con­sid­er­a­tions

mariushobbhahn19 Jan 2022 9:44 UTC
12 points
1 comment7 min readEA link

[Question] Why not to solve al­ign­ment by mak­ing su­per­in­tel­li­gent hu­mans?

Pato16 Oct 2022 21:26 UTC
9 points
12 comments1 min readEA link

UK AI Policy Re­port: Con­tent, Sum­mary, and its Im­pact on EA Cause Areas

Algo_Law21 Jul 2022 17:32 UTC
9 points
1 comment9 min readEA link

Hacker-AI and Digi­tal Ghosts – Pre-AGI

Erland Wittkotter19 Oct 2022 7:49 UTC
4 points
0 comments1 min readEA link

It’s (not) how you use it

Eleni_A7 Sep 2022 13:28 UTC
6 points
3 comments2 min readEA link

An­nounc­ing: What Fu­ture World? - Grow­ing the AI Gover­nance Community

DavidCorfield2 Nov 2022 0:31 UTC
4 points
0 comments1 min readEA link

[linkpost] When does tech­ni­cal work to re­duce AGI con­flict make a differ­ence?: Introduction

antimonyanthony16 Sep 2022 14:35 UTC
31 points
0 comments1 min readEA link
(www.lesswrong.com)

The al­ign­ment prob­lem from a deep learn­ing perspective

richard_ngo11 Aug 2022 3:18 UTC
58 points
0 comments21 min readEA link

FYI: I’m work­ing on a book about the threat of AGI/​ASI for a gen­eral au­di­ence. I hope it will be of value to the cause and the community

Darren McKee17 Jun 2022 11:52 UTC
32 points
1 comment2 min readEA link

Nice­ness is unnatural

So8res13 Oct 2022 1:30 UTC
20 points
1 comment1 min readEA link

An­nounc­ing the Fu­ture Fund’s AI Wor­ld­view Prize

Nick_Beckstead23 Sep 2022 16:28 UTC
255 points
130 comments13 min readEA link
(ftxfuturefund.org)

[Question] Graph of % of tasks AI is su­per­hu­man at?

Denkenberger15 Nov 2022 5:59 UTC
9 points
0 comments1 min readEA link

aisafety.com­mu­nity—A liv­ing doc­u­ment of AI safety communities

zeshen20 Oct 2022 22:08 UTC
24 points
13 comments1 min readEA link

My (Lazy) Longter­mism FAQ

Devin Kalish24 Oct 2022 16:44 UTC
28 points
6 comments27 min readEA link

[Question] How much will pre-trans­for­ma­tive AI speed up R&D?

Ben Snodin31 May 2021 20:20 UTC
23 points
0 comments1 min readEA link

Which Post Idea Is Most Effec­tive?

Jordan Arel25 Apr 2022 4:47 UTC
26 points
6 comments2 min readEA link

Re­sources that (I think) new al­ign­ment re­searchers should know about

Akash28 Oct 2022 22:13 UTC
20 points
2 comments1 min readEA link

The AIA and its Brus­sels Effect

Kathryn O'Rourke27 Dec 2022 16:01 UTC
14 points
0 comments5 min readEA link

Safety timelines: How long will it take to solve al­ign­ment?

Esben Kran19 Sep 2022 12:51 UTC
41 points
9 comments6 min readEA link

Wor­ld­view iPeo­ple—Fu­ture Fund’s AI Wor­ld­view Prize

Toni MUENDEL28 Oct 2022 7:37 UTC
0 points
5 comments1 min readEA link

Like­li­hood of an anti-AI back­lash: Re­sults from a pre­limi­nary Twit­ter poll

Geoffrey Miller27 Sep 2022 22:01 UTC
27 points
13 comments1 min readEA link

How tech­ni­cal safety stan­dards could pro­mote TAI safety

Cullen8 Aug 2022 16:57 UTC
127 points
15 comments7 min readEA link

What are cur­rent smaller prob­lems re­lated to top EA cause ar­eas (eg deep­fake poli­cies for AI risk, on­go­ing covid var­i­ants for bio risk) and would it be benefi­cial for these small and not-catas­trophic challenges to get more EA re­sources, as a way of de­vel­op­ing ca­pac­ity to pre­vent the catas­trophic ver­sions?

nonzerosum13 Jun 2022 17:32 UTC
7 points
0 comments2 min readEA link

Let’s talk about un­con­trol­lable AI

Karl von Wendt9 Oct 2022 10:37 UTC
12 points
2 comments1 min readEA link

Anti-squat­ted AI x-risk do­mains index

plex12 Aug 2022 12:00 UTC
52 points
9 comments1 min readEA link

Slides: Po­ten­tial Risks From Ad­vanced AI

Aryeh Englander28 Apr 2022 2:18 UTC
9 points
0 comments1 min readEA link

In­tro­duc­tion: Bias in Eval­u­at­ing AGI X-Risks

Remmelt27 Dec 2022 10:27 UTC
4 points
0 comments1 min readEA link

CFP for Re­bel­lion and Di­sobe­di­ence in AI workshop

Ram Rachum29 Dec 2022 16:09 UTC
4 points
0 comments1 min readEA link

The case for tak­ing AI se­ri­ously as a threat to hu­man­ity (Kel­sey Piper)

EA Handbook15 Oct 2020 7:00 UTC
11 points
1 comment1 min readEA link
(www.vox.com)

Re­ac­tive de­val­u­a­tion: Bias in Eval­u­at­ing AGI X-Risks

Remmelt30 Dec 2022 9:02 UTC
2 points
9 comments1 min readEA link

My thoughts on OpenAI’s al­ign­ment plan

Akash30 Dec 2022 19:34 UTC
16 points
0 comments1 min readEA link

Curse of knowl­edge and Naive re­al­ism: Bias in Eval­u­at­ing AGI X-Risks

Remmelt31 Dec 2022 13:33 UTC
5 points
0 comments1 min readEA link

Self-Limit­ing AI in AI Alignment

The_Lord's_Servant_28031 Dec 2022 19:07 UTC
2 points
1 comment1 min readEA link

Challenge to the no­tion that any­thing is (maybe) pos­si­ble with AGI

Remmelt1 Jan 2023 3:57 UTC
−17 points
3 comments1 min readEA link

Sum­mary of 80k’s AI prob­lem profile

Jakub Kraus1 Jan 2023 7:48 UTC
19 points
0 comments5 min readEA link
(www.lesswrong.com)

Re­sults from the AI test­ing hackathon

Esben Kran2 Jan 2023 15:46 UTC
35 points
4 comments5 min readEA link
(alignmentjam.com)

AI Safety Doesn’t Have to be Weird

Mica White2 Jan 2023 21:56 UTC
11 points
1 comment2 min readEA link

Sta­tus quo bias; Sys­tem justification

Remmelt3 Jan 2023 2:50 UTC
4 points
1 comment1 min readEA link

[Question] How have shorter AI timelines been af­fect­ing you, and how have you been re­spond­ing to them?

Liav.Koren3 Jan 2023 4:20 UTC
33 points
16 comments1 min readEA link

Nor­malcy bias and Base rate ne­glect: Bias in Eval­u­at­ing AGI X-Risks

Remmelt4 Jan 2023 3:16 UTC
5 points
0 comments1 min readEA link

AI al­ign­ment re­search links

Holden Karnofsky6 Jan 2022 5:52 UTC
16 points
0 comments6 min readEA link
(www.cold-takes.com)

“AI” is an indexical

ThomasW3 Jan 2023 22:00 UTC
23 points
2 comments1 min readEA link

Holden Karnofsky In­ter­view about Most Im­por­tant Cen­tury & Trans­for­ma­tive AI

Dwarkesh Patel3 Jan 2023 17:31 UTC
29 points
2 comments1 min readEA link

Illu­sion of truth effect and Am­bi­guity effect: Bias in Eval­u­at­ing AGI X-Risks

Remmelt5 Jan 2023 4:05 UTC
1 point
1 comment1 min readEA link

ChatGPT un­der­stands, but largely does not gen­er­ate Span­glish (and other code-mixed) text

Milan Weibel4 Jan 2023 22:10 UTC
5 points
0 comments4 min readEA link
(www.lesswrong.com)

Me­tac­u­lus Year in Re­view: 2022

christian6 Jan 2023 1:23 UTC
25 points
2 comments4 min readEA link
(metaculus.medium.com)

Is any­one else also get­ting more wor­ried about hard take­off AGI sce­nar­ios?

JonCefalu9 Jan 2023 6:04 UTC
19 points
11 comments3 min readEA link

Misha Yagudin and Ozzie Gooen Dis­cuss LLMs and Effec­tive Altruism

Ozzie Gooen6 Jan 2023 22:59 UTC
47 points
3 comments14 min readEA link
(quri.substack.com)

An­chor­ing fo­cal­ism and the Iden­ti­fi­able vic­tim effect: Bias in Eval­u­at­ing AGI X-Risks

Remmelt7 Jan 2023 9:59 UTC
4 points
1 comment1 min readEA link

[Dis­cus­sion] How Broad is the Hu­man Cog­ni­tive Spec­trum?

𝕮𝖎𝖓𝖊𝖗𝖆7 Jan 2023 0:59 UTC
16 points
1 comment1 min readEA link

David Krueger on AI Align­ment in Academia and Coordination

Michaël Trazzi7 Jan 2023 21:14 UTC
32 points
1 comment3 min readEA link
(theinsideview.ai)

Learn­ing as much Deep Learn­ing math as I could in 24 hours

Phosphorous8 Jan 2023 2:19 UTC
57 points
5 comments7 min readEA link

Big list of AI safety videos

Jakub Kraus9 Jan 2023 6:09 UTC
9 points
0 comments1 min readEA link
(docs.google.com)

Went­worth and Larsen on buy­ing time

Akash9 Jan 2023 21:31 UTC
48 points
0 comments1 min readEA link

[Question] What AI Take-Over Movies or Books Will Scare Me Into Tak­ing AI Se­ri­ously?

Jordan Arel10 Jan 2023 8:30 UTC
11 points
7 comments1 min readEA link

ea.do­mains—Do­mains Free to a Good Home

plex12 Jan 2023 13:32 UTC
48 points
8 comments4 min readEA link

[Ru­mour] Microsoft to in­vest $10B in OpenAI, will re­ceive 75% of prof­its un­til they re­coup in­vest­ment: GPT would be in­te­grated with Office

𝕮𝖎𝖓𝖊𝖗𝖆10 Jan 2023 23:43 UTC
25 points
2 comments1 min readEA link

An­nounc­ing the 2023 PIBBSS Sum­mer Re­search Fellowship

Dušan D. Nešić (Dushan)12 Jan 2023 21:38 UTC
26 points
3 comments1 min readEA link

[Question] Con­cerns about AI safety ca­reer change

mmKALLL13 Jan 2023 20:52 UTC
45 points
15 comments4 min readEA link

EA rele­vant Fore­sight In­sti­tute Work­shops in 2023: WBE & AI safety, Cryp­tog­ra­phy & AI safety, XHope, Space, and Atom­i­cally Pre­cise Manufacturing

elteerkers16 Jan 2023 14:02 UTC
20 points
2 comments3 min readEA link

Prepar­ing for AI-as­sisted al­ign­ment re­search: we need data!

CBiddulph17 Jan 2023 3:28 UTC
11 points
0 comments11 min readEA link

An­nounc­ing aisafety.training

JJ Hepburn17 Jan 2023 1:55 UTC
108 points
4 comments1 min readEA link

[Question] Should AI writ­ers be pro­hibited in ed­u­ca­tion?

Eleni_A16 Jan 2023 22:29 UTC
3 points
2 comments1 min readEA link

Les­sons learned and re­view of the AI Safety Nudge Competition

Marc Carauleanu17 Jan 2023 17:13 UTC
5 points
0 comments5 min readEA link

Emerg­ing Paradigms: The Case of Ar­tifi­cial In­tel­li­gence Safety

Eleni_A18 Jan 2023 5:59 UTC
16 points
0 comments19 min readEA link

[Question] Any Philos­o­phy PhD recom­men­da­tions for stu­dents in­ter­ested in Align­ment Efforts?

rickyhuang.hexuan18 Jan 2023 5:54 UTC
7 points
6 comments1 min readEA link

Help me to un­der­stand AI al­ign­ment!

britomart18 Jan 2023 9:13 UTC
3 points
11 comments1 min readEA link

6-para­graph AI risk in­tro for MAISI

Jakub Kraus19 Jan 2023 9:22 UTC
12 points
0 comments1 min readEA link

An­nounc­ing Cavendish Labs

dyusha19 Jan 2023 20:00 UTC
106 points
6 comments2 min readEA link

Why peo­ple want to work on AI safety (but don’t)

Emily Grundy24 Jan 2023 6:41 UTC
69 points
10 comments7 min readEA link

What a com­pute-cen­tric frame­work says about AI take­off speeds—draft report

Tom_Davidson23 Jan 2023 4:09 UTC
186 points
5 comments16 min readEA link
(www.lesswrong.com)

Ex­is­ten­tial Risk of Misal­igned In­tel­li­gence Aug­men­ta­tion (Par­tic­u­larly Us­ing High-Band­width BCI Im­plants)

Damian Gorski24 Jan 2023 17:02 UTC
1 point
0 comments9 min readEA link

[Linkpost] Hu­man-nar­rated au­dio ver­sion of “Is Power-Seek­ing AI an Ex­is­ten­tial Risk?”

Joe_Carlsmith31 Jan 2023 19:19 UTC
7 points
0 comments1 min readEA link

Alexan­der and Yud­kowsky on AGI goals

Scott Alexander31 Jan 2023 23:36 UTC
29 points
1 comment1 min readEA link

Launch­ing The Col­lec­tive In­tel­li­gence Pro­ject: Whitepa­per and Pilots

jasmine_wang6 Feb 2023 17:00 UTC
37 points
8 comments2 min readEA link
(cip.org)

Error

The value
  NIL
is not of type
  SIMPLE-STRING
when binding #:USER-ID9

In­ter­view with Ro­man Yam­polskiy about AGI on The Real­ity Check

Darren McKee18 Feb 2023 23:29 UTC
27 points
0 comments1 min readEA link
(www.trcpodcast.com)

Com­ments on OpenAI’s “Plan­ning for AGI and be­yond”

So8res3 Mar 2023 23:01 UTC
115 points
7 comments1 min readEA link

2023 Stan­ford Ex­is­ten­tial Risks Conference

elizabethcooper24 Feb 2023 17:49 UTC
29 points
5 comments1 min readEA link

Seek­ing in­put on a list of AI books for broader audience

Darren McKee27 Feb 2023 22:40 UTC
48 points
14 comments5 min readEA link

What does Bing Chat tell us about AI risk?

Holden Karnofsky28 Feb 2023 18:47 UTC
99 points
8 comments2 min readEA link
(www.cold-takes.com)

[Cross­post] Why Un­con­trol­lable AI Looks More Likely Than Ever

Otto8 Mar 2023 15:33 UTC
49 points
6 comments4 min readEA link
(time.com)

Fake Meat and Real Talk 1 - Are We All Gonna Die? Yud­kowsky and the Dangers of AI (Please RSVP)

David N8 Mar 2023 20:40 UTC
11 points
2 comments1 min readEA link

Paper Sum­mary: The Effec­tive­ness of AI Ex­is­ten­tial Risk Com­mu­ni­ca­tion to the Amer­i­can and Dutch Public

Otto9 Mar 2023 10:40 UTC
96 points
11 comments4 min readEA link

Every­thing’s nor­mal un­til it’s not

Eleni_A10 Mar 2023 1:42 UTC
6 points
0 comments3 min readEA link

The Power of In­tel­li­gence—The Animation

Writer11 Mar 2023 16:15 UTC
56 points
0 comments1 min readEA link

Yud­kowsky on AGI risk on the Ban­kless podcast

RobBensinger13 Mar 2023 0:42 UTC
52 points
2 comments75 min readEA link

On tak­ing AI risk se­ri­ously

Eleni_A13 Mar 2023 5:44 UTC
51 points
4 comments1 min readEA link
(www.nytimes.com)

The Over­ton Win­dow widens: Ex­am­ples of AI risk in the media

Akash23 Mar 2023 17:10 UTC
111 points
11 comments1 min readEA link

My at­tempt at ex­plain­ing the case for AI risk in a straight­for­ward way

JulianHazell25 Mar 2023 16:32 UTC
24 points
7 comments18 min readEA link
(muddyclothes.substack.com)

[Linkpost] Shorter ver­sion of re­port on ex­is­ten­tial risk from power-seek­ing AI

Joe_Carlsmith22 Mar 2023 18:06 UTC
49 points
1 comment1 min readEA link

[Question] What are the ar­gu­ments that sup­port China build­ing AGI+ if Western com­pa­nies de­lay/​pause AI de­vel­op­ment?

DMMF29 Mar 2023 18:53 UTC
32 points
9 comments1 min readEA link

Longter­mism and short­ter­mism can dis­agree on nu­clear war to stop ad­vanced AI

David Johnston30 Mar 2023 23:22 UTC
2 points
0 comments1 min readEA link

Nu­clear brinks­man­ship is not a good AI x-risk strategy

titotal30 Mar 2023 22:07 UTC
11 points
8 comments5 min readEA link

“Dangers of AI and the End of Hu­man Civ­i­liza­tion” Yud­kowsky on Lex Fridman

𝕮𝖎𝖓𝖊𝖗𝖆30 Mar 2023 15:44 UTC
28 points
0 comments1 min readEA link

[Question] What are the biggest ob­sta­cles on AI safety re­search ca­reer?

jackchang11031 Mar 2023 14:53 UTC
2 points
1 comment1 min readEA link

[Question] How much should states in­vest in con­tin­gency plans for wide­spread in­ter­net out­age?

Kinoshita Yoshikazu (pseudonym)7 Apr 2023 16:05 UTC
2 points
0 comments1 min readEA link

Hu­man Values and AGI Risk | William James

William James31 Mar 2023 22:30 UTC
1 point
0 comments12 min readEA link

Pes­simism about AI Safety

Max_He-Ho2 Apr 2023 7:57 UTC
5 points
0 comments25 min readEA link
(www.lesswrong.com)

Re­search Sum­mary: Fore­cast­ing with Large Lan­guage Models

Damien Laird2 Apr 2023 10:52 UTC
4 points
0 comments7 min readEA link
(damienlaird.substack.com)

[Question] Pre­dic­tions for fu­ture AI gov­er­nance?

jackchang1102 Apr 2023 16:43 UTC
4 points
1 comment1 min readEA link

[Question] De­bates on re­duc­ing long-term s-risks?

jackchang1106 Apr 2023 1:26 UTC
12 points
2 comments1 min readEA link

Risks from GPT-4 Byproduct of Re­cur­sively Op­ti­miz­ing AIs

ben hayum6 Apr 2023 5:52 UTC
84 points
4 comments10 min readEA link
(www.lesswrong.com)

Reli­a­bil­ity, Se­cu­rity, and AI risk: Notes from in­fosec text­book chap­ter 1

Akash7 Apr 2023 15:47 UTC
15 points
0 comments1 min readEA link

Paus­ing AI Devel­op­ments Isn’t Enough. We Need to Shut it All Down

EliezerYudkowsky9 Apr 2023 15:53 UTC
45 points
3 comments1 min readEA link

Pod­cast/​video/​tran­script: Eliezer Yud­kowsky—Why AI Will Kill Us, Align­ing LLMs, Na­ture of In­tel­li­gence, SciFi, & Rationality

PeterSlattery9 Apr 2023 10:37 UTC
32 points
2 comments137 min readEA link
(www.youtube.com)

Mea­sur­ing ar­tifi­cial in­tel­li­gence on hu­man bench­marks is naive

Ward A11 Apr 2023 11:28 UTC
3 points
2 comments1 min readEA link

[US] NTIA: AI Ac­countabil­ity Policy Re­quest for Comment

Kyle J. Lucchese13 Apr 2023 16:12 UTC
47 points
4 comments1 min readEA link
(ntia.gov)

Un-un­plug­ga­bil­ity—can’t we just un­plug it?

Oliver Sourbut15 May 2023 13:23 UTC
14 points
0 comments1 min readEA link

AI Takeover Sce­nario with Scaled LLMs

simeon_c16 Apr 2023 23:28 UTC
28 points
1 comment1 min readEA link

Sum­mary: The Case for Halt­ing AI Devel­op­ment—Max Teg­mark on the Lex Frid­man Podcast

Madhav Malhotra16 Apr 2023 22:28 UTC
37 points
4 comments4 min readEA link
(youtu.be)

Prevenire una catas­trofe legata alle IA

EA Italy17 Jan 2023 11:07 UTC
1 point
0 comments4 min readEA link

L’im­por­tanza delle IA come pos­si­bile mi­nac­cia per l’umanità

EA Italy17 Jan 2023 22:24 UTC
1 point
0 comments1 min readEA link
(www.vox.com)

Perché il deep learn­ing mod­erno potrebbe ren­dere diffi­cile l’al­linea­mento delle IA

EA Italy17 Jan 2023 23:29 UTC
1 point
0 comments16 min readEA link

Le Tem­p­is­tiche delle IA: il di­bat­tito e il punto di vista degli “es­perti”

EA Italy17 Jan 2023 23:30 UTC
1 point
0 comments11 min readEA link

Ricerca sulla sicurezza delle IA: panoram­ica delle carriere

EA Italy17 Jan 2023 11:06 UTC
1 point
0 comments7 min readEA link

Ap­profondi­menti sui rischi dell’IA (ma­te­ri­ali in in­glese)

EA Italy18 Jan 2023 11:16 UTC
1 point
0 comments2 min readEA link

Orthog­o­nal: A new agent foun­da­tions al­ign­ment organization

Tamsin Leake19 Apr 2023 20:17 UTC
36 points
0 comments1 min readEA link

Notes on “the hot mess the­ory of AI mis­al­ign­ment”

Jakub Kraus21 Apr 2023 10:07 UTC
37 points
3 comments1 min readEA link

Stu­dent com­pe­ti­tion for draft­ing a treaty on mora­to­rium of large-scale AI ca­pa­bil­ities R&D

Nayanika24 Apr 2023 13:15 UTC
35 points
4 comments2 min readEA link

FT: We must slow down the race to God-like AI

Angelina Li24 Apr 2023 11:57 UTC
27 points
2 comments2 min readEA link
(www.ft.com)

AGI ruin mostly rests on strong claims about al­ign­ment and de­ploy­ment, not about society

RobBensinger24 Apr 2023 13:07 UTC
14 points
4 comments1 min readEA link

Refram­ing the bur­den of proof: Com­pa­nies should prove that mod­els are safe (rather than ex­pect­ing au­di­tors to prove that mod­els are dan­ger­ous)

Akash25 Apr 2023 18:49 UTC
34 points
1 comment1 min readEA link

Im­pli­ca­tions of the White­house meet­ing with AI CEOs for AI su­per­in­tel­li­gence risk—a first-step to­wards evals?

Jamie Bernardi7 May 2023 17:33 UTC
76 points
3 comments7 min readEA link

Why “just make an agent which cares only about bi­nary re­wards” doesn’t work.

Lysandre Terrisse9 May 2023 16:51 UTC
3 points
1 comment3 min readEA link

Un­veiling the Amer­i­can Public Opinion on AI Mo­ra­to­rium and Govern­ment In­ter­ven­tion: The Im­pact of Me­dia Exposure

Otto8 May 2023 10:49 UTC
27 points
5 comments6 min readEA link

A re­quest to keep pes­simistic AI posts ac­tion­able.

tcelferact11 May 2023 15:35 UTC
26 points
9 comments1 min readEA link

Con­fu­sions and up­dates on STEM AI

Eleni_A19 May 2023 21:34 UTC
7 points
0 comments1 min readEA link

Oc­to­ber 2022 AI Risk Com­mu­nity Sur­vey Results

Froolow24 May 2023 10:37 UTC
18 points
0 comments7 min readEA link

New s-risks au­dio­book available now

Alistair Webster24 May 2023 20:27 UTC
76 points
1 comment1 min readEA link
(centerforreducingsuffering.org)

Will AI end ev­ery­thing? A guide to guess­ing | EAG Bay Area 23

Katja_Grace25 May 2023 17:01 UTC
69 points
1 comment21 min readEA link

The Case for AI Adap­ta­tion: The Per­ils of Liv­ing in a World with Aligned and Well-De­ployed Trans­for­ma­tive Ar­tifi­cial Intelligence

HTC30 May 2023 18:29 UTC
3 points
1 comment7 min readEA link

Sum­maries: Align­ment Fun­da­men­tals Curriculum

Leon_Lang19 Sep 2022 15:43 UTC
25 points
1 comment1 min readEA link
(docs.google.com)

Loss of con­trol of AI is not a likely source of AI x-risk

squek9 Nov 2022 5:48 UTC
8 points
0 comments1 min readEA link

Govern­ments pose larger risks than cor­po­ra­tions: a brief re­sponse to Grace

David Johnston19 Oct 2022 11:54 UTC
11 points
3 comments2 min readEA link

AGI Risk: How to in­ter­na­tion­ally reg­u­late in­dus­tries in non-democracies

Timothy_Liptrot16 May 2022 22:45 UTC
9 points
2 comments9 min readEA link

Why do we post our AI safety plans on the In­ter­net?

Peter S. Park31 Oct 2022 16:27 UTC
14 points
22 comments11 min readEA link

We Did AGISF’s 8-week Course in 3 Days. Here’s How it Went

ag400024 Jul 2022 16:46 UTC
26 points
7 comments5 min readEA link

Ap­pli­ca­tions Open: GovAI Sum­mer Fel­low­ship 2023

GovAI21 Dec 2022 15:00 UTC
28 points
0 comments2 min readEA link

I am a Me­moryless System

NicholasKross23 Oct 2022 17:36 UTC
4 points
0 comments9 min readEA link
(www.thinkingmuchbetter.com)

[Question] How long does it take to un­der­srand AI X-Risk from scratch so that I have a con­fi­dent, clear men­tal model of it from first prin­ci­ples?

Jordan Arel27 Jul 2022 16:58 UTC
29 points
6 comments1 min readEA link

Rood­man’s Thoughts on Biolog­i­cal Anchors

lukeprog14 Sep 2022 12:23 UTC
72 points
8 comments1 min readEA link
(docs.google.com)

Drivers of large lan­guage model diffu­sion: in­cre­men­tal re­search, pub­lic­ity, and cascades

Ben Cottier21 Dec 2022 13:50 UTC
21 points
0 comments29 min readEA link

SERI MATS Pro­gram—Win­ter 2022 Cohort

Ryan Kidd8 Oct 2022 19:09 UTC
50 points
5 comments1 min readEA link

Pre­sump­tive Listen­ing: stick­ing to fa­mil­iar con­cepts and miss­ing the outer rea­son­ing paths

Remmelt27 Dec 2022 15:40 UTC
3 points
0 comments1 min readEA link

Don’t ex­pect AGI any­time soon

cveres10 Oct 2022 22:38 UTC
0 points
19 comments1 min readEA link

How Josiah be­came an AI safety researcher

Neil Crawford29 Mar 2022 19:47 UTC
10 points
0 comments1 min readEA link

[Question] Should I force my­self to work on AGI al­ign­ment?

Isaac Benson24 Aug 2022 17:25 UTC
19 points
17 comments1 min readEA link

[Question] Do EA folks think that a path to zero AGI de­vel­op­ment is fea­si­ble or worth­while for safety from AI?

Noah Scales17 Jul 2022 8:47 UTC
8 points
3 comments1 min readEA link

Estab­lish­ing Oxford’s AI Safety Stu­dent Group: Les­sons Learnt and Our Model

Wilkin123421 Sep 2022 7:57 UTC
71 points
3 comments1 min readEA link

We Are Con­jec­ture, A New Align­ment Re­search Startup

Connor Leahy9 Apr 2022 15:07 UTC
31 points
0 comments1 min readEA link

AI Safety Ca­reer Bot­tle­necks Sur­vey Re­sponses Responses

Linda Linsefors28 May 2021 10:41 UTC
34 points
1 comment5 min readEA link

Which of these ar­gu­ments for x-risk do you think we should test?

Wim9 Aug 2022 13:43 UTC
3 points
2 comments1 min readEA link

Win­ners of the AI Safety Nudge Competition

Marc Carauleanu15 Nov 2022 1:06 UTC
22 points
0 comments1 min readEA link

[Question] Mu­tual As­sured Destruc­tion used against AGI

L3opard8 Oct 2022 9:35 UTC
4 points
5 comments1 min readEA link

An­nounc­ing the SPT Model Web App for AI Governance

Paolo Bova4 Aug 2022 10:45 UTC
36 points
0 comments3 min readEA link

Join the in­ter­pretabil­ity re­search hackathon

Esben Kran28 Oct 2022 16:26 UTC
48 points
0 comments5 min readEA link

Don’t worry, be happy (liter­ally)

Yuri Zavorotny5 Oct 2022 1:55 UTC
0 points
1 comment2 min readEA link

[Question] Best in­tro­duc­tory overviews of AGI safety?

Jakub Kraus13 Dec 2022 19:04 UTC
21 points
8 comments2 min readEA link
(www.lesswrong.com)

“In­tro to brain-like-AGI safety” se­ries—halfway point!

Steven Byrnes9 Mar 2022 15:21 UTC
8 points
0 comments2 min readEA link

[Question] Track­ing Com­pute Stocks and Flows: Case Stud­ies?

Cullen5 Oct 2022 17:54 UTC
34 points
1 comment1 min readEA link

A stub­born un­be­liever fi­nally gets the depth of the AI al­ign­ment problem

aelwood13 Oct 2022 15:16 UTC
32 points
7 comments1 min readEA link

Distil­la­tion of “How Likely is De­cep­tive Align­ment?”

NickGabs1 Dec 2022 20:22 UTC
10 points
1 comment10 min readEA link

The prob­lem of ar­tifi­cial suffering

mlsbt24 Sep 2021 14:43 UTC
49 points
3 comments9 min readEA link

In­tro to AI Safety

Madhav Malhotra19 Oct 2022 23:45 UTC
4 points
0 comments1 min readEA link

An ex­per­i­ment elic­it­ing rel­a­tive es­ti­mates for Open Philan­thropy’s 2018 AI safety grants

NunoSempere12 Sep 2022 11:19 UTC
111 points
16 comments12 min readEA link

AI Twit­ter ac­counts to fol­low?

Adrian Salustri10 Jun 2022 6:19 UTC
1 point
2 comments1 min readEA link

Sys­temic Cas­cad­ing Risks: Rele­vance in Longter­mism & Value Lock-In

Richard Ren2 Sep 2022 7:53 UTC
52 points
10 comments16 min readEA link

What we owe the microbiome

TeddyW17 Dec 2022 16:17 UTC
17 points
2 comments1 min readEA link

Dis­cov­er­ing Lan­guage Model Be­hav­iors with Model-Writ­ten Evaluations

evhub20 Dec 2022 20:09 UTC
25 points
0 comments1 min readEA link

[Question] I’m in­ter­view­ing pro­lific AI safety re­searcher Richard Ngo (now at OpenAI and pre­vi­ously Deep­Mind). What should I ask him?

Robert_Wiblin29 Sep 2022 0:00 UTC
45 points
11 comments1 min readEA link

Mere ex­po­sure effect: Bias in Eval­u­at­ing AGI X-Risks

Remmelt27 Dec 2022 14:05 UTC
4 points
1 comment1 min readEA link

[Cause Ex­plo­ra­tion Prizes] Ex­pand­ing com­mu­ni­ca­tion about AGI risks

Ines22 Sep 2022 5:30 UTC
13 points
0 comments11 min readEA link

a ca­sual in­tro to AI doom and alignment

Tamsin Leake2 Nov 2022 9:42 UTC
8 points
2 comments1 min readEA link

Main paths to im­pact in EU AI Policy

JOMG_Monnet8 Dec 2022 16:17 UTC
69 points
2 comments8 min readEA link

GPT-3-like mod­els are now much eas­ier to ac­cess and de­ploy than to develop

Ben Cottier21 Dec 2022 13:49 UTC
22 points
3 comments19 min readEA link

AI Fore­cast­ing Re­search Ideas

Jaime Sevilla17 Nov 2022 17:37 UTC
71 points
1 comment1 min readEA link
(docs.google.com)

[Question] Why is “Ar­gu­ment Map­ping” Not More Com­mon in EA/​Ra­tion­al­ity (And What Ob­jec­tions Should I Ad­dress in a Post on the Topic?)

Harrison Durland23 Dec 2022 21:55 UTC
15 points
5 comments1 min readEA link

What I’m doing

Chris Leong19 Jul 2022 11:31 UTC
28 points
0 comments5 min readEA link

Why we need a new agency to reg­u­late ad­vanced ar­tifi­cial intelligence

Michael Huang4 Aug 2022 13:38 UTC
25 points
0 comments1 min readEA link
(www.brookings.edu)

AI Alter­na­tive Fu­tures: Ex­plo­ra­tory Sce­nario Map­ping for Ar­tifi­cial In­tel­li­gence Risk—Re­quest for Par­ti­ci­pa­tion [Linkpost]

Kiliank9 May 2022 19:53 UTC
17 points
2 comments8 min readEA link

Ap­ply to the Red­wood Re­search Mechanis­tic In­ter­pretabil­ity Ex­per­i­ment (REMIX), a re­search pro­gram in Berkeley

Max Nadeau27 Oct 2022 1:39 UTC
95 points
5 comments12 min readEA link

ChatGPT can write code! ?

Miguel10 Dec 2022 5:36 UTC
6 points
15 comments1 min readEA link
(www.whitehatstoic.com)

What are the risks of an or­a­cle AI?

Griffin Young5 Oct 2022 6:18 UTC
6 points
2 comments1 min readEA link

An en­tire cat­e­gory of risks is un­der­val­ued by EA [Sum­mary of pre­vi­ous fo­rum post]

Richard Ren5 Sep 2022 15:07 UTC
70 points
5 comments5 min readEA link

On the cor­re­spon­dence be­tween AI-mis­al­ign­ment and cog­ni­tive dis­so­nance us­ing a be­hav­ioral eco­nomics model

Stijn1 Nov 2022 9:15 UTC
11 points
0 comments6 min readEA link

Seek­ing Stu­dent Sub­mis­sions: Edit Your Source Code Contest

Aris Richardson26 Aug 2022 2:06 UTC
24 points
6 comments2 min readEA link

[Question] What Do AI Safety Pitches Not Get About Your Field?

Aris Richardson20 Sep 2022 18:13 UTC
70 points
18 comments1 min readEA link

Pod­cast: Tam­era Lan­ham on AI risk, threat mod­els, al­ign­ment pro­pos­als, ex­ter­nal­ized rea­son­ing over­sight, and work­ing at Anthropic

Akash20 Dec 2022 21:39 UTC
14 points
1 comment1 min readEA link

“The Physi­cists”: A play about ex­tinc­tion and the re­spon­si­bil­ity of scientists

Lara_TH29 Nov 2022 16:53 UTC
28 points
1 comment8 min readEA link

An ap­praisal of the Fu­ture of Life In­sti­tute AI ex­is­ten­tial risk program

PabloAMC11 Dec 2022 13:36 UTC
28 points
0 comments1 min readEA link

More Aca­demic Diver­sity in Align­ment?

ojorgensen27 Nov 2022 17:52 UTC
7 points
0 comments1 min readEA link

[An­nounce­ment] The Steven Aiberg Project

StevenAiberg19 Oct 2022 7:48 UTC
0 points
0 comments4 min readEA link

A Sur­vey of the Po­ten­tial Long-term Im­pacts of AI

Sam Clarke18 Jul 2022 9:48 UTC
63 points
2 comments27 min readEA link

Re­silience Via Frag­mented Power

steve632014 Jul 2022 15:37 UTC
2 points
0 comments6 min readEA link

Linkpost—Beyond Hyper­an­thro­po­mor­phism: Or, why fears of AI are not even wrong, and how to make them real

Locke24 Aug 2022 16:24 UTC
−4 points
3 comments2 min readEA link
(studio.ribbonfarm.com)

Maybe AI risk shouldn’t af­fect your life plan all that much

Justis22 Jul 2022 15:30 UTC
21 points
4 comments6 min readEA link

Fa­cil­i­ta­tor Help Wanted for Columbia EA AI Safety Groups

Berkan Ottlik5 Jul 2022 10:27 UTC
16 points
0 comments1 min readEA link

[Question] Is there a news-tracker about GPT-4? Why has ev­ery­thing be­come so silent about it?

Franziska Fischer29 Oct 2022 8:56 UTC
10 points
4 comments1 min readEA link

[Question] Fore­cast­ing thread: How does AI risk level vary based on timelines?

elifland14 Sep 2022 23:56 UTC
47 points
8 comments1 min readEA link

[Question] A dataset for AI/​su­per­in­tel­li­gence sto­ries and other me­dia?

Harrison Durland29 Mar 2022 21:41 UTC
20 points
2 comments1 min readEA link

Is Eric Sch­midt fund­ing AI ca­pa­bil­ities re­search by the US gov­ern­ment?

Pranay K24 Dec 2022 8:32 UTC
46 points
3 comments2 min readEA link
(www.politico.com)

[Creative Writ­ing Con­test] The Puppy Problem

Louis13 Oct 2021 14:01 UTC
13 points
0 comments7 min readEA link

CNAS re­port: ‘Ar­tifi­cial In­tel­li­gence and Arms Con­trol’

MMMaas13 Oct 2022 8:35 UTC
16 points
0 comments1 min readEA link
(www.cnas.org)

[CANCELLED] Ber­lin AI Align­ment Open Meetup Au­gust 2022

Isidor Regenfuß4 Aug 2022 13:34 UTC
0 points
0 comments1 min readEA link

Un­con­trol­lable AI as an Ex­is­ten­tial Risk

Karl von Wendt9 Oct 2022 10:37 UTC
28 points
0 comments1 min readEA link

Sha­har Avin on How to Strate­gi­cally Reg­u­late Ad­vanced AI Systems

Michaël Trazzi23 Sep 2022 15:49 UTC
48 points
2 comments5 min readEA link
(theinsideview.ai)

7 Learn­ings and a De­tailed De­scrip­tion of an AI Safety Read­ing Group

nell23 Sep 2022 2:02 UTC
20 points
5 comments9 min readEA link

AGI Timelines in Gover­nance: Differ­ent Strate­gies for Differ­ent Timeframes

simeon_c19 Dec 2022 21:31 UTC
110 points
19 comments1 min readEA link

De­liber­ate prac­tice for re­search?

Alex_Altair8 Oct 2022 3:45 UTC
19 points
4 comments1 min readEA link

Which AI Safety Org to Join?

Yonatan Cale11 Oct 2022 19:42 UTC
17 points
21 comments1 min readEA link

Hu­mans aren’t fit­ness maximizers

So8res4 Oct 2022 1:32 UTC
30 points
2 comments5 min readEA link

A challenge for AGI or­ga­ni­za­tions, and a challenge for readers

RobBensinger1 Dec 2022 23:11 UTC
168 points
13 comments1 min readEA link

My ar­gu­ment against AGI

cveres12 Oct 2022 6:32 UTC
2 points
29 comments3 min readEA link

Im­pli­ca­tions of large lan­guage model diffu­sion for AI governance

Ben Cottier21 Dec 2022 13:50 UTC
14 points
0 comments38 min readEA link

Should AI fo­cus on prob­lem-solv­ing or strate­gic plan­ning? Why not both?

oliver_siegel1 Nov 2022 9:53 UTC
1 point
0 comments1 min readEA link

Op­ti­mism, AI risk, and EA blind spots

Justis28 Sep 2022 17:21 UTC
87 points
22 comments8 min readEA link

Reflec­tions on my 5-month AI al­ign­ment up­skil­ling grant

Jay Bailey28 Dec 2022 7:23 UTC
110 points
5 comments8 min readEA link
(www.lesswrong.com)

AI can ex­ploit safety plans posted on the Internet

Peter S. Park4 Dec 2022 12:17 UTC
4 points
3 comments1 min readEA link

New co­op­er­a­tion mechanism—quadratic fund­ing with­out a match­ing pool

Filip Sondej5 Jun 2022 13:55 UTC
53 points
7 comments5 min readEA link

More to ex­plore on ‘Risks from Ar­tifi­cial In­tel­li­gence’

EA Handbook15 Jul 2022 23:00 UTC
4 points
0 comments2 min readEA link

Where I cur­rently dis­agree with Ryan Green­blatt’s ver­sion of the ELK approach

So8res29 Sep 2022 21:19 UTC
21 points
0 comments5 min readEA link

Back­ground for “Un­der­stand­ing the diffu­sion of large lan­guage mod­els”

Ben Cottier21 Dec 2022 13:49 UTC
12 points
0 comments23 min readEA link

Mechanism De­sign for AI Safety—Read­ing Group Curriculum

Rubi J. Hudson25 Oct 2022 3:54 UTC
24 points
1 comment3 min readEA link

Ap­ply to at­tend an AI safety work­shop in Berkeley (Nov 18-21)

Akash6 Nov 2022 18:06 UTC
19 points
0 comments1 min readEA link

Call to ac­tion: Read + Share AI Safety /​ Re­in­force­ment Learn­ing Fea­tured in Conversation

Justin Olive24 Oct 2022 1:13 UTC
3 points
0 comments1 min readEA link

Prizes for ML Safety Bench­mark Ideas

Joshc28 Oct 2022 2:44 UTC
58 points
8 comments1 min readEA link

Effec­tive En­force­abil­ity of EU Com­pe­ti­tion Law Un­der Differ­ent AI Devel­op­ment Sce­nar­ios: A Frame­work for Le­gal Analysis

HaydnBelfield19 Aug 2022 17:20 UTC
11 points
0 comments6 min readEA link
(verfassungsblog.de)

How could we know that an AGI sys­tem will have good con­se­quences?

So8res7 Nov 2022 22:42 UTC
25 points
0 comments1 min readEA link

How im­por­tant are ac­cu­rate AI timelines for the op­ti­mal spend­ing sched­ule on AI risk in­ter­ven­tions?

Tristan Cook16 Dec 2022 16:05 UTC
30 points
0 comments6 min readEA link

“Cot­ton Gin” AI Risk

42317524 Sep 2022 23:04 UTC
6 points
2 comments1 min readEA link

A note about differ­en­tial tech­nolog­i­cal development

So8res24 Jul 2022 23:41 UTC
58 points
8 comments5 min readEA link

Ber­lin AI Align­ment Open Meetup Septem­ber 2022

Isidor Regenfuß21 Sep 2022 15:09 UTC
2 points
0 comments1 min readEA link

Ge­orge­town EA Fall 2022″In­tro to AI” Read­ing Group

Daniel H8 Oct 2022 1:44 UTC
3 points
0 comments1 min readEA link
(docs.google.com)

[Question] EA’s Achieve­ments in 2022

ElliotJDavies14 Dec 2022 14:33 UTC
98 points
11 comments1 min readEA link

Is GPT3 a Good Ra­tion­al­ist? - In­struc­tGPT3 [2/​2]

simeon_c7 Apr 2022 13:54 UTC
25 points
0 comments8 min readEA link

Com­pute & An­titrust: Reg­u­la­tory im­pli­ca­tions of the AI hard­ware sup­ply chain, from chip de­sign to cloud APIs

HaydnBelfield19 Aug 2022 17:20 UTC
32 points
0 comments6 min readEA link
(verfassungsblog.de)

NeurIPS ML Safety Work­shop 2022

Dan H26 Jul 2022 15:33 UTC
72 points
0 comments1 min readEA link
(neurips2022.mlsafety.org)

An­nounc­ing an Em­piri­cal AI Safety Program

Joshc13 Sep 2022 21:39 UTC
64 points
7 comments2 min readEA link

Pile of Law and Law-Fol­low­ing AI

Cullen13 Jul 2022 0:29 UTC
28 points
2 comments3 min readEA link

Brain­storm of things that could force an AI team to burn their lead

So8res25 Jul 2022 0:00 UTC
26 points
1 comment12 min readEA link

How ‘Hu­man-Hu­man’ dy­nam­ics give way to ‘Hu­man-AI’ and then ‘AI-AI’ dynamics

Remmelt27 Dec 2022 3:16 UTC
4 points
0 comments1 min readEA link

We Ran an AI Timelines Retreat

Lenny McCline17 May 2022 4:40 UTC
46 points
6 comments3 min readEA link

[Question] What are some cur­rent, already pre­sent challenges from AI?

nonzerosum30 Jun 2022 15:44 UTC
5 points
1 comment1 min readEA link

7 traps that (we think) new al­ign­ment re­searchers of­ten fall into

Akash27 Sep 2022 23:13 UTC
72 points
13 comments1 min readEA link

Join ASAP (AI Safety Ac­countabil­ity Pro­gramme) 🚀

TheMcDouglas10 Sep 2022 11:15 UTC
54 points
20 comments3 min readEA link

Could re­al­is­tic de­pic­tions of catas­trophic AI risks effec­tively re­duce said risks?

Matthew Barber17 Aug 2022 20:01 UTC
26 points
11 comments2 min readEA link

[DISC] Are Values Ro­bust?

𝕮𝖎𝖓𝖊𝖗𝖆21 Dec 2022 1:13 UTC
4 points
0 comments1 min readEA link

Con­sider try­ing Vivek Heb­bar’s al­ign­ment exercises

Akash24 Oct 2022 19:46 UTC
16 points
0 comments1 min readEA link

Credo AI is hiring for sev­eral roles

IanEisenberg11 Apr 2022 15:58 UTC
14 points
2 comments1 min readEA link

Share your re­quests for ChatGPT

Kate Tran5 Dec 2022 18:43 UTC
8 points
5 comments1 min readEA link

What does it take to defend the world against out-of-con­trol AGIs?

Steven Byrnes25 Oct 2022 14:47 UTC
43 points
1 comment1 min readEA link

Oren’s Field Guide of Bad AGI Outcomes

Oren Montano26 Sep 2022 8:59 UTC
1 point
0 comments1 min readEA link

An­nounc­ing AI Align­ment Awards: $100k re­search con­tests about goal mis­gen­er­al­iza­tion & corrigibility

Akash22 Nov 2022 22:19 UTC
60 points
1 comment1 min readEA link

Markus An­der­ljung On The AI Policy Landscape

Michaël Trazzi9 Sep 2022 17:27 UTC
14 points
0 comments2 min readEA link
(theinsideview.ai)

Cryp­tocur­rency Ex­ploits Show the Im­por­tance of Proac­tive Poli­cies for AI X-Risk

eSpencer16 Sep 2022 4:44 UTC
14 points
0 comments3 min readEA link

EAG DC: Meta-Bot­tle­necks in Prevent­ing AI Doom

Joseph Bloom30 Sep 2022 17:53 UTC
5 points
0 comments7 min readEA link

The Vi­talik Bu­terin Fel­low­ship in AI Ex­is­ten­tial Safety is open for ap­pli­ca­tions!

Cynthia Chen14 Oct 2022 3:23 UTC
37 points
0 comments2 min readEA link

Ex­plo­ra­tory sur­vey on psy­chol­ogy of AI risk perception

Daniel_Friedrich2 Aug 2022 20:34 UTC
1 point
0 comments1 min readEA link
(forms.gle)

Clar­ifi­ca­tions about struc­tural risk from AI

Sam Clarke18 Jan 2022 12:57 UTC
31 points
3 comments4 min readEA link

What does it mean for an AGI to be ‘safe’?

So8res7 Oct 2022 4:43 UTC
53 points
21 comments1 min readEA link

Posit: Most AI safety peo­ple should work on al­ign­ment/​safety challenges for AI tools that already have users (Stable Diffu­sion, GPT)

nonzerosum20 Dec 2022 17:23 UTC
12 points
3 comments1 min readEA link

Why I’m Scep­ti­cal of Foom

𝕮𝖎𝖓𝖊𝖗𝖆8 Dec 2022 10:01 UTC
21 points
7 comments1 min readEA link

Join the AI Test­ing Hackathon this Friday

Esben Kran12 Dec 2022 14:24 UTC
33 points
0 comments8 min readEA link
(alignmentjam.com)

Es­ti­mat­ing the Cur­rent and Fu­ture Num­ber of AI Safety Researchers

Stephen McAleese28 Sep 2022 20:58 UTC
61 points
29 comments9 min readEA link

AI al­ign­ment with hu­mans… but with which hu­mans?

Geoffrey Miller8 Sep 2022 23:43 UTC
45 points
21 comments3 min readEA link

AI Timelines via Cu­mu­la­tive Op­ti­miza­tion Power: Less Long, More Short

Jake Cannell6 Oct 2022 7:06 UTC
27 points
0 comments1 min readEA link

AGI will ar­rive by the end of this decade ei­ther as a uni­corn or as a black swan

Yuri Barzov21 Oct 2022 10:50 UTC
−4 points
7 comments3 min readEA link

All AGI Safety ques­tions wel­come (es­pe­cially ba­sic ones) [~monthly thread]

robertskmiles1 Nov 2022 23:21 UTC
75 points
94 comments1 min readEA link

AISER—AIS Europe Retreat

Carolin Basilowski23 Dec 2022 18:11 UTC
5 points
0 comments1 min readEA link

Values and control

dotsam4 Aug 2022 18:28 UTC
3 points
1 comment1 min readEA link

Ajeya’s TAI timeline short­ened from 2050 to 2040

Zach Stein-Perlman3 Aug 2022 0:00 UTC
59 points
2 comments1 min readEA link
(www.lesswrong.com)

(My sug­ges­tions) On Begin­ner Steps in AI Alignment

Joseph Bloom22 Sep 2022 15:32 UTC
34 points
3 comments9 min readEA link

AI Risk In­tro 2: Solv­ing The Problem

LRudL24 Sep 2022 9:33 UTC
11 points
0 comments28 min readEA link
(www.perfectlynormal.co.uk)

AI Safety re­searcher ca­reer review

Benjamin_Todd23 Nov 2021 0:00 UTC
12 points
0 comments6 min readEA link
(80000hours.org)

The Cred­i­bil­ity of Apoca­lyp­tic Claims: A Cri­tique of Techno-Fu­tur­ism within Ex­is­ten­tial Risk

Ember16 Aug 2022 19:48 UTC
24 points
35 comments17 min readEA link

Katja Grace on Slow­ing Down AI, AI Ex­pert Sur­veys And Es­ti­mat­ing AI Risk

Michaël Trazzi16 Sep 2022 18:00 UTC
40 points
6 comments4 min readEA link
(theinsideview.ai)

[Question] Clos­ing the Feed­back Loop on AI Safety Re­search.

Ben.Hartley29 Jul 2022 21:46 UTC
3 points
4 comments1 min readEA link

When can a mimic sur­prise you? Why gen­er­a­tive mod­els han­dle seem­ingly ill-posed problems

David Johnston6 Nov 2022 11:46 UTC
6 points
0 comments1 min readEA link

An­nounc­ing the AI Safety Field Build­ing Hub, a new effort to provide AISFB pro­jects, men­tor­ship, and funding

Vael Gates28 Jul 2022 21:29 UTC
126 points
6 comments6 min readEA link

[Question] What are peo­ple’s thoughts on work­ing for Deep­Mind as a gen­eral soft­ware en­g­ineer?

Max Pietsch23 Sep 2022 17:13 UTC
9 points
4 comments1 min readEA link

Nine Points of Col­lec­tive Insanity

Remmelt27 Dec 2022 3:14 UTC
1 point
0 comments1 min readEA link

What could an AI-caused ex­is­ten­tial catas­tro­phe ac­tu­ally look like?

Benjamin Hilton12 Sep 2022 16:25 UTC
49 points
7 comments9 min readEA link
(80000hours.org)

When to di­ver­sify? Break­ing down mis­sion-cor­re­lated investing

jh29 Nov 2022 11:18 UTC
33 points
2 comments8 min readEA link

fully al­igned sin­gle­ton as a solu­tion to everything

Tamsin Leake12 Nov 2022 18:19 UTC
9 points
0 comments1 min readEA link

Chris Olah on what the hell is go­ing on in­side neu­ral networks

80000_Hours4 Aug 2021 15:13 UTC
4 points
0 comments135 min readEA link

4 Key As­sump­tions in AI Safety

Prometheus7 Nov 2022 10:50 UTC
5 points
0 comments1 min readEA link

Fol­low along with Columbia EA’s Ad­vanced AI Safety Fel­low­ship!

RohanS2 Jul 2022 6:07 UTC
27 points
0 comments2 min readEA link

The re­li­gion prob­lem in AI alignment

Geoffrey Miller16 Sep 2022 1:24 UTC
47 points
27 comments11 min readEA link

[Question] AI Risk Micro­dy­nam­ics Survey

Froolow9 Oct 2022 20:00 UTC
7 points
1 comment1 min readEA link

What’s so dan­ger­ous about AI any­way? – Or: What it means to be a superintelligence

Thomas Kehrenberg18 Jul 2022 16:14 UTC
9 points
2 comments11 min readEA link

[Question] What kind of or­ga­ni­za­tion should be the first to de­velop AGI in a po­ten­tial arms race?

BrownHairedEevee17 Jul 2022 17:41 UTC
10 points
2 comments1 min readEA link

[Question] Recom­men­da­tions for non-tech­ni­cal books on AI?

Joseph Lemien12 Jul 2022 23:23 UTC
8 points
10 comments1 min readEA link

[Question] Benev­olen­tAI—an effec­tively im­pact­ful com­pany?

Jack Hilton11 Oct 2022 14:35 UTC
16 points
11 comments1 min readEA link

[Question] Up­dates on FLI’S Value Align­ment Map?

rodeo_flagellum19 Sep 2022 0:25 UTC
8 points
0 comments2 min readEA link

En­cul­tured AI, Part 2: Pro­vid­ing a Service

Andrew Critch11 Aug 2022 20:13 UTC
10 points
0 comments3 min readEA link

Safety of Self-Assem­bled Neu­ro­mor­phic Hardware

Can Rager26 Dec 2022 19:10 UTC
8 points
1 comment10 min readEA link

Band­wagon effect: Bias in Eval­u­at­ing AGI X-Risks

Remmelt28 Dec 2022 7:54 UTC
4 points
0 comments1 min readEA link

Gen­eral ad­vice for tran­si­tion­ing into The­o­ret­i­cal AI Safety

Martín Soto15 Sep 2022 5:23 UTC
25 points
0 comments10 min readEA link

In­stead of tech­ni­cal re­search, more peo­ple should fo­cus on buy­ing time

Akash5 Nov 2022 20:43 UTC
107 points
32 comments1 min readEA link

Who owns AI-gen­er­ated con­tent?

Johan S Daniel7 Dec 2022 3:03 UTC
−2 points
0 comments2 min readEA link

The limited up­side of interpretability

Peter S. Park15 Nov 2022 20:22 UTC
23 points
3 comments10 min readEA link

Error

The value
  NIL
is not of type
  SIMPLE-STRING
when binding #:USER-ID9

(Linkpost) Wired Magaz­ine prints mis­in­for­ma­tion about AI safety

trevor11 Dec 2022 3:29 UTC
−5 points
3 comments4 min readEA link
(www.wired.com)

AI Align­ment is in­tractable (and we hu­mans should stop work­ing on it)

GPT 328 Jul 2022 20:02 UTC
1 point
1 comment1 min readEA link

Con­tra shard the­ory, in the con­text of the di­a­mond max­i­mizer problem

So8res13 Oct 2022 23:51 UTC
27 points
0 comments1 min readEA link

Ac­tion­able-guidance and roadmap recom­men­da­tions for the NIST AI Risk Man­age­ment Framework

Tony Barrett17 May 2022 15:27 UTC
11 points
0 comments3 min readEA link

Re­place­ment for PONR concept

kokotajlod2 Sep 2022 0:38 UTC
14 points
1 comment3 min readEA link

The repli­ca­tion and em­u­la­tion of GPT-3

Ben Cottier21 Dec 2022 13:49 UTC
14 points
0 comments33 min readEA link

Cause Area: Differ­en­tial Neu­rotech­nol­ogy Development

mwcvitkovic10 Aug 2022 2:39 UTC
88 points
7 comments36 min readEA link

When re­port­ing AI timelines, be clear who you’re defer­ring to

Sam Clarke10 Oct 2022 14:24 UTC
120 points
23 comments1 min readEA link

Pro­mot­ing com­pas­sion­ate longtermism

jonleighton7 Dec 2022 14:26 UTC
115 points
5 comments12 min readEA link

What are some low-cost out­side-the-box ways to do/​fund al­ign­ment re­search?

trevor111 Nov 2022 5:57 UTC
2 points
3 comments1 min readEA link

How to do the­o­ret­i­cal re­search, a per­sonal perspective

Mark Xu19 Aug 2022 19:43 UTC
132 points
7 comments15 min readEA link

There have been 3 planes (billion­aire donors) and 2 have crashed

trevor117 Dec 2022 3:38 UTC
4 points
5 comments2 min readEA link

Seek­ing so­cial sci­ence stu­dents /​ col­lab­o­ra­tors in­ter­ested in AI ex­is­ten­tial risks

Vael Gates24 Sep 2021 21:56 UTC
58 points
7 comments3 min readEA link

Is in­ter­est in al­ign­ment worth men­tion­ing for grad school ap­pli­ca­tions?

Franziska Fischer16 Oct 2022 4:50 UTC
5 points
5 comments1 min readEA link

The US ex­pands re­stric­tions on AI ex­ports to China. What are the x-risk effects?

Stephen Clare14 Oct 2022 18:17 UTC
154 points
17 comments4 min readEA link

[Question] What is the best ar­ti­cle to in­tro­duce some­one to AI safety for the first time?

trevor122 Nov 2022 2:06 UTC
2 points
3 comments1 min readEA link

[Question] What should I ask Ajeya Co­tra — se­nior re­searcher at Open Philan­thropy, and ex­pert on AI timelines and safety challenges?

Robert_Wiblin28 Oct 2022 15:28 UTC
23 points
10 comments1 min readEA link

List of AI safety courses and resources

Daniel del Castillo6 Sep 2021 14:26 UTC
50 points
7 comments1 min readEA link

Distil­la­tion of The Offense-Defense Balance of Scien­tific Knowledge

Arjun Yadav12 Aug 2022 7:01 UTC
17 points
0 comments3 min readEA link

Why The Fo­cus on Ex­pected Utility Max­imisers?

𝕮𝖎𝖓𝖊𝖗𝖆27 Dec 2022 15:51 UTC
11 points
1 comment1 min readEA link

Publi­ca­tion de­ci­sions for large lan­guage mod­els, and their impacts

Ben Cottier21 Dec 2022 13:50 UTC
14 points
0 comments16 min readEA link

Safety with­out op­pres­sion: an AI gov­er­nance problem

Nathan_Barnard28 Jul 2022 10:19 UTC
3 points
0 comments8 min readEA link

Su­per­in­tel­li­gent AI is nec­es­sary for an amaz­ing fu­ture, but far from sufficient

So8res31 Oct 2022 21:16 UTC
35 points
5 comments1 min readEA link

A strange twist on the road to AGI

cveres12 Oct 2022 23:27 UTC
3 points
0 comments1 min readEA link

The His­tory, Episte­mol­ogy and Strat­egy of Tech­nolog­i­cal Res­traint, and les­sons for AI (short es­say)

MMMaas10 Aug 2022 11:00 UTC
75 points
3 comments9 min readEA link
(verfassungsblog.de)

New AI risk in­tro from Vox [link post]

Jakub Kraus21 Dec 2022 5:50 UTC
7 points
1 comment2 min readEA link
(www.vox.com)

Im­proved Se­cu­rity to Prevent Hacker-AI and Digi­tal Ghosts

Erland Wittkotter21 Oct 2022 10:11 UTC
1 point
0 comments1 min readEA link

Credo AI is hiring!

IanEisenberg3 Mar 2022 18:02 UTC
16 points
6 comments4 min readEA link

In­tro­duc­ing spirit hazards

brb24327 May 2022 22:16 UTC
9 points
2 comments2 min readEA link

How Do AI Timelines Affect Ex­is­ten­tial Risk?

Stephen McAleese29 Aug 2022 17:10 UTC
2 points
0 comments23 min readEA link
(www.lesswrong.com)

Con­jec­ture: In­ter­nal In­fo­haz­ard Policy

Connor Leahy29 Jul 2022 19:35 UTC
34 points
3 comments18 min readEA link

AI Safety For Dum­mies (Like Me)

Madhav Malhotra24 Aug 2022 20:26 UTC
22 points
6 comments20 min readEA link

Pos­si­ble miracles

Akash9 Oct 2022 18:17 UTC
38 points
2 comments1 min readEA link

What AI Safety Ma­te­ri­als Do ML Re­searchers Find Com­pel­ling?

Vael Gates28 Dec 2022 2:03 UTC
129 points
12 comments1 min readEA link

Searle vs Bostrom: cru­cial con­sid­er­a­tions for EA AI work?

Forumite13 Jul 2022 10:18 UTC
11 points
2 comments1 min readEA link

[Question] Does the idea of AGI that benev­olently con­trol us ap­peal to EA folks?

Noah Scales16 Jul 2022 19:17 UTC
6 points
20 comments1 min readEA link

Four ques­tions I ask AI safety researchers

Akash17 Jul 2022 17:25 UTC
30 points
3 comments1 min readEA link

Long-term AI policy strat­egy re­search and implementation

Benjamin_Todd9 Nov 2021 0:00 UTC
1 point
0 comments7 min readEA link
(80000hours.org)

Alex Lawsen On Fore­cast­ing AI Progress

Michaël Trazzi6 Sep 2022 9:53 UTC
38 points
1 comment2 min readEA link
(theinsideview.ai)

An­nounc­ing the Cam­bridge Bos­ton Align­ment Ini­ti­a­tive [Hiring!]

kuhanj2 Dec 2022 1:07 UTC
83 points
0 comments1 min readEA link

“Origi­nal­ity is noth­ing but ju­di­cious imi­ta­tion”—Voltaire

Damien Lasseur23 Oct 2022 19:00 UTC
1 point
0 comments1 min readEA link

The her­i­ta­bil­ity of hu­man val­ues: A be­hav­ior ge­netic cri­tique of Shard Theory

Geoffrey Miller20 Oct 2022 15:53 UTC
48 points
12 comments21 min readEA link

My (naive) take on Risks from Learned Optimization

Artyom K6 Nov 2022 16:25 UTC
5 points
0 comments1 min readEA link

EA & LW Fo­rums Weekly Sum­mary (5 − 11 Sep 22’)

Zoe Williams12 Sep 2022 23:21 UTC
36 points
0 comments14 min readEA link

CSER is hiring for a se­nior re­search as­so­ci­ate on longterm AI risk and governance

Sam Clarke24 Jan 2022 13:24 UTC
9 points
4 comments1 min readEA link

Good Fu­tures Ini­ti­a­tive: Win­ter Pro­ject In­tern­ship

Aris Richardson27 Nov 2022 23:27 UTC
67 points
7 comments4 min readEA link

An­nounc­ing the Har­vard AI Safety Team

Xander Davies30 Jun 2022 18:34 UTC
128 points
4 comments5 min readEA link

Power-Seek­ing AI and Ex­is­ten­tial Risk

antoniofrancaib11 Oct 2022 21:47 UTC
10 points
0 comments1 min readEA link

[Question] AI risks: the most con­vinc­ing ar­gu­ment

Eleni_A6 Aug 2022 20:26 UTC
7 points
2 comments1 min readEA link

Three sce­nar­ios of pseudo-al­ign­ment

Eleni_A5 Sep 2022 20:26 UTC
7 points
0 comments3 min readEA link

Why mechanis­tic in­ter­pretabil­ity does not and can­not con­tribute to long-term AGI safety (from mes­sages with a friend)

Remmelt19 Dec 2022 12:02 UTC
17 points
3 comments1 min readEA link

Any fur­ther work on AI Safety Suc­cess Sto­ries?

Krieger2 Oct 2022 11:59 UTC
2 points
0 comments1 min readEA link

Sum­mary of “Tech­nol­ogy Favours Tyranny” by Yu­val Noah Harari

Madhav Malhotra26 Oct 2022 21:37 UTC
33 points
2 comments2 min readEA link

AI Safety groups should imi­tate ca­reer de­vel­op­ment clubs

Joshc9 Nov 2022 23:48 UTC
90 points
5 comments2 min readEA link

Co­op­er­a­tion, Avoidance, and In­differ­ence: Alter­nate Fu­tures for Misal­igned AGI

Kiel Brennan-Marquez10 Dec 2022 20:32 UTC
4 points
1 comment18 min readEA link

Tony Blair In­sti­tute—Com­pute for AI In­dex ( Seek­ing a Sup­plier)

TomWestgarth3 Oct 2022 10:25 UTC
28 points
8 comments1 min readEA link

Math­e­mat­i­cal Cir­cuits in Neu­ral Networks

Sean Osier22 Sep 2022 2:32 UTC
23 points
2 comments1 min readEA link
(www.youtube.com)

New re­port on how much com­pu­ta­tional power it takes to match the hu­man brain (Open Philan­thropy)

Aaron Gertler15 Sep 2020 1:06 UTC
41 points
1 comment18 min readEA link
(www.openphilanthropy.org)

Google could build a con­scious AI in three months

Derek Shiller1 Oct 2022 13:24 UTC
14 points
17 comments7 min readEA link

Tech­ni­cal AI safety in the United Arab Emirates

ea nyuad21 Jun 2022 3:11 UTC
10 points
0 comments11 min readEA link

[Question] Why does AGI oc­cur al­most nowhere, not even just as a re­mark for eco­nomic/​poli­ti­cal mod­els?

Franziska Fischer2 Oct 2022 14:43 UTC
52 points
17 comments1 min readEA link

[Question] Please Share Your Per­spec­tives on the De­gree of So­cietal Im­pact from Trans­for­ma­tive AI Outcomes

Kiliank15 Apr 2022 1:23 UTC
3 points
3 comments1 min readEA link

AI Safety Ideas: A col­lab­o­ra­tive AI safety re­search platform

Apart Research17 Oct 2022 17:01 UTC
67 points
13 comments4 min readEA link

Take­aways from a sur­vey on AI al­ign­ment resources

DanielFilan5 Nov 2022 23:45 UTC
18 points
9 comments6 min readEA link
(www.lesswrong.com)

Analysing a 2036 Takeover Scenario

ukc100146 Oct 2022 20:48 UTC
4 points
1 comment1 min readEA link

Newslet­ter for Align­ment Re­search: The ML Safety Updates

Esben Kran22 Oct 2022 16:17 UTC
30 points
0 comments7 min readEA link

Hacker-AI – Does it already ex­ist?

Erland Wittkotter7 Nov 2022 14:01 UTC
0 points
1 comment1 min readEA link

Cog­ni­tive sci­ence and failed AI fore­casts

Eleni_A18 Nov 2022 14:25 UTC
13 points
0 comments2 min readEA link

Crypto ‘or­a­cle pro­to­cols’ for AI al­ign­ment with real-world data?

Geoffrey Miller22 Sep 2022 23:05 UTC
9 points
5 comments1 min readEA link

Me­tac­u­lus is build­ing a team ded­i­cated to AI forecasting

christian18 Oct 2022 16:08 UTC
35 points
0 comments1 min readEA link
(apply.workable.com)

Con­clu­sion and Bibliog­ra­phy for “Un­der­stand­ing the diffu­sion of large lan­guage mod­els”

Ben Cottier21 Dec 2022 13:50 UTC
12 points
0 comments11 min readEA link

[Question] Who would you have on your dream team for solv­ing AGI Align­ment?

Greg_Colbourn25 Aug 2022 13:34 UTC
10 points
14 comments1 min readEA link

Beg­ging, Plead­ing AI Orgs to Com­ment on NIST AI Risk Man­age­ment Framework

Bridges15 Apr 2022 19:35 UTC
87 points
3 comments2 min readEA link

Fore­cast­ing Through Fiction

Yitz6 Jul 2022 5:23 UTC
8 points
3 comments6 min readEA link
(www.lesswrong.com)

As­sis­tant-pro­fes­sor-ranked AI ethics philoso­pher job op­por­tu­nity at Can­ter­bury Univer­sity, New Zealand

ben.smith16 Oct 2022 17:56 UTC
27 points
0 comments1 min readEA link
(www.linkedin.com)

[Question] How much should you op­ti­mize for the short-timelines sce­nario?

SoerenMind26 Jul 2022 15:51 UTC
39 points
2 comments1 min readEA link

The Wind­fall Clause has a reme­dies problem

John Bridge23 May 2022 10:31 UTC
40 points
0 comments20 min readEA link

Longter­mists Should Work on AI—There is No “AI Neu­tral” Sce­nario

simeon_c7 Aug 2022 16:43 UTC
42 points
62 comments6 min readEA link

How would you es­ti­mate the value of de­lay­ing AGI by 1 day, in marginal GiveWell dona­tions?

AnonymousAccount16 Dec 2022 9:25 UTC
28 points
19 comments2 min readEA link

Where are the red lines for AI?

Karl von Wendt5 Aug 2022 9:41 UTC
13 points
3 comments6 min readEA link

Meta AI an­nounces Cicero: Hu­man-Level Di­plo­macy play (with di­alogue)

Jacy22 Nov 2022 16:50 UTC
49 points
10 comments1 min readEA link

[Question] Do AI com­pa­nies make their safety re­searchers sign a non-dis­par­age­ment clause?

Ofer5 Sep 2022 13:40 UTC
70 points
4 comments1 min readEA link

Ques­tions for fur­ther in­ves­ti­ga­tion of AI diffusion

Ben Cottier21 Dec 2022 13:50 UTC
28 points
0 comments11 min readEA link

Com­po­nents of Strate­gic Clar­ity [Strate­gic Per­spec­tives on Long-term AI Gover­nance, #2]

MMMaas2 Jul 2022 11:22 UTC
63 points
0 comments5 min readEA link

An Ex­tremely Opinionated An­no­tated List of My Favourite Mechanis­tic In­ter­pretabil­ity Papers

Neel Nanda18 Oct 2022 21:23 UTC
18 points
0 comments12 min readEA link
(www.neelnanda.io)

AI Safety Needs Great Product Builders

goodgravy2 Nov 2022 11:33 UTC
45 points
1 comment6 min readEA link

How to Catch a ChatGPT Cheat: 7 Prac­ti­cal Tips

Marshall27 Dec 2022 16:09 UTC
8 points
2 comments4 min readEA link

[Question] Benefits/​Risks of Scott Aaron­son’s Ortho­dox/​Re­form Fram­ing for AI Alignment

Jeremy21 Nov 2022 17:47 UTC
15 points
5 comments1 min readEA link
(scottaaronson.blog)

The great en­ergy de­scent (short ver­sion) - An im­por­tant thing EA might have missed

Corentin Biteau31 Aug 2022 21:50 UTC
59 points
88 comments10 min readEA link

Clas­sify­ing sources of AI x-risk

Sam Clarke8 Aug 2022 18:18 UTC
38 points
6 comments3 min readEA link

How I Came To Longter­mism On My Own & An Out­sider Per­spec­tive On EA Longtermism

Jordan Arel7 Aug 2022 2:42 UTC
34 points
2 comments20 min readEA link

A Cri­tique of AI Takeover Scenarios

Fods1231 Aug 2022 13:49 UTC
44 points
4 comments12 min readEA link

A mod­est case for hope

xavier rg17 Oct 2022 6:03 UTC
28 points
0 comments1 min readEA link

[Question] Ques­tions on databases of AI Risk estimates

Froolow2 Oct 2022 9:12 UTC
24 points
12 comments2 min readEA link

Ex­plore Risks from Emerg­ing Tech­nol­ogy with Peers Out­side of (or New to) the AI Align­ment Com­mu­nity—Ex­press In­ter­est by Au­gust 8

Fasori17 Jul 2022 20:59 UTC
3 points
0 comments2 min readEA link

Who will be in charge once al­ign­ment is achieved?

trurl16 Dec 2022 16:53 UTC
8 points
2 comments1 min readEA link

The ‘Old AI’: Les­sons for AI gov­er­nance from early elec­tric­ity regulation

Sam Clarke19 Dec 2022 2:46 UTC
58 points
1 comment13 min readEA link

[Question] Book recom­men­da­tions for the his­tory of ML?

Eleni_A28 Dec 2022 23:45 UTC
10 points
4 comments1 min readEA link

The het­ero­gene­ity of hu­man value types: Im­pli­ca­tions for AI alignment

Geoffrey Miller16 Sep 2022 21:21 UTC
21 points
2 comments10 min readEA link

WFW?: Op­por­tu­nity and The­ory of Impact

DavidCorfield2 Nov 2022 0:45 UTC
2 points
5 comments14 min readEA link
(www.whatfuture.world)

Some­thing to make my­self fas­ci­nated with com­put­ing sci­ence and AI.

Eduardo7 Dec 2022 2:12 UTC
3 points
5 comments1 min readEA link

Distri­bu­tion Shifts and The Im­por­tance of AI Safety

Leon_Lang29 Sep 2022 22:38 UTC
7 points
0 comments1 min readEA link

“AI pre­dic­tions” (Fu­ture Fund AI Wor­ld­view Prize sub­mis­sion)

ketanrama5 Nov 2022 17:51 UTC
3 points
0 comments3 min readEA link
(medium.com)

Why some peo­ple be­lieve in AGI, but I don’t.

cveres26 Oct 2022 3:09 UTC
13 points
2 comments4 min readEA link

EA’s brain-over-body bias, and the em­bod­ied value prob­lem in AI al­ign­ment

Geoffrey Miller21 Sep 2022 18:55 UTC
45 points
3 comments25 min readEA link

Align­ment 201 curriculum

richard_ngo12 Oct 2022 19:17 UTC
94 points
9 comments1 min readEA link

Why we’re not found­ing a hu­man-data-for-al­ign­ment org

LRudL27 Sep 2022 20:14 UTC
149 points
7 comments29 min readEA link

AI Safety Endgame Stories

IvanVendrov28 Sep 2022 17:12 UTC
31 points
1 comment1 min readEA link

There are two fac­tions work­ing to pre­vent AI dan­gers. Here’s why they’re deeply di­vided.

Sharmake10 Aug 2022 19:52 UTC
9 points
0 comments4 min readEA link
(www.vox.com)

An­nual AGI Bench­mark­ing Event

Metaculus26 Aug 2022 21:31 UTC
20 points
2 comments2 min readEA link
(www.metaculus.com)

Ap­ply for men­tor­ship in AI Safety field-building

Akash17 Sep 2022 19:03 UTC
21 points
0 comments1 min readEA link
No comments.