AI safety

TagLast edit: 17 Jun 2022 9:38 UTC by Leo

AI safety is the study of ways to reduce risks posed by artificial intelligence.

AI safety as a career

80,000 Hours’ medium-depth investigation rates technical AI safety research a “priority path”—among the most promising career opportunities the organization has identified so far.[1][2]

Further reading

Gates, Vael (2022) Resources I send to AI researchers about AI safety, Effective Altruism Forum, June 13.

Krakovna, Victoria (2017) Introductory resources on AI safety research, Victoria Krakovna’s Blog, October 19.
A list of readings on AI safety.

Ngo, Richard (2019) Disentangling arguments for the importance of AI safety, Effective Altruism Forum, January 21.

Related entries

AI alignment | AI interpretability | AI risk | cooperative AI

  1. ^

    Todd, Benjamin (2018) The highest impact career paths our research has identified so far, 80,000 Hours, August 12.

  2. ^

    Todd, Benjamin (2021) AI safety technical research, 80,000 Hours, October.

2021 AI Align­ment Liter­a­ture Re­view and Char­ity Comparison

Larks23 Dec 2021 14:06 UTC
160 points
18 comments73 min readEA link

Sen­tience In­sti­tute 2021 End of Year Summary

Ali26 Nov 2021 14:40 UTC
65 points
5 comments6 min readEA link

A Viral Li­cense for AI Safety

IvanVendrov5 Jun 2021 2:00 UTC
24 points
6 comments5 min readEA link

Ask AI com­pa­nies about what they are do­ing for AI safety?

mic8 Mar 2022 21:54 UTC
39 points
1 comment2 min readEA link

Help us find pain points in AI safety

Esben Kran12 Apr 2022 18:43 UTC
21 points
4 comments9 min readEA link

“Pivotal Act” In­ten­tions: Nega­tive Con­se­quences and Fal­la­cious Arguments

Andrew Critch19 Apr 2022 20:24 UTC
63 points
10 comments7 min readEA link

In­tro­duc­tion to Prag­matic AI Safety [Prag­matic AI Safety #1]

ThomasWoodside9 May 2022 17:02 UTC
67 points
0 comments6 min readEA link

Digi­tal peo­ple could make AI safer

GMcGowan10 Jun 2022 15:29 UTC
21 points
12 comments4 min readEA link

[Question] What harm could AI safety do?

SeanEngelhart15 May 2021 1:11 UTC
12 points
9 comments1 min readEA link

An­nounc­ing the Vi­talik Bu­terin Fel­low­ships in AI Ex­is­ten­tial Safety!

DanielFilan21 Sep 2021 0:41 UTC
62 points
0 comments1 min readEA link

Say­ing ‘AI safety re­search is a Pas­cal’s Mug­ging’ isn’t a strong response

Robert_Wiblin15 Dec 2015 13:48 UTC
14 points
16 commentsEA link

13 Very Differ­ent Stances on AGI

Ozzie Gooen27 Dec 2021 23:30 UTC
69 points
27 comments3 min readEA link

Emerg­ing Tech­nolo­gies: More to explore

EA Handbook1 Jan 2021 11:06 UTC
4 points
0 comments2 min readEA link

Ac­tion: Help ex­pand fund­ing for AI Safety by co­or­di­nat­ing on NSF response

Evan R. Murphy20 Jan 2022 20:48 UTC
20 points
7 comments3 min readEA link

How I Formed My Own Views About AI Safety

Neel Nanda27 Feb 2022 18:52 UTC
117 points
12 comments14 min readEA link

Disen­tan­gling ar­gu­ments for the im­por­tance of AI safety

richard_ngo23 Jan 2019 14:58 UTC
63 points
14 comments8 min readEA link

AI safety starter pack

mariushobbhahn28 Mar 2022 16:05 UTC
92 points
7 comments5 min readEA link

How I failed to form views on AI safety

Ada-Maaria Hyvärinen17 Apr 2022 11:05 UTC
184 points
72 comments40 min readEA link

Chain­ing Retroac­tive Fun­ders to Bor­row Against Un­likely Utopias

Denis Drescher19 Apr 2022 18:25 UTC
24 points
4 comments7 min readEA link

Cal­ling for Stu­dent Sub­mis­sions: AI Safety Distil­la­tion Contest

Aris Richardson23 Apr 2022 20:24 UTC
101 points
30 comments3 min readEA link

EA, Psy­chol­ogy & AI Safety Research

Altruist26 May 2022 23:46 UTC
16 points
2 comments6 min readEA link

Is the time crunch for AI Safety Move­ment Build­ing now?

Chris Leong8 Jun 2022 12:19 UTC
12 points
10 comments3 min readEA link

20 Cri­tiques of AI Safety That I Found on Twitter

Daniel Kirmani23 Jun 2022 15:11 UTC
13 points
13 comments1 min readEA link

AI Safety Ca­reer Bot­tle­necks Sur­vey Re­sponses Responses

Linda Linsefors28 May 2021 10:41 UTC
34 points
1 comment5 min readEA link

Mak­ing of #IAN

kirchner.jan29 Aug 2021 16:24 UTC
9 points
0 comments1 min readEA link

Chris Olah on what the hell is go­ing on in­side neu­ral networks

80000_Hours4 Aug 2021 15:13 UTC
5 points
0 comments133 min readEA link

List of AI safety courses and resources

Daniel del Castillo6 Sep 2021 14:26 UTC
47 points
3 comments1 min readEA link

Con­tribute by fa­cil­i­tat­ing the AGI Safety Fun­da­men­tals Programme

Jamie Bernardi6 Dec 2021 11:50 UTC
27 points
0 comments2 min readEA link

What role should evolu­tion­ary analo­gies play in un­der­stand­ing AI take­off speeds?

anson.ho11 Dec 2021 1:16 UTC
12 points
0 comments42 min readEA link

What is the role of Bayesian ML for AI al­ign­ment/​safety?

mariushobbhahn11 Jan 2022 8:07 UTC
37 points
6 comments3 min readEA link

AI ac­cel­er­a­tion from a safety per­spec­tive: Trade-offs and con­sid­er­a­tions

mariushobbhahn19 Jan 2022 9:44 UTC
12 points
1 comment7 min readEA link

My per­sonal cruxes for work­ing on AI safety

Buck13 Feb 2020 7:11 UTC
133 points
35 comments44 min readEA link

Univer­sity com­mu­nity build­ing seems like the wrong model for AI safety

George Stiffman26 Feb 2022 6:23 UTC
24 points
8 comments2 min readEA link

[Question] AI Eth­i­cal Committee

eaaicommittee1 Mar 2022 23:35 UTC
8 points
0 comments1 min readEA link

AI Safety Overview: CERI Sum­mer Re­search Fellowship

Jamie Bernardi24 Mar 2022 15:12 UTC
29 points
0 comments2 min readEA link

Data Publi­ca­tion for the 2021 Ar­tifi­cial In­tel­li­gence, Mo­ral­ity, and Sen­tience (AIMS) Sur­vey

Janet Pauketat24 Mar 2022 15:43 UTC
21 points
0 comments3 min readEA link

Sce­nario Map­ping Ad­vanced AI Risk: Re­quest for Par­ti­ci­pa­tion with Data Collection

Kiliank27 Mar 2022 11:44 UTC
14 points
1 comment5 min readEA link

Com­mu­nity Build­ing for Grad­u­ate Stu­dents: A Tar­geted Approach

Neil Crawford29 Mar 2022 19:47 UTC
12 points
0 comments3 min readEA link

How Josiah be­came an AI safety researcher

Neil Crawford29 Mar 2022 19:47 UTC
9 points
0 comments1 min readEA link

[Question] A dataset for AI/​su­per­in­tel­li­gence sto­ries and other me­dia?

Harrison Durland29 Mar 2022 21:41 UTC
18 points
2 comments1 min readEA link

Pitch­ing AI Safety in 3 sentences

PabloAMC30 Mar 2022 18:50 UTC
7 points
0 comments1 min readEA link

[Question] Why does (any par­tic­u­lar) AI safety work re­duce s-risks more than it in­creases them?

MichaelStJules3 Oct 2021 16:55 UTC
33 points
18 comments1 min readEA link

[Question] Is it valuable to the field of AI Safety to have a neu­ro­science back­ground?

Samuel Nellessen3 Apr 2022 19:44 UTC
17 points
3 comments1 min readEA link

Is GPT3 a Good Ra­tion­al­ist? - In­struc­tGPT3 [2/​2]

simeon_c7 Apr 2022 13:54 UTC
22 points
0 comments7 min readEA link

A tough ca­reer decision

PabloAMC9 Apr 2022 0:46 UTC
65 points
13 comments4 min readEA link

A vi­su­al­iza­tion of some orgs in the AI Safety Pipeline

Aaron_Scher10 Apr 2022 16:52 UTC
11 points
8 comments1 min readEA link

AI Ethics non-profit is look­ing for an investor

sergia12 Apr 2022 8:47 UTC
−4 points
0 comments1 min readEA link

How to be­come an AI safety researcher

peterbarnett12 Apr 2022 11:33 UTC
98 points
14 comments14 min readEA link

[Question] Please Share Your Per­spec­tives on the De­gree of So­cietal Im­pact from Trans­for­ma­tive AI Outcomes

Kiliank15 Apr 2022 1:23 UTC
3 points
3 comments1 min readEA link

Beg­ging, Plead­ing AI Orgs to Com­ment on NIST AI Risk Man­age­ment Framework

Bridges15 Apr 2022 19:35 UTC
86 points
4 comments2 min readEA link

In­for­ma­tion se­cu­rity con­sid­er­a­tions for AI and the long term future

Jeffrey Ladish2 May 2022 20:53 UTC
105 points
6 comments11 min readEA link

When is AI safety re­search harm­ful?

Nathan_Barnard9 May 2022 10:36 UTC
13 points
6 comments9 min readEA link

New se­ries of posts an­swer­ing one of Holden’s “Im­por­tant, ac­tion­able re­search ques­tions”

Evan R. Murphy12 May 2022 21:22 UTC
9 points
0 comments1 min readEA link

Fermi es­ti­ma­tion of the im­pact you might have work­ing on AI safety

frib13 May 2022 13:30 UTC
22 points
13 comments1 min readEA link

[Link post] Promis­ing Paths to Align­ment—Con­nor Leahy | Talk

frances_lorenz14 May 2022 15:58 UTC
16 points
0 comments1 min readEA link

[Question] What does the Pro­ject Man­age­ment role look like in AI safety?

Gaurav Sett14 May 2022 19:29 UTC
8 points
1 comment1 min readEA link

Ac­tion­able-guidance and roadmap recom­men­da­tions for the NIST AI Risk Man­age­ment Framework

Tony Barrett17 May 2022 15:27 UTC
7 points
0 comments3 min readEA link

“In­tro to brain-like-AGI safety” se­ries—just finished!

Steven Byrnes17 May 2022 15:35 UTC
13 points
0 comments1 min readEA link

It’s not ob­vi­ous to me that ac­cord­ing to the EA frame­work, AI Safety is helpful

oh5432117 May 2022 21:34 UTC
8 points
8 comments1 min readEA link

Com­plex Sys­tems for AI Safety [Prag­matic AI Safety #3]

ThomasWoodside24 May 2022 0:04 UTC
30 points
4 comments21 min readEA link

In­tro­duc­ing spirit hazards

brb24327 May 2022 22:16 UTC
9 points
2 comments2 min readEA link

Perform Tractable Re­search While Avoid­ing Ca­pa­bil­ities Ex­ter­nal­ities [Prag­matic AI Safety #4]

ThomasWoodside30 May 2022 20:37 UTC
27 points
0 comments25 min readEA link

Ad­vice on Pur­su­ing Tech­ni­cal AI Safety Research

frances_lorenz31 May 2022 17:48 UTC
22 points
2 comments4 min readEA link

New 80k ca­reer re­view—Data col­lec­tion for AI alignment

Benjamin Hilton3 Jun 2022 11:44 UTC
34 points
1 comment5 min readEA link

Grokking “Fore­cast­ing TAI with biolog­i­cal an­chors”

anson.ho6 Jun 2022 18:56 UTC
41 points
0 comments14 min readEA link

Open Prob­lems in AI X-Risk [PAIS #5]

ThomasWoodside10 Jun 2022 2:22 UTC
29 points
0 comments35 min readEA link

Re­sources I send to AI re­searchers about AI safety

Vael Gates14 Jun 2022 2:23 UTC
44 points
0 comments9 min readEA link

Ex­pected eth­i­cal value of a ca­reer in AI safety

Jordan Taylor14 Jun 2022 14:25 UTC
33 points
16 comments13 min readEA link

Refer the Co­op­er­a­tive AI Foun­da­tion’s New COO, Re­ceive $5000

Lewis Hammond16 Jun 2022 13:27 UTC
41 points
0 comments3 min readEA link

FYI: I’m work­ing on a book about the threat of AGI/​ASI for a gen­eral au­di­ence. I hope it will be of value to the cause and the community

Darren McKee17 Jun 2022 11:52 UTC
24 points
0 comments2 min readEA link

Pivotal out­comes and pivotal processes

Andrew Critch17 Jun 2022 23:43 UTC
40 points
1 comment4 min readEA link

Tech­ni­cal AI safety in the United Arab Emirates

ea nyuad21 Jun 2022 3:11 UTC
10 points
0 comments11 min readEA link

A Quick List of Some Prob­lems in AI Align­ment As A Field

NicholasKross21 Jun 2022 17:09 UTC
15 points
10 comments6 min readEA link

Half-baked ideas thread (EA /​ AI Safety)

Aryeh Englander23 Jun 2022 16:05 UTC
18 points
8 comments1 min readEA link