RSS

AI safety

Core TagLast edit: 4 Mar 2023 21:30 UTC by Nathan Young

AI safety is the study of ways to reduce risks posed by artificial intelligence.

AI safety as a career

80,000 Hours’ medium-depth investigation rates technical AI safety research a “priority path”—among the most promising career opportunities the organization has identified so far.[1][2]

Arguments against AI safety

AI safety and AI risk is sometimes referred to as a Pascal’s Mugging [3], implying that the risks are tiny and that for any stated level of ignorable risk the the payoffs could be exaggerated to force it to still be a top priority. A response to this is that in a survey of 700 ML researchers, the median answer to the “the probability that the long-run effect of advanced AI on humanity will be “extremely bad (e.g., human extinction)” was 5% with, with 48% of respondents giving 10% or higher[4]. These probabilites are too high (by at least 5 orders of magnitude) to be consider Pascalian.

Further reading

Gates, Vael (2022) Resources I send to AI researchers about AI safety, Effective Altruism Forum, June 13.

Krakovna, Victoria (2017) Introductory resources on AI safety research, Victoria Krakovna’s Blog, October 19.

Ngo, Richard (2019) Disentangling arguments for the importance of AI safety, Effective Altruism Forum, January 21.

Related entries

AI alignment | AI interpretability | AI risk | cooperative AI | building the field of AI safety

  1. ^

    Todd, Benjamin (2018) The highest impact career paths our research has identified so far, 80,000 Hours, August 12.

  2. ^

    Todd, Benjamin (2021) AI safety technical research, 80,000 Hours, October.

  3. ^

    https://​​twitter.com/​​amasad/​​status/​​1632121317146361856 The CEO of Replit, a coding organisation who are involved in ML Tools

  4. ^

Katja Grace: Let’s think about slow­ing down AI

peterhartree23 Dec 2022 0:57 UTC
80 points
7 comments2 min readEA link
(worldspiritsockpuppet.substack.com)

Dona­tion recom­men­da­tions for xrisk + ai safety

vincentweisser6 Feb 2023 21:25 UTC
10 points
11 comments1 min readEA link

Prevent­ing an AI-re­lated catas­tro­phe—Prob­lem profile

Benjamin Hilton29 Aug 2022 18:49 UTC
129 points
17 comments4 min readEA link
(80000hours.org)

Please don’t crit­i­cize EAs who “sell out” to OpenAI and Anthropic

BrownHairedEevee5 Mar 2023 21:17 UTC
−7 points
21 comments2 min readEA link

Some Things I Heard about AI Gover­nance at EAG

utilistrutil28 Feb 2023 21:27 UTC
32 points
5 comments6 min readEA link

AI safety starter pack

mariushobbhahn28 Mar 2022 16:05 UTC
119 points
11 comments6 min readEA link

Large Lan­guage Models as Fi­du­cia­ries to Humans

johnjnay24 Jan 2023 19:53 UTC
25 points
0 comments34 min readEA link
(papers.ssrn.com)

Cy­borg Pe­ri­ods: There will be mul­ti­ple AI transitions

Jan_Kulveit22 Feb 2023 16:09 UTC
55 points
1 comment1 min readEA link

NYT: Google will ‘re­cal­ibrate’ the risk of re­leas­ing AI due to com­pe­ti­tion with OpenAI

Michael Huang22 Jan 2023 2:13 UTC
166 points
8 comments1 min readEA link
(www.nytimes.com)

Suc­cess with­out dig­nity: a nearcast­ing story of avoid­ing catas­tro­phe by luck

Holden Karnofsky15 Mar 2023 20:17 UTC
72 points
3 comments1 min readEA link

2021 AI Align­ment Liter­a­ture Re­view and Char­ity Comparison

Larks23 Dec 2021 14:06 UTC
166 points
18 comments75 min readEA link

[MLSN #8]: Mechanis­tic in­ter­pretabil­ity, us­ing law to in­form AI al­ign­ment, scal­ing laws for proxy gaming

ThomasW20 Feb 2023 16:06 UTC
25 points
0 comments4 min readEA link
(newsletter.mlsafety.org)

Two im­por­tant re­cent AI Talks- Ge­bru and Lazar

Gideon Futerman6 Mar 2023 1:30 UTC
−14 points
5 comments1 min readEA link

Sen­tience In­sti­tute 2021 End of Year Summary

Ali26 Nov 2021 14:40 UTC
66 points
5 comments6 min readEA link
(www.sentienceinstitute.org)

Oper­a­tional­iz­ing timelines

Zach Stein-Perlman10 Mar 2023 17:30 UTC
28 points
2 comments1 min readEA link

Po­ten­tial em­ploy­ees have a unique lever to in­fluence the be­hav­iors of AI labs

oxalis18 Mar 2023 20:58 UTC
107 points
1 comment5 min readEA link

Pros and Cons of boy­cotting paid Chat GPT

NickLaing18 Mar 2023 8:50 UTC
14 points
11 comments2 min readEA link

“Pivotal Act” In­ten­tions: Nega­tive Con­se­quences and Fal­la­cious Arguments

Andrew Critch19 Apr 2022 20:24 UTC
74 points
10 comments7 min readEA link

A Roundtable for Safe AI (RSAI)?

Lara_TH9 Mar 2023 12:11 UTC
7 points
0 comments4 min readEA link

AGI Safety Needs Peo­ple With All Skil­lsets!

Severin25 Jul 2022 13:30 UTC
33 points
7 comments2 min readEA link

[Question] Should peo­ple get neu­ro­science phD to work in AI safety field?

jackchang1107 Mar 2023 16:21 UTC
9 points
11 comments1 min readEA link

Liter­a­ture re­view of Trans­for­ma­tive Ar­tifi­cial In­tel­li­gence timelines

Jaime Sevilla27 Jan 2023 20:36 UTC
147 points
10 comments1 min readEA link

Paper Sum­mary: The Effec­tive­ness of AI Ex­is­ten­tial Risk Com­mu­ni­ca­tion to the Amer­i­can and Dutch Public

Otto9 Mar 2023 10:40 UTC
90 points
6 comments4 min readEA link

Align­ing the Align­ers: En­sur­ing Aligned AI acts for the com­mon good of all mankind

timunderwood16 Jan 2023 11:13 UTC
33 points
2 comments4 min readEA link

How ma­jor gov­ern­ments can help with the most im­por­tant century

Holden Karnofsky24 Feb 2023 19:37 UTC
46 points
3 comments4 min readEA link
(www.cold-takes.com)

New Ar­tifi­cial In­tel­li­gence quiz: can you beat ChatGPT?

AndreFerretti3 Mar 2023 15:46 UTC
28 points
2 comments1 min readEA link

There are no co­her­ence theorems

EJT20 Feb 2023 21:52 UTC
82 points
48 comments19 min readEA link

Ask AI com­pa­nies about what they are do­ing for AI safety?

mic8 Mar 2022 21:54 UTC
44 points
1 comment2 min readEA link

Misal­ign­ment Mu­seum opens in San Fran­cisco: ‘Sorry for kil­ling most of hu­man­ity’

Michael Huang4 Mar 2023 7:09 UTC
92 points
6 comments1 min readEA link
(www.misalignmentmuseum.com)

Sym­bio­sis, not al­ign­ment, as the goal for liberal democ­ra­cies in the tran­si­tion to ar­tifi­cial gen­eral intelligence

simonfriederich17 Mar 2023 13:04 UTC
16 points
2 comments24 min readEA link
(rdcu.be)

Help us find pain points in AI safety

Esben Kran12 Apr 2022 18:43 UTC
31 points
4 comments8 min readEA link

An­nounc­ing: Mechanism De­sign for AI Safety—Read­ing Group

Rubi J. Hudson9 Aug 2022 4:25 UTC
35 points
1 comment4 min readEA link

De­cep­tive Align­ment is <1% Likely by Default

DavidW21 Feb 2023 15:07 UTC
30 points
24 comments10 min readEA link

AGI mis­al­ign­ment x-risk may be lower due to an over­looked goal speci­fi­ca­tion technology

johnjnay21 Oct 2022 2:03 UTC
20 points
1 comment1 min readEA link

A Viral Li­cense for AI Safety

IvanVendrov5 Jun 2021 2:00 UTC
29 points
6 comments5 min readEA link

AI safety and con­scious­ness re­search: A brainstorm

Daniel_Friedrich15 Mar 2023 14:33 UTC
9 points
0 comments9 min readEA link

(Even) More Early-Ca­reer EAs Should Try AI Safety Tech­ni­cal Research

levin30 Jun 2022 21:14 UTC
86 points
37 comments10 min readEA link

[Linkpost] Scott Alexan­der re­acts to OpenAI’s lat­est post

Akash11 Mar 2023 22:24 UTC
104 points
3 comments1 min readEA link

What does Bing Chat tell us about AI risk?

Holden Karnofsky28 Feb 2023 18:47 UTC
90 points
8 comments2 min readEA link
(www.cold-takes.com)

In­tro­duc­tion to Prag­matic AI Safety [Prag­matic AI Safety #1]

ThomasW9 May 2022 17:02 UTC
68 points
0 comments6 min readEA link

Digi­tal peo­ple could make AI safer

GMcGowan10 Jun 2022 15:29 UTC
22 points
15 comments4 min readEA link
(www.mindlessalgorithm.com)

Ap­ply to the Cam­bridge ML for Align­ment Boot­camp (CaMLAB) [26 March − 8 April]

hannah9 Feb 2023 16:32 UTC
62 points
1 comment5 min readEA link

Poster Ses­sion on AI Safety

Neil Crawford12 Nov 2022 3:50 UTC
8 points
0 comments4 min readEA link

What to think when a lan­guage model tells you it’s sentient

rgb20 Feb 2023 2:59 UTC
105 points
18 comments6 min readEA link

[Question] Do you worry about to­tal­i­tar­ian regimes us­ing AI Align­ment tech­nol­ogy to cre­ate AGI that sub­scribe to their val­ues?

diodio_yang28 Feb 2023 18:12 UTC
25 points
11 comments2 min readEA link

What AI com­pa­nies can do to­day to help with the most im­por­tant century

Holden Karnofsky20 Feb 2023 17:40 UTC
103 points
8 comments11 min readEA link
(www.cold-takes.com)

How evals might (or might not) pre­vent catas­trophic risks from AI

Akash7 Feb 2023 20:16 UTC
28 points
0 comments1 min readEA link

Speedrun: AI Align­ment Prizes

joe9 Feb 2023 11:55 UTC
22 points
0 comments18 min readEA link

ChatGPT not so clever or not so ar­tifi­cial as hyped to be?

Haris Shekeris2 Mar 2023 6:16 UTC
−7 points
2 comments1 min readEA link

Available Ta­lent af­ter Ma­jor Lay­offs at Tech Giants

nnn21 Jan 2023 2:50 UTC
137 points
3 comments1 min readEA link

Tech­nol­ogy is Power: Rais­ing Aware­ness Of Tech­nolog­i­cal Risks

Marc Wong9 Feb 2023 15:13 UTC
3 points
0 comments2 min readEA link

Com­ments on OpenAI’s “Plan­ning for AGI and be­yond”

So8res3 Mar 2023 23:01 UTC
110 points
6 comments1 min readEA link

Say­ing ‘AI safety re­search is a Pas­cal’s Mug­ging’ isn’t a strong response

Robert_Wiblin15 Dec 2015 13:48 UTC
14 points
16 comments2 min readEA link

Ex­cerpts from “Do­ing EA Bet­ter” on x-risk methodology

BrownHairedEevee26 Jan 2023 1:04 UTC
21 points
5 comments6 min readEA link
(forum.effectivealtruism.org)

Ways to buy time

Akash12 Nov 2022 19:31 UTC
46 points
1 comment1 min readEA link

Col­lin Burns on Align­ment Re­search And Dis­cov­er­ing La­tent Knowl­edge Without Supervision

Michaël Trazzi17 Jan 2023 17:21 UTC
21 points
3 comments1 min readEA link

Vic­to­ria Krakovna on AGI Ruin, The Sharp Left Turn and Paradigms of AI Alignment

Michaël Trazzi12 Jan 2023 17:09 UTC
16 points
0 comments1 min readEA link

[TIME mag­a­z­ine] Deep­Mind’s CEO Helped Take AI Main­stream. Now He’s Urg­ing Cau­tion (Per­rigo, 2023)

Will Aldred20 Jan 2023 20:37 UTC
93 points
0 comments1 min readEA link
(time.com)

How I Formed My Own Views About AI Safety

Neel Nanda27 Feb 2022 18:52 UTC
130 points
12 comments13 min readEA link
(www.neelnanda.io)

The Wizard of Oz Prob­lem: How in­cen­tives and nar­ra­tives can skew our per­cep­tion of AI developments

Akash20 Mar 2023 22:36 UTC
16 points
0 comments1 min readEA link

[Question] Will AI Wor­ld­view Prize Fund­ing Be Re­placed?

Jordan Arel13 Nov 2022 17:10 UTC
26 points
4 comments1 min readEA link

My highly per­sonal skep­ti­cism brain­dump on ex­is­ten­tial risk from ar­tifi­cial in­tel­li­gence.

NunoSempere23 Jan 2023 20:08 UTC
413 points
115 comments14 min readEA link
(nunosempere.com)

[Question] Do AI com­pa­nies make their safety re­searchers sign a non-dis­par­age­ment clause?

Ofer5 Sep 2022 13:40 UTC
70 points
4 comments1 min readEA link

13 Very Differ­ent Stances on AGI

Ozzie Gooen27 Dec 2021 23:30 UTC
84 points
27 comments3 min readEA link

AI Risk Man­age­ment Frame­work | NIST

𝕮𝖎𝖓𝖊𝖗𝖆26 Jan 2023 15:27 UTC
50 points
0 comments1 min readEA link

Com­pendium of prob­lems with RLHF

Raphaël S30 Jan 2023 8:48 UTC
16 points
0 comments1 min readEA link

Ap­pli­ca­tions are now open for In­tro to ML Safety Spring 2023

Joshc4 Nov 2022 22:45 UTC
49 points
1 comment2 min readEA link

Ques­tions about Con­je­cure’s CoEm proposal

Akash9 Mar 2023 19:32 UTC
19 points
0 comments1 min readEA link

Re­cur­sive Mid­dle Man­ager Hell

Raemon17 Jan 2023 19:02 UTC
78 points
2 comments1 min readEA link

Fight­ing with­out hope

Akash1 Mar 2023 18:15 UTC
34 points
9 comments1 min readEA link

Quick sur­vey on AI al­ign­ment resources

frances_lorenz30 Jun 2022 19:08 UTC
14 points
0 comments1 min readEA link

Trends in the dol­lar train­ing cost of ma­chine learn­ing systems

Ben Cottier1 Feb 2023 14:48 UTC
58 points
3 comments1 min readEA link

Google in­vests $300mn in ar­tifi­cial in­tel­li­gence start-up An­thropic | FT

𝕮𝖎𝖓𝖊𝖗𝖆3 Feb 2023 19:43 UTC
155 points
6 comments1 min readEA link
(www.ft.com)

AXRP: Store, Pa­treon, Video

DanielFilan7 Feb 2023 5:12 UTC
7 points
0 comments1 min readEA link

Two con­trast­ing mod­els of “in­tel­li­gence” and fu­ture growth

Magnus Vinding24 Nov 2022 11:54 UTC
55 points
19 comments26 min readEA link

EA, Psy­chol­ogy & AI Safety Research

Sam Ellis26 May 2022 23:46 UTC
26 points
3 comments7 min readEA link

What is it like do­ing AI safety work?

Kat Woods21 Feb 2023 19:24 UTC
95 points
2 comments10 min readEA link

Ac­tion: Help ex­pand fund­ing for AI Safety by co­or­di­nat­ing on NSF response

Evan R. Murphy20 Jan 2022 20:48 UTC
20 points
7 comments3 min readEA link

Jobs that can help with the most im­por­tant century

Holden Karnofsky12 Feb 2023 18:19 UTC
52 points
2 comments32 min readEA link
(www.cold-takes.com)

[Question] What harm could AI safety do?

SeanEngelhart15 May 2021 1:11 UTC
12 points
7 comments1 min readEA link

[Question] How good/​bad is the new Bing AI for the world?

Nathan Young17 Feb 2023 16:31 UTC
21 points
13 comments1 min readEA link

AI al­ign­ment re­searchers may have a com­par­a­tive ad­van­tage in re­duc­ing s-risks

Lukas_Gloor15 Feb 2023 13:01 UTC
73 points
5 comments13 min readEA link

An­nounc­ing the In­tro­duc­tion to ML Safety Course

ThomasW6 Aug 2022 2:50 UTC
135 points
5 comments7 min readEA link

Fram­ing AI strategy

Zach Stein-Perlman7 Feb 2023 20:03 UTC
16 points
0 comments1 min readEA link
(www.lesswrong.com)

The Parable of the Boy Who Cried 5% Chance of Wolf

Kat Woods15 Aug 2022 14:22 UTC
75 points
8 comments2 min readEA link

An­nounc­ing the Vi­talik Bu­terin Fel­low­ships in AI Ex­is­ten­tial Safety!

DanielFilan21 Sep 2021 0:41 UTC
62 points
0 comments1 min readEA link
(grants.futureoflife.org)

Coun­ter­ar­gu­ments to the ba­sic AI risk case

Katja_Grace14 Oct 2022 20:30 UTC
266 points
22 comments34 min readEA link

Order Mat­ters for De­cep­tive Alignment

DavidW15 Feb 2023 20:12 UTC
20 points
1 comment1 min readEA link
(www.lesswrong.com)

A tale of 2.5 or­thog­o­nal­ity theses

Arepo1 May 2022 13:53 UTC
137 points
31 comments15 min readEA link

12 ca­reer ad­vis­ing ques­tions that may (or may not) be helpful for peo­ple in­ter­ested in al­ign­ment research

Akash12 Dec 2022 22:36 UTC
14 points
0 comments1 min readEA link

Law-Fol­low­ing AI 4: Don’t Rely on Vi­car­i­ous Liability

Cullen2 Aug 2022 23:23 UTC
13 points
0 comments3 min readEA link

Chain­ing Retroac­tive Fun­ders to Bor­row Against Un­likely Utopias

Dawn Drescher19 Apr 2022 18:25 UTC
24 points
4 comments9 min readEA link
(impactmarkets.substack.com)

Tak­ing a leave of ab­sence from Open Philan­thropy to work on AI safety

Holden Karnofsky23 Feb 2023 19:05 UTC
417 points
27 comments2 min readEA link

Don’t Call It AI Alignment

RedStateBlueState20 Feb 2023 5:27 UTC
15 points
7 comments2 min readEA link

11 heuris­tics for choos­ing (al­ign­ment) re­search projects

Akash27 Jan 2023 0:36 UTC
29 points
1 comment1 min readEA link

Next steps af­ter AGISF at UMich

Jakub Kraus25 Jan 2023 20:57 UTC
18 points
1 comment1 min readEA link

Spread­ing mes­sages to help with the most im­por­tant century

Holden Karnofsky25 Jan 2023 20:35 UTC
122 points
20 comments18 min readEA link
(www.cold-takes.com)

Emerg­ing Tech­nolo­gies: More to explore

EA Handbook1 Jan 2021 11:06 UTC
4 points
0 comments1 min readEA link

4 ways to think about de­moc­ra­tiz­ing AI [GovAI Linkpost]

Akash13 Feb 2023 18:06 UTC
33 points
0 comments1 min readEA link

Ret­ro­spec­tive on the 2022 Con­jec­ture AI Discussions

Andrea_Miotti24 Feb 2023 22:41 UTC
12 points
1 comment1 min readEA link

Sur­vey on in­ter­me­di­ate goals in AI governance

MichaelA17 Mar 2023 12:44 UTC
129 points
2 comments1 min readEA link

Just Pivot to AI: The se­cret is out

sapphire15 Mar 2023 6:25 UTC
0 points
4 comments2 min readEA link

GPT-4 is out: thread (& links)

Lizka14 Mar 2023 20:02 UTC
84 points
18 comments1 min readEA link

Be­ware safety-washing

Lizka13 Jan 2023 10:39 UTC
124 points
6 comments4 min readEA link

An overview of some promis­ing work by ju­nior al­ign­ment researchers

Akash26 Dec 2022 17:23 UTC
10 points
0 comments1 min readEA link

Chris­ti­ano (ARC) and GA (Con­jec­ture) Dis­cuss Align­ment Cruxes

Andrea_Miotti24 Feb 2023 23:03 UTC
16 points
1 comment1 min readEA link

Com­mu­nity Build­ing for Grad­u­ate Stu­dents: A Tar­geted Approach

Neil Crawford29 Mar 2022 19:47 UTC
13 points
0 comments3 min readEA link

Ap­ply for men­tor­ship in AI Safety field-building

Akash17 Sep 2022 19:03 UTC
21 points
0 comments1 min readEA link

[Question] Game the­ory work on AI al­ign­ment with di­verse AI sys­tems, hu­man in­di­vi­d­u­als, & hu­man groups?

Geoffrey Miller2 Mar 2023 16:50 UTC
22 points
2 comments1 min readEA link

List #1: Why stop­ping the de­vel­op­ment of AGI is hard but doable

Remmelt24 Dec 2022 9:52 UTC
24 points
2 comments1 min readEA link

AI Gover­nance & Strat­egy: Pri­ori­ties, tal­ent gaps, & opportunities

Akash3 Mar 2023 18:09 UTC
19 points
0 comments1 min readEA link

Who Aligns the Align­ment Re­searchers?

ben.smith5 Mar 2023 23:22 UTC
22 points
4 comments1 min readEA link

Anony­mous ad­vice: If you want to re­duce AI risk, should you take roles that ad­vance AI ca­pa­bil­ities?

Benjamin Hilton11 Oct 2022 14:15 UTC
72 points
9 comments17 min readEA link
(80000hours.org)

AI Safety in a World of Vuln­er­a­ble Ma­chine Learn­ing Systems

AdamGleave8 Mar 2023 2:40 UTC
18 points
0 comments1 min readEA link

“Can We Sur­vive Tech­nol­ogy?” by John von Neumann

Eli Rose13 Mar 2023 2:26 UTC
47 points
0 comments1 min readEA link
(geosci.uchicago.edu)

How I failed to form views on AI safety

Ada-Maaria Hyvärinen17 Apr 2022 11:05 UTC
204 points
71 comments40 min readEA link

An­nounc­ing the Open Philan­thropy AI Wor­ld­views Contest

Jason Schukraft10 Mar 2023 2:33 UTC
132 points
16 comments3 min readEA link
(www.openphilanthropy.org)

List #3: Why not to as­sume on prior that AGI-al­ign­ment workarounds are available

Remmelt24 Dec 2022 9:54 UTC
6 points
0 comments1 min readEA link

Is the time crunch for AI Safety Move­ment Build­ing now?

Chris Leong8 Jun 2022 12:19 UTC
14 points
10 comments2 min readEA link

CFP for Re­bel­lion and Di­sobe­di­ence in AI workshop

Ram Rachum29 Dec 2022 16:09 UTC
4 points
0 comments1 min readEA link

Four ques­tions I ask AI safety researchers

Akash17 Jul 2022 17:25 UTC
30 points
3 comments1 min readEA link

Trans­for­ma­tive AI is­sues (not just mis­al­ign­ment): an overview

Holden Karnofsky6 Jan 2023 2:19 UTC
31 points
0 comments22 min readEA link
(www.cold-takes.com)

Lev­el­ling Up in AI Safety Re­search Engineering

Gabriel Mukobi2 Sep 2022 4:59 UTC
139 points
17 comments15 min readEA link

Is this com­mu­nity over-em­pha­siz­ing AI al­ign­ment?

Lixiang8 Jan 2023 6:23 UTC
2 points
5 comments1 min readEA link

Yud­kowsky on AGI risk on the Ban­kless podcast

RobBensinger13 Mar 2023 0:42 UTC
44 points
2 comments75 min readEA link

Fu­ture Mat­ters #8: Bing Chat, AI labs on safety, and paus­ing Fu­ture Matters

Pablo21 Mar 2023 14:50 UTC
26 points
0 comments24 min readEA link

Where I’m at with AI risk: con­vinced of dan­ger but not (yet) of doom

Amber Dawn21 Mar 2023 13:23 UTC
38 points
1 comment6 min readEA link

New ‘South Park’ epi­sode on AI & Chat GPT

Geoffrey Miller21 Mar 2023 20:06 UTC
7 points
1 comment1 min readEA link

20 Cri­tiques of AI Safety That I Found on Twitter

Daniel Kirmani23 Jun 2022 15:11 UTC
14 points
13 comments1 min readEA link

Are al­ign­ment re­searchers de­vot­ing enough time to im­prov­ing their re­search ca­pac­ity?

Carson Jones4 Nov 2022 0:58 UTC
11 points
1 comment1 min readEA link

Why peo­ple want to work on AI safety (but don’t)

Emily Grundy24 Jan 2023 6:41 UTC
67 points
10 comments7 min readEA link

[Linkpost] Jan Leike on three kinds of al­ign­ment taxes

Akash6 Jan 2023 23:57 UTC
29 points
0 comments1 min readEA link

AI Safety Camp, Vir­tual Edi­tion 2023

Linda Linsefors6 Jan 2023 0:55 UTC
30 points
0 comments1 min readEA link

List #2: Why co­or­di­nat­ing to al­ign as hu­mans to not de­velop AGI is a lot eas­ier than, well… co­or­di­nat­ing as hu­mans with AGI co­or­di­nat­ing to be al­igned with humans

Remmelt24 Dec 2022 9:53 UTC
3 points
0 comments1 min readEA link

How to pur­sue a ca­reer in tech­ni­cal AI alignment

CharlieRS4 Jun 2022 21:36 UTC
230 points
7 comments39 min readEA link

Peter Eck­er­sley (1979-2022)

Gavin3 Sep 2022 10:45 UTC
497 points
9 comments1 min readEA link

“Aligned with who?” Re­sults of sur­vey­ing 1,000 US par­ti­ci­pants on AI values

Holly Morgan21 Mar 2023 22:07 UTC
6 points
0 comments2 min readEA link
(www.lesswrong.com)

Planes are still decades away from dis­plac­ing most bird jobs

guzey25 Nov 2022 16:49 UTC
27 points
2 comments1 min readEA link

A Wind­fall Clause for CEO could worsen AI race dynamics

Larks9 Mar 2023 18:02 UTC
69 points
11 comments7 min readEA link

Align­ment is mostly about mak­ing cog­ni­tion aimable at all

So8res30 Jan 2023 15:22 UTC
57 points
3 comments1 min readEA link

A Bare­bones Guide to Mechanis­tic In­ter­pretabil­ity Prerequisites

Neel Nanda29 Nov 2022 18:43 UTC
50 points
1 comment3 min readEA link
(neelnanda.io)

Race to the Top: Bench­marks for AI Safety

isaduan4 Dec 2022 22:50 UTC
51 points
8 comments1 min readEA link

In­tent al­ign­ment should not be the goal for AGI x-risk reduction

johnjnay26 Oct 2022 1:24 UTC
7 points
1 comment1 min readEA link

In­tro­duc­ing the new Ries­gos Catas­trófi­cos Globales team

Jaime Sevilla3 Mar 2023 23:04 UTC
73 points
2 comments5 min readEA link
(riesgoscatastroficosglobales.com)

Is AI fore­cast­ing a waste of effort on the mar­gin?

Emrik5 Nov 2022 0:41 UTC
9 points
6 comments3 min readEA link

A new­comer’s guide to the tech­ni­cal AI safety field

zeshen4 Nov 2022 14:29 UTC
12 points
0 comments1 min readEA link

What suc­cess looks like

mariushobbhahn28 Jun 2022 14:30 UTC
106 points
20 comments19 min readEA link

Cal­ling for Stu­dent Sub­mis­sions: AI Safety Distil­la­tion Contest

Aris Richardson23 Apr 2022 20:24 UTC
101 points
28 comments3 min readEA link

AI Safety Seems Hard to Measure

Holden Karnofsky11 Dec 2022 1:31 UTC
88 points
2 comments14 min readEA link

Many AI gov­er­nance pro­pos­als have a trade­off be­tween use­ful­ness and feasibility

Akash3 Feb 2023 18:49 UTC
21 points
0 comments1 min readEA link

Disen­tan­gling ar­gu­ments for the im­por­tance of AI safety

richard_ngo23 Jan 2019 14:58 UTC
63 points
14 comments8 min readEA link

What does it mean for an AGI to be ‘safe’?

So8res7 Oct 2022 4:43 UTC
53 points
21 comments1 min readEA link

Dona­tion offsets for ChatGPT Plus subscriptions

Jeffrey Ladish16 Mar 2023 23:11 UTC
64 points
10 comments3 min readEA link

We are fight­ing a shared bat­tle (a call for a differ­ent ap­proach to AI Strat­egy)

Gideon Futerman16 Mar 2023 14:37 UTC
55 points
9 comments15 min readEA link

On the cor­re­spon­dence be­tween AI-mis­al­ign­ment and cog­ni­tive dis­so­nance us­ing a be­hav­ioral eco­nomics model

Stijn1 Nov 2022 9:15 UTC
11 points
0 comments6 min readEA link

AGI in sight: our look at the game board

Andrea_Miotti18 Feb 2023 22:17 UTC
30 points
18 comments1 min readEA link

Which ML skills are use­ful for find­ing a new AIS re­search agenda?

Yonatan Cale9 Feb 2023 13:09 UTC
7 points
3 comments1 min readEA link

Tech­nolog­i­cal de­vel­op­ments that could in­crease risks from nu­clear weapons: A shal­low review

MichaelA9 Feb 2023 15:41 UTC
78 points
3 comments5 min readEA link
(bit.ly)

Un­jour­nal: Eval­u­a­tions of “Ar­tifi­cial In­tel­li­gence and Eco­nomic Growth”, and new host­ing space

david_reinstein17 Mar 2023 20:20 UTC
46 points
0 comments2 min readEA link
(unjournal.pubpub.org)

The Im­por­tance of AI Align­ment, ex­plained in 5 points

Daniel_Eth11 Feb 2023 2:56 UTC
40 points
4 comments13 min readEA link

We Did AGISF’s 8-week Course in 3 Days. Here’s How it Went

ag400024 Jul 2022 16:46 UTC
26 points
7 comments5 min readEA link

Qual­ities that al­ign­ment men­tors value in ju­nior researchers

Akash14 Feb 2023 23:27 UTC
30 points
1 comment1 min readEA link

Shut­ting Down the Light­cone Offices

Habryka15 Mar 2023 1:46 UTC
225 points
60 comments17 min readEA link
(www.lesswrong.com)

[Question] Good de­pic­tions of speed mis­matches be­tween ad­vanced AI sys­tems and hu­mans?

Geoffrey Miller15 Mar 2023 16:40 UTC
18 points
9 comments1 min readEA link

In­sights from an ex­pert sur­vey about in­ter­me­di­ate goals in AI governance

Sebastian Schwiecker17 Mar 2023 14:59 UTC
9 points
2 comments1 min readEA link

VIRTUA: a novel about AI alignment

Karl von Wendt12 Jan 2023 9:37 UTC
21 points
0 comments1 min readEA link

How bad a fu­ture do ML re­searchers ex­pect?

Katja_Grace13 Mar 2023 5:47 UTC
159 points
19 comments1 min readEA link

EA In­fosec: skill up in or make a tran­si­tion to in­fosec via this book club

Jason Clinton5 Mar 2023 21:02 UTC
159 points
10 comments2 min readEA link

[Question] How to hedge in­vest­ment port­fo­lio against AI risk?

Timothy_Liptrot31 Jan 2023 8:04 UTC
8 points
0 comments1 min readEA link

How we could stum­ble into AI catastrophe

Holden Karnofsky16 Jan 2023 14:52 UTC
78 points
0 comments31 min readEA link
(www.cold-takes.com)

ML Sum­mer Boot­camp Reflec­tion: Aalto EA Finland

Aayush Kucheria12 Jan 2023 8:24 UTC
15 points
2 comments9 min readEA link

Why I’m Scep­ti­cal of Foom

𝕮𝖎𝖓𝖊𝖗𝖆8 Dec 2022 10:01 UTC
21 points
7 comments1 min readEA link

[Link] EAF Re­search agenda: “Co­op­er­a­tion, Con­flict, and Trans­for­ma­tive Ar­tifi­cial In­tel­li­gence”

stefan.torges17 Jan 2020 13:28 UTC
64 points
0 comments1 min readEA link

[Ru­mour] Microsoft to in­vest $10B in OpenAI, will re­ceive 75% of prof­its un­til they re­coup in­vest­ment: GPT would be in­te­grated with Office

𝕮𝖎𝖓𝖊𝖗𝖆10 Jan 2023 23:43 UTC
25 points
2 comments1 min readEA link

Data Publi­ca­tion for the 2021 Ar­tifi­cial In­tel­li­gence, Mo­ral­ity, and Sen­tience (AIMS) Sur­vey

Janet Pauketat24 Mar 2022 15:43 UTC
21 points
0 comments3 min readEA link
(www.sentienceinstitute.org)

Re­search + Real­ity Graph­ing to Sup­port AI Policy (and more): Sum­mary of a Frozen Project

Harrison Durland2 Jul 2022 20:58 UTC
32 points
2 comments8 min readEA link

Big list of AI safety videos

Jakub Kraus9 Jan 2023 6:09 UTC
9 points
0 comments1 min readEA link
(docs.google.com)

Is any­one else also get­ting more wor­ried about hard take­off AGI sce­nar­ios?

JonCefalu9 Jan 2023 6:04 UTC
19 points
11 comments3 min readEA link

Tech­ni­cal AI safety in the United Arab Emirates

ea nyuad21 Jun 2022 3:11 UTC
10 points
0 comments11 min readEA link

Learn­ing as much Deep Learn­ing math as I could in 24 hours

Phosphorous8 Jan 2023 2:19 UTC
57 points
5 comments7 min readEA link

Trans­for­ma­tive AI and Com­pute [Sum­mary]

lennart23 Sep 2021 13:53 UTC
53 points
5 comments9 min readEA link

AI ac­cel­er­a­tion from a safety per­spec­tive: Trade-offs and con­sid­er­a­tions

mariushobbhahn19 Jan 2022 9:44 UTC
12 points
1 comment7 min readEA link

ML Safety Schol­ars Sum­mer 2022 Retrospective

ThomasW1 Nov 2022 3:09 UTC
55 points
2 comments21 min readEA link

Ques­tions about AI that bother me

Eleni_A31 Jan 2023 6:50 UTC
33 points
6 comments2 min readEA link

Which AI Safety Org to Join?

Yonatan Cale11 Oct 2022 19:42 UTC
15 points
21 comments1 min readEA link

“AI” is an indexical

ThomasW3 Jan 2023 22:00 UTC
23 points
2 comments1 min readEA link

So­cial sci­en­tists in­ter­ested in AI safety should con­sider do­ing di­rect tech­ni­cal AI safety re­search, (pos­si­bly meta-re­search), or gov­er­nance, sup­port roles, or com­mu­nity build­ing instead

Vael Gates20 Jul 2022 23:01 UTC
64 points
8 comments17 min readEA link

How to be­come an AI safety researcher

peterbarnett12 Apr 2022 11:33 UTC
106 points
15 comments14 min readEA link

Self-Limit­ing AI in AI Alignment

The_Lord's_Servant_28031 Dec 2022 19:07 UTC
2 points
1 comment1 min readEA link

In­for­ma­tion se­cu­rity con­sid­er­a­tions for AI and the long term future

Jeffrey Ladish2 May 2022 20:53 UTC
118 points
7 comments11 min readEA link

Win­ners of the AI Safety Nudge Competition

Marc Carauleanu15 Nov 2022 1:06 UTC
22 points
0 comments1 min readEA link

Rac­ing through a minefield: the AI de­ploy­ment problem

Holden Karnofsky31 Dec 2022 21:44 UTC
72 points
1 comment13 min readEA link
(www.cold-takes.com)

[Question] What does the Pro­ject Man­age­ment role look like in AI safety?

Gaurav Sett14 May 2022 19:29 UTC
8 points
1 comment1 min readEA link

AI Safety Ex­ec­u­tive Summary

Sean Osier6 Sep 2022 8:26 UTC
20 points
2 comments5 min readEA link
(seanosier.notion.site)

Let’s think about slow­ing down AI

Katja_Grace23 Dec 2022 19:56 UTC
320 points
7 comments1 min readEA link

A vi­su­al­iza­tion of some orgs in the AI Safety Pipeline

Aaron_Scher10 Apr 2022 16:52 UTC
11 points
8 comments1 min readEA link

Take­aways from a sur­vey on AI al­ign­ment resources

DanielFilan5 Nov 2022 23:45 UTC
18 points
9 comments6 min readEA link
(www.lesswrong.com)

[Job]: AI Stan­dards Devel­op­ment Re­search Assistant

Tony Barrett14 Oct 2022 20:18 UTC
13 points
0 comments2 min readEA link

Re­silience Via Frag­mented Power

steve632014 Jul 2022 15:37 UTC
2 points
0 comments6 min readEA link

a ca­sual in­tro to AI doom and alignment

Tamsin Leake2 Nov 2022 9:42 UTC
7 points
2 comments1 min readEA link

Half-baked ideas thread (EA /​ AI Safety)

Aryeh Englander23 Jun 2022 16:05 UTC
21 points
8 comments1 min readEA link

Four rea­sons I find AI safety emo­tion­ally compelling

Kat Woods28 Jun 2022 14:01 UTC
30 points
4 comments4 min readEA link

An­nounc­ing: What Fu­ture World? - Grow­ing the AI Gover­nance Community

DavidCorfield2 Nov 2022 0:31 UTC
4 points
0 comments1 min readEA link

5th IEEE In­ter­na­tional Con­fer­ence on Ar­tifi­cial In­tel­li­gence Test­ing (AITEST 2023)

surabhi gupta12 Mar 2023 9:06 UTC
−5 points
0 comments1 min readEA link

What is the role of Bayesian ML for AI al­ign­ment/​safety?

mariushobbhahn11 Jan 2022 8:07 UTC
39 points
6 comments3 min readEA link

My Most Likely Rea­son to Die Young is AI X-Risk

AISafetyIsNotLongtermist4 Jul 2022 15:34 UTC
231 points
62 comments4 min readEA link
(www.lesswrong.com)

ARIA is look­ing for top­ics for roundtables

Nathan_Barnard26 Aug 2022 19:14 UTC
34 points
11 comments1 min readEA link

[Question] Book recom­men­da­tions for the his­tory of ML?

Eleni_A28 Dec 2022 23:45 UTC
10 points
4 comments1 min readEA link

“In­tro to brain-like-AGI safety” se­ries—just finished!

Steven Byrnes17 May 2022 15:35 UTC
15 points
0 comments1 min readEA link

UK policy and poli­tics careers

weeatquince28 Sep 2019 16:18 UTC
28 points
10 comments7 min readEA link

FLI launches Wor­ld­build­ing Con­test with $100,000 in prizes

ggilgallon17 Jan 2022 13:54 UTC
87 points
55 comments6 min readEA link

AGI will ar­rive by the end of this decade ei­ther as a uni­corn or as a black swan

Yuri Barzov21 Oct 2022 10:50 UTC
−4 points
7 comments3 min readEA link

The Wind­fall Clause has a reme­dies problem

John Bridge23 May 2022 10:31 UTC
40 points
0 comments20 min readEA link

How to make the best of the most im­por­tant cen­tury?

Holden Karnofsky14 Sep 2021 21:05 UTC
49 points
5 comments12 min readEA link

[Question] Best in­tro­duc­tory overviews of AGI safety?

Jakub Kraus13 Dec 2022 19:04 UTC
13 points
8 comments2 min readEA link
(www.lesswrong.com)

[Question] Is it valuable to the field of AI Safety to have a neu­ro­science back­ground?

Samuel Nellessen3 Apr 2022 19:44 UTC
18 points
3 comments1 min readEA link

NeurIPS ML Safety Work­shop 2022

Dan H26 Jul 2022 15:33 UTC
72 points
0 comments1 min readEA link
(neurips2022.mlsafety.org)

On Ar­tifi­cial Gen­eral In­tel­li­gence: Ask­ing the Right Questions

Heather Douglas2 Oct 2022 5:00 UTC
−1 points
7 comments3 min readEA link

My thoughts on OpenAI’s al­ign­ment plan

Akash30 Dec 2022 19:34 UTC
16 points
0 comments1 min readEA link

[Question] Do­ing Global Pri­ori­ties or AI Policy re­search from re­mote lo­ca­tion?

With Love from Israel29 Oct 2019 9:34 UTC
30 points
4 comments1 min readEA link

Pro­jec­tLawful.com gives you policy experience

trevor130 Aug 2022 23:07 UTC
−8 points
2 comments4 min readEA link

Re­sults from the AI test­ing hackathon

Esben Kran2 Jan 2023 15:46 UTC
35 points
4 comments5 min readEA link
(alignmentjam.com)

Why Would AI “Aim” To Defeat Hu­man­ity?

Holden Karnofsky29 Nov 2022 18:59 UTC
19 points
0 comments32 min readEA link
(www.cold-takes.com)

How tech­ni­cal safety stan­dards could pro­mote TAI safety

Cullen8 Aug 2022 16:57 UTC
126 points
15 comments7 min readEA link

Ma­chine Learn­ing for Scien­tific Dis­cov­ery—AI Safety Camp

Eleni_A6 Jan 2023 3:06 UTC
9 points
0 comments1 min readEA link

[Question] Will AGI cause mass tech­nolog­i­cal un­em­ploy­ment?

BrownHairedEevee22 Jun 2020 20:55 UTC
3 points
2 comments2 min readEA link

AI Gover­nance Course—Cur­ricu­lum and Application

Mauricio29 Nov 2021 13:29 UTC
94 points
11 comments7 min readEA link

[Question] What do we do if AI doesn’t take over the world, but still causes a sig­nifi­cant global prob­lem?

James_Banks2 Aug 2020 3:35 UTC
16 points
5 comments1 min readEA link

Spicy takes about AI policy (Clark, 2022)

Will Aldred9 Aug 2022 13:49 UTC
43 points
0 comments3 min readEA link
(twitter.com)

In­stead of tech­ni­cal re­search, more peo­ple should fo­cus on buy­ing time

Akash5 Nov 2022 20:43 UTC
106 points
32 comments1 min readEA link

Brian Tse: Sino-Western co­op­er­a­tion in AI safety

EA Global30 Jan 2020 22:02 UTC
11 points
0 comments14 min readEA link
(www.youtube.com)

Credo AI is hiring for sev­eral roles

IanEisenberg11 Apr 2022 15:58 UTC
14 points
2 comments1 min readEA link

Mauhn Re­leases AI Safety Documentation

Berg Severens2 Jul 2021 12:19 UTC
4 points
2 comments1 min readEA link

David Krueger on AI Align­ment in Academia and Coordination

Michaël Trazzi7 Jan 2023 21:14 UTC
32 points
1 comment3 min readEA link
(theinsideview.ai)

AI & Policy 1/​3: On know­ing the effect of to­day’s poli­cies on Trans­for­ma­tive AI risks, and the case for in­sti­tu­tional im­prove­ments.

weeatquince27 Aug 2019 11:04 UTC
27 points
3 comments10 min readEA link

Re­sources I send to AI re­searchers about AI safety

Vael Gates11 Jan 2023 1:24 UTC
28 points
0 comments1 min readEA link

Perform Tractable Re­search While Avoid­ing Ca­pa­bil­ities Ex­ter­nal­ities [Prag­matic AI Safety #4]

ThomasW30 May 2022 20:37 UTC
33 points
1 comment26 min readEA link

The Power of In­tel­li­gence—The Animation

Writer11 Mar 2023 16:15 UTC
56 points
0 comments1 min readEA link

AI safety uni­ver­sity groups: a promis­ing op­por­tu­nity to re­duce ex­is­ten­tial risk

mic30 Jun 2022 18:37 UTC
50 points
1 comment11 min readEA link

Fermi es­ti­ma­tion of the im­pact you might have work­ing on AI safety

frib13 May 2022 13:30 UTC
24 points
13 comments1 min readEA link

AI Safety Overview: CERI Sum­mer Re­search Fellowship

Jamie Bernardi24 Mar 2022 15:12 UTC
29 points
0 comments2 min readEA link

[Link post] Promis­ing Paths to Align­ment—Con­nor Leahy | Talk

frances_lorenz14 May 2022 15:58 UTC
16 points
0 comments1 min readEA link

AI Safety in a Vuln­er­a­ble World: Re­quest­ing Feed­back on Pre­limi­nary Thoughts

Jordan Arel6 Dec 2022 22:36 UTC
5 points
4 comments3 min readEA link

Safety timelines: How long will it take to solve al­ign­ment?

Esben Kran19 Sep 2022 12:51 UTC
39 points
9 comments6 min readEA link

In­tro to Safety Engineering

Madhav Malhotra19 Oct 2022 23:44 UTC
4 points
0 comments1 min readEA link

How could we know that an AGI sys­tem will have good con­se­quences?

So8res7 Nov 2022 22:42 UTC
25 points
0 comments1 min readEA link

[Question] I’m in­ter­view­ing pro­lific AI safety re­searcher Richard Ngo (now at OpenAI and pre­vi­ously Deep­Mind). What should I ask him?

Robert_Wiblin29 Sep 2022 0:00 UTC
45 points
11 comments1 min readEA link

Fake Meat and Real Talk 1 - Are We All Gonna Die? Yud­kowsky and the Dangers of AI (Please RSVP)

David N8 Mar 2023 20:40 UTC
11 points
2 comments1 min readEA link

[Cross­post] Why Un­con­trol­lable AI Looks More Likely Than Ever

Otto8 Mar 2023 15:33 UTC
49 points
6 comments4 min readEA link
(time.com)

Grokking “Fore­cast­ing TAI with biolog­i­cal an­chors”

anson6 Jun 2022 18:56 UTC
43 points
0 comments12 min readEA link

A mod­est case for hope

xavier rg17 Oct 2022 6:03 UTC
28 points
0 comments1 min readEA link

aisafety.com­mu­nity—A liv­ing doc­u­ment of AI safety communities

zeshen20 Oct 2022 22:08 UTC
24 points
13 comments1 min readEA link

Pod­cast: Tam­era Lan­ham on AI risk, threat mod­els, al­ign­ment pro­pos­als, ex­ter­nal­ized rea­son­ing over­sight, and work­ing at Anthropic

Akash20 Dec 2022 21:39 UTC
14 points
1 comment1 min readEA link

Dis­cov­er­ing Lan­guage Model Be­hav­iors with Model-Writ­ten Evaluations

evhub20 Dec 2022 20:09 UTC
25 points
0 comments1 min readEA link

Pivotal out­comes and pivotal processes

Andrew Critch17 Jun 2022 23:43 UTC
42 points
1 comment5 min readEA link

Part 1: The AI Safety com­mu­nity has four main work groups, Strat­egy, Gover­nance, Tech­ni­cal and Move­ment Building

PeterSlattery25 Nov 2022 3:45 UTC
72 points
7 comments6 min readEA link

I’m In­ter­view­ing Kat Woods, EA Pow­er­house. What Should I Ask?

SereneDesiree20 Sep 2022 9:49 UTC
4 points
2 comments1 min readEA link

Univer­sity com­mu­nity build­ing seems like the wrong model for AI safety

George Stiffman26 Feb 2022 6:23 UTC
24 points
8 comments1 min readEA link

Prob­lems of peo­ple new to AI safety and my pro­ject ideas to miti­gate them

Igor Ivanov3 Mar 2023 17:35 UTC
14 points
0 comments7 min readEA link

Align­ing AI with Hu­mans by Lev­er­ag­ing Le­gal Informatics

johnjnay18 Sep 2022 7:43 UTC
20 points
11 comments3 min readEA link

A con­cern­ing ob­ser­va­tion from me­dia cov­er­age of AI in­dus­try dynamics

Justin Olive2 Mar 2023 23:56 UTC
45 points
5 comments3 min readEA link

Prov­ably Hon­est—A First Step

Srijanak De5 Nov 2022 21:49 UTC
1 point
0 comments1 min readEA link

How Josiah be­came an AI safety researcher

Neil Crawford29 Mar 2022 19:47 UTC
10 points
0 comments1 min readEA link

Data col­lec­tion for AI al­ign­ment—Ca­reer review

Benjamin Hilton3 Jun 2022 11:44 UTC
34 points
1 comment5 min readEA link
(80000hours.org)

Pre­sump­tive Listen­ing: stick­ing to fa­mil­iar con­cepts and miss­ing the outer rea­son­ing paths

Remmelt27 Dec 2022 15:40 UTC
3 points
0 comments1 min readEA link

Distil­la­tion of The Offense-Defense Balance of Scien­tific Knowledge

Arjun Yadav12 Aug 2022 7:01 UTC
17 points
0 comments3 min readEA link

[Question] What are some sources re­lated to big-pic­ture AI strat­egy?

Jacob_Watts2 Mar 2023 5:04 UTC
9 points
4 comments1 min readEA link

“Write a crit­i­cal post about Effec­tive Altru­ism, and offer sug­ges­tions on how to im­prove the move­ment.”

davidvanbeveren6 Dec 2022 20:58 UTC
30 points
6 comments2 min readEA link

Scor­ing fore­casts from the 2016 “Ex­pert Sur­vey on Progress in AI”

PatrickL1 Mar 2023 14:39 UTC
185 points
21 comments9 min readEA link

How truth­ful can LLMs be: a the­o­ret­i­cal per­spec­tive with a re­quest for help from ex­perts on The­o­ret­i­cal CS

sergia1 Mar 2023 15:43 UTC
8 points
1 comment3 min readEA link
(www.lesswrong.com)

What I’m doing

Chris Leong19 Jul 2022 11:31 UTC
28 points
0 comments5 min readEA link

Es­ti­mat­ing the Cur­rent and Fu­ture Num­ber of AI Safety Researchers

Stephen McAleese28 Sep 2022 20:58 UTC
58 points
29 comments9 min readEA link

[Question] An eco­nomics of AI gov—best re­sources for

Liv26 Feb 2023 11:11 UTC
9 points
3 comments1 min readEA link

Con­crete ac­tions to im­prove AI gov­er­nance: the be­havi­our sci­ence approach

AlexanderSaeri1 Dec 2022 21:34 UTC
31 points
0 comments11 min readEA link

Con­sider try­ing Vivek Heb­bar’s al­ign­ment exercises

Akash24 Oct 2022 19:46 UTC
16 points
0 comments1 min readEA link

Should AI fo­cus on prob­lem-solv­ing or strate­gic plan­ning? Why not both?

oliver_siegel1 Nov 2022 9:53 UTC
1 point
0 comments1 min readEA link

Why some peo­ple be­lieve in AGI, but I don’t.

cveres26 Oct 2022 3:09 UTC
13 points
2 comments4 min readEA link

Ber­lin AI Safety Open Meetup July 2022

Isidor Regenfuß22 Jul 2022 16:26 UTC
1 point
0 comments1 min readEA link

[Question] Do EA folks think that a path to zero AGI de­vel­op­ment is fea­si­ble or worth­while for safety from AI?

Noah Scales17 Jul 2022 8:47 UTC
8 points
3 comments1 min readEA link

Why The Fo­cus on Ex­pected Utility Max­imisers?

𝕮𝖎𝖓𝖊𝖗𝖆27 Dec 2022 15:51 UTC
11 points
1 comment1 min readEA link

Im­proved Se­cu­rity to Prevent Hacker-AI and Digi­tal Ghosts

Erland Wittkotter21 Oct 2022 10:11 UTC
1 point
0 comments1 min readEA link

UK AI Policy Re­port: Con­tent, Sum­mary, and its Im­pact on EA Cause Areas

Algo_Law21 Jul 2022 17:32 UTC
9 points
1 comment9 min readEA link

(Im­por­tant) Cause Pri­or­ity: stop the NSA & Pen­tagon from likely build­ing an S-Risk caus­ing AGI

JonCefalu27 Feb 2023 12:00 UTC
−21 points
3 comments1 min readEA link

[Question] What Do AI Safety Pitches Not Get About Your Field?

Aris Richardson20 Sep 2022 18:13 UTC
70 points
19 comments1 min readEA link

Ad­vice on Pur­su­ing Tech­ni­cal AI Safety Research

frances_lorenz31 May 2022 17:48 UTC
22 points
2 comments4 min readEA link

A Cri­tique of AI Takeover Scenarios

Fods1231 Aug 2022 13:49 UTC
44 points
4 comments12 min readEA link

2023 Stan­ford Ex­is­ten­tial Risks Conference

elizabethcooper24 Feb 2023 17:49 UTC
29 points
5 comments1 min readEA link

AI Safety Endgame Stories

IvanVendrov28 Sep 2022 17:12 UTC
31 points
1 comment1 min readEA link

*New* Canada AI Safety & Gover­nance community

Wyatt Tessari L'Allié29 Aug 2022 15:58 UTC
31 points
2 comments1 min readEA link

Sce­nario Map­ping Ad­vanced AI Risk: Re­quest for Par­ti­ci­pa­tion with Data Collection

Kiliank27 Mar 2022 11:44 UTC
14 points
0 comments5 min readEA link

Good Fu­tures Ini­ti­a­tive: Win­ter Pro­ject In­tern­ship

Aris Richardson27 Nov 2022 23:27 UTC
67 points
7 comments4 min readEA link

An­nounc­ing AI Align­ment Awards: $100k re­search con­tests about goal mis­gen­er­al­iza­tion & corrigibility

Akash22 Nov 2022 22:19 UTC
60 points
1 comment1 min readEA link

[Question] Benefits/​Risks of Scott Aaron­son’s Ortho­dox/​Re­form Fram­ing for AI Alignment

Jeremy21 Nov 2022 17:47 UTC
15 points
5 comments1 min readEA link
(scottaaronson.blog)

AI al­ign­ment re­searchers don’t (seem to) stack

So8res21 Feb 2023 0:48 UTC
47 points
3 comments1 min readEA link

[Question] AI Eth­i­cal Committee

eaaicommittee1 Mar 2022 23:35 UTC
8 points
0 comments1 min readEA link

Distil­la­tion of “How Likely is De­cep­tive Align­ment?”

NickGabs1 Dec 2022 20:22 UTC
10 points
1 comment10 min readEA link

[Question] Please Share Your Per­spec­tives on the De­gree of So­cietal Im­pact from Trans­for­ma­tive AI Outcomes

Kiliank15 Apr 2022 1:23 UTC
3 points
3 comments1 min readEA link

Beg­ging, Plead­ing AI Orgs to Com­ment on NIST AI Risk Man­age­ment Framework

Bridges15 Apr 2022 19:35 UTC
87 points
4 comments2 min readEA link

A Quick List of Some Prob­lems in AI Align­ment As A Field

NicholasKross21 Jun 2022 17:09 UTC
16 points
10 comments6 min readEA link
(www.thinkingmuchbetter.com)

All AGI Safety ques­tions wel­come (es­pe­cially ba­sic ones) [~monthly thread]

robertskmiles1 Nov 2022 23:21 UTC
75 points
94 comments1 min readEA link

AI Safety Info Distil­la­tion Fellowship

robertskmiles17 Feb 2023 16:16 UTC
80 points
1 comment1 min readEA link

Math­e­mat­i­cal Cir­cuits in Neu­ral Networks

Sean Osier22 Sep 2022 2:32 UTC
23 points
2 comments1 min readEA link
(www.youtube.com)

Ap­ply for the ML Win­ter Camp in Cam­bridge, UK [2-10 Jan]

Nathan_Barnard2 Dec 2022 19:33 UTC
50 points
11 comments2 min readEA link

4 Key As­sump­tions in AI Safety

Prometheus7 Nov 2022 10:50 UTC
5 points
0 comments1 min readEA link

High im­pact job op­por­tu­nity at ARIA (UK)

Rasool12 Feb 2023 10:35 UTC
80 points
0 comments1 min readEA link

Newslet­ter for Align­ment Re­search: The ML Safety Updates

Esben Kran22 Oct 2022 16:17 UTC
30 points
0 comments7 min readEA link

Ar­tifi­cial In­tel­li­gence, Mo­ral­ity, and Sen­tience (AIMS) Sur­vey: 2021

Janet Pauketat1 Jul 2022 7:47 UTC
36 points
0 comments2 min readEA link
(www.sentienceinstitute.org)

Overview | An Eval­u­a­tive Evolu­tion

Matt Keene10 Feb 2023 18:15 UTC
−9 points
0 comments5 min readEA link
(www.creatingafuturewewant.com)

Con­crete Ad­vice for Form­ing In­side Views on AI Safety

Neel Nanda17 Aug 2022 23:26 UTC
57 points
4 comments9 min readEA link
(www.alignmentforum.org)

Re­sources that (I think) new al­ign­ment re­searchers should know about

Akash28 Oct 2022 22:13 UTC
20 points
2 comments1 min readEA link

[Question] Does China have AI al­ign­ment re­sources/​in­sti­tu­tions? How can we pri­ori­tize cre­at­ing more?

Jakub Kraus4 Aug 2022 19:23 UTC
17 points
9 comments1 min readEA link

Refer the Co­op­er­a­tive AI Foun­da­tion’s New COO, Re­ceive $5000

Lewis Hammond16 Jun 2022 13:27 UTC
42 points
0 comments2 min readEA link

A challenge for AGI or­ga­ni­za­tions, and a challenge for readers

RobBensinger1 Dec 2022 23:11 UTC
168 points
13 comments1 min readEA link

[Question] Why does (any par­tic­u­lar) AI safety work re­duce s-risks more than it in­creases them?

MichaelStJules3 Oct 2021 16:55 UTC
42 points
19 comments1 min readEA link

Mechanism De­sign for AI Safety—Agenda Creation Retreat

Rubi J. Hudson10 Feb 2023 3:05 UTC
21 points
1 comment1 min readEA link

AI Safety Ideas: A col­lab­o­ra­tive AI safety re­search platform

Apart Research17 Oct 2022 17:01 UTC
67 points
13 comments4 min readEA link

Su­per­in­tel­li­gent AI is nec­es­sary for an amaz­ing fu­ture, but far from sufficient

So8res31 Oct 2022 21:16 UTC
35 points
5 comments1 min readEA link

Don’t ac­cel­er­ate prob­lems you’re try­ing to solve

Andrea_Miotti15 Feb 2023 18:11 UTC
25 points
2 comments1 min readEA link

Ex­pected eth­i­cal value of a ca­reer in AI safety

Jordan Taylor14 Jun 2022 14:25 UTC
36 points
16 comments11 min readEA link

Chris Olah on what the hell is go­ing on in­side neu­ral networks

80000_Hours4 Aug 2021 15:13 UTC
4 points
0 comments135 min readEA link

[Our World in Data] AI timelines: What do ex­perts in ar­tifi­cial in­tel­li­gence ex­pect for the fu­ture? (Roser, 2023)

Will Aldred7 Feb 2023 14:52 UTC
84 points
1 comment1 min readEA link
(ourworldindata.org)

Stress Ex­ter­nal­ities More in AI Safety Pitches

NickGabs26 Sep 2022 20:31 UTC
31 points
13 comments2 min readEA link

Dear An­thropic peo­ple, please don’t re­lease Claude

No drama8 Feb 2023 2:44 UTC
22 points
5 comments1 min readEA link

FYI: I’m work­ing on a book about the threat of AGI/​ASI for a gen­eral au­di­ence. I hope it will be of value to the cause and the community

Darren McKee17 Jun 2022 11:52 UTC
32 points
1 comment2 min readEA link

[Question] Huh. Bing thing got me real anx­ious about AI. Re­sources to help with that please?

Arvin15 Feb 2023 16:55 UTC
2 points
7 comments1 min readEA link

Reflec­tions on the PIBBSS Fel­low­ship 2022

nora11 Dec 2022 22:03 UTC
69 points
4 comments18 min readEA link

(My sug­ges­tions) On Begin­ner Steps in AI Alignment

Joseph Bloom22 Sep 2022 15:32 UTC
24 points
3 comments9 min readEA link

Launch­ing The Col­lec­tive In­tel­li­gence Pro­ject: Whitepa­per and Pilots

jasmine_wang6 Feb 2023 17:00 UTC
37 points
8 comments2 min readEA link
(cip.org)

In­ter­view with Ro­man Yam­polskiy about AGI on The Real­ity Check

Darren McKee18 Feb 2023 23:29 UTC
27 points
0 comments1 min readEA link
(www.trcpodcast.com)

AI Safety For Dum­mies (Like Me)

Madhav Malhotra24 Aug 2022 20:26 UTC
16 points
7 comments20 min readEA link

A BOTEC es­ti­mat­ing the effects of an AI ca­pa­bil­ities pro­ject on AI timelines, un­al­igned AI, and hu­man extinction

WilliamKiely5 Feb 2023 11:26 UTC
15 points
4 comments1 min readEA link

Call for sub­mis­sions: AI Safety Spe­cial Ses­sion at the Con­fer­ence on Ar­tifi­cial Life (ALIFE 2023)

Rory Greig5 Feb 2023 16:37 UTC
16 points
0 comments2 min readEA link
(humanvaluesandartificialagency.com)

Chain­ing the evil ge­nie: why “outer” AI safety is prob­a­bly easy

titotal30 Aug 2022 13:55 UTC
18 points
10 comments10 min readEA link

Does most of your im­pact come from what you do soon?

Joshc21 Feb 2023 5:12 UTC
32 points
1 comment5 min readEA link

AGI Timelines in Gover­nance: Differ­ent Strate­gies for Differ­ent Timeframes

simeon_c19 Dec 2022 21:31 UTC
110 points
19 comments1 min readEA link

[Question] Clos­ing the Feed­back Loop on AI Safety Re­search.

Ben.Hartley29 Jul 2022 21:46 UTC
3 points
4 comments1 min readEA link

What is Differ­en­tial Tech­nolog­i­cal Devel­op­ment?

aj22 Feb 2023 21:50 UTC
14 points
0 comments13 min readEA link
(www.ajkourabi.com)

New se­ries of posts an­swer­ing one of Holden’s “Im­por­tant, ac­tion­able re­search ques­tions”

Evan R. Murphy12 May 2022 21:22 UTC
9 points
0 comments1 min readEA link

Se­cond call: CFP for Re­bel­lion and Di­sobe­di­ence in AI workshop

Ram Rachum5 Feb 2023 12:19 UTC
2 points
0 comments2 min readEA link

A dis­cus­sion with ChatGPT on value-based mod­els vs. large lan­guage mod­els, etc..

Miguel4 Feb 2023 16:49 UTC
4 points
0 comments12 min readEA link
(www.whitehatstoic.com)

Crit­i­cism Thread: What things should OpenPhil im­prove on?

anonymousEA204 Feb 2023 8:16 UTC
69 points
7 comments2 min readEA link

How much should gov­ern­ments pay to pre­vent catas­tro­phes? Longter­mism’s limited role

EJT19 Mar 2023 16:50 UTC
172 points
4 comments35 min readEA link
(philpapers.org)

High-level hopes for AI alignment

Holden Karnofsky20 Dec 2022 2:11 UTC
118 points
14 comments19 min readEA link
(www.cold-takes.com)

The Limit of Lan­guage Models

𝕮𝖎𝖓𝖊𝖗𝖆26 Dec 2022 11:17 UTC
10 points
1 comment1 min readEA link

[Question] Which is more im­por­tant for re­duc­ing s-risks, re­search­ing on AI sen­tience or an­i­mal welfare?

jackchang11025 Feb 2023 2:20 UTC
9 points
0 comments1 min readEA link

[Linkpost] “Blueprint for an AI Bill of Rights”—Office of Science and Tech­nol­ogy Policy, USA (2022)

rodeo_flagellum5 Oct 2022 16:48 UTC
15 points
0 comments1 min readEA link

An au­dio ver­sion of the al­ign­ment prob­lem from a deep learn­ing per­spec­tive by Richard Ngo Et Al

Miguel3 Feb 2023 19:32 UTC
18 points
0 comments1 min readEA link
(www.whitehatstoic.com)

Assess­ing China’s im­por­tance as an AI superpower

JulianHazell3 Feb 2023 11:08 UTC
88 points
7 comments1 min readEA link
(muddyclothes.substack.com)

How to ‘troll for good’: Lev­er­ag­ing IP for AI governance

Michael Huang26 Feb 2023 6:34 UTC
26 points
2 comments1 min readEA link
(www.science.org)

[Question] Is there any re­search or fore­casts of how likely AI Align­ment is go­ing to be a hard vs. easy prob­lem rel­a­tive to ca­pa­bil­ities?

Jordan Arel14 Aug 2022 15:58 UTC
8 points
1 comment1 min readEA link

Seek­ing in­put on a list of AI books for broader audience

Darren McKee27 Feb 2023 22:40 UTC
33 points
9 comments5 min readEA link

Why I think it’s im­por­tant to work on AI forecasting

Matthew_Barnett27 Feb 2023 21:24 UTC
175 points
10 comments10 min readEA link

Very Briefly: The CHIPS Act

Yadav26 Feb 2023 13:53 UTC
33 points
3 comments1 min readEA link
(www.y1d2.com)

[Question] How long does it take to un­der­srand AI X-Risk from scratch so that I have a con­fi­dent, clear men­tal model of it from first prin­ci­ples?

Jordan Arel27 Jul 2022 16:58 UTC
29 points
6 comments1 min readEA link

I asked ChatGPT about pat­tern recog­ni­tion, al­ign­ment and fair­ness con­straints.

Miguel3 Feb 2023 3:36 UTC
−2 points
0 comments1 min readEA link
(www.whitehatstoic.com)

40,000 rea­sons to worry about AI safety

Michael Huang2 Feb 2023 7:48 UTC
9 points
2 comments2 min readEA link
(www.theverge.com)

Fo­cus on the places where you feel shocked ev­ery­one’s drop­ping the ball

So8res2 Feb 2023 0:27 UTC
90 points
6 comments1 min readEA link

“AGI timelines: ig­nore the so­cial fac­tor at their peril” (Fu­ture Fund AI Wor­ld­view Prize sub­mis­sion)

ketanrama5 Nov 2022 17:45 UTC
10 points
0 comments12 min readEA link
(trevorklee.substack.com)

Alexan­der and Yud­kowsky on AGI goals

Scott Alexander31 Jan 2023 23:36 UTC
25 points
1 comment1 min readEA link

In­tro­duc­ing Leap Labs, an AI in­ter­pretabil­ity startup

Jessica Rumbelow6 Mar 2023 17:37 UTC
9 points
0 comments1 min readEA link
(www.lesswrong.com)

Talk to me about your sum­mer/​ca­reer plans

Akash31 Jan 2023 18:29 UTC
30 points
0 comments1 min readEA link

Con­tribute by fa­cil­i­tat­ing the AGI Safety Fun­da­men­tals Programme

Jamie Bernardi6 Dec 2021 11:50 UTC
27 points
0 comments2 min readEA link

[Linkpost] Hu­man-nar­rated au­dio ver­sion of “Is Power-Seek­ing AI an Ex­is­ten­tial Risk?”

Joe_Carlsmith31 Jan 2023 19:19 UTC
7 points
0 comments1 min readEA link

Eli Lifland on Nav­i­gat­ing the AI Align­ment Landscape

Ozzie Gooen1 Feb 2023 0:07 UTC
48 points
9 comments31 min readEA link
(quri.substack.com)

A tough ca­reer decision

PabloAMC9 Apr 2022 0:46 UTC
68 points
13 comments4 min readEA link

Brain­storm of things that could force an AI team to burn their lead

So8res25 Jul 2022 0:00 UTC
26 points
1 comment12 min readEA link

Joscha Bach on Syn­thetic In­tel­li­gence [an­no­tated]

Roman Leventov2 Mar 2023 11:21 UTC
2 points
0 comments9 min readEA link
(www.jimruttshow.com)

Is in­ter­est in al­ign­ment worth men­tion­ing for grad school ap­pli­ca­tions?

Franziska Fischer16 Oct 2022 4:50 UTC
5 points
5 comments1 min readEA link

A Brief Overview of AI Safety/​Align­ment Orgs, Fields, Re­searchers, and Re­sources for ML Researchers

Austin Witte2 Feb 2023 6:19 UTC
18 points
5 comments2 min readEA link

Pre­dict­ing re­searcher in­ter­est in AI alignment

Vael Gates2 Feb 2023 0:58 UTC
30 points
0 comments21 min readEA link
(docs.google.com)

Call to de­mand an­swers from An­thropic about join­ing the AI race

sergia2 Mar 2023 17:26 UTC
14 points
71 comments1 min readEA link
(forum.effectivealtruism.org)

“AI Risk Dis­cus­sions” web­site: Ex­plor­ing in­ter­views from 97 AI Researchers

Vael Gates2 Feb 2023 1:00 UTC
41 points
1 comment1 min readEA link

AI Safety Ar­gu­ments: An In­ter­ac­tive Guide

Lukas Trötzmüller1 Feb 2023 19:21 UTC
32 points
5 comments3 min readEA link

An­nounc­ing the Cam­bridge Bos­ton Align­ment Ini­ti­a­tive [Hiring!]

kuhanj2 Dec 2022 1:07 UTC
83 points
0 comments1 min readEA link

Ret­ro­spec­tive on the AI Safety Field Build­ing Hub

Vael Gates2 Feb 2023 2:06 UTC
52 points
2 comments9 min readEA link

On value in hu­mans, other an­i­mals, and AI

Michele Campolo31 Jan 2023 23:48 UTC
7 points
6 comments5 min readEA link

Fol­low-Up Sur­vey: Ma­jor Lay­offs at Tech Gi­ants [re­cruit­ing sup­port]

nnn31 Jan 2023 1:49 UTC
36 points
4 comments1 min readEA link

Acausal normalcy

Andrew Critch3 Mar 2023 23:35 UTC
18 points
4 comments8 min readEA link

My per­sonal cruxes for work­ing on AI safety

Buck13 Feb 2020 7:11 UTC
135 points
35 comments45 min readEA link

The Benefits of Distil­la­tion in Research

Jonas Hallgren4 Mar 2023 19:19 UTC
42 points
2 comments5 min readEA link

[Question] How to nav­i­gate po­ten­tial infohazards

more better 4 Mar 2023 21:28 UTC
16 points
7 comments1 min readEA link

AGI Bat­tle Royale: Why “slow takeover” sce­nar­ios de­volve into a chaotic multi-AGI fight to the death

titotal22 Sep 2022 15:00 UTC
36 points
9 comments15 min readEA link

Time-stamp­ing: An ur­gent, ne­glected AI safety measure

Axel Svensson30 Jan 2023 11:21 UTC
57 points
27 comments3 min readEA link

Les­sons from Three Mile Is­land for AI Warn­ing Shots

NickGabs26 Sep 2022 2:47 UTC
42 points
0 comments12 min readEA link

[Question] Up­dates on FLI’S Value Align­ment Map?

rodeo_flagellum19 Sep 2022 0:25 UTC
8 points
0 comments2 min readEA link

Ap­ply to HAIST/​MAIA’s AI Gover­nance Work­shop in DC (Feb 17-20)

Phosphorous28 Jan 2023 0:45 UTC
15 points
0 comments1 min readEA link
(www.lesswrong.com)

In­ter­views with 97 AI Re­searchers: Quan­ti­ta­tive Analysis

Maheen Shermohammed2 Feb 2023 4:50 UTC
73 points
4 comments7 min readEA link

When is AI safety re­search harm­ful?

Nathan_Barnard9 May 2022 10:36 UTC
13 points
6 comments9 min readEA link

An­thropic: Core Views on AI Safety: When, Why, What, and How

jonmenaster9 Mar 2023 17:30 UTC
100 points
6 comments22 min readEA link
(www.anthropic.com)

Com­plex Sys­tems for AI Safety [Prag­matic AI Safety #3]

ThomasW24 May 2022 0:04 UTC
48 points
6 comments21 min readEA link

Every­thing’s nor­mal un­til it’s not

Eleni_A10 Mar 2023 1:42 UTC
6 points
0 comments3 min readEA link

Open Prob­lems in AI X-Risk [PAIS #5]

ThomasW10 Jun 2022 2:22 UTC
44 points
1 comment36 min readEA link

Ja­pan AI Align­ment Conference

ChrisScammell10 Mar 2023 9:23 UTC
17 points
2 comments1 min readEA link
(www.conjecture.dev)

Thoughts on the OpenAI al­ign­ment plan: will AI re­search as­sis­tants be net-pos­i­tive for AI ex­is­ten­tial risk?

Jeffrey Ladish10 Mar 2023 8:20 UTC
10 points
0 comments9 min readEA link

You won’t solve al­ign­ment with­out agent foundations

Samin6 Nov 2022 8:07 UTC
12 points
0 comments1 min readEA link

[Question] AI Safety Pitches post ChatGPT

ojorgensen5 Dec 2022 22:48 UTC
6 points
2 comments1 min readEA link

Loss of con­trol of AI is not a likely source of AI x-risk

squek9 Nov 2022 5:48 UTC
8 points
0 comments1 min readEA link

On tak­ing AI risk se­ri­ously

Eleni_A13 Mar 2023 5:44 UTC
51 points
4 comments1 min readEA link
(www.nytimes.com)

“How to Es­cape from the Si­mu­la­tion”—Seeds of Science call for reviewers

rogersbacon126 Jan 2023 15:12 UTC
7 points
0 comments1 min readEA link

[Question] De­sign­ing user au­then­ti­ca­tion pro­to­cols

Kinoshita Yoshikazu (pseudonym)13 Mar 2023 15:56 UTC
−1 points
2 comments1 min readEA link

Three sce­nar­ios of pseudo-al­ign­ment

Eleni_A5 Sep 2022 20:26 UTC
7 points
0 comments3 min readEA link

AGI Safety Fun­da­men­tals cur­ricu­lum and application

richard_ngo20 Oct 2021 21:45 UTC
123 points
20 comments8 min readEA link
(docs.google.com)

AI Risk in Africa

Claude Formanek12 Oct 2021 2:28 UTC
16 points
0 comments10 min readEA link

Pitch­ing AI Safety in 3 sentences

PabloAMC30 Mar 2022 18:50 UTC
7 points
0 comments1 min readEA link

Sha­har Avin on How to Strate­gi­cally Reg­u­late Ad­vanced AI Systems

Michaël Trazzi23 Sep 2022 15:49 UTC
46 points
1 comment5 min readEA link
(theinsideview.ai)

Idea: an AI gov­er­nance group colo­cated with ev­ery AI re­search group!

capybaralet7 Dec 2020 23:41 UTC
8 points
1 comment1 min readEA link

Nu­clear Es­pi­onage and AI Governance

GAA4 Oct 2021 18:21 UTC
32 points
3 comments24 min readEA link

CFP for the Largest An­nual Meet­ing of Poli­ti­cal Science: Get Help With Your Re­search Submission

Mahendra Prasad22 Dec 2020 23:39 UTC
13 points
0 comments2 min readEA link

Si­mu­la­tors and Mindcrime

𝕮𝖎𝖓𝖊𝖗𝖆9 Dec 2022 15:20 UTC
1 point
0 comments1 min readEA link

Ques­tions for fur­ther in­ves­ti­ga­tion of AI diffusion

Ben Cottier21 Dec 2022 13:50 UTC
28 points
0 comments11 min readEA link

Estab­lish­ing Oxford’s AI Safety Stu­dent Group: Les­sons Learnt and Our Model

Wilkin123421 Sep 2022 7:57 UTC
71 points
3 comments1 min readEA link

AI Gover­nance Read­ing Group Guide

Alex HT25 Jun 2020 10:16 UTC
25 points
2 comments3 min readEA link

Sup­ple­ment to “The Brus­sels Effect and AI: How EU AI reg­u­la­tion will im­pact the global AI mar­ket”

MarkusAnderljung16 Aug 2022 20:55 UTC
107 points
7 comments8 min readEA link

Longter­mist rea­sons to work for in­no­va­tive governments

Alexis Carlier13 Oct 2020 16:32 UTC
74 points
8 comments1 min readEA link

Slightly against al­ign­ing with neo-luddites

Matthew_Barnett26 Dec 2022 23:27 UTC
70 points
17 comments4 min readEA link

A Map to Nav­i­gate AI Governance

hanadulset14 Feb 2022 22:41 UTC
64 points
11 comments24 min readEA link

Google could build a con­scious AI in three months

Derek Shiller1 Oct 2022 13:24 UTC
14 points
17 comments7 min readEA link

AI Gover­nance Needs Tech­ni­cal Work

Mauricio5 Sep 2022 22:25 UTC
91 points
3 comments7 min readEA link

Ex­is­ten­tial Risk of Misal­igned In­tel­li­gence Aug­men­ta­tion (Par­tic­u­larly Us­ing High-Band­width BCI Im­plants)

Damian Gorski24 Jan 2023 17:02 UTC
1 point
0 comments9 min readEA link

What if AI de­vel­op­ment goes well?

RoryG3 Aug 2022 8:57 UTC
25 points
7 comments12 min readEA link

AI Safety Un­con­fer­ence NeurIPS 2022

Orpheus_Lummis7 Nov 2022 15:39 UTC
13 points
5 comments1 min readEA link
(aisafetyevents.org)

Mas­sive Scal­ing Should be Frowned Upon

harsimony17 Nov 2022 17:44 UTC
9 points
0 comments5 min readEA link

An­nounc­ing the SPT Model Web App for AI Governance

Paolo Bova4 Aug 2022 10:45 UTC
36 points
0 comments3 min readEA link

[Question] What are the challenges and prob­lems with pro­gram­ming law-break­ing con­straints into AGI?

MichaelStJules2 Feb 2020 20:53 UTC
12 points
34 comments1 min readEA link

AI Gover­nance Read­ing Group [Toronto+re­mote]

Liav.Koren24 Jan 2023 22:05 UTC
2 points
0 comments1 min readEA link

Cryp­tocur­rency Ex­ploits Show the Im­por­tance of Proac­tive Poli­cies for AI X-Risk

eSpencer16 Sep 2022 4:44 UTC
14 points
0 comments3 min readEA link

AGI safety field build­ing pro­jects I’d like to see

Severin24 Jan 2023 23:30 UTC
25 points
2 comments1 min readEA link

AI Alter­na­tive Fu­tures: Ex­plo­ra­tory Sce­nario Map­ping for Ar­tifi­cial In­tel­li­gence Risk—Re­quest for Par­ti­ci­pa­tion [Linkpost]

Kiliank9 May 2022 19:53 UTC
17 points
2 comments8 min readEA link

AGI al­ign­ment re­sults from a se­ries of al­igned ac­tions

hanadulset27 Dec 2021 19:33 UTC
15 points
1 comment6 min readEA link

EU AI Act now has a sec­tion on gen­eral pur­pose AI systems

MathiasKB9 Dec 2021 12:40 UTC
64 points
10 comments1 min readEA link

Im­pli­ca­tions of large lan­guage model diffu­sion for AI governance

Ben Cottier21 Dec 2022 13:50 UTC
14 points
0 comments38 min readEA link

Two Strange Things About AI Safety Policy

Jay_Shooster28 Sep 2016 16:23 UTC
26 points
28 comments4 min readEA link

My sum­mary of “Prag­matic AI Safety”

Eleni_A5 Nov 2022 14:47 UTC
14 points
0 comments5 min readEA link

CNAS re­port: ‘Ar­tifi­cial In­tel­li­gence and Arms Con­trol’

MMMaas13 Oct 2022 8:35 UTC
14 points
0 comments1 min readEA link
(www.cnas.org)

Assess­ing the state of AI R&D in the US, China, and Europe – Part 1: Out­put indicators

stefan.torges1 Nov 2019 14:41 UTC
21 points
0 comments14 min readEA link

[Question] What are the most press­ing is­sues in short-term AI policy?

BrownHairedEevee14 Jan 2020 22:05 UTC
9 points
0 comments1 min readEA link

Who owns AI-gen­er­ated con­tent?

Johan S Daniel7 Dec 2022 3:03 UTC
−2 points
0 comments2 min readEA link

Against GDP as a met­ric for timelines and take­off speeds

kokotajlod29 Dec 2020 17:50 UTC
47 points
6 comments14 min readEA link

“Nor­mal ac­ci­dents” and AI sys­tems

Eleni_A8 Aug 2022 18:43 UTC
4 points
1 comment1 min readEA link
(www.achan.ca)

AI safety mile­stones?

Zach Stein-Perlman23 Jan 2023 21:30 UTC
6 points
0 comments1 min readEA link

Up­date to Samotsvety AGI timelines

Misha_Yagudin24 Jan 2023 4:27 UTC
110 points
9 comments4 min readEA link

Prizes for ML Safety Bench­mark Ideas

Joshc28 Oct 2022 2:44 UTC
57 points
3 comments1 min readEA link

[Question] Track­ing Com­pute Stocks and Flows: Case Stud­ies?

Cullen5 Oct 2022 17:54 UTC
34 points
1 comment1 min readEA link

AI Safety Ca­reer Bot­tle­necks Sur­vey Re­sponses Responses

Linda Linsefors28 May 2021 10:41 UTC
34 points
1 comment5 min readEA link

Work­ing in Congress (Part #1): Back­ground and some EA cause area analysis

US Policy Careers11 Apr 2021 18:24 UTC
98 points
3 comments28 min readEA link

What role should evolu­tion­ary analo­gies play in un­der­stand­ing AI take­off speeds?

anson11 Dec 2021 1:16 UTC
12 points
0 comments42 min readEA link

AMA: Fu­ture of Life In­sti­tute’s EU Team

Risto Uuk31 Jan 2022 17:14 UTC
44 points
15 comments2 min readEA link

[Question] How to Im­prove China-Western Co­or­di­na­tion on EA Is­sues?

Michael Kehoe3 Nov 2021 7:28 UTC
15 points
2 comments1 min readEA link

[Question] Has pri­vate AGI re­search made in­de­pen­dent safety re­search in­effec­tive already? What should we do about this?

Roman Leventov23 Jan 2023 16:23 UTC
15 points
0 comments5 min readEA link

[Job ad] Re­search im­por­tant longter­mist top­ics at Re­think Pri­ori­ties!

Linch6 Oct 2021 19:09 UTC
65 points
46 comments1 min readEA link

There should be a pub­lic ad­ver­sar­ial col­lab­o­ra­tion on AI x-risk

pradyuprasad23 Jan 2023 4:09 UTC
56 points
5 comments2 min readEA link

Jade Le­ung: Why com­pa­nies should be lead­ing on AI governance

EA Global15 May 2019 23:37 UTC
28 points
7 comments18 min readEA link
(www.youtube.com)

Com­pute & An­titrust: Reg­u­la­tory im­pli­ca­tions of the AI hard­ware sup­ply chain, from chip de­sign to cloud APIs

HaydnBelfield19 Aug 2022 17:20 UTC
32 points
0 comments6 min readEA link
(verfassungsblog.de)

Where are the red lines for AI?

Karl von Wendt5 Aug 2022 9:41 UTC
13 points
3 comments6 min readEA link

My (naive) take on Risks from Learned Optimization

Artyom K6 Nov 2022 16:25 UTC
5 points
0 comments1 min readEA link

The AI rev­olu­tion and in­ter­na­tional poli­tics (Allan Dafoe)

EA Global2 Jun 2017 8:48 UTC
8 points
0 comments19 min readEA link
(www.youtube.com)

What does (and doesn’t) AI mean for effec­tive al­tru­ism?

EA Global12 Aug 2017 7:00 UTC
8 points
0 comments12 min readEA link

Ngo and Yud­kowsky on AI ca­pa­bil­ity gains

richard_ngo19 Nov 2021 1:54 UTC
23 points
4 comments39 min readEA link

Fol­low along with Columbia EA’s Ad­vanced AI Safety Fel­low­ship!

RohanS2 Jul 2022 6:07 UTC
27 points
0 comments2 min readEA link

What a com­pute-cen­tric frame­work says about AI take­off speeds—draft report

Tom_Davidson23 Jan 2023 4:09 UTC
185 points
5 comments16 min readEA link
(www.lesswrong.com)

Does gen­er­al­ity pay? GPT-3 can provide pre­limi­nary ev­i­dence.

BrownHairedEevee12 Jul 2020 18:53 UTC
21 points
4 comments2 min readEA link

Main paths to im­pact in EU AI Policy

JOMG_Monnet8 Dec 2022 16:17 UTC
67 points
2 comments8 min readEA link

Daniel Dewey: The Open Philan­thropy Pro­ject’s work on po­ten­tial risks from ad­vanced AI

EA Global11 Aug 2017 8:19 UTC
6 points
0 comments18 min readEA link
(www.youtube.com)

Un­der­stand­ing the diffu­sion of large lan­guage mod­els: summary

Ben Cottier21 Dec 2022 13:49 UTC
124 points
18 comments22 min readEA link

Book re­view: Ar­chi­tects of In­tel­li­gence by Martin Ford (2018)

Ofer11 Aug 2020 17:24 UTC
11 points
1 comment2 min readEA link

[Link and com­men­tary] Beyond Near- and Long-Term: Towards a Clearer Ac­count of Re­search Pri­ori­ties in AI Ethics and Society

MichaelA14 Mar 2020 9:04 UTC
18 points
0 comments7 min readEA link

[Link post] Co­or­di­na­tion challenges for pre­vent­ing AI conflict

stefan.torges9 Mar 2021 9:39 UTC
52 points
0 comments1 min readEA link
(longtermrisk.org)

The ‘Old AI’: Les­sons for AI gov­er­nance from early elec­tric­ity regulation

Sam Clarke19 Dec 2022 2:46 UTC
58 points
1 comment13 min readEA link

FLI is hiring a new Direc­tor of US Policy

aaguirre27 Jul 2022 0:07 UTC
14 points
0 comments1 min readEA link

Here are the fi­nal­ists from FLI’s $100K Wor­ld­build­ing Contest

Jackson Wagner6 Jun 2022 18:42 UTC
42 points
5 comments2 min readEA link

AI Benefits Post 4: Out­stand­ing Ques­tions on Select­ing Benefits

Cullen14 Jul 2020 17:24 UTC
6 points
0 comments5 min readEA link

Why we need a new agency to reg­u­late ad­vanced ar­tifi­cial intelligence

Michael Huang4 Aug 2022 13:38 UTC
25 points
0 comments1 min readEA link
(www.brookings.edu)

The Slip­pery Slope from DALLE-2 to Deep­fake Anarchy

stecas5 Nov 2022 14:47 UTC
54 points
11 comments17 min readEA link

[Pod­cast] Ajeya Co­tra on wor­ld­view di­ver­sifi­ca­tion and how big the fu­ture could be

BrownHairedEevee22 Jan 2021 23:57 UTC
57 points
20 comments1 min readEA link
(80000hours.org)

Good policy ideas that won’t hap­pen (yet)

Niel_Bowerman11 Sep 2014 12:29 UTC
28 points
8 comments14 min readEA link

Chris­ti­ano, Co­tra, and Yud­kowsky on AI progress

Ajeya25 Nov 2021 16:30 UTC
18 points
6 comments69 min readEA link

An­titrust-Com­pli­ant AI In­dus­try Self-Regulation

Cullen7 Jul 2020 20:52 UTC
26 points
1 comment1 min readEA link
(cullenokeefe.com)

Will the EU reg­u­la­tions on AI mat­ter to the rest of the world?

hanadulset1 Jan 2022 21:56 UTC
33 points
5 comments5 min readEA link

[Question] If FTX is liqui­dated, who ends up con­trol­ling An­thropic?

Ofer15 Nov 2022 15:04 UTC
63 points
8 comments1 min readEA link

In­sti­tu­tions Can­not Res­train Dark-Triad AI Exploitation

Remmelt27 Dec 2022 10:34 UTC
8 points
0 comments1 min readEA link

Slow­ing down AI progress is an un­der­ex­plored al­ign­ment strategy

Michael Huang13 Jul 2022 3:22 UTC
89 points
11 comments3 min readEA link
(www.lesswrong.com)

Three pillars for avoid­ing AGI catas­tro­phe: Tech­ni­cal al­ign­ment, de­ploy­ment de­ci­sions, and co­or­di­na­tion

alexlintz3 Aug 2022 21:24 UTC
83 points
4 comments11 min readEA link

[Fic­tion] Im­proved Gover­nance on the Crit­i­cal Path to AI Align­ment by 2045.

Jackson Wagner18 May 2022 15:50 UTC
20 points
1 comment12 min readEA link

An in­ter­ven­tion to shape policy di­alogue, com­mu­ni­ca­tion, and AI re­search norms for AI safety

Lee_Sharkey1 Oct 2017 18:29 UTC
9 points
28 comments10 min readEA link

Com­po­nents of Strate­gic Clar­ity [Strate­gic Per­spec­tives on Long-term AI Gover­nance, #2]

MMMaas2 Jul 2022 11:22 UTC
62 points
0 comments5 min readEA link

[Question] How strong is the ev­i­dence of un­al­igned AI sys­tems caus­ing harm?

BrownHairedEevee21 Jul 2020 4:08 UTC
31 points
1 comment1 min readEA link

Sur­vey on AI ex­is­ten­tial risk scenarios

Sam Clarke8 Jun 2021 17:12 UTC
148 points
6 comments6 min readEA link

How might we al­ign trans­for­ma­tive AI if it’s de­vel­oped very soon?

Holden Karnofsky29 Aug 2022 15:48 UTC
155 points
17 comments44 min readEA link

What does it mean to be­come an ex­pert in AI Hard­ware?

Christopher_Phenicie9 Jan 2021 4:15 UTC
86 points
10 comments11 min readEA link

[link] Cen­tre for the Gover­nance of AI 2020 An­nual Report

MarkusAnderljung14 Jan 2021 10:23 UTC
11 points
5 comments1 min readEA link

Truth­ful AI

Owen Cotton-Barratt20 Oct 2021 15:11 UTC
55 points
14 comments10 min readEA link

‘Ar­tifi­cial In­tel­li­gence Gover­nance un­der Change’ (PhD dis­ser­ta­tion)

MMMaas15 Sep 2022 12:10 UTC
52 points
1 comment2 min readEA link
(drive.google.com)

FHI Re­port: Stable Agree­ments in Tur­bu­lent Times

Cullen21 Feb 2019 17:12 UTC
25 points
2 comments4 min readEA link
(www.fhi.ox.ac.uk)

Birds, Brains, Planes, and AI: Against Ap­peals to the Com­plex­ity/​Mys­te­ri­ous­ness/​Effi­ciency of the Brain

kokotajlod18 Jan 2021 12:39 UTC
27 points
2 comments1 min readEA link

The Pug­wash Con­fer­ences and the Anti-Bal­lis­tic Mis­sile Treaty as a case study of Track II diplomacy

rani_martin16 Sep 2022 10:42 UTC
82 points
5 comments27 min readEA link

Quick Thoughts on A.I. Governance

NicholasKross30 Apr 2022 14:49 UTC
43 points
0 comments2 min readEA link
(www.thinkingmuchbetter.com)

Ideal gov­er­nance (for com­pa­nies, coun­tries and more)

Holden Karnofsky7 Apr 2022 16:54 UTC
80 points
19 comments14 min readEA link

The Im­por­tance of Ar­tifi­cial Sentience

Jamie_Harris3 Mar 2021 17:17 UTC
64 points
10 comments12 min readEA link
(www.sentienceinstitute.org)

[Link] Cen­ter for the Gover­nance of AI (GovAI) An­nual Re­port 2018

MarkusAnderljung21 Dec 2018 16:17 UTC
24 points
0 comments1 min readEA link

WFW?: Op­por­tu­nity and The­ory of Impact

DavidCorfield2 Nov 2022 0:45 UTC
2 points
5 comments14 min readEA link
(www.whatfuture.world)

New Se­quence—Towards a wor­ld­wide, wa­ter­tight Wind­fall Clause

John Bridge7 Apr 2022 15:02 UTC
25 points
4 comments8 min readEA link

NIST AI Risk Man­age­ment Frame­work re­quest for in­for­ma­tion (RFI)

Aryeh Englander31 Aug 2021 22:24 UTC
7 points
0 comments2 min readEA link

How Europe might mat­ter for AI governance

stefan.torges12 Jul 2019 23:42 UTC
52 points
13 comments8 min readEA link

Went­worth and Larsen on buy­ing time

Akash9 Jan 2023 21:31 UTC
48 points
0 comments1 min readEA link

[Question] Should AI writ­ers be pro­hibited in ed­u­ca­tion?

Eleni_A16 Jan 2023 22:29 UTC
3 points
2 comments1 min readEA link

Model-Based Policy Anal­y­sis un­der Deep Uncertainty

Max Reddel6 Mar 2023 14:24 UTC
80 points
31 comments19 min readEA link

Shul­man and Yud­kowsky on AI progress

CarlShulman4 Dec 2021 11:37 UTC
46 points
0 comments20 min readEA link

MIRI Con­ver­sa­tions: Tech­nol­ogy Fore­cast­ing & Grad­u­al­ism (Distil­la­tion)

TheMcDouglas13 Jul 2022 10:45 UTC
27 points
9 comments19 min readEA link

An­nounc­ing Epoch: A re­search or­ga­ni­za­tion in­ves­ti­gat­ing the road to Trans­for­ma­tive AI

Jaime Sevilla27 Jun 2022 13:39 UTC
183 points
11 comments2 min readEA link
(epochai.org)

“Slower tech de­vel­op­ment” can be about or­der­ing, grad­u­al­ness, or dis­tance from now

MichaelA14 Nov 2021 20:58 UTC
37 points
3 comments4 min readEA link

AI im­pacts and Paul Chris­ti­ano on take­off speeds

Crosspost2 Mar 2018 11:16 UTC
4 points
0 comments1 min readEA link

Con­ti­nu­ity Assumptions

Jan_Kulveit13 Jun 2022 21:36 UTC
42 points
4 comments4 min readEA link
(www.alignmentforum.org)

Vignettes Work­shop (AI Im­pacts)

kokotajlod15 Jun 2021 11:02 UTC
43 points
5 comments1 min readEA link

Epoch is hiring a Re­search Data Analyst

merilalama22 Nov 2022 17:34 UTC
21 points
0 comments4 min readEA link
(careers.rethinkpriorities.org)

Hereti­cal Thoughts on AI | Eli Dourado

𝕮𝖎𝖓𝖊𝖗𝖆19 Jan 2023 16:11 UTC
137 points
15 comments1 min readEA link

An­i­mal Rights, The Sin­gu­lar­ity, and Astro­nom­i­cal Suffering

sapphire20 Aug 2020 20:23 UTC
50 points
0 comments3 min readEA link

Paul Chris­ti­ano on how OpenAI is de­vel­op­ing real solu­tions to the ‘AI al­ign­ment prob­lem’, and his vi­sion of how hu­man­ity will pro­gres­sively hand over de­ci­sion-mak­ing to AI systems

80000_Hours2 Oct 2018 11:49 UTC
6 points
0 comments188 min readEA link

Op­por­tu­ni­ties for in­di­vi­d­ual donors in AI safety

alexflint12 Mar 2018 2:10 UTC
13 points
11 comments10 min readEA link

AGI risk: analo­gies & arguments

Gavin23 Mar 2021 13:18 UTC
31 points
3 comments8 min readEA link
(www.gleech.org)

Ap­ply to the ML for Align­ment Boot­camp (MLAB) in Berkeley [Jan 3 - Jan 22]

Habryka3 Nov 2021 18:20 UTC
140 points
6 comments1 min readEA link

SERI ML Align­ment The­ory Schol­ars Pro­gram 2022

Ryan Kidd27 Apr 2022 16:33 UTC
57 points
2 comments3 min readEA link

Michael Page, Dario Amodei, He­len Toner, Tasha McCauley, Jan Leike, & Owen Cot­ton-Bar­ratt: Mus­ings on AI

EA Global11 Aug 2017 8:19 UTC
6 points
0 comments1 min readEA link
(www.youtube.com)

In­for­mat­ica: Spe­cial Is­sue on Superintelligence

RyanCarey3 May 2017 5:05 UTC
7 points
0 comments2 min readEA link

Long-Term Fu­ture Fund: Ask Us Any­thing!

AdamGleave3 Dec 2020 13:44 UTC
89 points
153 comments1 min readEA link

AI Fore­cast­ing Dic­tionary (Fore­cast­ing in­fras­truc­ture, part 1)

jacobjacob8 Aug 2019 13:16 UTC
18 points
0 comments5 min readEA link

AI Im­pacts: His­toric trends in tech­nolog­i­cal progress

Aaron Gertler12 Feb 2020 0:08 UTC
55 points
5 comments3 min readEA link

Ap­ply to the sec­ond ML for Align­ment Boot­camp (MLAB 2) in Berkeley [Aug 15 - Fri Sept 2]

Buck6 May 2022 0:19 UTC
109 points
7 comments5 min readEA link

Asya Ber­gal: Rea­sons you might think hu­man-level AI is un­likely to hap­pen soon

EA Global26 Aug 2020 16:01 UTC
23 points
2 comments17 min readEA link
(www.youtube.com)

Why the Orthog­o­nal­ity Th­e­sis’s ve­rac­ity is not the point:

Antoine de Scorraille23 Jul 2020 15:40 UTC
3 points
0 comments3 min readEA link

Pod­cast: Krister Bykvist on moral un­cer­tainty, ra­tio­nal­ity, metaethics, AI and fu­ture pop­u­la­tions

Gus Docker21 Oct 2021 15:17 UTC
8 points
0 comments1 min readEA link
(www.utilitarianpodcast.com)

[Question] What are the top pri­ori­ties in a slow-take­off, mul­ti­po­lar world?

JP Addison25 Aug 2021 8:47 UTC
26 points
9 comments1 min readEA link

Ought: why it mat­ters and ways to help

Paul_Christiano26 Jul 2019 1:56 UTC
52 points
5 comments5 min readEA link

An­nounc­ing the Har­vard AI Safety Team

Xander Davies30 Jun 2022 18:34 UTC
128 points
4 comments5 min readEA link

AI Re­search Con­sid­er­a­tions for Hu­man Ex­is­ten­tial Safety (ARCHES)

Andrew Critch21 May 2020 6:55 UTC
29 points
0 comments3 min readEA link
(acritch.com)

Two rea­sons we might be closer to solv­ing al­ign­ment than it seems

Kat Woods24 Sep 2022 17:38 UTC
38 points
18 comments4 min readEA link

In­ter­view with Tom Chivers: “AI is a plau­si­ble ex­is­ten­tial risk, but it feels as if I’m in Pas­cal’s mug­ging”

felix.h21 Feb 2021 13:41 UTC
16 points
1 comment7 min readEA link

Three kinds of competitiveness

AI Impacts2 Apr 2020 3:46 UTC
10 points
0 comments5 min readEA link
(aiimpacts.org)

Cri­tique of Su­per­in­tel­li­gence Part 5

Fods1213 Dec 2018 5:19 UTC
12 points
2 comments6 min readEA link

[Question] Who would you have on your dream team for solv­ing AGI Align­ment?

Greg_Colbourn25 Aug 2022 13:34 UTC
8 points
14 comments1 min readEA link

$500 bounty for al­ign­ment con­test ideas

Akash30 Jun 2022 1:55 UTC
18 points
1 comment2 min readEA link

BERI is hiring an ML Soft­ware Engineer

sawyer10 Nov 2021 19:36 UTC
17 points
2 comments1 min readEA link

AI views and dis­agree­ments AMA: Chris­ti­ano, Ngo, Shah, Soares, Yudkowsky

RobBensinger1 Mar 2022 1:13 UTC
30 points
5 comments1 min readEA link
(www.lesswrong.com)

The al­ign­ment prob­lem from a deep learn­ing perspective

richard_ngo11 Aug 2022 3:18 UTC
58 points
0 comments21 min readEA link

An­nounc­ing AI Safety Support

Linda Linsefors19 Nov 2020 20:19 UTC
55 points
0 comments4 min readEA link

Ngo and Yud­kowsky on al­ign­ment difficulty

richard_ngo15 Nov 2021 22:47 UTC
71 points
13 comments94 min readEA link

Long-Term Fu­ture Fund: April 2019 grant recommendations

Habryka23 Apr 2019 7:00 UTC
142 points
242 comments47 min readEA link

Con­jec­ture: In­ter­nal In­fo­haz­ard Policy

Connor Leahy29 Jul 2022 19:35 UTC
34 points
3 comments18 min readEA link

Con­sider try­ing the ELK con­test (I am)

Holden Karnofsky5 Jan 2022 19:42 UTC
110 points
17 comments16 min readEA link

Ex­plain­ing AI Misal­ign­ment with Stable Diffusion

Yulia4 Sep 2022 20:10 UTC
6 points
1 comment4 min readEA link
(yuliaverse.substack.com)

Cog­ni­tive Science/​Psy­chol­ogy As a Ne­glected Ap­proach to AI Safety

Kaj_Sotala5 Jun 2017 13:46 UTC
37 points
37 comments4 min readEA link

PIBBSS Fel­low­ship: Bounty for Refer­rals & Dead­line Extension

Anna_Gajdova17 Jan 2022 16:23 UTC
17 points
7 comments1 min readEA link

[Question] Should I force my­self to work on AGI al­ign­ment?

Isaac Benson24 Aug 2022 17:25 UTC
19 points
17 comments1 min readEA link

The Vi­talik Bu­terin Fel­low­ship in AI Ex­is­ten­tial Safety is open for ap­pli­ca­tions!

Cynthia Chen14 Oct 2022 3:23 UTC
37 points
0 comments2 min readEA link

It’s (not) how you use it

Eleni_A7 Sep 2022 13:28 UTC
6 points
3 comments2 min readEA link

[Question] Brief sum­mary of key dis­agree­ments in AI Risk

Aryeh Englander26 Dec 2019 19:40 UTC
31 points
3 comments1 min readEA link

AGI in a vuln­er­a­ble world

AI Impacts2 Apr 2020 3:43 UTC
17 points
0 comments1 min readEA link
(aiimpacts.org)

[Question] How can we se­cure more re­search po­si­tions at our uni­ver­si­ties for x-risk re­searchers?

Neil Crawford6 Sep 2022 14:41 UTC
3 points
2 comments1 min readEA link

De­cep­tion as the op­ti­mal: mesa-op­ti­miz­ers and in­ner al­ign­ment

Eleni_A16 Aug 2022 3:45 UTC
19 points
0 comments5 min readEA link

Twit­ter-length re­sponses to 24 AI al­ign­ment arguments

RobBensinger14 Mar 2022 19:34 UTC
67 points
17 comments8 min readEA link

Short-Term AI Align­ment as a Pri­or­ity Cause

len.hoang.lnh11 Feb 2020 16:22 UTC
17 points
11 comments7 min readEA link

Align­ment Newslet­ter One Year Retrospective

Rohin Shah10 Apr 2019 7:00 UTC
62 points
22 comments21 min readEA link

AMA: Ajeya Co­tra, re­searcher at Open Phil

Ajeya28 Jan 2021 17:38 UTC
84 points
105 comments1 min readEA link

We Ran an AI Timelines Retreat

Lenny McCline17 May 2022 4:40 UTC
46 points
6 comments3 min readEA link

The Tree of Life: Stan­ford AI Align­ment The­ory of Change

Gabriel Mukobi2 Jul 2022 18:32 UTC
68 points
5 comments14 min readEA link

A list of good heuris­tics that the case for AI X-risk fails

Aaron Gertler16 Jul 2020 9:56 UTC
23 points
9 comments2 min readEA link
(www.alignmentforum.org)

Is GPT-3 the death of the pa­per­clip max­i­mizer?

matthias_samwald3 Aug 2020 11:34 UTC
4 points
1 comment1 min readEA link

Paul Chris­ti­ano: Cur­rent work in AI alignment

EA Global3 Apr 2020 7:06 UTC
74 points
1 comment22 min readEA link
(www.youtube.com)

New re­port on how much com­pu­ta­tional power it takes to match the hu­man brain (Open Philan­thropy)

Aaron Gertler15 Sep 2020 1:06 UTC
41 points
1 comment18 min readEA link
(www.openphilanthropy.org)

Align­ment is hard. Com­mu­ni­cat­ing that, might be harder

Eleni_A1 Sep 2022 11:45 UTC
17 points
1 comment3 min readEA link

[Question] Why AGIs util­ity can’t out­weigh hu­mans’ util­ity?

Alex P20 Sep 2022 5:16 UTC
6 points
26 comments1 min readEA link

AI Safety Needs Great Engineers

Andy Jones23 Nov 2021 21:03 UTC
95 points
13 comments4 min readEA link

An ML safety in­surance com­pany—shower thoughts

EdoArad18 Oct 2021 7:45 UTC
15 points
4 comments1 min readEA link

LessWrong is now a book, available for pre-or­der!

jacobjacob4 Dec 2020 20:42 UTC
48 points
1 comment10 min readEA link

In­creased Availa­bil­ity and Willing­ness for De­ploy­ment of Re­sources for Effec­tive Altru­ism and Long-Termism

Evan_Gaensbauer29 Dec 2021 20:20 UTC
46 points
1 comment2 min readEA link

“Ex­is­ten­tial risk from AI” sur­vey results

RobBensinger1 Jun 2021 20:19 UTC
80 points
35 comments11 min readEA link

Align­ment’s phlo­gis­ton

Eleni_A18 Aug 2022 1:41 UTC
18 points
1 comment2 min readEA link

7 es­says on Build­ing a Bet­ter Future

Jamie_Harris24 Jun 2022 14:28 UTC
21 points
0 comments2 min readEA link

Some promis­ing ca­reer ideas be­yond 80,000 Hours’ pri­or­ity paths

Ardenlk26 Jun 2020 10:34 UTC
140 points
28 comments15 min readEA link

Tech­ni­cal AGI safety re­search out­side AI

richard_ngo18 Oct 2019 15:02 UTC
86 points
5 comments4 min readEA link

A cen­tral AI al­ign­ment prob­lem: ca­pa­bil­ities gen­er­al­iza­tion, and the sharp left turn

So8res15 Jun 2022 14:19 UTC
51 points
2 comments7 min readEA link

SERI ML ap­pli­ca­tion dead­line is ex­tended un­til May 22.

Viktoria Malyasova22 May 2022 0:13 UTC
13 points
3 comments1 min readEA link

Messy per­sonal stuff that af­fected my cause pri­ori­ti­za­tion (or: how I started to care about AI safety)

Julia_Wise5 May 2022 17:59 UTC
262 points
14 comments2 min readEA link

Be­ing an in­di­vi­d­ual al­ign­ment grantmaker

A_donor28 Feb 2022 16:39 UTC
34 points
20 comments2 min readEA link

[Question] How do you talk about AI safety?

BrownHairedEevee19 Apr 2020 16:15 UTC
10 points
5 comments1 min readEA link

Rele­vant pre-AGI possibilities

kokotajlod20 Jun 2020 13:15 UTC
22 points
0 comments1 min readEA link
(aiimpacts.org)

[Question] What is most con­fus­ing to you about AI stuff?

Sam Clarke23 Nov 2021 16:00 UTC
25 points
15 comments1 min readEA link

How do take­off speeds af­fect the prob­a­bil­ity of bad out­comes from AGI?

KR7 Jul 2020 17:53 UTC
18 points
0 comments8 min readEA link

[Question] Is work­ing on AI safety as dan­ger­ous as ig­nor­ing it?

jkmh20 Sep 2021 23:06 UTC
10 points
5 comments1 min readEA link

[Question] Is it crunch time yet? If so, who can help?

NicholasKross13 Oct 2021 4:11 UTC
29 points
9 comments1 min readEA link

Syd­ney AI Safety Fellowship

Chris Leong2 Dec 2021 7:35 UTC
16 points
0 comments2 min readEA link

Jan Leike, He­len Toner, Malo Bour­gon, and Miles Brundage: Work­ing in AI

EA Global11 Aug 2017 8:19 UTC
6 points
0 comments1 min readEA link
(www.youtube.com)

Hiring en­g­ineers and re­searchers to help al­ign GPT-3

Paul_Christiano1 Oct 2020 18:52 UTC
107 points
19 comments3 min readEA link

The aca­demic con­tri­bu­tion to AI safety seems large

Gavin30 Jul 2020 10:30 UTC
116 points
28 comments9 min readEA link

“In­tro to brain-like-AGI safety” se­ries—halfway point!

Steven Byrnes9 Mar 2022 15:21 UTC
8 points
0 comments2 min readEA link

7 traps that (we think) new al­ign­ment re­searchers of­ten fall into

Akash27 Sep 2022 23:13 UTC
72 points
13 comments1 min readEA link

[Linkpost] How To Get Into In­de­pen­dent Re­search On Align­ment/​Agency

Jackson Wagner14 Feb 2022 21:40 UTC
10 points
0 comments1 min readEA link

The first AI Safety Camp & onwards

Remmelt7 Jun 2018 18:49 UTC
25 points
2 comments8 min readEA link

Con­nor Leahy on Con­jec­ture and Dy­ing with Dignity

Michaël Trazzi22 Jul 2022 19:30 UTC
34 points
0 comments10 min readEA link
(theinsideview.ai)

2020 AI Align­ment Liter­a­ture Re­view and Char­ity Comparison

Larks21 Dec 2020 15:25 UTC
154 points
16 comments70 min readEA link

2018 AI Align­ment Liter­a­ture Re­view and Char­ity Comparison

Larks18 Dec 2018 4:48 UTC
118 points
28 comments64 min readEA link

[AN #80]: Why AI risk might be solved with­out ad­di­tional in­ter­ven­tion from longtermists

Rohin Shah3 Jan 2020 7:52 UTC
58 points
12 comments10 min readEA link
(www.alignmentforum.org)

Con­sider pay­ing me to do AI safety re­search work

Rupert5 Nov 2020 8:09 UTC
11 points
3 comments2 min readEA link

The het­ero­gene­ity of hu­man value types: Im­pli­ca­tions for AI alignment

Geoffrey Miller16 Sep 2022 21:21 UTC
16 points
2 comments10 min readEA link

Shar­ing the World with Digi­tal Minds

Aaron Gertler1 Dec 2020 8:00 UTC
12 points
1 comment1 min readEA link
(www.nickbostrom.com)

[Question] What is an ex­am­ple of re­cent, tan­gible progress in AI safety re­search?

Aaron Gertler14 Jun 2021 5:29 UTC
35 points
4 comments1 min readEA link

Chris­ti­ano and Yud­kowsky on AI pre­dic­tions and hu­man intelligence

EliezerYudkowsky23 Feb 2022 16:51 UTC
31 points
0 comments42 min readEA link

[Creative Writ­ing Con­test] Me­tal or Mortal

Louis16 Oct 2021 16:24 UTC
7 points
0 comments7 min readEA link

De­fus­ing AGI Danger

Mark Xu24 Dec 2020 23:08 UTC
23 points
0 comments2 min readEA link
(www.alignmentforum.org)

Quan­tify­ing the Far Fu­ture Effects of Interventions

MichaelDickens18 May 2016 2:15 UTC
8 points
0 comments11 min readEA link

2019 AI Align­ment Liter­a­ture Re­view and Char­ity Comparison

Larks19 Dec 2019 2:58 UTC
147 points
28 comments64 min readEA link

A mesa-op­ti­miza­tion per­spec­tive on AI valence and moral patienthood

jacobpfau9 Sep 2021 22:23 UTC
10 points
18 comments17 min readEA link

Cri­tique of Su­per­in­tel­li­gence Part 2

Fods1213 Dec 2018 5:12 UTC
9 points
12 comments7 min readEA link

EA megapro­jects continued

mariushobbhahn3 Dec 2021 10:33 UTC
177 points
49 comments7 min readEA link

In­ter­pret­ing Neu­ral Net­works through the Poly­tope Lens

Sid Black23 Sep 2022 18:03 UTC
35 points
0 comments1 min readEA link

Skil­ling-up in ML Eng­ineer­ing for Align­ment: re­quest for comments

TheMcDouglas24 Apr 2022 6:40 UTC
8 points
0 comments1 min readEA link

Visi­ble Thoughts Pro­ject and Bounty Announcement

So8res30 Nov 2021 0:35 UTC
35 points
2 comments12 min readEA link

[Creative Writ­ing Con­test] The Puppy Problem

Louis13 Oct 2021 14:01 UTC
13 points
0 comments7 min readEA link

Public-fac­ing Cen­sor­ship Is Safety Theater, Caus­ing Rep­u­ta­tional Da­m­age

Yitz23 Sep 2022 5:08 UTC
49 points
7 comments1 min readEA link

A con­ver­sa­tion with Ro­hin Shah

AI Impacts12 Nov 2019 1:31 UTC
27 points
8 comments33 min readEA link
(aiimpacts.org)

Euro­pean Master’s Pro­grams in Ma­chine Learn­ing, Ar­tifi­cial In­tel­li­gence, and re­lated fields

Master Programs ML/AI17 Jan 2021 20:09 UTC
17 points
4 comments1 min readEA link

[Ex­tended Dead­line: Jan 23rd] An­nounc­ing the PIBBSS Sum­mer Re­search Fellowship

nora18 Dec 2021 16:54 UTC
36 points
1 comment1 min readEA link

Cri­tique of Su­per­in­tel­li­gence Part 4

Fods1213 Dec 2018 5:14 UTC
4 points
2 comments4 min readEA link

How to do the­o­ret­i­cal re­search, a per­sonal perspective

Mark Xu19 Aug 2022 19:43 UTC
132 points
7 comments15 min readEA link

[Question] How would a lan­guage model be­come goal-di­rected?

David Mears16 Jul 2022 14:50 UTC
109 points
19 comments1 min readEA link

Shah and Yud­kowsky on al­ign­ment failures

EliezerYudkowsky28 Feb 2022 19:25 UTC
38 points
7 comments92 min readEA link

Deep­Mind is hiring for the Scal­able Align­ment and Align­ment Teams

Rohin Shah13 May 2022 12:19 UTC
102 points
0 comments9 min readEA link

Changes in fund­ing in the AI safety field

Sebastian_Farquhar3 Feb 2017 13:09 UTC
34 points
10 comments7 min readEA link

AGI safety from first principles

richard_ngo21 Oct 2020 17:42 UTC
77 points
10 comments3 min readEA link
(www.alignmentforum.org)

“Tak­ing AI Risk Se­ri­ously” – Thoughts by An­drew Critch

Raemon19 Nov 2018 2:21 UTC
26 points
9 comments1 min readEA link
(www.lesswrong.com)

[Question] Why should we *not* put effort into AI safety re­search?

Ben Thompson16 May 2021 5:11 UTC
15 points
5 comments1 min readEA link

Sum­mary of Stu­art Rus­sell’s new book, “Hu­man Com­pat­i­ble”

Rohin Shah19 Oct 2019 19:56 UTC
33 points
1 comment16 min readEA link
(www.alignmentforum.org)

An­drew Critch: Log­i­cal in­duc­tion — progress in AI alignment

EA Global6 Aug 2016 0:40 UTC
6 points
0 comments1 min readEA link
(www.youtube.com)

On how var­i­ous plans miss the hard bits of the al­ign­ment challenge

So8res12 Jul 2022 5:35 UTC
125 points
13 comments27 min readEA link

[Question] Do EA folks want AGI at all?

Noah Scales16 Jul 2022 5:44 UTC
8 points
10 comments1 min readEA link

In­tel­lec­tual Diver­sity in AI Safety

KR22 Jul 2020 19:07 UTC
21 points
8 comments3 min readEA link

You Un­der­stand AI Align­ment and How to Make Soup

Leen Armoush28 May 2022 6:22 UTC
0 points
2 comments5 min readEA link

A re­sponse to Matthews on AI Risk

RyanCarey11 Aug 2015 12:58 UTC
11 points
16 comments6 min readEA link

[Question] Are so­cial me­dia al­gorithms an ex­is­ten­tial risk?

Barry Grimes15 Sep 2020 8:52 UTC
24 points
13 comments1 min readEA link

The role of academia in AI Safety.

PabloAMC28 Mar 2022 0:04 UTC
71 points
19 comments3 min readEA link

An­nual AGI Bench­mark­ing Event

Metaculus26 Aug 2022 21:31 UTC
20 points
2 comments2 min readEA link
(www.metaculus.com)

AMA or dis­cuss my 80K pod­cast epi­sode: Ben Garfinkel, FHI researcher

bgarfinkel13 Jul 2020 16:17 UTC
87 points
140 comments1 min readEA link

Pre­dict re­sponses to the “ex­is­ten­tial risk from AI” survey

RobBensinger28 May 2021 1:38 UTC
36 points
8 comments2 min readEA link

Anal­y­sis of AI Safety sur­veys for field-build­ing insights

Ash Jafari5 Dec 2022 17:37 UTC
24 points
7 comments5 min readEA link

[Question] Why not offer a multi-mil­lion /​ billion dol­lar prize for solv­ing the Align­ment Prob­lem?

Aryeh Englander17 Apr 2022 16:08 UTC
15 points
9 comments1 min readEA link

Steer­ing AI to care for an­i­mals, and soon

Andrew Critch14 Jun 2022 1:13 UTC
206 points
38 comments1 min readEA link

On Defer­ence and Yud­kowsky’s AI Risk Estimates

bgarfinkel19 Jun 2022 14:35 UTC
261 points
188 comments17 min readEA link

Owain Evans and Vic­to­ria Krakovna: Ca­reers in tech­ni­cal AI safety

EA Global3 Nov 2017 7:43 UTC
6 points
0 comments1 min readEA link
(www.youtube.com)

[Question] Is a ca­reer in mak­ing AI sys­tems more se­cure a mean­ingful way to miti­gate the X-risk posed by AGI?

Kyle O’Brien13 Feb 2022 7:05 UTC
14 points
4 comments1 min readEA link

[Question] Why not to solve al­ign­ment by mak­ing su­per­in­tel­li­gent hu­mans?

Pato16 Oct 2022 21:26 UTC
9 points
12 comments1 min readEA link

[Link] How un­der­stand­ing valence could help make fu­ture AIs safer

Milan_Griffes8 Oct 2020 18:53 UTC
22 points
2 comments3 min readEA link

My cur­rent thoughts on MIRI’s “highly re­li­able agent de­sign” work

Daniel_Dewey7 Jul 2017 1:17 UTC
51 points
59 comments19 min readEA link

En­abling more feedback

JJ Hepburn10 Dec 2021 6:52 UTC
41 points
3 comments3 min readEA link

Why I ex­pect suc­cess­ful (nar­row) alignment

Tobias_Baumann29 Dec 2018 15:46 UTC
18 points
10 comments1 min readEA link
(s-risks.org)

Red­wood Re­search is hiring for sev­eral roles

Jack R29 Nov 2021 0:18 UTC
75 points
0 comments1 min readEA link

‘Force mul­ti­pli­ers’ for EA research

Craig Drayton18 Jun 2022 13:39 UTC
18 points
7 comments4 min readEA link

Feed­back Re­quest on EA Philip­pines’ Ca­reer Ad­vice Re­search for Tech­ni­cal AI Safety

BrianTan3 Oct 2020 10:39 UTC
18 points
5 comments4 min readEA link

The case for be­com­ing a black-box in­ves­ti­ga­tor of lan­guage models

Buck6 May 2022 14:37 UTC
89 points
7 comments3 min readEA link

[Question] Anal­ogy of AI Align­ment as Rais­ing a Child?

Aaron_Scher19 Feb 2022 21:40 UTC
4 points
2 comments1 min readEA link

Draft re­port on AI timelines

Ajeya15 Dec 2020 12:10 UTC
35 points
0 comments1 min readEA link
(alignmentforum.org)

Align­ing Recom­mender Sys­tems as Cause Area

IvanVendrov8 May 2019 8:56 UTC
151 points
45 comments13 min readEA link

Fore­cast­ing Trans­for­ma­tive AI: What Kind of AI?

Holden Karnofsky10 Aug 2021 21:38 UTC
61 points
2 comments10 min readEA link

What Should the Aver­age EA Do About AI Align­ment?

Raemon25 Feb 2017 20:07 UTC
42 points
39 comments7 min readEA link

2017 AI Safety Liter­a­ture Re­view and Char­ity Comparison

Larks20 Dec 2017 21:54 UTC
43 points
17 comments23 min readEA link

Disagree­ments about Align­ment: Why, and how, we should try to solve them

ojorgensen8 Aug 2022 22:32 UTC
16 points
6 comments16 min readEA link

New Speaker Series on AI Align­ment Start­ing March 3

Zechen Zhang26 Feb 2022 10:58 UTC
5 points
0 comments1 min readEA link

Cri­tique of Su­per­in­tel­li­gence Part 1

Fods1213 Dec 2018 5:10 UTC
22 points
13 comments8 min readEA link

Prize and fast track to al­ign­ment re­search at ALTER

Vanessa18 Sep 2022 9:15 UTC
38 points
0 comments3 min readEA link

AI Align­ment 2018-2019 Review

Habryka28 Jan 2020 21:14 UTC
28 points
0 comments6 min readEA link
(www.lesswrong.com)

From lan­guage to ethics by au­to­mated reasoning

Michele Campolo21 Nov 2021 15:16 UTC
8 points
0 comments6 min readEA link

Ma­hen­dra Prasad: Ra­tional group de­ci­sion-making

EA Global8 Jul 2020 15:06 UTC
14 points
0 comments14 min readEA link
(www.youtube.com)

We should ex­pect to worry more about spec­u­la­tive risks

bgarfinkel29 May 2022 21:08 UTC
120 points
14 comments3 min readEA link

Buck Sh­legeris: How I think stu­dents should ori­ent to AI safety

EA Global25 Oct 2020 5:48 UTC
9 points
0 comments1 min readEA link
(www.youtube.com)

Pro­mot­ing com­pas­sion­ate longtermism

jonleighton7 Dec 2022 14:26 UTC
115 points
5 comments12 min readEA link

Med­i­ta­tions on ca­reers in AI Safety

PabloAMC23 Mar 2022 22:00 UTC
88 points
34 comments2 min readEA link

Con­ver­sa­tion on AI risk with Adam Gleave

AI Impacts27 Dec 2019 21:43 UTC
18 points
3 comments3 min readEA link
(aiimpacts.org)

There are two fac­tions work­ing to pre­vent AI dan­gers. Here’s why they’re deeply di­vided.

Sharmake10 Aug 2022 19:52 UTC
9 points
0 comments4 min readEA link
(www.vox.com)

Defend­ing against Ad­ver­sar­ial Poli­cies in Re­in­force­ment Learn­ing with Alter­nat­ing Training

sergia12 Feb 2022 15:59 UTC
1 point
1 comment14 min readEA link

Atari early

AI Impacts2 Apr 2020 23:28 UTC
34 points
2 comments1 min readEA link
(aiimpacts.org)

I’m Buck Sh­legeris, I do re­search and out­reach at MIRI, AMA

Buck15 Nov 2019 22:44 UTC
122 points
229 comments2 min readEA link

[Question] Why aren’t you freak­ing out about OpenAI? At what point would you start?

AppliedDivinityStudies10 Oct 2021 13:06 UTC
77 points
22 comments2 min readEA link

My Overview of the AI Align­ment Land­scape: A Bird’s Eye View

Neel Nanda15 Dec 2021 23:46 UTC
43 points
15 comments16 min readEA link
(www.alignmentforum.org)

AI Safety: Ap­ply­ing to Grad­u­ate Studies

frances_lorenz15 Dec 2021 22:56 UTC
21 points
0 comments12 min readEA link

FLI AI Align­ment pod­cast: Evan Hub­inger on In­ner Align­ment, Outer Align­ment, and Pro­pos­als for Build­ing Safe Ad­vanced AI

evhub1 Jul 2020 20:59 UTC
13 points
2 comments1 min readEA link
(futureoflife.org)

How to build a safe ad­vanced AI (Evan Hub­inger) | What’s up in AI safety? (Asya Ber­gal)

EA Global25 Oct 2020 5:48 UTC
6 points
0 comments1 min readEA link
(www.youtube.com)

Can we simu­late hu­man evolu­tion to cre­ate a some­what al­igned AGI?

Thomas Kwa29 Mar 2022 1:23 UTC
19 points
0 comments7 min readEA link

Work­ing at EA or­ga­ni­za­tions se­ries: Ma­chine In­tel­li­gence Re­search Institute

SoerenMind1 Nov 2015 12:49 UTC
8 points
0 comments4 min readEA link

AI al­ign­ment prize win­ners and next round [link]

RyanCarey20 Jan 2018 12:07 UTC
7 points
1 comment1 min readEA link

Ben Garfinkel: How sure are we about this AI stuff?

bgarfinkel9 Feb 2019 19:17 UTC
120 points
17 comments18 min readEA link

Stu­dent pro­ject for en­gag­ing with AI alignment

Per Ivar Friborg9 May 2022 10:44 UTC
35 points
1 comment1 min readEA link

A Sim­ple Model of AGI De­ploy­ment Risk

djbinder9 Jul 2021 9:44 UTC
16 points
0 comments5 min readEA link

[Question] Is trans­for­ma­tive AI the biggest ex­is­ten­tial risk? Why or why not?

BrownHairedEevee5 Mar 2022 3:54 UTC
9 points
11 comments1 min readEA link

Ngo and Yud­kowsky on sci­en­tific rea­son­ing and pivotal acts

EliezerYudkowsky21 Feb 2022 17:00 UTC
33 points
1 comment35 min readEA link

Cri­tique of Su­per­in­tel­li­gence Part 3

Fods1213 Dec 2018 5:13 UTC
3 points
5 comments7 min readEA link

Me­tac­u­lus is build­ing a team ded­i­cated to AI forecasting

christian18 Oct 2022 16:08 UTC
35 points
0 comments1 min readEA link
(apply.workable.com)

E.A. Me­gapro­ject Ideas

Tomer_Goloboy21 Mar 2022 1:23 UTC
15 points
4 comments3 min readEA link

LW4EA: Some cruxes on im­pact­ful al­ter­na­tives to AI policy work

Jeremy17 May 2022 3:05 UTC
11 points
1 comment1 min readEA link
(www.lesswrong.com)

Sum­maries: Align­ment Fun­da­men­tals Curriculum

Leon_Lang19 Sep 2022 15:43 UTC
25 points
1 comment1 min readEA link
(docs.google.com)

Why AI is Harder Than We Think—Me­lanie Mitchell

BrownHairedEevee28 Apr 2021 8:19 UTC
49 points
7 comments2 min readEA link
(arxiv.org)

Ought’s the­ory of change

stuhlmueller12 Apr 2022 0:09 UTC
43 points
4 comments3 min readEA link

AI safety schol­ar­ships look worth-fund­ing (if other fund­ing is sane)

anon-a19 Nov 2019 0:59 UTC
22 points
6 comments2 min readEA link

[Question] What con­sid­er­a­tions in­fluence whether I have more in­fluence over short or long timelines?

kokotajlod5 Nov 2020 19:57 UTC
18 points
0 comments1 min readEA link

AI Safety field-build­ing pro­jects I’d like to see

Akash11 Sep 2022 23:45 UTC
19 points
4 comments7 min readEA link
(www.lesswrong.com)

[Question] Donat­ing against Short Term AI risks

Jan-Willem16 Nov 2020 12:23 UTC
6 points
10 comments1 min readEA link

How Do AI Timelines Affect Giv­ing Now vs. Later?

MichaelDickens3 Aug 2021 3:36 UTC
36 points
8 comments8 min readEA link

Pre­serv­ing and con­tin­u­ing al­ign­ment re­search through a se­vere global catastrophe

A_donor6 Mar 2022 18:43 UTC
38 points
14 comments4 min readEA link

Database of ex­is­ten­tial risk estimates

MichaelA15 Apr 2020 12:43 UTC
120 points
37 comments5 min readEA link

Long-Term Fu­ture Fund: May 2021 grant recommendations

abergal27 May 2021 6:44 UTC
110 points
17 comments58 min readEA link

AGI Safety Com­mu­ni­ca­tions Initiative

Ines11 Jun 2022 16:30 UTC
33 points
5 comments1 min readEA link

What Should We Op­ti­mize—A Conversation

Johannes C. Mayer7 Apr 2022 14:48 UTC
1 point
0 comments15 min readEA link

What can the prin­ci­pal-agent liter­a­ture tell us about AI risk?

Alexis Carlier10 Feb 2020 10:10 UTC
26 points
1 comment12 min readEA link

[Question] I’m in­ter­view­ing Max Teg­mark about AI safety and more. What shouId I ask him?

Robert_Wiblin13 May 2022 15:32 UTC
18 points
2 comments1 min readEA link

Amanda Askell: AI safety needs so­cial scientists

EA Global4 Mar 2019 15:50 UTC
26 points
0 comments18 min readEA link
(www.youtube.com)

On pre­sent­ing the case for AI risk

Aryeh Englander8 Mar 2022 21:37 UTC
114 points
12 comments4 min readEA link

AGI Predictions

Pablo21 Nov 2020 12:02 UTC
36 points
0 comments1 min readEA link
(www.lesswrong.com)

Why AI al­ign­ment could be hard with mod­ern deep learning

Ajeya21 Sep 2021 15:35 UTC
134 points
16 comments14 min readEA link
(www.cold-takes.com)

Get­ting started in­de­pen­dently in AI Safety

JJ Hepburn6 Jul 2021 15:20 UTC
40 points
10 comments3 min readEA link

[Question] Can we con­vince peo­ple to work on AI safety with­out con­vinc­ing them about AGI hap­pen­ing this cen­tury?

BrianTan26 Nov 2020 14:46 UTC
8 points
3 comments2 min readEA link

Jesse Clif­ton: Open-source learn­ing — a bar­gain­ing approach

EA Global18 Oct 2019 18:05 UTC
9 points
0 comments1 min readEA link
(www.youtube.com)

Tan Zhi Xuan: AI al­ign­ment, philo­soph­i­cal plu­ral­ism, and the rele­vance of non-Western philosophy

EA Global21 Nov 2020 8:12 UTC
12 points
1 comment1 min readEA link
(www.youtube.com)

Cortés, Pizarro, and Afonso as Prece­dents for Takeover

AI Impacts2 Mar 2020 12:25 UTC
27 points
17 comments11 min readEA link
(aiimpacts.org)

My plan for a “Most Im­por­tant Cen­tury” read­ing group

Jack O'Brien19 Jan 2022 9:32 UTC
12 points
1 comment2 min readEA link

Red­wood Re­search is hiring for sev­eral roles (Oper­a­tions and Tech­ni­cal)

JJXWang14 Apr 2022 15:23 UTC
45 points
0 comments1 min readEA link

Katja Grace: AI safety

EA Global11 Aug 2017 8:19 UTC
6 points
0 comments1 min readEA link
(www.youtube.com)

Some global catas­trophic risk estimates

Tamay10 Feb 2021 19:32 UTC
106 points
14 comments1 min readEA link

Key Papers in Lan­guage Model Safety

aogara20 Jun 2022 14:59 UTC
14 points
0 comments22 min readEA link

Sin­ga­pore’s Tech­ni­cal AI Align­ment Re­search Ca­reer Guide

Yi-Yang26 Aug 2020 8:09 UTC
34 points
7 comments8 min readEA link

[Closed] Hiring a math­e­mat­i­cian to work on the learn­ing-the­o­retic AI al­ign­ment agenda

Vanessa19 Apr 2022 6:49 UTC
53 points
4 comments2 min readEA link

[linkpost] Shar­ing pow­er­ful AI mod­els: the emerg­ing paradigm of struc­tured access

ts20 Jan 2022 21:10 UTC
11 points
3 comments1 min readEA link

Ar­tifi­cial in­tel­li­gence ca­reer stories

EA Global25 Oct 2020 6:56 UTC
11 points
0 comments1 min readEA link
(www.youtube.com)

[Question] Ca­reer Ad­vice: Philos­o­phy + Pro­gram­ming → AI Safety

tcelferact18 Mar 2022 15:09 UTC
29 points
11 comments2 min readEA link

We Are Con­jec­ture, A New Align­ment Re­search Startup

Connor Leahy9 Apr 2022 15:07 UTC
31 points
0 comments1 min readEA link

Co­her­ence ar­gu­ments im­ply a force for goal-di­rected behavior

Katja_Grace6 Apr 2021 21:44 UTC
19 points
1 comment11 min readEA link
(worldspiritsockpuppet.com)

Soares, Tal­linn, and Yud­kowsky dis­cuss AGI cognition

EliezerYudkowsky29 Nov 2021 17:28 UTC
15 points
0 comments40 min readEA link

Draft re­port on ex­is­ten­tial risk from power-seek­ing AI

Joe_Carlsmith28 Apr 2021 21:41 UTC
87 points
34 comments1 min readEA link

2016 AI Risk Liter­a­ture Re­view and Char­ity Comparison

Larks13 Dec 2016 4:36 UTC
57 points
12 comments28 min readEA link

AI and im­pact opportunities

brb24331 Mar 2022 20:23 UTC
−2 points
6 comments1 min readEA link

[Cause Ex­plo­ra­tion Prizes] Ex­pand­ing com­mu­ni­ca­tion about AGI risks

Ines22 Sep 2022 5:30 UTC
12 points
0 comments11 min readEA link

In­tro­duc­ing the Prin­ci­ples of In­tel­li­gent Be­havi­our in Biolog­i­cal and So­cial Sys­tems (PIBBSS) Fellowship

adamShimi18 Dec 2021 15:25 UTC
33 points
5 comments10 min readEA link

We’re Aligned AI, we’re aiming to al­ign AI

Stuart Armstrong21 Feb 2022 10:43 UTC
64 points
8 comments3 min readEA link

Three Bi­ases That Made Me Believe in AI Risk

beth​13 Feb 2019 23:22 UTC
41 points
20 comments3 min readEA link

AI al­ign­ment with hu­mans… but with which hu­mans?

Geoffrey Miller8 Sep 2022 23:43 UTC
36 points
17 comments3 min readEA link

Scru­ti­niz­ing AI Risk (80K, #81) - v. quick summary

Ben23 Jul 2020 19:02 UTC
10 points
1 comment3 min readEA link

Re: Some thoughts on veg­e­tar­i­anism and veganism

Fai25 Feb 2022 20:43 UTC
50 points
3 comments8 min readEA link

Are Hu­mans ‘Hu­man Com­pat­i­ble’?

Matt Boyd6 Dec 2019 5:49 UTC
23 points
8 comments4 min readEA link

Crit­i­cism of the main frame­work in AI alignment

Michele Campolo31 Aug 2022 21:44 UTC
35 points
4 comments7 min readEA link

AI Risk: In­creas­ing Per­sua­sion Power

kewlcats3 Aug 2020 20:25 UTC
4 points
0 comments1 min readEA link

EA’s brain-over-body bias, and the em­bod­ied value prob­lem in AI al­ign­ment

Geoffrey Miller21 Sep 2022 18:55 UTC
44 points
2 comments25 min readEA link

[Question] Should the EA com­mu­nity have a DL en­g­ineer­ing fel­low­ship?

PabloAMC24 Dec 2021 13:43 UTC
26 points
6 comments1 min readEA link

[Question] Is this a good way to bet on short timelines?

kokotajlod28 Nov 2020 14:31 UTC
17 points
16 comments1 min readEA link

In­tro­duc­ing The Non­lin­ear Fund: AI Safety re­search, in­cu­ba­tion, and funding

Kat Woods18 Mar 2021 14:07 UTC
71 points
32 comments5 min readEA link

There should be an AI safety pro­ject board

mariushobbhahn14 Mar 2022 16:08 UTC
24 points
3 comments1 min readEA link

AI Align­ment YouTube Playlists

jacquesthibs9 May 2022 21:31 UTC
16 points
2 comments1 min readEA link

He­len Toner: The Open Philan­thropy Pro­ject’s work on AI risk

EA Global3 Nov 2017 7:43 UTC
6 points
0 comments1 min readEA link
(www.youtube.com)

AI Fore­cast­ing Re­s­olu­tion Coun­cil (Fore­cast­ing in­fras­truc­ture, part 2)

jacobjacob29 Aug 2019 17:43 UTC
28 points
0 comments3 min readEA link

My Un­der­stand­ing of Paul Chris­ti­ano’s Iter­ated Am­plifi­ca­tion AI Safety Re­search Agenda

Chi15 Aug 2020 19:59 UTC
38 points
3 comments40 min readEA link

Les­sons learned from talk­ing to >100 aca­demics about AI safety

mariushobbhahn10 Oct 2022 13:16 UTC
138 points
21 comments1 min readEA link

[Question] What kind of event, tar­geted to un­der­grad­u­ate CS ma­jors, would be most effec­tive at get­ting peo­ple to work on AI safety?

CBiddulph19 Sep 2021 16:19 UTC
9 points
1 comment1 min readEA link

[Creative Writ­ing Con­test] An AI Safety Limerick

Ben_West18 Oct 2021 19:11 UTC
21 points
5 comments1 min readEA link

Crypto ‘or­a­cle pro­to­cols’ for AI al­ign­ment with real-world data?

Geoffrey Miller22 Sep 2022 23:05 UTC
9 points
5 comments1 min readEA link

Key ques­tions about ar­tifi­cial sen­tience: an opinionated guide

rgb25 Apr 2022 13:42 UTC
94 points
2 comments1 min readEA link

The Me­taethics and Nor­ma­tive Ethics of AGI Value Align­ment: Many Ques­tions, Some Implications

Dario Citrini15 Sep 2021 19:05 UTC
23 points
0 comments8 min readEA link

The re­li­gion prob­lem in AI alignment

Geoffrey Miller16 Sep 2022 1:24 UTC
39 points
27 comments11 min readEA link

How to Diver­sify Con­cep­tual AI Align­ment: the Model Be­hind Refine

adamShimi20 Jul 2022 10:44 UTC
43 points
0 comments8 min readEA link
(www.alignmentforum.org)

[Question] Does the idea of AGI that benev­olently con­trol us ap­peal to EA folks?

Noah Scales16 Jul 2022 19:17 UTC
6 points
20 comments1 min readEA link

[Question] How can I bet on short timelines?

kokotajlod7 Nov 2020 12:45 UTC
33 points
12 comments2 min readEA link

[Dis­cus­sion] Best in­tu­ition pumps for AI safety

mariushobbhahn6 Nov 2021 8:11 UTC
10 points
8 comments1 min readEA link

Our Cur­rent Direc­tions in Mechanis­tic In­ter­pretabil­ity Re­search (AI Align­ment Speaker Series)

Group Organizer8 Apr 2022 17:08 UTC
3 points
0 comments1 min readEA link

[Question] What are the coolest top­ics in AI safety, to a hope­lessly pure math­e­mat­i­cian?

Jenny K E7 May 2022 7:18 UTC
87 points
29 comments1 min readEA link

AGI x-risk timelines: 10% chance (by year X) es­ti­mates should be the head­line, not 50%.

Greg_Colbourn1 Mar 2022 12:02 UTC
67 points
22 comments1 min readEA link

Dis­con­tin­u­ous progress in his­tory: an update

AI Impacts17 Apr 2020 16:28 UTC
69 points
3 comments24 min readEA link

Anti-squat­ted AI x-risk do­mains index

plex12 Aug 2022 12:00 UTC
52 points
9 comments1 min readEA link

[Question] 1h-vol­un­teers needed for a small AI Safety-re­lated re­search pro­ject

PabloAMC16 Aug 2021 17:51 UTC
4 points
0 comments1 min readEA link

Align­ment 201 curriculum

richard_ngo12 Oct 2022 19:17 UTC
94 points
9 comments1 min readEA link

Max Teg­mark: Risks and benefits of ad­vanced ar­tifi­cial intelligence

EA Global5 Aug 2016 9:19 UTC
6 points
0 comments1 min readEA link
(www.youtube.com)

In­tro­duc­ing the Fund for Align­ment Re­search (We’re Hiring!)

AdamGleave6 Jul 2022 2:00 UTC
74 points
3 comments4 min readEA link

On Solv­ing Prob­lems Be­fore They Ap­pear: The Weird Episte­molo­gies of Alignment

adamShimi11 Oct 2021 8:21 UTC
28 points
0 comments15 min readEA link

fic­tion about AI risk

Ann Garth12 Nov 2020 22:36 UTC
8 points
1 comment1 min readEA link

Ro­hin Shah: What’s been hap­pen­ing in AI al­ign­ment?

EA Global29 Jul 2020 20:15 UTC
17 points
0 comments14 min readEA link
(www.youtube.com)

Who or­dered al­ign­ment’s ap­ple?

Eleni_A28 Aug 2022 14:24 UTC
5 points
0 comments3 min readEA link

[Question] How much EA anal­y­sis of AI safety as a cause area ex­ists?

richard_ngo6 Sep 2019 11:15 UTC
94 points
20 comments2 min readEA link

[3-hour pod­cast]: Joseph Car­l­smith on longter­mism, utopia, the com­pu­ta­tional power of the brain, meta-ethics, illu­sion­ism and meditation

Gus Docker27 Jul 2021 13:18 UTC
34 points
2 comments1 min readEA link

EA Berkeley Pre­sents: Univer­sal Own­er­ship: Is In­dex In­vest­ing the New So­cially Re­spon­si­ble In­vest­ing?

Mahendra Prasad10 Mar 2022 6:58 UTC
7 points
0 comments1 min readEA link

AI Fore­cast­ing Ques­tion Database (Fore­cast­ing in­fras­truc­ture, part 3)

jacobjacob3 Sep 2019 14:57 UTC
23 points
2 comments5 min readEA link

[Question] What are your recom­men­da­tions for tech­ni­cal AI al­ign­ment pod­casts?

Evan_Gaensbauer11 May 2022 21:52 UTC
13 points
4 comments1 min readEA link

An­nounc­ing AXRP, the AI X-risk Re­search Podcast

DanielFilan23 Dec 2020 20:10 UTC
32 points
1 comment1 min readEA link

On AI and Compute

johncrox3 Apr 2019 21:26 UTC
39 points
12 comments5 min readEA link

Eric Drexler: Pare­to­topian goal alignment

EA Global15 Mar 2019 14:51 UTC
10 points
0 comments10 min readEA link
(www.youtube.com)

[Question] How should we in­vest in “long-term short-ter­mism” given the like­li­hood of trans­for­ma­tive AI?

James_Banks12 Jan 2021 23:54 UTC
7 points
0 comments1 min readEA link

In­tro to car­ing about AI al­ign­ment as an EA cause

So8res14 Apr 2017 0:42 UTC
28 points
10 comments25 min readEA link

[Link] Thiel on GCRs

Milan_Griffes22 Jul 2019 20:47 UTC
28 points
11 comments1 min readEA link

Crit­i­cal Re­view of ‘The Precipice’: A Re­assess­ment of the Risks of AI and Pandemics

Fods1211 May 2020 11:11 UTC
91 points
32 comments26 min readEA link

AI Value Align­ment Speaker Series Pre­sented By EA Berkeley

Mahendra Prasad1 Mar 2022 6:17 UTC
2 points
0 comments1 min readEA link

Gen­eral ad­vice for tran­si­tion­ing into The­o­ret­i­cal AI Safety

Martín Soto15 Sep 2022 5:23 UTC
25 points
0 comments10 min readEA link

13 back­ground claims about EA

Akash7 Sep 2022 3:54 UTC
69 points
16 comments3 min readEA link

De­sir­able? AI qualities

brb24321 Mar 2022 22:05 UTC
5 points
0 comments2 min readEA link

Emer­gent Ven­tures AI

Gavin8 Apr 2022 22:08 UTC
22 points
0 comments1 min readEA link
(marginalrevolution.com)

Con­fused about AI re­search as a means of ad­dress­ing AI risk

Eli Rose21 Feb 2019 0:07 UTC
31 points
15 comments1 min readEA link

Nat­u­ral­ism and AI alignment

Michele Campolo24 Apr 2021 16:20 UTC
17 points
3 comments8 min readEA link

Take­aways from safety by de­fault interviews

AI Impacts7 Apr 2020 2:01 UTC
25 points
2 comments1 min readEA link
(aiimpacts.org)

A stub­born un­be­liever fi­nally gets the depth of the AI al­ign­ment problem

aelwood13 Oct 2022 15:16 UTC
32 points
7 comments1 min readEA link

But ex­actly how com­plex and frag­ile?

Katja_Grace13 Dec 2019 7:05 UTC
36 points
3 comments3 min readEA link
(meteuphoric.com)

[Question] How to get more aca­demics en­thu­si­as­tic about do­ing AI Safety re­search?

PabloAMC4 Sep 2021 14:10 UTC
25 points
19 comments1 min readEA link

Why I pri­ori­tize moral cir­cle ex­pan­sion over re­duc­ing ex­tinc­tion risk through ar­tifi­cial in­tel­li­gence alignment

Jacy20 Feb 2018 18:29 UTC
98 points
72 comments36 min readEA link
(www.sentienceinstitute.org)

Im­pli­ca­tions of Quan­tum Com­put­ing for Ar­tifi­cial In­tel­li­gence al­ign­ment re­search (ABRIDGED)

Jaime Sevilla5 Sep 2019 14:56 UTC
25 points
4 comments2 min readEA link

New book: The Tango of Ethics: In­tu­ition, Ra­tion­al­ity and the Preven­tion of Suffering

jonleighton2 Jan 2023 8:45 UTC
111 points
3 comments5 min readEA link

AI al­ign­ment re­search links

Holden Karnofsky6 Jan 2022 5:52 UTC
16 points
0 comments6 min readEA link
(www.cold-takes.com)

Prepar­ing for AI-as­sisted al­ign­ment re­search: we need data!

CBiddulph17 Jan 2023 3:28 UTC
11 points
0 comments11 min readEA link

[Link post] AI could fuel fac­tory farm­ing—or end it

BrianK18 Oct 2022 11:16 UTC
34 points
0 comments1 min readEA link
(www.fastcompany.com)

[Question] What’s the best ma­chine learn­ing newslet­ter? How do you keep up to date?

Mathieu Putz25 Mar 2022 14:36 UTC
13 points
12 comments1 min readEA link

[Question] What work has been done on the post-AGI dis­tri­bu­tion of wealth?

levin6 Jul 2022 18:59 UTC
16 points
3 comments1 min readEA link

Philan­thropists Prob­a­bly Shouldn’t Mis­sion-Hedge AI Progress

MichaelDickens23 Aug 2022 23:03 UTC
27 points
9 comments36 min readEA link

EA AI/​Emerg­ing Tech Orgs Should Be In­volved with Pa­tent Office Partnership

Bridges12 Jun 2022 22:32 UTC
10 points
0 comments1 min readEA link

[Question] How might a herd of in­terns help with AI or biose­cu­rity re­search tasks/​ques­tions?

Harrison Durland20 Mar 2022 22:49 UTC
30 points
8 comments2 min readEA link

Cru­cial con­sid­er­a­tions in the field of Wild An­i­mal Welfare (WAW)

Holly_Elmore10 Apr 2022 19:43 UTC
63 points
10 comments3 min readEA link

New GPT3 Im­pres­sive Ca­pa­bil­ities—In­struc­tGPT3 [1/​2]

simeon_c13 Mar 2022 10:45 UTC
49 points
4 comments8 min readEA link

The prob­a­bil­ity that Ar­tifi­cial Gen­eral In­tel­li­gence will be de­vel­oped by 2043 is ex­tremely low.

cveres6 Oct 2022 11:26 UTC
2 points
12 comments13 min readEA link

[Question] What EAG ses­sions would you like on AI?

Nathan Young20 Mar 2022 17:05 UTC
7 points
10 comments1 min readEA link

Pri­ori­tiz­ing the Arts in re­sponse to AI automation

Casey25 Sep 2022 7:49 UTC
6 points
1 comment1 min readEA link

Ret­ro­spec­tive on the Sum­mer 2021 AGI Safety Fundamentals

Dewi6 Dec 2021 20:10 UTC
66 points
3 comments32 min readEA link

Risks from Au­tonomous Weapon Sys­tems and Mili­tary AI

christian.r19 May 2022 10:45 UTC
71 points
10 comments37 min readEA link

On Scal­ing Academia

kirchner.jan20 Sep 2021 14:54 UTC
18 points
3 comments13 min readEA link
(universalprior.substack.com)

Free Guy, a rom-com on the moral pa­tient­hood of digi­tal sentience

mic23 Dec 2021 7:47 UTC
20 points
2 comments2 min readEA link

We Can’t Do Long Term Utili­tar­ian Calcu­la­tions Un­til We Know if AIs Can Be Con­scious or Not

Mike207312 Sep 2022 8:37 UTC
4 points
0 comments12 min readEA link

“The Physi­cists”: A play about ex­tinc­tion and the re­spon­si­bil­ity of scientists

Lara_TH29 Nov 2022 16:53 UTC
28 points
1 comment8 min readEA link

When 2/​3rds of the world goes against you

Jeffrey Kursonis2 Jul 2022 20:34 UTC
2 points
2 comments9 min readEA link

Ap­ply to be a Stan­ford HAI Ju­nior Fel­low (As­sis­tant Pro­fes­sor- Re­search) by Nov. 15, 2021

Vael Gates31 Oct 2021 2:21 UTC
15 points
0 comments1 min readEA link

Fa­nat­i­cism in AI: SERI Project

Jake Arft-Guatelli24 Sep 2021 4:39 UTC
7 points
2 comments5 min readEA link

Stack­elberg Games and Co­op­er­a­tive Com­mit­ment: My Thoughts and Reflec­tions on a 2-Month Re­search Project

Ben Bucknall13 Dec 2021 10:49 UTC
18 points
1 comment9 min readEA link

Strong AI. From the­ory to prac­tice.

GaHHuKoB19 Aug 2022 11:33 UTC
−2 points
0 comments11 min readEA link
(www.reddit.com)

Make a neu­ral net­work in ~10 minutes

Arjun Yadav25 Apr 2022 18:36 UTC
3 points
0 comments4 min readEA link
(arjunyadav.net)

Pro­ject Idea: The cost of Coc­ci­dio­sis on Chicken farm­ing and if AI can help

Max Harris26 Sep 2022 16:30 UTC
25 points
8 comments2 min readEA link

An Ex­er­cise in Speed-Read­ing: The Na­tional Se­cu­rity Com­mis­sion on AI (NSCAI) Fi­nal Report

abiolvera17 Aug 2022 16:55 UTC
47 points
4 comments11 min readEA link

Sixty years af­ter the Cuban Mis­sile Cri­sis, a new era of global catas­trophic risks

christian.r13 Oct 2022 11:25 UTC
31 points
0 comments1 min readEA link
(thebulletin.org)

GPT-2 as step to­ward gen­eral in­tel­li­gence (Alexan­der, 2019)

Will Aldred18 Jul 2022 16:14 UTC
42 points
0 comments2 min readEA link
(slatestarcodex.com)

In­tro­duc­ing the ML Safety Schol­ars Program

ThomasW4 May 2022 13:14 UTC
142 points
38 comments3 min readEA link

[Question] Re­quest for As­sis­tance—Re­search on Sce­nario Devel­op­ment for Ad­vanced AI Risk

Kiliank30 Mar 2022 3:01 UTC
2 points
1 comment1 min readEA link

Slides: Po­ten­tial Risks From Ad­vanced AI

Aryeh Englander28 Apr 2022 2:18 UTC
9 points
0 comments1 min readEA link

[Question] What will be some of the most im­pact­ful ap­pli­ca­tions of ad­vanced AI in the near term?

IanDavidMoss3 Mar 2022 15:26 UTC
16 points
7 comments1 min readEA link

Love and AI: Re­la­tional Brain/​Mind Dy­nam­ics in AI Development

Jeffrey Kursonis21 Jun 2022 7:09 UTC
2 points
2 comments3 min readEA link

Rab­bits, robots and resurrection

Patrick Wilson10 May 2022 15:00 UTC
9 points
0 comments15 min readEA link

On Generality

Oren Montano26 Sep 2022 8:59 UTC
2 points
0 comments1 min readEA link

A strange twist on the road to AGI

cveres12 Oct 2022 23:27 UTC
3 points
0 comments1 min readEA link

[Question] I have re­cently been in­ter­ested in robotics, par­tic­u­larly in for-profit star­tups. I think they can help in­crease food pro­duc­tion and help re­duce im­prove health­care. Would this fall un­der AI for so­cial good? How im­pact­ful will robotics be to so­ciety? How large is the coun­ter­fac­tual?

Isaac Benson2 Jan 2022 5:38 UTC
4 points
3 comments1 min readEA link

Without spe­cific coun­ter­mea­sures, the eas­iest path to trans­for­ma­tive AI likely leads to AI takeover

Ajeya18 Jul 2022 19:07 UTC
215 points
12 comments75 min readEA link
(www.lesswrong.com)

[Question] Train­ing a GPT model on EA texts: what data?

JoyOptimizer4 Jun 2022 5:59 UTC
23 points
16 comments1 min readEA link

AI timelines and the­o­ret­i­cal un­der­stand­ing of deep learn­ing

Venky102412 Sep 2021 16:26 UTC
4 points
8 comments2 min readEA link

My ar­gu­ment against AGI

cveres12 Oct 2022 6:32 UTC
2 points
29 comments3 min readEA link

Chris Olah on work­ing at top AI labs with­out an un­der­grad degree

80000_Hours10 Sep 2021 20:46 UTC
15 points
0 comments75 min readEA link

A Bird’s Eye View of the ML Field [Prag­matic AI Safety #2]

ThomasW9 May 2022 17:15 UTC
92 points
2 comments36 min readEA link

6 Year De­crease of Me­tac­u­lus AGI Prediction

Chris Leong12 Apr 2022 5:36 UTC
40 points
6 comments1 min readEA link

How to be­come more agen­tic, by GPT-EA-Fo­rum-v1

JoyOptimizer20 Jun 2022 6:50 UTC
24 points
8 comments4 min readEA link

Why I think strong gen­eral AI is com­ing soon

porby28 Sep 2022 6:55 UTC
14 points
1 comment1 min readEA link

Ex­plo­ra­tory sur­vey on psy­chol­ogy of AI risk perception

Daniel_Friedrich2 Aug 2022 20:34 UTC
1 point
0 comments1 min readEA link
(forms.gle)

Vot­ing The­ory has a HOLE

Anthony Repetto4 Dec 2021 4:20 UTC
2 points
2 comments2 min readEA link

First call for EA Data Science/​ML/​AI

astrastefania23 Aug 2022 19:37 UTC
25 points
0 comments1 min readEA link

Catholic the­olo­gians and priests on ar­tifi­cial intelligence

anonymous614 Jun 2022 18:53 UTC
15 points
3 comments1 min readEA link

Carnegie Coun­cil MisUn­der­stands Longtermism

Jeff A30 Sep 2022 2:57 UTC
6 points
8 comments1 min readEA link
(www.carnegiecouncil.org)

Re­sults from the lan­guage model hackathon

Esben Kran10 Oct 2022 8:29 UTC
23 points
2 comments1 min readEA link

Seek­ing Sur­vey Re­sponses—At­ti­tudes Towards AI risks

anson28 Mar 2022 17:47 UTC
23 points
2 comments1 min readEA link
(forms.gle)

Gen­eral vs spe­cific ar­gu­ments for the longter­mist im­por­tance of shap­ing AI development

Sam Clarke15 Oct 2021 14:43 UTC
44 points
7 comments2 min readEA link

Don’t ex­pect AGI any­time soon

cveres10 Oct 2022 22:38 UTC
0 points
19 comments1 min readEA link

Values lock-in is already hap­pen­ing (with­out AGI)

LB1 Sep 2022 22:21 UTC
1 point
1 comment12 min readEA link

EA for dumb peo­ple?

Olivia Addy11 Jul 2022 10:46 UTC
442 points
160 comments2 min readEA link

It takes 5 lay­ers and 1000 ar­tifi­cial neu­rons to simu­late a sin­gle biolog­i­cal neu­ron [Link]

MichaelStJules7 Sep 2021 21:53 UTC
44 points
17 comments2 min readEA link

TIO: A men­tal health chatbot

Sanjay12 Oct 2020 20:52 UTC
24 points
6 comments28 min readEA link

The His­tory of AI Rights Research

Jamie_Harris27 Aug 2022 8:14 UTC
43 points
1 comment14 min readEA link
(www.sentienceinstitute.org)

High Im­pact Ca­reers in For­mal Ver­ifi­ca­tion: Ar­tifi­cial Intelligence

quinn5 Jun 2021 14:45 UTC
28 points
6 comments16 min readEA link

Paul Chris­ti­ano – Ma­chine in­tel­li­gence and cap­i­tal accumulation

Tessa15 May 2014 0:10 UTC
21 points
0 comments6 min readEA link
(rationalaltruist.com)

Deep­Mind: Gen­er­ally ca­pa­ble agents emerge from open-ended play

kokotajlod27 Jul 2021 19:35 UTC
48 points
10 comments2 min readEA link
(deepmind.com)

There’s No Fire Alarm for Ar­tifi­cial Gen­eral Intelligence

EA Forum Archives14 Oct 2017 2:41 UTC
30 points
1 comment26 min readEA link
(www.lesswrong.com)

A Sur­vey of the Po­ten­tial Long-term Im­pacts of AI

Sam Clarke18 Jul 2022 9:48 UTC
63 points
2 comments27 min readEA link

[Question] What are some re­sources (ar­ti­cles, videos) that show off what the cur­rent state of the art in AI is? (for a layper­son who doesn’t know much about AI)

james6 Dec 2021 21:06 UTC
10 points
6 comments1 min readEA link

What’s so dan­ger­ous about AI any­way? – Or: What it means to be a superintelligence

Thomas Kehrenberg18 Jul 2022 16:14 UTC
9 points
2 comments11 min readEA link

[Cross-post] Change my mind: we should define and mea­sure the effec­tive­ness of ad­vanced AI

David Johnston6 Apr 2022 0:20 UTC
4 points
0 comments7 min readEA link

“Tech­nolog­i­cal un­em­ploy­ment” AI vs. “most im­por­tant cen­tury” AI: how far apart?

Holden Karnofsky11 Oct 2022 4:50 UTC
15 points
1 comment3 min readEA link
(www.cold-takes.com)

An­nounc­ing In­sights for Impact

Christian Pearson4 Jan 2023 7:00 UTC
79 points
4 comments1 min readEA link

How to use AI speech tran­scrip­tion and anal­y­sis to ac­cel­er­ate so­cial sci­ence research

AlexanderSaeri31 Jan 2023 4:01 UTC
37 points
5 comments11 min readEA link

Should ChatGPT make us down­weight our be­lief in the con­scious­ness of non-hu­man an­i­mals?

splinter18 Feb 2023 23:29 UTC
11 points
15 comments2 min readEA link

Re­sults for a sur­vey of tool use and work­flows in al­ign­ment research

jacquesthibs19 Dec 2022 15:19 UTC
29 points
0 comments1 min readEA link

Vael Gates: Risks from Ad­vanced AI (June 2022)

Vael Gates14 Jun 2022 0:49 UTC
45 points
5 comments30 min readEA link

Why I think that teach­ing philos­o­phy is high impact

Eleni_A19 Dec 2022 23:00 UTC
17 points
2 comments2 min readEA link

Air-gap­ping eval­u­a­tion and support

Ryan Kidd26 Dec 2022 22:52 UTC
18 points
12 comments1 min readEA link

An­nounc­ing the AI Safety Field Build­ing Hub, a new effort to provide AISFB pro­jects, men­tor­ship, and funding

Vael Gates28 Jul 2022 21:29 UTC
126 points
6 comments6 min readEA link

AGI Safety Fun­da­men­tals pro­gramme is con­tract­ing a low-code engineer

Jamie Bernardi26 Aug 2022 15:43 UTC
39 points
4 comments5 min readEA link

We all teach: here’s how to do it better

Michael Noetel30 Sep 2022 2:06 UTC
158 points
12 comments24 min readEA link

Tran­scripts of in­ter­views with AI researchers

Vael Gates9 May 2022 6:03 UTC
140 points
14 comments2 min readEA link

Con­crete Steps to Get Started in Trans­former Mechanis­tic Interpretability

Neel Nanda26 Dec 2022 13:00 UTC
18 points
0 comments12 min readEA link

An­nounc­ing an Em­piri­cal AI Safety Program

Joshc13 Sep 2022 21:39 UTC
64 points
7 comments2 min readEA link

What are some low-cost out­side-the-box ways to do/​fund al­ign­ment re­search?

trevor111 Nov 2022 5:57 UTC
2 points
3 comments1 min readEA link

Pod­cast: Shoshan­nah Tekofsky on skil­ling up in AI safety, vis­it­ing Berkeley, and de­vel­op­ing novel re­search ideas

Akash25 Nov 2022 20:47 UTC
14 points
0 comments1 min readEA link

Up­date on Har­vard AI Safety Team and MIT AI Alignment

Xander Davies2 Dec 2022 6:09 UTC
69 points
3 comments1 min readEA link

[Question] I have thou­sands of copies of HPMOR in Rus­sian. How to use them with the most im­pact?

Samin27 Dec 2022 11:07 UTC
35 points
10 comments1 min readEA link

AI Safety Micro­grant Round

Chris Leong14 Nov 2022 4:25 UTC
81 points
1 comment3 min readEA link

An­nounc­ing aisafety.training

JJ Hepburn17 Jan 2023 1:55 UTC
107 points
4 comments1 min readEA link

How many peo­ple are work­ing (di­rectly) on re­duc­ing ex­is­ten­tial risk from AI?

Benjamin Hilton17 Jan 2023 14:03 UTC
117 points
3 comments4 min readEA link
(80000hours.org)

Govern­ments pose larger risks than cor­po­ra­tions: a brief re­sponse to Grace

David Johnston19 Oct 2022 11:54 UTC
11 points
3 comments2 min readEA link

Refine: An In­cu­ba­tor for Con­cep­tual Align­ment Re­search Bets

adamShimi15 Apr 2022 8:59 UTC
47 points
0 comments4 min readEA link

Re­cruit­ing Skil­led Volunteers

The BOOM3 Nov 2022 14:36 UTC
−9 points
14 comments1 min readEA link

Part 2: AI Safety Move­ment Builders should help the com­mu­nity to op­ti­mise three fac­tors: con­trib­u­tors, con­tri­bu­tions and coordination

PeterSlattery15 Dec 2022 22:48 UTC
34 points
0 comments6 min readEA link

Why do we post our AI safety plans on the In­ter­net?

Peter S. Park31 Oct 2022 16:27 UTC
13 points
22 comments11 min readEA link

Why EAs are skep­ti­cal about AI Safety

Lukas Trötzmüller18 Jul 2022 19:01 UTC
278 points
31 comments30 min readEA link

[Link post] How plau­si­ble are AI Takeover sce­nar­ios?

SammyDMartin27 Sep 2021 13:03 UTC
26 points
0 comments1 min readEA link

I am a Me­moryless System

NicholasKross23 Oct 2022 17:36 UTC
4 points
0 comments9 min readEA link
(www.thinkingmuchbetter.com)

Rood­man’s Thoughts on Biolog­i­cal Anchors

lukeprog14 Sep 2022 12:23 UTC
72 points
8 comments1 min readEA link
(docs.google.com)

SERI MATS Pro­gram—Win­ter 2022 Cohort

Ryan Kidd8 Oct 2022 19:09 UTC
50 points
5 comments1 min readEA link

Disagree­ment with bio an­chors that lead to shorter timelines

mariushobbhahn16 Nov 2022 14:40 UTC
80 points
1 comment1 min readEA link

The next decades might be wild

mariushobbhahn15 Dec 2022 16:10 UTC
123 points
31 comments1 min readEA link

Which of these ar­gu­ments for x-risk do you think we should test?

Wim9 Aug 2022 13:43 UTC
3 points
2 comments1 min readEA link

[Question] Mu­tual As­sured Destruc­tion used against AGI

L3opard8 Oct 2022 9:35 UTC
4 points
5 comments1 min readEA link

Pre-An­nounc­ing the 2023 Open Philan­thropy AI Wor­ld­views Contest

Jason Schukraft21 Nov 2022 21:45 UTC
290 points
26 comments1 min readEA link

Join the in­ter­pretabil­ity re­search hackathon

Esben Kran28 Oct 2022 16:26 UTC
48 points
0 comments5 min readEA link

Don’t worry, be happy (liter­ally)

Yuri Zavorotny5 Oct 2022 1:55 UTC
0 points
1 comment2 min readEA link

Soft­ware en­g­ineer­ing—Ca­reer review

Benjamin Hilton8 Feb 2022 6:11 UTC
92 points
19 comments8 min readEA link
(80000hours.org)

A ten­ta­tive di­alogue with a Friendly-boxed-su­per-AGI on brain uploads

Ramiro12 May 2022 21:55 UTC
5 points
0 comments4 min readEA link

The prob­lem of ar­tifi­cial suffering

Martin Trouilloud24 Sep 2021 14:43 UTC
49 points
3 comments9 min readEA link

In­tro to AI Safety

Madhav Malhotra19 Oct 2022 23:45 UTC
4 poi