RSS

AI alignment

TagLast edit: 12 Apr 2022 14:14 UTC by Leo

The AI alignment tag is used for posts that discuss aligning AI systems with human interests, and for meta-discussion about whether this goal is worthwhile, achievable, etc.

Evaluation

80,000 Hours rates AI alignment a “highest priority area”: a problem at the top of their ranking of global issues assessed by importance, tractability and neglectedness.[1]

Further reading

Christiano, Paul (2020) Current work in AI alignment, Effective Altruism Forum, April 3.

Shah, Rohin (2020) What’s been happening in AI alignment?, Effective Altruism Forum, July 29.

External links

AI Alignment Forum.

Related entries

AI forecasting | alignment tax | Center for Human-Compatible Artificial Intelligence | governance of artificial intelligence | Machine Intelligence Research Institute | rationality community

  1. ^

2019 AI Align­ment Liter­a­ture Re­view and Char­ity Comparison

Larks19 Dec 2019 2:58 UTC
147 points
28 comments62 min readEA link

2018 AI Align­ment Liter­a­ture Re­view and Char­ity Comparison

Larks18 Dec 2018 4:48 UTC
115 points
28 comments63 min readEA link

Ben Garfinkel: How sure are we about this AI stuff?

Ben Garfinkel9 Feb 2019 19:17 UTC
113 points
17 comments18 min readEA link

AI Re­search Con­sid­er­a­tions for Hu­man Ex­is­ten­tial Safety (ARCHES)

Andrew Critch21 May 2020 6:55 UTC
29 points
0 comments3 min readEA link
(acritch.com)

AGI Safety Fun­da­men­tals cur­ricu­lum and application

richard_ngo20 Oct 2021 21:45 UTC
121 points
20 comments8 min readEA link
(docs.google.com)

Disen­tan­gling ar­gu­ments for the im­por­tance of AI safety

richard_ngo23 Jan 2019 14:58 UTC
63 points
14 comments8 min readEA link

Why I pri­ori­tize moral cir­cle ex­pan­sion over re­duc­ing ex­tinc­tion risk through ar­tifi­cial in­tel­li­gence alignment

Jacy20 Feb 2018 18:29 UTC
97 points
72 comments35 min readEA link

Del­e­gated agents in prac­tice: How com­pa­nies might end up sel­l­ing AI ser­vices that act on be­half of con­sumers and coal­i­tions, and what this im­plies for safety research

Remmelt26 Nov 2020 16:39 UTC
11 points
0 comments4 min readEA link

My cur­rent thoughts on MIRI’s “highly re­li­able agent de­sign” work

Daniel_Dewey7 Jul 2017 1:17 UTC
51 points
65 commentsEA link

Why AI al­ign­ment could be hard with mod­ern deep learning

Ajeya21 Sep 2021 15:35 UTC
123 points
16 comments14 min readEA link
(www.cold-takes.com)

Deep­Mind is hiring for the Scal­able Align­ment and Align­ment Teams

Rohin Shah13 May 2022 12:19 UTC
102 points
0 comments9 min readEA link

2016 AI Risk Liter­a­ture Re­view and Char­ity Comparison

Larks13 Dec 2016 4:36 UTC
53 points
22 commentsEA link

2017 AI Safety Liter­a­ture Re­view and Char­ity Comparison

Larks20 Dec 2017 21:54 UTC
43 points
17 commentsEA link

The aca­demic con­tri­bu­tion to AI safety seems large

Gavin30 Jul 2020 10:30 UTC
115 points
28 comments9 min readEA link

Hiring en­g­ineers and re­searchers to help al­ign GPT-3

Paul_Christiano1 Oct 2020 18:52 UTC
107 points
19 comments3 min readEA link

My per­sonal cruxes for work­ing on AI safety

Buck13 Feb 2020 7:11 UTC
133 points
35 comments44 min readEA link

How do take­off speeds af­fect the prob­a­bil­ity of bad out­comes from AGI?

KR7 Jul 2020 17:53 UTC
18 points
0 comments8 min readEA link

Scru­ti­niz­ing AI Risk (80K, #81) - v. quick summary

Louis_Dixon23 Jul 2020 19:02 UTC
10 points
1 comment3 min readEA link

Cog­ni­tive Science/​Psy­chol­ogy As a Ne­glected Ap­proach to AI Safety

Kaj_Sotala5 Jun 2017 13:46 UTC
34 points
37 commentsEA link

AGI safety from first principles

richard_ngo21 Oct 2020 17:42 UTC
77 points
10 comments3 min readEA link
(www.alignmentforum.org)

An­nounc­ing AI Safety Support

Linda Linsefors19 Nov 2020 20:19 UTC
53 points
0 comments4 min readEA link

TAI Safety Biblio­graphic Database

Jess_Riedel22 Dec 2020 16:03 UTC
61 points
9 comments17 min readEA link

AMA: Ajeya Co­tra, re­searcher at Open Phil

Ajeya28 Jan 2021 17:38 UTC
84 points
107 comments1 min readEA link

[Link post] Co­or­di­na­tion challenges for pre­vent­ing AI conflict

stefan.torges9 Mar 2021 9:39 UTC
48 points
0 comments1 min readEA link
(longtermrisk.org)

Draft re­port on ex­is­ten­tial risk from power-seek­ing AI

Joe_Carlsmith28 Apr 2021 21:41 UTC
81 points
33 comments1 min readEA link

Crazy ideas some­times do work

Aryeh Englander4 Sep 2021 3:27 UTC
70 points
8 comments1 min readEA link

Ngo and Yud­kowsky on al­ign­ment difficulty

richard_ngo15 Nov 2021 22:47 UTC
71 points
13 comments94 min readEA link

From lan­guage to ethics by au­to­mated reasoning

Michele Campolo21 Nov 2021 15:16 UTC
2 points
0 comments6 min readEA link

[Question] What is most con­fus­ing to you about AI stuff?

Sam Clarke23 Nov 2021 16:00 UTC
25 points
16 comments1 min readEA link

A tale of 2.75 or­thog­o­nal­ity theses

Arepo1 May 2022 13:53 UTC
124 points
30 comments14 min readEA link

[Question] What are the coolest top­ics in AI safety, to a hope­lessly pure math­e­mat­i­cian?

Jenny K E7 May 2022 7:18 UTC
81 points
29 comments1 min readEA link

A cen­tral AI al­ign­ment prob­lem: ca­pa­bil­ities gen­er­al­iza­tion, and the sharp left turn

So8res15 Jun 2022 14:19 UTC
39 points
2 comments10 min readEA link

On Defer­ence and Yud­kowsky’s AI Risk Estimates

Ben Garfinkel19 Jun 2022 14:35 UTC
255 points
164 comments17 min readEA link

[Question] How much EA anal­y­sis of AI safety as a cause area ex­ists?

richard_ngo6 Sep 2019 11:15 UTC
90 points
20 comments2 min readEA link

Rele­vant pre-AGI possibilities

kokotajlod20 Jun 2020 13:15 UTC
22 points
0 comments1 min readEA link
(aiimpacts.org)

Par­allels Between AI Safety by De­bate and Ev­i­dence Law

Cullen_OKeefe20 Jul 2020 22:52 UTC
30 points
2 comments2 min readEA link
(cullenokeefe.com)

[Question] How strong is the ev­i­dence of un­al­igned AI sys­tems caus­ing harm?

evelynciara21 Jul 2020 4:08 UTC
31 points
1 comment1 min readEA link

In­tel­lec­tual Diver­sity in AI Safety

KR22 Jul 2020 19:07 UTC
21 points
8 comments3 min readEA link

Why the Orthog­o­nal­ity Th­e­sis’s ve­rac­ity is not the point:

Antoine de Scorraille23 Jul 2020 15:40 UTC
2 points
0 comments3 min readEA link

AI Risk: In­creas­ing Per­sua­sion Power

kewlcats3 Aug 2020 20:25 UTC
4 points
0 comments1 min readEA link

My Un­der­stand­ing of Paul Chris­ti­ano’s Iter­ated Am­plifi­ca­tion AI Safety Re­search Agenda

Chi15 Aug 2020 19:59 UTC
34 points
3 comments39 min readEA link

[Link] How un­der­stand­ing valence could help make fu­ture AIs safer

Milan_Griffes8 Oct 2020 18:53 UTC
22 points
2 comments3 min readEA link

2020 AI Align­ment Liter­a­ture Re­view and Char­ity Comparison

Larks21 Dec 2020 15:25 UTC
150 points
16 comments68 min readEA link

An­nounc­ing AXRP, the AI X-risk Re­search Podcast

DanielFilan23 Dec 2020 20:10 UTC
32 points
1 comment1 min readEA link

Buck Sh­legeris: How I think stu­dents should ori­ent to AI safety

EA Global25 Oct 2020 5:48 UTC
9 points
0 comments1 min readEA link
(www.youtube.com)

In­tro­duc­ing The Non­lin­ear Fund: AI Safety re­search, in­cu­ba­tion, and funding

Kat Woods18 Mar 2021 14:07 UTC
70 points
32 comments5 min readEA link

Paul Chris­ti­ano: Cur­rent work in AI alignment

EA Global3 Apr 2020 7:06 UTC
52 points
0 comments23 min readEA link
(www.youtube.com)

Ap­ply to the ML for Align­ment Boot­camp (MLAB) in Berkeley [Jan 3 - Jan 22]

Habryka3 Nov 2021 18:20 UTC
140 points
7 comments1 min readEA link

There should be an AI safety pro­ject board

mariushobbhahn14 Mar 2022 16:08 UTC
24 points
3 comments1 min readEA link

We Are Con­jec­ture, A New Align­ment Re­search Startup

Connor Leahy9 Apr 2022 15:07 UTC
28 points
0 comments1 min readEA link

Ap­ply to the sec­ond ML for Align­ment Boot­camp (MLAB 2) in Berkeley [Aug 15 - Fri Sept 2]

Buck6 May 2022 0:19 UTC
103 points
6 comments6 min readEA link

In­tro­duc­tion to Prag­matic AI Safety [Prag­matic AI Safety #1]

ThomasWoodside9 May 2022 17:02 UTC
67 points
0 comments6 min readEA link

EA, Psy­chol­ogy & AI Safety Research

Altruist26 May 2022 23:46 UTC
16 points
2 comments6 min readEA link

(Even) More Early-Ca­reer EAs Should Try AI Safety Tech­ni­cal Research

levin30 Jun 2022 21:14 UTC
77 points
26 comments11 min readEA link

Long-Term Fu­ture Fund: April 2019 grant recommendations

Habryka23 Apr 2019 7:00 UTC
142 points
242 comments46 min readEA link

I’m Buck Sh­legeris, I do re­search and out­reach at MIRI, AMA

Buck15 Nov 2019 22:44 UTC
122 points
231 comments2 min readEA link

Tech­ni­cal AGI safety re­search out­side AI

richard_ngo18 Oct 2019 15:02 UTC
86 points
5 comments3 min readEA link

I’m Cul­len O’Keefe, a Policy Re­searcher at OpenAI, AMA

Cullen_OKeefe11 Jan 2020 4:13 UTC
45 points
68 comments1 min readEA link

Align­ment Newslet­ter One Year Retrospective

Rohin Shah10 Apr 2019 7:00 UTC
62 points
22 comments21 min readEA link

[Link] EAF Re­search agenda: “Co­op­er­a­tion, Con­flict, and Trans­for­ma­tive Ar­tifi­cial In­tel­li­gence”

stefan.torges17 Jan 2020 13:28 UTC
64 points
0 comments1 min readEA link

[AN #80]: Why AI risk might be solved with­out ad­di­tional in­ter­ven­tion from longtermists

Rohin Shah3 Jan 2020 7:52 UTC
58 points
12 comments10 min readEA link
(www.alignmentforum.org)

AI Im­pacts: His­toric trends in tech­nolog­i­cal progress

Aaron Gertler12 Feb 2020 0:08 UTC
55 points
5 comments3 min readEA link

Ought: why it mat­ters and ways to help

Paul_Christiano26 Jul 2019 1:56 UTC
52 points
5 comments5 min readEA link

Crit­i­cal Re­view of ‘The Precipice’: A Re­assess­ment of the Risks of AI and Pandemics

Fods1211 May 2020 11:11 UTC
87 points
32 comments26 min readEA link

[Question] What are the challenges and prob­lems with pro­gram­ming law-break­ing con­straints into AGI?

MichaelStJules2 Feb 2020 20:53 UTC
7 points
34 comments1 min readEA link

FLI AI Align­ment pod­cast: Evan Hub­inger on In­ner Align­ment, Outer Align­ment, and Pro­pos­als for Build­ing Safe Ad­vanced AI

evhub1 Jul 2020 20:59 UTC
13 points
2 comments1 min readEA link
(futureoflife.org)

AMA or dis­cuss my 80K pod­cast epi­sode: Ben Garfinkel, FHI researcher

Ben Garfinkel13 Jul 2020 16:17 UTC
87 points
140 comments1 min readEA link

A list of good heuris­tics that the case for AI X-risk fails

Aaron Gertler16 Jul 2020 9:56 UTC
23 points
9 comments2 min readEA link
(www.alignmentforum.org)

Ro­hin Shah: What’s been hap­pen­ing in AI al­ign­ment?

EA Global29 Jul 2020 20:15 UTC
17 points
0 comments14 min readEA link
(www.youtube.com)

Is GPT-3 the death of the pa­per­clip max­i­mizer?

matthias_samwald3 Aug 2020 11:34 UTC
4 points
1 comment1 min readEA link

Con­ver­sa­tion on AI risk with Adam Gleave

AI Impacts27 Dec 2019 21:43 UTC
18 points
3 comments4 min readEA link
(aiimpacts.org)

Dis­con­tin­u­ous progress in his­tory: an update

AI Impacts17 Apr 2020 16:28 UTC
68 points
3 comments24 min readEA link

Thoughts on short timelines

Tobias_Baumann23 Oct 2018 15:59 UTC
22 points
15 commentsEA link

New re­port on how much com­pu­ta­tional power it takes to match the hu­man brain (Open Philan­thropy)

Aaron Gertler15 Sep 2020 1:06 UTC
39 points
1 comment17 min readEA link
(www.openphilanthropy.org)

Align­ing Recom­mender Sys­tems as Cause Area

IvanVendrov8 May 2019 8:56 UTC
145 points
44 comments13 min readEA link

Con­sider pay­ing me to do AI safety re­search work

Rupert5 Nov 2020 8:09 UTC
11 points
3 comments2 min readEA link

AGI Predictions

Pablo21 Nov 2020 12:02 UTC
36 points
0 comments1 min readEA link
(www.lesswrong.com)

[Question] How can I bet on short timelines?

kokotajlod7 Nov 2020 12:45 UTC
33 points
12 comments2 min readEA link

[Question] Is this a good way to bet on short timelines?

kokotajlod28 Nov 2020 14:31 UTC
17 points
16 comments1 min readEA link

Draft re­port on AI timelines

Ajeya15 Dec 2020 12:10 UTC
35 points
0 comments1 min readEA link
(alignmentforum.org)

Some AI re­search ar­eas and their rele­vance to ex­is­ten­tial safety

Andrew Critch15 Dec 2020 12:15 UTC
11 points
0 comments56 min readEA link
(alignmentforum.org)

Open Philan­thropy’s AI gov­er­nance grant­mak­ing (so far)

Aaron Gertler17 Dec 2020 12:00 UTC
63 points
0 comments6 min readEA link
(www.openphilanthropy.org)

Take­aways from safety by de­fault interviews

AI Impacts7 Apr 2020 2:01 UTC
25 points
2 comments13 min readEA link
(aiimpacts.org)

What does it mean to be­come an ex­pert in AI Hard­ware?

Christopher_Phenicie9 Jan 2021 4:15 UTC
71 points
11 comments11 min readEA link

[Question] How should we in­vest in “long-term short-ter­mism” given the like­li­hood of trans­for­ma­tive AI?

James_Banks12 Jan 2021 23:54 UTC
7 points
0 comments1 min readEA link

Shar­ing the World with Digi­tal Minds

Aaron Gertler1 Dec 2020 8:00 UTC
12 points
1 comment1 min readEA link
(www.nickbostrom.com)

What Should the Aver­age EA Do About AI Align­ment?

Raemon25 Feb 2017 20:07 UTC
31 points
41 commentsEA link

Op­por­tu­ni­ties for in­di­vi­d­ual donors in AI safety

alexflint12 Mar 2018 2:10 UTC
13 points
13 commentsEA link

Quan­tify­ing the Far Fu­ture Effects of Interventions

MichaelDickens18 May 2016 2:15 UTC
8 points
1 commentEA link

Po­ten­tial Risks from Ad­vanced AI

EA Global13 Aug 2017 7:00 UTC
8 points
0 comments17 min readEA link

What does (and doesn’t) AI mean for effec­tive al­tru­ism?

EA Global12 Aug 2017 7:00 UTC
8 points
0 comments12 min readEA link

Some global catas­trophic risk estimates

Tamay10 Feb 2021 19:32 UTC
105 points
14 comments1 min readEA link

Three Im­pacts of Ma­chine Intelligence

Paul_Christiano23 Aug 2013 10:10 UTC
33 points
4 comments8 min readEA link
(rationalaltruist.com)

Some promis­ing ca­reer ideas be­yond 80,000 Hours’ pri­or­ity paths

Ardenlk26 Jun 2020 10:34 UTC
140 points
28 comments15 min readEA link

Jesse Clif­ton: Open-source learn­ing — a bar­gain­ing approach

EA Global18 Oct 2019 18:05 UTC
9 points
0 comments1 min readEA link
(www.youtube.com)

How to build a safe ad­vanced AI (Evan Hub­inger) | What’s up in AI safety? (Asya Ber­gal)

EA Global25 Oct 2020 5:48 UTC
6 points
0 comments1 min readEA link
(www.youtube.com)

Ar­tifi­cial in­tel­li­gence ca­reer stories

EA Global25 Oct 2020 6:56 UTC
11 points
0 comments1 min readEA link
(www.youtube.com)

Tan Zhi Xuan: AI al­ign­ment, philo­soph­i­cal plu­ral­ism, and the rele­vance of non-Western philosophy

EA Global21 Nov 2020 8:12 UTC
12 points
1 comment1 min readEA link
(www.youtube.com)

AGI risk: analo­gies & arguments

Gavin23 Mar 2021 13:18 UTC
31 points
3 comments8 min readEA link
(www.gleech.org)

He­len Toner: The Open Philan­thropy Pro­ject’s work on AI risk

EA Global3 Nov 2017 7:43 UTC
6 points
0 comments1 min readEA link
(www.youtube.com)

Owain Evans and Vic­to­ria Krakovna: Ca­reers in tech­ni­cal AI safety

EA Global3 Nov 2017 7:43 UTC
6 points
0 comments1 min readEA link
(www.youtube.com)

Owen Cot­ton-Bar­ratt: What does (and doesn’t) AI mean for effec­tive al­tru­ism?

EA Global11 Aug 2017 8:19 UTC
8 points
0 comments12 min readEA link
(www.youtube.com)

Katja Grace: AI safety

EA Global11 Aug 2017 8:19 UTC
6 points
0 comments1 min readEA link
(www.youtube.com)

Daniel Dewey: The Open Philan­thropy Pro­ject’s work on po­ten­tial risks from ad­vanced AI

EA Global11 Aug 2017 8:19 UTC
6 points
0 comments17 min readEA link
(www.youtube.com)

Jan Leike, He­len Toner, Malo Bour­gon, and Miles Brundage: Work­ing in AI

EA Global11 Aug 2017 8:19 UTC
6 points
0 comments1 min readEA link
(www.youtube.com)

Michael Page, Dario Amodei, He­len Toner, Tasha McCauley, Jan Leike, & Owen Cot­ton-Bar­ratt: Mus­ings on AI

EA Global11 Aug 2017 8:19 UTC
6 points
0 comments1 min readEA link
(www.youtube.com)

Max Teg­mark: Risks and benefits of ad­vanced ar­tifi­cial intelligence

EA Global5 Aug 2016 9:19 UTC
6 points
0 comments1 min readEA link
(www.youtube.com)

An­drew Critch: Log­i­cal in­duc­tion — progress in AI alignment

EA Global6 Aug 2016 0:40 UTC
6 points
0 comments1 min readEA link
(www.youtube.com)

Co­her­ence ar­gu­ments im­ply a force for goal-di­rected behavior

Katja_Grace6 Apr 2021 21:44 UTC
19 points
1 comment14 min readEA link
(worldspiritsockpuppet.com)

On AI and Compute

johncrox3 Apr 2019 21:26 UTC
39 points
12 comments8 min readEA link

AI al­ign­ment prize win­ners and next round [link]

RyanCarey20 Jan 2018 12:07 UTC
7 points
1 commentEA link

Atari early

AI Impacts2 Apr 2020 23:28 UTC
34 points
2 comments5 min readEA link
(aiimpacts.org)

[Question] Is there ev­i­dence that recom­mender sys­tems are chang­ing users’ prefer­ences?

zdgroff12 Apr 2021 19:11 UTC
61 points
15 comments1 min readEA link

Why I ex­pect suc­cess­ful (nar­row) alignment

Tobias_Baumann29 Dec 2018 15:46 UTC
18 points
10 commentsEA link
(s-risks.org)

Ma­hen­dra Prasad: Ra­tional group de­ci­sion-making

EA Global8 Jul 2020 15:06 UTC
14 points
0 comments16 min readEA link
(www.youtube.com)

[Question] Brief sum­mary of key dis­agree­ments in AI Risk

Aryeh Englander26 Dec 2019 19:40 UTC
31 points
3 comments1 min readEA link

[Question] What con­sid­er­a­tions in­fluence whether I have more in­fluence over short or long timelines?

kokotajlod5 Nov 2020 19:57 UTC
18 points
0 comments1 min readEA link

AGI in a vuln­er­a­ble world

AI Impacts2 Apr 2020 3:43 UTC
17 points
0 comments1 min readEA link
(aiimpacts.org)

Three kinds of competitiveness

AI Impacts2 Apr 2020 3:46 UTC
10 points
0 comments5 min readEA link
(aiimpacts.org)

In­for­ma­tion se­cu­rity ca­reers for GCR reduction

ClaireZabel20 Jun 2019 23:56 UTC
185 points
34 comments8 min readEA link

Nat­u­ral­ism and AI alignment

Michele Campolo24 Apr 2021 16:20 UTC
12 points
3 comments7 min readEA link

Why AI is Harder Than We Think—Me­lanie Mitchell

evelynciara28 Apr 2021 8:19 UTC
43 points
6 comments2 min readEA link
(arxiv.org)

In­for­mat­ica: Spe­cial Is­sue on Superintelligence

RyanCarey3 May 2017 5:05 UTC
7 points
0 commentsEA link

Eric Drexler: Pare­to­topian goal alignment

EA Global15 Mar 2019 14:51 UTC
6 points
0 comments10 min readEA link
(www.youtube.com)

[Question] What harm could AI safety do?

SeanEngelhart15 May 2021 1:11 UTC
12 points
9 comments1 min readEA link

[Question] Why should we *not* put effort into AI safety re­search?

Ben Thompson16 May 2021 5:11 UTC
15 points
5 comments1 min readEA link

Pre­dict re­sponses to the “ex­is­ten­tial risk from AI” survey

RobBensinger28 May 2021 1:38 UTC
36 points
8 comments2 min readEA link

Long-Term Fu­ture Fund: May 2021 grant recommendations

abergal27 May 2021 6:44 UTC
110 points
17 comments57 min readEA link

Fi­nal Re­port of the Na­tional Se­cu­rity Com­mis­sion on Ar­tifi­cial In­tel­li­gence (NSCAI, 2021)

MichaelA1 Jun 2021 8:19 UTC
51 points
3 comments4 min readEA link
(www.nscai.gov)

“Ex­is­ten­tial risk from AI” sur­vey results

RobBensinger1 Jun 2021 20:19 UTC
78 points
36 comments11 min readEA link

Sur­vey on AI ex­is­ten­tial risk scenarios

Sam Clarke8 Jun 2021 17:12 UTC
148 points
5 comments7 min readEA link

Some AI Gover­nance Re­search Ideas

MarkusAnderljung3 Jun 2021 10:51 UTC
90 points
5 comments2 min readEA link

[Question] What is an ex­am­ple of re­cent, tan­gible progress in AI safety re­search?

Aaron Gertler14 Jun 2021 5:29 UTC
35 points
4 comments1 min readEA link

[Question] The pos­i­tive case for a fo­cus on achiev­ing safe AI?

vipulnaik25 Jun 2021 4:01 UTC
41 points
1 comment1 min readEA link

Mauhn Re­leases AI Safety Documentation

Berg Severens2 Jul 2021 12:19 UTC
4 points
2 comments1 min readEA link

Get­ting started in­de­pen­dently in AI Safety

JJ Hepburn6 Jul 2021 15:20 UTC
39 points
10 comments2 min readEA link

A Sim­ple Model of AGI De­ploy­ment Risk

djbinder9 Jul 2021 9:44 UTC
16 points
0 comments4 min readEA link

Paul Chris­ti­ano on how OpenAI is de­vel­op­ing real solu­tions to the ‘AI al­ign­ment prob­lem’, and his vi­sion of how hu­man­ity will pro­gres­sively hand over de­ci­sion-mak­ing to AI systems

80000_Hours2 Oct 2018 11:49 UTC
6 points
0 comments185 min readEA link

How Do AI Timelines Affect Giv­ing Now vs. Later?

MichaelDickens3 Aug 2021 3:36 UTC
33 points
8 comments7 min readEA link

Fore­cast­ing Trans­for­ma­tive AI: What Kind of AI?

Holden Karnofsky10 Aug 2021 21:38 UTC
60 points
2 comments10 min readEA link

[Question] What are the top pri­ori­ties in a slow-take­off, mul­ti­po­lar world?

JP Addison25 Aug 2021 8:47 UTC
26 points
9 comments1 min readEA link

A mesa-op­ti­miza­tion per­spec­tive on AI valence and moral patienthood

jacobpfau9 Sep 2021 22:23 UTC
10 points
18 comments33 min readEA link

The Me­taethics and Nor­ma­tive Ethics of AGI Value Align­ment: Many Ques­tions, Some Implications

Dario Citrini15 Sep 2021 19:05 UTC
21 points
0 comments8 min readEA link

[Question] What kind of event, tar­geted to un­der­grad­u­ate CS ma­jors, would be most effec­tive at get­ting peo­ple to work on AI safety?

Caleb Biddulph19 Sep 2021 16:19 UTC
9 points
1 comment1 min readEA link

An­nounc­ing the Vi­talik Bu­terin Fel­low­ships in AI Ex­is­ten­tial Safety!

DanielFilan21 Sep 2021 0:41 UTC
62 points
0 comments1 min readEA link
(grants.futureoflife.org)

[Question] Why aren’t you freak­ing out about OpenAI? At what point would you start?

AppliedDivinityStudies10 Oct 2021 13:06 UTC
71 points
22 comments2 min readEA link

[Question] Is it crunch time yet? If so, who can help?

NicholasKross13 Oct 2021 4:11 UTC
29 points
9 comments1 min readEA link

An ML safety in­surance com­pany—shower thoughts

EdoArad18 Oct 2021 7:45 UTC
15 points
4 comments1 min readEA link

[Creative Writ­ing Con­test] An AI Safety Limerick

Ben_West18 Oct 2021 19:11 UTC
21 points
5 comments1 min readEA link

Truth­ful AI

Owen Cotton-Barratt20 Oct 2021 15:11 UTC
55 points
15 comments10 min readEA link

BERI is hiring an ML Soft­ware Engineer

sawyer10 Nov 2021 19:36 UTC
17 points
2 comments1 min readEA link

Dis­cus­sion with Eliezer Yud­kowsky on AGI interventions

RobBensinger11 Nov 2021 3:21 UTC
60 points
35 comments34 min readEA link

[Question] What would you do if you had a lot of money/​power/​in­fluence and you thought that AI timelines were very short?

Greg_Colbourn12 Nov 2021 21:59 UTC
29 points
10 comments1 min readEA link

“Slower tech de­vel­op­ment” can be about or­der­ing, grad­u­al­ness, or dis­tance from now

MichaelA14 Nov 2021 20:58 UTC
31 points
3 comments5 min readEA link

How to get tech­nolog­i­cal knowl­edge on AI/​ML (for non-tech peo­ple)

Leonie Koessler30 Jun 2021 7:53 UTC
61 points
6 comments5 min readEA link

Ngo and Yud­kowsky on AI ca­pa­bil­ity gains

richard_ngo19 Nov 2021 1:54 UTC
23 points
4 comments39 min readEA link

Yud­kowsky and Chris­ti­ano dis­cuss “Take­off Speeds”

EliezerYudkowsky22 Nov 2021 19:42 UTC
42 points
0 comments60 min readEA link

AI Safety Needs Great Engineers

Andy Jones23 Nov 2021 21:03 UTC
89 points
12 comments4 min readEA link

Chris­ti­ano, Co­tra, and Yud­kowsky on AI progress

Ajeya25 Nov 2021 16:30 UTC
18 points
6 comments68 min readEA link

Red­wood Re­search is hiring for sev­eral roles

Jack R29 Nov 2021 0:18 UTC
75 points
0 comments1 min readEA link

Soares, Tal­linn, and Yud­kowsky dis­cuss AGI cognition

EliezerYudkowsky29 Nov 2021 17:28 UTC
15 points
0 comments40 min readEA link

Syd­ney AI Safety Fellowship

Chris Leong2 Dec 2021 7:35 UTC
16 points
0 comments2 min readEA link

EA megapro­jects continued

mariushobbhahn3 Dec 2021 10:33 UTC
172 points
47 comments7 min readEA link

AI Safety: Ap­ply­ing to Grad­u­ate Studies

frances_lorenz15 Dec 2021 22:56 UTC
21 points
0 comments12 min readEA link

In­tro­duc­ing the Prin­ci­ples of In­tel­li­gent Be­havi­our in Biolog­i­cal and So­cial Sys­tems (PIBBSS) Fellowship

adamShimi18 Dec 2021 15:25 UTC
28 points
4 comments10 min readEA link

[Question] Should the EA com­mu­nity have a DL en­g­ineer­ing fel­low­ship?

PabloAMC24 Dec 2021 13:43 UTC
26 points
6 comments1 min readEA link

13 Very Differ­ent Stances on AGI

Ozzie Gooen27 Dec 2021 23:30 UTC
69 points
27 comments3 min readEA link

In­creased Availa­bil­ity and Willing­ness for De­ploy­ment of Re­sources for Effec­tive Altru­ism and Long-Termism

Evan_Gaensbauer29 Dec 2021 20:20 UTC
45 points
1 comment2 min readEA link

Con­sider try­ing the ELK con­test (I am)

Holden Karnofsky5 Jan 2022 19:42 UTC
110 points
18 comments16 min readEA link

Ac­tion: Help ex­pand fund­ing for AI Safety by co­or­di­nat­ing on NSF response

Evan R. Murphy20 Jan 2022 20:48 UTC
20 points
7 comments3 min readEA link

[linkpost] Shar­ing pow­er­ful AI mod­els: the emerg­ing paradigm of struc­tured access

tobyshevlane20 Jan 2022 21:10 UTC
10 points
3 comments1 min readEA link

[Question] Is a ca­reer in mak­ing AI sys­tems more se­cure a mean­ingful way to miti­gate the X-risk posed by AGI?

Kyle O’Brien13 Feb 2022 7:05 UTC
14 points
4 comments1 min readEA link

[Linkpost] How To Get Into In­de­pen­dent Re­search On Align­ment/​Agency

Jackson Wagner14 Feb 2022 21:40 UTC
10 points
0 comments1 min readEA link

Ngo and Yud­kowsky on sci­en­tific rea­son­ing and pivotal acts

EliezerYudkowsky21 Feb 2022 17:00 UTC
33 points
1 comment35 min readEA link

We’re Aligned AI, we’re aiming to al­ign AI

Stuart Armstrong21 Feb 2022 10:43 UTC
63 points
8 comments3 min readEA link

Chris­ti­ano and Yud­kowsky on AI pre­dic­tions and hu­man intelligence

EliezerYudkowsky23 Feb 2022 16:51 UTC
31 points
0 comments42 min readEA link

Im­por­tant, ac­tion­able re­search ques­tions for the most im­por­tant century

Holden Karnofsky24 Feb 2022 16:34 UTC
250 points
14 comments19 min readEA link

How I Formed My Own Views About AI Safety

Neel Nanda27 Feb 2022 18:52 UTC
117 points
12 comments14 min readEA link
(www.neelnanda.io)

AI views and dis­agree­ments AMA: Chris­ti­ano, Ngo, Shah, Soares, Yudkowsky

RobBensinger1 Mar 2022 1:13 UTC
30 points
5 comments1 min readEA link
(www.lesswrong.com)

Shah and Yud­kowsky on al­ign­ment failures

EliezerYudkowsky28 Feb 2022 19:25 UTC
20 points
7 comments92 min readEA link

AGI x-risk timelines: 10% chance (by year X) es­ti­mates should be the head­line, not 50%.

Greg_Colbourn1 Mar 2022 12:02 UTC
63 points
22 comments2 min readEA link

[Question] Is trans­for­ma­tive AI the biggest ex­is­ten­tial risk? Why or why not?

evelynciara5 Mar 2022 3:54 UTC
9 points
11 comments1 min readEA link

On pre­sent­ing the case for AI risk

Aryeh Englander8 Mar 2022 21:37 UTC
113 points
12 comments4 min readEA link

Twit­ter-length re­sponses to 24 AI al­ign­ment arguments

RobBensinger14 Mar 2022 19:34 UTC
67 points
17 comments8 min readEA link

[Question] Ca­reer Ad­vice: Philos­o­phy + Pro­gram­ming → AI Safety

tcelferact18 Mar 2022 15:09 UTC
29 points
11 comments2 min readEA link

Med­i­ta­tions on ca­reers in AI Safety

PabloAMC23 Mar 2022 22:00 UTC
88 points
35 comments2 min readEA link

Emer­gent Ven­tures AI

Gavin8 Apr 2022 22:08 UTC
23 points
0 comments1 min readEA link
(marginalrevolution.com)

SERI ML Align­ment The­ory Schol­ars Pro­gram 2022

Ryan Kidd27 Apr 2022 16:33 UTC
57 points
2 comments3 min readEA link

Law-Fol­low­ing AI 1: Se­quence In­tro­duc­tion and Structure

Cullen_OKeefe27 Apr 2022 17:16 UTC
23 points
0 comments9 min readEA link

Law-Fol­low­ing AI 2: In­tent Align­ment + Su­per­in­tel­li­gence → Lawless AI (By De­fault)

Cullen_OKeefe27 Apr 2022 17:18 UTC
15 points
0 comments6 min readEA link

Law-Fol­low­ing AI 3: Lawless AI Agents Un­der­mine Sta­bi­liz­ing Agreements

Cullen_OKeefe27 Apr 2022 17:20 UTC
21 points
1 comment3 min readEA link

Messy per­sonal stuff that af­fected my cause pri­ori­ti­za­tion (or: how I started to care about AI safety)

Julia_Wise5 May 2022 17:59 UTC
257 points
14 comments2 min readEA link

The case for be­com­ing a black-box in­ves­ti­ga­tor of lan­guage models

Buck6 May 2022 14:37 UTC
86 points
7 comments3 min readEA link

[Question] What are your recom­men­da­tions for tech­ni­cal AI al­ign­ment pod­casts?

Evan_Gaensbauer11 May 2022 21:52 UTC
13 points
4 comments1 min readEA link

[Question] I’m in­ter­view­ing Max Teg­mark about AI safety and more. What shouId I ask him?

Robert_Wiblin13 May 2022 15:32 UTC
18 points
2 comments1 min readEA link

SERI ML ap­pli­ca­tion dead­line is ex­tended un­til May 22.

Viktoria Malyasova22 May 2022 0:13 UTC
13 points
3 comments1 min readEA link

We should ex­pect to worry more about spec­u­la­tive risks

Ben Garfinkel29 May 2022 21:08 UTC
117 points
15 comments3 min readEA link

How to pur­sue a ca­reer in tech­ni­cal AI alignment

CharlieRS4 Jun 2022 21:36 UTC
186 points
6 comments40 min readEA link

Steer­ing AI to care for an­i­mals, and soon

Andrew Critch14 Jun 2022 1:13 UTC
171 points
37 comments1 min readEA link

Key Papers in Lan­guage Model Safety

aogara20 Jun 2022 14:59 UTC
12 points
0 comments22 min readEA link

7 es­says on Build­ing a Bet­ter Future

Jamie_Harris24 Jun 2022 14:28 UTC
19 points
0 comments2 min readEA link

What suc­cess looks like

mariushobbhahn28 Jun 2022 14:30 UTC
74 points
16 comments19 min readEA link

$500 bounty for al­ign­ment con­test ideas

Akash30 Jun 2022 1:55 UTC
18 points
1 comment2 min readEA link

An­nounc­ing the Har­vard AI Safety Team

Alexander Davies30 Jun 2022 18:34 UTC
103 points
2 comments5 min readEA link

AI safety uni­ver­sity groups: a promis­ing op­por­tu­nity to re­duce ex­is­ten­tial risk

mic30 Jun 2022 18:37 UTC
32 points
1 comment14 min readEA link

Quick sur­vey on AI al­ign­ment resources

frances_lorenz30 Jun 2022 19:08 UTC
14 points
0 comments1 min readEA link

AI Gover­nance Ca­reer Paths for Europeans

careersthrowaway16 May 2020 6:40 UTC
72 points
1 comment12 min readEA link

[Link and com­men­tary] Beyond Near- and Long-Term: Towards a Clearer Ac­count of Re­search Pri­ori­ties in AI Ethics and Society

MichaelA14 Mar 2020 9:04 UTC
17 points
0 comments6 min readEA link

[Question] How do you talk about AI safety?

evelynciara19 Apr 2020 16:15 UTC
10 points
5 comments1 min readEA link

Database of ex­is­ten­tial risk estimates

MichaelA15 Apr 2020 12:43 UTC
106 points
36 comments5 min readEA link

AI Benefits Post 1: In­tro­duc­ing “AI Benefits”

Cullen_OKeefe22 Jun 2020 16:58 UTC
10 points
2 comments3 min readEA link

AI Benefits Post 2: How AI Benefits Differs from AI Align­ment & AI for Good

Cullen_OKeefe29 Jun 2020 16:59 UTC
9 points
0 comments2 min readEA link

Book re­view: Ar­chi­tects of In­tel­li­gence by Martin Ford (2018)

ofer11 Aug 2020 17:24 UTC
11 points
1 comment2 min readEA link

An­i­mal Rights, The Sin­gu­lar­ity, and Astro­nom­i­cal Suffering

sapphire20 Aug 2020 20:23 UTC
46 points
0 comments3 min readEA link

Sin­ga­pore’s Tech­ni­cal AI Align­ment Re­search Ca­reer Guide

Yi-Yang26 Aug 2020 8:09 UTC
32 points
7 comments8 min readEA link

Asya Ber­gal: Rea­sons you might think hu­man-level AI is un­likely to hap­pen soon

EA Global26 Aug 2020 16:01 UTC
23 points
2 comments18 min readEA link
(www.youtube.com)

A course for the gen­eral pub­lic on AI

LeandroD31 Aug 2020 1:29 UTC
1 point
0 comments1 min readEA link

Does gen­er­al­ity pay? GPT-3 can provide pre­limi­nary ev­i­dence.

evelynciara12 Jul 2020 18:53 UTC
21 points
4 comments2 min readEA link

Short-Term AI Align­ment as a Pri­or­ity Cause

len.hoang.lnh11 Feb 2020 16:22 UTC
17 points
11 comments7 min readEA link

[Question] Are so­cial me­dia al­gorithms an ex­is­ten­tial risk?

BarryGrimes15 Sep 2020 8:52 UTC
24 points
13 comments1 min readEA link

Sum­mary of Stu­art Rus­sell’s new book, “Hu­man Com­pat­i­ble”

Rohin Shah19 Oct 2019 19:56 UTC
31 points
1 comment15 min readEA link
(www.alignmentforum.org)

Feed­back Re­quest on EA Philip­pines’ Ca­reer Ad­vice Re­search for Tech­ni­cal AI Safety

BrianTan3 Oct 2020 10:39 UTC
18 points
5 comments4 min readEA link

AI risk hub in Sin­ga­pore?

kokotajlod29 Oct 2020 11:51 UTC
23 points
3 comments4 min readEA link

fic­tion about AI risk

Ann Garth12 Nov 2020 22:36 UTC
8 points
1 comment1 min readEA link

[Question] Donat­ing against Short Term AI risks

Jan-WillemvanPutten16 Nov 2020 12:23 UTC
6 points
10 comments1 min readEA link

How Rood­man’s GWP model trans­lates to TAI timelines

kokotajlod16 Nov 2020 14:11 UTC
22 points
0 comments2 min readEA link

Com­pet­i­tive Ethics

mwcvitkovic24 Nov 2020 1:00 UTC
20 points
7 comments4 min readEA link

Defend­ing against Ad­ver­sar­ial Poli­cies in Re­in­force­ment Learn­ing with Alter­nat­ing Training

sergia12 Feb 2022 15:59 UTC
4 points
0 comments13 min readEA link

Long-Term Fu­ture Fund: Ask Us Any­thing!

AdamGleave3 Dec 2020 13:44 UTC
89 points
154 comments1 min readEA link

[Question] Can we con­vince peo­ple to work on AI safety with­out con­vinc­ing them about AGI hap­pen­ing this cen­tury?

BrianTan26 Nov 2020 14:46 UTC
8 points
3 comments2 min readEA link

Cen­tre for the Study of Ex­is­ten­tial Risk Four Month Re­port June—Septem­ber 2020

HaydnBelfield2 Dec 2020 18:33 UTC
24 points
0 comments18 min readEA link

LessWrong is now a book, available for pre-or­der!

jacobjacob4 Dec 2020 20:42 UTC
48 points
1 comment10 min readEA link

De­fus­ing AGI Danger

Mark Xu24 Dec 2020 23:08 UTC
22 points
0 comments9 min readEA link
(www.alignmentforum.org)

Against GDP as a met­ric for timelines and take­off speeds

kokotajlod29 Dec 2020 17:50 UTC
41 points
6 comments14 min readEA link

Euro­pean Master’s Pro­grams in Ma­chine Learn­ing, Ar­tifi­cial In­tel­li­gence, and re­lated fields

Master Programs ML/AI17 Jan 2021 20:09 UTC
17 points
4 comments1 min readEA link

Birds, Brains, Planes, and AI: Against Ap­peals to the Com­plex­ity/​Mys­te­ri­ous­ness/​Effi­ciency of the Brain

kokotajlod18 Jan 2021 12:39 UTC
27 points
2 comments1 min readEA link

13 Re­cent Publi­ca­tions on Ex­is­ten­tial Risk (Jan 2021 up­date)

HaydnBelfield8 Feb 2021 12:42 UTC
7 points
2 comments10 min readEA link

Stu­art Rus­sell Hu­man Com­pat­i­ble AI Roundtable with Allan Dafoe, Rob Re­ich, & Ma­ri­etje Schaake

Mahendra Prasad11 Feb 2021 7:43 UTC
16 points
0 comments1 min readEA link

In­ter­view with Tom Chivers: “AI is a plau­si­ble ex­is­ten­tial risk, but it feels as if I’m in Pas­cal’s mug­ging”

felix.h21 Feb 2021 13:41 UTC
16 points
1 comment7 min readEA link

Work­ing at EA or­ga­ni­za­tions se­ries: Ma­chine In­tel­li­gence Re­search Institute

SoerenMind1 Nov 2015 12:49 UTC
8 points
0 commentsEA link

AI Fore­cast­ing Dic­tionary (Fore­cast­ing in­fras­truc­ture, part 1)

jacobjacob8 Aug 2019 13:16 UTC
18 points
0 comments5 min readEA link

AI Fore­cast­ing Re­s­olu­tion Coun­cil (Fore­cast­ing in­fras­truc­ture, part 2)

jacobjacob29 Aug 2019 17:43 UTC
28 points
0 comments3 min readEA link

AI Fore­cast­ing Ques­tion Database (Fore­cast­ing in­fras­truc­ture, part 3)

jacobjacob3 Sep 2019 14:57 UTC
23 points
1 comment4 min readEA link

Cri­tique of Su­per­in­tel­li­gence Part 1

Fods1213 Dec 2018 5:10 UTC
22 points
13 commentsEA link

Cri­tique of Su­per­in­tel­li­gence Part 2

Fods1213 Dec 2018 5:12 UTC
9 points
12 commentsEA link

Cri­tique of Su­per­in­tel­li­gence Part 3

Fods1213 Dec 2018 5:13 UTC
3 points
5 commentsEA link

Cri­tique of Su­per­in­tel­li­gence Part 4

Fods1213 Dec 2018 5:14 UTC
4 points
2 commentsEA link

Cri­tique of Su­per­in­tel­li­gence Part 5

Fods1213 Dec 2018 5:19 UTC
12 points
2 commentsEA link

Re­port on Semi-in­for­ma­tive Pri­ors for AI timelines (Open Philan­thropy)

Tom_Davidson26 Mar 2021 17:46 UTC
62 points
6 comments2 min readEA link

The flaws that make to­day’s AI ar­chi­tec­ture un­safe and a new ap­proach that could fix it

80000_Hours22 Jun 2020 22:15 UTC
3 points
0 comments86 min readEA link
(80000hours.org)

Three Bi­ases That Made Me Believe in AI Risk

beth​13 Feb 2019 23:22 UTC
39 points
20 comments3 min readEA link

But ex­actly how com­plex and frag­ile?

Katja_Grace13 Dec 2019 7:05 UTC
36 points
3 comments3 min readEA link
(meteuphoric.com)

Con­fused about AI re­search as a means of ad­dress­ing AI risk

Eli Rose21 Feb 2019 0:07 UTC
31 points
15 comments1 min readEA link

Changes in fund­ing in the AI safety field

Sebastian_Farquhar3 Feb 2017 13:09 UTC
34 points
10 commentsEA link

AI Align­ment 2018-2019 Review

Habryka28 Jan 2020 21:14 UTC
28 points
0 comments6 min readEA link
(www.lesswrong.com)

A con­ver­sa­tion with Ro­hin Shah

AI Impacts12 Nov 2019 1:31 UTC
27 points
8 comments33 min readEA link
(aiimpacts.org)

“Tak­ing AI Risk Se­ri­ously” – Thoughts by An­drew Critch

Raemon19 Nov 2018 2:21 UTC
26 points
9 commentsEA link
(www.lesswrong.com)

Amanda Askell: AI safety needs so­cial scientists

EA Global4 Mar 2019 15:50 UTC
26 points
0 comments18 min readEA link
(www.youtube.com)

[Link] Thiel on GCRs

Milan_Griffes22 Jul 2019 20:47 UTC
28 points
11 comments1 min readEA link

Cortés, Pizarro, and Afonso as Prece­dents for Takeover

AI Impacts2 Mar 2020 12:25 UTC
27 points
17 comments11 min readEA link
(aiimpacts.org)

The first AI Safety Camp & onwards

Remmelt7 Jun 2018 18:49 UTC
25 points
2 commentsEA link

In­tro to car­ing about AI al­ign­ment as an EA cause

So8res14 Apr 2017 0:42 UTC
24 points
12 commentsEA link

Im­pli­ca­tions of Quan­tum Com­put­ing for Ar­tifi­cial In­tel­li­gence al­ign­ment re­search (ABRIDGED)

Jaime Sevilla5 Sep 2019 14:56 UTC
25 points
4 comments2 min readEA link

Are Hu­mans ‘Hu­man Com­pat­i­ble’?

Matt Boyd6 Dec 2019 5:49 UTC
23 points
8 comments4 min readEA link

A re­sponse to Matthews on AI Risk

RyanCarey11 Aug 2015 12:58 UTC
11 points
17 commentsEA link

What can the prin­ci­pal-agent liter­a­ture tell us about AI risk?

Alexis Carlier10 Feb 2020 10:10 UTC
26 points
1 comment16 min readEA link

Sup­port­ing global co­or­di­na­tion in AI de­vel­op­ment: Why and how to con­tribute to in­ter­na­tional AI standards

pcihon17 Apr 2019 22:17 UTC
21 points
4 comments1 min readEA link

AI safety schol­ar­ships look worth-fund­ing (if other fund­ing is sane)

anon-a19 Nov 2019 0:59 UTC
22 points
6 comments2 min readEA link

AI Safety Ca­reer Bot­tle­necks Sur­vey Re­sponses Responses

Linda Linsefors28 May 2021 10:41 UTC
34 points
1 comment5 min readEA link

[3-hour pod­cast]: Joseph Car­l­smith on longter­mism, utopia, the com­pu­ta­tional power of the brain, meta-ethics, illu­sion­ism and meditation

Gus Docker27 Jul 2021 13:18 UTC
34 points
2 comments1 min readEA link

[Question] 1h-vol­un­teers needed for a small AI Safety-re­lated re­search pro­ject

PabloAMC16 Aug 2021 17:51 UTC
4 points
0 comments1 min readEA link

[Question] How to get more aca­demics en­thu­si­as­tic about do­ing AI Safety re­search?

PabloAMC4 Sep 2021 14:10 UTC
25 points
19 comments1 min readEA link

List of AI safety courses and resources

Daniel del Castillo6 Sep 2021 14:26 UTC
47 points
3 comments1 min readEA link

[Question] Is work­ing on AI safety as dan­ger­ous as ig­nor­ing it?

jkmh20 Sep 2021 23:06 UTC
8 points
5 comments1 min readEA link

On Solv­ing Prob­lems Be­fore They Ap­pear: The Weird Episte­molo­gies of Alignment

adamShimi11 Oct 2021 8:21 UTC
28 points
0 comments15 min readEA link

AI Risk in Africa

Claude Formanek12 Oct 2021 2:28 UTC
16 points
0 comments10 min readEA link

[Creative Writ­ing Con­test] The Puppy Problem

Louis13 Oct 2021 14:01 UTC
11 points
0 comments7 min readEA link

[Creative Writ­ing Con­test] Me­tal or Mortal

Louis16 Oct 2021 16:24 UTC
7 points
0 comments7 min readEA link

Pod­cast: Krister Bykvist on moral un­cer­tainty, ra­tio­nal­ity, metaethics, AI and fu­ture pop­u­la­tions

Gus Docker21 Oct 2021 15:17 UTC
8 points
0 comments1 min readEA link
(www.utilitarianpodcast.com)

[Dis­cus­sion] Best in­tu­ition pumps for AI safety

mariushobbhahn6 Nov 2021 8:11 UTC
10 points
8 comments1 min readEA link

Visi­ble Thoughts Pro­ject and Bounty Announcement

So8res30 Nov 2021 0:35 UTC
35 points
2 comments13 min readEA link

Con­tribute by fa­cil­i­tat­ing the AGI Safety Fun­da­men­tals Programme

Jamie Bernardi6 Dec 2021 11:50 UTC
27 points
0 comments2 min readEA link

[Question] What “defense lay­ers” should gov­ern­ments, AI labs, and busi­nesses use to pre­vent catas­trophic AI failures?

alexlintz3 Dec 2021 14:24 UTC
36 points
3 comments1 min readEA link

En­abling more feedback

JJ Hepburn10 Dec 2021 6:52 UTC
40 points
3 comments3 min readEA link

My Overview of the AI Align­ment Land­scape: A Bird’s Eye View

Neel Nanda15 Dec 2021 23:46 UTC
43 points
15 comments16 min readEA link
(www.alignmentforum.org)

[Ex­tended Dead­line: Jan 23rd] An­nounc­ing the PIBBSS Sum­mer Re­search Fellowship

nora18 Dec 2021 16:54 UTC
36 points
1 comment1 min readEA link

In­tro­duc­ing a New Course on the Eco­nomics of AI

akorinek21 Dec 2021 4:55 UTC
83 points
6 comments2 min readEA link

AGI al­ign­ment re­sults from a se­ries of al­igned ac­tions

Nicolas27 Dec 2021 19:33 UTC
15 points
1 comment6 min readEA link

What is the role of Bayesian ML for AI al­ign­ment/​safety?

mariushobbhahn11 Jan 2022 8:07 UTC
37 points
6 comments3 min readEA link

PIBBSS Fel­low­ship: Bounty for Refer­rals & Dead­line Extension

Anna_Gajdova17 Jan 2022 16:23 UTC
17 points
7 comments1 min readEA link

AI ac­cel­er­a­tion from a safety per­spec­tive: Trade-offs and con­sid­er­a­tions

mariushobbhahn19 Jan 2022 9:44 UTC
12 points
1 comment7 min readEA link

[Question] Anal­ogy of AI Align­ment as Rais­ing a Child?

Aaron_Scher19 Feb 2022 21:40 UTC
4 points
2 comments1 min readEA link

Re: Some thoughts on veg­e­tar­i­anism and veganism

Fai25 Feb 2022 20:43 UTC
47 points
3 comments8 min readEA link

New Speaker Series on AI Align­ment Start­ing March 3

Zechen Zhang26 Feb 2022 10:58 UTC
5 points
0 comments1 min readEA link

Be­ing an in­di­vi­d­ual al­ign­ment grantmaker

A_donor28 Feb 2022 16:39 UTC
33 points
20 comments2 min readEA link

AI Value Align­ment Speaker Series Pre­sented By EA Berkeley

Mahendra Prasad1 Mar 2022 6:17 UTC
2 points
0 comments1 min readEA link

Pre­serv­ing and con­tin­u­ing al­ign­ment re­search through a se­vere global catastrophe

A_donor6 Mar 2022 18:43 UTC
38 points
12 comments5 min readEA link

“In­tro to brain-like-AGI safety” se­ries—halfway point!

Steven Byrnes9 Mar 2022 15:21 UTC
8 points
0 comments2 min readEA link

EA Berkeley Pre­sents: Univer­sal Own­er­ship: Is In­dex In­vest­ing the New So­cially Re­spon­si­ble In­vest­ing?

Mahendra Prasad10 Mar 2022 6:58 UTC
7 points
0 comments1 min readEA link

E.A. Me­gapro­ject Ideas

Tomer_Goloboy21 Mar 2022 1:23 UTC
15 points
3 comments4 min readEA link

De­sir­able? AI qualities

brb24321 Mar 2022 22:05 UTC
5 points
0 comments3 min readEA link

AI Safety Overview: CERI Sum­mer Re­search Fellowship

Jamie Bernardi24 Mar 2022 15:12 UTC
29 points
0 comments2 min readEA link

The role of academia in AI Safety.

PabloAMC28 Mar 2022 0:04 UTC
71 points
19 comments3 min readEA link

AI safety starter pack

mariushobbhahn28 Mar 2022 16:05 UTC
93 points
9 comments5 min readEA link

Can we simu­late hu­man evolu­tion to cre­ate a some­what al­igned AGI?

Thomas Kwa29 Mar 2022 1:23 UTC
19 points
0 comments7 min readEA link

Com­mu­nity Build­ing for Grad­u­ate Stu­dents: A Tar­geted Approach

Neil Crawford29 Mar 2022 19:47 UTC
12 points
0 comments3 min readEA link

How Josiah be­came an AI safety researcher

Neil Crawford29 Mar 2022 19:47 UTC
9 points
0 comments1 min readEA link

[Question] Why does (any par­tic­u­lar) AI safety work re­duce s-risks more than it in­creases them?

MichaelStJules3 Oct 2021 16:55 UTC
33 points
18 comments1 min readEA link

[Question] Is it valuable to the field of AI Safety to have a neu­ro­science back­ground?

Samuel Nellessen3 Apr 2022 19:44 UTC
17 points
3 comments1 min readEA link

What Should We Op­ti­mize—A Conversation

Johannes C. Mayer7 Apr 2022 14:48 UTC
1 point
0 comments14 min readEA link

A tough ca­reer decision

PabloAMC9 Apr 2022 0:46 UTC
65 points
13 comments4 min readEA link

Our Cur­rent Direc­tions in Mechanis­tic In­ter­pretabil­ity Re­search (AI Align­ment Speaker Series)

Group Organizer8 Apr 2022 17:08 UTC
3 points
0 comments1 min readEA link

Ought’s the­ory of change

stuhlmueller12 Apr 2022 0:09 UTC
40 points
4 comments3 min readEA link

Red­wood Re­search is hiring for sev­eral roles (Oper­a­tions and Tech­ni­cal)

JJXWang14 Apr 2022 15:23 UTC
45 points
0 comments1 min readEA link

Beg­ging, Plead­ing AI Orgs to Com­ment on NIST AI Risk Man­age­ment Framework

Bridges15 Apr 2022 19:35 UTC
86 points
4 comments2 min readEA link

[Question] Why not offer a multi-mil­lion /​ billion dol­lar prize for solv­ing the Align­ment Prob­lem?

Aryeh Englander17 Apr 2022 16:08 UTC
15 points
9 comments1 min readEA link

[Closed] Hiring a math­e­mat­i­cian to work on the learn­ing-the­o­retic AI al­ign­ment agenda

Vanessa19 Apr 2022 6:49 UTC
52 points
4 comments2 min readEA link

Skil­ling-up in ML Eng­ineer­ing for Align­ment: re­quest for comments

Callum McDougall24 Apr 2022 6:40 UTC
8 points
0 comments1 min readEA link

Key ques­tions about ar­tifi­cial sen­tience: an opinionated guide

rgb25 Apr 2022 13:42 UTC
89 points
2 comments18 min readEA link

Stu­dent pro­ject for en­gag­ing with AI alignment

Per Ivar Friborg9 May 2022 10:44 UTC
30 points
1 comment1 min readEA link

AI Align­ment YouTube Playlists

jacquesthibs9 May 2022 21:31 UTC
16 points
2 comments1 min readEA link

New se­ries of posts an­swer­ing one of Holden’s “Im­por­tant, ac­tion­able re­search ques­tions”

Evan R. Murphy12 May 2022 21:22 UTC
9 points
0 comments1 min readEA link

[Link post] Promis­ing Paths to Align­ment—Con­nor Leahy | Talk

frances_lorenz14 May 2022 15:58 UTC
16 points
0 comments1 min readEA link

Deep­Mind’s gen­er­al­ist AI, Gato: A non-tech­ni­cal explainer

frances_lorenz16 May 2022 21:19 UTC
127 points
13 comments6 min readEA link

LW4EA: Some cruxes on im­pact­ful al­ter­na­tives to AI policy work

Jeremy17 May 2022 3:05 UTC
11 points
1 comment1 min readEA link
(www.lesswrong.com)

We Ran an AI Timelines Retreat

Lenny McCline17 May 2022 4:40 UTC
46 points
6 comments3 min readEA link

Ap­pendix to Bridg­ing Demonstration

MakoYass1 Jun 2022 20:30 UTC
13 points
0 comments30 min readEA link

You Un­der­stand AI Align­ment and How to Make Soup

Leen Armoush28 May 2022 6:22 UTC
0 points
2 comments5 min readEA link

Ca­reer re­view—Data col­lec­tion for AI alignment

Benjamin Hilton3 Jun 2022 11:44 UTC
34 points
1 comment5 min readEA link
(80000hours.org)

AGI Safety Com­mu­ni­ca­tions Initiative

Ines11 Jun 2022 16:30 UTC
27 points
4 comments1 min readEA link

Ex­pected eth­i­cal value of a ca­reer in AI safety

Jordan Taylor14 Jun 2022 14:25 UTC
33 points
16 comments13 min readEA link

Align Hu­mans to Ra­tion­al­ity?

Scytale15 Jun 2022 10:34 UTC
1 point
0 comments10 min readEA link

FYI: I’m work­ing on a book about the threat of AGI/​ASI for a gen­eral au­di­ence. I hope it will be of value to the cause and the community

Darren McKee17 Jun 2022 11:52 UTC
24 points
0 comments2 min readEA link

‘Force mul­ti­pli­ers’ for EA research

Craig Drayton18 Jun 2022 13:39 UTC
17 points
4 comments4 min readEA link

A Quick List of Some Prob­lems in AI Align­ment As A Field

NicholasKross21 Jun 2022 17:09 UTC
15 points
10 comments6 min readEA link
(www.thinkingmuchbetter.com)

Fol­low along with Columbia EA’s Ad­vanced AI Safety Fel­low­ship!

RohanS2 Jul 2022 6:07 UTC
20 points
0 comments2 min readEA link

The Tree of Life: Stan­ford AI Align­ment The­ory of Change

Gabriel Mukobi2 Jul 2022 18:32 UTC
39 points
4 comments14 min readEA link
No comments.