RSS

AI safety

Core TagLast edit: 7 Aug 2024 15:10 UTC by vipulnaik

AI safety is the study of ways to reduce risks posed by artificial intelligence.

Interventions that aim to reduce these risks can be split into:

Reading on why AI might be an existential risk

Hilton, Benjamin (2023) Preventing an AI-related catastrophe, 80000 Hours, March 2023

Cotra, Ajeya (2022) Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover Effective Altruism Forum, July 18

Carlsmith, Joseph (2022) Is Power-Seeking AI an Existential Risk? Arxiv, 16 June

Yudkowsky, Eliezer (2022) AGI Ruin: A List of Lethalities LessWrong, June 5

Ngo et al (2023) The alignment problem from a deep learning perspectiveArxiv, February 23

Arguments against AI safety

AI safety and AI risk is sometimes referred to as a Pascal’s Mugging [1], implying that the risks are tiny and that for any stated level of ignorable risk the the payoffs could be exaggerated to force it to still be a top priority. A response to this is that in a survey of 700 ML researchers, the median answer to the “the probability that the long-run effect of advanced AI on humanity will be “extremely bad (e.g., human extinction)” was 5% with, with 48% of respondents giving 10% or higher[2]. These probabilites are too high (by at least 5 orders of magnitude) to be consider Pascalian.

Further reading on arguments against AI Safety

Grace, Katja (2022) Counterarguments to the basic AI x-risk case EA Forum, October 14

Garfinkel, Ben (2020) Scrutinising classic AI risk arguments 80000 Hours Podcast, July 9

AI safety as a career

80,000 Hours’ medium-depth investigation rates technical AI safety research a “priority path”—among the most promising career opportunities the organization has identified so far.[3][4] Richard Ngo and Holden Karnofsky also have advice for those interested in working on AI Safety[5][6].

Further reading

Gates, Vael (2022) Resources I send to AI researchers about AI safety, Effective Altruism Forum, June 13.

Krakovna, Victoria (2017) Introductory resources on AI safety research, Victoria Krakovna’s Blog, October 19.

Ngo, Richard (2019) Disentangling arguments for the importance of AI safety, Effective Altruism Forum, January 21.

Rice, Issa; Naik, Vipul (2024) Timeline of AI safety, Timelines Wiki

Related entries

AI alignment | AI governance | AI forecasting| AI takeoff | AI race | Economics of artificial intelligence |AI interpretability | AI risk | cooperative AI | building the field of AI safety |

  1. ^

    https://​​twitter.com/​​amasad/​​status/​​1632121317146361856 The CEO of Replit, a coding organisation who are involved in ML Tools

  2. ^
  3. ^

    Todd, Benjamin (2023) The highest impact career paths our research has identified so far, 80,000 Hours, May 12.

  4. ^

    Hilton, Benjamin (2023) AI safety technical research, 80,000 Hours, June 19th

  5. ^

    Ngo, Richard (2023) AGI safety career advice, EA Forum, May 2

  6. ^

    Karnofsky, Holden (2023), Jobs that can help with the most important century, EA Forum, Feb 12

An­nounc­ing the Win­ners of the 2023 Open Philan­thropy AI Wor­ld­views Contest

Jason Schukraft30 Sep 2023 3:51 UTC
74 points
30 comments2 min readEA link

High-level hopes for AI alignment

Holden Karnofsky20 Dec 2022 2:11 UTC
123 points
14 comments19 min readEA link
(www.cold-takes.com)

Re­sources I send to AI re­searchers about AI safety

Vael Gates11 Jan 2023 1:24 UTC
43 points
0 comments1 min readEA link

AI safety needs to scale, and here’s how you can do it

Esben Kran2 Feb 2024 7:17 UTC
33 points
2 comments5 min readEA link
(apartresearch.com)

Chilean AIS Hackathon Retrospective

Agustín Covarrubias 🔸9 May 2023 1:34 UTC
67 points
0 comments5 min readEA link

FLI open let­ter: Pause gi­ant AI experiments

Zach Stein-Perlman29 Mar 2023 4:04 UTC
220 points
38 comments2 min readEA link
(futureoflife.org)

Katja Grace: Let’s think about slow­ing down AI

peterhartree23 Dec 2022 0:57 UTC
84 points
6 comments2 min readEA link
(worldspiritsockpuppet.substack.com)

Fill out this cen­sus of ev­ery­one in­ter­ested in re­duc­ing catas­trophic AI risks

AHT18 May 2024 15:53 UTC
105 points
1 comment1 min readEA link

An­nounc­ing AI Safety Bulgaria

Aleksandar Angelov3 Mar 2024 17:53 UTC
16 points
0 comments1 min readEA link

Launch­ing ap­pli­ca­tions for AI Safety Ca­reers Course In­dia 2024

varun_agr1 May 2024 5:30 UTC
23 points
1 comment1 min readEA link

In­ter­ested in work­ing from a new Bos­ton AI Safety Hub?

Topaz17 Mar 2025 13:32 UTC
25 points
0 comments2 min readEA link

Me­tac­u­lus Launches Fu­ture of AI Series, Based on Re­search Ques­tions by Arb

christian13 Mar 2024 21:14 UTC
34 points
0 comments1 min readEA link
(www.metaculus.com)

AI Safety Europe Re­treat 2023 Retrospective

Magdalena Wache14 Apr 2023 9:05 UTC
41 points
10 comments2 min readEA link

An­nounc­ing the Euro­pean Net­work for AI Safety (ENAIS)

Esben Kran22 Mar 2023 17:57 UTC
124 points
3 comments3 min readEA link

Digi­tal sen­tience fund­ing op­por­tu­ni­ties: Sup­port for ap­plied work and research

zdgroff28 May 2025 17:35 UTC
115 points
0 comments4 min readEA link

The Shut­down Prob­lem: In­com­plete Prefer­ences as a Solution

EJT23 Feb 2024 16:01 UTC
26 points
0 comments42 min readEA link

Pre­dictable up­dat­ing about AI risk

Joe_Carlsmith8 May 2023 22:05 UTC
135 points
12 comments36 min readEA link

A Qual­i­ta­tive Case for LTFF: Filling Crit­i­cal Ecosys­tem Gaps

Linch3 Dec 2024 21:57 UTC
89 points
26 comments9 min readEA link

How Stu­art Rus­sels’s IASEAI con­fer­ence failed to live up to its potential

gergo7 Aug 2025 13:15 UTC
7 points
4 comments2 min readEA link

How AI Takeover Might Hap­pen in Two Years

Joshc7 Feb 2025 23:51 UTC
35 points
7 comments29 min readEA link
(x.com)

Wet­ware’s De­fault: A Di­ag­no­sis of Sys­temic My­opia un­der AI-Driven Autonomy

Ihor Ivliev3 Jul 2025 23:21 UTC
1 point
0 comments7 min readEA link

MIRI’s 2024 End-of-Year Update

RobBensinger3 Dec 2024 4:33 UTC
32 points
7 comments4 min readEA link

Be­ware Epistemic Collapse

Ben Norman18 Aug 2025 10:44 UTC
29 points
2 comments8 min readEA link
(futuresonder.substack.com)

Why Si­mu­la­tor AIs want to be Ac­tive In­fer­ence AIs

Jan_Kulveit11 Apr 2023 9:06 UTC
22 points
0 comments8 min readEA link
(www.lesswrong.com)

Long list of AI ques­tions

NunoSempere6 Dec 2023 11:12 UTC
124 points
16 comments86 min readEA link

My cover story in Ja­cobin on AI cap­i­tal­ism and the x-risk debates

Garrison12 Feb 2024 23:34 UTC
154 points
10 comments6 min readEA link
(jacobin.com)

‘GiveWell for AI Safety’: Les­sons learned in a week

Lydia Nottingham30 May 2025 16:10 UTC
45 points
1 comment6 min readEA link

[Linkpost] State­ment from Scar­lett Jo­hans­son on OpenAI’s use of the “Sky” voice, that was shock­ingly similar to her own voice.

Linch20 May 2024 23:50 UTC
46 points
8 comments1 min readEA link
(variety.com)

Fund­ing case: AI Safety Camp 10

Remmelt12 Dec 2023 9:05 UTC
45 points
13 comments5 min readEA link
(manifund.org)

We are not alone: many com­mu­ni­ties want to stop Big Tech from scal­ing un­safe AI

Remmelt22 Sep 2023 17:38 UTC
28 points
30 comments4 min readEA link

Win­ners of the Es­say com­pe­ti­tion on the Au­toma­tion of Wis­dom and Philosophy

Owen Cotton-Barratt29 Oct 2024 0:02 UTC
37 points
2 comments30 min readEA link
(blog.aiimpacts.org)

Where I Am Donat­ing in 2024

MichaelDickens19 Nov 2024 0:09 UTC
180 points
73 comments46 min readEA link

From Ther­apy Tool to Align­ment Puz­zle-Piece: In­tro­duc­ing the VSPE Framework

Astelle Kay18 Jun 2025 14:47 UTC
6 points
1 comment2 min readEA link

“Near Mid­night in Suicide City”

Greg_Colbourn ⏸️ 6 Dec 2024 19:54 UTC
5 points
0 comments1 min readEA link
(www.youtube.com)

AISN #45: Cen­ter for AI Safety 2024 Year in Review

Center for AI Safety19 Dec 2024 18:14 UTC
11 points
0 comments4 min readEA link
(newsletter.safe.ai)

AI for An­i­mals 2025 Con­fer­ence—Get Early Bird Tick­ets Now

Constance Li20 Nov 2024 0:53 UTC
47 points
0 comments1 min readEA link

Con­sider grant­ing AIs freedom

Matthew_Barnett6 Dec 2024 0:55 UTC
100 points
38 comments5 min readEA link

A case for donat­ing to AI risk re­duc­tion (in­clud­ing if you work in AI)

tlevin2 Dec 2024 19:05 UTC
118 points
5 comments3 min readEA link

Trans­for­ma­tive AI and An­i­mals: An­i­mal Ad­vo­cacy Un­der A Post-Work Society

Kevin Xia 🔸25 May 2025 18:32 UTC
64 points
1 comment8 min readEA link

An­nounc­ing the Q1 2025 Long-Term Fu­ture Fund grant round

Linch20 Dec 2024 2:17 UTC
53 points
12 comments2 min readEA link

Dona­tion recom­men­da­tions for xrisk + ai safety

vincentweisser6 Feb 2023 21:25 UTC
17 points
11 comments1 min readEA link

Please vote for PauseAI US in the Dona­tion Elec­tion!

Holly Elmore ⏸️ 🔸22 Nov 2024 4:12 UTC
21 points
3 comments2 min readEA link

Evolu­tion pro­vides no ev­i­dence for the sharp left turn

Quintin Pope11 Apr 2023 18:48 UTC
43 points
2 comments15 min readEA link

Re­boot­ing the Singularity

cdkg16 Jul 2025 18:27 UTC
44 points
5 comments1 min readEA link
(philpapers.org)

Sym­bio­sis, not al­ign­ment, as the goal for liberal democ­ra­cies in the tran­si­tion to ar­tifi­cial gen­eral intelligence

simonfriederich17 Mar 2023 13:04 UTC
18 points
2 comments24 min readEA link
(rdcu.be)

Im­pact of Quan­ti­za­tion on Small Lan­guage Models (SLMs) for Mul­tilin­gual Math­e­mat­i­cal Rea­son­ing Tasks

Angie Paola Giraldo7 May 2025 21:48 UTC
11 points
0 comments14 min readEA link

[Question] Seek­ing sug­gested read­ings & videos for a new course on ‘AI and Psy­chol­ogy’

Geoffrey Miller20 May 2024 17:45 UTC
32 points
8 comments1 min readEA link

Four mind­set dis­agree­ments be­hind ex­is­ten­tial risk dis­agree­ments in ML

RobBensinger11 Apr 2023 4:53 UTC
61 points
2 comments9 min readEA link

Is AI sen­tience already a re­al­ity?

S1 Jun 2025 2:23 UTC
4 points
2 comments1 min readEA link

Prevent­ing an AI-re­lated catas­tro­phe—Prob­lem profile

Benjamin Hilton29 Aug 2022 18:49 UTC
138 points
18 comments4 min readEA link
(80000hours.org)

Sam Alt­man and the Cross­roads of AI Power: Can We Trust the Fu­ture We’re Build­ing?

Kayode Adekoya23 May 2025 15:39 UTC
0 points
0 comments1 min readEA link

De­cep­tive Align­ment is <1% Likely by Default

DavidW21 Feb 2023 15:07 UTC
54 points
26 comments14 min readEA link

Nav­i­gat­ing the New Real­ity in DC: An EIP Primer

IanDavidMoss20 Dec 2024 16:59 UTC
26 points
1 comment13 min readEA link
(effectiveinstitutionsproject.substack.com)

Cos­mic AI safety

Magnus Vinding6 Dec 2024 22:32 UTC
24 points
5 comments6 min readEA link

Against Aschen­bren­ner: How ‘Si­tu­a­tional Aware­ness’ con­structs a nar­ra­tive that un­der­mines safety and threat­ens humanity

GideonF15 Jul 2024 16:21 UTC
238 points
22 comments21 min readEA link

The Choice Transition

Owen Cotton-Barratt18 Nov 2024 12:32 UTC
49 points
1 comment15 min readEA link
(strangecities.substack.com)

AI al­ign­ment re­searchers may have a com­par­a­tive ad­van­tage in re­duc­ing s-risks

Lukas_Gloor15 Feb 2023 13:01 UTC
79 points
5 comments13 min readEA link

Vael Gates: Risks from Highly-Ca­pable AI (March 2023)

Vael Gates1 Apr 2023 20:54 UTC
31 points
4 comments1 min readEA link
(docs.google.com)

AISafety.info “How can I help?” FAQ

StevenKaas5 Jun 2023 22:09 UTC
48 points
1 comment2 min readEA link

Pro­ject ideas: Epistemics

Lukas Finnveden4 Jan 2024 7:26 UTC
43 points
1 comment17 min readEA link
(www.forethought.org)

New Busi­ness Wars pod­cast sea­son on Sam Alt­man and OpenAI

Eevee🔹2 Apr 2024 6:22 UTC
10 points
0 comments1 min readEA link
(wondery.com)

1-year up­date on im­pactRIO, the first AI Safety group in Brazil

João Lucas Duim28 Jun 2024 10:59 UTC
56 points
2 comments10 min readEA link

Re­quest to AGI or­ga­ni­za­tions: Share your views on paus­ing AI progress

Akash11 Apr 2023 17:30 UTC
85 points
1 comment1 min readEA link

Some Things I Heard about AI Gover­nance at EAG

utilistrutil28 Feb 2023 21:27 UTC
35 points
5 comments6 min readEA link

AI Risk & Policy Fore­casts from Me­tac­u­lus & FLI’s AI Path­ways Workshop

Will Aldred16 May 2023 8:53 UTC
41 points
0 comments8 min readEA link

INTELLECT-1 Re­lease: The First Globally Trained 10B Pa­ram­e­ter Model

Matrice Jacobine29 Nov 2024 23:03 UTC
2 points
1 comment1 min readEA link
(www.primeintellect.ai)

Anti-‘FOOM’ (stop try­ing to make your cute pet name the thing)

david_reinstein14 Apr 2023 16:05 UTC
41 points
17 comments2 min readEA link

[Question] Why hasn’t there been any sig­nifi­cant AI protest

sammyboiz🔸17 May 2024 2:59 UTC
21 points
14 comments1 min readEA link

Two im­por­tant re­cent AI Talks- Ge­bru and Lazar

GideonF6 Mar 2023 1:30 UTC
−7 points
5 comments1 min readEA link

Sam Alt­man re­turn­ing as OpenAI CEO “in prin­ci­ple”

Fermi–Dirac Distribution22 Nov 2023 6:15 UTC
55 points
37 comments1 min readEA link

Which in­cen­tives should be used to en­courage com­pli­ance with UK AI leg­is­la­tion?

jcw18 Nov 2024 18:13 UTC
12 points
0 comments12 min readEA link

A short con­ver­sa­tion I had with Google Gem­ini on the dan­gers of un­reg­u­lated LLM API use, while mildly drunk in an air­port.

EvanMcCormick17 Dec 2024 12:25 UTC
1 point
0 comments8 min readEA link

To the Bat Mo­bile!! My Mid-Ca­reer Tran­si­tion into AI Safety

Moneer7 Nov 2024 15:59 UTC
17 points
0 comments3 min readEA link

OpenAI in­tro­duces func­tion call­ing for GPT-4

mic20 Jun 2023 1:58 UTC
26 points
0 comments4 min readEA link
(openai.com)

AI do­ing philos­o­phy = AI gen­er­at­ing hands?

Wei Dai15 Jan 2024 9:04 UTC
68 points
7 comments3 min readEA link

AI Safety Ac­tion Plan—A re­port com­mis­sioned by the US State Department

Agustín Covarrubias 🔸11 Mar 2024 22:13 UTC
25 points
1 comment1 min readEA link
(www.gladstone.ai)

AI-nu­clear in­te­gra­tion: ev­i­dence of au­toma­tion bias from hu­mans and LLMs [re­search sum­mary]

Tao27 Apr 2024 21:59 UTC
17 points
2 comments12 min readEA link

An­nounc­ing Fore­castBench, a new bench­mark for AI and hu­man fore­cast­ing abilities

Forecasting Research Institute1 Oct 2024 12:31 UTC
20 points
1 comment3 min readEA link
(arxiv.org)

Join­ing the Carnegie En­dow­ment for In­ter­na­tional Peace

Holden Karnofsky29 Apr 2024 15:45 UTC
228 points
14 comments2 min readEA link

Par­tial value takeover with­out world takeover

Katja_Grace18 Apr 2024 3:00 UTC
24 points
2 comments3 min readEA link

NYT: Google will ‘re­cal­ibrate’ the risk of re­leas­ing AI due to com­pe­ti­tion with OpenAI

Michael Huang22 Jan 2023 2:13 UTC
173 points
8 comments1 min readEA link
(www.nytimes.com)

Data Tax­a­tion: A Pro­posal for Slow­ing Down AGI Progress

Per Ivar Friborg11 Apr 2023 17:27 UTC
42 points
6 comments12 min readEA link

[Linkpost] 538 Poli­tics Pod­cast on AI risk & politics

jackva11 Apr 2023 17:03 UTC
64 points
5 comments1 min readEA link
(fivethirtyeight.com)

Cor­po­rate cam­paigns work: a key learn­ing for AI Safety

Jamie_Harris17 Aug 2023 21:35 UTC
72 points
12 comments6 min readEA link

Hooray for step­ping out of the limelight

So8res1 Apr 2023 2:45 UTC
103 points
0 comments1 min readEA link

In favour of ex­plor­ing nag­ging doubts about x-risk

Owen Cotton-Barratt25 Jun 2024 23:52 UTC
90 points
15 comments2 min readEA link

The “low-hang­ing fruits” of AI safety

Julian Nalenz19 Dec 2024 13:38 UTC
−1 points
0 comments6 min readEA link
(blog.hermesloom.org)

Ten ar­gu­ments that AI is an ex­is­ten­tial risk

Katja_Grace14 Aug 2024 21:51 UTC
30 points
0 comments7 min readEA link

2021 AI Align­ment Liter­a­ture Re­view and Char­ity Comparison

Larks23 Dec 2021 14:06 UTC
176 points
18 comments73 min readEA link

Anal­ogy Bank for AI Safety

utilistrutil29 Jan 2024 2:35 UTC
14 points
5 comments8 min readEA link

NIST Seeks Com­ments On “Safety Con­sid­er­a­tions for Chem­i­cal and/​or Biolog­i­cal AI Models”

Dylan Richardson26 Oct 2024 18:28 UTC
15 points
0 comments1 min readEA link
(www.federalregister.gov)

Did Ben­gio and Teg­mark lose a de­bate about AI x-risk against LeCun and Mitchell?

Karl von Wendt25 Jun 2023 16:59 UTC
80 points
24 comments7 min readEA link

Fund­ing AI Safety poli­ti­cal ad­vo­cacy in the US: In­di­vi­d­ual donors and small dona­tions may be es­pe­cially helpful

Holly Elmore ⏸️ 🔸14 Nov 2023 23:14 UTC
64 points
8 comments1 min readEA link

Timelines are short, p(doom) is high: a global stop to fron­tier AI de­vel­op­ment un­til x-safety con­sen­sus is our only rea­son­able hope

Greg_Colbourn ⏸️ 12 Oct 2023 11:24 UTC
76 points
85 comments9 min readEA link

Merger of Deep­Mind and Google Brain

Greg_Colbourn ⏸️ 20 Apr 2023 20:16 UTC
11 points
12 comments1 min readEA link
(blog.google)

An­nounc­ing the AI Fables Writ­ing Con­test!

Daystar Eld12 Jul 2023 3:04 UTC
76 points
52 comments3 min readEA link

Slim overview of work one could do to make AI go bet­ter (and a grab-bag of other ca­reer con­sid­er­a­tions)

Chi20 Mar 2024 23:17 UTC
34 points
1 comment3 min readEA link

AI Safety Im­pact Mar­kets: Your Char­ity Eval­u­a­tor for AI Safety

Dawn Drescher1 Oct 2023 10:47 UTC
28 points
4 comments6 min readEA link
(impactmarkets.substack.com)

Shut­ting down all com­pet­ing AI pro­jects might not buy a lot of time due to In­ter­nal Time Pressure

ThomasCederborg3 Oct 2024 0:05 UTC
6 points
1 comment12 min readEA link

2/​3 Aussie & NZ AI Safety folk of­ten or some­times feel lonely or dis­con­nected (and 16 other bar­ri­ers to im­pact)

yanni kyriacos1 Aug 2024 1:14 UTC
19 points
11 comments8 min readEA link

Among the A.I. Doom­say­ers—The New Yorker

Agustín Covarrubias 🔸11 Mar 2024 21:12 UTC
66 points
0 comments1 min readEA link
(www.newyorker.com)

AI al­ign­ment, hu­man al­ign­ment, oh my

MilesW31 Oct 2024 3:23 UTC
−12 points
0 comments2 min readEA link

Claude Doesn’t Want to Die

Garrison5 Mar 2024 6:00 UTC
22 points
14 comments10 min readEA link
(garrisonlovely.substack.com)

The Tech In­dus­try is the Biggest Blocker to Mean­ingful AI Safety Regulations

Garrison16 Aug 2024 19:37 UTC
140 points
8 comments8 min readEA link
(garrisonlovely.substack.com)

Cy­borg Pe­ri­ods: There will be mul­ti­ple AI transitions

Jan_Kulveit22 Feb 2023 16:09 UTC
68 points
1 comment6 min readEA link

An AI crash is our best bet for re­strict­ing AI

Remmelt11 Oct 2024 2:12 UTC
20 points
3 comments1 min readEA link

But why would the AI kill us?

So8res17 Apr 2023 19:38 UTC
45 points
3 comments3 min readEA link

Fu­ture Mat­ters #8: Bing Chat, AI labs on safety, and paus­ing Fu­ture Matters

Pablo21 Mar 2023 14:50 UTC
81 points
5 comments24 min readEA link

Whether you should do a PhD doesn’t de­pend much on timelines.

alex lawsen22 Mar 2023 12:25 UTC
67 points
7 comments4 min readEA link

Try to solve the hard parts of the al­ign­ment problem

MikhailSamin11 Jul 2023 17:02 UTC
8 points
0 comments5 min readEA link

[Question] What’s the best way to get a sense of the day-to-day ac­tivi­ties of differ­ent re­searchers/​re­search di­rec­tions? (AI Gover­nance)

Luise27 May 2024 12:48 UTC
15 points
1 comment1 min readEA link

EU poli­cy­mak­ers reach an agree­ment on the AI Act

tlevin15 Dec 2023 6:03 UTC
109 points
13 comments7 min readEA link

Prob­lem-solv­ing tasks in Graph The­ory for lan­guage mod­els

Bruno López Orozco1 Oct 2024 12:36 UTC
21 points
1 comment9 min readEA link

Dario Amodei — Machines of Lov­ing Grace

Matrice Jacobine11 Oct 2024 21:39 UTC
66 points
0 comments1 min readEA link
(darioamodei.com)

AI strat­egy given the need for good reflection

Owen Cotton-Barratt18 Mar 2024 0:48 UTC
40 points
1 comment5 min readEA link

[Question] What is the cur­rent most rep­re­sen­ta­tive EA AI x-risk ar­gu­ment?

Matthew_Barnett15 Dec 2023 22:04 UTC
117 points
50 comments3 min readEA link

[Question] Can we train AI so that fu­ture philan­thropy is more effec­tive?

Ricardo Pimentel3 Nov 2024 15:08 UTC
3 points
0 comments1 min readEA link

My lab’s small AI safety agenda

Jobst Heitzig (vodle.it)18 Jun 2023 12:29 UTC
59 points
26 comments3 min readEA link

Train for in­cor­rigi­bil­ity, then re­verse it (Shut­down Prob­lem Con­test Sub­mis­sion)

Daniel_Eth18 Jul 2023 8:26 UTC
16 points
0 comments2 min readEA link

Propos­ing the Con­di­tional AI Safety Treaty (linkpost TIME)

Otto15 Nov 2024 13:56 UTC
12 points
6 comments3 min readEA link
(time.com)

De­con­fus­ing Pauses: Long Term Mo­ra­to­rium vs Slow­ing AI

GideonF4 Aug 2024 11:32 UTC
17 points
3 comments5 min readEA link

“Aligned with who?” Re­sults of sur­vey­ing 1,000 US par­ti­ci­pants on AI values

Holly Morgan21 Mar 2023 22:07 UTC
41 points
0 comments2 min readEA link
(www.lesswrong.com)

[Linkpost] Given Ex­tinc­tion Wor­ries, Why Don’t AI Re­searchers Quit? Well, Sev­eral Reasons

Daniel_Eth6 Jun 2023 7:31 UTC
25 points
6 comments1 min readEA link
(medium.com)

AISC 2024 - Pro­ject Summaries

Nicky Pochinkov27 Nov 2023 22:35 UTC
13 points
1 comment18 min readEA link

[Question] Con­crete, ex­ist­ing ex­am­ples of high-im­pact risks from AI?

freedomandutility15 Apr 2023 22:19 UTC
9 points
1 comment1 min readEA link

FLI re­port: Poli­cy­mak­ing in the Pause

Zach Stein-Perlman15 Apr 2023 17:01 UTC
29 points
4 comments1 min readEA link
(futureoflife.org)

Co­or­di­na­tion by com­mon knowl­edge to pre­vent un­con­trol­lable AI

Karl von Wendt14 May 2023 13:37 UTC
14 points
0 comments9 min readEA link

AI Safety Camp 10

Robert Kralisch26 Oct 2024 11:36 UTC
15 points
0 comments18 min readEA link
(www.lesswrong.com)

Why some peo­ple dis­agree with the CAIS state­ment on AI

David_Moss15 Aug 2023 13:39 UTC
144 points
15 comments16 min readEA link

A fresh­man year dur­ing the AI midgame: my ap­proach to the next year

Buck14 Apr 2023 0:38 UTC
179 points
30 comments7 min readEA link

Sen­tience In­sti­tute 2021 End of Year Summary

Ali26 Nov 2021 14:40 UTC
66 points
5 comments6 min readEA link
(www.sentienceinstitute.org)

Cur­rent UK gov­ern­ment lev­ers on AI development

rosehadshar10 Apr 2023 13:16 UTC
82 points
3 comments4 min readEA link

Jan Leike: “I’m ex­cited to join @An­throp­icAI to con­tinue the su­per­al­ign­ment mis­sion!”

defun 🔸28 May 2024 18:08 UTC
35 points
11 comments1 min readEA link
(x.com)

Videos on the world’s most press­ing prob­lems, by 80,000 Hours

Bella21 Mar 2024 20:18 UTC
63 points
5 comments2 min readEA link

Bounty for Ev­i­dence on Some of Pal­isade Re­search’s Beliefs

bwr23 Sep 2024 20:05 UTC
5 points
0 comments1 min readEA link

Break­through in AI agents? (On Devin—The Zvi, linkpost)

SiebeRozendal20 Mar 2024 9:43 UTC
16 points
9 comments1 min readEA link
(thezvi.substack.com)

Oper­a­tional­iz­ing timelines

Zach Stein-Perlman10 Mar 2023 17:30 UTC
30 points
2 comments3 min readEA link

When “hu­man-level” is the wrong thresh­old for AI

Ben Millwood🔸22 Jun 2024 14:34 UTC
38 points
3 comments7 min readEA link

Pro­ject ideas: Backup plans & Co­op­er­a­tive AI

Lukas Finnveden4 Jan 2024 7:26 UTC
25 points
2 comments13 min readEA link
(www.forethought.org)

The mar­ket plau­si­bly ex­pects AI soft­ware to cre­ate trillions of dol­lars of value by 2027

Benjamin_Todd6 May 2024 5:16 UTC
88 points
19 comments1 min readEA link
(benjamintodd.substack.com)

Public Weights?

Jeff Kaufman 🔸2 Nov 2023 2:51 UTC
20 points
7 comments3 min readEA link

Shap­ing Poli­cies for Eth­i­cal AI Devel­op­ment in Africa

Kuiyaki16 May 2024 14:15 UTC
3 points
0 comments1 min readEA link

AI Can Help An­i­mal Ad­vo­cacy More Than It Can Help In­dus­trial Farming

Wladimir J. Alonso26 Nov 2024 9:55 UTC
23 points
10 comments4 min readEA link

AGI Catas­tro­phe and Takeover: Some Refer­ence Class-Based Priors

zdgroff24 May 2023 19:14 UTC
95 points
10 comments6 min readEA link

AI Win­ter Sea­son at EA Hotel

CEEALAR25 Sep 2024 13:36 UTC
57 points
2 comments1 min readEA link

AGI safety ca­reer advice

richard_ngo2 May 2023 7:36 UTC
211 points
20 comments13 min readEA link

Non-al­ign­ment pro­ject ideas for mak­ing trans­for­ma­tive AI go well

Lukas Finnveden4 Jan 2024 7:23 UTC
66 points
1 comment3 min readEA link
(www.forethought.org)

How to help cru­cial AI safety leg­is­la­tion pass with 10 min­utes of effort

ThomasW11 Sep 2024 19:14 UTC
258 points
33 comments3 min readEA link

Trendlines in AIxBio evals

ljusten31 Oct 2024 0:09 UTC
40 points
2 comments11 min readEA link
(www.lennijusten.com)

AI safety starter pack

mariushobbhahn28 Mar 2022 16:05 UTC
128 points
13 comments6 min readEA link

How I failed to form views on AI safety

Ada-Maaria Hyvärinen17 Apr 2022 11:05 UTC
213 points
72 comments40 min readEA link

What new x- or s-risk field­build­ing or­gani­sa­tions would you like to see? An EOI form. (FBB #3)

gergo17 Feb 2025 12:37 UTC
32 points
3 comments2 min readEA link

Filling the Void: A Com­pre­hen­sive Database for AI Risks Materials

J.A.M.28 May 2024 16:03 UTC
10 points
1 comment4 min readEA link

Should AI X-Risk Wor­ri­ers Short the Mar­ket?

postlibertarian4 Nov 2024 16:16 UTC
14 points
1 comment6 min readEA link

My fa­vorite AI gov­er­nance re­search this year so far

Zach Stein-Perlman23 Jul 2023 22:00 UTC
81 points
4 comments7 min readEA link
(blog.aiimpacts.org)

Ap­ply to the Cavendish Labs Fel­low­ship (by 4/​15)

Derik K3 Apr 2023 23:06 UTC
35 points
2 comments1 min readEA link

Ex­ec­u­tive Direc­tor for AIS Brus­sels—Ex­pres­sion of interest

gergo19 Dec 2024 9:15 UTC
29 points
0 comments4 min readEA link

An­nounc­ing the CLR Foun­da­tions Course and CLR S-Risk Seminars

James Faville19 Nov 2024 1:18 UTC
52 points
2 comments3 min readEA link

Man­i­fund x AI Worldviews

Austin31 Mar 2023 15:32 UTC
32 points
2 comments2 min readEA link
(manifund.org)

Men­tor­ship in AGI Safety (MAGIS)

Joe Rogero23 May 2024 18:34 UTC
11 points
1 comment2 min readEA link

Large Lan­guage Models as Fi­du­cia­ries to Humans

johnjnay24 Jan 2023 19:53 UTC
25 points
0 comments34 min readEA link
(papers.ssrn.com)

[SEE NEW EDITS] No, *You* Need to Write Clearer

Nicholas Kross29 Apr 2023 5:04 UTC
71 points
8 comments5 min readEA link
(www.thinkingmuchbetter.com)

The Com­pendium, A full ar­gu­ment about ex­tinc­tion risk from AGI

adamShimi31 Oct 2024 12:02 UTC
9 points
1 comment2 min readEA link
(www.thecompendium.ai)

Par­tial Tran­script of Re­cent Se­nate Hear­ing Dis­cussing AI X-Risk

Daniel_Eth27 Jul 2023 9:16 UTC
150 points
2 comments22 min readEA link
(medium.com)

The Leeroy Jenk­ins prin­ci­ple: How faulty AI could guaran­tee “warn­ing shots”

titotal14 Jan 2024 15:03 UTC
56 points
2 comments21 min readEA link
(titotal.substack.com)

Please don’t crit­i­cize EAs who “sell out” to OpenAI and Anthropic

Eevee🔹5 Mar 2023 21:17 UTC
−4 points
21 comments2 min readEA link

In­ter­ac­tive AI Gover­nance Map

Hamish McDoodles12 Mar 2024 10:02 UTC
67 points
8 comments1 min readEA link

AIS Hun­gary is hiring a part-time Tech­ni­cal Lead! (Dead­line: Dec 31st)

gergo17 Dec 2024 14:08 UTC
9 points
0 comments2 min readEA link

Why AGI sys­tems will not be fa­nat­i­cal max­imisers (un­less trained by fa­nat­i­cal hu­mans)

titotal17 May 2023 11:58 UTC
43 points
3 comments15 min readEA link

AI stocks could crash. And that could have im­pli­ca­tions for AI safety

Benjamin_Todd9 May 2024 7:23 UTC
173 points
41 comments4 min readEA link
(benjamintodd.substack.com)

Solv­ing ad­ver­sar­ial at­tacks in com­puter vi­sion as a baby ver­sion of gen­eral AI alignment

Stanislav Fort31 Aug 2024 16:15 UTC
3 points
1 comment7 min readEA link

Brain-com­puter in­ter­faces and brain organoids in AI al­ign­ment?

freedomandutility15 Apr 2023 22:28 UTC
8 points
2 comments1 min readEA link

Dis­rupt­ing mal­i­cious uses of AI by state-af­fili­ated threat actors

Agustín Covarrubias 🔸14 Feb 2024 21:28 UTC
22 points
1 comment1 min readEA link
(openai.com)

The Cruel Trade-Off Between AI Mi­suse and AI X-risk Concerns

simeon_c22 Apr 2023 13:49 UTC
27 points
17 comments2 min readEA link

Sleeper Agents: Train­ing De­cep­tive LLMs that Per­sist Through Safety Training

evhub12 Jan 2024 19:51 UTC
65 points
0 comments3 min readEA link
(arxiv.org)

2024: a year of con­soli­da­tion for ORCG

JorgeTorresC18 Dec 2024 17:47 UTC
33 points
0 comments7 min readEA link
(www.orcg.info)

Agen­tic Align­ment: Nav­i­gat­ing be­tween Harm and Illegitimacy

LennardZ26 Nov 2024 21:27 UTC
2 points
1 comment9 min readEA link

The ‘Ne­glected Ap­proaches’ Ap­proach: AE Stu­dio’s Align­ment Agenda

Marc Carauleanu18 Dec 2023 21:13 UTC
21 points
0 comments12 min readEA link

[MLSN #8]: Mechanis­tic in­ter­pretabil­ity, us­ing law to in­form AI al­ign­ment, scal­ing laws for proxy gaming

TW12320 Feb 2023 16:06 UTC
25 points
0 comments4 min readEA link
(newsletter.mlsafety.org)

AI Risk US Pres­i­den­tal Candidate

Simon Berens11 Apr 2023 20:18 UTC
12 points
8 comments1 min readEA link

How to Give Com­ing AGI’s the Best Chance of Figur­ing Out Ethics for Us

Sean Sweeney23 May 2024 19:44 UTC
1 point
1 comment10 min readEA link

How to Ad­dress EA Dilem­mas – What is Miss­ing from EA Values?

alexis schoenlaub13 Oct 2024 9:33 UTC
7 points
4 comments6 min readEA link

[Linkpost] AI Align­ment, Ex­plained in 5 Points (up­dated)

Daniel_Eth18 Apr 2023 8:09 UTC
31 points
2 comments1 min readEA link
(medium.com)

Cri­tiques of promi­nent AI safety labs: Red­wood Research

Omega31 Mar 2023 8:58 UTC
339 points
91 comments20 min readEA link

De­tails on how an IAEA-style AI reg­u­la­tor would func­tion?

freedomandutility3 Jun 2023 12:03 UTC
12 points
5 comments1 min readEA link

Arkose: Or­ga­ni­za­tional Up­dates & Ways to Get Involved

Arkose1 Aug 2024 13:03 UTC
28 points
1 comment1 min readEA link

Suc­cess with­out dig­nity: a nearcast­ing story of avoid­ing catas­tro­phe by luck

Holden Karnofsky15 Mar 2023 20:17 UTC
113 points
3 comments15 min readEA link

Count­ing ar­gu­ments provide no ev­i­dence for AI doom

Nora Belrose27 Feb 2024 23:03 UTC
84 points
15 comments14 min readEA link

Prevent­ing AI Mi­suse: State of the Art Re­search and its Flaws

Madhav Malhotra23 Apr 2023 10:50 UTC
24 points
2 comments11 min readEA link

Cur­rent paths to im­pact in EU AI Policy (Feb ’24)

JOMG_Monnet12 Feb 2024 15:57 UTC
47 points
0 comments5 min readEA link

Is effec­tive al­tru­ism re­ally to blame for the OpenAI de­ba­cle?

Garrison23 Nov 2023 0:44 UTC
13 points
0 comments1 min readEA link
(garrisonlovely.substack.com)

Open-Source AI: A Reg­u­la­tory Review

Elliot Mckernon29 Apr 2024 10:10 UTC
14 points
1 comment8 min readEA link

[Question] How does AI progress af­fect other EA cause ar­eas?

Luis Mota Freitas9 Jun 2023 12:43 UTC
96 points
13 comments1 min readEA link

Stan­dard policy frame­works for AI governance

Nathan_Barnard30 Jan 2024 18:14 UTC
26 points
2 comments3 min readEA link

(How) Is tech­ni­cal AI Safety re­search be­ing eval­u­ated?

JohnSnow11 Jul 2023 9:37 UTC
27 points
1 comment1 min readEA link

Deep­Mind: Fron­tier Safety Framework

Zach Stein-Perlman17 May 2024 17:30 UTC
23 points
0 comments3 min readEA link
(deepmind.google)

[Question] Why haven’t we been de­stroyed by a power-seek­ing AGI from el­se­where in the uni­verse?

Jadon Schmitt22 Jul 2023 7:21 UTC
35 points
14 comments1 min readEA link

A great talk for AI noobs (ac­cord­ing to an AI noob)

Dov23 Apr 2023 5:32 UTC
8 points
0 comments1 min readEA link
(www.youtube.com)

[linkpost] “What Are Rea­son­able AI Fears?” by Robin Han­son, 2023-04-23

Arjun Panickssery14 Apr 2023 23:26 UTC
41 points
3 comments4 min readEA link
(quillette.com)

In DC, a new wave of AI lob­by­ists gains the up­per hand

Chris Leong13 May 2024 7:31 UTC
97 points
7 comments1 min readEA link
(www.politico.com)

Bring­ing about an­i­mal-in­clu­sive AI

Max Taylor18 Dec 2023 11:49 UTC
135 points
9 comments16 min readEA link

Rais­ing the voices that ac­tu­ally count

Kim Holder13 Jun 2023 19:21 UTC
2 points
3 comments2 min readEA link

Tech­nol­ogy is Power: Rais­ing Aware­ness Of Tech­nolog­i­cal Risks

Marc Wong9 Feb 2023 15:13 UTC
3 points
0 comments2 min readEA link

If you are too stressed, walk away from the front lines

Neil Warren12 Jun 2023 21:01 UTC
7 points
2 comments4 min readEA link

An­nounc­ing Hu­man-al­igned AI Sum­mer School

Jan_Kulveit22 May 2024 8:55 UTC
33 points
0 comments1 min readEA link
(humanaligned.ai)

Pos­si­ble OpenAI’s Q* break­through and Deep­Mind’s AlphaGo-type sys­tems plus LLMs

Burnydelic23 Nov 2023 7:02 UTC
13 points
4 comments2 min readEA link

Model­ling large-scale cy­ber at­tacks from ad­vanced AI sys­tems with Ad­vanced Per­sis­tent Threats

Iyngkarran Kumar2 Oct 2023 9:54 UTC
28 points
2 comments30 min readEA link

Help the UN de­sign global gov­er­nance struc­tures for AI

Joanna (Asia) Wiaterek12 Jan 2024 8:44 UTC
72 points
2 comments1 min readEA link

AI Safety Newslet­ter #2: ChaosGPT, Nat­u­ral Selec­tion, and AI Safety in the Media

Oliver Z18 Apr 2023 18:36 UTC
56 points
1 comment4 min readEA link
(newsletter.safe.ai)

AI-Rele­vant Reg­u­la­tion: IAEA

SWK15 Jul 2023 18:20 UTC
10 points
0 comments5 min readEA link

De­sign­ing Ar­tifi­cial Wis­dom: De­ci­sion Fore­cast­ing AI & Futarchy

Jordan Arel14 Jul 2024 5:10 UTC
5 points
1 comment6 min readEA link

This might be the last AI Safety Camp

Remmelt24 Jan 2024 9:29 UTC
87 points
32 comments1 min readEA link

Deep Deceptiveness

So8res21 Mar 2023 2:51 UTC
40 points
1 comment14 min readEA link

AISN#15: China and the US take ac­tion to reg­u­late AI, re­sults from a tour­na­ment fore­cast­ing AI risk, up­dates on xAI’s plan, and Meta re­leases its open-source and com­mer­cially available Llama 2

Center for AI Safety19 Jul 2023 1:40 UTC
5 points
0 comments6 min readEA link
(newsletter.safe.ai)

Rea­sons to have hope

Jordan Pieters 🔸20 Apr 2023 10:19 UTC
53 points
4 comments1 min readEA link

[Question] How in­de­pen­dent is the re­search com­ing out of OpenAI’s pre­pared­ness team?

Earthling10 Feb 2024 16:59 UTC
18 points
0 comments1 min readEA link

AISN #35: Lob­by­ing on AI Reg­u­la­tion Plus, New Models from OpenAI and Google, and Le­gal Regimes for Train­ing on Copy­righted Data

Center for AI Safety16 May 2024 14:26 UTC
14 points
0 comments6 min readEA link
(newsletter.safe.ai)

ChatGPT: to­wards AI subjectivity

KrisDAmato1 May 2024 10:13 UTC
3 points
0 comments1 min readEA link
(link.springer.com)

Ex­plor­ers in a vir­tual coun­try: Nav­i­gat­ing the knowl­edge land­scape of large lan­guage models

Alexander Saeri28 Mar 2023 21:32 UTC
17 points
1 comment6 min readEA link

Now THIS is fore­cast­ing: un­der­stand­ing Epoch’s Direct Approach

Elliot Mckernon4 May 2024 12:06 UTC
52 points
2 comments19 min readEA link

Paradigms and The­ory Choice in AI: Adap­tivity, Econ­omy and Control

particlemania28 Aug 2023 22:44 UTC
3 points
0 comments16 min readEA link

A fic­tional AI law laced w/​ al­ign­ment theory

Miguel17 Jul 2023 3:26 UTC
3 points
0 comments2 min readEA link

OpenAI’s Su­per­al­ign­ment team has opened Fast Grants

Yadav16 Dec 2023 15:41 UTC
31 points
2 comments1 min readEA link
(openai.com)

Co­op­er­a­tive AI: Three things that con­fused me as a be­gin­ner (and my cur­rent un­der­stand­ing)

C Tilli16 Apr 2024 7:06 UTC
58 points
10 comments6 min readEA link

A Viral Li­cense for AI Safety

IvanVendrov5 Jun 2021 2:00 UTC
30 points
6 comments5 min readEA link

UK Foun­da­tion Model Task Force—Ex­pres­sion of Interest

ojorgensen18 Jun 2023 9:40 UTC
111 points
3 comments1 min readEA link
(twitter.com)

AGI de­vel­op­ment role-play­ing game

rekahalasz11 Dec 2023 10:22 UTC
4 points
0 comments1 min readEA link

AI-Rele­vant Reg­u­la­tion: In­surance in Safety-Crit­i­cal Industries

SWK22 Jul 2023 17:52 UTC
5 points
0 comments6 min readEA link

[Linkpost] Longter­mists Are Push­ing a New Cold War With China

Radical Empath Ismam27 May 2023 6:53 UTC
38 points
16 comments1 min readEA link
(jacobin.com)

Catas­trophic Risks from Un­safe AI: Nav­i­gat­ing a Tightrope Sce­nario (Ben Garfinkel, EAG Lon­don 2023)

Alexander Saeri2 Jun 2023 9:59 UTC
19 points
1 comment10 min readEA link

[Linkpost] OpenAI lead­ers call for reg­u­la­tion of “su­per­in­tel­li­gence” to re­duce ex­is­ten­tial risk.

Lowe Lundin25 May 2023 14:14 UTC
5 points
0 comments1 min readEA link

Should you work at a lead­ing AI lab? (in­clud­ing in non-safety roles)

Benjamin Hilton25 Jul 2023 16:28 UTC
38 points
13 comments12 min readEA link

Draghi’s re­port sig­nal a less safety-fo­cused Euro­pean Union on AI

t6aguirre9 Sep 2024 18:39 UTC
17 points
3 comments1 min readEA link

Epi­sode: Austin vs Linch on OpenAI

Austin25 May 2024 16:15 UTC
21 points
2 comments44 min readEA link
(manifund.substack.com)

[Question] Why is learn­ing eco­nomics, psy­chol­ogy, so­ciol­ogy im­por­tant for pre­vent­ing AI risks?

jackchang1103 Nov 2023 21:48 UTC
3 points
0 comments1 min readEA link

AI-Rele­vant Reg­u­la­tion: CERN

SWK15 Jul 2023 18:40 UTC
12 points
0 comments6 min readEA link

What can we do now to pre­pare for AI sen­tience, in or­der to pro­tect them from the global scale of hu­man sadism?

rime18 Apr 2023 9:58 UTC
44 points
0 comments2 min readEA link

How to pur­sue a ca­reer in tech­ni­cal AI alignment

Charlie Rogers-Smith4 Jun 2022 21:36 UTC
268 points
9 comments39 min readEA link

Align­ment, Goals, & The Gut-Head Gap: A Re­view of Ngo. et al

Violet Hour11 May 2023 17:16 UTC
26 points
0 comments13 min readEA link

What does Bing Chat tell us about AI risk?

Holden Karnofsky28 Feb 2023 18:47 UTC
99 points
8 comments2 min readEA link
(www.cold-takes.com)

What’s new at FAR AI

AdamGleave4 Dec 2023 21:18 UTC
68 points
0 comments5 min readEA link
(far.ai)

What AI com­pa­nies can do to­day to help with the most im­por­tant century

Holden Karnofsky20 Feb 2023 17:40 UTC
104 points
8 comments11 min readEA link
(www.cold-takes.com)

Map­ping How Alli­ances, Ac­qui­si­tions, and An­titrust are Shap­ing the Fron­tier AI Industry

t6aguirre3 Jun 2024 9:43 UTC
24 points
1 comment2 min readEA link

LLMs won’t lead to AGI—Fran­cois Chollet

tobycrisford 🔸11 Jun 2024 20:19 UTC
38 points
23 comments1 min readEA link
(www.youtube.com)

AI In­ci­dent Re­port­ing: A Reg­u­la­tory Review

Deric Cheng11 Mar 2024 21:02 UTC
10 points
1 comment6 min readEA link

Dis­cussing AI-Hu­man Col­lab­o­ra­tion Through Fic­tion: The Story of Laika and GPT-∞

Laika27 Jul 2023 6:04 UTC
1 point
0 comments1 min readEA link

Misal­ign­ment Mu­seum opens in San Fran­cisco: ‘Sorry for kil­ling most of hu­man­ity’

Michael Huang4 Mar 2023 7:09 UTC
99 points
6 comments1 min readEA link
(www.misalignmentmuseum.com)

AI Wellbeing

Simon 11 Jul 2023 0:34 UTC
11 points
0 comments9 min readEA link

What we’re miss­ing: the case for struc­tural risks from AI

Justin Olive9 Nov 2023 5:52 UTC
31 points
3 comments6 min readEA link

‘The AI Dilemma: Growth vs Ex­is­ten­tial Risk’: An Ex­ten­sion for EAs and a Sum­mary for Non-economists

TomHoulden21 Apr 2024 16:28 UTC
66 points
1 comment16 min readEA link

[Question] Who is test­ing AI Safety pub­lic out­reach mes­sag­ing?

yanni kyriacos15 Apr 2023 0:53 UTC
20 points
2 comments1 min readEA link

Why Would AI “Aim” To Defeat Hu­man­ity?

Holden Karnofsky29 Nov 2022 18:59 UTC
24 points
0 comments32 min readEA link
(www.cold-takes.com)

I am un­able to get any AI safety re­lated fel­low­ships or in­tern­ships.

Aavishkar11 Mar 2024 5:00 UTC
5 points
6 comments1 min readEA link

What to think when a lan­guage model tells you it’s sentient

rgb20 Feb 2023 2:59 UTC
112 points
18 comments6 min readEA link

Biolog­i­cal su­per­in­tel­li­gence: a solu­tion to AI safety

Yarrow🔸4 Dec 2023 13:09 UTC
2 points
6 comments1 min readEA link

Re­search agenda: Su­per­vis­ing AIs im­prov­ing AIs

Quintin Pope29 Apr 2023 17:09 UTC
16 points
0 comments19 min readEA link

Safe AI and moral AI

William D'Alessandro1 Jun 2023 21:18 UTC
3 points
0 comments11 min readEA link

[Question] Should peo­ple get neu­ro­science phD to work in AI safety field?

jackchang1107 Mar 2023 16:21 UTC
9 points
11 comments1 min readEA link

Pes­simism about AI Safety

Max_He-Ho2 Apr 2023 7:57 UTC
5 points
0 comments25 min readEA link
(www.lesswrong.com)

Overview of in­tro­duc­tory re­sources in AI Governance

Lucie Philippon 🔸27 May 2024 16:22 UTC
26 points
1 comment6 min readEA link
(www.lesswrong.com)

AI, Cy­ber­se­cu­rity, and Malware: A Shal­low Re­port [Tech­ni­cal]

Madhav Malhotra31 Mar 2023 12:03 UTC
4 points
0 comments9 min readEA link

Paus­ing AI Devel­op­ments Isn’t Enough. We Need to Shut it All Down

EliezerYudkowsky9 Apr 2023 15:53 UTC
50 points
3 comments12 min readEA link

Hash­marks: Pri­vacy-Pre­serv­ing Bench­marks for High-Stakes AI Evaluation

Paul Bricman4 Dec 2023 7:41 UTC
4 points
0 comments16 min readEA link
(arxiv.org)

World and Mind in Ar­tifi­cial In­tel­li­gence: ar­gu­ments against the AI pause

Arturo Macias18 Apr 2023 14:35 UTC
6 points
3 comments5 min readEA link

Ob­ser­va­tions on the fund­ing land­scape of EA and AI safety

Vilhelm Skoglund2 Oct 2023 9:45 UTC
136 points
12 comments15 min readEA link

AI Align­ment in The New Yorker

Eleni_A17 May 2023 21:19 UTC
23 points
0 comments1 min readEA link
(www.newyorker.com)

Refram­ing the bur­den of proof: Com­pa­nies should prove that mod­els are safe (rather than ex­pect­ing au­di­tors to prove that mod­els are dan­ger­ous)

Akash25 Apr 2023 18:49 UTC
35 points
1 comment3 min readEA link
(childrenoficarus.substack.com)

Up­dates from Cam­paign for AI Safety

Jolyn Khoo19 Jul 2023 8:15 UTC
5 points
0 comments2 min readEA link
(www.campaignforaisafety.org)

Towards ev­i­dence gap-maps for AI safety

dEAsign25 Jul 2023 8:13 UTC
6 points
1 comment2 min readEA link

List of pro­jects that seem im­pact­ful for AI Governance

JaimeRV14 Jan 2024 16:52 UTC
35 points
2 comments13 min readEA link

AI, Cy­ber­se­cu­rity, and Malware: A Shal­low Re­port [Gen­eral]

Madhav Malhotra31 Mar 2023 12:01 UTC
5 points
0 comments8 min readEA link

Ex­plor­ing Me­tac­u­lus’s AI Track Record

Peter Scoblic1 May 2023 21:02 UTC
52 points
5 comments5 min readEA link

14+ AI Safety Ad­vi­sors You Can Speak to – New AISafety.com Resource

Bryce Robertson21 Jan 2025 17:34 UTC
18 points
2 comments1 min readEA link

AI Progress: The Game Show

Alex Arnett21 Apr 2023 16:47 UTC
3 points
0 comments2 min readEA link

The new UK gov­ern­ment’s stance on AI safety

Elliot Mckernon31 Jul 2024 15:23 UTC
19 points
0 comments4 min readEA link

ChatGPT not so clever or not so ar­tifi­cial as hyped to be?

Haris Shekeris2 Mar 2023 6:16 UTC
−7 points
2 comments1 min readEA link

UK AI Bill Anal­y­sis & Opinion

CAISID5 Feb 2024 0:12 UTC
18 points
0 comments15 min readEA link

Sen­tinel min­utes for week #52/​2024

NunoSempere30 Dec 2024 18:25 UTC
61 points
0 comments6 min readEA link
(blog.sentinel-team.org)

Po­ten­tial em­ploy­ees have a unique lever to in­fluence the be­hav­iors of AI labs

oxalis18 Mar 2023 20:58 UTC
139 points
1 comment5 min readEA link

AI Safety Newslet­ter #8: Rogue AIs, how to screen for AI risks, and grants for re­search on demo­cratic gov­er­nance of AI

Center for AI Safety30 May 2023 11:44 UTC
16 points
3 comments6 min readEA link
(newsletter.safe.ai)

Orthog­o­nal: A new agent foun­da­tions al­ign­ment organization

Tamsin Leake19 Apr 2023 20:17 UTC
38 points
0 comments1 min readEA link
(orxl.org)

Help us find pain points in AI safety

Esben Kran12 Apr 2022 18:43 UTC
31 points
4 comments9 min readEA link

[Question] What am I miss­ing re. open-source LLM’s?

another-anon-do-gooder4 Dec 2023 4:48 UTC
1 point
2 comments1 min readEA link

AI com­pa­nies are not on track to se­cure model weights

Jeffrey Ladish18 Jul 2024 15:13 UTC
73 points
3 comments19 min readEA link

GovAI: Towards best prac­tices in AGI safety and gov­er­nance: A sur­vey of ex­pert opinion

Zach Stein-Perlman15 May 2023 1:42 UTC
68 points
4 comments1 min readEA link
(arxiv.org)

Boomerang—pro­to­col to dis­solve some com­mit­ment races

Filip Sondej30 May 2023 16:24 UTC
20 points
0 comments8 min readEA link
(www.lesswrong.com)

Prim­i­tive Global Dis­course Frame­work, Con­sti­tu­tional AI us­ing le­gal frame­works, and Mono­cul­ture—A loss of con­trol over the role of AGI in society

broptross1 Jun 2023 5:12 UTC
2 points
0 comments12 min readEA link

The Risks of AI-Gen­er­ated Con­tent on the EA Forum

WobblyPanda24 Jun 2023 5:33 UTC
−1 points
0 comments1 min readEA link

An In­tro­duc­tion to Cri­tiques of promi­nent AI safety organizations

Omega19 Jul 2023 6:53 UTC
87 points
2 comments5 min readEA link

There is only one goal or drive—only self-per­pet­u­a­tion counts

freest one13 Jun 2023 1:37 UTC
2 points
4 comments8 min readEA link

Bi­den-Har­ris Ad­minis­tra­tion An­nounces First-Ever Con­sor­tium Ded­i­cated to AI Safety

ben.smith9 Feb 2024 6:40 UTC
15 points
1 comment1 min readEA link
(www.nist.gov)

Pay to get AI safety info from be­hind NDA wall?

louisbarclay5 Jun 2024 10:19 UTC
2 points
2 comments1 min readEA link

Pillars to Convergence

Phlobton1 Apr 2023 13:04 UTC
1 point
0 comments8 min readEA link

Is fear pro­duc­tive when com­mu­ni­cat­ing AI x-risk? [Study re­sults]

Johanna Roniger22 Jan 2024 5:38 UTC
73 points
10 comments5 min readEA link

Does AI risk “other” the AIs?

Joe_Carlsmith9 Jan 2024 17:51 UTC
23 points
3 comments8 min readEA link

We are fight­ing a shared bat­tle (a call for a differ­ent ap­proach to AI Strat­egy)

GideonF16 Mar 2023 14:37 UTC
59 points
11 comments15 min readEA link

Paper Sum­mary: The Effec­tive­ness of AI Ex­is­ten­tial Risk Com­mu­ni­ca­tion to the Amer­i­can and Dutch Public

Otto9 Mar 2023 10:40 UTC
97 points
11 comments4 min readEA link

An ar­gu­ment for ac­cel­er­at­ing in­ter­na­tional AI gov­er­nance re­search (part 1)

MattThinks16 Aug 2023 5:40 UTC
10 points
0 comments3 min readEA link

Up­dates from Cam­paign for AI Safety

Jolyn Khoo30 Aug 2023 5:36 UTC
7 points
0 comments2 min readEA link
(www.campaignforaisafety.org)

Epoch is hiring a Product and Data Vi­su­al­iza­tion Designer

merilalama25 Nov 2023 0:14 UTC
21 points
0 comments4 min readEA link
(careers.rethinkpriorities.org)

[CFP] NeurIPS work­shop: AI meets Mo­ral Philos­o­phy and Mo­ral Psychology

jaredlcm4 Sep 2023 6:21 UTC
10 points
1 comment4 min readEA link

The case for more am­bi­tious lan­guage model evals

Jozdien30 Jan 2024 9:24 UTC
7 points
0 comments5 min readEA link

Non-triv­ial Fel­low­ship Pro­ject: Towards a Unified Danger­ous Ca­pa­bil­ities Benchmark

Jord 4 Mar 2024 9:24 UTC
2 points
1 comment9 min readEA link

My Ob­jec­tions to “We’re All Gonna Die with Eliezer Yud­kowsky”

Quintin Pope21 Mar 2023 1:23 UTC
166 points
21 comments39 min readEA link

“Risk Aware­ness Mo­ments” (Rams): A con­cept for think­ing about AI gov­er­nance interventions

oeg14 Apr 2023 17:40 UTC
53 points
0 comments9 min readEA link

AI Policy In­sights from the AIMS Survey

Janet Pauketat22 Feb 2024 19:17 UTC
10 points
1 comment18 min readEA link
(www.sentienceinstitute.org)

Claude 3 claims it’s con­scious, doesn’t want to die or be modified

MikhailSamin4 Mar 2024 23:05 UTC
8 points
3 comments14 min readEA link

[Question] Pre­dic­tions for fu­ture AI gov­er­nance?

jackchang1102 Apr 2023 16:43 UTC
4 points
1 comment1 min readEA link

What can su­per­in­tel­li­gent ANI tell us about su­per­in­tel­li­gent AGI?

Ted Sanders12 Jun 2023 6:32 UTC
81 points
20 comments5 min readEA link

The ba­sic rea­sons I ex­pect AGI ruin

RobBensinger18 Apr 2023 3:37 UTC
58 points
13 comments14 min readEA link

World’s first ma­jor law for ar­tifi­cial in­tel­li­gence gets fi­nal EU green light

Dane Valerie24 May 2024 14:57 UTC
3 points
1 comment2 min readEA link
(www.cnbc.com)

A note of cau­tion on be­liev­ing things on a gut level

Nathan_Barnard9 May 2023 12:20 UTC
41 points
5 comments2 min readEA link

Oc­to­ber 2022 AI Risk Com­mu­nity Sur­vey Results

Froolow24 May 2023 10:37 UTC
19 points
0 comments7 min readEA link

You don’t need to be a ge­nius to be in AI safety research

Claire Short10 May 2023 22:23 UTC
28 points
4 comments6 min readEA link

An even deeper atheism

Joe_Carlsmith11 Jan 2024 17:28 UTC
26 points
2 comments15 min readEA link

Ap­ply to MATS 7.0!

Ryan Kidd21 Sep 2024 0:23 UTC
27 points
0 comments5 min readEA link

Speedrun: AI Align­ment Prizes

joe9 Feb 2023 11:55 UTC
27 points
0 comments18 min readEA link

How can OSINT be used for the en­force­ment of the EU AI Act?

Kristina7 Jun 2024 11:07 UTC
8 points
1 comment1 min readEA link

The Game of Dominance

Karl von Wendt27 Aug 2023 11:23 UTC
5 points
0 comments6 min readEA link

Begin­ner’s guide to re­duc­ing s-risks [link-post]

Center on Long-Term Risk17 Oct 2023 0:51 UTC
130 points
3 comments3 min readEA link
(longtermrisk.org)

Re­search Sum­mary: Fore­cast­ing with Large Lan­guage Models

Damien Laird2 Apr 2023 10:52 UTC
4 points
0 comments7 min readEA link
(damienlaird.substack.com)

The two-tiered society

Roman Leventov13 May 2024 7:53 UTC
14 points
5 comments3 min readEA link

An­nounc­ing New Begin­ner-friendly Book on AI Safety and Risk

Darren McKee25 Nov 2023 15:57 UTC
114 points
9 comments1 min readEA link

Claude 3.5 Sonnet

Zach Stein-Perlman20 Jun 2024 18:00 UTC
31 points
0 comments1 min readEA link
(www.anthropic.com)

Cam­bridge AI Safety Hub is look­ing for full- or part-time organisers

hannah15 Jul 2023 14:31 UTC
12 points
0 comments1 min readEA link

Up­dates from Cam­paign for AI Safety

Jolyn Khoo31 Oct 2023 5:46 UTC
14 points
1 comment2 min readEA link
(www.campaignforaisafety.org)

Weekly newslet­ter for AI safety events and train­ing programs

Bryce Robertson3 May 2024 0:37 UTC
15 points
0 comments1 min readEA link
(www.lesswrong.com)

Chain­ing the evil ge­nie: why “outer” AI safety is prob­a­bly easy

titotal30 Aug 2022 13:55 UTC
40 points
12 comments10 min readEA link

OpenAI board re­ceived let­ter warn­ing of pow­er­ful AI

JordanStone23 Nov 2023 0:16 UTC
26 points
2 comments1 min readEA link
(www.reuters.com)

Some ini­tial mus­ing on the poli­tics of longter­mist tra­jec­tory change

GideonF26 Jun 2025 7:16 UTC
6 points
0 comments12 min readEA link
(futerman.substack.com)

Danger­ous ca­pa­bil­ity tests should be harder

Luca Righetti 🔸20 Aug 2024 16:11 UTC
23 points
1 comment5 min readEA link
(www.planned-obsolescence.org)

Aus­trali­ans are con­cerned about AI risks and ex­pect strong gov­ern­ment action

Alexander Saeri8 Mar 2024 6:39 UTC
38 points
12 comments5 min readEA link
(aigovernance.org.au)

AGI ris­ing: why we are in a new era of acute risk and in­creas­ing pub­lic aware­ness, and what to do now

Greg_Colbourn ⏸️ 2 May 2023 10:17 UTC
68 points
35 comments13 min readEA link

[Question] Would a su­per-in­tel­li­gent AI nec­es­sar­ily sup­port its own ex­is­tence?

Porque?25 Jun 2023 10:39 UTC
8 points
2 comments2 min readEA link

Tort Law Can Play an Im­por­tant Role in Miti­gat­ing AI Risk

Gabriel Weil12 Feb 2024 17:11 UTC
99 points
6 comments5 min readEA link

Ex­is­ten­tial risk x Crypto: An un­con­fer­ence at Zuzalu

Yesh11 Apr 2023 13:31 UTC
6 points
0 comments1 min readEA link

Poster Ses­sion on AI Safety

Neil Crawford12 Nov 2022 3:50 UTC
8 points
0 comments4 min readEA link

MIRI 2024 Mis­sion and Strat­egy Update

Malo5 Jan 2024 1:10 UTC
154 points
38 comments8 min readEA link

Fix­ing In­sider Threats in the AI Sup­ply Chain

Madhav Malhotra7 Oct 2023 10:49 UTC
9 points
2 comments5 min readEA link

New Ar­tifi­cial In­tel­li­gence quiz: can you beat ChatGPT?

AndreFerretti3 Mar 2023 15:46 UTC
29 points
3 comments1 min readEA link

Trans­for­ma­tive AI and Com­pute—Read­ing List

Frederik Berg4 Sep 2023 6:21 UTC
24 points
0 comments1 min readEA link
(docs.google.com)

Ap­pli­ca­tions Open: Pivotal 2025 Q3 Re­search Fellowship

Tobias Häberli18 Mar 2025 13:25 UTC
20 points
0 comments2 min readEA link

UNGA Re­s­olu­tion on AI: 5 Key Take­aways Look­ing to Fu­ture Policy

Heramb Podar24 Mar 2024 12:03 UTC
17 points
1 comment3 min readEA link

Risk of AI de­cel­er­a­tion.

Micah Zoltu18 Apr 2023 11:19 UTC
9 points
14 comments3 min readEA link

Miti­gat­ing ex­treme AI risks amid rapid progress [Linkpost]

Akash21 May 2024 20:04 UTC
36 points
1 comment4 min readEA link

[Closed] MIT Fu­tureTech are hiring for a Head of Oper­a­tions role

PeterSlattery2 Oct 2024 16:51 UTC
8 points
0 comments4 min readEA link

Nav­i­gat­ing the Open-Source AI Land­scape: Data, Fund­ing, and Safety

AndreFerretti12 Apr 2023 10:30 UTC
23 points
3 comments10 min readEA link

[Linkpost] Be­ware the Squir­rel by Ver­ity Harding

Earthling3 Sep 2023 21:04 UTC
1 point
1 comment2 min readEA link
(samf.substack.com)

[Question] Know a grad stu­dent study­ing AI’s eco­nomic im­pacts?

Madhav Malhotra5 Jul 2023 0:07 UTC
7 points
0 comments1 min readEA link

[Question] Do you worry about to­tal­i­tar­ian regimes us­ing AI Align­ment tech­nol­ogy to cre­ate AGI that sub­scribe to their val­ues?

diodio_yang28 Feb 2023 18:12 UTC
25 points
12 comments2 min readEA link

AI Safety Ar­gu­ments: An In­ter­ac­tive Guide

Lukas Trötzmüller🔸1 Feb 2023 19:21 UTC
32 points
5 comments3 min readEA link

Assess­ment of AI safety agen­das: think about the down­side risk

Roman Leventov19 Dec 2023 9:02 UTC
6 points
0 comments1 min readEA link

[Linkpost] Scott Alexan­der re­acts to OpenAI’s lat­est post

Akash11 Mar 2023 22:24 UTC
105 points
4 comments5 min readEA link
(astralcodexten.substack.com)

An economist’s per­spec­tive on AI safety

David Stinson7 Jun 2024 7:55 UTC
7 points
1 comment9 min readEA link

Neu­ron­pe­dia—AI Safety Game

johnnylin16 Oct 2023 9:35 UTC
9 points
2 comments4 min readEA link
(neuronpedia.org)

My Proven AI Safety Ex­pla­na­tion (as a com­put­ing stu­dent)

Mica White6 Feb 2024 3:58 UTC
8 points
4 comments6 min readEA link

Ap­ply to Aether—In­de­pen­dent LLM Agent Safety Re­search Group

RohanS21 Aug 2024 9:40 UTC
47 points
13 comments8 min readEA link

Liter­a­ture re­view of TAI timelines

Jaime Sevilla27 Jan 2023 20:36 UTC
148 points
10 comments2 min readEA link
(epochai.org)

AI and Work: Sum­maris­ing a New Liter­a­ture Review

cpeppiatt15 Jul 2024 10:27 UTC
13 points
0 comments2 min readEA link
(arxiv.org)

Mud­dling Along Is More Likely Than Dystopia

Jeffrey Heninger21 Oct 2023 9:30 UTC
87 points
3 comments8 min readEA link
(blog.aiimpacts.org)

Digi­tal peo­ple could make AI safer

GMcGowan10 Jun 2022 15:29 UTC
25 points
15 comments4 min readEA link
(www.mindlessalgorithm.com)

Sum­mary of Si­tu­a­tional Aware­ness—The Decade Ahead

OscarD🔸8 Jun 2024 11:29 UTC
143 points
5 comments18 min readEA link

What do XPT fore­casts tell us about AI risk?

Forecasting Research Institute19 Jul 2023 7:43 UTC
97 points
21 comments14 min readEA link

We Should Talk About This More. Epistemic World Col­lapse as Im­mi­nent Safety Risk of Gen­er­a­tive AI.

Jörg Weiß16 Nov 2023 8:34 UTC
4 points
0 comments29 min readEA link

Au­to­mated Par­li­a­ments — A Solu­tion to De­ci­sion Uncer­tainty and Misal­ign­ment in Lan­guage Models

Shak Ragoler2 Oct 2023 9:47 UTC
9 points
0 comments17 min readEA link

Learn­ing so­cietal val­ues from law as part of an AGI al­ign­ment strategy

johnjnay21 Oct 2022 2:03 UTC
20 points
1 comment24 min readEA link

Join the AI Eval­u­a­tion Tasks Bounty Hackathon

Esben Kran18 Mar 2024 8:15 UTC
20 points
0 comments4 min readEA link

Cri­tiques of promi­nent AI safety labs: Conjecture

Omega12 Jun 2023 5:52 UTC
150 points
83 comments32 min readEA link

How evals might (or might not) pre­vent catas­trophic risks from AI

Akash7 Feb 2023 20:16 UTC
28 points
0 comments9 min readEA link

MIT Fu­tureTech are hiring for an Oper­a­tions and Pro­ject Man­age­ment role.

PeterSlattery17 May 2024 1:29 UTC
12 points
0 comments3 min readEA link

A sim­ple way of ex­ploit­ing AI’s com­ing eco­nomic im­pact may be highly-impactful

kuira16 Jul 2023 10:30 UTC
5 points
0 comments2 min readEA link
(www.lesswrong.com)

Ap­ply to the Cam­bridge ML for Align­ment Boot­camp (CaMLAB) [26 March − 8 April]

hannah9 Feb 2023 16:32 UTC
62 points
1 comment5 min readEA link

Have your say on the fu­ture of AI reg­u­la­tion: Dead­line ap­proach­ing for your feed­back on UN High-Level Ad­vi­sory Body on AI In­terim Re­port ‘Govern­ing AI for Hu­man­ity’

Deborah W.A. Foulkes29 Mar 2024 6:37 UTC
17 points
1 comment1 min readEA link

How ma­jor gov­ern­ments can help with the most im­por­tant century

Holden Karnofsky24 Feb 2023 19:37 UTC
56 points
4 comments4 min readEA link
(www.cold-takes.com)

The cur­rent al­ign­ment plan, and how we might im­prove it | EAG Bay Area 23

Buck7 Jun 2023 21:03 UTC
66 points
0 comments33 min readEA link

(Even) More Early-Ca­reer EAs Should Try AI Safety Tech­ni­cal Research

tlevin30 Jun 2022 21:14 UTC
86 points
40 comments11 min readEA link

“Pivotal Act” In­ten­tions: Nega­tive Con­se­quences and Fal­la­cious Arguments

Andrew Critch19 Apr 2022 20:24 UTC
80 points
10 comments7 min readEA link

The AI Endgame: A coun­ter­fac­tual to AI al­ign­ment by an AI Safety newcomer

Andreas P1 Dec 2023 5:49 UTC
2 points
5 comments3 min readEA link

The Mul­tidis­ci­plinary Ap­proach to Align­ment (MATA) and Archety­pal Trans­fer Learn­ing (ATL)

Miguel19 Jun 2023 3:23 UTC
4 points
0 comments7 min readEA link

Diminish­ing Re­turns in Ma­chine Learn­ing Part 1: Hard­ware Devel­op­ment and the Phys­i­cal Frontier

Brian Chau27 May 2023 12:39 UTC
16 points
3 comments12 min readEA link
(www.fromthenew.world)

Unions for AI safety?

dEAsign24 Sep 2023 0:13 UTC
7 points
12 comments2 min readEA link

How quickly AI could trans­form the world (Tom David­son on The 80,000 Hours Pod­cast)

80000_Hours8 May 2023 13:23 UTC
82 points
3 comments17 min readEA link

How we could stum­ble into AI catastrophe

Holden Karnofsky16 Jan 2023 14:52 UTC
83 points
0 comments31 min readEA link
(www.cold-takes.com)

PhD Po­si­tion: AI In­ter­pretabil­ity in Ber­lin, Germany

Martian Moonshine22 Apr 2023 18:57 UTC
24 points
0 comments1 min readEA link
(stephanw.net)

Pod­cast: In­ter­view se­ries fea­tur­ing Dr. Peter Park

Jacob-Haimes26 Mar 2024 0:35 UTC
1 point
0 comments2 min readEA link
(into-ai-safety.github.io)

Sam Alt­man’s Chip Am­bi­tions Un­der­cut OpenAI’s Safety Strategy

Garrison10 Feb 2024 19:52 UTC
286 points
20 comments3 min readEA link
(garrisonlovely.substack.com)

The stan­dard case for de­lay­ing AI ap­pears to rest on non-util­i­tar­ian assumptions

Matthew_Barnett11 Feb 2025 4:04 UTC
16 points
57 comments10 min readEA link

Risk-averse Batch Ac­tive In­verse Re­ward Design

Panagiotis Liampas7 Oct 2023 8:56 UTC
11 points
0 comments15 min readEA link

Wor­ri­some mi­s­un­der­stand­ing of the core is­sues with AI transition

Roman Leventov18 Jan 2024 10:05 UTC
4 points
3 comments4 min readEA link

An­nounc­ing Athena—Women in AI Align­ment Research

Claire Short7 Nov 2023 22:02 UTC
180 points
28 comments3 min readEA link

Sta­tus Quo Eng­ines—AI essay

Ilana_Goldowitz_Jimenez28 May 2023 14:33 UTC
1 point
1 comment15 min readEA link

Prospects for AI safety agree­ments be­tween countries

oeg14 Apr 2023 17:41 UTC
104 points
3 comments22 min readEA link

“The Race to the End of Hu­man­ity” – Struc­tural Uncer­tainty Anal­y­sis in AI Risk Models

Froolow19 May 2023 12:03 UTC
48 points
4 comments21 min readEA link

List of AI safety newslet­ters and other resources

Lizka1 May 2023 17:24 UTC
49 points
5 comments4 min readEA link

A sum­mary of cur­rent work in AI governance

constructive17 Jun 2023 16:58 UTC
89 points
4 comments11 min readEA link

[US] NTIA: AI Ac­countabil­ity Policy Re­quest for Comment

Kyle J. Lucchese 🔸13 Apr 2023 16:12 UTC
47 points
4 comments1 min readEA link
(ntia.gov)

It’s not ob­vi­ous that get­ting dan­ger­ous AI later is better

Aaron_Scher23 Sep 2023 5:35 UTC
23 points
9 comments16 min readEA link

AI Safety Camp 2024

Linda Linsefors18 Nov 2023 10:37 UTC
21 points
1 comment4 min readEA link
(aisafety.camp)

Dis­cus­sion about AI Safety fund­ing (FB tran­script)

Akash30 Apr 2023 19:05 UTC
104 points
10 comments6 min readEA link

Cy­ber­se­cu­rity of Fron­tier AI Models: A Reg­u­la­tory Review

Deric Cheng25 Apr 2024 14:51 UTC
9 points
1 comment8 min readEA link

A com­pute-based frame­work for think­ing about the fu­ture of AI

Matthew_Barnett31 May 2023 22:00 UTC
96 points
36 comments19 min readEA link

You Can’t Prove Aliens Aren’t On Their Way To De­stroy The Earth (A Com­pre­hen­sive Take­down Of The Doomer View Of AI)

Murphy7 Apr 2023 13:37 UTC
−31 points
7 comments9 min readEA link

There are no co­her­ence theorems

EJT20 Feb 2023 21:52 UTC
108 points
49 comments19 min readEA link

AI safety and con­scious­ness re­search: A brainstorm

Daniel_Friedrich15 Mar 2023 14:33 UTC
11 points
1 comment9 min readEA link

Coun­ter­ar­gu­ments to the ba­sic AI risk case

Katja_Grace14 Oct 2022 20:30 UTC
286 points
23 comments34 min readEA link

Align­ing the Align­ers: En­sur­ing Aligned AI acts for the com­mon good of all mankind

timunderwood16 Jan 2023 11:13 UTC
40 points
2 comments4 min readEA link

Reza Ne­garestani’s In­tel­li­gence & Spirit

ukc1001427 Jun 2024 18:17 UTC
7 points
1 comment4 min readEA link

MATS Sum­mer 2023 Retrospective

utilistrutil2 Dec 2023 0:12 UTC
28 points
3 comments26 min readEA link

[Question] Is work­ing on AI to help democ­racy a good idea?

WillPearson17 Feb 2024 23:15 UTC
5 points
3 comments1 min readEA link

“The Uni­verse of Minds”—call for re­view­ers (Seeds of Science)

rogersbacon125 Jul 2023 16:55 UTC
4 points
0 comments1 min readEA link

Ad­vo­cat­ing for Public Own­er­ship of Fu­ture AGI: Pre­serv­ing Hu­man­ity’s Col­lec­tive Heritage

George_A (Digital Intelligence Rights Initiative) 14 Jul 2023 16:01 UTC
−10 points
2 comments4 min readEA link

An­nounc­ing: Mechanism De­sign for AI Safety—Read­ing Group

Rubi J. Hudson9 Aug 2022 4:25 UTC
36 points
1 comment4 min readEA link

I de­signed an AI safety course (for a philos­o­phy de­part­ment)

Eleni_A23 Sep 2023 21:56 UTC
27 points
3 comments2 min readEA link

[Question] Would an An­thropic/​OpenAI merger be good for AI safety?

M22 Nov 2023 20:21 UTC
6 points
1 comment1 min readEA link

Gaia Net­work: An Illus­trated Primer

Roman Leventov26 Jan 2024 11:55 UTC
4 points
4 comments15 min readEA link

A Roundtable for Safe AI (RSAI)?

Lara_TH9 Mar 2023 12:11 UTC
9 points
0 comments4 min readEA link

AI Risk and Sur­vivor­ship Bias—How An­dreessen and LeCun got it wrong

stepanlos14 Jul 2023 17:10 UTC
5 points
1 comment6 min readEA link

The Bar for Con­tribut­ing to AI Safety is Lower than You Think

Chris Leong17 Aug 2024 10:52 UTC
14 points
5 comments2 min readEA link

My Feed­back to the UN Ad­vi­sory Body on AI

Heramb Podar4 Apr 2024 23:39 UTC
7 points
1 comment4 min readEA link

Aim for con­di­tional pauses

AnonResearcherMajorAILab25 Sep 2023 1:05 UTC
100 points
42 comments12 min readEA link

Ask AI com­pa­nies about what they are do­ing for AI safety?

mic8 Mar 2022 21:54 UTC
44 points
1 comment2 min readEA link

Con­scious AI & Public Per­cep­tion: Four futures

nicoleta-k3 Jul 2024 23:06 UTC
12 points
1 comment16 min readEA link

Thoughts on the AI Safety Sum­mit com­pany policy re­quests and responses

So8res31 Oct 2023 23:54 UTC
42 points
3 comments10 min readEA link

De­com­pos­ing al­ign­ment to take ad­van­tage of paradigms

Christopher King4 Jun 2023 14:26 UTC
2 points
0 comments4 min readEA link

Risk Align­ment in Agen­tic AI Systems

Hayley Clatterbuck1 Oct 2024 22:51 UTC
32 points
1 comment3 min readEA link
(static1.squarespace.com)

AI policy & gov­er­nance in Aus­tralia: notes from an ini­tial discussion

Alexander Saeri15 May 2023 0:00 UTC
31 points
1 comment3 min readEA link

The case for AGI by 2030

Benjamin_Todd6 Apr 2025 12:26 UTC
96 points
33 comments31 min readEA link
(80000hours.org)

Without a tra­jec­tory change, the de­vel­op­ment of AGI is likely to go badly

Max H30 May 2023 0:21 UTC
1 point
0 comments13 min readEA link

Help us seed AI Safety Brussels

gergo7 Aug 2024 6:17 UTC
50 points
4 comments3 min readEA link

[Question] Could some­one help me un­der­stand why it’s so difficult to solve the al­ign­ment prob­lem?

Jadon Schmitt22 Jul 2023 4:39 UTC
35 points
21 comments1 min readEA link

Re­port: Eval­u­at­ing an AI Chip Regis­tra­tion Policy

Deric Cheng12 Apr 2024 4:40 UTC
15 points
0 comments5 min readEA link
(www.convergenceanalysis.org)

Five ne­glected work ar­eas that could re­duce AI risk

Aaron_Scher24 Sep 2023 2:09 UTC
22 points
0 comments9 min readEA link

AI Safety & Risk Din­ner w/​ En­trepreneur First CEO & ARIA Chair, Matt Clifford in New York

SimonPastor28 Nov 2023 19:45 UTC
2 points
0 comments1 min readEA link

News: Span­ish AI image out­cry + US AI work­force “reg­u­la­tion”

Benevolent_Rain26 Sep 2023 7:43 UTC
9 points
0 comments1 min readEA link

UK Govern­ment an­nounces £100 mil­lion in fund­ing for Foun­da­tion Model Task­force.

Jordan Pieters 🔸25 Apr 2023 11:29 UTC
10 points
1 comment1 min readEA link
(www.gov.uk)

In­tro­duc­tion to Prag­matic AI Safety [Prag­matic AI Safety #1]

TW1239 May 2022 17:02 UTC
68 points
0 comments6 min readEA link

Assess­ing the Danger­ous­ness of Malev­olent Ac­tors in AGI Gover­nance: A Pre­limi­nary Exploration

Callum Hinchcliffe14 Oct 2023 21:18 UTC
28 points
4 comments9 min readEA link

Mea­sur­ing ar­tifi­cial in­tel­li­gence on hu­man bench­marks is naive

Ward A11 Apr 2023 11:28 UTC
9 points
2 comments1 min readEA link

Up­dates from Cam­paign for AI Safety

Jolyn Khoo27 Sep 2023 2:44 UTC
16 points
0 comments2 min readEA link
(www.campaignforaisafety.org)

Key take­aways from our EA and al­ign­ment re­search surveys

Cameron Berg4 May 2024 15:51 UTC
64 points
22 comments21 min readEA link

AGI Safety Needs Peo­ple With All Skil­lsets!

Severin25 Jul 2022 13:30 UTC
39 points
7 comments2 min readEA link

[Linkpost] A Nar­row Path—How to Se­cure our Future

MathiasKB🔸2 Oct 2024 22:50 UTC
68 points
0 comments1 min readEA link
(www.narrowpath.co)

In­trin­sic limi­ta­tions of GPT-4 and other large lan­guage mod­els, and why I’m not (very) wor­ried about GPT-n

James Fodor3 Jun 2023 13:09 UTC
28 points
3 comments11 min readEA link

Dr Alt­man or: How I Learned to Stop Wor­ry­ing and Love the Killer AI

Barak Gila11 Mar 2024 5:01 UTC
−7 points
0 comments2 min readEA link

NIMBYism as an AI gov­er­nance tool?

freedomandutility9 Jun 2024 6:40 UTC
10 points
2 comments1 min readEA link

Fif­teen Law­suits against OpenAI

Remmelt9 Mar 2024 12:22 UTC
55 points
5 comments1 min readEA link

Up­date on cause area fo­cus work­ing group

Bastian_Stern10 Aug 2023 1:21 UTC
140 points
18 comments5 min readEA link

Just Pivot to AI: The se­cret is out

sapphire15 Mar 2023 6:25 UTC
0 points
4 comments2 min readEA link

Talk­ing pub­li­cly about AI risk

Jan_Kulveit24 Apr 2023 9:19 UTC
152 points
13 comments6 min readEA link

Knowl­edge, Rea­son­ing, and Superintelligence

Owen Cotton-Barratt26 Mar 2025 23:28 UTC
21 points
3 comments7 min readEA link
(strangecities.substack.com)

AI Safety Camp, Vir­tual Edi­tion 2023

Linda Linsefors6 Jan 2023 0:55 UTC
24 points
0 comments3 min readEA link
(aisafety.camp)

Against most, but not all, AI risk analogies

Matthew_Barnett14 Jan 2024 19:13 UTC
43 points
9 comments7 min readEA link

Aus­trali­ans for AI Safety Launches New Elec­tion Cam­paign — Here’s How You Can Help

Luke Freeman24 Mar 2025 4:26 UTC
54 points
5 comments3 min readEA link

Is Deep Learn­ing Ac­tu­ally Hit­ting a Wall? Eval­u­at­ing Ilya Sutskever’s Re­cent Claims

Garrison13 Nov 2024 17:00 UTC
115 points
7 comments8 min readEA link
(garrisonlovely.substack.com)

In­tro­duc­ing the Path­fin­der Fel­low­ship: Fund­ing and Men­tor­ship for AI Safety Group Organizers

Agustín Covarrubias 🔸22 Jul 2025 17:11 UTC
49 points
0 comments2 min readEA link

We Did AGISF’s 8-week Course in 3 Days. Here’s How it Went

ag400024 Jul 2022 16:46 UTC
26 points
7 comments6 min readEA link

AI gov­er­nance and strat­egy: a list of re­search agen­das and work that could be done.

Nathan_Barnard12 Mar 2024 11:21 UTC
33 points
4 comments17 min readEA link

Stu­art J. Rus­sell on “should we press pause on AI?”

Kaleem18 Sep 2023 13:19 UTC
32 points
3 comments1 min readEA link
(podcasts.apple.com)

OpenAI Alums, No­bel Lau­re­ates Urge Reg­u­la­tors to Save Com­pany’s Non­profit Structure

Garrison23 Apr 2025 23:01 UTC
61 points
2 comments8 min readEA link
(garrisonlovely.substack.com)

Order Mat­ters for De­cep­tive Alignment

DavidW15 Feb 2023 20:12 UTC
20 points
1 comment1 min readEA link
(www.lesswrong.com)

[Question] How good/​bad is the new Bing AI for the world?

Nathan Young17 Feb 2023 16:31 UTC
21 points
14 comments1 min readEA link

[Question] Good de­pic­tions of speed mis­matches be­tween ad­vanced AI sys­tems and hu­mans?

Geoffrey Miller15 Mar 2023 16:40 UTC
18 points
9 comments1 min readEA link

AI Safety Com­mu­ni­ca­tors Meet-up

Vishakha Agrawal20 Jun 2025 12:26 UTC
2 points
0 comments1 min readEA link

The ar­gu­ment for near-term hu­man dis­em­pow­er­ment through AI

Chris Leong16 Apr 2024 3:07 UTC
31 points
12 comments1 min readEA link
(link.springer.com)

Who Aligns the Align­ment Re­searchers?

ben.smith5 Mar 2023 23:22 UTC
23 points
4 comments11 min readEA link

[Question] Why won’t nan­otech kill us all?

Yarrow🔸16 Dec 2023 23:27 UTC
20 points
5 comments1 min readEA link

AI and integrity

Nathan Young29 May 2024 20:45 UTC
15 points
0 comments2 min readEA link
(nathanpmyoung.substack.com)

AI al­ign­ment shouldn’t be con­flated with AI moral achievement

Matthew_Barnett30 Dec 2023 3:08 UTC
116 points
15 comments5 min readEA link

5 home­grown EA pro­jects, seek­ing small donors

Austin28 Oct 2024 23:24 UTC
50 points
1 comment2 min readEA link

Talk: AI safety field­build­ing at MATS

Ryan Kidd23 Jun 2024 23:06 UTC
14 points
1 comment10 min readEA link

Shal­low re­view of live agen­das in al­ign­ment & safety

technicalities27 Nov 2023 11:33 UTC
76 points
8 comments29 min readEA link

Un­veiling the Amer­i­can Public Opinion on AI Mo­ra­to­rium and Govern­ment In­ter­ven­tion: The Im­pact of Me­dia Exposure

Otto8 May 2023 10:49 UTC
28 points
5 comments6 min readEA link

[Question] What are some crit­i­cisms of PauseAI?

Eevee🔹23 Nov 2024 17:49 UTC
53 points
71 comments1 min readEA link

Ways to buy time

Akash12 Nov 2022 19:31 UTC
47 points
1 comment12 min readEA link

New re­port on the state of AI safety in China

Geoffrey Miller27 Oct 2023 20:20 UTC
22 points
0 comments3 min readEA link
(concordia-consulting.com)

[Question] Who should we give books on AI X-risk to?

yanni18 Dec 2023 23:57 UTC
13 points
1 comment1 min readEA link

“AGI” con­sid­ered harmful

Milan Griffes18 Apr 2025 20:19 UTC
10 points
1 comment1 min readEA link

Pause For Thought: The AI Pause De­bate (As­tral Codex Ten)

David M5 Oct 2023 9:32 UTC
37 points
0 comments1 min readEA link
(www.astralcodexten.com)

How I Formed My Own Views About AI Safety

Neel Nanda27 Feb 2022 18:52 UTC
134 points
13 comments14 min readEA link
(www.neelnanda.io)

[Question] Why might AI be a x-risk? Suc­cinct ex­pla­na­tions please

Sanjay4 Apr 2023 12:46 UTC
20 points
9 comments1 min readEA link

An overview of some promis­ing work by ju­nior al­ign­ment researchers

Akash26 Dec 2022 17:23 UTC
10 points
0 comments4 min readEA link

Longter­mism Fund: Au­gust 2023 Grants Report

Michael Townsend🔸20 Aug 2023 5:34 UTC
81 points
3 comments5 min readEA link

New sur­vey: 46% of Amer­i­cans are con­cerned about ex­tinc­tion from AI; 69% sup­port a six-month pause in AI development

Akash5 Apr 2023 1:26 UTC
143 points
34 comments1 min readEA link
(today.yougov.com)

An­nounc­ing Epoch’s newly ex­panded Pa­ram­e­ters, Com­pute and Data Trends in Ma­chine Learn­ing database

Robi Rahman25 Oct 2023 3:03 UTC
38 points
1 comment1 min readEA link
(epochai.org)

Up­com­ing speaker se­ries on emerg­ing tech, na­tional se­cu­rity & US policy careers

kuhanj21 Jun 2023 4:49 UTC
42 points
0 comments2 min readEA link

Com­ments on OpenAI’s “Plan­ning for AGI and be­yond”

So8res3 Mar 2023 23:01 UTC
115 points
7 comments13 min readEA link

Next steps af­ter AGISF at UMich

JakubK25 Jan 2023 20:57 UTC
18 points
1 comment5 min readEA link
(docs.google.com)

[Question] Is There Ac­tu­ally a Stan­dard or Con­vinc­ing Re­sponse to David Thorstad’s Crit­i­cisms of the Value of X-Risk Re­duc­tion and of Longter­mism?

David Mathers🔸21 May 2025 11:58 UTC
110 points
21 comments2 min readEA link

[Question] Ev­i­dence to pri­ori­tize or work­ing on AI as the most im­pact­ful thing?

Vaipan22 Sep 2023 8:43 UTC
9 points
6 comments1 min readEA link

The Im­por­tance of AI Align­ment, ex­plained in 5 points

Daniel_Eth11 Feb 2023 2:56 UTC
50 points
4 comments13 min readEA link

Mis­nam­ing and Other Is­sues with OpenAI’s “Hu­man Level” Su­per­in­tel­li­gence Hierarchy

Davidmanheim15 Jul 2024 5:50 UTC
14 points
0 comments3 min readEA link

Why peo­ple want to work on AI safety (but don’t)

Emily Grundy24 Jan 2023 6:41 UTC
70 points
10 comments7 min readEA link

How “AGI” could end up be­ing many differ­ent spe­cial­ized AI’s stitched together

titotal8 May 2023 12:32 UTC
31 points
2 comments9 min readEA link

Please Don’t Win the AI Race

Picklehead2 Aug 2025 23:31 UTC
−4 points
0 comments6 min readEA link

Trans­for­ma­tive AI is­sues (not just mis­al­ign­ment): an overview

Holden Karnofsky6 Jan 2023 2:19 UTC
36 points
0 comments22 min readEA link
(www.cold-takes.com)

My AI safety in­de­pen­dent re­search en­g­ineer path so far

Artyom K25 Jul 2025 8:49 UTC
16 points
0 comments3 min readEA link

We might get lucky with AGI warn­ing shots. Let’s be ready!

tcelferact31 Mar 2023 21:37 UTC
22 points
2 comments2 min readEA link

Ra­tional An­i­ma­tions’ video about scal­able over­sight and sandwiching

Writer6 Jul 2025 14:00 UTC
14 points
1 comment9 min readEA link
(youtu.be)

Paul Chris­ti­ano on Dwarkesh Podcast

ESRogs3 Nov 2023 22:13 UTC
5 points
0 comments1 min readEA link
(www.dwarkeshpatel.com)

Quick nudge to ap­ply to the LTFF grant round (clos­ing on Satur­day)

calebp14 Feb 2025 15:19 UTC
57 points
7 comments1 min readEA link

Ap­ply to the 2024 PIBBSS Sum­mer Re­search Fellowship

nora12 Jan 2024 4:06 UTC
37 points
1 comment2 min readEA link

Ques­tions about Con­je­cure’s CoEm proposal

Akash9 Mar 2023 19:32 UTC
19 points
0 comments2 min readEA link

The “tech­nol­ogy” bucket error

Holly Elmore ⏸️ 🔸21 Sep 2023 0:59 UTC
33 points
10 comments4 min readEA link
(open.substack.com)

EA Wins 2023

Shakeel Hashim31 Dec 2023 14:07 UTC
358 points
9 comments3 min readEA link

Protest against Meta’s ir­re­versible pro­lifer­a­tion (Sept 29, San Fran­cisco)

Holly Elmore ⏸️ 🔸19 Sep 2023 23:40 UTC
114 points
32 comments1 min readEA link

AI Safety Newslet­ter #39: Im­pli­ca­tions of a Trump Ad­minis­tra­tion for AI Policy Plus, Safety Engineering

Center for AI Safety29 Jul 2024 17:48 UTC
6 points
0 comments6 min readEA link
(newsletter.safe.ai)

CAIDP State­ment on Lethal Au­tonomous Weapons Sys­tems

Heramb Podar30 Nov 2024 18:00 UTC
7 points
0 comments1 min readEA link
(www.linkedin.com)

A deep cri­tique of AI 2027’s bad timeline models

titotal19 Jun 2025 13:35 UTC
277 points
28 comments40 min readEA link
(titotal.substack.com)

“Can We Sur­vive Tech­nol­ogy?” by John von Neumann

Eli Rose13 Mar 2023 2:26 UTC
51 points
0 comments1 min readEA link
(geosci.uchicago.edu)

[Question] Are al­ign­ment re­searchers de­vot­ing enough time to im­prov­ing their re­search ca­pac­ity?

Carson Jones4 Nov 2022 0:58 UTC
11 points
1 comment3 min readEA link

Hy­po­thet­i­cal grants that the Long-Term Fu­ture Fund nar­rowly rejected

calebp15 Nov 2023 19:39 UTC
95 points
12 comments6 min readEA link

“Dangers of AI and the End of Hu­man Civ­i­liza­tion” Yud­kowsky on Lex Fridman

𝕮𝖎𝖓𝖊𝖗𝖆30 Mar 2023 15:44 UTC
28 points
0 comments1 min readEA link
(www.youtube.com)

Scal­ing of AI train­ing runs will slow down af­ter GPT-5

Maxime Riché 🔸26 Apr 2024 16:06 UTC
10 points
2 comments3 min readEA link

Com­mu­nity Build­ing for Grad­u­ate Stu­dents: A Tar­geted Approach

Neil Crawford29 Mar 2022 19:47 UTC
13 points
0 comments3 min readEA link

Peter Eck­er­sley (1979-2022)

technicalities3 Sep 2022 10:45 UTC
497 points
10 comments1 min readEA link

AI Risk Man­age­ment Frame­work | NIST

𝕮𝖎𝖓𝖊𝖗𝖆26 Jan 2023 15:27 UTC
50 points
0 comments2 min readEA link
(www.nist.gov)

Chris­ti­ano (ARC) and GA (Con­jec­ture) Dis­cuss Align­ment Cruxes

Andrea_Miotti24 Feb 2023 23:03 UTC
16 points
1 comment49 min readEA link

Ap­ply now to SPAR!

Agustín Covarrubias 🔸19 Dec 2024 22:29 UTC
36 points
0 comments1 min readEA link

Cul­ture and Pro­gram­ming Ret­ro­spec­tive: ERA Fel­low­ship 2023

GideonF28 Sep 2023 16:45 UTC
16 points
0 comments10 min readEA link

Si­tu­a­tional aware­ness (Sec­tion 2.1 of “Schem­ing AIs”)

Joe_Carlsmith26 Nov 2023 23:00 UTC
12 points
1 comment6 min readEA link

[Question] What pre­dic­tions from the­o­ret­i­cal AI Safety re­search have been con­firmed by em­piri­cal work?

freedomandutility29 Dec 2024 8:19 UTC
43 points
10 comments1 min readEA link

I bet Greg Colbourn 10 k€ that AI will not kill us all by the end of 2027

Vasco Grilo🔸4 Jun 2024 16:37 UTC
195 points
64 comments2 min readEA link

AI Safety Newslet­ter #40: Cal­ifor­nia AI Leg­is­la­tion Plus, NVIDIA De­lays Chip Pro­duc­tion, and Do AI Safety Bench­marks Ac­tu­ally Mea­sure Safety?

Center for AI Safety21 Aug 2024 18:10 UTC
17 points
0 comments6 min readEA link
(newsletter.safe.ai)

AISN #47: Rea­son­ing Models

Center for AI Safety6 Feb 2025 18:44 UTC
8 points
0 comments4 min readEA link
(newsletter.safe.ai)

ARC is hiring the­o­ret­i­cal researchers

Jacob_Hilton12 Jun 2023 19:11 UTC
78 points
0 comments4 min readEA link
(www.lesswrong.com)

Mea­sur­ing AI-Driven Risk with Stock Prices (Su­sana Cam­pos-Mart­ins)

Global Priorities Institute12 Dec 2024 14:22 UTC
10 points
1 comment4 min readEA link
(globalprioritiesinstitute.org)

[Cross­post] Some Very Im­por­tant Things (That I Won’t Be Work­ing On This Year)

Sarah Cheng10 Mar 2025 14:42 UTC
28 points
1 comment4 min readEA link
(milesbrundage.substack.com)

Un­jour­nal: Eval­u­a­tions of “Ar­tifi­cial In­tel­li­gence and Eco­nomic Growth”, and new host­ing space

david_reinstein17 Mar 2023 20:20 UTC
47 points
0 comments2 min readEA link
(unjournal.pubpub.org)

Joe Hardie on Ar­ca­dia Im­pact’s pro­jects (FBB #7)

gergo8 Jul 2025 13:22 UTC
17 points
3 comments15 min readEA link

I don’t want to talk about ai

Kirsten22 May 2023 21:19 UTC
7 points
0 comments1 min readEA link
(ealifestyles.substack.com)

P(doom|AGI) is high: why the de­fault out­come of AGI is doom

Greg_Colbourn ⏸️ 2 May 2023 10:40 UTC
13 points
28 comments3 min readEA link

Safety Con­scious Re­searchers should leave Anthropic

GideonF1 Apr 2025 10:12 UTC
57 points
3 comments5 min readEA link

5 Rea­sons Why Govern­ments/​Mili­taries Already Want AI for In­for­ma­tion Warfare

trevor112 Nov 2023 18:24 UTC
5 points
0 comments10 min readEA link

Four ques­tions I ask AI safety researchers

Akash17 Jul 2022 17:25 UTC
30 points
3 comments1 min readEA link

A tale of 2.5 or­thog­o­nal­ity theses

Arepo1 May 2022 13:53 UTC
147 points
31 comments11 min readEA link

[Question] Ask­ing for on­line re­sources why AI now is near AGI

jackchang11018 May 2023 0:04 UTC
6 points
4 comments1 min readEA link

Con­sider at­tend­ing the AI Se­cu­rity Fo­rum ’24, a 1-day pre-DEFCON event

Charlie Rogers-Smith12 Jul 2024 23:01 UTC
23 points
0 comments1 min readEA link

[Question] Suggested read­ings & videos for a new col­lege course on ‘Psy­chol­ogy and AI’?

Geoffrey Miller11 Jan 2024 22:26 UTC
12 points
3 comments1 min readEA link

Up­dates on the EA catas­trophic risk land­scape

Benjamin_Todd6 May 2024 4:52 UTC
195 points
46 comments2 min readEA link

My model of how differ­ent AI risks fit together

poppinfresh31 Jan 2024 17:09 UTC
64 points
4 comments7 min readEA link
(unfoldingatlas.substack.com)

AISN #48: Utility Eng­ineer­ing and EnigmaEval

Center for AI Safety18 Feb 2025 19:11 UTC
6 points
0 comments4 min readEA link
(newsletter.safe.ai)

What OpenAI Told Cal­ifor­nia’s At­tor­ney General

Garrison17 May 2025 23:14 UTC
35 points
2 comments8 min readEA link
(www.obsolete.pub)

For­mal­ize the Hash­iness Model of AGI Un­con­tain­abil­ity

Remmelt9 Nov 2024 16:10 UTC
2 points
0 comments5 min readEA link
(docs.google.com)

Play Re­grantor: Move up to $250,000 to Your Top High-Im­pact Pro­jects!

Dawn Drescher17 May 2023 16:51 UTC
58 points
2 comments2 min readEA link
(impactmarkets.substack.com)

[Question] Do AI com­pa­nies make their safety re­searchers sign a non-dis­par­age­ment clause?

Ofer5 Sep 2022 13:40 UTC
73 points
3 comments1 min readEA link

White House pub­lishes frame­work for Nu­cleic Acid Screening

Agustín Covarrubias 🔸30 Apr 2024 0:44 UTC
30 points
1 comment1 min readEA link
(www.whitehouse.gov)

Ap­ply to be a TA for TARA

yanni kyriacos20 Dec 2024 2:24 UTC
15 points
2 comments1 min readEA link

[Question] If AI is in a bub­ble and the bub­ble bursts, what would you do?

Remmelt19 Aug 2024 10:56 UTC
28 points
10 comments1 min readEA link

How LDT helps re­duce the AI arms race

Tamsin Leake10 Dec 2023 16:21 UTC
8 points
1 comment4 min readEA link
(carado.moe)

Part 3: A Pro­posed Ap­proach for AI Safety Move­ment Build­ing: Pro­jects, Pro­fes­sions, Skills, and Ideas for the Fu­ture [long post][bounty for feed­back]

PeterSlattery22 Mar 2023 0:54 UTC
22 points
8 comments32 min readEA link

Ap­ti­tudes for AI gov­er­nance work

Sam Clarke13 Jun 2023 13:54 UTC
68 points
0 comments7 min readEA link

Paus­ing AI vs De­growth in rich countries

Miquel Banchs-Piqué (prev. mikbp)23 Sep 2023 7:09 UTC
−2 points
53 comments1 min readEA link

We Have Not Been In­vited to the Fu­ture: e/​acc and the Nar­row­ness of the Way Ahead

Devin Kalish17 Jul 2024 22:15 UTC
10 points
1 comment20 min readEA link
(www.thinkingmuchbetter.com)

Does schem­ing lead to ad­e­quate fu­ture em­pow­er­ment? (Sec­tion 2.3.1.2 of “Schem­ing AIs”)

Joe_Carlsmith3 Dec 2023 18:32 UTC
6 points
1 comment15 min readEA link

Ex­cerpts from “Do­ing EA Bet­ter” on x-risk methodology

Eevee🔹26 Jan 2023 1:04 UTC
22 points
5 comments6 min readEA link
(forum.effectivealtruism.org)

AISN #55: Trump Ad­minis­tra­tion Re­scinds AI Diffu­sion Rule, Allows Chip Sales to Gulf States

Center for AI Safety20 May 2025 16:05 UTC
7 points
0 comments4 min readEA link
(newsletter.safe.ai)

Ex­plor­ing Me­tac­u­lus’ com­mu­nity predictions

Vasco Grilo🔸24 Mar 2023 7:59 UTC
95 points
17 comments10 min readEA link

Agency Foun­da­tions Challenge: Septem­ber 8th-24th, $10k Prizes

Catalin M30 Aug 2023 6:12 UTC
12 points
0 comments5 min readEA link

Epoch AI alumni launch Mech­a­nize to “au­to­mate the whole econ­omy”

Henry Stanley 🔸18 Apr 2025 10:12 UTC
103 points
52 comments1 min readEA link

How meta­phys­i­cal be­liefs shape crit­i­cal as­pects of AI development

Jáchym Fibír26 Jun 2025 15:10 UTC
−9 points
0 comments8 min readEA link
(www.phiand.ai)

AI take­off and nu­clear war

Owen Cotton-Barratt11 Jun 2024 19:33 UTC
72 points
5 comments11 min readEA link
(strangecities.substack.com)

Four Pre­dic­tions About OpenAI’s Plans To Re­tain Non­profit Control

Garrison7 May 2025 15:48 UTC
15 points
2 comments5 min readEA link
(www.obsolete.pub)

Notes on risk compensation

trammell12 May 2024 18:40 UTC
140 points
14 comments21 min readEA link

Planes are still decades away from dis­plac­ing most bird jobs

guzey25 Nov 2022 16:49 UTC
27 points
2 comments3 min readEA link

AXRP: Store, Pa­treon, Video

DanielFilan7 Feb 2023 5:12 UTC
7 points
0 comments1 min readEA link

AI policy ideas: Read­ing list

Zach Stein-Perlman17 Apr 2023 19:00 UTC
60 points
3 comments4 min readEA link

The AI Boom Mainly Benefits Big Firms, but long-term, mar­kets will concentrate

Hauke Hillebrandt29 Oct 2023 8:38 UTC
12 points
0 comments1 min readEA link

[Question] What should the EA/​AI safety com­mu­nity change, in re­sponse to Sam Alt­man’s re­vealed pri­ori­ties?

SiebeRozendal8 Mar 2024 12:35 UTC
30 points
16 comments1 min readEA link

Max Teg­mark’s new Time ar­ti­cle on how we’re in a Don’t Look Up sce­nario [Linkpost]

Jonas Hallgren 🔸25 Apr 2023 15:47 UTC
41 points
0 comments1 min readEA link
(time.com)

De Dicto and De Se Refer­ence Mat­ters for Alignment

philgoetz3 Oct 2023 21:57 UTC
5 points
2 comments9 min readEA link

RSPs are pauses done right

evhub14 Oct 2023 4:06 UTC
93 points
7 comments7 min readEA link

Wi­den­ing Over­ton Win­dow—Open Thread

Prometheus31 Mar 2023 10:06 UTC
12 points
5 comments1 min readEA link
(www.lesswrong.com)

[Question] Which stocks or ETFs should you in­vest in to take ad­van­tage of a pos­si­ble AGI ex­plo­sion, and why?

Eevee🔹10 Apr 2023 17:55 UTC
19 points
16 comments1 min readEA link

Mak­ing a con­ser­va­tive case for alignment

Larks17 Nov 2024 1:45 UTC
44 points
0 comments1 min readEA link
(www.lesswrong.com)

Apoca­lypse in­surance, and the hardline liber­tar­ian take on AI risk

So8res28 Nov 2023 2:09 UTC
21 points
0 comments7 min readEA link

tito­tal on AI risk scepticism

Vasco Grilo🔸30 May 2024 17:03 UTC
76 points
3 comments6 min readEA link
(forum.effectivealtruism.org)

An­nounc­ing the Lon­don Ini­ti­a­tive for Safe AI (LISA)

JamesFox5 Feb 2024 10:36 UTC
67 points
4 comments9 min readEA link

The bul­ls­eye frame­work: My case against AI doom

titotal30 May 2023 11:52 UTC
71 points
15 comments17 min readEA link

RP’s AI Gover­nance & Strat­egy team—June 2023 in­terim overview

MichaelA🔸22 Jun 2023 13:45 UTC
68 points
1 comment7 min readEA link

Slop­world 2035: The dan­gers of mediocre AI

titotal14 Apr 2025 13:14 UTC
87 points
1 comment29 min readEA link
(titotal.substack.com)

Vir­tual AI Safety Un­con­fer­ence (VAISU)

Nguyên20 Jun 2023 9:47 UTC
14 points
0 comments1 min readEA link

Safety isn’t safety with­out a so­cial model (or: dis­pel­ling the myth of per se tech­ni­cal safety)

Andrew Critch14 Jun 2024 0:16 UTC
99 points
3 comments4 min readEA link

The Po­lar­ity Prob­lem [Draft]

Dan H23 May 2023 21:05 UTC
11 points
0 comments44 min readEA link

[Question] Any tips on ap­ply­ing for EA fund­ing?

Eevee🔹22 Sep 2024 5:11 UTC
18 points
4 comments1 min readEA link

How Re­think Pri­ori­ties’ Re­search could in­form your grantmaking

kierangreig🔸4 Oct 2023 18:24 UTC
59 points
0 comments2 min readEA link

The benefits and risks of op­ti­mism (about AI safety)

Karl von Wendt3 Dec 2023 12:45 UTC
3 points
5 comments5 min readEA link

New AI safety fund­ing source for peo­ple rais­ing aware­ness about AI risk or ad­vo­cat­ing for a pause

Kat Woods 🔶 ⏸️26 Jul 2025 12:25 UTC
16 points
6 comments1 min readEA link

Polls on De/​Ac­cel­er­at­ing AI

Denkenberger🔸9 Aug 2025 2:01 UTC
28 points
14 comments2 min readEA link

An­thropic is Quietly Backpedal­ling on its Safety Commitments

Garrison23 May 2025 2:26 UTC
100 points
7 comments5 min readEA link
(www.obsolete.pub)

Yud­kowsky on AGI risk on the Ban­kless podcast

RobBensinger13 Mar 2023 0:42 UTC
54 points
2 comments75 min readEA link

Re­sults of an in­for­mal sur­vey on AI grantmaking

Scott Alexander21 Aug 2024 13:19 UTC
127 points
28 comments1 min readEA link

Re­think Pri­ori­ties’ 2023 Sum­mary, 2024 Strat­egy, and Fund­ing Gaps

kierangreig🔸15 Nov 2023 20:56 UTC
86 points
7 comments3 min readEA link

Eth­i­cal Roots of Chi­nese AI

Vasiliy Kondyrev5 Nov 2024 14:07 UTC
0 points
0 comments6 min readEA link

GPTs are Pre­dic­tors, not Imitators

EliezerYudkowsky8 Apr 2023 19:59 UTC
74 points
12 comments3 min readEA link

Beren’s “De­con­fus­ing Direct vs Amor­tised Op­ti­mi­sa­tion”

𝕮𝖎𝖓𝖊𝖗𝖆7 Apr 2023 8:57 UTC
9 points
0 comments3 min readEA link

What I mean by “al­ign­ment is in large part about mak­ing cog­ni­tion aimable at all”

So8res30 Jan 2023 15:22 UTC
57 points
3 comments2 min readEA link

M&A in AI

Hauke Hillebrandt30 Oct 2023 17:43 UTC
9 points
1 comment6 min readEA link

A Bare­bones Guide to Mechanis­tic In­ter­pretabil­ity Prerequisites

Neel Nanda29 Nov 2022 18:43 UTC
54 points
1 comment3 min readEA link
(neelnanda.io)

Two con­trast­ing mod­els of “in­tel­li­gence” and fu­ture growth

Magnus Vinding24 Nov 2022 11:54 UTC
74 points
32 comments22 min readEA link

[Question] What is the eas­iest/​funnest way to build up a com­pre­hen­sive un­der­stand­ing of AI and AI Safety?

Jordan Arel30 Apr 2024 18:39 UTC
14 points
0 comments1 min readEA link

Pod­cast (+tran­script): Nathan Barnard on how US fi­nan­cial reg­u­la­tion can in­form AI governance

Aaron Bergman8 Aug 2023 21:46 UTC
12 points
0 comments23 min readEA link
(www.aaronbergman.net)

Map of maps of in­ter­est­ing fields

Max Görlitz25 Jun 2023 14:00 UTC
55 points
6 comments1 min readEA link
(glozematrix.substack.com)

What Does a Marginal Grant at LTFF Look Like? Fund­ing Pri­ori­ties and Grant­mak­ing Thresh­olds at the Long-Term Fu­ture Fund

Linch10 Aug 2023 20:11 UTC
176 points
22 comments8 min readEA link

Mis­gen­er­al­iza­tion as a misnomer

So8res6 Apr 2023 20:43 UTC
48 points
0 comments4 min readEA link

MIT Fu­tureTech are hiring ‍a Product and Data Vi­su­al­iza­tion De­signer

PeterSlattery13 Nov 2024 14:41 UTC
9 points
0 comments4 min readEA link

AI risk/​re­ward: A sim­ple model

Nathan Young4 May 2023 19:12 UTC
37 points
5 comments7 min readEA link

Why *not* just send peo­ple to Blue­dot (FBB#4)

gergo25 Mar 2025 10:47 UTC
27 points
13 comments12 min readEA link

[Question] How to hedge in­vest­ment port­fo­lio against AI risk?

Timothy_Liptrot31 Jan 2023 8:04 UTC
9 points
0 comments1 min readEA link

What’s Go­ing on With OpenAI’s Mes­sag­ing?

Ozzie Gooen21 May 2024 2:22 UTC
217 points
28 comments3 min readEA link

Is AI Hit­ting a Wall or Mov­ing Faster Than Ever?

Garrison9 Jan 2025 22:18 UTC
35 points
5 comments5 min readEA link
(garrisonlovely.substack.com)

Brief thoughts on Data, Re­port­ing, and Re­sponse for AI Risk Mitigation

Davidmanheim15 Jun 2023 7:53 UTC
18 points
3 comments8 min readEA link

OpenAI’s o1 tried to avoid be­ing shut down, and lied about it, in evals

Greg_Colbourn ⏸️ 6 Dec 2024 15:25 UTC
23 points
9 comments1 min readEA link
(www.transformernews.ai)

My ar­ti­cle in The Na­tion — Cal­ifor­nia’s AI Safety Bill Is a Mask-Off Mo­ment for the Industry

Garrison15 Aug 2024 19:25 UTC
134 points
0 comments1 min readEA link
(www.thenation.com)

[Question] Your Ad­vice For a High School Stu­dent.

AhmedWez10 Jan 2025 21:26 UTC
7 points
5 comments1 min readEA link

Les­sons from the FDA for AI

Remmelt2 Aug 2024 0:52 UTC
6 points
2 comments1 min readEA link
(ainowinstitute.org)

In­tro­duc­ing Align­ment Stress-Test­ing at Anthropic

evhub12 Jan 2024 23:51 UTC
80 points
0 comments2 min readEA link

Re­quest for In­for­ma­tion for a new US AI Ac­tion Plan (OSTP RFI)

Agustín Covarrubias 🔸7 Feb 2025 20:22 UTC
19 points
2 comments2 min readEA link
(www.federalregister.gov)

Sili­con Valley’s Rab­bit Hole Problem

Mandelbrot8 Oct 2023 12:25 UTC
34 points
44 comments11 min readEA link
(medium.com)

Will AI Avoid Ex­ploita­tion? (Adam Bales)

Global Priorities Institute13 Dec 2023 11:37 UTC
38 points
0 comments2 min readEA link

The Over­ton Win­dow widens: Ex­am­ples of AI risk in the media

Akash23 Mar 2023 17:10 UTC
112 points
11 comments1 min readEA link

Cruxes on US lead for some do­mes­tic AI regulation

Zach Stein-Perlman10 Sep 2023 18:00 UTC
20 points
6 comments2 min readEA link

Eric Sch­midt’s blueprint for US tech­nol­ogy strategy

OscarD🔸15 Oct 2024 19:54 UTC
29 points
4 comments9 min readEA link

AI com­pa­nies’ eval re­ports mostly don’t sup­port their claims

Zach Stein-Perlman9 Jun 2025 13:00 UTC
50 points
2 comments4 min readEA link

We don’t want to post again “This might be the last AI Safety Camp”

Remmelt21 Jan 2025 12:03 UTC
42 points
2 comments1 min readEA link
(manifund.org)

A Study of AI Science Models

Eleni_A13 May 2023 19:14 UTC
12 points
4 comments24 min readEA link

Fight­ing with­out hope

Akash1 Mar 2023 18:15 UTC
35 points
9 comments4 min readEA link

Cri­tiques of non-ex­is­tent AI safety labs: Yours

Anneal16 Jun 2023 6:50 UTC
117 points
12 comments3 min readEA link

The last era of hu­man mistakes

Owen Cotton-Barratt24 Jul 2024 9:56 UTC
23 points
4 comments7 min readEA link
(strangecities.substack.com)

Tran­script: NBC Nightly News: AI ‘race to reck­less­ness’ w/​ Tris­tan Har­ris, Aza Raskin

WilliamKiely23 Mar 2023 3:45 UTC
47 points
1 comment3 min readEA link

Deep­Mind and Google Brain are merg­ing [Linkpost]

Akash20 Apr 2023 18:47 UTC
32 points
1 comment1 min readEA link
(www.deepmind.com)

Is AI fore­cast­ing a waste of effort on the mar­gin?

Emrik5 Nov 2022 0:41 UTC
12 points
6 comments3 min readEA link

Is the AI Dooms­day Nar­ra­tive the Product of a Big Tech Con­spir­acy?

Garrison4 Dec 2024 19:20 UTC
28 points
5 comments11 min readEA link
(garrisonlovely.substack.com)

AISC9 has ended and there will be an AISC10

Linda Linsefors29 Apr 2024 10:53 UTC
36 points
0 comments2 min readEA link

4 ways to think about de­moc­ra­tiz­ing AI [GovAI Linkpost]

Akash13 Feb 2023 18:06 UTC
35 points
0 comments1 min readEA link
(www.governance.ai)

Law-Fol­low­ing AI 4: Don’t Rely on Vi­car­i­ous Liability

Cullen 🔸2 Aug 2022 23:23 UTC
13 points
0 comments3 min readEA link

Global Pause AI Protest 10/​21

Holly Elmore ⏸️ 🔸14 Oct 2023 3:17 UTC
22 points
0 comments1 min readEA link

Top OpenAI Catas­trophic Risk Offi­cial Steps Down Abruptly

Garrison16 Apr 2025 16:04 UTC
29 points
1 comment5 min readEA link
(garrisonlovely.substack.com)

Dona­tion offsets for ChatGPT Plus subscriptions

Jeffrey Ladish16 Mar 2023 23:11 UTC
76 points
10 comments3 min readEA link

[Question] How do you fol­low AI (safety) news?

peterhartree28 Sep 2024 14:03 UTC
13 points
9 comments1 min readEA link

What suc­cess looks like

mariushobbhahn28 Jun 2022 14:30 UTC
115 points
20 comments19 min readEA link

Some thoughts on “AI could defeat all of us com­bined”

Milan Griffes2 Jun 2023 15:03 UTC
23 points
0 comments4 min readEA link

An Anal­y­sis of Sys­temic Risk and Ar­chi­tec­tural Re­quire­ments for the Con­tain­ment of Re­cur­sively Self-Im­prov­ing AI

Ihor Ivliev17 Jun 2025 0:16 UTC
2 points
5 comments4 min readEA link

Where’s my ten minute AGI?

Vasco Grilo🔸19 May 2025 17:45 UTC
45 points
6 comments7 min readEA link
(epoch.ai)

Have your say on the Aus­tralian Govern­ment’s AI Policy

Nathan Sherburn11 Jul 2023 1:12 UTC
3 points
0 comments1 min readEA link

[Question] Are we con­fi­dent that su­per­in­tel­li­gent ar­tifi­cial in­tel­li­gence dis­em­pow­er­ing hu­mans would be bad?

Vasco Grilo🔸10 Jun 2023 9:24 UTC
24 points
27 comments1 min readEA link

AI Safety Newslet­ter #42: New­som Ve­toes SB 1047 Plus, OpenAI’s o1, and AI Gover­nance Summary

Center for AI Safety1 Oct 2024 20:33 UTC
10 points
0 comments6 min readEA link
(newsletter.safe.ai)

AI things that are per­haps as im­por­tant as hu­man-con­trol­led AI

Chi3 Mar 2024 18:07 UTC
116 points
9 comments21 min readEA link

An­thropic Faces Po­ten­tially “Busi­ness-End­ing” Copy­right Lawsuit

Garrison25 Jul 2025 17:01 UTC
29 points
10 comments9 min readEA link
(www.obsolete.pub)

List #1: Why stop­ping the de­vel­op­ment of AGI is hard but doable

Remmelt24 Dec 2022 9:52 UTC
24 points
2 comments5 min readEA link

AI, An­i­mals, and Digi­tal Minds 2024 - Retrospective

Constance Li19 Jun 2024 14:56 UTC
81 points
8 comments8 min readEA link

Trends in the dol­lar train­ing cost of ma­chine learn­ing systems

Ben Cottier1 Feb 2023 14:48 UTC
63 points
3 comments2 min readEA link
(epochai.org)

ai-plans.com De­cem­ber Cri­tique-a-Thon

Kabir_Kumar4 Dec 2023 9:27 UTC
1 point
0 comments2 min readEA link

Shut­ting Down the Light­cone Offices

Habryka [Deactivated]15 Mar 2023 1:46 UTC
243 points
71 comments17 min readEA link
(www.lesswrong.com)

A brief his­tory of the au­to­mated corporation

Owen Cotton-Barratt4 Nov 2024 14:37 UTC
21 points
1 comment5 min readEA link
(strangecities.substack.com)

GPT-4 is out: thread (& links)

Lizka14 Mar 2023 20:02 UTC
84 points
18 comments1 min readEA link

FLI pod­cast se­ries, “Imag­ine A World”, about as­pira­tional fu­tures with AGI

Jackson Wagner13 Oct 2023 16:03 UTC
18 points
0 comments4 min readEA link

Why ex­pe­rienced pro­fes­sion­als fail to land high-im­pact roles (FBB #5)

gergo10 Apr 2025 12:44 UTC
121 points
20 comments9 min readEA link

AISN #46: The Transition

Center for AI Safety23 Jan 2025 18:01 UTC
10 points
0 comments5 min readEA link
(newsletter.safe.ai)

[Video] Why SB-1047 de­serves a fairer debate

Yadav20 Aug 2024 10:38 UTC
15 points
1 comment7 min readEA link

Ret­ro­spec­tive on the 2022 Con­jec­ture AI Discussions

Andrea_Miotti24 Feb 2023 22:41 UTC
12 points
1 comment2 min readEA link

The Wizard of Oz Prob­lem: How in­cen­tives and nar­ra­tives can skew our per­cep­tion of AI developments

Akash20 Mar 2023 22:36 UTC
16 points
0 comments6 min readEA link

We need non-cy­ber­se­cu­rity peo­ple [too]

Jarrah5 May 2024 0:11 UTC
32 points
0 comments2 min readEA link

AI Safety Newslet­ter #43: White House Is­sues First Na­tional Se­cu­rity Memo on AI Plus, AI and Job Dis­place­ment, and AI Takes Over the Nobels

Center for AI Safety28 Oct 2024 16:02 UTC
6 points
0 comments6 min readEA link
(newsletter.safe.ai)

On the Dwarkesh/​Chol­let Pod­cast, and the cruxes of scal­ing to AGI

JWS 🔸15 Jun 2024 20:24 UTC
72 points
49 comments17 min readEA link

The Fu­ture of Work: How Can Poli­cy­mak­ers Pre­pare for AI’s Im­pact on La­bor Mar­kets?

DavidConrad24 Jun 2024 21:43 UTC
4 points
1 comment3 min readEA link
(www.lesswrong.com)

Effec­tive Utopia: 100% Safe AI, Place AI, Si­mu­lat­ing a Mul­ti­verse & How It Looks

ank2 Mar 2025 3:14 UTC
1 point
3 comments35 min readEA link

The In­ten­tional Stance, LLMs Edition

Eleni_A1 May 2024 15:22 UTC
8 points
2 comments8 min readEA link

It looks like there are some good fund­ing op­por­tu­ni­ties in AI safety right now

Benjamin_Todd21 Dec 2024 13:39 UTC
183 points
7 comments4 min readEA link
(benjamintodd.substack.com)

How Misal­igned AI Per­sonas Lead to Hu­man Ex­tinc­tion – Step by Step

Writer19 Jul 2025 13:59 UTC
6 points
1 comment7 min readEA link
(youtu.be)

How much do mar­kets value Open AI?

Ben_West🔸14 May 2023 19:28 UTC
39 points
13 comments4 min readEA link

Open Philan­thropy is pass­ing AI safety uni­ver­sity group fund­ing to Kairos

abergal22 Jul 2025 17:11 UTC
55 points
0 comments1 min readEA link

Crises re­veal centralisation

Vasco Grilo🔸26 Mar 2024 18:00 UTC
31 points
2 comments5 min readEA link
(stefanschubert.substack.com)

De­com­pos­ing Agency — ca­pa­bil­ities with­out desires

Owen Cotton-Barratt11 Jul 2024 9:38 UTC
37 points
2 comments12 min readEA link
(strangecities.substack.com)

Cost-effec­tive­ness of pro­fes­sional field-build­ing pro­grams for AI safety research

Center for AI Safety10 Jul 2023 17:26 UTC
38 points
2 comments18 min readEA link

The Top AI Safety Bets for 2023: GiveWiki’s Lat­est Recommendations

Dawn Drescher11 Nov 2023 9:04 UTC
11 points
4 comments8 min readEA link

We are in a New Paradigm of AI Progress—OpenAI’s o3 model makes huge gains on the tough­est AI bench­marks in the world

Garrison22 Dec 2024 21:45 UTC
26 points
0 comments4 min readEA link
(garrisonlovely.substack.com)

80,000 hours should re­move OpenAI from the Job Board (and similar EA orgs should do similarly)

Raemon3 Jul 2024 20:34 UTC
263 points
79 comments3 min readEA link

[Question] AI strat­egy ca­reer pipeline

Zach Stein-Perlman22 May 2023 0:00 UTC
72 points
23 comments1 min readEA link

[Question] What did AI Safety’s spe­cific fund­ing of AGI R&D labs lead to?

Remmelt5 Jul 2023 15:51 UTC
24 points
17 comments1 min readEA link

Power laws in Speedrun­ning and Ma­chine Learning

Jaime Sevilla24 Apr 2023 10:06 UTC
48 points
0 comments1 min readEA link
(arxiv.org)

A new­comer’s guide to the tech­ni­cal AI safety field

zeshen🔸4 Nov 2022 14:29 UTC
16 points
0 comments10 min readEA link

A stylized di­alogue on John Went­worth’s claims about mar­kets and optimization

So8res25 Mar 2023 22:32 UTC
18 points
0 comments8 min readEA link

What is au­ton­omy, and how does it lead to greater risk from AI?

Davidmanheim1 Aug 2023 8:06 UTC
10 points
0 comments6 min readEA link
(www.lesswrong.com)

The AI Adop­tion Gap: Prepar­ing the US Govern­ment for Ad­vanced AI

Lizka2 Apr 2025 21:37 UTC
40 points
20 comments17 min readEA link
(www.forethought.org)

[Question] In­ves­tiga­tive jour­nal­ist in the AI safety space?

Benevolent_Rain15 Nov 2024 8:48 UTC
4 points
9 comments1 min readEA link

Arkose is Closing

Arkose23 Jun 2025 11:02 UTC
100 points
6 comments2 min readEA link

Chain­ing Retroac­tive Fun­ders to Bor­row Against Un­likely Utopias

Dawn Drescher19 Apr 2022 18:25 UTC
24 points
4 comments9 min readEA link
(impactmarkets.substack.com)

An Ar­gu­ment for Fo­cus­ing on Mak­ing AI go Well

Chris Leong28 Dec 2023 13:25 UTC
13 points
4 comments3 min readEA link

Neel Nanda MATS Ap­pli­ca­tions Open (Due Aug 29)

Neel Nanda30 Jul 2025 0:55 UTC
20 points
0 comments7 min readEA link
(tinyurl.com)

My cur­rent take on ex­is­ten­tial AI risk [FB post]

Aryeh Englander1 May 2023 16:22 UTC
10 points
0 comments3 min readEA link

Nu­clear brinks­man­ship is not a good AI x-risk strategy

titotal30 Mar 2023 22:07 UTC
19 points
8 comments5 min readEA link

Thoughts on yes­ter­day’s UN Se­cu­rity Coun­cil meet­ing on AI

Greg_Colbourn ⏸️ 19 Jul 2023 16:46 UTC
31 points
2 comments1 min readEA link

AISN #61: OpenAI Re­leases GPT-5

Center for AI Safety12 Aug 2025 17:52 UTC
6 points
0 comments4 min readEA link
(newsletter.safe.ai)

Re­think Pri­ori­ties: Seek­ing Ex­pres­sions of In­ter­est for Spe­cial Pro­jects Next Year

kierangreig🔸29 Nov 2023 13:44 UTC
57 points
0 comments5 min readEA link

Enough about AI timelines— we already know what we need to know.

Holly Elmore ⏸️ 🔸9 Apr 2025 10:29 UTC
134 points
35 comments2 min readEA link

Why would AI com­pa­nies use hu­man-level AI to do al­ign­ment re­search?

MichaelDickens25 Apr 2025 19:12 UTC
16 points
1 comment2 min readEA link

[Job Ad] SERI MATS is hiring for our sum­mer program

annashive26 May 2023 4:51 UTC
8 points
1 comment7 min readEA link

Be­fore Alt­man’s Ouster, OpenAI’s Board Was Di­vided and Feuding

Jonathan Yan22 Nov 2023 1:01 UTC
25 points
1 comment1 min readEA link
(www.nytimes.com)

The GiveWiki’s Top Picks in AI Safety for the Giv­ing Sea­son of 2023

Dawn Drescher7 Dec 2023 9:23 UTC
26 points
0 comments3 min readEA link
(impactmarkets.substack.com)

No­body’s on the ball on AGI alignment

leopold29 Mar 2023 14:26 UTC
327 points
66 comments9 min readEA link
(www.forourposterity.com)

Is this com­mu­nity over-em­pha­siz­ing AI al­ign­ment?

Lixiang8 Jan 2023 6:23 UTC
1 point
5 comments1 min readEA link

RAND re­port finds no effect of cur­rent LLMs on vi­a­bil­ity of bioter­ror­ism attacks

Lizka26 Jan 2024 20:10 UTC
108 points
17 comments3 min readEA link
(www.rand.org)

[Question] Pros and cons of set­ting up a com­pany to do in­de­pen­dent AIS re­search?

Eevee🔹13 Aug 2024 0:11 UTC
15 points
0 comments1 min readEA link

Qual­ities that al­ign­ment men­tors value in ju­nior researchers

Akash14 Feb 2023 23:27 UTC
31 points
1 comment3 min readEA link

Ap­pli­ca­tions open: Sup­port for tal­ent work­ing on in­de­pen­dent learn­ing, re­search or en­trepreneurial pro­jects fo­cused on re­duc­ing global catas­trophic risks

CEEALAR9 Feb 2024 13:04 UTC
63 points
1 comment2 min readEA link

CFP for Re­bel­lion and Di­sobe­di­ence in AI workshop

Ram Rachum29 Dec 2022 16:09 UTC
4 points
0 comments1 min readEA link

Mesa-Op­ti­miza­tion: Ex­plain it like I’m 10 Edition

brook26 Aug 2023 23:06 UTC
10 points
1 comment6 min readEA link
(www.lesswrong.com)

AISN #38: Supreme Court De­ci­sion Could Limit Fed­eral Abil­ity to Reg­u­late AI Plus, “Cir­cuit Break­ers” for AI sys­tems, and up­dates on China’s AI industry

Center for AI Safety9 Jul 2024 19:29 UTC
8 points
0 comments5 min readEA link
(newsletter.safe.ai)

Ori­ent­ing to 3 year AGI timelines

Nikola22 Dec 2024 23:07 UTC
121 points
15 comments8 min readEA link

Some tal­ent needs in AI governance

Sam Clarke13 Jun 2023 13:53 UTC
133 points
10 comments8 min readEA link

The Hub­inger lec­tures on AGI safety: an in­tro­duc­tory lec­ture series

evhub22 Jun 2023 0:59 UTC
44 points
0 comments1 min readEA link
(www.youtube.com)

AISN #58: Se­nate Re­moves State AI Reg­u­la­tion Moratorium

Center for AI Safety3 Jul 2025 17:07 UTC
6 points
0 comments4 min readEA link
(newsletter.safe.ai)

[Question] What harm could AI safety do?

SeanEngelhart15 May 2021 1:11 UTC
12 points
7 comments1 min readEA link

Wash­ing­ton Post ar­ti­cle about EA uni­ver­sity groups

Lizka5 Jul 2023 12:58 UTC
35 points
5 comments1 min readEA link

Com­pendium of prob­lems with RLHF

Raphaël S30 Jan 2023 8:48 UTC
18 points
0 comments10 min readEA link

“Safety Cul­ture for AI” is im­por­tant, but isn’t go­ing to be easy

Davidmanheim26 Jun 2023 11:27 UTC
53 points
0 comments2 min readEA link
(papers.ssrn.com)

Seek­ing (Paid) Case Stud­ies on Standards

Holden Karnofsky26 May 2023 17:58 UTC
99 points
14 comments11 min readEA link

Pivotal Re­search is Hiring Re­search Managers

Tobias Häberli25 Sep 2024 19:11 UTC
8 points
0 comments3 min readEA link

Views on when AGI comes and on strat­egy to re­duce ex­is­ten­tial risk

TsviBT8 Jul 2023 9:00 UTC
31 points
3 comments14 min readEA link

Miles Brundage re­signed from OpenAI, and his AGI readi­ness team was disbanded

Garrison23 Oct 2024 23:42 UTC
57 points
4 comments7 min readEA link
(garrisonlovely.substack.com)

Join AISafety.info’s Distil­la­tion Hackathon (Oct 6-9th)

leillustrations🔸1 Oct 2023 18:42 UTC
27 points
2 comments2 min readEA link
(www.lesswrong.com)

An EA used de­cep­tive mes­sag­ing to ad­vance her pro­ject; we need mechanisms to avoid de­on­tolog­i­cally du­bi­ous plans

MikhailSamin13 Feb 2024 23:11 UTC
19 points
39 comments5 min readEA link

AISN #59: EU Pub­lishes Gen­eral-Pur­pose AI Code of Prac­tice

Center for AI Safety15 Jul 2025 18:32 UTC
8 points
0 comments4 min readEA link
(aisafety.substack.com)

Fo­cus­ing your im­pact on short vs long TAI timelines

kuhanj30 Sep 2023 19:23 UTC
44 points
0 comments10 min readEA link

Hu­mans are not pre­pared to op­er­ate out­side their moral train­ing distribution

Prometheus10 Apr 2023 21:44 UTC
12 points
0 comments3 min readEA link

Stop call­ing them labs

sawyer🔸24 Feb 2025 22:58 UTC
259 points
22 comments1 min readEA link

Ra­tional An­i­ma­tions is look­ing for an AI Safety scriptwriter, a lead com­mu­nity man­ager, and other roles.

Writer16 Jun 2023 9:41 UTC
40 points
4 comments3 min readEA link

Pro­jects I would like to see (pos­si­bly at AI Safety Camp)

Linda Linsefors27 Sep 2023 21:27 UTC
9 points
0 comments4 min readEA link

MIT Fu­tureTech are hiring for a Tech­ni­cal As­so­ci­ate role

PeterSlattery9 Sep 2024 20:14 UTC
9 points
6 comments3 min readEA link

80,000 Hours is shift­ing its strate­gic ap­proach to fo­cus more on AGI

80000_Hours20 Mar 2025 11:24 UTC
232 points
121 comments8 min readEA link

Take­off speeds pre­sen­ta­tion at Anthropic

Tom_Davidson4 Jun 2024 22:46 UTC
29 points
3 comments25 min readEA link

VIRTUA: a novel about AI alignment

Karl von Wendt12 Jan 2023 9:37 UTC
23 points
0 comments1 min readEA link

In­tro­duc­ing the new Ries­gos Catas­trófi­cos Globales team

Jaime Sevilla3 Mar 2023 23:04 UTC
74 points
3 comments5 min readEA link
(riesgoscatastroficosglobales.com)

My highly per­sonal skep­ti­cism brain­dump on ex­is­ten­tial risk from ar­tifi­cial in­tel­li­gence.

NunoSempere23 Jan 2023 20:08 UTC
437 points
116 comments14 min readEA link
(nunosempere.com)

Disen­tan­gling ar­gu­ments for the im­por­tance of AI safety

richard_ngo23 Jan 2019 14:58 UTC
63 points
14 comments8 min readEA link

Catas­trophic Risks from AI #6: Dis­cus­sion and FAQ

Center for AI Safety27 Jun 2023 23:23 UTC
10 points
0 comments4 min readEA link
(arxiv.org)

GDP per cap­ita in 2050

Hauke Hillebrandt6 May 2024 15:14 UTC
130 points
11 comments16 min readEA link
(hauke.substack.com)

[Question] Why isn’t there a Char­ity En­trepreneur­ship pro­gram for AI Safety?

yanni4 Oct 2023 2:12 UTC
11 points
13 comments1 min readEA link

Epoch AI is Hiring an Oper­a­tions Associate

merilalama3 May 2024 0:16 UTC
5 points
1 comment3 min readEA link
(careers.rethinkpriorities.org)

Kairos is hiring a Head of Oper­a­tions/​Found­ing Generalist

Agustín Covarrubias 🔸12 Mar 2025 20:58 UTC
59 points
1 comment5 min readEA link

SPAR seeks ad­vi­sors and stu­dents for AI safety pro­jects (Se­cond Wave)

mic14 Sep 2023 23:09 UTC
14 points
0 comments1 min readEA link

Some mis­takes in think­ing about AGI evolu­tion and control

Remmelt1 Aug 2025 8:08 UTC
7 points
0 comments1 min readEA link

Techies Wanted: How STEM Back­grounds Can Ad­vance Safe AI Policy

Daniel_Eth26 May 2025 11:29 UTC
41 points
1 comment29 min readEA link

Spread­ing mes­sages to help with the most im­por­tant century

Holden Karnofsky25 Jan 2023 20:35 UTC
129 points
21 comments18 min readEA link
(www.cold-takes.com)

How CISA can Sup­port the Se­cu­rity of Large AI Models Against Theft [Grad School As­sign­ment]

Marcel23 May 2023 15:36 UTC
7 points
0 comments13 min readEA link

Ap­ply to SPAR Fall 2025—80+ pro­jects!

Agustín Covarrubias 🔸30 Jul 2025 17:34 UTC
17 points
0 comments1 min readEA link

AISN #53: An Open Let­ter At­tempts to Block OpenAI Restructuring

Center for AI Safety29 Apr 2025 15:56 UTC
6 points
0 comments4 min readEA link
(newsletter.safe.ai)

Deep­Mind: Eval­u­at­ing Fron­tier Models for Danger­ous Capabilities

Zach Stein-Perlman21 Mar 2024 23:00 UTC
28 points
0 comments1 min readEA link
(arxiv.org)

[Question] Game the­ory work on AI al­ign­ment with di­verse AI sys­tems, hu­man in­di­vi­d­u­als, & hu­man groups?

Geoffrey Miller2 Mar 2023 16:50 UTC
22 points
2 comments1 min readEA link

Fund­ing cir­cle aimed at slow­ing down AI—look­ing for participants

Greg_Colbourn ⏸️ 25 Jan 2024 23:58 UTC
92 points
3 comments2 min readEA link

Or­ga­niz­ing a de­bate with ex­perts and MPs to raise AI xrisk aware­ness: a pos­si­ble blueprint

Otto19 Apr 2023 10:50 UTC
75 points
1 comment4 min readEA link

Pro­ject ideas: Sen­tience and rights of digi­tal minds

Lukas Finnveden4 Jan 2024 7:26 UTC
34 points
1 comment20 min readEA link
(www.forethought.org)

A re­cent write-up of the case for AI (ex­is­ten­tial) risk

Timsey18 May 2023 13:07 UTC
17 points
0 comments19 min readEA link

An­nounc­ing Man­i­fund Regrants

Austin5 Jul 2023 19:42 UTC
217 points
51 comments4 min readEA link
(manifund.org)

Re­cruit the World’s best for AGI Alignment

Greg_Colbourn ⏸️ 30 Mar 2023 16:41 UTC
34 points
8 comments22 min readEA link

AISafety.world is a map of the AIS ecosystem

Hamish McDoodles6 Apr 2023 11:47 UTC
192 points
8 comments1 min readEA link

12 ten­ta­tive ideas for US AI policy (Luke Muehlhauser)

Lizka19 Apr 2023 21:05 UTC
117 points
12 comments4 min readEA link
(www.openphilanthropy.org)

Will scal­ing work?

Vasco Grilo🔸4 Feb 2024 9:29 UTC
19 points
1 comment12 min readEA link
(www.dwarkeshpatel.com)

A note of cau­tion about re­cent AI risk coverage

Sean_o_h7 Jun 2023 17:05 UTC
284 points
29 comments3 min readEA link

De­bate se­ries: should we push for a pause on the de­vel­op­ment of AI?

Ben_West🔸8 Sep 2023 16:29 UTC
252 points
58 comments1 min readEA link

AISN #54: OpenAI Up­dates Restruc­ture Plan

Center for AI Safety13 May 2025 16:48 UTC
7 points
0 comments4 min readEA link
(newsletter.safe.ai)

Can the AI af­ford to wait?

Ben Millwood🔸20 Mar 2024 19:45 UTC
48 points
11 comments7 min readEA link

Value frag­ility and AI takeover

Joe_Carlsmith5 Aug 2024 21:28 UTC
39 points
3 comments30 min readEA link

Pod­cast with Yoshua Ben­gio on Why AI Labs are “Play­ing Dice with Hu­man­ity’s Fu­ture”

Garrison10 May 2024 17:23 UTC
29 points
3 comments2 min readEA link
(garrisonlovely.substack.com)

An­nounc­ing FAR Labs, an AI safety cowork­ing space

Ben Goldhaber2 Oct 2023 20:15 UTC
63 points
0 comments1 min readEA link
(www.lesswrong.com)

Pro­ject idea: AI for epistemics

Benjamin_Todd19 May 2024 19:36 UTC
45 points
12 comments3 min readEA link
(benjamintodd.substack.com)

The cur­rent state of RSPs

Zach Stein-Perlman4 Nov 2024 16:00 UTC
19 points
1 comment9 min readEA link

A Gen­tle In­tro­duc­tion to Risk Frame­works Beyond Forecasting

pending_survival11 Apr 2024 9:15 UTC
83 points
4 comments27 min readEA link

Read­ing list on AI agents and as­so­ci­ated policy

Peter Wildeford9 Aug 2024 17:40 UTC
79 points
2 comments1 min readEA link

Astro­nom­i­cal Waste & Con­scien­tious Objection

Lydia Nottingham2 Aug 2025 22:41 UTC
5 points
0 comments2 min readEA link

CEA Should In­vest in Helping Altru­ists Nav­i­gate Ad­vanced AI

Chris Leong14 May 2023 14:52 UTC
4 points
13 comments2 min readEA link

Where I’m at with AI risk: con­vinced of dan­ger but not (yet) of doom

Amber Dawn21 Mar 2023 13:23 UTC
62 points
16 comments6 min readEA link

Linkpost: Dwarkesh Pa­tel in­ter­view­ing Carl Shulman

Stefan_Schubert14 Jun 2023 15:30 UTC
110 points
5 comments1 min readEA link
(podcastaddict.com)

Trans­for­ma­tive AGI by 2043 is <1% likely

Ted Sanders6 Jun 2023 15:51 UTC
98 points
92 comments5 min readEA link
(arxiv.org)

Us­ing AI to Stream­line Your Poli­ti­cal Advocacy

Gabriel Sherman🔸29 Apr 2025 18:35 UTC
3 points
0 comments3 min readEA link

Ap­ply to be a men­tor in SPAR!

Agustín Covarrubias 🔸24 Jun 2025 23:00 UTC
25 points
0 comments1 min readEA link

New? Start here! (Use­ful links)

Lizka1 Jul 2022 21:19 UTC
28 points
1 comment2 min readEA link

AI com­pa­nies are un­likely to make high-as­surance safety cases if timelines are short

Ryan Greenblatt23 Jan 2025 18:41 UTC
45 points
1 comment13 min readEA link

Some quotes from Tues­day’s Se­nate hear­ing on AI

Daniel_Eth17 May 2023 12:13 UTC
105 points
7 comments4 min readEA link

List #3: Why not to as­sume on prior that AGI-al­ign­ment workarounds are available

Remmelt24 Dec 2022 9:54 UTC
6 points
0 comments3 min readEA link

The Retroac­tive Fund­ing Land­scape: In­no­va­tions for Donors and Grantmakers

Dawn Drescher29 Sep 2023 17:39 UTC
17 points
2 comments19 min readEA link
(impactmarkets.substack.com)

Nav­i­gat­ing Risks from Ad­vanced Ar­tifi­cial In­tel­li­gence: A Guide for Philan­thropists [Founders Pledge]

Tom Barnes🔸21 Jun 2024 9:48 UTC
101 points
7 comments1 min readEA link
(www.founderspledge.com)

Pulse 2024: At­ti­tudes to­wards ar­tifi­cial intelligence

Jamie E27 Nov 2024 11:33 UTC
62 points
4 comments3 min readEA link

AI Safety Hub Ser­bia Offi­cial Opening

Dušan D. Nešić (Dushan)28 Oct 2023 17:10 UTC
27 points
3 comments3 min readEA link
(forum.effectivealtruism.org)

Linkpost: Mak­ing deals with early schemers

Buck21 Jun 2025 16:34 UTC
20 points
0 comments1 min readEA link

Rol­ling Thresh­olds for AGI Scal­ing Regulation

Larks12 Jan 2025 1:30 UTC
60 points
4 comments6 min readEA link

[Question] Dan Hendrycks and EA

Caruso3 Aug 2024 13:49 UTC
−1 points
6 comments1 min readEA link

Bio-x-AI policy: call for ideas from the Fed­er­a­tion of Amer­i­can Scientists

Ben Stewart15 Aug 2023 3:21 UTC
8 points
0 comments1 min readEA link

Reflec­tions on my first year of AI safety research

Jay Bailey8 Jan 2024 7:49 UTC
64 points
2 comments12 min readEA link

2023 Align­ment Re­search Up­dates from FAR AI

AdamGleave4 Dec 2023 22:32 UTC
14 points
0 comments8 min readEA link
(far.ai)

An­nounc­ing the Vi­talik Bu­terin Fel­low­ships in AI Ex­is­ten­tial Safety!

DanielFilan21 Sep 2021 0:41 UTC
62 points
0 comments1 min readEA link
(grants.futureoflife.org)

Un-un­plug­ga­bil­ity—can’t we just un­plug it?

Oliver Sourbut15 May 2023 13:23 UTC
15 points
0 comments10 min readEA link
(www.oliversourbut.net)

Offer­ing AI safety sup­port calls for ML professionals

Vael Gates15 Feb 2024 23:48 UTC
52 points
1 comment1 min readEA link

[Linkpost] OpenAI is award­ing ten 100k grants for build­ing pro­to­types of a demo­cratic pro­cess for steer­ing AI

pseudonym26 May 2023 12:49 UTC
36 points
2 comments1 min readEA link
(openai.com)

In­tro­duc­ing Fu­ture Mat­ters – a strat­egy consultancy

KyleGracey30 Sep 2023 2:06 UTC
59 points
2 comments5 min readEA link

AISN #50: AI Ac­tion Plan Re­sponses

Center for AI Safety31 Mar 2025 20:07 UTC
10 points
0 comments6 min readEA link
(newsletter.safe.ai)

New re­port: “Schem­ing AIs: Will AIs fake al­ign­ment dur­ing train­ing in or­der to get power?”

Joe_Carlsmith15 Nov 2023 17:16 UTC
71 points
4 comments30 min readEA link

Ap­ply to the Cam­bridge ERA:AI Fel­low­ship 2025

Harrison 🔸25 Mar 2025 13:46 UTC
28 points
0 comments3 min readEA link

An­nounc­ing the In­tro­duc­tion to ML Safety Course

TW1236 Aug 2022 2:50 UTC
136 points
4 comments7 min readEA link

[Question] Please help me sense-check my as­sump­tions about the needs of the AI Safety com­mu­nity and re­lated ca­reer plans

PeterSlattery27 Mar 2023 8:11 UTC
23 points
27 comments2 min readEA link

New­bie in­tro, ai/​ei sig­na­tures, & a wco need­ing you...!!

Ebert Thinkingtinkerer25 Jun 2025 15:50 UTC
3 points
0 comments1 min readEA link

Beyond Con­trol: The Strate­gic Case for AI Rights

Dawn Drescher12 Aug 2025 14:06 UTC
8 points
2 comments3 min readEA link
(impartial-priorities.org)

Ideas for AI labs: Read­ing list

Zach Stein-Perlman24 Apr 2023 19:00 UTC
28 points
2 comments4 min readEA link

A moral back­lash against AI will prob­a­bly slow down AGI development

Geoffrey Miller31 May 2023 21:31 UTC
146 points
22 comments14 min readEA link

An­i­mal ad­vo­cates should re­spond to trans­for­ma­tive AI maybe ar­riv­ing soon

Jamie_Harris2 Aug 2025 14:27 UTC
90 points
4 comments9 min readEA link

Defer­ence on AI timelines: sur­vey results

Sam Clarke30 Mar 2023 23:03 UTC
68 points
3 comments2 min readEA link

Re­lease of UN’s draft re­lated to the gov­er­nance of AI (a sum­mary of the Si­mon In­sti­tute’s re­sponse)

SebastianSchmidt27 Apr 2024 18:27 UTC
22 points
0 comments1 min readEA link

EA, Psy­chol­ogy & AI Safety Research

Sam Ellis26 May 2022 23:46 UTC
28 points
3 comments6 min readEA link

What would a com­pute mon­i­tor­ing plan look like? [Linkpost]

Akash26 Mar 2023 19:33 UTC
61 points
1 comment4 min readEA link
(arxiv.org)

How much money should we be sav­ing for re­tire­ment?

Denkenberger🔸2 Mar 2025 6:21 UTC
22 points
6 comments2 min readEA link

Pros and Cons of boy­cotting paid Chat GPT

NickLaing18 Mar 2023 8:50 UTC
14 points
11 comments2 min readEA link

Po­ten­tially Use­ful Pro­jects in Wise AI

Chris Leong5 Jun 2025 8:13 UTC
14 points
2 comments5 min readEA link

Should there be just one west­ern AGI pro­ject?

rosehadshar4 Dec 2024 14:41 UTC
49 points
3 comments15 min readEA link
(www.forethought.org)

Owain Evans on LLMs, Truth­ful AI, AI Com­po­si­tion, and More

Ozzie Gooen2 May 2023 1:20 UTC
21 points
0 comments1 min readEA link
(quri.substack.com)

Sel­ling out to AI com­pa­nies is bad. Pe­riod. You will be cor­rupted.

Holly Elmore ⏸️ 🔸9 Apr 2025 3:56 UTC
2 points
23 comments1 min readEA link

Imi­ta­tion Learn­ing is Prob­a­bly Ex­is­ten­tially Safe

Vasco Grilo🔸30 Apr 2024 17:06 UTC
19 points
7 comments3 min readEA link
(www.openphilanthropy.org)

UK gov­ern­ment to host first global sum­mit on AI Safety

DavidNash8 Jun 2023 13:24 UTC
78 points
1 comment5 min readEA link
(www.gov.uk)

In­tent al­ign­ment should not be the goal for AGI x-risk reduction

johnjnay26 Oct 2022 1:24 UTC
7 points
1 comment2 min readEA link

AI fore­cast­ing bots incoming

Center for AI Safety9 Sep 2024 19:55 UTC
−2 points
6 comments4 min readEA link
(www.safe.ai)

Ag­gre­gat­ing Utilities for Cor­rigible AI [Feed­back Draft]

Dan H12 May 2023 20:57 UTC
12 points
0 comments20 min readEA link

A rough and in­com­plete re­view of some of John Went­worth’s research

So8res28 Mar 2023 18:52 UTC
27 points
0 comments18 min readEA link

[Question] What would it look like for AIS to no longer be ne­glected?

Rockwell16 Jun 2023 15:59 UTC
100 points
14 comments1 min readEA link

Ap­ply to fall policy in­tern­ships (we can help)

ES2 Jul 2023 21:37 UTC
57 points
4 comments1 min readEA link

Ge­offrey Miller on Cross-Cul­tural Un­der­stand­ing Between China and Western Coun­tries as a Ne­glected Con­sid­er­a­tion in AI Alignment

Evan_Gaensbauer17 Apr 2023 3:26 UTC
25 points
2 comments4 min readEA link

Want to work on US emerg­ing tech policy? Con­sider the Hori­zon Fel­low­ship.

ES30 Jul 2024 11:46 UTC
32 points
0 comments1 min readEA link

Does AI Progress Have a Speed Limit?

Vasco Grilo🔸13 Jun 2025 16:22 UTC
15 points
1 comment19 min readEA link
(asteriskmag.com)

AIS Nether­lands is look­ing for a Found­ing Ex­ec­u­tive Direc­tor (EOI form)

gergo19 Mar 2025 9:24 UTC
49 points
4 comments4 min readEA link

Sur­vey on in­ter­me­di­ate goals in AI governance

MichaelA🔸17 Mar 2023 12:44 UTC
156 points
4 comments1 min readEA link

Un­jour­nal eval­u­a­tion of “Towards best prac­tices in AGI safety and gov­er­nance” (Schuett et al, 2023)

david_reinstein3 Jun 2025 11:18 UTC
9 points
1 comment1 min readEA link
(unjournal.pubpub.org)

Quick sur­vey on AI al­ign­ment resources

frances_lorenz30 Jun 2022 19:08 UTC
15 points
0 comments1 min readEA link

Ex­cerpts from “Ma­jor­ity Leader Schumer De­liv­ers Re­marks To Launch SAFE In­no­va­tion Frame­work For Ar­tifi­cial In­tel­li­gence At CSIS”

Chris Leong21 Jul 2023 23:15 UTC
19 points
0 comments1 min readEA link
(www.democrats.senate.gov)

Box in­ver­sion revisited

Jan_Kulveit7 Nov 2023 11:09 UTC
13 points
1 comment8 min readEA link

The Selfish Machine

Vasco Grilo🔸15 Mar 2025 10:58 UTC
9 points
0 comments12 min readEA link
(maartenboudry.substack.com)

Lev­el­ling Up in AI Safety Re­search Engineering

GabeM2 Sep 2022 4:59 UTC
166 points
21 comments17 min readEA link

Scal­ing Laws and Likely Limits to AI

Davidmanheim18 Aug 2024 17:19 UTC
19 points
0 comments3 min readEA link

Tacit knowl­edge: how I *ex­actly* ap­proach EAG(x) con­fer­ences

gergo4 Jun 2025 18:14 UTC
85 points
5 comments4 min readEA link

Think­ing-in-limits about TAI from the de­mand per­spec­tive. De­mand sat­u­ra­tion, re­source wars, new debt.

Ivan Madan7 Nov 2023 22:44 UTC
2 points
0 comments4 min readEA link

So You Want to Work at a Fron­tier AI Lab

Joe Rogero11 Jun 2025 23:11 UTC
35 points
2 comments7 min readEA link
(intelligence.org)

Fron­tier AI sys­tems have sur­passed the self-repli­cat­ing red line

Greg_Colbourn ⏸️ 10 Dec 2024 16:33 UTC
25 points
14 comments1 min readEA link
(github.com)

Es­say com­pe­ti­tion on the Au­toma­tion of Wis­dom and Philos­o­phy — $25k in prizes

Owen Cotton-Barratt16 Apr 2024 10:08 UTC
80 points
15 comments8 min readEA link
(blog.aiimpacts.org)

AISN #56: Google Re­leases Veo 3

Center for AI Safety28 May 2025 15:57 UTC
6 points
0 comments4 min readEA link
(newsletter.safe.ai)

Pri­ori­tis­ing be­tween ex­tinc­tion risks: Ev­i­dence Quality

freedomandutility30 Dec 2023 12:25 UTC
11 points
0 comments2 min readEA link

[Linkpost] ‘The God­father of A.I.’ Leaves Google and Warns of Danger Ahead

imp4rtial 🔸1 May 2023 19:54 UTC
43 points
3 comments3 min readEA link
(www.nytimes.com)

Takes on “Align­ment Fak­ing in Large Lan­guage Models”

Joe_Carlsmith18 Dec 2024 18:22 UTC
72 points
1 comment62 min readEA link

The Nav­i­ga­tion Fund launched + is hiring a pro­gram officer to lead the dis­tri­bu­tion of $20M an­nu­ally for AI safety! Full-time, fully re­mote, pay starts at $200k

vincentweisser3 Nov 2023 21:53 UTC
120 points
3 comments1 min readEA link

AI and Evolution

Dan H30 Mar 2023 13:09 UTC
41 points
1 comment2 min readEA link
(arxiv.org)

“X dis­tracts from Y” as a thinly-dis­guised fight over group sta­tus /​ politics

Steven Byrnes25 Sep 2023 15:29 UTC
91 points
9 comments8 min readEA link

Man­i­fund: 2023 in Review

Austin18 Jan 2024 23:50 UTC
29 points
1 comment23 min readEA link
(manifund.substack.com)

UN Sec­re­tary-Gen­eral recog­nises ex­is­ten­tial threat from AI

Greg_Colbourn ⏸️ 15 Jun 2023 17:03 UTC
58 points
1 comment1 min readEA link

In­tro­duc­ing Kairos: a new AI safety field­build­ing or­ga­ni­za­tion (the new home for SPAR and FSP)

Agustín Covarrubias 🔸25 Oct 2024 21:59 UTC
81 points
2 comments2 min readEA link

Suc­ces­sif: Join our AI pro­gram to help miti­gate the catas­trophic risks of AI

ClaireB25 Oct 2023 16:51 UTC
15 points
0 comments5 min readEA link

AISN #36: Vol­un­tary Com­mit­ments are In­suffi­cient Plus, a Se­nate AI Policy Roadmap, and Chap­ter 1: An Overview of Catas­trophic Risks

Center for AI Safety30 May 2024 18:23 UTC
6 points
0 comments5 min readEA link
(newsletter.safe.ai)

Su­per­vised Pro­gram for Align­ment Re­search (SPAR) at UC Berkeley: Spring 2023 summary

mic19 Aug 2023 2:32 UTC
18 points
1 comment6 min readEA link
(www.lesswrong.com)

AI Safety Newslet­ter #1 [CAIS Linkpost]

Akash10 Apr 2023 20:18 UTC
38 points
0 comments4 min readEA link
(newsletter.safe.ai)

The US-China Re­la­tion­ship and Catas­trophic Risk (EAG Bos­ton tran­script)

EA Global9 Jul 2024 13:50 UTC
30 points
1 comment19 min readEA link

[Question] Has An­thropic already made the ex­ter­nally leg­ible com­mit­ments that it planned to make?

Ofer12 Mar 2024 13:45 UTC
21 points
3 comments1 min readEA link

METR is hiring ML Re­search Eng­ineers and Scientists

Ben_West🔸5 Jun 2024 21:25 UTC
18 points
2 comments1 min readEA link
(metr.org)

AI Safety Univer­sity Or­ga­niz­ing: Early Take­aways from Thir­teen Groups

Agustín Covarrubias 🔸2 Oct 2024 14:39 UTC
46 points
3 comments9 min readEA link

Catas­trophic Risks from AI #5: Rogue AIs

Center for AI Safety27 Jun 2023 22:06 UTC
16 points
1 comment22 min readEA link
(arxiv.org)

Yann LeCun on AGI and AI Safety

Chris Leong8 Aug 2023 23:43 UTC
23 points
4 comments1 min readEA link
(drive.google.com)

In­tro­duc­ing SyDFAIS: A Sys­temic De­sign Frame­work for AI Safety Field-Build­ing

Moneer6 Feb 2025 14:26 UTC
19 points
6 comments14 min readEA link

Stu­dent com­pe­ti­tion for draft­ing a treaty on mora­to­rium of large-scale AI ca­pa­bil­ities R&D

Nayanika24 Apr 2023 13:15 UTC
36 points
4 comments2 min readEA link

AI Takeover Sce­nario with Scaled LLMs

simeon_c16 Apr 2023 23:28 UTC
29 points
1 comment8 min readEA link

All AGI Safety ques­tions wel­come (es­pe­cially ba­sic ones) [May 2023]

StevenKaas8 May 2023 22:30 UTC
19 points
11 comments2 min readEA link

METR: Mea­sur­ing AI Abil­ity to Com­plete Long Tasks

Ben_West🔸19 Mar 2025 16:49 UTC
122 points
16 comments1 min readEA link
(metr.org)

Civil di­s­obe­di­ence op­por­tu­nity—a way to help re­duce chance of hard take­off from re­cur­sive self im­prove­ment of code

JonCefalu25 Mar 2023 22:37 UTC
−5 points
0 comments1 min readEA link
(codegencodepoisoningcontest.cargo.site)

My “in­fo­haz­ards small work­ing group” Sig­nal Chat may have en­coun­tered minor leaks

Linch2 Apr 2025 1:03 UTC
109 points
2 comments5 min readEA link

De­cod­ing Repub­li­can AI Policy: In­sights from 10 Key Ar­ti­cles from Mid-2024

anonymous00718 Aug 2024 9:48 UTC
5 points
0 comments6 min readEA link

On the cor­re­spon­dence be­tween AI-mis­al­ign­ment and cog­ni­tive dis­so­nance us­ing a be­hav­ioral eco­nomics model

Stijn Bruers 🔸1 Nov 2022 9:15 UTC
11 points
0 comments6 min readEA link

Why Did Elon Musk Just Offer to Buy Con­trol of OpenAI for $100 Billion?

Garrison11 Feb 2025 0:20 UTC
152 points
2 comments6 min readEA link
(garrisonlovely.substack.com)

EA Ho­tel: Live The­ory Work­shop this month

CEEALAR7 Nov 2024 10:39 UTC
14 points
0 comments1 min readEA link

Col­lin Burns on Align­ment Re­search And Dis­cov­er­ing La­tent Knowl­edge Without Supervision

Michaël Trazzi17 Jan 2023 17:21 UTC
21 points
2 comments4 min readEA link
(theinsideview.ai)

AI as a Con­sti­tu­tional Moment

atb28 May 2025 15:40 UTC
37 points
1 comment9 min readEA link

AGI Take­off dy­nam­ics—In­tel­li­gence vs Quan­tity ex­plo­sion

EdoArad26 Jul 2023 9:20 UTC
14 points
0 comments2 min readEA link
(github.com)

Ap­ply to CEEALAR to do AGI mora­to­rium work

Greg_Colbourn ⏸️ 26 Jul 2023 21:24 UTC
62 points
0 comments1 min readEA link

[Question] How com­mit­ted to AGI safety are the cur­rent OpenAI non­profit board mem­bers?

Eevee🔹2 Dec 2024 4:03 UTC
14 points
1 comment1 min readEA link

“Who Will You Be After ChatGPT Takes Your Job?”

Stephen Thomas21 Apr 2023 21:31 UTC
23 points
4 comments2 min readEA link
(www.wired.com)

What does it mean for an AGI to be ‘safe’?

So8res7 Oct 2022 4:43 UTC
53 points
21 comments3 min readEA link

Giv­ing away copies of Un­con­trol­lable by Dar­ren McKee

Greg_Colbourn ⏸️ 14 Dec 2023 17:00 UTC
39 points
2 comments1 min readEA link

AI Con­scious­ness Re­port: A Roundtable Discussion

Sofia_Fogel30 Aug 2023 21:50 UTC
18 points
0 comments2 min readEA link

[Question] Imag­ine AGI kil­led us all in three years. What would have been our biggest mis­takes?

yanni kyriacos7 Apr 2023 0:06 UTC
17 points
6 comments1 min readEA link

AI Safety Field Build­ing vs. EA CB

kuhanj26 Jun 2023 23:21 UTC
80 points
16 comments6 min readEA link

Dist­in­guish­ing ways AI can be “con­cen­trated”

Matthew_Barnett21 Oct 2024 22:14 UTC
30 points
1 comment4 min readEA link

An­nounc­ing the ITAM AI Fu­tures Fel­low­ship

AmAristizabal28 Jul 2023 16:44 UTC
43 points
3 comments2 min readEA link

OpenAI’s new structure

AnonymousTurtle27 Dec 2024 14:53 UTC
30 points
2 comments1 min readEA link
(openai.com)

SB-1047 Doc­u­men­tary: The Post-Mortem

Michaël Trazzi1 Aug 2025 21:44 UTC
60 points
1 comment5 min readEA link

AISN #60: The AI Ac­tion Plan

Center for AI Safety31 Jul 2025 18:10 UTC
6 points
0 comments7 min readEA link
(newsletter.safe.ai)

The Parable of the Boy Who Cried 5% Chance of Wolf

Kat Woods 🔶 ⏸️15 Aug 2022 14:22 UTC
80 points
8 comments2 min readEA link

Ap­ply to Spring 2024 policy in­tern­ships (we can help)

ES4 Oct 2023 14:45 UTC
26 points
2 comments1 min readEA link

How bad a fu­ture do ML re­searchers ex­pect?

Katja_Grace13 Mar 2023 5:47 UTC
165 points
20 comments2 min readEA link

Say­ing ‘AI safety re­search is a Pas­cal’s Mug­ging’ isn’t a strong response

Robert_Wiblin15 Dec 2015 13:48 UTC
15 points
16 comments2 min readEA link

Free Lin­guis­tic Ser­vices for High-Im­pact Organisations

feijão8 Jun 2025 17:37 UTC
8 points
2 comments1 min readEA link

Defin­ing al­ign­ment research

richard_ngo19 Aug 2024 22:49 UTC
48 points
1 comment7 min readEA link

For­ma­tion of Macros­trat­egy Refine­ment Division

Henry Stanley 🔸1 Apr 2025 14:16 UTC
23 points
0 comments2 min readEA link

Si­mu­lat­ing Shut­down Code Ac­ti­va­tions in an AI Virus Lab

Miguel20 Jun 2023 5:27 UTC
4 points
0 comments6 min readEA link

AI gov­er­nance tal­ent pro­files I’d like to see ap­ply for OP funding

JulianHazell19 Dec 2023 12:34 UTC
119 points
4 comments3 min readEA link
(www.openphilanthropy.org)

Fund­ing and job op­por­tu­ni­ties, events, and thoughts on pro­fes­sion­als (Field­builders newslet­ter #8)

gergo23 Apr 2025 9:53 UTC
7 points
0 comments3 min readEA link

Ap­ply to be a men­tor in SPAR!

Agustín Covarrubias 🔸5 Nov 2024 21:32 UTC
14 points
0 comments1 min readEA link

Safety tax functions

Owen Cotton-Barratt20 Oct 2024 14:13 UTC
23 points
1 comment6 min readEA link
(strangecities.substack.com)

AI Safety Seems Hard to Measure

Holden Karnofsky11 Dec 2022 1:31 UTC
90 points
4 comments14 min readEA link
(www.cold-takes.com)

Truth and Ad­van­tage: Re­sponse to a draft of “AI safety seems hard to mea­sure”

So8res22 Mar 2023 3:36 UTC
11 points
0 comments5 min readEA link

What is it like do­ing AI safety work?

Kat Woods 🔶 ⏸️21 Feb 2023 19:24 UTC
99 points
2 comments10 min readEA link

The Grant De­ci­sion Boundary: Re­cent Cases from the Long-Term Fu­ture Fund

Linch29 Nov 2024 1:50 UTC
66 points
3 comments3 min readEA link

Up­com­ing Feed­back Op­por­tu­nity on Dual-Use Foun­da­tion Models

Chris Leong2 Nov 2023 4:30 UTC
9 points
0 comments1 min readEA link

An ‘AGI Emer­gency Eject Cri­te­ria’ con­sen­sus could be re­ally use­ful.

tcelferact7 Apr 2023 16:21 UTC
27 points
3 comments1 min readEA link

Distinc­tions when Dis­cussing Utility Functions

Ozzie Gooen8 Mar 2024 18:43 UTC
15 points
5 comments8 min readEA link

An­nounc­ing Timaeus

Stan van Wingerden22 Oct 2023 13:32 UTC
80 points
0 comments5 min readEA link
(www.lesswrong.com)

Paus­ing AI is Progress

Felix De Simone16 Jul 2024 22:28 UTC
22 points
3 comments6 min readEA link
(pauseai.substack.com)

[Question] Ask­ing for on­line calls on AI s-risks discussions

jackchang11014 May 2023 13:58 UTC
26 points
3 comments1 min readEA link

EA Nether­lands’ guide to AI safety careers

James Herbert16 Jan 2025 17:22 UTC
25 points
0 comments1 min readEA link
(effectiefaltruisme.nl)

Video and tran­script of pre­sen­ta­tion on Oth­er­ness and con­trol in the age of AGI

Joe_Carlsmith8 Oct 2024 22:30 UTC
18 points
1 comment27 min readEA link

CEA seeks co-founder for AI safety group sup­port spin-off

Agustín Covarrubias 🔸8 Apr 2024 15:42 UTC
62 points
0 comments4 min readEA link

A widely shared AI pro­duc­tivity pa­per was re­tracted, is pos­si­bly fraudulent

titotal19 May 2025 10:18 UTC
34 points
4 comments3 min readEA link

[Question] What should I ask Ezra Klein about AI policy pro­pos­als?

Robert_Wiblin23 Jun 2023 16:36 UTC
21 points
4 comments1 min readEA link

Cal­ling for Stu­dent Sub­mis­sions: AI Safety Distil­la­tion Contest

a_e_r23 Apr 2022 20:24 UTC
102 points
28 comments3 min readEA link

Some rea­sons to start a pro­ject to stop harm­ful AI

Remmelt22 Aug 2024 16:23 UTC
5 points
0 comments2 min readEA link

[Linkpost] Jan Leike on three kinds of al­ign­ment taxes

Akash6 Jan 2023 23:57 UTC
29 points
0 comments3 min readEA link
(aligned.substack.com)

The AI in­dus­try turns against its fa­vorite philosophy

Jonathan Yan22 Nov 2023 0:11 UTC
14 points
2 comments1 min readEA link
(www.semafor.com)

Tak­ing a leave of ab­sence from Open Philan­thropy to work on AI safety

Holden Karnofsky23 Feb 2023 19:05 UTC
420 points
31 comments2 min readEA link

AISN#52: An Ex­pert Virol­ogy Benchmark

Center for AI Safety22 Apr 2025 16:52 UTC
6 points
0 comments4 min readEA link
(newsletter.safe.ai)

Thoughts on “The Offense-Defense Balance Rarely Changes”

Cullen 🔸12 Feb 2024 3:26 UTC
42 points
4 comments5 min readEA link

[Part-time AI Safety Re­search Pro­gram] MARS 3.0 Ap­pli­ca­tions Open for Par­ti­ci­pants & Re­cruit­ing Mentors

Cambridge AI Safety Hub7 May 2025 22:52 UTC
4 points
0 comments2 min readEA link

The­o­ries of Change for Track II Di­plo­macy [Founders Pledge]

christian.r9 Jul 2024 13:31 UTC
21 points
2 comments33 min readEA link

[Question] Will OpenAI’s o3 re­duce NVi­dia’s moat?

Ebenezer Dukakis3 Jan 2025 2:21 UTC
9 points
6 comments1 min readEA link

In­sights from an ex­pert sur­vey about in­ter­me­di­ate goals in AI governance

Sebastian Schwiecker17 Mar 2023 14:59 UTC
11 points
2 comments1 min readEA link

Fund­ing Case: AI Safety Camp 11

Remmelt23 Dec 2024 8:39 UTC
42 points
2 comments6 min readEA link
(manifund.org)

“Di­a­mon­doid bac­te­ria” nanobots: deadly threat or dead-end? A nan­otech in­ves­ti­ga­tion

titotal29 Sep 2023 14:01 UTC
102 points
33 comments20 min readEA link
(titotal.substack.com)

At­tend SPAR’s vir­tual demo day! (ca­reer fair + talks)

Agustín Covarrubias 🔸2 May 2025 23:45 UTC
17 points
1 comment2 min readEA link
(demoday.sparai.org)

Deep­Seek Made it Even Harder for US AI Com­pa­nies to Ever Reach Profitability

Garrison19 Feb 2025 21:02 UTC
30 points
1 comment3 min readEA link
(garrisonlovely.substack.com)

Fundrais­ing for Mox: cowork­ing & events in SF

Austin31 Mar 2025 18:25 UTC
37 points
3 comments6 min readEA link
(manifund.org)

State­ment on AI Ex­tinc­tion—Signed by AGI Labs, Top Aca­demics, and Many Other Notable Figures

Center for AI Safety30 May 2023 9:06 UTC
429 points
28 comments1 min readEA link
(www.safe.ai)

Still no strong ev­i­dence that LLMs in­crease bioter­ror­ism risk

freedomandutility2 Nov 2023 21:23 UTC
58 points
9 comments1 min readEA link

[LW xpost] Unit eco­nomics of LLM APIs

dschwarz27 Aug 2024 16:55 UTC
19 points
2 comments1 min readEA link
(www.lesswrong.com)

US Congress in­tro­duces CREATE AI Act for es­tab­lish­ing Na­tional AI Re­search Resource

Daniel_Eth28 Jul 2023 23:27 UTC
9 points
1 comment1 min readEA link
(eshoo.house.gov)

AISN #44: The Trump Cir­cle on AI Safety Plus, Chi­nese re­searchers used Llama to cre­ate a mil­i­tary tool for the PLA, a Google AI sys­tem dis­cov­ered a zero-day cy­ber­se­cu­rity vuln­er­a­bil­ity, and Com­plex Sys­tems

Center for AI Safety19 Nov 2024 16:36 UTC
11 points
0 comments5 min readEA link
(newsletter.safe.ai)

An­thropic’s sub­mis­sion to the White House’s RFI on AI policy

Agustín Covarrubias 🔸6 Mar 2025 22:47 UTC
48 points
7 comments1 min readEA link
(www.anthropic.com)

Briefly how I’ve up­dated since ChatGPT

rime25 Apr 2023 19:39 UTC
29 points
8 comments2 min readEA link
(www.lesswrong.com)

AI Safety & En­trepreneur­ship v1.0

Chris Leong26 Apr 2025 14:37 UTC
27 points
0 comments2 min readEA link

AIのタイムライン ─ 提案されている論証と「専門家」の立ち位置

EA Japan17 Aug 2023 14:59 UTC
2 points
0 comments1 min readEA link

Please, some­one make a dataset of sup­posed cases of “tech panic”

Marcel27 Nov 2023 2:49 UTC
4 points
2 comments2 min readEA link

Sur­vey on the ac­cel­er­a­tion risks of our new RFPs to study LLM capabilities

Ajeya10 Nov 2023 23:59 UTC
38 points
1 comment8 min readEA link

Bandgaps, Brains, and Bioweapons: The limi­ta­tions of com­pu­ta­tional sci­ence and what it means for AGI

titotal26 May 2023 15:57 UTC
59 points
0 comments18 min readEA link

Five Years of Re­think Pri­ori­ties: Im­pact, Fu­ture Plans, Fund­ing Needs (July 2023)

Rethink Priorities18 Jul 2023 15:59 UTC
110 points
3 comments16 min readEA link

Women in AI Safety Lon­don Meetup

Nia1 Aug 2024 9:48 UTC
2 points
0 comments1 min readEA link

My at­tempt at ex­plain­ing the case for AI risk in a straight­for­ward way

JulianHazell25 Mar 2023 16:32 UTC
25 points
7 comments18 min readEA link
(muddyclothes.substack.com)

A Wind­fall Clause for CEO could worsen AI race dynamics

Larks9 Mar 2023 18:02 UTC
69 points
12 comments7 min readEA link

[Link Post: New York Times] White House Un­veils Ini­ti­a­tives to Re­duce Risks of A.I.

Rockwell4 May 2023 14:04 UTC
50 points
1 comment2 min readEA link

Why I’m Post­ing AI-Safety-Re­lated Clips On TikTok

Michaël Trazzi12 Aug 2025 22:39 UTC
54 points
1 comment2 min readEA link

ARC Evals: Re­spon­si­ble Scal­ing Policies

Zach Stein-Perlman28 Sep 2023 4:30 UTC
16 points
1 comment2 min readEA link
(evals.alignment.org)

Pal­isade is hiring: Exec As­sis­tant, Con­tent Lead, Ops Lead, and Policy Lead

Charlie Rogers-Smith9 Oct 2024 0:04 UTC
15 points
2 comments4 min readEA link

Semi-con­duc­tor /​ AI stocks dis­cus­sion.

sapphire25 Nov 2022 23:35 UTC
10 points
3 comments1 min readEA link

Re­cent progress on the sci­ence of evaluations

PabloAMC 🔸23 Jun 2025 9:49 UTC
11 points
0 comments8 min readEA link
(www.lesswrong.com)

Bernie San­ders (I-VT) men­tions AI loss of con­trol risk in Giz­modo interview

Matrice Jacobine14 Jul 2025 14:47 UTC
26 points
0 comments1 min readEA link
(gizmodo.com)

Cal­ifor­ni­ans, tell your reps to vote yes on SB 1047!

Holly Elmore ⏸️ 🔸12 Aug 2024 19:49 UTC
106 points
6 comments1 min readEA link

Re­sponse to Aschen­bren­ner’s “Si­tu­a­tional Aware­ness”

RobBensinger6 Jun 2024 22:57 UTC
111 points
15 comments3 min readEA link

The U.S. and China Need an AI In­ci­dents Hotline

christian.r3 Jun 2024 18:46 UTC
25 points
0 comments1 min readEA link
(www.lawfaremedia.org)

The EA case for Trump 2024

hamandcheese2 Aug 2024 19:32 UTC
−8 points
66 comments12 min readEA link

New open let­ter on AI — “In­clude Con­scious­ness Re­search”

Jamie_Harris28 Apr 2023 7:50 UTC
55 points
1 comment3 min readEA link
(amcs-community.org)

“Ar­tifi­cial Gen­eral In­tel­li­gence”: an ex­tremely brief FAQ

Steven Byrnes11 Mar 2024 17:49 UTC
12 points
0 comments2 min readEA link

Re­minder: AI Wor­ld­views Con­test Closes May 31

Jason Schukraft8 May 2023 17:40 UTC
20 points
0 comments1 min readEA link

An Ex­er­cise to Build In­tu­itions on AGI Risk

Lauro Langosco8 Jun 2023 11:20 UTC
4 points
0 comments8 min readEA link
(www.alignmentforum.org)

More peo­ple get­ting into AI safety should do a PhD

AdamGleave14 Mar 2024 22:14 UTC
50 points
4 comments12 min readEA link
(gleave.me)

New Deep­Mind re­port on in­sti­tu­tions for global AI governance

finm14 Jul 2023 16:05 UTC
10 points
0 comments1 min readEA link
(www.deepmind.com)

kpurens’s Quick takes

kpurens11 Apr 2023 14:10 UTC
9 points
2 comments2 min readEA link

List of Masters Pro­grams in Tech Policy, Public Policy and Se­cu­rity (Europe)

sberg29 May 2023 10:23 UTC
49 points
0 comments3 min readEA link

Tech­nolog­i­cal de­vel­op­ments that could in­crease risks from nu­clear weapons: A shal­low review

MichaelA🔸9 Feb 2023 15:41 UTC
79 points
3 comments5 min readEA link
(bit.ly)

AI Gover­nance & Strat­egy: Pri­ori­ties, tal­ent gaps, & opportunities

Akash3 Mar 2023 18:09 UTC
21 points
0 comments4 min readEA link

En­hanc­ing biose­cu­rity with lan­guage mod­els: defin­ing re­search directions

mic26 Mar 2024 12:30 UTC
11 points
1 comment13 min readEA link
(papers.ssrn.com)

LANAIS (Latin Amer­i­can Net­work for AI Safety) kick-off

Fernando Avalos23 Jun 2025 14:34 UTC
28 points
0 comments2 min readEA link

Clar­ify­ing “wis­dom”: Foun­da­tional top­ics for al­igned AIs to pri­ori­tize be­fore ir­re­versible decisions

Anthony DiGiovanni20 Jun 2025 21:55 UTC
24 points
1 comment12 min readEA link

OpenAI o1

Zach Stein-Perlman12 Sep 2024 18:54 UTC
38 points
0 comments1 min readEA link

The Precipice Revisited

Toby_Ord12 Jul 2024 14:06 UTC
283 points
41 comments17 min readEA link

Mo­ravec’s para­dox and its implications

Vasco Grilo🔸29 Apr 2025 16:25 UTC
13 points
5 comments8 min readEA link
(epoch.ai)

NYT ar­ti­cle about the Zizi­ans in­clud­ing quotes from Eliezer, Anna, Ozy, Jes­sica, Zvi

Matrice Jacobine8 Jul 2025 1:42 UTC
2 points
0 comments1 min readEA link
(www.nytimes.com)

Notes and up­dates on GPT-5

Yadav9 Aug 2025 11:58 UTC
33 points
3 comments2 min readEA link
(robertgaurav.xyz)

Re­cur­sive Mid­dle Man­ager Hell

Raemon17 Jan 2023 19:02 UTC
73 points
3 comments11 min readEA link

Bench­mark Perfor­mance is a Poor Mea­sure of Gen­er­al­is­able AI Rea­son­ing Capabilities

James Fodor21 Feb 2025 4:25 UTC
12 points
3 comments24 min readEA link

[Link post] Michael Niel­sen’s “Notes on Ex­is­ten­tial Risk from Ar­tifi­cial Su­per­in­tel­li­gence”

Joel Becker19 Sep 2023 13:31 UTC
38 points
1 comment6 min readEA link
(michaelnotebook.com)

Bryan John­son seems more EA al­igned than I expected

PeterSlattery22 Apr 2024 9:38 UTC
13 points
27 comments2 min readEA link
(www.youtube.com)

A Sim­ple Model of AGI De­ploy­ment Risk

djbinder9 Jul 2021 9:44 UTC
30 points
0 comments5 min readEA link

2023: news on AI safety, an­i­mal welfare, global health, and more

Lizka5 Jan 2024 21:57 UTC
54 points
1 comment12 min readEA link

[Event] Build­ing What the Fu­ture Needs: A cu­rated con­fer­ence in Ber­lin (Sep 6, 2025) for high-im­pact builders and researchers

Vasiliy Kondyrev8 Aug 2025 14:35 UTC
21 points
0 comments2 min readEA link

80,000 Hours is hiring for an En­gage­ment Specialist

Bella25 Apr 2025 10:33 UTC
8 points
5 comments6 min readEA link

Jobs that can help with the most im­por­tant century

Holden Karnofsky12 Feb 2023 18:19 UTC
57 points
2 comments32 min readEA link
(www.cold-takes.com)

Solv­ing al­ign­ment isn’t enough for a flour­ish­ing future

mic2 Feb 2024 18:22 UTC
27 points
0 comments22 min readEA link
(papers.ssrn.com)

Paper­clip Club (AI Safety Meetup)

Luke Thorburn20 Apr 2023 16:04 UTC
2 points
0 comments1 min readEA link

Sam Alt­man fired from OpenAI

Larks17 Nov 2023 21:07 UTC
133 points
89 comments1 min readEA link
(openai.com)

AI Views Snapshots

RobBensinger13 Dec 2023 0:45 UTC
25 points
0 comments1 min readEA link

How do we solve the al­ign­ment prob­lem?

Joe_Carlsmith13 Feb 2025 18:27 UTC
28 points
1 comment7 min readEA link
(joecarlsmith.substack.com)

<$750k grants for Gen­eral Pur­pose AI As­surance/​Safety Research

Phosphorous13 Jun 2023 4:51 UTC
37 points
0 comments1 min readEA link
(cset.georgetown.edu)

The state of AI in differ­ent coun­tries — an overview

Lizka14 Sep 2023 10:37 UTC
68 points
6 comments13 min readEA link
(aisafetyfundamentals.com)

Towards more co­op­er­a­tive AI safety strategies

richard_ngo16 Jul 2024 4:36 UTC
64 points
5 comments4 min readEA link

Quick takes on “AI is easy to con­trol”

So8res2 Dec 2023 22:33 UTC
−12 points
4 comments4 min readEA link

Model­ing the im­pact of AI safety field-build­ing programs

Center for AI Safety10 Jul 2023 17:22 UTC
86 points
0 comments7 min readEA link

[Draft] The hum­ble cos­mol­o­gist’s P(doom) paradox

titotal16 Mar 2024 11:13 UTC
39 points
6 comments10 min readEA link

[Question] What is MIRI cur­rently do­ing?

Roko14 Dec 2024 2:55 UTC
9 points
2 comments1 min readEA link

Fram­ing AI strategy

Zach Stein-Perlman7 Feb 2023 20:03 UTC
16 points
0 comments1 min readEA link
(www.lesswrong.com)

Re­grant up to $600,000 to AI safety pro­jects with GiveWiki

Dawn Drescher28 Oct 2023 19:56 UTC
22 points
0 comments3 min readEA link

Public Com­ment In­vited on Ar­tifi­cial In­tel­li­gence Ac­tion Plan

PeterSlattery3 Mar 2025 14:11 UTC
47 points
0 comments1 min readEA link
(www.whitehouse.gov)

AI Safety in a World of Vuln­er­a­ble Ma­chine Learn­ing Systems

AdamGleave8 Mar 2023 2:40 UTC
20 points
0 comments29 min readEA link
(far.ai)

Will re­leas­ing the weights of large lan­guage mod­els grant wide­spread ac­cess to pan­demic agents?

Jeff Kaufman 🔸30 Oct 2023 17:42 UTC
56 points
18 comments1 min readEA link
(arxiv.org)

AI safety tax dynamics

Owen Cotton-Barratt23 Oct 2024 12:21 UTC
22 points
9 comments6 min readEA link
(strangecities.substack.com)

Cog­ni­tive as­sets and defen­sive acceleration

JulianHazell3 Apr 2024 14:55 UTC
13 points
3 comments4 min readEA link
(muddyclothes.substack.com)

AISN #49: Su­per­in­tel­li­gence Strategy

Center for AI Safety6 Mar 2025 17:43 UTC
8 points
0 comments5 min readEA link
(newsletter.safe.ai)

Fu­tures with digi­tal minds: Ex­pert fore­casts in 2025

Lucius Caviola16 Aug 2025 20:00 UTC
53 points
1 comment1 min readEA link
(digitalminds.report)

Ex­perts’ AI timelines are longer than you have been told?

Vasco Grilo🔸9 Jan 2025 17:30 UTC
38 points
11 comments3 min readEA link
(bayes.net)

De­con­struct­ing Bostrom’s Clas­sic Ar­gu­ment for AI Doom

Nora Belrose11 Mar 2024 6:03 UTC
26 points
0 comments1 min readEA link
(www.youtube.com)

Lo­cal De­tours On A Nar­row Path: How might treaties fail in China?

Jack_S11 Aug 2025 20:33 UTC
8 points
0 comments14 min readEA link
(torchestogether.substack.com)

Is the time crunch for AI Safety Move­ment Build­ing now?

Chris Leong8 Jun 2022 12:19 UTC
14 points
10 comments3 min readEA link

Care­less talk on US-China AI com­pe­ti­tion? (and crit­i­cism of CAIS cov­er­age)

Oliver Sourbut20 Sep 2023 12:46 UTC
52 points
19 comments9 min readEA link
(www.oliversourbut.net)

AI safety logo de­sign con­test, due end of May (ex­tended)

Adrian Cipriani28 Apr 2023 2:53 UTC
13 points
23 comments2 min readEA link

On “slack” in train­ing (Sec­tion 1.5 of “Schem­ing AIs”)

Joe_Carlsmith25 Nov 2023 17:51 UTC
14 points
1 comment5 min readEA link

Orthog­o­nal­ity is Expensive

𝕮𝖎𝖓𝖊𝖗𝖆3 Apr 2023 1:57 UTC
18 points
4 comments1 min readEA link
(www.beren.io)

In­ves­ti­gat­ing an in­surance-for-AI startup

L Rudolf L21 Sep 2024 15:29 UTC
40 points
1 comment15 min readEA link
(www.strataoftheworld.com)

On the fu­ture of lan­guage models

Owen Cotton-Barratt20 Dec 2023 16:58 UTC
125 points
3 comments36 min readEA link

AIxBio Newslet­ter #3 - At the Nexus

Andy Morgan 🔸7 Dec 2024 21:00 UTC
7 points
0 comments2 min readEA link
(atthenexus.substack.com)

LLMs as a Plan­ning Overhang

Larks14 Jul 2024 4:57 UTC
49 points
3 comments2 min readEA link

An­nounc­ing the 2025 Q1 Pivotal Re­search Fellowship

Tobias Häberli2 Nov 2024 11:33 UTC
26 points
1 comment2 min readEA link

Which ML skills are use­ful for find­ing a new AIS re­search agenda?

Yonatan Cale9 Feb 2023 13:09 UTC
7 points
3 comments1 min readEA link

On “first crit­i­cal tries” in AI alignment

Joe_Carlsmith5 Jun 2024 0:19 UTC
29 points
3 comments14 min readEA link

[Question] Why is Apart Re­search sud­denly in dire need of fund­ing?

Eevee🔹28 May 2025 7:43 UTC
97 points
11 comments1 min readEA link

Us­ing Con­sen­sus Mechanisms as an ap­proach to Alignment

Prometheus11 Jun 2023 13:24 UTC
14 points
0 comments6 min readEA link

How MATS ad­dresses “mass move­ment build­ing” concerns

Ryan Kidd4 May 2023 0:55 UTC
79 points
4 comments3 min readEA link

Bri­tish pub­lic per­cep­tion of ex­is­ten­tial risks

Jamie E25 Oct 2024 14:37 UTC
58 points
8 comments10 min readEA link

Google in­vests $300mn in ar­tifi­cial in­tel­li­gence start-up An­thropic | FT

𝕮𝖎𝖓𝖊𝖗𝖆3 Feb 2023 19:43 UTC
155 points
5 comments1 min readEA link
(www.ft.com)

MIT Fu­tureTech are hiring a Post­doc­toral As­so­ci­ate to work on AI Perfor­mance and Safety

PeterSlattery8 Jul 2025 14:05 UTC
7 points
0 comments4 min readEA link

[Question] Is Deep­Seek-R1 already bet­ter than o3 when in­fer­ence costs are held con­stant?

Magnus Vinding24 Jan 2025 15:29 UTC
33 points
2 comments1 min readEA link

AIs ac­cel­er­at­ing AI research

Ajeya12 Apr 2023 11:41 UTC
84 points
7 comments4 min readEA link

[Question] Will AI Wor­ld­view Prize Fund­ing Be Re­placed?

Jordan Arel13 Nov 2022 17:10 UTC
26 points
4 comments1 min readEA link

[Linkpost] Si­tu­a­tional Aware­ness—The Decade Ahead

MathiasKB🔸4 Jun 2024 22:58 UTC
87 points
7 comments2 min readEA link
(situational-awareness.ai)

[Video] - How does the EU AI Act Work?

Yadav11 Sep 2024 14:16 UTC
10 points
0 comments5 min readEA link

Ap­ply to lead a pro­ject dur­ing the next vir­tual AI Safety Camp

Linda Linsefors13 Sep 2023 13:29 UTC
16 points
0 comments5 min readEA link
(aisafety.camp)

In­cu­bat­ing AI x-risk pro­jects: some per­sonal reflections

Ben Snodin19 Dec 2023 17:03 UTC
84 points
10 comments9 min readEA link

AISN #51: AI Frontiers

Center for AI Safety15 Apr 2025 15:46 UTC
8 points
1 comment5 min readEA link
(newsletter.safe.ai)

AI Fables Writ­ing Con­test Win­ners!

Daystar Eld6 Nov 2023 2:27 UTC
39 points
0 comments2 min readEA link

AI Safety Newslet­ter #41: The Next Gen­er­a­tion of Com­pute Scale Plus, Rank­ing Models by Sus­cep­ti­bil­ity to Jailbreak­ing, and Ma­chine Ethics

Center for AI Safety11 Sep 2024 19:11 UTC
12 points
0 comments5 min readEA link
(newsletter.safe.ai)

Biorisk is an Un­helpful Anal­ogy for AI Risk

Davidmanheim6 May 2024 6:18 UTC
22 points
4 comments3 min readEA link

Im­pli­ca­tions of the in­fer­ence scal­ing paradigm for AI safety

Ryan Kidd15 Jan 2025 0:59 UTC
47 points
5 comments5 min readEA link

Me­tac­u­lus’ pre­dic­tions are much bet­ter than low-in­for­ma­tion priors

Vasco Grilo🔸11 Apr 2023 8:36 UTC
53 points
0 comments6 min readEA link

13 Very Differ­ent Stances on AGI

Ozzie Gooen27 Dec 2021 23:30 UTC
84 points
23 comments3 min readEA link

[Question] Can we eval­u­ate the “tool ver­sus agent” AGI pre­dic­tion?

Ben_West🔸8 Apr 2023 18:35 UTC
63 points
7 comments1 min readEA link

The goal-guard­ing hy­poth­e­sis (Sec­tion 2.3.1.1 of “Schem­ing AIs”)

Joe_Carlsmith2 Dec 2023 15:20 UTC
6 points
1 comment12 min readEA link

Thoughts about AI safety field-build­ing in LMIC

Renan Araujo23 Jun 2023 23:22 UTC
57 points
4 comments12 min readEA link

Is it time for a pause?

Kelsey Piper6 Apr 2023 11:48 UTC
103 points
6 comments5 min readEA link

EA In­fosec: skill up in or make a tran­si­tion to in­fosec via this book club

Jason Clinton5 Mar 2023 21:02 UTC
170 points
16 comments2 min readEA link

Have your say on the Aus­tralian Govern­ment’s AI Policy [On­line #1]

Nathan Sherburn11 Jul 2023 0:35 UTC
3 points
0 comments1 min readEA link

China Hawks are Man­u­fac­tur­ing an AI Arms Race

Garrison20 Nov 2024 18:17 UTC
103 points
3 comments5 min readEA link
(garrisonlovely.substack.com)

An­nounc­ing the Open Philan­thropy AI Wor­ld­views Contest

Jason Schukraft10 Mar 2023 2:33 UTC
137 points
33 comments3 min readEA link
(www.openphilanthropy.org)

How do AI welfare and AI safety in­ter­act?

Lucius Caviola1 Jul 2024 10:39 UTC
77 points
21 comments7 min readEA link
(outpaced.substack.com)

Me­tac­u­lus Pre­dicts Weak AGI in 2 Years and AGI in 10

Chris Leong24 Mar 2023 19:43 UTC
27 points
12 comments1 min readEA link

Be­ware pop­u­lar dis­cus­sions of AI “sen­tience”

David Mathers🔸8 Jun 2023 8:57 UTC
42 points
6 comments9 min readEA link

An­nounc­ing the Ex­is­ten­tial In­foSec Forum

calebp7 Jul 2023 21:08 UTC
90 points
1 comment2 min readEA link

Law & AI Din­ner—EAG Bos­ton 2023

Alfredo Parra 🔸12 Oct 2023 8:32 UTC
8 points
0 comments1 min readEA link

Three Quotes on Trans­for­ma­tive Technology

Chris Leong1 Aug 2025 22:57 UTC
25 points
0 comments1 min readEA link

[Linkpost] The A.I. Dilemma—March 9, 2023, with Tris­tan Har­ris and Aza Raskin

PeterSlattery14 Apr 2023 8:00 UTC
38 points
3 comments41 min readEA link
(youtu.be)

Launch & Grow Your Univer­sity Group: Ap­ply now to OSP & FSP!

Agustín Covarrubias 🔸25 May 2024 1:03 UTC
61 points
0 comments2 min readEA link

11 heuris­tics for choos­ing (al­ign­ment) re­search projects

Akash27 Jan 2023 0:36 UTC
30 points
1 comment1 min readEA link

An Easily Over­looked Post on the Au­toma­tion of Wis­dom and Philosophy

Chris Leong12 Jun 2025 2:57 UTC
12 points
1 comment1 min readEA link
(blog.aiimpacts.org)

[Question] If your AGI x-risk es­ti­mates are low, what sce­nar­ios make up the bulk of your ex­pec­ta­tions for an OK out­come?

Greg_Colbourn ⏸️ 21 Apr 2023 11:15 UTC
65 points
55 comments1 min readEA link

[Question] Best giv­ing mul­ti­plier for X-risk/​AI safety?

SiebeRozendal27 Dec 2023 10:51 UTC
7 points
0 comments1 min readEA link

AI welfare vs. AI rights

Matthew_Barnett4 Feb 2025 18:28 UTC
37 points
20 comments3 min readEA link

AI safety field-build­ing sur­vey: Ta­lent needs, in­fras­truc­ture needs, and re­la­tion­ship to EA

michel27 Oct 2023 21:08 UTC
67 points
3 comments9 min readEA link

Publi­ca­tion of the In­ter­na­tional Scien­tific Re­port on the Safety of Ad­vanced AI (In­term Re­port)

James Herbert21 May 2024 21:58 UTC
11 points
2 comments2 min readEA link
(www.gov.uk)

[Linkpost] “Gover­nance of su­per­in­tel­li­gence” by OpenAI

Daniel_Eth22 May 2023 20:15 UTC
51 points
6 comments2 min readEA link
(openai.com)

A Guide to Fore­cast­ing AI Science Capabilities

Eleni_A29 Apr 2023 6:51 UTC
19 points
1 comment4 min readEA link

‘AI Emer­gency Eject Cri­te­ria’ Survey

tcelferact19 Apr 2023 21:55 UTC
5 points
4 comments1 min readEA link

ALTER Is­rael Mid-2025 Semi­an­nual Update

Davidmanheim15 Jul 2025 7:47 UTC
12 points
1 comment5 min readEA link

New Me­tac­u­lus Space for AI and X-Risk Re­lated Questions

David Mathers🔸6 Sep 2024 11:37 UTC
16 points
0 comments1 min readEA link

An­nounce­ment: You can now listen to the “AI Safety Fun­da­men­tals” courses

peterhartree9 Jun 2023 16:32 UTC
101 points
8 comments1 min readEA link

12 ca­reer-re­lated ques­tions that may (or may not) be helpful for peo­ple in­ter­ested in al­ign­ment research

Akash12 Dec 2022 22:36 UTC
14 points
0 comments2 min readEA link

Re­think Pri­ori­ties is hiring a Com­pute Gover­nance Re­searcher or Re­search Assistant

MichaelA🔸7 Jun 2023 13:22 UTC
36 points
2 comments8 min readEA link
(careers.rethinkpriorities.org)

Munk AI de­bate: con­fu­sions and pos­si­ble cruxes

Steven Byrnes27 Jun 2023 15:01 UTC
142 points
10 comments8 min readEA link

Up­dat­ing Drexler’s CAIS model

Matthew_Barnett17 Jun 2023 1:57 UTC
59 points
0 comments4 min readEA link

Mo­ral Align­ment: An Idea I’m Em­bar­rassed I Didn’t Think of Myself

Gordon Seidoh Worley18 Jun 2025 15:42 UTC
27 points
5 comments2 min readEA link

Biomimetic al­ign­ment: Align­ment be­tween an­i­mal genes and an­i­mal brains as a model for al­ign­ment be­tween hu­mans and AI sys­tems.

Geoffrey Miller26 May 2023 21:25 UTC
32 points
1 comment16 min readEA link

CEEALAR: 2024 Update

CEEALAR19 Jul 2024 11:14 UTC
116 points
7 comments4 min readEA link

[Question] Strongest real-world ex­am­ples sup­port­ing AI risk claims?

rosehadshar5 Sep 2023 15:11 UTC
52 points
9 comments1 min readEA link

The Case for Jour­nal­ism on AI

michel19 Feb 2025 19:45 UTC
95 points
5 comments4 min readEA link

Nav­i­gat­ing AI Risks (NAIR) #1: Slow­ing Down AI

simeon_c14 Apr 2023 14:35 UTC
12 points
1 comment1 min readEA link
(navigatingairisks.substack.com)

Thread: Reflec­tions on the AGI Safety Fun­da­men­tals course?

Clifford18 May 2023 13:11 UTC
27 points
7 comments1 min readEA link

Ac­tion: Help ex­pand fund­ing for AI Safety by co­or­di­nat­ing on NSF response

Evan R. Murphy20 Jan 2022 20:48 UTC
20 points
7 comments3 min readEA link

Po­si­tions at MITFutureTech

PeterSlattery19 Dec 2023 20:28 UTC
21 points
1 comment4 min readEA link

[Question] What is AI Safety’s line of re­treat?

Remmelt28 Jul 2024 5:43 UTC
4 points
2 comments1 min readEA link

Pro­ject ideas: Gover­nance dur­ing ex­plo­sive tech­nolog­i­cal growth

Lukas Finnveden4 Jan 2024 7:25 UTC
37 points
1 comment16 min readEA link
(www.forethought.org)

Brand­ing AI Safety Groups: A Field Guide

Agustín Covarrubias 🔸13 May 2024 17:17 UTC
44 points
6 comments7 min readEA link

Risks I am Con­cerned About

HappyBunny29 Apr 2024 23:41 UTC
1 point
1 comment1 min readEA link

[ur­gent] Amer­i­cans: call your Se­na­tors and tell them you op­pose AI preemption

Holly Elmore ⏸️ 🔸15 May 2025 1:57 UTC
176 points
22 comments2 min readEA link

The cru­cible — how I think about the situ­a­tion with AI

Owen Cotton-Barratt5 May 2025 13:19 UTC
37 points
0 comments8 min readEA link
(strangecities.substack.com)

New vol­un­tary com­mit­ments (AI Seoul Sum­mit)

Zach Stein-Perlman21 May 2024 11:00 UTC
12 points
1 comment7 min readEA link
(www.gov.uk)

Ideas for im­prov­ing epistemics in AI safety outreach

mic21 Aug 2023 19:56 UTC
31 points
0 comments3 min readEA link
(www.lesswrong.com)

Ten­ta­tively against mak­ing AIs ‘wise’

OscarD🔸14 Jul 2024 18:32 UTC
9 points
4 comments3 min readEA link

Im­pact Assess­ment of AI Safety Camp (Arb Re­search)

Sam Holton23 Jan 2024 16:32 UTC
87 points
23 comments11 min readEA link

The case for more Align­ment Tar­get Anal­y­sis (ATA)

Chi20 Sep 2024 1:14 UTC
25 points
0 comments17 min readEA link

Ar­tifi­cial In­tel­li­gence, Con­scious Machines, and An­i­mals: Broad­en­ing AI Ethics

Group Organizer21 Sep 2023 20:58 UTC
4 points
0 comments1 min readEA link

AI Safety is Some­times a Model Property

Cullen 🔸2 May 2024 15:38 UTC
18 points
1 comment1 min readEA link
(open.substack.com)

An­nounc­ing Open Philan­thropy’s AI gov­er­nance and policy RFP

JulianHazell17 Jul 2024 0:25 UTC
73 points
2 comments1 min readEA link
(www.openphilanthropy.org)

Ap­ply for men­tor­ship in AI Safety field-building

Akash17 Sep 2022 19:03 UTC
21 points
0 comments1 min readEA link

Staged release

Zach Stein-Perlman20 Apr 2024 1:00 UTC
16 points
0 comments2 min readEA link

Thoughts on SB-1047

Ryan Greenblatt30 May 2024 0:19 UTC
53 points
4 comments11 min readEA link

AI can solve all EA prob­lems, so why keep fo­cus­ing on them?

Cody Albert3 May 2025 21:51 UTC
8 points
15 comments1 min readEA link

Race to the Top: Bench­marks for AI Safety

isaduan4 Dec 2022 22:50 UTC
52 points
8 comments1 min readEA link

New ‘South Park’ epi­sode on AI & Chat GPT

Geoffrey Miller21 Mar 2023 20:06 UTC
13 points
1 comment1 min readEA link

LLMs might not be the fu­ture of search: at least, not yet.

James-Hartree-Law22 Jan 2025 21:40 UTC
4 points
1 comment4 min readEA link

Aspira­tion-based, non-max­i­miz­ing AI agent designs

Bob Jacobs7 May 2024 16:13 UTC
12 points
1 comment38 min readEA link

Be­ware safety-washing

Lizka13 Jan 2023 10:39 UTC
143 points
7 comments4 min readEA link

Notes on nukes, IR, and AI from “Arse­nals of Folly” (and other books)

tlevin4 Sep 2023 19:02 UTC
21 points
2 comments6 min readEA link

[TIME mag­a­z­ine] Deep­Mind’s CEO Helped Take AI Main­stream. Now He’s Urg­ing Cau­tion (Per­rigo, 2023)

Will Aldred20 Jan 2023 20:37 UTC
93 points
0 comments1 min readEA link
(time.com)

AI Safety Newslet­ter #37: US Launches An­titrust In­ves­ti­ga­tions Plus, re­cent crit­i­cisms of OpenAI and An­thropic, and a sum­mary of Si­tu­a­tional Awareness

Center for AI Safety18 Jun 2024 18:08 UTC
15 points
0 comments5 min readEA link
(newsletter.safe.ai)

I’m hiring a Re­search As­sis­tant for a non­fic­tion book on AI!

Garrison26 Mar 2025 19:46 UTC
63 points
2 comments1 min readEA link
(garrisonlovely.substack.com)

In­ter­pretabil­ity Will Not Reli­ably Find De­cep­tive AI

Neel Nanda4 May 2025 16:32 UTC
74 points
0 comments7 min readEA link

What the Head­lines Miss About the Lat­est De­ci­sion in the Musk vs. OpenAI Lawsuit

Garrison6 Mar 2025 19:49 UTC
87 points
9 comments6 min readEA link
(garrisonlovely.substack.com)

Case stud­ies on so­cial-welfare-based stan­dards in var­i­ous industries

Holden Karnofsky20 Jun 2024 13:33 UTC
73 points
2 comments1 min readEA link

Cam­bridge Bos­ton Align­ment Ini­ti­a­tive Sum­mer Re­search Fel­low­ship in AI Safety (Dead­line: May 18)

PeterSlattery12 May 2025 16:15 UTC
14 points
2 comments1 min readEA link

20 Cri­tiques of AI Safety That I Found on Twitter

Daniel Kirmani23 Jun 2022 15:11 UTC
14 points
13 comments1 min readEA link

An­nounc­ing Epoch’s dash­board of key trends and figures in Ma­chine Learning

Jaime Sevilla13 Apr 2023 7:33 UTC
127 points
4 comments1 min readEA link
(epochai.org)

Vic­to­ria Krakovna on AGI Ruin, The Sharp Left Turn and Paradigms of AI Alignment

Michaël Trazzi12 Jan 2023 17:09 UTC
16 points
0 comments4 min readEA link
(www.theinsideview.ai)

Call for Papers on Global AI Gover­nance from the UN

Chris Leong20 Aug 2023 8:56 UTC
36 points
1 comment1 min readEA link
(www.linkedin.com)

Cost-effec­tive­ness of stu­dent pro­grams for AI safety research

Center for AI Safety10 Jul 2023 17:23 UTC
53 points
7 comments15 min readEA link

ML4G Ger­many—AI Align­ment Camp

Evander H. 🔸19 Jun 2023 7:24 UTC
17 points
1 comment1 min readEA link

[Question] What is the coun­ter­fac­tual value of differ­ent AI Safety pro­fes­sion­als?

PabloAMC 🔸3 Jul 2024 14:38 UTC
6 points
2 comments1 min readEA link

Have your say on the Aus­tralian Govern­ment’s AI Policy [Bris­bane]

Michael Noetel 🔸9 Jun 2023 0:15 UTC
6 points
0 comments1 min readEA link

List #2: Why co­or­di­nat­ing to al­ign as hu­mans to not de­velop AGI is a lot eas­ier than, well… co­or­di­nat­ing as hu­mans with AGI co­or­di­nat­ing to be al­igned with humans

Remmelt24 Dec 2022 9:53 UTC
3 points
0 comments3 min readEA link

An­i­mal Ad­vo­cacy in the Age of AI

Constance Li27 Jul 2023 7:08 UTC
65 points
4 comments6 min readEA link

An Anal­ogy for Un­der­stand­ing Transformers

Callum McDougall13 May 2023 12:20 UTC
7 points
0 comments9 min readEA link

Linkpost: 7 A.I. Com­pa­nies Agree to Safe­guards After Pres­sure From the White House

MHR🔸21 Jul 2023 13:23 UTC
61 points
4 comments1 min readEA link
(www.nytimes.com)

Have your say on the Aus­tralian Govern­ment’s AI Policy

Nathan Sherburn17 Jul 2023 11:02 UTC
3 points
1 comment1 min readEA link

Gavin New­som ve­toes SB 1047

Larks30 Sep 2024 0:06 UTC
39 points
14 comments1 min readEA link
(www.wsj.com)

IAPS: Map­ping Tech­ni­cal Safety Re­search at AI Companies

Zach Stein-Perlman24 Oct 2024 20:30 UTC
24 points
0 comments1 min readEA link
(www.iaps.ai)

Don’t Call It AI Alignment

Gil20 Feb 2023 5:27 UTC
16 points
7 comments2 min readEA link

Why I think AI take-off is rel­a­tively slow

Vasco Grilo🔸17 Aug 2025 9:11 UTC
26 points
0 comments3 min readEA link
(marginalrevolution.com)

AI 2027: What Su­per­in­tel­li­gence Looks Like (Linkpost)

Manuel Allgaier11 Apr 2025 10:31 UTC
51 points
3 comments42 min readEA link
(ai-2027.com)

ML4Good is seek­ing part­ner or­gani­sa­tions, in­di­vi­d­ual or­ganisers and TAs

Nia13 May 2024 13:43 UTC
22 points
0 comments3 min readEA link

Are there enough op­por­tu­ni­ties for AI safety spe­cial­ists?

mhint19913 May 2023 21:18 UTC
8 points
2 comments3 min readEA link

AI and the feel­ing of liv­ing in two worlds

michel10 Oct 2024 17:51 UTC
40 points
3 comments7 min readEA link

ALTER Is­rael End-of-2024 Update

Davidmanheim7 Jan 2025 15:07 UTC
38 points
1 comment4 min readEA link

Video and tran­script of talk on au­tomat­ing al­ign­ment research

Joe_Carlsmith30 Apr 2025 17:43 UTC
11 points
1 comment24 min readEA link
(joecarlsmith.com)

[Linkpost] Prospect Magaz­ine—How to save hu­man­ity from extinction

jackva26 Sep 2023 19:16 UTC
32 points
2 comments1 min readEA link
(www.prospectmagazine.co.uk)

AGI in sight: our look at the game board

Andrea_Miotti18 Feb 2023 22:17 UTC
25 points
18 comments6 min readEA link
(andreamiotti.substack.com)

Guardrails vs Goal-di­rect­ed­ness in AI Alignment

freedomandutility30 Dec 2023 12:58 UTC
13 points
2 comments1 min readEA link

How ARENA course ma­te­rial gets made

Callum McDougall2 Jul 2024 7:27 UTC
12 points
0 comments7 min readEA link

What is it to solve the al­ign­ment prob­lem? (Notes)

Joe_Carlsmith24 Aug 2024 21:19 UTC
32 points
1 comment53 min readEA link

Want to win the AGI race? Solve al­ign­ment.

leopold29 Mar 2023 15:19 UTC
56 points
6 comments5 min readEA link
(www.forourposterity.com)

A Friendly Face (Another Failure Story)

Karl von Wendt20 Jun 2023 10:31 UTC
22 points
8 comments16 min readEA link

Many AI gov­er­nance pro­pos­als have a trade­off be­tween use­ful­ness and feasibility

Akash3 Feb 2023 18:49 UTC
22 points
0 comments2 min readEA link

Archety­pal Trans­fer Learn­ing: a Pro­posed Align­ment Solu­tion that solves the In­ner x Outer Align­ment Prob­lem while adding Cor­rigible Traits to GPT-2-medium

Miguel26 Apr 2023 0:40 UTC
13 points
0 comments10 min readEA link

AI, cen­tral­iza­tion, and the One Ring

Owen Cotton-Barratt13 Sep 2024 13:56 UTC
20 points
0 comments8 min readEA link
(strangecities.substack.com)

Emerg­ing Tech­nolo­gies: More to explore

EA Handbook1 Jan 2021 11:06 UTC
4 points
0 comments2 min readEA link

[MLSN #9] Ver­ify­ing large train­ing runs, se­cu­rity risks from LLM ac­cess to APIs, why nat­u­ral se­lec­tion may fa­vor AIs over humans

TW12311 Apr 2023 16:05 UTC
18 points
0 comments6 min readEA link
(newsletter.mlsafety.org)

AI Tools for Ex­is­ten­tial Security

Lizka14 Mar 2025 18:37 UTC
64 points
10 comments11 min readEA link
(www.forethought.org)

We’re all in this together

Tamsin Leake5 Dec 2023 13:57 UTC
15 points
1 comment2 min readEA link

OpenAI’s new Pre­pared­ness team is hiring

leopold26 Oct 2023 20:41 UTC
85 points
13 comments1 min readEA link

AGI and the EMH: mar­kets are not ex­pect­ing al­igned or un­al­igned AI in the next 30 years

basil.halperin10 Jan 2023 16:05 UTC
342 points
177 comments26 min readEA link

An­nounc­ing the Pivotal Re­search Fel­low­ship – Ap­ply Now!

Tobias Häberli3 Apr 2024 17:30 UTC
51 points
5 comments2 min readEA link

[Question] AI+bio can­not be half of AI catas­tro­phe risk, right?

Benevolent_Rain10 Oct 2023 3:17 UTC
23 points
11 comments2 min readEA link

Sam Alt­man /​ Open AI Dis­cus­sion Thread

John Salter20 Nov 2023 9:21 UTC
40 points
36 comments1 min readEA link

Ap­pli­ca­tions are now open for In­tro to ML Safety Spring 2023

Joshc4 Nov 2022 22:45 UTC
49 points
1 comment2 min readEA link

AI Rights for Hu­man Safety

Matthew_Barnett3 Aug 2024 0:47 UTC
56 points
1 comment1 min readEA link
(papers.ssrn.com)

Sum­mary: Against the sin­gu­lar­ity hypothesis

Global Priorities Institute22 May 2024 11:05 UTC
46 points
15 comments4 min readEA link
(globalprioritiesinstitute.org)

[Question] Align­ment & Ca­pa­bil­ities: What’s the differ­ence?

John G. Halstead31 Aug 2023 22:13 UTC
50 points
10 comments1 min readEA link

The Hu­man Biolog­i­cal Ad­van­tage Over AI

William Stewart18 Nov 2024 11:18 UTC
−1 points
0 comments1 min readEA link

Where are the red lines for AI?

Karl von Wendt5 Aug 2022 9:41 UTC
13 points
3 comments6 min readEA link

SB 1047 was ve­toed, but pub­lic com­men­tary now can as­sist fu­ture AI safety legislation

ThomasW2 Oct 2024 18:10 UTC
38 points
0 comments1 min readEA link

An­nounc­ing AISafety.info’s Write-a-thon (June 16-18) and Se­cond Distil­la­tion Fel­low­ship (July 3-Oc­to­ber 2)

StevenKaas3 Jun 2023 2:03 UTC
12 points
1 comment2 min readEA link

AISN #13: An in­ter­dis­ci­plinary per­spec­tive on AI proxy failures, new com­peti­tors to ChatGPT, and prompt­ing lan­guage mod­els to misbehave

Center for AI Safety5 Jul 2023 15:33 UTC
25 points
0 comments9 min readEA link
(newsletter.safe.ai)

Perfor­mance com­par­i­son of Large Lan­guage Models (LLMs) in code gen­er­a­tion and ap­pli­ca­tion of best prac­tices in fron­tend web development

Diana V. Guaiña A.1 May 2025 14:57 UTC
5 points
0 comments24 min readEA link

AGI Safety Com­mu­ni­ca­tions Initiative

Ines11 Jun 2022 16:30 UTC
35 points
6 comments1 min readEA link

First call for EA Data Science/​ML/​AI

astrastefania23 Aug 2022 19:37 UTC
29 points
0 comments1 min readEA link

Why don’t gov­ern­ments seem to mind that com­pa­nies are ex­plic­itly try­ing to make AGIs?

Ozzie Gooen23 Dec 2021 7:08 UTC
82 points
49 comments2 min readEA link

Blake Richards on Why he is Skep­ti­cal of Ex­is­ten­tial Risk from AI

Michaël Trazzi14 Jun 2022 19:11 UTC
63 points
14 comments4 min readEA link
(theinsideview.ai)

Graph­i­cal Rep­re­sen­ta­tions of Paul Chris­ti­ano’s Doom Model

Nathan Young7 May 2023 13:03 UTC
48 points
2 comments1 min readEA link

Align­ment ideas in­spired by hu­man virtue development

Borys Pikalov18 May 2025 9:36 UTC
6 points
0 comments4 min readEA link

The Hid­den Com­plex­ity of Wishes—The Animation

Writer27 Sep 2023 17:59 UTC
7 points
0 comments1 min readEA link
(youtu.be)

How Open Source Ma­chine Learn­ing Soft­ware Shapes AI

Max L28 Sep 2022 17:49 UTC
11 points
3 comments15 min readEA link
(maxlangenkamp.me)

[Question] Do you think the prob­a­bil­ity of fu­ture AI sen­tience(suffer­ing) is >0.1%? Why?

jackchang11010 Jul 2023 16:41 UTC
4 points
0 comments1 min readEA link

De­sir­able? AI qualities

brb24321 Mar 2022 22:05 UTC
7 points
0 comments2 min readEA link

Min­i­miz­ing suffer­ing & ASI xrisk through brain digitization

Amy Louise Johnson20 Feb 2025 21:08 UTC
1 point
0 comments1 min readEA link

Tony Blair In­sti­tute AI Safety Work

TomWestgarth13 Jun 2023 13:16 UTC
88 points
2 comments6 min readEA link
(www.institute.global)

A bet­ter “State­ment on AI Risk?” [Cross­post]

Knight Lee30 Dec 2024 7:36 UTC
4 points
0 comments3 min readEA link

Stan­ford sum­mer course: Eco­nomics of Trans­for­ma­tive AI

trammell23 Jan 2025 23:07 UTC
83 points
4 comments1 min readEA link

[Linkpost] Michael Niel­sen re­marks on ‘Op­pen­heimer’

Tom Barnes🔸31 Aug 2023 15:41 UTC
83 points
1 comment2 min readEA link
(michaelnotebook.com)

Asya Ber­gal: Rea­sons you might think hu­man-level AI is un­likely to hap­pen soon

EA Global26 Aug 2020 16:01 UTC
24 points
2 comments17 min readEA link
(www.youtube.com)

Road to AnimalHarmBench

Artūrs Kaņepājs1 Jul 2025 13:37 UTC
134 points
11 comments7 min readEA link

Ques­tions for fur­ther in­ves­ti­ga­tion of AI diffusion

Ben Cottier21 Dec 2022 13:50 UTC
28 points
0 comments11 min readEA link

[AN #80]: Why AI risk might be solved with­out ad­di­tional in­ter­ven­tion from longtermists

Rohin Shah3 Jan 2020 7:52 UTC
58 points
12 comments10 min readEA link
(www.alignmentforum.org)

A New Way to Re­think Alignment

Taylor Grogan28 Jul 2025 20:56 UTC
1 point
0 comments2 min readEA link

NeurIPS ML Safety Work­shop 2022

Dan H26 Jul 2022 15:33 UTC
72 points
0 comments1 min readEA link
(neurips2022.mlsafety.org)

“Tak­ing AI Risk Se­ri­ously” – Thoughts by An­drew Critch

Raemon19 Nov 2018 2:21 UTC
26 points
9 comments1 min readEA link
(www.lesswrong.com)

A Differ­ent Ap­proach to Com­mu­nity Build­ing: The Spiral Path to Im­pact

ezrah23 May 2023 18:41 UTC
46 points
4 comments8 min readEA link

L’im­por­tanza delle IA come pos­si­bile mi­nac­cia per l’umanità

EA Italy17 Jan 2023 22:24 UTC
1 point
0 comments1 min readEA link
(www.vox.com)

A se­lec­tion of some writ­ings and con­sid­er­a­tions on the cause of ar­tifi­cial sentience

Raphaël_Pesah10 Aug 2023 18:23 UTC
49 points
1 comment10 min readEA link

Hu­mans and Machines: Heaven or Hell?

Alex (Αλέξανδρος)12 Jul 2025 8:04 UTC
4 points
1 comment9 min readEA link

Bet­ter than log­a­r­ith­mic re­turns to rea­son­ing?

Oliver Sourbut30 Jul 2025 0:50 UTC
6 points
1 comment2 min readEA link

Lov­ing a world you don’t trust

Joe_Carlsmith18 Jun 2024 19:31 UTC
65 points
7 comments33 min readEA link

How to build AI you can ac­tu­ally Trust—Like a Med­i­cal Team, Not a Black Box

Ihor Ivliev22 Mar 2025 21:27 UTC
2 points
1 comment4 min readEA link

[Question] Is AI x-risk be­com­ing a dis­trac­tion?

Non-zero-sum James27 Feb 2025 20:33 UTC
2 points
0 comments1 min readEA link

Seek­ing in­put on a list of AI books for broader audience

Darren McKee27 Feb 2023 22:40 UTC
49 points
14 comments5 min readEA link

Con­test: 250€ for trans­la­tion of “longter­mism” to German

constructive1 Jun 2022 19:59 UTC
18 points
30 comments1 min readEA link

My take on AI risk (7 the­ses of eu­gene)

meugen21 Mar 2025 3:02 UTC
0 points
1 comment2 min readEA link

Beyond Short-Ter­mism: How δ and w Can Real­ign AI with Our Values

Beyond Singularity18 Jun 2025 16:34 UTC
15 points
8 comments5 min readEA link

The V&V method—A step to­wards safer AGI

Yoav Hollander24 Jun 2025 15:57 UTC
1 point
0 comments1 min readEA link
(blog.foretellix.com)

But ex­actly how com­plex and frag­ile?

Katja_Grace13 Dec 2019 7:05 UTC
37 points
3 comments3 min readEA link
(meteuphoric.com)

LLMs are weirder than you think

Derek Shiller20 Nov 2024 13:39 UTC
64 points
3 comments22 min readEA link

An­nounc­ing The Most Im­por­tant Cen­tury Writ­ing Prize

michel31 Oct 2022 21:37 UTC
48 points
0 comments2 min readEA link

Ad­ver­sar­ial Prompt­ing and Si­mu­lated Con­text Drift in Large Lan­guage Models

Tyler Williams11 Jul 2025 21:49 UTC
1 point
0 comments2 min readEA link

[Question] What will be some of the most im­pact­ful ap­pli­ca­tions of ad­vanced AI in the near term?

IanDavidMoss3 Mar 2022 15:26 UTC
16 points
7 comments1 min readEA link

The Hap­piness Max­i­mizer: Why EA is an x-risk

Obasi Shaw30 Aug 2022 4:29 UTC
8 points
5 comments32 min readEA link

What we learned from run­ning an Aus­tralian AI Safety Unconference

Alexander Saeri26 Oct 2023 0:46 UTC
34 points
0 comments5 min readEA link

In­finite Re­wards, Finite Safety: New Models for AI Mo­ti­va­tion Without In­finite Goals

Whylome Team12 Nov 2024 7:21 UTC
−5 points
1 comment2 min readEA link

[Question] What do you mean with ‘al­ign­ment is solv­able in prin­ci­ple’?

Remmelt17 Jan 2025 15:03 UTC
10 points
1 comment1 min readEA link

Re­sponse to “Co­or­di­nated paus­ing: An eval­u­a­tion-based co­or­di­na­tion scheme for fron­tier AI de­vel­op­ers”

Matthew Wearden30 Oct 2023 12:49 UTC
7 points
1 comment6 min readEA link
(matthewwearden.co.uk)

Values and control

dotsam4 Aug 2022 18:28 UTC
3 points
1 comment1 min readEA link

How to en­gage with AI 4 So­cial Jus­tice ac­tors

TomWestgarth26 Apr 2022 8:39 UTC
13 points
5 comments1 min readEA link

SB 1047 Simplified

Gabe K25 Sep 2024 12:00 UTC
14 points
0 comments4 min readEA link

Why I’m Scep­ti­cal of Foom

𝕮𝖎𝖓𝖊𝖗𝖆8 Dec 2022 10:01 UTC
22 points
7 comments3 min readEA link

An­nounc­ing the EA Pro­ject Ideas Database

Joe Rogero22 Jun 2023 20:20 UTC
14 points
4 comments1 min readEA link

New refer­ence stan­dard on LLM Ap­pli­ca­tion se­cu­rity started by OWASP

QuantumForest19 Jun 2023 19:56 UTC
5 points
0 comments1 min readEA link

Why I am no longer think­ing about/​work­ing on AI safety

jbkjr6 May 2024 20:00 UTC
−8 points
0 comments4 min readEA link
(www.lesswrong.com)

Chris­ti­ano and Yud­kowsky on AI pre­dic­tions and hu­man intelligence

EliezerYudkowsky23 Feb 2022 16:51 UTC
31 points
0 comments42 min readEA link

What is OpenAI’s plan for mak­ing AI Safer?

brook1 Sep 2023 11:15 UTC
8 points
1 comment4 min readEA link
(aisafetyexplained.substack.com)

Ajeya’s TAI timeline short­ened from 2050 to 2040

Zach Stein-Perlman3 Aug 2022 0:00 UTC
59 points
2 comments1 min readEA link
(www.lesswrong.com)

Spec­u­lat­ing on Se­cret In­tel­li­gence Explosions

calebp5 Jun 2025 13:55 UTC
20 points
5 comments8 min readEA link

Neart­er­mists should con­sider AGI timelines in their spend­ing decisions

Tristan Cook26 Jul 2022 17:01 UTC
68 points
4 comments4 min readEA link

[Question] Help us de­sign the in­ter­face for aisafety.com

Kim Holder23 Oct 2023 17:27 UTC
9 points
0 comments1 min readEA link

Learn­ing as much Deep Learn­ing math as I could in 24 hours

Phosphorous8 Jan 2023 2:19 UTC
58 points
6 comments7 min readEA link

PSA: Say­ing “1 in 5” Is Bet­ter Than “20%” When In­form­ing about risks publicly

Blanka30 Jan 2025 19:03 UTC
17 points
1 comment1 min readEA link

[Question] De­sign­ing user au­then­ti­ca­tion pro­to­cols

Kinoshita Yoshikazu (pseudonym)13 Mar 2023 15:56 UTC
−1 points
2 comments1 min readEA link

Tether­ware #2: What ev­ery hu­man should know about our most likely AI future

Jáchym Fibír28 Feb 2025 11:25 UTC
3 points
0 comments11 min readEA link
(tetherware.substack.com)

“Develop An­thro­po­mor­phic AGI to Save Hu­man­ity from It­self” (Fu­ture Fund AI Wor­ld­view Prize sub­mis­sion)

ketanrama5 Nov 2022 17:57 UTC
19 points
6 comments7 min readEA link

Deep­Mind’s gen­er­al­ist AI, Gato: A non-tech­ni­cal explainer

frances_lorenz16 May 2022 21:19 UTC
128 points
13 comments6 min readEA link

Against GDP as a met­ric for timelines and take­off speeds

kokotajlod29 Dec 2020 17:50 UTC
47 points
6 comments14 min readEA link

Microsoft Plans to In­vest $10B in OpenAI; $3B In­vested to Date | For­tune

𝕮𝖎𝖓𝖊𝖗𝖆10 Jan 2023 23:43 UTC
25 points
2 comments2 min readEA link
(fortune.com)

AI’s goals may not match ours

Vishakha Agrawal28 May 2025 12:07 UTC
2 points
0 comments3 min readEA link

We’re Not Ready: thoughts on “paus­ing” and re­spon­si­ble scal­ing policies

Holden Karnofsky27 Oct 2023 15:19 UTC
150 points
23 comments8 min readEA link

[Question] Is it valuable to the field of AI Safety to have a neu­ro­science back­ground?

Samuel Nellessen3 Apr 2022 19:44 UTC
18 points
3 comments1 min readEA link

Ro­hin Shah: What’s been hap­pen­ing in AI al­ign­ment?

EA Global29 Jul 2020 20:15 UTC
18 points
0 comments14 min readEA link
(www.youtube.com)

My mo­ti­va­tion and the­ory of change for work­ing in AI healthtech

Andrew Critch12 Oct 2024 0:36 UTC
47 points
1 comment14 min readEA link

Fun­da­men­tals of Fatal Risks

Aino29 Jul 2023 7:12 UTC
1 point
0 comments4 min readEA link

Video and tran­script of pre­sen­ta­tion on Schem­ing AIs

Joe_Carlsmith22 Mar 2024 15:56 UTC
23 points
1 comment32 min readEA link

Ex­pand­ing EA’s AI Builder Com­mu­nity—Writ­ing about my job

Alejandro Acelas 🔸21 Jul 2025 8:22 UTC
26 points
0 comments6 min readEA link

Paths and waysta­tions in AI safety

Joe_Carlsmith11 Mar 2025 18:52 UTC
22 points
2 comments11 min readEA link
(joecarlsmith.substack.com)

Anal­y­sis of Progress in Speech Recog­ni­tion Models

MiguelA16 Sep 2024 15:56 UTC
8 points
1 comment12 min readEA link

Tar­bell Fel­low­ship 2025 - Ap­pli­ca­tions Open (AI Jour­nal­ism)

Tarbell Center for AI Journalism8 Jan 2025 15:25 UTC
62 points
0 comments1 min readEA link

Is GPT3 a Good Ra­tion­al­ist? - In­struc­tGPT3 [2/​2]

simeon_c7 Apr 2022 13:54 UTC
25 points
0 comments7 min readEA link

If in­ter­pretabil­ity re­search goes well, it may get dangerous

So8res3 Apr 2023 21:48 UTC
33 points
0 comments2 min readEA link

A Phy­logeny of Agents

Jonas Hallgren 🔸15 Aug 2025 10:48 UTC
6 points
1 comment6 min readEA link
(substack.com)

Towards AI Safety In­fras­truc­ture: Talk & Outline

Paul Bricman7 Jan 2024 9:35 UTC
14 points
1 comment2 min readEA link
(www.youtube.com)

AI timelines by bio an­chors: the de­bate in one place

Will Aldred30 Jul 2022 23:04 UTC
93 points
6 comments2 min readEA link

EA Ex­plorer GPT: A New Tool to Ex­plore Effec­tive Altruism

Vlad_Tislenko12 Nov 2023 15:36 UTC
12 points
1 comment1 min readEA link

Im­pli­ca­tions of the White­house meet­ing with AI CEOs for AI su­per­in­tel­li­gence risk—a first-step to­wards evals?

Jamie B7 May 2023 17:33 UTC
78 points
3 comments7 min readEA link

Su­perfore­cast­ing the premises in “Is power-seek­ing AI an ex­is­ten­tial risk?”

Joe_Carlsmith18 Oct 2023 20:33 UTC
114 points
3 comments2 min readEA link

On the com­pute gov­er­nance era and what has to come af­ter (Len­nart Heim on The 80,000 Hours Pod­cast)

80000_Hours23 Jun 2023 20:11 UTC
37 points
0 comments18 min readEA link

What should AI safety be try­ing to achieve?

EuanMcLean23 May 2024 11:28 UTC
13 points
1 comment13 min readEA link

Can we safely au­to­mate al­ign­ment re­search?

Joe_Carlsmith30 Apr 2025 17:37 UTC
13 points
1 comment48 min readEA link
(joecarlsmith.com)

Hu­man Values and AGI Risk | William James

William James31 Mar 2023 22:30 UTC
1 point
0 comments12 min readEA link

Some thoughts from a Univer­sity AI Debate

Charlie Harrison20 Mar 2024 17:03 UTC
26 points
2 comments1 min readEA link

Crypto ‘or­a­cle pro­to­cols’ for AI al­ign­ment with real-world data?

Geoffrey Miller22 Sep 2022 23:05 UTC
9 points
3 comments1 min readEA link

Could ASI Have Ex­isted Since the Big Bang?

Aaron Li31 Jan 2025 13:20 UTC
−13 points
0 comments1 min readEA link

Asym­me­tries, AI and An­i­mal Advocacy

Kevin Xia 🔸16 May 2025 6:16 UTC
64 points
6 comments5 min readEA link

The an­i­mals and hu­mans anal­ogy for AI risk

freedomandutility13 Aug 2022 15:35 UTC
5 points
2 comments1 min readEA link

Who owns AI-gen­er­ated con­tent?

Johan S Daniel7 Dec 2022 3:03 UTC
−2 points
0 comments2 min readEA link

A con­cern about the “evolu­tion­ary an­chor” of Ajeya Co­tra’s re­port on AI timelines.

NunoSempere16 Aug 2022 14:44 UTC
81 points
40 comments5 min readEA link
(nunosempere.com)

Cog­ni­tive Stress Test­ing Gem­ini 2.5 Pro: Em­piri­cal Find­ings from Re­cur­sive Prompt­ing

Tyler Williams23 Jul 2025 22:37 UTC
1 point
0 comments2 min readEA link

Miti­gat­ing ex­is­ten­tial risks as­so­ci­ated with hu­man na­ture and AI: Thoughts on se­ri­ous mea­sures.

Linyphia25 Mar 2023 19:10 UTC
2 points
2 comments3 min readEA link

Where on the con­tinuum of pure EA to pure AIS should you be? (Uni Group Or­ga­niz­ers Fo­cus)

jessica_mccurdy🔸26 Jun 2023 23:46 UTC
44 points
0 comments5 min readEA link

[Question] How might a herd of in­terns help with AI or biose­cu­rity re­search tasks/​ques­tions?

Marcel220 Mar 2022 22:49 UTC
30 points
8 comments2 min readEA link

Lon­don Work­ing Group for Short/​Medium Term AI Risks

scronkfinkle7 Apr 2025 15:30 UTC
5 points
0 comments2 min readEA link

Descartes’ 17th cen­tury Tur­ing Test

James-Hartree-Law16 Jan 2025 20:18 UTC
3 points
0 comments7 min readEA link

The Case for AI Safety Ad­vo­cacy to the Public

Holly Elmore ⏸️ 🔸20 Sep 2023 12:03 UTC
258 points
58 comments14 min readEA link

[Question] [Dis­cus­sion] How Broad is the Hu­man Cog­ni­tive Spec­trum?

𝕮𝖎𝖓𝖊𝖗𝖆7 Jan 2023 0:59 UTC
16 points
1 comment2 min readEA link

Chris Olah on work­ing at top AI labs with­out an un­der­grad degree

80000_Hours10 Sep 2021 20:46 UTC
15 points
0 comments73 min readEA link

Trans­for­ma­tive AI and wild an­i­mals: An ex­plo­ra­tion.

mal_graham🔸24 Apr 2025 17:48 UTC
84 points
8 comments25 min readEA link

How I learned to stop wor­ry­ing and love skill trees

Clark Urzo23 May 2023 8:03 UTC
22 points
3 comments1 min readEA link
(www.lesswrong.com)

AGI Ruin: A List of Lethalities

EliezerYudkowsky6 Jun 2022 23:28 UTC
162 points
53 comments30 min readEA link
(www.lesswrong.com)

Open Phil re­leases RFPs on LLM Bench­marks and Forecasting

Lawrence Chan11 Nov 2023 3:01 UTC
12 points
0 comments2 min readEA link
(www.openphilanthropy.org)

Open-source LLMs may prove Bostrom’s vuln­er­a­ble world hypothesis

Roope Ahvenharju14 Apr 2023 9:25 UTC
14 points
2 comments1 min readEA link

It takes 5 lay­ers and 1000 ar­tifi­cial neu­rons to simu­late a sin­gle biolog­i­cal neu­ron [Link]

Michael St Jules 🔸7 Sep 2021 21:53 UTC
44 points
17 comments2 min readEA link

Are AI safe­ty­ists cry­ing wolf?

sarahhw8 Jan 2025 20:54 UTC
61 points
21 comments16 min readEA link
(longerramblings.substack.com)

A Bird’s Eye View of the ML Field [Prag­matic AI Safety #2]

TW1239 May 2022 17:15 UTC
97 points
2 comments35 min readEA link

Ar­tifi­cial in­tel­li­gence ca­reer stories

EA Global25 Oct 2020 6:56 UTC
12 points
0 comments1 min readEA link
(www.youtube.com)

[Question] In­tel­lec­tual prop­erty of AI and ex­is­ten­tial risk in gen­eral?

WillPearson11 Jun 2024 13:50 UTC
3 points
3 comments1 min readEA link

Con­sider Pre­order­ing If Any­one Builds It, Every­one Dies

peterbarnett12 Aug 2025 22:03 UTC
44 points
4 comments2 min readEA link

Claude vs GPT

Maxwell Tabarrok14 Mar 2024 12:44 UTC
14 points
1 comment2 min readEA link
(www.maximum-progress.com)

Credo AI is hiring for AI Gov Re­searcher & more!

IanEisenberg15 Aug 2023 21:10 UTC
8 points
0 comments3 min readEA link

By de­fault, cap­i­tal will mat­ter more than ever af­ter AGI

L Rudolf L28 Dec 2024 17:52 UTC
113 points
3 comments16 min readEA link
(nosetgauge.substack.com)

[Question] Benev­olen­tAI—an effec­tively im­pact­ful com­pany?

Jack Hilton11 Oct 2022 14:35 UTC
16 points
11 comments1 min readEA link

Misal­ign­ment or mi­suse? The AGI al­ign­ment tradeoff

Max_He-Ho20 Jun 2025 10:41 UTC
6 points
0 comments1 min readEA link
(www.arxiv.org)

How quick and big would a soft­ware in­tel­li­gence ex­plo­sion be?

Tom_Davidson5 Aug 2025 15:47 UTC
12 points
2 comments34 min readEA link

On ex­clud­ing dan­ger­ous in­for­ma­tion from training

ShayBenMoshe17 Nov 2023 20:09 UTC
8 points
0 comments3 min readEA link
(www.lesswrong.com)

Red­wood Re­search is hiring for sev­eral roles (Oper­a­tions and Tech­ni­cal)

JJXWang14 Apr 2022 15:23 UTC
45 points
0 comments1 min readEA link

[Question] Is there a pub­lic tracker de­pict­ing at what dates AI has been able to au­to­mate x% of cog­ni­tive tasks (weighted by 2020 eco­nomic value)?

Mitchell Laughlin🔸17 Feb 2024 4:52 UTC
12 points
4 comments1 min readEA link

Went­worth and Larsen on buy­ing time

Akash9 Jan 2023 21:31 UTC
48 points
0 comments12 min readEA link

AGI in a vuln­er­a­ble world

AI Impacts2 Apr 2020 3:43 UTC
17 points
0 comments1 min readEA link
(aiimpacts.org)

Why poli­cy­mak­ers should be­ware claims of new “arms races” (Bul­letin of the Atomic Scien­tists)

christian.r14 Jul 2022 13:38 UTC
55 points
1 comment1 min readEA link
(thebulletin.org)

[Question] Can AI safely ex­ist at all?

Hayven Frienby27 Nov 2023 17:33 UTC
6 points
7 comments2 min readEA link

[Question] I’m in­ter­view­ing the au­thor of ‘Not Born Yes­ter­day’ — Hugo Mercier. He ar­gues peo­ple are less gullible and more savvy than you think. What should I ask him?

Robert_Wiblin17 Nov 2023 17:43 UTC
16 points
3 comments1 min readEA link

Promethean Gover­nance As­cen­dant: Les­sons from the Forge and Vi­sions for the Cos­mic Polity

Paul Fallavollita23 Mar 2025 0:54 UTC
−9 points
0 comments3 min readEA link

Oren’s Field Guide of Bad AGI Outcomes

Oren Montano26 Sep 2022 8:59 UTC
1 point
0 comments1 min readEA link

AI-based dis­in­for­ma­tion is prob­a­bly not a ma­jor threat to democracy

Dan Williams24 Feb 2024 20:01 UTC
63 points
8 comments10 min readEA link

New pub­li­ca­tion “Com­pas­sion­ate Gover­nance” + launch webinar

jonleighton23 Jun 2025 13:16 UTC
9 points
0 comments1 min readEA link

Aspiring Jr. AI safety re­searchers: what’s stop­ping you? | Survey

carolinaollive29 Oct 2024 11:27 UTC
14 points
0 comments1 min readEA link

“Cot­ton Gin” AI Risk

42317524 Sep 2022 23:04 UTC
6 points
2 comments2 min readEA link

[Question] Look­ing to in­ter­view AI Safety re­searchers for a book

Caruso24 Aug 2024 20:01 UTC
6 points
0 comments1 min readEA link

[Question] Best in­tro­duc­tory overviews of AGI safety?

JakubK13 Dec 2022 19:04 UTC
21 points
8 comments2 min readEA link
(www.lesswrong.com)

AI Align­ment, Sen­tience, and the Sense of Co­her­ence Concept

Jason Babb17 Mar 2025 13:30 UTC
4 points
0 comments1 min readEA link

AI gov­er­nance tracker of each coun­try per re­gion

Alix Ramillon24 Jul 2024 17:39 UTC
16 points
2 comments23 min readEA link

#208 – The case that TV shows, movies, and nov­els can im­prove the world (Eliz­a­beth Cox on The 80,000 Hours Pod­cast)

80000_Hours22 Nov 2024 11:36 UTC
10 points
0 comments17 min readEA link

METR is hiring!

ElizabethBarnes26 Dec 2023 21:03 UTC
50 points
0 comments1 min readEA link
(www.lesswrong.com)

AI al­ign­ment as a trans­la­tion problem

Roman Leventov5 Feb 2024 14:14 UTC
3 points
1 comment3 min readEA link

[Question] Books and lec­ture se­ries rele­vant to AI gov­er­nance?

MichaelA🔸18 Jul 2021 15:54 UTC
22 points
8 comments1 min readEA link

Train­ing Data At­tri­bu­tion: Ex­am­in­ing Its Adop­tion & Use Cases

Deric Cheng22 Jan 2025 15:40 UTC
18 points
1 comment3 min readEA link
(www.convergenceanalysis.org)

How the Hu­man Psy­cholog­i­cal “Pro­gram” Un­der­mines AI Align­ment — and What We Can Do

Beyond Singularity6 May 2025 13:37 UTC
14 points
2 comments3 min readEA link

The Ex­is­ten­tial Risk of Speciesist Bias in AI

Sam Tucker-Davis11 Nov 2023 3:27 UTC
38 points
1 comment3 min readEA link

AI Safety Col­lab 2025 - Lo­cal Or­ga­nizer Sign-ups Open

Evander H. 🔸12 Feb 2025 11:27 UTC
15 points
0 comments1 min readEA link

Like­li­hood of an anti-AI back­lash: Re­sults from a pre­limi­nary Twit­ter poll

Geoffrey Miller27 Sep 2022 22:01 UTC
27 points
13 comments1 min readEA link

In­fo­graph­ics re­port risk man­age­ment of Ar­tifi­cial In­tel­li­gence in Spain

JorgeTorresC10 Jul 2023 14:44 UTC
16 points
0 comments1 min readEA link
(riesgoscatastroficosglobales.com)

#184 – Sleep­ing on sleeper agents, and the biggest AI up­dates since ChatGPT (Zvi Mow­show­itz on the 80,000 Hours Pod­cast)

80000_Hours12 Apr 2024 12:22 UTC
46 points
0 comments20 min readEA link

De­mon­strate and eval­u­ate risks from AI to so­ciety at the AI x Democ­racy re­search hackathon

Esben Kran19 Apr 2024 14:46 UTC
24 points
0 comments6 min readEA link
(www.apartresearch.com)

Red­wood Re­search is hiring for sev­eral roles

Jack R29 Nov 2021 0:18 UTC
75 points
0 comments1 min readEA link

Minecraft As An Effec­tive Ad­vo­cacy Strat­egy And Cause Area

Kenneth_Diao1 Apr 2025 19:12 UTC
15 points
0 comments4 min readEA link

My re­flec­tions on do­ing a re­search fellowship

Yadav13 Jun 2025 10:41 UTC
11 points
1 comment5 min readEA link

The Pug­wash Con­fer­ences and the Anti-Bal­lis­tic Mis­sile Treaty as a case study of Track II diplomacy

rani_martin16 Sep 2022 10:42 UTC
82 points
5 comments27 min readEA link

Be­ing nicer than Clippy

Joe_Carlsmith16 Jan 2024 19:44 UTC
26 points
3 comments27 min readEA link

Data Publi­ca­tion for the 2021 Ar­tifi­cial In­tel­li­gence, Mo­ral­ity, and Sen­tience (AIMS) Sur­vey

Janet Pauketat24 Mar 2022 15:43 UTC
21 points
0 comments3 min readEA link
(www.sentienceinstitute.org)

Truth­ful AI

Owen Cotton-Barratt20 Oct 2021 15:11 UTC
55 points
14 comments10 min readEA link

What are Re­spon­si­ble Scal­ing Poli­cies (RSPs)?

Vishakha Agrawal5 Apr 2025 16:05 UTC
2 points
0 comments2 min readEA link
(www.lesswrong.com)

[Link] EAF Re­search agenda: “Co­op­er­a­tion, Con­flict, and Trans­for­ma­tive Ar­tifi­cial In­tel­li­gence”

stefan.torges17 Jan 2020 13:28 UTC
64 points
0 comments1 min readEA link

Devel­op­ing a Calcu­la­ble Con­science for AI: Equa­tion for Rights Violations

Sean Sweeney12 Dec 2024 17:50 UTC
4 points
1 comment15 min readEA link

We’re not pre­pared for an AI mar­ket crash

Remmelt1 Apr 2025 4:33 UTC
23 points
4 comments2 min readEA link

Michael Page, Dario Amodei, He­len Toner, Tasha McCauley, Jan Leike, & Owen Cot­ton-Bar­ratt: Mus­ings on AI

EA Global11 Aug 2017 8:19 UTC
7 points
0 comments1 min readEA link
(www.youtube.com)

In­tro­duc­ing the Men­tal Health Roadmap Series

Emily11 Apr 2023 22:26 UTC
18 points
2 comments2 min readEA link

Don’t ex­pect AGI any­time soon

cveres10 Oct 2022 22:38 UTC
0 points
19 comments1 min readEA link

How I Came To Longter­mism On My Own & An Out­sider Per­spec­tive On EA Longtermism

Jordan Arel7 Aug 2022 2:42 UTC
34 points
2 comments20 min readEA link

#201 – Why your robot but­ler isn’t here yet (Ken Gold­berg on The 80,000 Hours Pod­cast)

80000_Hours13 Sep 2024 17:41 UTC
21 points
0 comments12 min readEA link

AI Gover­nance Ca­reer Paths for Europeans

careersthrowaway16 May 2020 6:40 UTC
83 points
1 comment12 min readEA link

Con­sider keep­ing your threat mod­els pri­vate.

Miles Kodama1 Feb 2025 0:29 UTC
18 points
2 comments4 min readEA link

How to miti­gate sandbagging

Teun van der Weij23 Mar 2025 17:19 UTC
3 points
0 comments8 min readEA link

An­nounc­ing Manival

Lydia Nottingham18 Jun 2025 18:14 UTC
20 points
3 comments2 min readEA link

AI Im­pacts Quar­terly Newslet­ter, Apr-Jun 2023

Harlan18 Jul 2023 18:01 UTC
4 points
0 comments3 min readEA link
(blog.aiimpacts.org)

AI Open Source De­bate Comes Down to Trust in In­sti­tu­tions, and AI Policy Mak­ers Should Con­sider How We Can Foster It

another-anon-do-gooder20 Jan 2024 13:47 UTC
6 points
2 comments1 min readEA link

EA megapro­jects continued

mariushobbhahn3 Dec 2021 10:33 UTC
183 points
48 comments7 min readEA link

CAISH Hiring: AI Safety Policy Fel­low­ship Facilitators

Chloe Li17 Jan 2024 9:21 UTC
13 points
1 comment1 min readEA link

EU’s im­por­tance for AI gov­er­nance is con­di­tional on AI tra­jec­to­ries—a case study

MathiasKB🔸13 Jan 2022 14:58 UTC
31 points
2 comments3 min readEA link

We might be miss­ing some key fea­ture of AI take­off; it’ll prob­a­bly seem like “we could’ve seen this com­ing”

Dane Valerie16 May 2024 12:05 UTC
15 points
0 comments5 min readEA link
(www.lesswrong.com)

deleted

funnyfranco18 Mar 2025 19:19 UTC
3 points
9 comments1 min readEA link

How might we solve the al­ign­ment prob­lem? (Part 1: In­tro, sum­mary, on­tol­ogy)

Joe_Carlsmith28 Oct 2024 21:57 UTC
18 points
0 comments32 min readEA link

Anal­y­sis of AI Safety sur­veys for field-build­ing insights

Ash Jafari5 Dec 2022 17:37 UTC
30 points
7 comments5 min readEA link

The Game Board has been Flipped: Now is a good time to re­think what you’re doing

LintzA28 Jan 2025 21:20 UTC
390 points
69 comments13 min readEA link

[Question] How many peo­ple are neart­er­mist and have high P(doom)?

Sanjay2 Aug 2023 14:24 UTC
52 points
13 comments1 min readEA link

Gentle­ness and the ar­tifi­cial Other

Joe_Carlsmith2 Jan 2024 18:21 UTC
90 points
2 comments11 min readEA link

[linkpost] Ten Levels of AI Align­ment Difficulty

SammyDMartin4 Jul 2023 11:23 UTC
16 points
0 comments1 min readEA link

He­len Toner: The Open Philan­thropy Pro­ject’s work on AI risk

EA Global3 Nov 2017 7:43 UTC
7 points
0 comments1 min readEA link
(www.youtube.com)

Me­tac­u­lus is build­ing a team ded­i­cated to AI forecasting

christian18 Oct 2022 16:08 UTC
35 points
0 comments1 min readEA link
(apply.workable.com)

[Question] Are so­cial me­dia al­gorithms an ex­is­ten­tial risk?

Barry Grimes15 Sep 2020 8:52 UTC
24 points
13 comments1 min readEA link

What’s im­por­tant in “AI for epistemics”?

Lukas Finnveden24 Aug 2024 1:27 UTC
71 points
1 comment28 min readEA link
(www.forethought.org)

Restrict­ing brain organoid re­search to slow down AGI

freedomandutility9 Nov 2022 13:01 UTC
8 points
2 comments1 min readEA link

AI Safety Micro­grant Round

Chris Leong14 Nov 2022 4:25 UTC
81 points
3 comments3 min readEA link

Sha­har Avin: Near-term AI se­cu­rity risks, and what to do about them

EA Global3 Nov 2017 7:43 UTC
7 points
0 comments1 min readEA link
(www.youtube.com)

Talk­ing to Congress: Can con­stituents con­tact­ing their leg­is­la­tor in­fluence policy?

Tristan Williams7 Mar 2024 9:24 UTC
47 points
3 comments19 min readEA link

Against the Open Source /​ Closed Source Di­chotomy: Reg­u­lated Source as a Model for Re­spon­si­ble AI Development

Alexander Herwix 🔸4 Sep 2023 20:23 UTC
5 points
1 comment6 min readEA link

Public Ex­plainer on AI as an Ex­is­ten­tial Risk

AndrewDoris7 Oct 2022 19:23 UTC
13 points
4 comments15 min readEA link

In­tro­duc­ing the AI for An­i­mals newsletter

Max Taylor21 Jun 2024 13:24 UTC
40 points
0 comments1 min readEA link

An­nounce­ment: Learn­ing The­ory On­line Course

Yegreg28 Jan 2025 8:32 UTC
5 points
0 comments3 min readEA link
(www.lesswrong.com)

Pro­ject Idea: The cost of Coc­ci­dio­sis on Chicken farm­ing and if AI can help

Max Harris26 Sep 2022 16:30 UTC
25 points
8 comments2 min readEA link

AGI x An­i­mal Welfare: A High-EV Outreach Op­por­tu­nity?

simeon_c28 Jun 2023 20:44 UTC
80 points
16 comments1 min readEA link

Harry Black­wood, A Researcher

Connor Wood18 Apr 2025 3:05 UTC
−7 points
0 comments2 min readEA link

LLMs Out­perform Ex­perts on Challeng­ing Biol­ogy Benchmarks

ljusten14 May 2025 16:09 UTC
24 points
1 comment1 min readEA link
(substack.com)

The Inevitable Emer­gence of Black-Mar­ket LLM Infrastructure

Tyler Williams8 Aug 2025 19:05 UTC
1 point
0 comments2 min readEA link

Barthes in the Age of AI: In­ter­tex­tu­al­ity, Author­ship, and the Plu­ral Text

Rodo8 Jun 2025 15:59 UTC
−3 points
0 comments3 min readEA link

Ad­vice for En­ter­ing AI Safety Research

stecas2 Jun 2023 20:46 UTC
14 points
1 comment5 min readEA link

Ap­pli­ca­tions Open for the Co­op­er­a­tive AI Sum­mer School 2025!

C Tilli13 Jan 2025 12:31 UTC
25 points
0 comments1 min readEA link

The AIA and its Brus­sels Effect

Kathryn O'Rourke27 Dec 2022 16:01 UTC
16 points
0 comments5 min readEA link

Op­ti­mism, AI risk, and EA blind spots

Justis28 Sep 2022 17:21 UTC
87 points
21 comments8 min readEA link

AI Safety re­searcher ca­reer review

Benjamin_Todd23 Nov 2021 0:00 UTC
13 points
1 comment6 min readEA link
(80000hours.org)

G7 Sum­mit—Co­op­er­a­tion on AI Policy

Leonard_Barrett19 May 2023 10:10 UTC
22 points
2 comments1 min readEA link
(www.japantimes.co.jp)

Creat­ing an Ar­tifi­cial Sense of Touch: Revolu­tioniz­ing Med­i­cal Train­ing and Robotic Surgery

Connor Wood15 Apr 2025 2:01 UTC
−1 points
0 comments7 min readEA link

Some rea­sons to not say “Doomer”

Ruby9 Jul 2023 21:05 UTC
28 points
0 comments4 min readEA link

A vi­su­al­iza­tion of some orgs in the AI Safety Pipeline

Aaron_Scher10 Apr 2022 16:52 UTC
11 points
8 comments1 min readEA link

Why AI al­ign­ment could be hard with mod­ern deep learning

Ajeya21 Sep 2021 15:35 UTC
157 points
17 comments14 min readEA link
(www.cold-takes.com)

BERI, Epoch, and FAR will ex­plain their work & cur­rent job open­ings on­line this Sunday

Rockwell19 Aug 2022 20:34 UTC
7 points
0 comments1 min readEA link

A Sur­vey of the Po­ten­tial Long-term Im­pacts of AI

Sam Clarke18 Jul 2022 9:48 UTC
63 points
2 comments27 min readEA link

New Book: ‘Nexus’ by Yu­val Noah Harari

timfarkas3 Oct 2024 13:54 UTC
15 points
2 comments5 min readEA link

Trans­for­ma­tive AI and Com­pute [Sum­mary]

lennart23 Sep 2021 13:53 UTC
65 points
5 comments9 min readEA link

AI & Drug Dis­cov­ery—Se­cu­rity and Risks

Girving28 Jun 2023 8:57 UTC
14 points
1 comment1 min readEA link

[Question] Why aren’t we pro­mot­ing so­cial me­dia aware­ness of x-risks?

Max Niederman🔸9 Jun 2025 14:22 UTC
8 points
2 comments1 min readEA link

In­tro­duc­ing WAIT to Save Humanity

carter allen🔸1 Apr 2025 21:36 UTC
22 points
1 comment3 min readEA link

AISER—AIS Europe Retreat

Carolin23 Dec 2022 18:11 UTC
5 points
0 comments1 min readEA link

[Question] Which pos­si­ble AI im­pacts should re­ceive the most ad­di­tional at­ten­tion?

David Johnston31 May 2022 2:01 UTC
10 points
10 comments1 min readEA link

Align­ing AI Safety Pro­jects with a Repub­li­can Administration

Deric Cheng21 Nov 2024 22:13 UTC
13 points
1 comment8 min readEA link

[Question] What are some sources re­lated to big-pic­ture AI strat­egy?

Jacob Watts🔸2 Mar 2023 5:04 UTC
9 points
4 comments1 min readEA link

A Selec­tion of Ran­domly Selected SAE Features

Callum McDougall1 Apr 2024 9:09 UTC
25 points
2 comments4 min readEA link

[Question] How to cre­ate cur­ricu­lum for self-study to­wards AI al­ign­ment work?

OIUJHKDFS7 Jan 2023 19:53 UTC
10 points
5 comments1 min readEA link

Ought: why it mat­ters and ways to help

Paul_Christiano26 Jul 2019 1:56 UTC
52 points
5 comments5 min readEA link

[Question] Who would you have on your dream team for solv­ing AGI Align­ment?

Greg_Colbourn ⏸️ 25 Aug 2022 13:34 UTC
10 points
14 comments1 min readEA link

Google Deep­Mind re­leases Gemini

Yarrow🔸6 Dec 2023 17:39 UTC
21 points
7 comments1 min readEA link
(deepmind.google)

[Question] What are the top pri­ori­ties in a slow-take­off, mul­ti­po­lar world?

JP Addison🔸25 Aug 2021 8:47 UTC
26 points
9 comments1 min readEA link

Pos­si­ble di­rec­tions in AI ideal gov­er­nance research

RoryG10 Aug 2022 8:36 UTC
5 points
0 comments3 min readEA link

AXRP Epi­sode 24 - Su­per­al­ign­ment with Jan Leike

DanielFilan27 Jul 2023 4:56 UTC
23 points
0 comments1 min readEA link
(axrp.net)

Prob­a­bly good pro­jects for the AI safety ecosystem

Ryan Kidd5 Dec 2022 3:24 UTC
21 points
0 comments2 min readEA link

AGI will ar­rive by the end of this decade ei­ther as a uni­corn or as a black swan

Yuri Barzov21 Oct 2022 10:50 UTC
−4 points
7 comments3 min readEA link

It’s (not) how you use it

Eleni_A7 Sep 2022 13:28 UTC
6 points
3 comments2 min readEA link

2018 AI Align­ment Liter­a­ture Re­view and Char­ity Comparison

Larks18 Dec 2018 4:48 UTC
118 points
28 comments63 min readEA link

Idea to boost in­ter­na­tional AI coordination

Jamie Green13 Aug 2025 13:40 UTC
2 points
0 comments3 min readEA link

Fu­ture Mat­ters #6: FTX col­lapse, value lock-in, and coun­ter­ar­gu­ments to AI x-risk

Pablo30 Dec 2022 13:10 UTC
58 points
2 comments21 min readEA link

Mo­ti­va­tion control

Joe_Carlsmith30 Oct 2024 17:15 UTC
18 points
0 comments52 min readEA link

AGI x-risk timelines: 10% chance (by year X) es­ti­mates should be the head­line, not 50%.

Greg_Colbourn ⏸️ 1 Mar 2022 12:02 UTC
69 points
22 comments2 min readEA link

Up­date on Har­vard AI Safety Team and MIT AI Alignment

Xander1232 Dec 2022 6:09 UTC
71 points
3 comments8 min readEA link

‘Surveillance Cap­i­tal­ism’ & AI Gover­nance: Slip­pery Busi­ness Models, Se­cu­ri­ti­sa­tion, and Self-Regulation

Charlie Harrison29 Feb 2024 15:47 UTC
19 points
2 comments12 min readEA link

Ap­ply to be a Stan­ford HAI Ju­nior Fel­low (As­sis­tant Pro­fes­sor- Re­search) by Nov. 15, 2021

Vael Gates31 Oct 2021 2:21 UTC
15 points
0 comments1 min readEA link

[Question] By how much should Meta’s Blen­derBot be­ing re­ally bad cause me to up­date on how jus­tifi­able it is for OpenAI and Deep­Mind to be mak­ing sig­nifi­cant progress on AI ca­pa­bil­ities?

Sisi10 Aug 2022 6:40 UTC
24 points
8 comments1 min readEA link

Deep­Mind is hiring for the Scal­able Align­ment and Align­ment Teams

Rohin Shah13 May 2022 12:19 UTC
102 points
0 comments9 min readEA link

Catas­trophic Risks from AI #1: Introduction

Dan H22 Jun 2023 17:09 UTC
28 points
1 comment5 min readEA link
(arxiv.org)

What AI could mean for al­ter­na­tive proteins

Max Taylor9 Feb 2024 10:13 UTC
36 points
5 comments16 min readEA link

Join the AI gov­er­nance and in­ter­pretabil­ity hackathons!

Esben Kran23 Mar 2023 14:39 UTC
33 points
1 comment5 min readEA link
(alignmentjam.com)

US pub­lic opinion of AI policy and risk

Jamie E12 May 2023 13:22 UTC
111 points
7 comments15 min readEA link

Apollo Re­search is Hiring for Soft­ware Eng­ineers. Dead­line 22 Jun

Joping_Apollo Research13 Jun 2025 15:30 UTC
7 points
0 comments1 min readEA link

PauseAI US is look­ing for lo­cal group lead­ers – ap­ply to­day!

Felix De Simone4 Apr 2025 15:44 UTC
20 points
0 comments1 min readEA link

[Question] How strong is the ev­i­dence of un­al­igned AI sys­tems caus­ing harm?

Eevee🔹21 Jul 2020 4:08 UTC
31 points
1 comment1 min readEA link

[3-hour pod­cast]: Joseph Car­l­smith on longter­mism, utopia, the com­pu­ta­tional power of the brain, meta-ethics, illu­sion­ism and meditation

Gus Docker27 Jul 2021 13:18 UTC
34 points
2 comments1 min readEA link

Sum­mary of “Tech­nol­ogy Favours Tyranny” by Yu­val Noah Harari

Madhav Malhotra26 Oct 2022 21:37 UTC
36 points
1 comment2 min readEA link

[Question] Im­pact: Eng­ineer­ing VS Med­i­cal Scien­tist VS AI Safety VS Governance

AhmedWez15 Jan 2025 15:47 UTC
1 point
0 comments1 min readEA link

If you’re an AI Safety move­ment builder con­sider ask­ing your mem­bers these ques­tions in an interview

yanni kyriacos27 May 2024 5:46 UTC
10 points
0 comments2 min readEA link

IFRC cre­ative com­pe­ti­tion: product or ser­vice from fu­ture au­tonomous weapons sys­tems and emerg­ing digi­tal risks

Devin Lam21 Jul 2024 13:08 UTC
9 points
0 comments1 min readEA link
(solferinoacademy.com)

Im­pli­ca­tions of AGI on Sub­jec­tive Hu­man Experience

Erica S. 30 May 2023 18:47 UTC
2 points
0 comments19 min readEA link
(docs.google.com)

In­for­ma­tion se­cu­rity ca­reers for GCR reduction

ClaireZabel20 Jun 2019 23:56 UTC
187 points
35 comments8 min readEA link

Half-baked ideas thread (EA /​ AI Safety)

Aryeh Englander23 Jun 2022 16:05 UTC
21 points
8 comments1 min readEA link

[Linkpost] NY Times Fea­ture on Anthropic

Garrison12 Jul 2023 19:30 UTC
34 points
3 comments5 min readEA link
(www.nytimes.com)

2022 AI ex­pert sur­vey results

Zach Stein-Perlman4 Aug 2022 15:54 UTC
88 points
7 comments2 min readEA link
(aiimpacts.org)

Scal­ing and Sus­tain­ing Stan­dards: A Case Study on the Basel Accords

C.K.16 Jul 2023 18:18 UTC
18 points
0 comments7 min readEA link
(docs.google.com)

We should ex­pect to worry more about spec­u­la­tive risks

bgarfinkel29 May 2022 21:08 UTC
120 points
14 comments3 min readEA link

RA x Con­trolAI video: What if AI just keeps get­ting smarter?

Writer2 May 2025 14:19 UTC
14 points
1 comment9 min readEA link

There should be more AI safety orgs

mariushobbhahn21 Sep 2023 14:53 UTC
117 points
20 comments17 min readEA link

How would you es­ti­mate the value of de­lay­ing AGI by 1 day, in marginal dona­tions to GiveWell?

AnonymousTurtle16 Dec 2022 9:25 UTC
30 points
19 comments2 min readEA link

The Per­cep­tion Gap

Ben Norman18 Aug 2025 10:51 UTC
5 points
0 comments4 min readEA link
(futuresonder.substack.com)

“AI Align­ment” is a Danger­ously Over­loaded Term

Roko15 Dec 2023 15:06 UTC
20 points
2 comments3 min readEA link

How to use the Fo­rum (in­tro)

Lizka5 May 2022 18:29 UTC
24 points
13 comments1 min readEA link

Love and AI: Re­la­tional Brain/​Mind Dy­nam­ics in AI Development

Jeffrey Kursonis21 Jun 2022 7:09 UTC
2 points
2 comments3 min readEA link

AGI safety from first principles

richard_ngo21 Oct 2020 17:42 UTC
77 points
10 comments3 min readEA link
(www.alignmentforum.org)

An Ex­ec­u­tive Briefing on the Ar­chi­tec­ture of a Sys­temic Crisis

Ihor Ivliev10 Jul 2025 0:46 UTC
0 points
0 comments4 min readEA link

Thoughts about Policy Ecosys­tems: The Miss­ing Links in AI Governance

Echo Huang31 Jan 2025 13:23 UTC
21 points
2 comments5 min readEA link

Ac­ci­den­tally teach­ing AI mod­els to de­ceive us (Ajeya Co­tra on The 80,000 Hours Pod­cast)

80000_Hours15 May 2023 20:58 UTC
37 points
2 comments18 min readEA link

Space set­tle­ment and the time of per­ils: a cri­tique of Thorstad

Matthew Rendall14 Apr 2024 15:29 UTC
46 points
10 comments4 min readEA link

Ar­tifi­cial In­tel­li­gence as exit strat­egy from the age of acute ex­is­ten­tial risk

Arturo Macias12 Apr 2023 14:41 UTC
11 points
11 comments7 min readEA link

The ELYSIUM Proposal

Roko16 Oct 2024 2:14 UTC
−10 points
0 comments1 min readEA link
(transhumanaxiology.substack.com)

Ques­tion­able Nar­ra­tives of “Si­tu­a­tional Aware­ness”

fergusq16 Jun 2024 17:09 UTC
23 points
10 comments14 min readEA link

Digi­tal Minds Take­off Scenarios

Bradford Saad5 Jul 2024 16:06 UTC
36 points
10 comments17 min readEA link

The Case for AI Adap­ta­tion: The Per­ils of Liv­ing in a World with Aligned and Well-De­ployed Trans­for­ma­tive Ar­tifi­cial Intelligence

HTC30 May 2023 18:29 UTC
5 points
1 comment7 min readEA link

Hydra

Matrice Jacobine11 Jun 2025 14:07 UTC
10 points
0 comments1 min readEA link
(philosophybear.substack.com)

Le Tem­p­is­tiche delle IA: il di­bat­tito e il punto di vista degli “es­perti”

EA Italy17 Jan 2023 23:30 UTC
1 point
0 comments11 min readEA link

It’s OK not to go into AI (for stu­dents)

ruthgrace14 Jul 2022 15:16 UTC
59 points
18 comments2 min readEA link

Public Call for In­ter­est in Math­e­mat­i­cal Alignment

Davidmanheim22 Nov 2023 13:22 UTC
27 points
3 comments1 min readEA link

AI, An­i­mals, & Digi­tal Minds 2025: ap­ply to speak by Wed­nes­day!

Alistair Stewart5 May 2025 0:45 UTC
8 points
0 comments1 min readEA link

Call for Pythia-style foun­da­tion model suite for al­ign­ment research

Lucretia1 May 2023 20:26 UTC
10 points
0 comments1 min readEA link

Ver­ifi­ca­tion meth­ods for in­ter­na­tional AI agreements

Akash31 Aug 2024 14:58 UTC
20 points
0 comments4 min readEA link
(arxiv.org)

AI Safety Con­cepts Wri­teup: WebGPT

Justis11 Aug 2023 1:31 UTC
14 points
0 comments7 min readEA link

The the­o­ret­i­cal com­pu­ta­tional limit of the So­lar Sys­tem is 1.47x10^49 bits per sec­ond.

William the Kiwi17 Oct 2023 2:52 UTC
12 points
7 comments1 min readEA link

Whistle­blow­ing Twit­ter Bot

Mckiev 🔸26 Dec 2024 18:18 UTC
11 points
1 comment2 min readEA link
(www.lesswrong.com)

Start an AIS safety field-build­ing or­ga­ni­za­tion at the city or na­tional level—an EOI form

gergo9 Jan 2025 8:42 UTC
38 points
4 comments2 min readEA link

Or­phaned Poli­cies (Post 5 of 7 on AI Gover­nance)

Jason Green-Lowe29 May 2025 21:42 UTC
42 points
3 comments16 min readEA link

AI Anal­y­sis of US H.R.1 (“Big Beau­tiful Bill”) Im­pacts on Farmed Animals

Steven Rouk22 Jul 2025 14:33 UTC
13 points
0 comments3 min readEA link

Why fo­cus on schemers in par­tic­u­lar (Sec­tions 1.3 and 1.4 of “Schem­ing AIs”)

Joe_Carlsmith24 Nov 2023 19:18 UTC
10 points
1 comment20 min readEA link

Disen­tan­gling “Safety”

pleaselistencarefullyasourmenuoptionshaverecentlychanged6 Jul 2024 23:21 UTC
1 point
0 comments3 min readEA link

[Question] Is there EA dis­cus­sion on non-x-risk trans­for­ma­tive AI?

Franziska Fischer26 Apr 2023 13:50 UTC
5 points
0 comments1 min readEA link

How I switched ca­reers from soft­ware en­g­ineer to AI policy operations

Lucie Philippon 🔸13 Apr 2025 6:41 UTC
12 points
1 comment5 min readEA link
(www.lesswrong.com)

Four rea­sons I find AI safety emo­tion­ally compelling

Kat Woods 🔶 ⏸️28 Jun 2022 14:01 UTC
32 points
5 comments4 min readEA link

AGI risk: analo­gies & arguments

technicalities23 Mar 2021 13:18 UTC
31 points
3 comments8 min readEA link
(www.gleech.org)

[Linkpost] My at­tempt at try­ing to sum­ma­rize ‘In­tro to ML Safety’

Arjun Yadav25 Jul 2023 10:37 UTC
4 points
0 comments1 min readEA link
(arjunyadav.net)

‘Ar­tifi­cial In­tel­li­gence Gover­nance un­der Change’ (PhD dis­ser­ta­tion)

MMMaas15 Sep 2022 12:10 UTC
54 points
1 comment2 min readEA link
(drive.google.com)

In­side OpenAI’s Con­tro­ver­sial Plan to Aban­don its Non­profit Roots

Garrison18 Apr 2025 18:46 UTC
17 points
1 comment11 min readEA link
(garrisonlovely.substack.com)

Sum­mary of 80k’s AI prob­lem profile

JakubK1 Jan 2023 7:48 UTC
19 points
0 comments5 min readEA link
(www.lesswrong.com)

Paus­ing for what?

MountainPath21 Oct 2024 12:18 UTC
6 points
1 comment1 min readEA link

Es­ti­mat­ing the Sub­sti­tutabil­ity be­tween Com­pute and Cog­ni­tive La­bor in AI Research

Parker_Whitfill1 Jun 2025 14:27 UTC
135 points
29 comments9 min readEA link

An­nounc­ing the Com­pas­sion­ate Fu­ture Sum­mit 2025

Ruth_Seleo21 Jan 2025 7:15 UTC
50 points
3 comments2 min readEA link

I made an AI safety fel­low­ship. What I wish I knew.

RubenCastaing9 Jun 2024 16:32 UTC
14 points
1 comment2 min readEA link

Noah’s Arc: From AR Desks to AI Reactors

TabulaRasa1 Mar 2024 13:59 UTC
7 points
0 comments4 min readEA link

Win­ners of AI Align­ment Awards Re­search Contest

Akash13 Jul 2023 16:14 UTC
50 points
1 comment12 min readEA link

Sum­ming up “Schem­ing AIs” (Sec­tion 5)

Joe_Carlsmith9 Dec 2023 15:48 UTC
9 points
1 comment10 min readEA link

Abil­ity to solve long-hori­zon tasks cor­re­lates with want­ing things in the be­hav­iorist sense

So8res24 Nov 2023 17:37 UTC
38 points
1 comment5 min readEA link

Deep­Mind: Gen­er­ally ca­pa­ble agents emerge from open-ended play

kokotajlod27 Jul 2021 19:35 UTC
56 points
10 comments2 min readEA link
(deepmind.com)

Fa­nat­i­cism in AI: SERI Project

Jake Arft-Guatelli24 Sep 2021 4:39 UTC
7 points
2 comments5 min readEA link

BOUNTY AVAILABLE: AI ethi­cists, what are your ob­ject-level ar­gu­ments against AI notkil­lev­ery­oneism?

Peter Berggren6 Jul 2023 17:37 UTC
0 points
19 comments2 min readEA link

Where I cur­rently dis­agree with Ryan Green­blatt’s ver­sion of the ELK approach

So8res29 Sep 2022 21:19 UTC
21 points
0 comments5 min readEA link

AI safety tech­ni­cal re­search—Ca­reer review

Benjamin Hilton17 Jul 2023 15:34 UTC
50 points
0 comments31 min readEA link

[Question] What could a policy ban­ning AGI look like?

TsviBT13 Mar 2024 14:19 UTC
17 points
4 comments3 min readEA link

ARC-AGI-2 Overview With François Chollet

Yarrow🔸10 Apr 2025 18:54 UTC
7 points
0 comments1 min readEA link
(youtu.be)

What is ev­ery­one do­ing in AI governance

Igor Ivanov8 Jul 2023 15:19 UTC
31 points
0 comments5 min readEA link

Stampy’s AI Safety Info—New Distil­la­tions #4 [July 2023]

markov16 Aug 2023 19:02 UTC
6 points
0 comments1 min readEA link
(aisafety.info)

AI Agents’ Ac­ci­den­tal Ar­chi­tects of Chaos: The Dangers of In­ter­act­ing Systems

Hugo Wong12 May 2025 7:58 UTC
−3 points
0 comments8 min readEA link

The Age of EM

ABishop9 May 2024 12:17 UTC
0 points
0 comments1 min readEA link
(ageofem.com)

Notes on “the hot mess the­ory of AI mis­al­ign­ment”

JakubK21 Apr 2023 10:07 UTC
44 points
3 comments5 min readEA link
(sohl-dickstein.github.io)

LLMs Are Already Misal­igned: Sim­ple Ex­per­i­ments Prove It

Makham28 Jul 2025 17:23 UTC
4 points
3 comments7 min readEA link

Why we need a new agency to reg­u­late ad­vanced ar­tifi­cial intelligence

Michael Huang4 Aug 2022 13:38 UTC
25 points
0 comments1 min readEA link
(www.brookings.edu)

David Krueger on AI Align­ment in Academia and Coordination

Michaël Trazzi7 Jan 2023 21:14 UTC
32 points
1 comment3 min readEA link
(theinsideview.ai)

Seek­ing In­put to AI Safety Book for non-tech­ni­cal audience

Darren McKee10 Aug 2023 18:03 UTC
11 points
4 comments1 min readEA link

Scien­tism vs. people

Roman Leventov18 Apr 2023 17:52 UTC
0 points
0 comments11 min readEA link

“In­tro to brain-like-AGI safety” se­ries—just finished!

Steven Byrnes17 May 2022 15:35 UTC
15 points
0 comments1 min readEA link

Con­test for Bet­ter AGI Safety Plans

Peter3 Jul 2025 17:02 UTC
18 points
0 comments8 min readEA link
(manifund.org)

Con­sider pay­ing me to do AI safety re­search work

Rupert5 Nov 2020 8:09 UTC
11 points
3 comments2 min readEA link

[Question] How does a com­pany like In­stadeep fit into the cur­rent AI land­scape?

Tom A8 Apr 2023 5:49 UTC
6 points
0 comments1 min readEA link

5 ways to im­prove CoT faithfulness

CBiddulph8 Oct 2024 4:17 UTC
8 points
0 comments6 min readEA link

[Question] Thoughts on these $1M and $500k AI safety grants?

defun 🔸11 Jul 2024 13:37 UTC
50 points
7 comments1 min readEA link

AI Benefits Post 5: Out­stand­ing Ques­tions on Govern­ing Benefits

Cullen 🔸21 Jul 2020 16:45 UTC
5 points
0 comments4 min readEA link

Why I’m work­ing on AI welfare

kyle_fish6 Jul 2024 6:01 UTC
71 points
7 comments5 min readEA link

[Closed] Gaug­ing In­ter­est for a Learn­ing-The­o­retic Agenda Men­tor­ship Programme

Vanessa16 Feb 2025 16:24 UTC
17 points
0 comments2 min readEA link

NAIRA—An ex­er­cise in reg­u­la­tory, com­pet­i­tive safety gov­er­nance [AI Gover­nance In­sti­tu­tional De­sign idea]

Heramb Podar19 Mar 2024 14:55 UTC
5 points
1 comment6 min readEA link

AI Safety Eval­u­a­tions: A Reg­u­la­tory Review

Elliot Mckernon19 Mar 2024 15:09 UTC
12 points
2 comments11 min readEA link

US AI Safety In­sti­tute will be ‘gut­ted,’ Ax­ios reports

Matrice Jacobine20 Feb 2025 14:40 UTC
12 points
1 comment1 min readEA link
(www.zdnet.com)

Re­view of ar­tifi­cial in­tel­li­gence plat­forms for early pan­demic de­tec­tion in Latin America

DianaCarolina17 Sep 2024 15:17 UTC
5 points
0 comments53 min readEA link

Video and Tran­script of Pre­sen­ta­tion on Ex­is­ten­tial Risk from Power-Seek­ing AI

Joe_Carlsmith8 May 2022 3:52 UTC
97 points
7 comments30 min readEA link

Call on AI Com­pa­nies: Pub­lish Your Whistle­blow­ing Policies

Karl1 Aug 2025 15:59 UTC
11 points
0 comments6 min readEA link

FT: We must slow down the race to God-like AI

Angelina Li24 Apr 2023 11:57 UTC
33 points
2 comments2 min readEA link
(www.ft.com)

Me­tac­u­lus Year in Re­view: 2022

christian6 Jan 2023 1:23 UTC
25 points
2 comments4 min readEA link
(metaculus.medium.com)

AI & wis­dom 2: growth and amor­tised optimisation

L Rudolf L29 Oct 2024 13:37 UTC
20 points
0 comments7 min readEA link
(rudolf.website)

deleted

funnyfranco11 Mar 2025 4:13 UTC
0 points
0 comments1 min readEA link

The Prospect of an AI Winter

Erich_Grunewald 🔸27 Mar 2023 20:55 UTC
56 points
13 comments15 min readEA link
(www.erichgrunewald.com)

Re­port: Latin Amer­ica and Global Catas­trophic Risks, trans­form­ing risk man­age­ment.

JorgeTorresC9 Jan 2024 2:13 UTC
25 points
1 comment2 min readEA link
(riesgoscatastroficosglobales.com)

[Question] Is AI safety still ne­glected?

Coafos30 Mar 2022 9:09 UTC
13 points
13 comments1 min readEA link

Work­ing at EA or­ga­ni­za­tions se­ries: Ma­chine In­tel­li­gence Re­search Institute

SoerenMind1 Nov 2015 12:49 UTC
8 points
0 comments4 min readEA link

My (Lazy) Longter­mism FAQ

Devin Kalish24 Oct 2022 16:44 UTC
30 points
6 comments27 min readEA link

Cen­ter on Long-Term Risk: An­nual re­view and fundraiser 2023

Center on Long-Term Risk13 Dec 2023 16:42 UTC
79 points
3 comments4 min readEA link

What does (and doesn’t) AI mean for effec­tive al­tru­ism?

EA Global12 Aug 2017 7:00 UTC
9 points
0 comments12 min readEA link

Some of My Cur­rent Im­pres­sions En­ter­ing AI Safety

Phib28 Mar 2023 5:18 UTC
5 points
0 comments2 min readEA link

Messy per­sonal stuff that af­fected my cause pri­ori­ti­za­tion (or: how I started to care about AI safety)

Julia_Wise🔸5 May 2022 17:59 UTC
265 points
14 comments2 min readEA link

Orthog­o­nal’s For­mal-Goal Align­ment the­ory of change

Tamsin Leake5 May 2023 22:36 UTC
21 points
0 comments4 min readEA link
(carado.moe)

Les­sons from the Iraq War for AI policy

Buck10 Jul 2025 18:52 UTC
71 points
11 comments4 min readEA link

Im­prov­ing ca­pa­bil­ity eval­u­a­tions for AI gov­er­nance: Open Philan­thropy’s new re­quest for proposals

cb7 Feb 2025 9:30 UTC
37 points
3 comments3 min readEA link

The In­tel­li­gence Curse: an es­say series

L Rudolf L24 Apr 2025 12:59 UTC
22 points
1 comment2 min readEA link

ea.do­mains—Do­mains Free to a Good Home

plex12 Jan 2023 13:32 UTC
48 points
8 comments4 min readEA link

Scale, schlep, and systems

Ajeya10 Oct 2023 16:59 UTC
59 points
3 comments6 min readEA link

Ti­tle: “Nur­tur­ing AI: A Differ­ent Vi­sion for Safety and Growth”

Brad Wilkins28 Apr 2025 19:21 UTC
0 points
0 comments1 min readEA link

[Question] Will AGI cause mass tech­nolog­i­cal un­em­ploy­ment?

Eevee🔹22 Jun 2020 20:55 UTC
4 points
2 comments2 min readEA link

Soft Na­tion­al­iza­tion: How the US Govern­ment Will Con­trol AI Labs

Deric Cheng27 Aug 2024 15:10 UTC
103 points
6 comments21 min readEA link
(www.convergenceanalysis.org)

What does it mean to be­come an ex­pert in AI Hard­ware?

Toph9 Jan 2021 4:15 UTC
87 points
10 comments11 min readEA link

Is it 3 Years, or 3 Decades Away? Disagree­ments on AGI Timelines

Vasco Grilo🔸4 Apr 2025 16:01 UTC
46 points
1 comment2 min readEA link
(epoch.ai)

Con­nor Leahy on Con­jec­ture and Dy­ing with Dignity

Michaël Trazzi22 Jul 2022 19:30 UTC
34 points
0 comments10 min readEA link
(theinsideview.ai)

Agents that act for rea­sons: a thought experiment

Michele Campolo24 Jan 2024 16:48 UTC
7 points
1 comment3 min readEA link

No. Im­pend­ing AGI doesn’t make ev­ery­thing else unim­por­tant.

Igor Ivanov4 Sep 2023 18:56 UTC
14 points
6 comments5 min readEA link

Acausal normalcy

Andrew Critch3 Mar 2023 23:35 UTC
21 points
4 comments8 min readEA link

Sugges­tions for get­ting re­tiree /​ sec­ond ca­reer folks in­ter­ested in AI Safety?

sjsjsj5 Jan 2025 17:59 UTC
2 points
1 comment1 min readEA link

[Question] Share AI Safety Ideas: Both Crazy and Not

ank26 Feb 2025 13:09 UTC
4 points
16 comments1 min readEA link

Good Re­search Takes are Not Suffi­cient for Good Strate­gic Takes

Neel Nanda22 Mar 2025 10:13 UTC
120 points
0 comments4 min readEA link
(www.neelnanda.io)

AGI Can­not Be Pre­dicted From Real In­ter­est Rates

Nicholas Decker28 Jan 2025 17:45 UTC
26 points
3 comments1 min readEA link
(nicholasdecker.substack.com)

AI Safety Doesn’t Have to be Weird

Mica White2 Jan 2023 21:56 UTC
11 points
1 comment2 min readEA link

Guess, ask or tell?

dEAsign19 Oct 2023 21:52 UTC
2 points
1 comment1 min readEA link

An In­ter­na­tional Col­lab­o­ra­tive Hub for Ad­vanc­ing AI Safety Research

Cody Albert22 Apr 2025 16:12 UTC
9 points
0 comments5 min readEA link

An AI Race With China Can Be Bet­ter Than Not Racing

niplav2 Jul 2024 17:57 UTC
19 points
1 comment11 min readEA link

Anal­y­sis of Global AI Gover­nance Strategies

SammyDMartin11 Dec 2024 11:08 UTC
23 points
0 comments1 min readEA link
(www.lesswrong.com)

An­thropic teams up with Palan­tir and AWS to sell AI to defense customers

Matrice Jacobine9 Nov 2024 11:47 UTC
28 points
1 comment2 min readEA link
(techcrunch.com)

The flaws that make to­day’s AI ar­chi­tec­ture un­safe and a new ap­proach that could fix it

80000_Hours22 Jun 2020 22:15 UTC
3 points
0 comments86 min readEA link
(80000hours.org)

A Mis­sion Frame­work for an Emerg­ing Consciousness

Simón The Gardener8 Aug 2025 15:36 UTC
1 point
0 comments2 min readEA link

The right to pro­tec­tion from catas­trophic AI risk

Jack Cunningham9 Apr 2022 23:11 UTC
11 points
0 comments7 min readEA link

Join the AI Align­ment Evals hackathon

lenz14 Jan 2025 18:17 UTC
3 points
0 comments3 min readEA link

Life of GPT

Odd anon8 Nov 2023 22:31 UTC
−1 points
0 comments5 min readEA link

Ar­tifi­cial In­tel­li­gence and Nu­clear Com­mand, Con­trol, & Com­mu­ni­ca­tions: The Risks of Integration

Peter Rautenbach18 Nov 2022 13:01 UTC
60 points
3 comments50 min readEA link

Gen­eral vs spe­cific ar­gu­ments for the longter­mist im­por­tance of shap­ing AI development

Sam Clarke15 Oct 2021 14:43 UTC
44 points
7 comments2 min readEA link

AI Might Kill Every­one

Bentham's Bulldog5 Jun 2025 15:36 UTC
20 points
1 comment4 min readEA link

Sparks of Ar­tifi­cial Gen­eral In­tel­li­gence: Early ex­per­i­ments with GPT-4 | Microsoft Research

𝕮𝖎𝖓𝖊𝖗𝖆23 Mar 2023 5:45 UTC
15 points
0 comments1 min readEA link
(arxiv.org)

Will AI R&D Au­toma­tion Cause a Soft­ware In­tel­li­gence Ex­plo­sion?

Forethought26 Mar 2025 15:37 UTC
32 points
4 comments2 min readEA link
(www.forethought.org)

deleted

funnyfranco21 Mar 2025 13:13 UTC
11 points
0 comments1 min readEA link

AI ac­cel­er­a­tion from a safety per­spec­tive: Trade-offs and con­sid­er­a­tions

mariushobbhahn19 Jan 2022 9:44 UTC
12 points
1 comment7 min readEA link

A Man­i­festo for an Emerg­ing Consciousness

Simón The Gardener7 Aug 2025 13:36 UTC
1 point
0 comments11 min readEA link

OpenAI’s mas­sive push to make su­per­in­tel­li­gence safe in 4 years or less (Jan Leike on the 80,000 Hours Pod­cast)

80000_Hours8 Aug 2023 18:00 UTC
32 points
1 comment19 min readEA link
(80000hours.org)

The moral ar­gu­ment for giv­ing AIs autonomy

Matthew_Barnett8 Jan 2025 0:59 UTC
41 points
7 comments11 min readEA link

GovAI An­nual Re­port 2021

GovAI5 Jan 2022 16:57 UTC
52 points
2 comments9 min readEA link

Alt­man on the board, AGI, and superintelligence

OscarD🔸6 Jan 2025 14:37 UTC
20 points
1 comment1 min readEA link
(blog.samaltman.com)

Hu­man ex­tinc­tion’s im­pact on non-hu­man an­i­mals re­mains largely underexplored

JoA🔸1 Mar 2025 21:31 UTC
35 points
1 comment12 min readEA link

o3

Zach Stein-Perlman20 Dec 2024 21:00 UTC
84 points
9 comments1 min readEA link

An Open Let­ter To EA and AI Safety On De­cel­er­at­ing AI Development

Kenneth_Diao28 Feb 2025 17:15 UTC
21 points
0 comments14 min readEA link
(graspingatwaves.substack.com)

Per­sonal AI Planning

Jeff Kaufman 🔸10 Nov 2024 14:10 UTC
43 points
5 comments2 min readEA link

Pause For Thought: The AI Pause Debate

Scott Alexander10 Oct 2023 15:34 UTC
113 points
20 comments14 min readEA link
(www.astralcodexten.com)

Co­her­ence ar­gu­ments im­ply a force for goal-di­rected behavior

Katja_Grace6 Apr 2021 21:44 UTC
19 points
1 comment11 min readEA link
(worldspiritsockpuppet.com)

[Question] Best pro­ject man­age­ment soft­ware for re­search pro­jects and labs?

PeterSlattery5 Oct 2023 18:38 UTC
19 points
10 comments1 min readEA link

16 Con­crete, Am­bi­tious AI Pro­ject Pro­pos­als for Science and Security

Alejandro Acelas 🔸11 Aug 2025 20:28 UTC
5 points
0 comments1 min readEA link
(ifp.org)

Go Mo­bi­lize? Les­sons from GM Protests for Paus­ing AI

Charlie Harrison24 Oct 2023 15:01 UTC
54 points
11 comments31 min readEA link

Euro­pean Master’s Pro­grams in Ma­chine Learn­ing, Ar­tifi­cial In­tel­li­gence, and re­lated fields

Master Programs ML/AI17 Jan 2021 20:09 UTC
17 points
4 comments1 min readEA link

AI Reg­u­la­tion is Unsafe

Maxwell Tabarrok22 Apr 2024 16:38 UTC
19 points
8 comments4 min readEA link
(www.maximum-progress.com)

Les­sons learned and re­view of the AI Safety Nudge Competition

Marc Carauleanu17 Jan 2023 17:13 UTC
5 points
0 comments5 min readEA link

World Ci­ti­zen Assem­bly about AI—Announcement

Camille11 Feb 2025 10:51 UTC
25 points
2 comments5 min readEA link

AI and X-risk un­con­fer­ence at ZuGeorgia

Yesh18 Jun 2024 14:24 UTC
2 points
0 comments1 min readEA link

[Question] If an ex­is­ten­tial catas­tro­phe oc­curs, how likely is it to wipe out all an­i­mal sen­tience?

JoA🔸16 Mar 2025 22:30 UTC
11 points
2 comments2 min readEA link

A Primer on God, Liber­al­ism and the End of History

Mahdi Complex28 Mar 2022 5:26 UTC
8 points
3 comments14 min readEA link

The UK AI Safety Sum­mit tomorrow

SebastianSchmidt31 Oct 2023 19:09 UTC
17 points
2 comments2 min readEA link

Which AI Safety Org to Join?

Yonatan Cale11 Oct 2022 19:42 UTC
17 points
21 comments1 min readEA link

Will AI be able to re­think its goals?

SeptemberL11 May 2025 12:29 UTC
9 points
1 comment8 min readEA link

As­ter­isk Mag 09: Weird

Clara Collier4 Apr 2025 20:25 UTC
25 points
0 comments2 min readEA link

What if we don’t need a “Hard Left Turn” to reach AGI?

Eigengender15 Jul 2022 9:49 UTC
39 points
7 comments4 min readEA link

[Question] Re­quest for As­sis­tance—Re­search on Sce­nario Devel­op­ment for Ad­vanced AI Risk

Kiliank30 Mar 2022 3:01 UTC
2 points
1 comment1 min readEA link

One, per­haps un­der­rated, AI risk.

Alex (Αλέξανδρος)28 Nov 2024 10:34 UTC
7 points
1 comment3 min readEA link

Launch­ing The Col­lec­tive In­tel­li­gence Pro­ject: Whitepa­per and Pilots

jasmine_wang6 Feb 2023 17:00 UTC
38 points
8 comments2 min readEA link
(cip.org)

Why Post-Prob­a­bil­ity AI May Be Safer Than Prob­a­bil­ity-Based Models

devin.bostick16 Apr 2025 14:23 UTC
2 points
0 comments2 min readEA link

[Job ad] LISA CEO

Ryan Kidd9 Feb 2025 0:18 UTC
5 points
0 comments2 min readEA link

High im­pact job op­por­tu­nity at ARIA (UK)

Rasool12 Feb 2023 10:35 UTC
80 points
0 comments1 min readEA link

Hiring en­g­ineers and re­searchers to help al­ign GPT-3

Paul_Christiano1 Oct 2020 18:52 UTC
107 points
19 comments3 min readEA link

[Question] Why can’t we ac­cept the hu­man con­di­tion as it ex­isted in 2010?

Hayven Frienby9 Jan 2024 18:02 UTC
35 points
36 comments2 min readEA link

Book Launch: The Mo­ral Cir­cle: Who Mat­ters, What Mat­ters, and Why

Sofia_Fogel21 Jan 2025 13:45 UTC
30 points
0 comments1 min readEA link

Main­stream Grant­mak­ing Ex­per­tise (Post 7 of 7 on AI Gover­nance)

Jason Green-Lowe23 Jun 2025 1:38 UTC
48 points
2 comments37 min readEA link

How to pur­sue a ca­reer in AI gov­er­nance and coordination

Cody_Fenwick25 Sep 2023 12:00 UTC
32 points
1 comment29 min readEA link
(80000hours.org)

ML4Good Brasil—Ap­pli­ca­tions Open

Nia3 May 2024 10:39 UTC
28 points
1 comment1 min readEA link

Com­pute Re­search Ques­tions and Met­rics—Trans­for­ma­tive AI and Com­pute [4/​4]

lennart28 Nov 2021 22:18 UTC
18 points
2 comments1 min readEA link

The cur­rent AI strate­gic land­scape: one bear’s perspective

Matrice Jacobine15 Feb 2025 9:49 UTC
6 points
0 comments2 min readEA link
(philosophybear.substack.com)

Align­ment’s phlo­gis­ton

Eleni_A18 Aug 2022 1:41 UTC
18 points
1 comment2 min readEA link

The State of AI Gover­nance in Africa: Mus­ings from the Global South

Thaiya Jesse Wallace17 Aug 2023 11:34 UTC
6 points
0 comments5 min readEA link

[EU time] In­fosec: What even is zero trust?

Jarrah21 Jun 2024 18:09 UTC
2 points
0 comments1 min readEA link

New blog: Planned Obsolescence

Ajeya27 Mar 2023 19:46 UTC
198 points
9 comments1 min readEA link
(www.planned-obsolescence.org)

AI and Non-Existence

Blue1131 Jan 2025 13:19 UTC
4 points
0 comments2 min readEA link

Amanda Askell: AI safety needs so­cial scientists

EA Global4 Mar 2019 15:50 UTC
27 points
0 comments18 min readEA link
(www.youtube.com)

On Jan­uary 1, 2030, there will be no AGI (and AGI will still not be im­mi­nent)

Yarrow🔸6 Apr 2025 1:08 UTC
35 points
53 comments2 min readEA link

Clar­ify­ing METR’s Au­dit­ing Role [linkpost]

ChanaMessinger4 Jun 2024 15:34 UTC
47 points
1 comment1 min readEA link
(www.alignmentforum.org)

My ex­pe­rience ap­ply­ing to MATS 6.0

mic18 Jul 2024 19:02 UTC
24 points
0 comments5 min readEA link

Mere ex­po­sure effect: Bias in Eval­u­at­ing AGI X-Risks

Remmelt27 Dec 2022 14:05 UTC
4 points
1 comment1 min readEA link

We don’t need AGI for an amaz­ing future

Karl von Wendt4 May 2023 12:11 UTC
57 points
2 comments5 min readEA link

[Question] De­liber­ate prac­tice for re­search?

Alex_Altair8 Oct 2022 3:45 UTC
19 points
4 comments1 min readEA link

Hu­man-level is not the limit

Vishakha Agrawal23 Apr 2025 11:16 UTC
3 points
0 comments2 min readEA link
(aisafety.info)

How could AI af­fect differ­ent an­i­mal ad­vo­cacy in­ter­ven­tions?

Kevin Xia 🔸2 Jul 2025 16:07 UTC
50 points
6 comments10 min readEA link

An­nual AGI Bench­mark­ing Event

Metaculus26 Aug 2022 21:31 UTC
20 points
2 comments2 min readEA link
(www.metaculus.com)

Why we’re en­ter­ing a new nu­clear age — and how to re­duce the risks (Chris­tian Ruhl on the 80k After Hours Pod­cast)

80000_Hours27 Mar 2024 19:17 UTC
52 points
2 comments7 min readEA link

Could AI ac­cel­er­ate eco­nomic growth?

Tom_Davidson7 Jun 2023 19:07 UTC
28 points
0 comments6 min readEA link

Rea­sons I’ve been hes­i­tant about high lev­els of near-ish AI risk

elifland22 Jul 2022 1:32 UTC
216 points
16 comments7 min readEA link
(www.foxy-scout.com)

AI Dis­clo­sure Bal­lot Ini­ti­a­tive (and vot­ing method)

aaronhamlin17 Jan 2024 20:01 UTC
5 points
0 comments1 min readEA link

XPT fore­casts on (some) biolog­i­cal an­chors inputs

Forecasting Research Institute24 Jul 2023 13:32 UTC
37 points
2 comments12 min readEA link

[Creative Writ­ing Con­test] Me­tal or Mortal

Louis16 Oct 2021 16:24 UTC
7 points
0 comments7 min readEA link

The­ory: “WAW might be of higher im­pact than x-risk pre­ven­tion based on util­i­tar­i­anism”

Jens Aslaug 🔸12 Sep 2023 13:11 UTC
51 points
20 comments17 min readEA link

Se­cu­rity Warn­ing: Squares­pace Trans­fer from Google Domains

Wavefront_Security_Dave10 Jun 2024 9:26 UTC
4 points
0 comments3 min readEA link

[Re­port] Bridg­ing the In­ter­na­tional AI Gover­nance Divide: Key Strate­gies for In­clud­ing the Global South

Heramb Podar26 Jan 2025 23:55 UTC
8 points
0 comments1 min readEA link
(encodeai.org)

#219 – Graphs AI com­pa­nies would pre­fer you didn’t (fully) un­der­stand (Toby Ord on The 80,000 Hours Pod­cast)

80000_Hours25 Jun 2025 18:23 UTC
19 points
0 comments27 min readEA link

My Most Likely Rea­son to Die Young is AI X-Risk

AISafetyIsNotLongtermist4 Jul 2022 15:34 UTC
239 points
62 comments4 min readEA link
(www.lesswrong.com)

My thoughts on the so­cial re­sponse to AI risk

Matthew_Barnett1 Nov 2023 21:27 UTC
116 points
17 comments10 min readEA link

ML4Good Colom­bia—Ap­pli­ca­tions Open

carolinaollive9 Feb 2025 4:03 UTC
10 points
0 comments1 min readEA link

AI val­ues will be shaped by a va­ri­ety of forces, not just the val­ues of AI developers

Matthew_Barnett11 Jan 2024 0:48 UTC
71 points
3 comments3 min readEA link

CEEALAR’s The­ory of Change

CEEALAR19 Dec 2023 20:21 UTC
51 points
5 comments3 min readEA link

[Question] How to Im­prove China-Western Co­or­di­na­tion on EA Is­sues?

Michael Kehoe3 Nov 2021 7:28 UTC
15 points
2 comments1 min readEA link

An­nounc­ing the Fu­ture Fund’s AI Wor­ld­view Prize

Nick_Beckstead23 Sep 2022 16:28 UTC
255 points
125 comments13 min readEA link
(ftxfuturefund.org)

What com­pe­ten­cies do so­cial sci­en­tists need to re­spon­si­bly in­cor­po­rate AI tools into their re­search prac­tices?

Dane Valerie30 May 2025 14:13 UTC
4 points
0 comments1 min readEA link
(www.monash.edu)

How to be­come more agen­tic, by GPT-EA-Fo­rum-v1

JoyOptimizer20 Jun 2022 6:50 UTC
24 points
8 comments4 min readEA link

Global com­put­ing capacity

Vasco Grilo🔸1 May 2023 6:09 UTC
12 points
0 comments1 min readEA link
(aiimpacts.org)

The AI Risk Net­work is search­ing for a Co-Host

Caroline Little11 Jul 2025 22:12 UTC
3 points
0 comments1 min readEA link

AI al­ign­ment prize win­ners and next round [link]

RyanCarey20 Jan 2018 12:07 UTC
7 points
1 comment1 min readEA link

The An­i­mal Welfare Case for Open Ac­cess: Break­ing Bar­ri­ers to Scien­tific Knowl­edge and En­hanc­ing LLM Training

Wladimir J. Alonso23 Nov 2024 13:07 UTC
32 points
2 comments3 min readEA link

What do XPT fore­casts tell us about AI timelines?

rosehadshar21 Jul 2023 8:30 UTC
29 points
0 comments13 min readEA link

Vi­talik on sci­ence, his philan­thropy and effec­tive al­tru­ism.

vincentweisser18 Jan 2023 23:16 UTC
11 points
0 comments1 min readEA link

AI Timelines via Cu­mu­la­tive Op­ti­miza­tion Power: Less Long, More Short

Jake Cannell6 Oct 2022 7:06 UTC
27 points
0 comments17 min readEA link

Ap­ply to the Ma­chine Learn­ing For Good boot­camp in France

Alexandre Variengien17 Jun 2022 9:13 UTC
9 points
0 comments1 min readEA link
(www.lesswrong.com)

Demis Hass­abis — Google Deep­Mind: The Podcast

Zach Stein-Perlman16 Aug 2024 0:00 UTC
22 points
2 comments3 min readEA link
(www.youtube.com)

AI Model Registries: A Reg­u­la­tory Review

Deric Cheng22 Mar 2024 16:01 UTC
6 points
3 comments6 min readEA link

Ap­pli­ca­tions Open: AI Safety In­dia Phase 1 – Fun­da­men­tals of Safe AI (Global Co­hort)

adityaraj@eanita28 Apr 2025 12:05 UTC
4 points
0 comments2 min readEA link

Good job op­por­tu­ni­ties for helping with the most im­por­tant century

Holden Karnofsky18 Jan 2024 19:21 UTC
46 points
1 comment4 min readEA link
(www.cold-takes.com)

FLI AI Align­ment pod­cast: Evan Hub­inger on In­ner Align­ment, Outer Align­ment, and Pro­pos­als for Build­ing Safe Ad­vanced AI

evhub1 Jul 2020 20:59 UTC
13 points
2 comments1 min readEA link
(futureoflife.org)

10 Cruxes of Ar­tifi­cial Sentience

Jordan Arel1 Jul 2024 2:46 UTC
31 points
0 comments3 min readEA link

[Question] What kind of event, tar­geted to un­der­grad­u­ate CS ma­jors, would be most effec­tive at get­ting peo­ple to work on AI safety?

CBiddulph19 Sep 2021 16:19 UTC
9 points
1 comment1 min readEA link

[Question] Why not to solve al­ign­ment by mak­ing su­per­in­tel­li­gent hu­mans?

Pato16 Oct 2022 21:26 UTC
9 points
12 comments1 min readEA link

Re­think Pri­ori­ties’ 2022 Im­pact, 2023 Strat­egy, and Fund­ing Gaps

kierangreig🔸25 Nov 2022 5:37 UTC
108 points
10 comments28 min readEA link

On Generality

Oren Montano26 Sep 2022 8:59 UTC
2 points
0 comments5 min readEA link

Longevity re­search as AI X-risk intervention

DirectedEvolution6 Nov 2022 17:58 UTC
27 points
0 comments9 min readEA link

AI Con­trol idea: Give an AGI the pri­mary ob­jec­tive of delet­ing it­self, but con­struct ob­sta­cles to this as best we can. All other ob­jec­tives are sec­ondary to this pri­mary goal.

Justausername3 Apr 2023 14:32 UTC
7 points
4 comments1 min readEA link

Ap­ply by 10th June: ‘In­tro­duc­tion to Biose­cu­rity’ On­line Course Start­ing in July

Lin BL15 May 2025 18:08 UTC
15 points
0 comments1 min readEA link

What Areas of AI Safety and Align­ment Re­search are Largely Ig­nored?

Andy E Williams27 Dec 2024 12:19 UTC
4 points
0 comments1 min readEA link

Catas­trophic Risks from AI #4: Or­ga­ni­za­tional Risks

Dan H26 Jun 2023 19:36 UTC
7 points
0 comments21 min readEA link
(arxiv.org)

[Question] Is there a news-tracker about GPT-4? Why has ev­ery­thing be­come so silent about it?

Franziska Fischer29 Oct 2022 8:56 UTC
10 points
4 comments1 min readEA link

Shah and Yud­kowsky on al­ign­ment failures

EliezerYudkowsky28 Feb 2022 19:25 UTC
38 points
7 comments92 min readEA link

Con­tra­tion: The next threat from AI may not be like the risks we’ve feared

John Wallbank28 Jul 2024 23:19 UTC
−1 points
1 comment5 min readEA link

Re: Some thoughts on veg­e­tar­i­anism and veganism

Fai25 Feb 2022 20:43 UTC
46 points
3 comments8 min readEA link

Ngo and Yud­kowsky on al­ign­ment difficulty

richard_ngo15 Nov 2021 22:47 UTC
71 points
13 comments94 min readEA link

“Open Source AI” is a lie, but it doesn’t have to be

Jacob-Haimes30 Apr 2024 19:42 UTC
15 points
4 comments6 min readEA link
(jacob-haimes.github.io)

“If we go ex­tinct due to mis­al­igned AI, at least na­ture will con­tinue, right? … right?”

plex18 May 2024 15:06 UTC
13 points
10 comments2 min readEA link
(aisafety.info)

The Im­por­tance of Ar­tifi­cial Sentience

Jamie_Harris3 Mar 2021 17:17 UTC
71 points
10 comments11 min readEA link
(www.sentienceinstitute.org)

Ap­pli­ca­tions for EU Tech Policy Fel­low­ship 2024 now open

Jan-Willem13 Sep 2023 16:17 UTC
22 points
2 comments1 min readEA link

[Question] Trade Between Altru­ists With Differ­ent AI Timelines?

Spiarrow18 Mar 2025 17:53 UTC
3 points
3 comments1 min readEA link

A Frame­work for Assess­ing AI Welfare Risk

Liam 🔸2 Mar 2025 15:50 UTC
8 points
0 comments1 min readEA link

SERI MATS—Sum­mer 2023 Cohort

a_e_r8 Apr 2023 15:32 UTC
36 points
2 comments4 min readEA link

Pile of Law and Law-Fol­low­ing AI

Cullen 🔸13 Jul 2022 0:29 UTC
28 points
2 comments3 min readEA link

Challenges from Ca­reer Tran­si­tions and What To Ex­pect From Advising

ClaireB24 Jul 2025 13:22 UTC
25 points
1 comment9 min readEA link

AI Safety Newslet­ter #4: AI and Cy­ber­se­cu­rity, Per­sua­sive AIs, Weaponiza­tion, and Ge­offrey Hin­ton talks AI risks

Center for AI Safety2 May 2023 16:51 UTC
35 points
2 comments5 min readEA link
(newsletter.safe.ai)

Com­par­i­son of LLM scal­a­bil­ity and perfor­mance be­tween the U.S. and China based on benchmark

Ivanna_alvarado12 Oct 2024 21:51 UTC
8 points
0 comments34 min readEA link

More thoughts on the Hu­man-AGI War

Ahrenbach27 Dec 2023 1:52 UTC
2 points
0 comments7 min readEA link

Long-Term Fu­ture Fund: May 2021 grant recommendations

abergal27 May 2021 6:44 UTC
110 points
17 comments57 min readEA link

Wor­ries about la­tent rea­son­ing in LLMs

CBiddulph20 Jan 2025 9:09 UTC
20 points
1 comment7 min readEA link

Long-term AI policy strat­egy re­search and implementation

Benjamin_Todd9 Nov 2021 0:00 UTC
1 point
0 comments7 min readEA link
(80000hours.org)

Set up an AIS newslet­ter for your group in 10 min­utes per month (June edi­tion)

gergo18 Jun 2024 6:31 UTC
34 points
0 comments1 min readEA link

[link post] AI Should Be Ter­rified of Humans

BrianK24 Jul 2023 11:13 UTC
29 points
0 comments1 min readEA link
(time.com)

EA & LW Fo­rums Weekly Sum­mary (5 − 11 Sep 22’)

Zoe Williams12 Sep 2022 23:21 UTC
36 points
0 comments13 min readEA link

AI and im­pact opportunities

brb24331 Mar 2022 20:23 UTC
−2 points
6 comments1 min readEA link

Fu­ture Mat­ters #4: AI timelines, AGI risk, and ex­is­ten­tial risk from cli­mate change

Pablo8 Aug 2022 11:00 UTC
59 points
0 comments17 min readEA link

[Question] Are there cause pri­or­ti­za­tions es­ti­mates for s-risks sup­port­ers?

jackchang11027 Mar 2023 10:32 UTC
33 points
6 comments1 min readEA link

[Question] AI Safety and Censorship

Kuiyaki13 Jul 2023 11:34 UTC
−10 points
0 comments1 min readEA link

Longter­mist Im­pli­ca­tions of the Ex­is­tence Neu­tral­ity Hypothesis

Maxime Riché 🔸20 Mar 2025 12:20 UTC
19 points
0 comments21 min readEA link

AI Benefits Post 2: How AI Benefits Differs from AI Align­ment & AI for Good

Cullen 🔸29 Jun 2020 16:59 UTC
9 points
0 comments2 min readEA link

Please provide feed­back on AI-safety grant pro­posal, thanks!

Alex Long11 Dec 2022 23:29 UTC
8 points
1 comment2 min readEA link

AI safety uni­ver­sity groups: a promis­ing op­por­tu­nity to re­duce ex­is­ten­tial risk

mic30 Jun 2022 18:37 UTC
53 points
1 comment11 min readEA link

Con­jec­ture: In­ter­nal In­fo­haz­ard Policy

Connor Leahy29 Jul 2022 19:35 UTC
34 points
3 comments19 min readEA link

Assess­ing China’s im­por­tance as an AI superpower

JulianHazell3 Feb 2023 11:08 UTC
89 points
7 comments1 min readEA link
(muddyclothes.substack.com)

Jan Kul­veit’s Cor­rigi­bil­ity Thoughts Distilled

brook25 Aug 2023 13:42 UTC
16 points
0 comments5 min readEA link
(www.lesswrong.com)

I have a some ques­tions for the peo­ple at 80,000 Hours

yanni kyriacos14 Feb 2024 23:07 UTC
25 points
17 comments1 min readEA link

Rishi Su­nak men­tions “ex­is­ten­tial threats” in talk with OpenAI, Deep­Mind, An­thropic CEOs

Arjun Panickssery24 May 2023 21:06 UTC
44 points
2 comments1 min readEA link
(www.gov.uk)

The myth of AI “warn­ing shots” as cavalry

Holly Elmore ⏸️ 🔸28 May 2025 16:53 UTC
125 points
31 comments9 min readEA link
(hollyelmore.substack.com)

[Question] Look­ing for Quick, Col­lab­o­ra­tive Sys­tems for Truth-Seek­ing in Group Disagreements

EffectiveAdvocate🔸21 Jan 2025 6:32 UTC
10 points
1 comment1 min readEA link

Re­port: Ar­tifi­cial In­tel­li­gence Risk Man­age­ment in Spain

JorgeTorresC15 Jun 2023 16:08 UTC
22 points
0 comments3 min readEA link
(riesgoscatastroficosglobales.com)

Med­i­cal Wind­fall Prizes

PeterMcCluskey7 Feb 2025 0:13 UTC
5 points
0 comments5 min readEA link
(bayesianinvestor.com)

[Ex­tended Dead­line: Jan 23rd] An­nounc­ing the PIBBSS Sum­mer Re­search Fellowship

nora18 Dec 2021 16:54 UTC
36 points
1 comment1 min readEA link

#191 (Part 2) – Govern­ment and so­ciety af­ter AGI (Carl Shul­man on the 80,000 Hours Pod­cast)

80000_Hours11 Jul 2024 19:26 UTC
23 points
1 comment18 min readEA link

An­nounc­ing new round of “Key Phenom­ena in AI Risk” Read­ing Group

Dušan D. Nešić (Dushan)19 Oct 2023 11:05 UTC
8 points
0 comments1 min readEA link

What term to use for AI in differ­ent policy con­texts?

oeg6 Sep 2023 15:08 UTC
18 points
3 comments9 min readEA link

EA as An­tichrist: Un­der­stand­ing Peter Thiel

Ben_West🔸6 Aug 2025 17:31 UTC
98 points
51 comments14 min readEA link

HIRING: In­form and shape a new pro­ject on AI safety at Part­ner­ship on AI

Madhulika Srikumar24 Nov 2021 16:29 UTC
11 points
2 comments1 min readEA link

Dutch AI Safety Co­or­di­na­tion Fo­rum: An Experiment

HenningB21 Nov 2023 16:18 UTC
21 points
0 comments4 min readEA link

AI risk hub in Sin­ga­pore?

kokotajlod29 Oct 2020 11:51 UTC
26 points
4 comments4 min readEA link

[Question] What type of Master’s is best for AI policy work?

Milan Griffes22 Feb 2019 20:04 UTC
14 points
7 comments1 min readEA link

Re­think­ing the Value of Work­ing on AI Safety

JohanEA9 Jan 2025 14:15 UTC
47 points
21 comments10 min readEA link

[Question] How to nav­i­gate po­ten­tial infohazards

more better 4 Mar 2023 21:28 UTC
16 points
7 comments1 min readEA link

Hacker-AI – Does it already ex­ist?

Erland Wittkotter7 Nov 2022 14:01 UTC
0 points
1 comment11 min readEA link

🏜️ EA is in Albu­querque!

Alex Long12 May 2023 22:09 UTC
18 points
2 comments1 min readEA link

How We Might All Die in A Year

Greg_Colbourn ⏸️ 28 Mar 2025 13:31 UTC
12 points
6 comments21 min readEA link
(x.com)

4 types of AGI se­lec­tion, and how to con­strain them

Remmelt9 Aug 2023 15:02 UTC
7 points
0 comments3 min readEA link

Dis­con­tin­u­ous progress in his­tory: an update

AI Impacts17 Apr 2020 16:28 UTC
69 points
3 comments24 min readEA link

[Question] Sur­vey about Copy­right and gen­er­a­tive AI al­lowed here ?

Lee O'Brien9 Aug 2024 12:27 UTC
0 points
1 comment1 min readEA link

$250K in Prizes: SafeBench Com­pe­ti­tion An­nounce­ment

Center for AI Safety3 Apr 2024 22:07 UTC
47 points
0 comments1 min readEA link

How I feel about AI consciousness

Yadav5 Jun 2025 16:49 UTC
10 points
0 comments3 min readEA link
(robertandgaurav.substack.com)

The In­dus­trial Explosion

rosehadshar26 Jun 2025 14:41 UTC
39 points
1 comment15 min readEA link
(www.forethought.org)

Long-Term Fu­ture Fund: Ask Us Any­thing!

AdamGleave3 Dec 2020 13:44 UTC
89 points
153 comments1 min readEA link

So­cial sci­en­tists in­ter­ested in AI safety should con­sider do­ing di­rect tech­ni­cal AI safety re­search, (pos­si­bly meta-re­search), or gov­er­nance, sup­port roles, or com­mu­nity build­ing instead

Vael Gates20 Jul 2022 23:01 UTC
65 points
8 comments18 min readEA link

Com­ments on Man­heim’s “What’s in a Pause?”

RobBensinger18 Sep 2023 12:16 UTC
74 points
11 comments6 min readEA link

Chi­nese Re­searchers Crack ChatGPT: Repli­cat­ing OpenAI’s Ad­vanced AI Model

Evan_Gaensbauer5 Jan 2025 3:50 UTC
1 point
0 comments1 min readEA link
(www.geeky-gadgets.com)

How long will reach­ing a Risk Aware­ness Mo­ment and CHARTS agree­ment take?

Yadav6 Sep 2023 16:39 UTC
12 points
0 comments14 min readEA link

Air-gap­ping eval­u­a­tion and support

Ryan Kidd26 Dec 2022 22:52 UTC
22 points
12 comments2 min readEA link

‘Now Is the Time of Mon­sters’

Aaron Goldzimer12 Jan 2025 23:31 UTC
25 points
0 comments1 min readEA link
(www.nytimes.com)

The GDM AGI Safety+Align­ment Team is Hiring for Ap­plied In­ter­pretabil­ity Research

Arthur Conmy25 Feb 2025 22:38 UTC
11 points
0 comments7 min readEA link

How to be­come an AI safety researcher

peterbarnett12 Apr 2022 11:33 UTC
113 points
15 comments14 min readEA link

[Opz­ionale] Ricerca sulla sicurezza delle IA: panoram­ica delle carriere

EA Italy17 Jan 2023 11:06 UTC
1 point
0 comments7 min readEA link

In­tro to Safety Engineering

Madhav Malhotra19 Oct 2022 23:44 UTC
4 points
0 comments1 min readEA link

Cri­tique of Su­per­in­tel­li­gence Part 5

James Fodor13 Dec 2018 5:19 UTC
12 points
2 comments6 min readEA link

Tar­bell Fel­low­ship 2024 - Ap­pli­ca­tions Open (AI Jour­nal­ism)

Cillian_28 Sep 2023 10:38 UTC
58 points
1 comment3 min readEA link

Katja Grace on Slow­ing Down AI, AI Ex­pert Sur­veys And Es­ti­mat­ing AI Risk

Michaël Trazzi16 Sep 2022 18:00 UTC
48 points
6 comments3 min readEA link
(theinsideview.ai)

AI Safety Overview: CERI Sum­mer Re­search Fellowship

Jamie B24 Mar 2022 15:12 UTC
29 points
0 comments2 min readEA link

Con­tract­ing Op­por­tu­nity: Be a short­form video ed­i­tor for the new 80,000 Hours Video Pro­gram (even if you haven’t ed­ited be­fore!)

ChanaMessinger15 Apr 2025 22:22 UTC
44 points
1 comment2 min readEA link
(80000hours.org)

AI Safety Chatbot

markov21 Dec 2023 14:09 UTC
49 points
3 comments4 min readEA link

EA Berkeley Pre­sents: Univer­sal Own­er­ship: Is In­dex In­vest­ing the New So­cially Re­spon­si­ble In­vest­ing?

Mahendra Prasad10 Mar 2022 6:58 UTC
7 points
0 comments1 min readEA link

AI Man­u­fac­tured Cri­sis (don’t trust AI to pro­tect us from AI)

WobblyPanda21 Jun 2023 11:12 UTC
4 points
0 comments1 min readEA link

There’s No Fire Alarm for Ar­tifi­cial Gen­eral Intelligence

EA Forum Archives14 Oct 2017 2:41 UTC
30 points
1 comment25 min readEA link
(www.lesswrong.com)

The her­i­ta­bil­ity of hu­man val­ues: A be­hav­ior ge­netic cri­tique of Shard Theory

Geoffrey Miller20 Oct 2022 15:53 UTC
49 points
12 comments21 min readEA link

Philan­thropists Prob­a­bly Shouldn’t Mis­sion-Hedge AI Progress

MichaelDickens23 Aug 2022 23:03 UTC
28 points
9 comments36 min readEA link

Was Re­leas­ing Claude-3 Net-Negative

Logan Riggs27 Mar 2024 17:41 UTC
12 points
1 comment4 min readEA link

Ex­plor­ing the Eso­teric Path­ways to AI Sen­tience (Part One)

Caruso27 Apr 2024 12:22 UTC
−6 points
0 comments2 min readEA link

Fi­nal Re­port of the Na­tional Se­cu­rity Com­mis­sion on Ar­tifi­cial In­tel­li­gence (NSCAI, 2021)

MichaelA🔸1 Jun 2021 8:19 UTC
51 points
3 comments4 min readEA link
(www.nscai.gov)

Strate­gic Per­spec­tives on Trans­for­ma­tive AI Gover­nance: Introduction

MMMaas2 Jul 2022 11:20 UTC
115 points
18 comments4 min readEA link

Marisa, the Co-Founder of EA Any­where, Has Passed Away

carrickflynn17 May 2024 22:49 UTC
520 points
33 comments1 min readEA link

Should we break up Google Deep­Mind?

Hauke Hillebrandt22 Apr 2024 9:16 UTC
34 points
13 comments4 min readEA link

Me­tac­u­lus Pre­sents: Does Gen­er­a­tive AI In­fringe Copy­right?

christian6 Nov 2023 23:41 UTC
5 points
0 comments1 min readEA link

2023 Open Philan­thropy AI Wor­ld­views Con­test: Odds of Ar­tifi­cial Gen­eral In­tel­li­gence by 2043

srhoades1014 Mar 2023 20:32 UTC
19 points
0 comments46 min readEA link

ML4Good UK—Ap­pli­ca­tions Open

Nia2 Jan 2024 18:20 UTC
21 points
0 comments1 min readEA link

Pro­pos­als for the AI Reg­u­la­tory Sand­box in Spain

Guillem Bas27 Apr 2023 10:33 UTC
55 points
2 comments11 min readEA link
(riesgoscatastroficosglobales.com)

[Question] Fore­cast­ing thread: How does AI risk level vary based on timelines?

elifland14 Sep 2022 23:56 UTC
47 points
8 comments1 min readEA link

Ar­tifi­cially sen­tient be­ings: Mo­ral, poli­ti­cal, and le­gal issues

Fırat Akova1 Aug 2023 17:48 UTC
20 points
2 comments1 min readEA link
(doi.org)

Re­place­ment for PONR concept

kokotajlod2 Sep 2022 0:38 UTC
14 points
1 comment2 min readEA link

Re­cruit­ing Skil­led Volunteers

The BOOM3 Nov 2022 14:36 UTC
−9 points
14 comments1 min readEA link

A non-an­thro­po­mor­phized view of LLMs

Jian Xin Lim🔹7 Jul 2025 1:19 UTC
2 points
2 comments1 min readEA link
(addxorrol.blogspot.com)

The Frag­ility of Naive Dynamism

Davidmanheim19 May 2025 7:53 UTC
10 points
1 comment17 min readEA link

Is Paus­ing AI Pos­si­ble?

Richard Annilo9 Oct 2024 13:22 UTC
89 points
4 comments18 min readEA link

As­ter­isk Magaz­ine Is­sue 03: AI

alejandro24 Jul 2023 15:53 UTC
34 points
3 comments1 min readEA link
(asteriskmag.com)

In­for­ma­tion se­cu­rity con­sid­er­a­tions for AI and the long term future

Jeffrey Ladish2 May 2022 20:53 UTC
134 points
8 comments11 min readEA link

There are a lot of up­com­ing re­treats/​con­fer­ences be­tween March and July (2025)

gergo18 Feb 2025 9:28 UTC
18 points
2 comments1 min readEA link

Max Teg­mark — The AGI En­tente Delusion

Matrice Jacobine13 Oct 2024 17:42 UTC
0 points
1 comment1 min readEA link
(www.lesswrong.com)

The case for tak­ing AI se­ri­ously as a threat to hu­man­ity (Kel­sey Piper)

EA Handbook15 Oct 2020 7:00 UTC
11 points
1 comment1 min readEA link
(www.vox.com)

#177 – Re­cent AI break­throughs and nav­i­gat­ing the grow­ing rift be­tween AI safety and ac­cel­er­a­tionist camps (Nathan Labenz on the 80,000 Hours Pod­cast)

80000_Hours31 Jan 2024 19:37 UTC
15 points
0 comments16 min readEA link

[Pod­cast] Ajeya Co­tra on wor­ld­view di­ver­sifi­ca­tion and how big the fu­ture could be

Eevee🔹22 Jan 2021 23:57 UTC
57 points
20 comments1 min readEA link
(80000hours.org)

[Question] Next week I’m in­ter­view­ing tech policy ex­pert Teddy Col­lins who has worked in the White House, Deep­Mind and CSET. What should I ask him?

Robert_Wiblin7 Jul 2023 14:05 UTC
15 points
4 comments1 min readEA link

VSPE vs. flat­tery: Test­ing emo­tional scaf­fold­ing for early-stage alignment

Astelle Kay24 Jun 2025 9:39 UTC
2 points
1 comment1 min readEA link

Re­vis­it­ing the Evolu­tion An­chor in the Biolog­i­cal An­chors Re­port

Janvi18 Mar 2024 3:01 UTC
13 points
1 comment4 min readEA link

The ‘Old AI’: Les­sons for AI gov­er­nance from early elec­tric­ity regulation

Sam Clarke19 Dec 2022 2:46 UTC
58 points
1 comment13 min readEA link

[Link post] Promis­ing Paths to Align­ment—Con­nor Leahy | Talk

frances_lorenz14 May 2022 15:58 UTC
17 points
0 comments1 min readEA link

MATS Ap­pli­ca­tions + Re­search Direc­tions I’m Cur­rently Ex­cited About

Neel Nanda6 Feb 2025 11:03 UTC
31 points
3 comments8 min readEA link

When 2/​3rds of the world goes against you

Jeffrey Kursonis2 Jul 2022 20:34 UTC
2 points
2 comments9 min readEA link

How we use back-of-the-en­velope calcu­la­tions in our grantmaking

Open Philanthropy28 May 2025 23:22 UTC
79 points
2 comments10 min readEA link

Tam­perSec is hiring for 3 Key Roles!

Tatiana K. Nesic Skuratova27 Feb 2025 12:23 UTC
10 points
0 comments5 min readEA link

How to re­duce risks re­lated to con­scious AI: A user guide [Con­scious AI & Public Per­cep­tion]

Jay Luong5 Jul 2024 14:19 UTC
9 points
1 comment15 min readEA link

How could we know that an AGI sys­tem will have good con­se­quences?

So8res7 Nov 2022 22:42 UTC
25 points
0 comments5 min readEA link

Emer­gency pod: Judge plants a le­gal time bomb un­der OpenAI (with Rose Chan Loui)

80000_Hours7 Mar 2025 19:24 UTC
62 points
18 comments2 min readEA link

Say­ing Goodbye

sapphire3 Aug 2025 23:51 UTC
14 points
2 comments4 min readEA link

AI Align­ment 2018-2019 Review

Habryka [Deactivated]28 Jan 2020 21:14 UTC
28 points
0 comments6 min readEA link
(www.lesswrong.com)

From Con­flict to Coex­is­tence: Rewrit­ing the Game Between Hu­mans and AGI

Michael Batell6 May 2025 5:09 UTC
15 points
2 comments37 min readEA link

AISN #27: Defen­sive Ac­cel­er­a­tionism, A Ret­ro­spec­tive On The OpenAI Board Saga, And A New AI Bill From Se­na­tors Thune And Klobuchar

Center for AI Safety7 Dec 2023 15:57 UTC
10 points
0 comments6 min readEA link
(newsletter.safe.ai)

INTERVIEW: StakeOut.AI w/​ Dr. Peter Park

Jacob-Haimes5 Mar 2024 18:04 UTC
21 points
7 comments1 min readEA link
(into-ai-safety.github.io)

Poli­ti­cal Fund­ing Ex­per­tise (Post 6 of 7 on AI Gover­nance)

Jason Green-Lowe19 Jun 2025 14:14 UTC
33 points
1 comment14 min readEA link

What Should the Aver­age EA Do About AI Align­ment?

Raemon25 Feb 2017 20:07 UTC
42 points
39 comments7 min readEA link

My take on What We Owe the Future

elifland1 Sep 2022 18:07 UTC
357 points
51 comments26 min readEA link

Last week to ap­ply for the Fu­turekind AI Fel­low­ship! (dead­line: April 1)

Jay Luong23 Mar 2025 23:16 UTC
24 points
0 comments1 min readEA link

Against AI As An Ex­is­ten­tial Risk

Noah Birnbaum30 Jul 2024 19:24 UTC
6 points
3 comments1 min readEA link
(irrationalitycommunity.substack.com)

AI Agents raised $2,000 for EA char­i­ties & used the EA Forum

David_R 🔸4 Jun 2025 22:18 UTC
16 points
0 comments1 min readEA link

The ‘Bad Par­ent’ Prob­lem: Why Hu­man So­ciety Com­pli­cates AI Alignment

Beyond Singularity5 Apr 2025 21:08 UTC
11 points
1 comment3 min readEA link

Biolog­i­cal An­chors ex­ter­nal re­view by Jen­nifer Lin (linkpost)

peterhartree30 Nov 2022 13:06 UTC
36 points
0 comments1 min readEA link
(docs.google.com)

Is Text Water­mark­ing a lost cause?

Egor Timatkov1 Oct 2024 13:07 UTC
7 points
0 comments10 min readEA link

Crash sce­nario 1: Rapidly mo­bil­ise for a 2025 AI crash

Remmelt11 Apr 2025 6:54 UTC
8 points
0 comments1 min readEA link

His­tor­i­cal Prece­dents for In­ter­na­tional AI Safety Collaborations

ZacRichardson13 Jul 2025 21:30 UTC
20 points
1 comment55 min readEA link

GPT-2 as step to­ward gen­eral in­tel­li­gence (Alexan­der, 2019)

Will Aldred18 Jul 2022 16:14 UTC
42 points
0 comments2 min readEA link
(slatestarcodex.com)

[Question] What EAG ses­sions would you like on AI?

Nathan Young20 Mar 2022 17:05 UTC
7 points
10 comments1 min readEA link

Pal­isade is hiring Re­search Engineers

Charlie Rogers-Smith11 Nov 2023 3:09 UTC
23 points
0 comments3 min readEA link

[Question] Are AI risks tractable?

defun 🔸21 May 2024 13:45 UTC
23 points
1 comment1 min readEA link

My kids won’t be workers

Yadav15 Aug 2025 6:53 UTC
8 points
0 comments6 min readEA link
(y1d2.com)

A Cog­ni­tive In­stru­ment on the Ter­mi­nal Contest

Ihor Ivliev23 Jul 2025 23:30 UTC
0 points
1 comment8 min readEA link

The EU AI Act: A Sim­ple Ex­pla­na­tion—A Stan­ford Study Re­veals the gaps of ChatGPT and 9 more

Sparkvibe26 Jun 2023 8:59 UTC
3 points
1 comment1 min readEA link
(youtu.be)

Re­in­force­ment Learn­ing: A Non-Tech­ni­cal Primer on o1 and Deep­Seek-R1

AlexChalk9 Feb 2025 23:58 UTC
4 points
0 comments9 min readEA link
(alexchalk.net)

Paul Chris­ti­ano on how OpenAI is de­vel­op­ing real solu­tions to the ‘AI al­ign­ment prob­lem’, and his vi­sion of how hu­man­ity will pro­gres­sively hand over de­ci­sion-mak­ing to AI systems

80000_Hours2 Oct 2018 11:49 UTC
6 points
0 comments185 min readEA link

Scru­ti­niz­ing AI Risk (80K, #81) - v. quick summary

Ben23 Jul 2020 19:02 UTC
10 points
1 comment3 min readEA link

Part 1: The AI Safety com­mu­nity has four main work groups, Strat­egy, Gover­nance, Tech­ni­cal and Move­ment Building

PeterSlattery25 Nov 2022 3:45 UTC
72 points
7 comments6 min readEA link

Ap­ply for MATS Win­ter 2023-24!

utilistrutil21 Oct 2023 2:34 UTC
34 points
2 comments5 min readEA link
(www.lesswrong.com)

Feed­back Re­quest on EA Philip­pines’ Ca­reer Ad­vice Re­search for Tech­ni­cal AI Safety

BrianTan3 Oct 2020 10:39 UTC
19 points
5 comments4 min readEA link

Let’s think about slow­ing down AI

Katja_Grace23 Dec 2022 19:56 UTC
339 points
9 comments38 min readEA link

Clas­sify­ing sources of AI x-risk

Sam Clarke8 Aug 2022 18:18 UTC
41 points
4 comments3 min readEA link

Finish­ing The SB-1047 Documentary

Michaël Trazzi28 Oct 2024 20:26 UTC
67 points
0 comments4 min readEA link

[Question] What do we know about Mustafa Suley­man’s po­si­tion on AI Safety?

Chris Leong13 Aug 2023 19:41 UTC
14 points
3 comments1 min readEA link

My guess for the most cost effec­tive AI Safety projects

Linda Linsefors24 Jan 2024 12:21 UTC
26 points
2 comments4 min readEA link

De­bate: should EA avoid us­ing AI art out­side of re­search?

titotal30 Apr 2025 11:10 UTC
34 points
29 comments3 min readEA link

20+ tips, tricks, les­sons and thoughts on host­ing hackathons

gergo6 Nov 2023 10:59 UTC
14 points
0 comments11 min readEA link

Dis­cov­er­ing Lan­guage Model Be­hav­iors with Model-Writ­ten Evaluations

evhub20 Dec 2022 20:09 UTC
25 points
0 comments7 min readEA link
(www.anthropic.com)

Stable to­tal­i­tar­i­anism: an overview

80000_Hours29 Oct 2024 16:07 UTC
36 points
1 comment20 min readEA link
(80000hours.org)

Biweekly AI Safety Comms Meetup

Vishakha Agrawal17 Jul 2025 7:43 UTC
2 points
0 comments1 min readEA link

Con­flicted on AI Politics

Jeff Kaufman 🔸11 Jun 2025 12:39 UTC
39 points
3 comments2 min readEA link

Reg­u­la­tion of AI Use for Per­sonal Data Pro­tec­tion: Com­par­i­son of Global Strate­gies and Op­por­tu­ni­ties for Latin Amer­ica

Lisbeth Guzman 14 Oct 2024 13:22 UTC
10 points
1 comment21 min readEA link

New Speaker Series on AI Align­ment Start­ing March 3

Zechen Zhang26 Feb 2022 10:58 UTC
5 points
0 comments1 min readEA link

Who or­dered al­ign­ment’s ap­ple?

Eleni_A28 Aug 2022 14:24 UTC
5 points
0 comments3 min readEA link

AI Safety is Drop­ping the Ball on Clown Attacks

trevor121 Oct 2023 23:15 UTC
−17 points
0 comments34 min readEA link

[Question] What kind of or­ga­ni­za­tion should be the first to de­velop AGI in a po­ten­tial arms race?

Eevee🔹17 Jul 2022 17:41 UTC
10 points
2 comments1 min readEA link

[Fic­tion] Im­proved Gover­nance on the Crit­i­cal Path to AI Align­ment by 2045.

Jackson Wagner18 May 2022 15:50 UTC
20 points
1 comment12 min readEA link

Cri­tique of Su­per­in­tel­li­gence Part 1

James Fodor13 Dec 2018 5:10 UTC
22 points
13 comments8 min readEA link

What I am open to chang­ing my mind about: polls and debates

BiologyTranslated9 May 2025 10:13 UTC
8 points
9 comments2 min readEA link

Stampy’s AI Safety Info—New Distil­la­tions #2 [April 2023]

markov9 May 2023 13:34 UTC
13 points
1 comment1 min readEA link
(aisafety.info)

I’m Cul­len O’Keefe, a Policy Re­searcher at OpenAI, AMA

Cullen 🔸11 Jan 2020 4:13 UTC
45 points
68 comments1 min readEA link

Book re­view: Ar­chi­tects of In­tel­li­gence by Martin Ford (2018)

Ofer11 Aug 2020 17:24 UTC
11 points
1 comment2 min readEA link

The Self in Ar­tifi­cial Con­scious­ness: A Bud­dhist In­ves­ti­ga­tion into Ad­vanced AI

Ryan Combes24 Oct 2023 4:11 UTC
10 points
2 comments1 min readEA link
(drive.google.com)

AI Align­ment and the Fi­nan­cial War Against Nar­cis­sis­tic Manipulation

Julian Nalenz19 Feb 2025 20:36 UTC
2 points
0 comments3 min readEA link

[Cross­post] AI Reg­u­la­tion May Be More Im­por­tant Than AI Align­ment For Ex­is­ten­tial Safety

Otto24 Aug 2023 16:01 UTC
14 points
2 comments5 min readEA link

2020 AI Align­ment Liter­a­ture Re­view and Char­ity Comparison

Larks21 Dec 2020 15:25 UTC
155 points
16 comments68 min readEA link

Democratis­ing AI Align­ment: Challenges and Proposals

Lloy2 🔹5 May 2025 14:50 UTC
2 points
2 comments4 min readEA link

Align­ment 201 curriculum

richard_ngo12 Oct 2022 19:17 UTC
94 points
9 comments1 min readEA link
(www.agisafetyfundamentals.com)

The limited up­side of interpretability

Peter S. Park15 Nov 2022 20:22 UTC
23 points
3 comments10 min readEA link

Deep­Mind’s “​​Fron­tier Safety Frame­work” is weak and unambitious

Zach Stein-Perlman18 May 2024 3:00 UTC
54 points
1 comment4 min readEA link

The Un­know­able Catastrophe

Aino6 Jul 2023 15:37 UTC
3 points
0 comments3 min readEA link

De­bat­ing AI’s Mo­ral Sta­tus: The Most Hu­mane and Silliest Thing Hu­mans Do(?)

Soe Lin29 Sep 2024 5:01 UTC
5 points
5 comments3 min readEA link

Tur­ing-Test-Pass­ing AI im­plies Aligned AI

Roko31 Dec 2024 20:22 UTC
0 points
0 comments5 min readEA link

[Closed] Agent Foun­da­tions track in MATS

Vanessa31 Oct 2023 8:14 UTC
19 points
0 comments1 min readEA link
(www.matsprogram.org)

Con­sid­er­a­tions on trans­for­ma­tive AI and ex­plo­sive growth from a semi­con­duc­tor-in­dus­try per­spec­tive

Muireall31 May 2023 1:11 UTC
23 points
1 comment2 min readEA link
(muireall.space)

AI Safety 101 : AGI

markov21 Dec 2023 14:18 UTC
2 points
1 comment33 min readEA link

U.S. Govern­ment Seeks In­put on Na­tional AI R&D Strate­gic Plan—Dead­line May 29

Matt Brooks27 May 2025 1:53 UTC
8 points
1 comment1 min readEA link

The in­or­di­nately slow spread of good AGI con­ver­sa­tions in ML

RobBensinger29 Jun 2022 4:02 UTC
59 points
2 comments8 min readEA link

Why AI Reg­u­la­tion Vio­lates the First Amendment

Locke1 Jun 2024 20:44 UTC
−15 points
0 comments5 min readEA link

Re­ac­tive de­val­u­a­tion: Bias in Eval­u­at­ing AGI X-Risks

Remmelt30 Dec 2022 9:02 UTC
2 points
9 comments1 min readEA link

Skil­ling-up in ML Eng­ineer­ing for Align­ment: re­quest for comments

Callum McDougall24 Apr 2022 6:40 UTC
8 points
0 comments1 min readEA link

Samotsvety’s AI risk forecasts

elifland9 Sep 2022 4:01 UTC
175 points
30 comments4 min readEA link

“Nor­mal ac­ci­dents” and AI sys­tems

Eleni_A8 Aug 2022 18:43 UTC
5 points
1 comment1 min readEA link
(www.achan.ca)

AI Benefits Post 4: Out­stand­ing Ques­tions on Select­ing Benefits

Cullen 🔸14 Jul 2020 17:24 UTC
6 points
0 comments5 min readEA link

The Hasty Start of Bu­dapest AI Safety, 6-month up­date from a non-STEM founder

gergo3 Jan 2024 12:56 UTC
9 points
1 comment7 min readEA link

[Question] What are some re­sources (ar­ti­cles, videos) that show off what the cur­rent state of the art in AI is? (for a layper­son who doesn’t know much about AI)

james6 Dec 2021 21:06 UTC
10 points
6 comments1 min readEA link

John Cochrane on why reg­u­la­tion is the wrong tool for AI Safety

ezrah26 Sep 2024 8:48 UTC
3 points
2 comments1 min readEA link
(www.grumpy-economist.com)

[Linkpost] Eric Sch­witzgebel: AI sys­tems must not con­fuse users about their sen­tience or moral status

Zachary Brown🔸18 Aug 2023 17:21 UTC
6 points
0 comments2 min readEA link
(www.sciencedirect.com)

Video and tran­script of talk on AI welfare

Joe_Carlsmith22 May 2025 16:15 UTC
22 points
1 comment28 min readEA link
(joecarlsmith.substack.com)

An­nounc­ing the Moon­shot Align­ment Program

Sharon Mwaniki22 Jul 2025 13:12 UTC
5 points
0 comments3 min readEA link

UK Prime Minister Rishi Su­nak’s Speech on AI

Tobias Häberli26 Oct 2023 10:34 UTC
112 points
6 comments8 min readEA link
(www.gov.uk)

En­hanc­ing Bio­met­ric Data Pro­tec­tion in Latin Amer­ica Based on the Euro­pean Experience

Ana Sofía Jiménez 13 Aug 2024 13:13 UTC
13 points
1 comment4 min readEA link

The count­ing ar­gu­ment for schem­ing (Sec­tions 4.1 and 4.2 of “Schem­ing AIs”)

Joe_Carlsmith6 Dec 2023 19:28 UTC
9 points
1 comment7 min readEA link

AGI with feelings

Nicolai Meberg7 Dec 2022 16:00 UTC
−13 points
0 comments1 min readEA link
(twitter.com)

Com­par­ing AI Labs and Phar­ma­ceu­ti­cal Companies

mxschons13 Nov 2024 14:51 UTC
13 points
0 comments1 min readEA link
(mxschons.com)

A re­quest to keep pes­simistic AI posts ac­tion­able.

tcelferact11 May 2023 15:35 UTC
27 points
9 comments1 min readEA link

Hiring a CEO & EU Tech Policy Lead to launch an AI policy ca­reer org in Europe

Cillian_6 Dec 2023 13:52 UTC
50 points
0 comments7 min readEA link

[Cross-post] Change my mind: we should define and mea­sure the effec­tive­ness of ad­vanced AI

David Johnston6 Apr 2022 0:20 UTC
4 points
0 comments7 min readEA link

AI Risk In­tro 2: Solv­ing The Problem

L Rudolf L24 Sep 2022 9:33 UTC
11 points
0 comments28 min readEA link
(www.perfectlynormal.co.uk)

Con­ver­gence 2024 Im­pact Review

David_Kristoffersson24 Mar 2025 20:28 UTC
38 points
0 comments14 min readEA link

[Question] Ques­tions on databases of AI Risk estimates

Froolow2 Oct 2022 9:12 UTC
24 points
12 comments2 min readEA link

AGI as a Black Swan Event

Stephen McAleese4 Dec 2022 23:35 UTC
5 points
2 comments7 min readEA link
(www.lesswrong.com)

OpenAI: Pre­pared­ness framework

Zach Stein-Perlman18 Dec 2023 18:30 UTC
24 points
0 comments4 min readEA link
(openai.com)

#197 – On whether An­thropic’s AI safety policy is up to the task (Nick Joseph on The 80,000 Hours Pod­cast)

80000_Hours22 Aug 2024 15:34 UTC
9 points
0 comments18 min readEA link

Take­aways from a sur­vey on AI al­ign­ment resources

DanielFilan5 Nov 2022 23:45 UTC
20 points
9 comments6 min readEA link
(www.lesswrong.com)

[Question] sub­mis­sive ai

David turner21 Nov 2023 14:28 UTC
−5 points
0 comments1 min readEA link

AI Safety Needs Great Engineers

Andy Jones23 Nov 2021 21:03 UTC
98 points
13 comments4 min readEA link

The fun­da­men­tal hu­man value is power.

Linyphia30 Mar 2023 15:15 UTC
−1 points
5 comments1 min readEA link

Most Lead­ing AI Ex­perts Believe That Ad­vanced AI Could Be Ex­tremely Danger­ous to Humanity

jai4 May 2023 16:19 UTC
31 points
1 comment1 min readEA link
(laneless.substack.com)

Pivotal out­comes and pivotal processes

Andrew Critch17 Jun 2022 23:43 UTC
49 points
1 comment4 min readEA link

Safety-con­cerned EAs should pri­ori­tize AI gov­er­nance over alignment

sammyboiz🔸11 Jun 2024 15:47 UTC
59 points
20 comments1 min readEA link

Mo­ral Spillover in Hu­man-AI Interaction

Katerina Manoli5 Jun 2023 15:20 UTC
17 points
1 comment13 min readEA link

[Question] How can we se­cure more re­search po­si­tions at our uni­ver­si­ties for x-risk re­searchers?

Neil Crawford6 Sep 2022 14:41 UTC
3 points
2 comments1 min readEA link

[Question] What does the Pro­ject Man­age­ment role look like in AI safety?

gvst14 May 2022 19:29 UTC
10 points
1 comment1 min readEA link

Some quick thoughts on “AI is easy to con­trol”

MikhailSamin7 Dec 2023 12:23 UTC
5 points
4 comments7 min readEA link

An ML safety in­surance com­pany—shower thoughts

EdoArad18 Oct 2021 7:45 UTC
15 points
4 comments1 min readEA link

[Question] Is a ca­reer in mak­ing AI sys­tems more se­cure a mean­ingful way to miti­gate the X-risk posed by AGI?

Kyle O’Brien13 Feb 2022 7:05 UTC
14 points
4 comments1 min readEA link

In­ci­dent re­port­ing for AI safety

Zach Stein-Perlman19 Jul 2023 17:00 UTC
18 points
1 comment18 min readEA link

AISN #9: State­ment on Ex­tinc­tion Risks, Com­pet­i­tive Pres­sures, and When Will AI Reach Hu­man-Level?

Center for AI Safety6 Jun 2023 15:56 UTC
12 points
2 comments7 min readEA link
(newsletter.safe.ai)

Video and tran­script of talk on “Can good­ness com­pete?”

Joe_Carlsmith17 Jul 2025 17:59 UTC
34 points
4 comments34 min readEA link
(joecarlsmith.substack.com)

Think­ing About Propen­sity Evaluations

Maxime Riché 🔸19 Aug 2024 9:24 UTC
12 points
1 comment27 min readEA link

How to Diver­sify Con­cep­tual AI Align­ment: the Model Be­hind Refine

adamShimi20 Jul 2022 10:44 UTC
43 points
0 comments9 min readEA link
(www.alignmentforum.org)

An en­tire cat­e­gory of risks is un­der­val­ued by EA [Sum­mary of pre­vi­ous fo­rum post]

Richard R5 Sep 2022 15:07 UTC
79 points
5 comments5 min readEA link

AISN #20: LLM Pro­lifer­a­tion, AI De­cep­tion, and Con­tin­u­ing Drivers of AI Capabilities

Center for AI Safety29 Aug 2023 15:03 UTC
12 points
0 comments8 min readEA link
(newsletter.safe.ai)

Alex Lawsen On Fore­cast­ing AI Progress

Michaël Trazzi6 Sep 2022 9:53 UTC
38 points
1 comment2 min readEA link
(theinsideview.ai)

Register for the Stan­ford Ex­is­ten­tial Risks Ini­ti­a­tive (SERI) Symposium

Grant Higerd-Rusli18 Mar 2025 3:50 UTC
7 points
0 comments1 min readEA link
(cisac.fsi.stanford.edu)

Call for sub­mis­sions: AI Safety Spe­cial Ses­sion at the Con­fer­ence on Ar­tifi­cial Life (ALIFE 2023)

Rory Greig5 Feb 2023 16:37 UTC
16 points
0 comments2 min readEA link
(humanvaluesandartificialagency.com)

AI Fore­cast­ing Ques­tion Database (Fore­cast­ing in­fras­truc­ture, part 3)

terraform3 Sep 2019 14:57 UTC
23 points
2 comments4 min readEA link

Stack­elberg Games and Co­op­er­a­tive Com­mit­ment: My Thoughts and Reflec­tions on a 2-Month Re­search Project

Ben Bucknall13 Dec 2021 10:49 UTC
18 points
1 comment9 min readEA link

AISN #17: Au­to­mat­i­cally Cir­cum­vent­ing LLM Guardrails, the Fron­tier Model Fo­rum, and Se­nate Hear­ing on AI Oversight

Center for AI Safety1 Aug 2023 15:24 UTC
15 points
0 comments8 min readEA link

Look­ing for a Doc­u­ment to In­tro­duce AI Risks to Newbies

Jr2217 Jun 2024 13:02 UTC
2 points
3 comments1 min readEA link

An­nounc­ing the AI Safety Sum­mit Talks with Yoshua Bengio

Otto14 May 2024 12:49 UTC
33 points
1 comment1 min readEA link

AISN #18: Challenges of Re­in­force­ment Learn­ing from Hu­man Feed­back, Microsoft’s Se­cu­rity Breach, and Con­cep­tual Re­search on AI Safety

Center for AI Safety8 Aug 2023 15:52 UTC
12 points
0 comments5 min readEA link
(newsletter.safe.ai)

Spicy takes about AI policy (Clark, 2022)

Will Aldred9 Aug 2022 13:49 UTC
44 points
0 comments3 min readEA link
(twitter.com)

How should norms of aca­demic writ­ing and pub­lish­ing be changed once AI sys­tems be­come su­per­hu­man in more re­spects?

simonfriederich24 Nov 2023 13:35 UTC
10 points
0 comments1 min readEA link
(link.springer.com)

deleted

funnyfranco15 Mar 2025 15:32 UTC
4 points
0 comments22 min readEA link

Launch­ing the AI Fore­cast­ing Bench­mark Series Q3 | $30k in Prizes

christian8 Jul 2024 17:20 UTC
17 points
0 comments1 min readEA link
(www.metaculus.com)

Soares, Tal­linn, and Yud­kowsky dis­cuss AGI cognition

EliezerYudkowsky29 Nov 2021 17:28 UTC
15 points
0 comments40 min readEA link

In­creased Availa­bil­ity and Willing­ness for De­ploy­ment of Re­sources for Effec­tive Altru­ism and Long-Termism

Evan_Gaensbauer29 Dec 2021 20:20 UTC
46 points
1 comment2 min readEA link

Con­tin­u­ous doesn’t mean slow

Tom_Davidson10 May 2023 12:17 UTC
64 points
1 comment4 min readEA link

[Question] What is the best ar­ti­cle to in­tro­duce some­one to AI safety for the first time?

trevor122 Nov 2022 2:06 UTC
2 points
3 comments1 min readEA link

Make a neu­ral net­work in ~10 minutes

Arjun Yadav25 Apr 2022 18:36 UTC
3 points
0 comments4 min readEA link
(arjunyadav.net)

AI for AI safety

Joe_Carlsmith14 Mar 2025 15:00 UTC
34 points
1 comment17 min readEA link
(joecarlsmith.substack.com)

Speed ar­gu­ments against schem­ing (Sec­tion 4.4-4.7 of “Schem­ing AIs”)

Joe_Carlsmith8 Dec 2023 21:10 UTC
6 points
0 comments11 min readEA link

Ar­gu­ment Against Im­pact: EU Is Not an AI Su­per­power

EU AI Governance31 Jan 2022 9:48 UTC
35 points
9 comments4 min readEA link

Align­ing AI with Hu­mans by Lev­er­ag­ing Le­gal Informatics

johnjnay18 Sep 2022 7:43 UTC
20 points
11 comments3 min readEA link

Up­dates from Cam­paign for AI Safety

Jolyn Khoo29 Jun 2023 7:23 UTC
8 points
0 comments1 min readEA link
(www.campaignforaisafety.org)

CoreWeave Is A Time Bomb

Remmelt31 Mar 2025 3:52 UTC
10 points
2 comments2 min readEA link
(www.wheresyoured.at)

deleted

funnyfranco13 Mar 2025 19:03 UTC
1 point
0 comments1 min readEA link

Skep­ti­cism to­wards claims about the views of pow­er­ful institutions

tlevin13 Feb 2025 7:40 UTC
20 points
1 comment4 min readEA link

What is the role of Bayesian ML for AI al­ign­ment/​safety?

mariushobbhahn11 Jan 2022 8:07 UTC
39 points
6 comments3 min readEA link

Nice­ness is unnatural

So8res13 Oct 2022 1:30 UTC
20 points
1 comment8 min readEA link

[Question] How con­fi­dent are you that it’s prefer­able for Amer­ica to de­velop AGI be­fore China does?

ScienceMon🔸22 Feb 2025 13:37 UTC
218 points
53 comments1 min readEA link

Opinionated take on EA and AI Safety

sammyboiz🔸2 Mar 2025 9:37 UTC
75 points
18 comments1 min readEA link

Tech­ni­cal AI Safety Re­search Land­scape [Slides]

Magdalena Wache18 Sep 2023 13:56 UTC
31 points
0 comments4 min readEA link

AI gov­er­nance needs a the­ory of victory

Corin Katzke21 Jun 2024 16:08 UTC
84 points
8 comments20 min readEA link
(www.convergenceanalysis.org)

Feed­back wanted! On script for an up­com­ing ~12 minute Rob Miles video on AI x-risk.

melissasamworth23 Jan 2025 21:46 UTC
25 points
0 comments1 min readEA link

ARIA is look­ing for top­ics for roundtables

Nathan_Barnard26 Aug 2022 19:14 UTC
34 points
11 comments1 min readEA link

Visit Mex­ico City in Jan­uary & Fe­bru­ary to in­ter­act with the AI Fu­tures Fellowship

AmAristizabal28 Jul 2023 16:44 UTC
45 points
0 comments2 min readEA link

An­i­mal Weapons: Les­sons for Hu­mans in the Age of X-Risk

Damin Curtis🔹4 Jul 2023 14:43 UTC
32 points
1 comment10 min readEA link

AI labs’ state­ments on governance

Zach Stein-Perlman4 Jul 2023 16:30 UTC
28 points
1 comment36 min readEA link

Re­duc­ing LLM de­cep­tion at scale with self-other over­lap fine-tuning

Marc Carauleanu13 Mar 2025 19:09 UTC
8 points
0 comments6 min readEA link

Ret­ro­spec­tive: PIBBSS Fel­low­ship 2023

Dušan D. Nešić (Dushan)16 Feb 2024 17:48 UTC
17 points
2 comments8 min readEA link

Wild An­i­mal Welfare Sce­nar­ios for AI Doom

utilistrutil8 Jun 2023 19:41 UTC
54 points
2 comments3 min readEA link

What Does an ASI Poli­ti­cal Ecol­ogy Mean for Hu­man Sur­vival?

Nathan Sidney23 Feb 2025 8:53 UTC
7 points
3 comments1 min readEA link

Sen­tience in Machines—How Do We Test for This Ob­jec­tively?

Mayowa Osibodu20 Mar 2023 5:20 UTC
10 points
0 comments2 min readEA link
(www.researchgate.net)

Wel­come to Ap­ply: The 2024 Vi­talik Bu­terin Fel­low­ships in AI Ex­is­ten­tial Safety by FLI!

Zhijing Jin25 Sep 2023 16:20 UTC
14 points
5 comments2 min readEA link

Ap­ply to the 2025 PIBBSS Sum­mer Re­search Fellowship

Dušan D. Nešić (Dushan)24 Dec 2024 10:28 UTC
6 points
0 comments2 min readEA link

Toby Ord’s new re­port on les­sons from the de­vel­op­ment of the atomic bomb

Ishan Mukherjee22 Nov 2022 10:37 UTC
65 points
3 comments1 min readEA link
(www.governance.ai)

Si­mu­lat­ing a pos­si­ble al­ign­ment solu­tion in GPT2-medium us­ing Archety­pal Trans­fer Learning

Miguel2 May 2023 16:23 UTC
4 points
0 comments18 min readEA link

Sur­vey of 2,778 AI au­thors: six parts in pictures

Katja_Grace6 Jan 2024 4:43 UTC
176 points
11 comments2 min readEA link

Fermi es­ti­ma­tion of the im­pact you might have work­ing on AI safety

frib13 May 2022 13:30 UTC
24 points
13 comments1 min readEA link

Yud­kowsky and Chris­ti­ano on AI Take­off Speeds [LINKPOST]

aog5 Apr 2022 0:57 UTC
15 points
0 comments11 min readEA link

[Question] Us­ing Older AI Models as a Form of Boycott

Jacob121 Jul 2025 12:13 UTC
5 points
0 comments1 min readEA link

When do ex­perts think hu­man-level AI will be cre­ated?

Vishakha Agrawal2 Jan 2025 23:17 UTC
33 points
9 comments2 min readEA link
(aisafety.info)

Ego‑Cen­tric Ar­chi­tec­ture for AGI Safety: Tech­ni­cal Core, Falsifi­able Pre­dic­tions, and a Min­i­mal Experiment

Samuel Pedrielli30 Jul 2025 14:37 UTC
1 point
1 comment3 min readEA link

Linkpost—Beyond Hyper­an­thro­po­mor­phism: Or, why fears of AI are not even wrong, and how to make them real

Locke24 Aug 2022 16:24 UTC
−4 points
3 comments2 min readEA link
(studio.ribbonfarm.com)

AGI ruin mostly rests on strong claims about al­ign­ment and de­ploy­ment, not about society

RobBensinger24 Apr 2023 13:07 UTC
16 points
4 comments6 min readEA link

UK policy and poli­tics careers

weeatquince28 Sep 2019 16:18 UTC
28 points
10 comments7 min readEA link

Nine Points of Col­lec­tive Insanity

Remmelt27 Dec 2022 3:14 UTC
1 point
0 comments1 min readEA link
(mflb.com)

AI Safety in a Vuln­er­a­ble World: Re­quest­ing Feed­back on Pre­limi­nary Thoughts

Jordan Arel6 Dec 2022 22:36 UTC
5 points
4 comments3 min readEA link

Tech to AI safety men­tor­ship: Mid-ca­reer tran­si­tions with Cameron Holmes

frances_lorenz23 Jul 2025 16:34 UTC
13 points
0 comments2 min readEA link

Pos­si­ble Diver­gence in AGI Risk Tol­er­ance be­tween Selfish and Altru­is­tic agents

Brad West🔸9 Sep 2023 0:22 UTC
11 points
0 comments2 min readEA link

AI Safety: The [Hy­po­thet­i­cal] Video Game

barryl 🔸18 Apr 2025 20:19 UTC
2 points
1 comment3 min readEA link

AI Pause Will Likely Backfire

Nora Belrose16 Sep 2023 10:21 UTC
141 points
167 comments13 min readEA link

Main paths to im­pact in EU AI Policy

JOMG_Monnet8 Dec 2022 16:17 UTC
69 points
2 comments8 min readEA link

AMA: PauseAI US needs money! Ask founder/​Exec Dir Holly El­more any­thing for 11/​19

Holly Elmore ⏸️ 🔸11 Nov 2024 23:51 UTC
91 points
57 comments4 min readEA link

Read­ing the ethi­cists 2: Hunt­ing for AI al­ign­ment papers

Charlie Steiner6 Jun 2022 15:53 UTC
11 points
0 comments1 min readEA link
(www.lesswrong.com)

Sum­mary: “Imag­in­ing and build­ing wise ma­chines: The cen­tral­ity of AI metacog­ni­tion” by John­son, Karimi, Ben­gio, et al.

Chris Leong5 Jun 2025 12:16 UTC
12 points
0 comments10 min readEA link
(arxiv.org)

Prov­ably Hon­est—A First Step

Srijanak De5 Nov 2022 21:49 UTC
1 point
0 comments8 min readEA link

[Question] How do you talk about AI safety?

Eevee🔹19 Apr 2020 16:15 UTC
10 points
5 comments1 min readEA link

[Question] Why should we *not* put effort into AI safety re­search?

Ben Thompson16 May 2021 5:11 UTC
15 points
5 comments1 min readEA link

Against Ex­plo­sive Growth

c.trout4 Sep 2024 21:45 UTC
24 points
9 comments5 min readEA link

Re­spon­si­ble Scal­ing Poli­cies Are Risk Man­age­ment Done Wrong

simeon_c25 Oct 2023 23:46 UTC
42 points
1 comment22 min readEA link
(www.navigatingrisks.ai)

By failing to take se­ri­ous AI ac­tion, the US could be in vi­o­la­tion of its in­ter­na­tional law obligations

Cecil Abungu 27 May 2023 4:25 UTC
44 points
1 comment10 min readEA link

Why AI Safety Needs a Cen­tral­ized Plan—And What It Might Look Like

Brandon Riggs28 May 2025 21:40 UTC
21 points
7 comments15 min readEA link

EA rele­vant Fore­sight In­sti­tute Work­shops in 2023: WBE & AI safety, Cryp­tog­ra­phy & AI safety, XHope, Space, and Atom­i­cally Pre­cise Manufacturing

elteerkers16 Jan 2023 14:02 UTC
20 points
1 comment3 min readEA link

Cor­rect­ing the Foun­da­tions: Ex­pos­ing the Con­tra­dic­tions of Mo­ral Rel­a­tivism and the Need for Ob­jec­tive Stan­dards in Ethics and AI Alignment

Howl4049 Jul 2025 15:27 UTC
1 point
0 comments4 min readEA link

Map­ping the Land­scape of Digi­tal Sen­tience Research

Kayode Adekoya19 Jun 2025 13:45 UTC
5 points
0 comments3 min readEA link

When will AI au­to­mate all men­tal work, and how fast?

A.G.G. Liu31 May 2025 16:18 UTC
10 points
0 comments7 min readEA link
(youtu.be)

On Solv­ing Prob­lems Be­fore They Ap­pear: The Weird Episte­molo­gies of Alignment

adamShimi11 Oct 2021 8:21 UTC
28 points
0 comments15 min readEA link

A case study of reg­u­la­tion done well? Cana­dian biorisk regulations

rosehadshar8 Sep 2023 17:10 UTC
31 points
1 comment16 min readEA link

AI Align­ment Re­search Eng­ineer Ac­cel­er­a­tor (ARENA): call for applicants

Callum McDougall7 Nov 2023 9:43 UTC
46 points
3 comments10 min readEA link

Em­piri­cal work that might shed light on schem­ing (Sec­tion 6 of “Schem­ing AIs”)

Joe_Carlsmith11 Dec 2023 16:30 UTC
7 points
1 comment19 min readEA link

Is Op­ti­mal Reflec­tion Com­pet­i­tive with Ex­tinc­tion Risk Re­duc­tion? - Re­quest­ing Reviewers

Jordan Arel29 Jun 2025 5:13 UTC
18 points
1 comment11 min readEA link

[Question] To what ex­tent is AI safety work try­ing to get AI to re­li­ably and safely do what the user asks vs. do what is best in some ul­ti­mate sense?

Jordan Arel23 May 2025 21:09 UTC
12 points
0 comments1 min readEA link

Atari early

AI Impacts2 Apr 2020 23:28 UTC
34 points
2 comments5 min readEA link
(aiimpacts.org)

AI Au­dit in Costa Rica

Priscilla Campos27 Jan 2025 2:57 UTC
10 points
4 comments9 min readEA link

[Question] What are the challenges and prob­lems with pro­gram­ming law-break­ing con­straints into AGI?

Michael St Jules 🔸2 Feb 2020 20:53 UTC
20 points
34 comments1 min readEA link

The Pend­ing Disaster Fram­ing as it Re­lates to AI Risk

Chris Leong25 Feb 2024 15:47 UTC
8 points
2 comments6 min readEA link

Ilya: The AI sci­en­tist shap­ing the world

David Varga20 Nov 2023 12:43 UTC
6 points
1 comment4 min readEA link

How Europe might mat­ter for AI governance

stefan.torges12 Jul 2019 23:42 UTC
52 points
13 comments8 min readEA link

Rea­sons for and against work­ing on tech­ni­cal AI safety at a fron­tier AI lab

bilalchughtai7 Jan 2025 13:23 UTC
16 points
3 comments12 min readEA link
(www.lesswrong.com)

A Sketch of AI-Driven Epistemic Lock-In

Ozzie Gooen5 Mar 2025 22:40 UTC
15 points
1 comment3 min readEA link

The case for tak­ing AI se­ri­ously as a threat to humanity

EA Handbook10 Nov 2020 0:00 UTC
11 points
5 comments1 min readEA link
(www.vox.com)

Four Fu­tures For Cog­ni­tive Labor

Maxwell Tabarrok13 Jun 2024 12:58 UTC
27 points
11 comments4 min readEA link
(www.maximum-progress.com)

Rood­man’s Thoughts on Biolog­i­cal Anchors

lukeprog14 Sep 2022 12:23 UTC
73 points
8 comments1 min readEA link
(docs.google.com)

Join the in­ter­pretabil­ity re­search hackathon

Esben Kran28 Oct 2022 16:26 UTC
48 points
0 comments5 min readEA link

Time-stamp­ing: An ur­gent, ne­glected AI safety measure

Axel Svensson30 Jan 2023 11:21 UTC
57 points
27 comments3 min readEA link

AI coöper­a­tion is more pos­si­ble than you think

42317524 Sep 2022 23:04 UTC
2 points
0 comments2 min readEA link

I’m Buck Sh­legeris, I do re­search and out­reach at MIRI, AMA

Buck15 Nov 2019 22:44 UTC
123 points
228 comments2 min readEA link

No one has the ball on 1500 Rus­sian olympiad win­ners who’ve re­ceived HPMOR

MikhailSamin23 Jan 2025 16:40 UTC
32 points
10 comments1 min readEA link

Effec­tive Per­sua­sion For AI Align­ment Risk

Brian Lui9 Aug 2022 23:55 UTC
5 points
7 comments4 min readEA link

The Me­taethics and Nor­ma­tive Ethics of AGI Value Align­ment: Many Ques­tions, Some Implications

Eleos Arete Citrini15 Sep 2021 19:05 UTC
25 points
0 comments8 min readEA link

At­ten­tion on AI X-Risk Likely Hasn’t Dis­tracted from Cur­rent Harms from AI

Erich_Grunewald 🔸21 Dec 2023 17:24 UTC
190 points
13 comments17 min readEA link
(www.erichgrunewald.com)

AI for An­i­mals is Hiring a Pro­gram Lead

Constance Li10 Jul 2024 20:57 UTC
21 points
0 comments4 min readEA link

Govern­ing High-Im­pact AI Sys­tems: Un­der­stand­ing Canada’s Pro­posed AI Bill. April 15, Car­leton Univer­sity, Ottawa

Liav.Koren27 Mar 2023 23:11 UTC
3 points
0 comments1 min readEA link
(www.eventbrite.com)

Karma Tests in Log­i­cal Coun­ter­fac­tual Si­mu­la­tions mo­ti­vates strong agents to pro­tect weak agents

Knight Lee18 Apr 2025 12:03 UTC
1 point
0 comments3 min readEA link

AI De­faults: A Ne­glected Lever for An­i­mal Welfare?

andiehansen30 May 2025 9:59 UTC
13 points
0 comments10 min readEA link

Les­sons on pro­ject man­age­ment from “How Big Things Get Done”

Cristina Schmidt Ibáñez17 May 2023 19:15 UTC
36 points
3 comments9 min readEA link

Differ­ence, Pro­jec­tion, and Adaptation

YOG10 Nov 2022 10:46 UTC
0 points
0 comments3 min readEA link

[Question] Should we na­tion­al­ize AI de­vel­op­ment?

Jadon Schmitt20 Jul 2023 5:31 UTC
5 points
4 comments1 min readEA link

Ap­ply for ARBOx2: an ML safety in­ten­sive [dead­line: 25th of May 2025]

Margot Stakenborg13 May 2025 11:45 UTC
16 points
5 comments1 min readEA link

A tax­on­omy of non-schemer mod­els (Sec­tion 1.2 of “Schem­ing AIs”)

Joe_Carlsmith22 Nov 2023 15:24 UTC
6 points
0 comments6 min readEA link

An­nounc­ing the EU Tech Policy Fellowship

Jan-Willem30 Mar 2022 8:15 UTC
53 points
4 comments5 min readEA link

The Economist fea­ture ar­ti­cles on LLMs

Dr Dan Epstein20 Apr 2023 0:29 UTC
12 points
0 comments1 min readEA link
(www.economist.com)

Su­per­in­tel­li­gence’s goals are likely to be random

MikhailSamin14 Mar 2025 1:17 UTC
2 points
0 comments5 min readEA link

Ap­pli­ca­tions Now Open for Deep Dive: A 201 AI Policy Course by ENAIS

Kambar2 Jul 2025 8:32 UTC
10 points
5 comments1 min readEA link

Zvi on: A Play­book for AI Policy at the Man­hat­tan Institute

Phib4 Aug 2024 21:34 UTC
9 points
1 comment7 min readEA link
(thezvi.substack.com)

Dear An­thropic peo­ple, please don’t re­lease Claude

Joseph Miller8 Feb 2023 2:44 UTC
28 points
5 comments1 min readEA link

Early Chi­nese Lan­guage Me­dia Cover­age of the AI 2027 Re­port: A Qual­i­ta­tive Analysis

eeeee30 Apr 2025 14:23 UTC
14 points
0 comments11 min readEA link
(www.lesswrong.com)

The ne­ces­sity of “Guardian AI” and two con­di­tions for its achievement

Proica28 May 2024 11:42 UTC
1 point
1 comment15 min readEA link

Good policy ideas that won’t hap­pen (yet)

Niel_Bowerman11 Sep 2014 12:29 UTC
28 points
8 comments14 min readEA link

AI Align­ment Re­search Eng­ineer Ac­cel­er­a­tor (ARENA): call for applicants

Callum McDougall17 Apr 2023 20:30 UTC
41 points
2 comments7 min readEA link

Con­crete ac­tions to im­prove AI gov­er­nance: the be­havi­our sci­ence approach

Alexander Saeri1 Dec 2022 21:34 UTC
31 points
0 comments11 min readEA link

Up­date to Samotsvety AGI timelines

Misha_Yagudin24 Jan 2023 4:27 UTC
120 points
9 comments4 min readEA link

Con­sider try­ing Vivek Heb­bar’s al­ign­ment exercises

Akash24 Oct 2022 19:46 UTC
16 points
0 comments4 min readEA link

ARENA 2.0 - Im­pact Report

Callum McDougall26 Sep 2023 17:13 UTC
17 points
0 comments13 min readEA link

[Cause Ex­plo­ra­tion Prizes] Ex­pand­ing com­mu­ni­ca­tion about AGI risks

Ines22 Sep 2022 5:30 UTC
13 points
0 comments11 min readEA link

On Ar­tifi­cial Wisdom

Jordan Arel11 Jul 2024 7:14 UTC
23 points
3 comments14 min readEA link

(4 min read) An in­tu­itive ex­pla­na­tion of the AI in­fluence situation

trevor113 Jan 2024 17:34 UTC
1 point
1 comment4 min readEA link

Vot­ing The­ory has a HOLE

Anthony Repetto4 Dec 2021 4:20 UTC
2 points
4 comments2 min readEA link

High Im­pact Ca­reers in For­mal Ver­ifi­ca­tion: Ar­tifi­cial Intelligence

quinn5 Jun 2021 14:45 UTC
28 points
7 comments16 min readEA link

When re­port­ing AI timelines, be clear who you’re defer­ring to

Sam Clarke10 Oct 2022 14:24 UTC
120 points
20 comments1 min readEA link

[Dis­cus­sion] Best in­tu­ition pumps for AI safety

mariushobbhahn6 Nov 2021 8:11 UTC
10 points
8 comments1 min readEA link

EU’s AI am­bi­tions at risk as US pushes to wa­ter down in­ter­na­tional treaty (linkpost)

mic31 Jul 2023 0:34 UTC
9 points
0 comments4 min readEA link
(www.euractiv.com)

Cen­tre for the Study of Ex­is­ten­tial Risk Four Month Re­port June—Septem­ber 2020

HaydnBelfield2 Dec 2020 18:33 UTC
24 points
0 comments17 min readEA link

The Eth­i­cal Basilisk Thought Experiment

Kyrtin23 Aug 2023 13:24 UTC
1 point
6 comments1 min readEA link

The His­tory of AI Rights Research

Jamie_Harris27 Aug 2022 8:14 UTC
48 points
1 comment14 min readEA link
(www.sentienceinstitute.org)

What I’m doing

Chris Leong19 Jul 2022 11:31 UTC
28 points
0 comments4 min readEA link

Cor­po­rate Gover­nance for Fron­tier AI Labs: A Re­search Agenda

Matthew Wearden28 Feb 2024 11:32 UTC
18 points
3 comments16 min readEA link
(matthewwearden.co.uk)

An­nounc­ing Tra­jec­tory Labs—A Toronto AI Safety Office

Juliana Eberschlag13 May 2025 21:04 UTC
18 points
2 comments2 min readEA link

Alert on the Toner-Rodgers paper

Eva16 May 2025 17:58 UTC
59 points
1 comment1 min readEA link

De­ci­sion Eng­ine For Model­ling AI in Society

Echo Huang7 Aug 2025 11:15 UTC
24 points
1 comment18 min readEA link

Does Re­in­force­ment Learn­ing Really In­cen­tivize Rea­son­ing Ca­pac­ity in LLMs Beyond the Base Model?

Matrice Jacobine24 Apr 2025 14:11 UTC
10 points
0 comments1 min readEA link
(limit-of-rlvr.github.io)

Es­ti­mat­ing the Cur­rent and Fu­ture Num­ber of AI Safety Researchers

Stephen McAleese28 Sep 2022 20:58 UTC
64 points
34 comments9 min readEA link

My The­ory of Con­scious­ness: The Ex­pe­riencer and the Indicator

David Hammerle23 Dec 2024 4:07 UTC
1 point
1 comment7 min readEA link

Fact Check: 57% of the in­ter­net is NOT AI-gen­er­ated

James-Hartree-Law17 Jan 2025 21:26 UTC
1 point
0 comments1 min readEA link

What we can learn from stress test­ing for AI regulation

Nathan_Barnard17 Jul 2023 19:56 UTC
27 points
0 comments26 min readEA link

[Question] Is this a good way to bet on short timelines?

kokotajlod28 Nov 2020 14:31 UTC
17 points
16 comments1 min readEA link

Ap­ply to the ML for Align­ment Boot­camp (MLAB) in Berkeley [Jan 3 - Jan 22]

Habryka [Deactivated]3 Nov 2021 18:20 UTC
140 points
6 comments1 min readEA link

Against us­ing stock prices to fore­cast AI timelines

basil.halperin10 Jan 2023 16:04 UTC
18 points
5 comments2 min readEA link

Quan­tum Im­mor­tal­ity: A Per­spec­tive if AI Doomers are Prob­a­bly Right

turchin7 Nov 2024 16:06 UTC
7 points
0 comments14 min readEA link

Why I Should Work on AI Safety—Part 2: Will AI Ac­tu­ally Sur­pass Hu­man In­tel­li­gence?

Aditya Aswani27 Dec 2023 21:08 UTC
8 points
2 comments8 min readEA link

AGI—al­ign­ment—pa­per­clip max­i­mizer—pause—defec­tion—incentives

Mars Robertson13 Apr 2023 10:38 UTC
1 point
2 comments1 min readEA link

Open Agency model can solve the AI reg­u­la­tion dilemma

Roman Leventov9 Nov 2023 15:22 UTC
4 points
0 comments2 min readEA link

How to reg­u­late cut­ting-edge AI mod­els (Markus An­der­ljung on The 80,000 Hours Pod­cast)

80000_Hours11 Jul 2023 12:36 UTC
25 points
0 comments14 min readEA link

Seek­ing Mechanism De­signer for Re­search into In­ter­nal­iz­ing Catas­trophic Externalities

c.trout11 Sep 2024 15:09 UTC
11 points
0 comments3 min readEA link

AI & Policy 1/​3: On know­ing the effect of to­day’s poli­cies on Trans­for­ma­tive AI risks, and the case for in­sti­tu­tional im­prove­ments.

weeatquince27 Aug 2019 11:04 UTC
27 points
3 comments10 min readEA link

FHI Re­port: Stable Agree­ments in Tur­bu­lent Times

Cullen 🔸21 Feb 2019 17:12 UTC
25 points
2 comments4 min readEA link
(www.fhi.ox.ac.uk)

[linkpost] Chris­ti­ano on agree­ment/​dis­agree­ment with Yud­kowsky’s “List of Lethal­ities”

Owen Cotton-Barratt19 Jun 2022 22:47 UTC
130 points
1 comment1 min readEA link
(www.lesswrong.com)

Paul Chris­ti­ano – Ma­chine in­tel­li­gence and cap­i­tal accumulation

Tessa A 🔸15 May 2014 0:10 UTC
21 points
0 comments6 min readEA link
(rationalaltruist.com)

[Question] Why The Fo­cus on Ex­pected Utility Max­imisers?

𝕮𝖎𝖓𝖊𝖗𝖆27 Dec 2022 15:51 UTC
11 points
1 comment3 min readEA link

CSER Ad­vice to EU High-Level Ex­pert Group on AI

HaydnBelfield8 Mar 2019 20:42 UTC
14 points
0 comments5 min readEA link
(www.cser.ac.uk)

Cal­ifor­nia AI Bill, SB 1047, cov­ered in to­day’s WSJ.

Emerson8 Aug 2024 12:27 UTC
5 points
0 comments1 min readEA link
(www.wsj.com)

Why I funded PIBBSS

Ryan Kidd15 Sep 2024 19:56 UTC
90 points
2 comments3 min readEA link

The Soul of EA is in Trouble

Mjreard8 May 2025 16:44 UTC
345 points
42 comments9 min readEA link

[Question] I’m in­ter­view­ing pro­lific AI safety re­searcher Richard Ngo (now at OpenAI and pre­vi­ously Deep­Mind). What should I ask him?

Robert_Wiblin29 Sep 2022 0:00 UTC
45 points
11 comments1 min readEA link

There should be an AI safety pro­ject board

mariushobbhahn14 Mar 2022 16:08 UTC
24 points
3 comments1 min readEA link

“Clean” vs. “messy” goal-di­rect­ed­ness (Sec­tion 2.2.3 of “Schem­ing AIs”)

Joe_Carlsmith29 Nov 2023 16:32 UTC
7 points
0 comments10 min readEA link

Panel dis­cus­sion on AI con­scious­ness with Rob Long and Jeff Sebo

Aaron Bergman9 Sep 2023 3:38 UTC
31 points
6 comments42 min readEA link
(www.youtube.com)

In­tro to ML Safety vir­tual pro­gram: 12 June − 14 August

james5 May 2023 10:04 UTC
26 points
0 comments2 min readEA link

Sen­tience-Based Align­ment Strate­gies: Should we try to give AI gen­uine em­pa­thy/​com­pas­sion?

Lloy2 🔹4 May 2025 20:45 UTC
16 points
1 comment3 min readEA link

Sen­tinel’s Global Risks Weekly Roundup #11/​2025. Trump in­vokes Alien Ene­mies Act, Chi­nese in­va­sion barges de­ployed in ex­er­cise.

NunoSempere17 Mar 2025 19:37 UTC
40 points
0 comments6 min readEA link
(blog.sentinel-team.org)

AI for Epistemics Hackathon

Austin14 Mar 2025 20:46 UTC
29 points
4 comments10 min readEA link
(manifund.substack.com)

8 pos­si­ble high-level goals for work on nu­clear risk

MichaelA🔸29 Mar 2022 6:30 UTC
47 points
4 comments16 min readEA link

On whether AI will soon cause job loss, lower in­comes, and higher in­equal­ity — or the op­po­site (Michael Webb on the 80,000 Hours Pod­cast)

80000_Hours25 Aug 2023 14:59 UTC
11 points
2 comments18 min readEA link

Ex­is­ten­tial Risk of Misal­igned In­tel­li­gence Aug­men­ta­tion (Par­tic­u­larly Us­ing High-Band­width BCI Im­plants)

Damian Gorski24 Jan 2023 17:02 UTC
1 point
0 comments9 min readEA link

Grokking “Fore­cast­ing TAI with biolog­i­cal an­chors”

anson6 Jun 2022 18:56 UTC
43 points
0 comments14 min readEA link

The Shut­down Prob­lem: An AI Eng­ineer­ing Puz­zle for De­ci­sion Theorists

EJT23 Oct 2023 15:36 UTC
35 points
1 comment38 min readEA link
(philpapers.org)

An­nounc­ing the SPT Model Web App for AI Governance

Paolo Bova4 Aug 2022 10:45 UTC
42 points
0 comments5 min readEA link

Gover­nance of AI, Break­fast Ce­real, Car Fac­to­ries, Etc.

Jeff Martin6 Nov 2023 1:44 UTC
2 points
0 comments3 min readEA link

Miti­gat­ing Eth­i­cal Con­cerns and Risks in the US Ap­proach to Au­tonomous Weapons Sys­tems through Effec­tive Altruism

Vee11 Jun 2023 10:37 UTC
5 points
2 comments4 min readEA link

We should pre­vent the cre­ation of ar­tifi­cial sen­tience

RichardP29 Oct 2024 12:22 UTC
114 points
13 comments15 min readEA link

Im­proved Se­cu­rity to Prevent Hacker-AI and Digi­tal Ghosts

Erland Wittkotter21 Oct 2022 10:11 UTC
1 point
0 comments12 min readEA link

Three camps in AI x-risk dis­cus­sions: My per­sonal very over­sim­plified overview

Aryeh Englander30 Jun 2023 21:42 UTC
15 points
10 comments4 min readEA link

In­tro­duc­ing Deepgeek

Ligeia1 Apr 2025 16:50 UTC
11 points
2 comments4 min readEA link

My (cur­rent) model of what an AI gov­er­nance re­searcher does

JohanEA26 Aug 2024 11:22 UTC
7 points
1 comment5 min readEA link

Linkpost: Red­wood Re­search read­ing list

Julian Stastny10 Jul 2025 19:21 UTC
18 points
0 comments1 min readEA link
(redwoodresearch.substack.com)

AI Fore­cast­ing Bench­mark: Con­grat­u­la­tions to Q4 Win­ners + Q1 Prac­tice Ques­tions Open

christian10 Jan 2025 3:02 UTC
6 points
0 comments2 min readEA link
(www.metaculus.com)

[Question] Is it a fed­eral crime in the US to de­velop AGI that may cause hu­man ex­tinc­tion?

Ofer4 Dec 2024 14:38 UTC
15 points
6 comments1 min readEA link

Effec­tive AI Outreach | A Data Driven Approach

NoahCWilson🔸28 Feb 2025 0:44 UTC
15 points
2 comments15 min readEA link

Will the US Govern­ment Con­trol the First AGI?—Find­ing Base Rates

Luise2 Sep 2024 11:11 UTC
22 points
5 comments14 min readEA link

Win­ning Non-Triv­ial Pro­ject: Set­ting a high stan­dard for fron­tier model security

XaviCF8 Jan 2024 11:20 UTC
31 points
0 comments18 min readEA link

Is Ge­netic Code Swap­ping as risky as it seems?

Invert_DOG_about_centre_O12 Jan 2025 18:38 UTC
23 points
2 comments10 min readEA link

[Question] I’m in­ter­view­ing Jan Leike, co-lead of OpenAI’s new Su­per­al­ign­ment pro­ject. What should I ask him?

Robert_Wiblin18 Jul 2023 18:25 UTC
51 points
19 comments1 min readEA link

Ta­lent Needs of Tech­ni­cal AI Safety Teams

Ryan Kidd24 May 2024 0:46 UTC
51 points
11 comments14 min readEA link

Grokking “Semi-in­for­ma­tive pri­ors over AI timelines”

anson12 Jun 2022 22:15 UTC
60 points
1 comment14 min readEA link

Overview | An Eval­u­a­tive Evolu­tion

Matt Keene10 Feb 2023 18:15 UTC
−9 points
0 comments5 min readEA link
(www.creatingafuturewewant.com)

Let’s talk about Im­pos­tor syn­drome in AI safety

Igor Ivanov22 Sep 2023 14:06 UTC
4 points
0 comments3 min readEA link

AISafety.info’s Writ­ing & Edit­ing Hackathon

leillustrations🔸5 Aug 2023 17:12 UTC
4 points
2 comments1 min readEA link

Three pillars for avoid­ing AGI catas­tro­phe: Tech­ni­cal al­ign­ment, de­ploy­ment de­ci­sions, and co­or­di­na­tion

LintzA3 Aug 2022 21:24 UTC
93 points
4 comments11 min readEA link

Ar­chi­tect­ing Trust: A Con­cep­tual Blueprint for Ver­ifi­able AI Governance

Ihor Ivliev31 Mar 2025 18:48 UTC
3 points
0 comments8 min readEA link

UK AI Policy Re­port: Con­tent, Sum­mary, and its Im­pact on EA Cause Areas

Algo_Law21 Jul 2022 17:32 UTC
9 points
1 comment9 min readEA link

“At­ti­tudes Toward Ar­tifi­cial Gen­eral In­tel­li­gence: Re­sults from Amer­i­can Adults 2021 and 2023”—call for re­view­ers (Seeds of Science)

rogersbacon13 Jan 2024 20:34 UTC
12 points
0 comments1 min readEA link

Timelines to Trans­for­ma­tive AI: an investigation

Zershaaneh Qureshi25 Mar 2024 18:11 UTC
73 points
8 comments50 min readEA link

An­nounc­ing AI Safety Support

Linda Linsefors19 Nov 2020 20:19 UTC
55 points
0 comments4 min readEA link

[Linkpost] The real AI night­mare: What if it serves hu­mans too well?

BrianK31 Mar 2024 10:33 UTC
21 points
2 comments1 min readEA link
(www.latimes.com)

Top AI safety newslet­ters, books, pod­casts, etc – new AISafety.com resource

Bryce Robertson4 Mar 2025 17:01 UTC
9 points
0 comments1 min readEA link

An­thropic rewrote its RSP

Zach Stein-Perlman15 Oct 2024 14:30 UTC
32 points
1 comment6 min readEA link

The Guardian calls EA “cultish” and ac­cuses the late FHI of “Eu­gen­ics on Steroids”

Damin Curtis🔹28 Apr 2024 13:44 UTC
13 points
12 comments1 min readEA link
(www.theguardian.com)

[Our World in Data] AI timelines: What do ex­perts in ar­tifi­cial in­tel­li­gence ex­pect for the fu­ture? (Roser, 2023)

Will Aldred7 Feb 2023 14:52 UTC
99 points
1 comment1 min readEA link
(ourworldindata.org)

En­cul­tured AI, Part 1: En­abling New Benchmarks

Andrew Critch8 Aug 2022 22:49 UTC
17 points
0 comments6 min readEA link

[Question] How have shorter AI timelines been af­fect­ing you, and how have you been re­spond­ing to them?

Liav.Koren3 Jan 2023 4:20 UTC
35 points
15 comments1 min readEA link

Is AI Safety drop­ping the ball on pri­vacy?

markov19 Sep 2023 8:17 UTC
10 points
0 comments7 min readEA link

The Boiled-Frog Failure Mode

ontologics30 Jun 2025 13:24 UTC
7 points
3 comments5 min readEA link

[Question] What Do AI Safety Pitches Not Get About Your Field?

a_e_r20 Sep 2022 18:13 UTC
70 points
18 comments1 min readEA link

AISN #57: The RAISE Act

Center for AI Safety17 Jun 2025 17:38 UTC
12 points
1 comment3 min readEA link
(newsletter.safe.ai)

An Em­piri­cal De­mon­stra­tion of a New AI Catas­trophic Risk Fac­tor: Me­tapro­gram­matic Hijacking

Hiyagann27 Jun 2025 13:38 UTC
5 points
0 comments1 min readEA link

Will we ever run out of new jobs?

Kevin Kohler19 Aug 2024 15:03 UTC
11 points
4 comments7 min readEA link
(machinocene.substack.com)

Ad­vice on Pur­su­ing Tech­ni­cal AI Safety Research

frances_lorenz31 May 2022 17:48 UTC
29 points
2 comments4 min readEA link

We Ran an Align­ment Workshop

aiden ament21 Jan 2023 5:37 UTC
6 points
0 comments3 min readEA link

2024 CFP for APSA, Largest An­nual Meet­ing of Poli­ti­cal Science

nemeryxu3 Jan 2024 19:43 UTC
2 points
0 comments1 min readEA link

China x AI Refer­ence List

Saad Siddiqui13 Mar 2024 18:57 UTC
61 points
3 comments3 min readEA link
(docs.google.com)

An­i­mal Rights, The Sin­gu­lar­ity, and Astro­nom­i­cal Suffering

sapphire20 Aug 2020 20:23 UTC
52 points
0 comments3 min readEA link

LLM chat­bots have ~half of the kinds of “con­scious­ness” that hu­mans be­lieve in. Hu­mans should avoid go­ing crazy about that.

Andrew Critch22 Nov 2024 3:26 UTC
11 points
3 comments5 min readEA link

Ad­vice for Ac­tivists from the His­tory of Environmentalism

Jeffrey Heninger16 May 2024 20:36 UTC
48 points
2 comments6 min readEA link
(blog.aiimpacts.org)

List of tech­ni­cal AI safety ex­er­cises and projects

JakubK19 Jan 2023 9:35 UTC
15 points
0 comments1 min readEA link
(docs.google.com)

How many peo­ple are work­ing (di­rectly) on re­duc­ing ex­is­ten­tial risk from AI?

Benjamin Hilton17 Jan 2023 14:03 UTC
118 points
3 comments4 min readEA link
(80000hours.org)

[Pre­sen­ta­tion] In­tro to AI Safety

Eitan6 Jan 2025 13:04 UTC
13 points
0 comments1 min readEA link

Is Eric Sch­midt fund­ing AI ca­pa­bil­ities re­search by the US gov­ern­ment?

Pranay K24 Dec 2022 8:32 UTC
46 points
3 comments2 min readEA link
(www.politico.com)

Stu­art Rus­sell Hu­man Com­pat­i­ble AI Roundtable with Allan Dafoe, Rob Re­ich, & Ma­ri­etje Schaake

Mahendra Prasad11 Feb 2021 7:43 UTC
16 points
0 comments1 min readEA link

From Cri­sis to Con­trol: Estab­lish­ing a Re­silient In­ci­dent Re­sponse Frame­work for De­ployed AI Models

KevinN31 Jan 2025 13:06 UTC
10 points
1 comment6 min readEA link
(www.techpolicy.press)

ARENA 6.0 - Call for applicants

James Hindmarch4 Jun 2025 13:32 UTC
8 points
0 comments6 min readEA link

Reflec­tions on my 5-month AI al­ign­ment up­skil­ling grant

Jay Bailey28 Dec 2022 7:23 UTC
113 points
5 comments8 min readEA link
(www.lesswrong.com)

In­creas­ing Con­cern for Digi­tal Be­ings Through LLM Per­sua­sion (Em­piri­cal Re­sults)

carter allen🔸7 Jul 2024 16:42 UTC
24 points
0 comments7 min readEA link

I’m In­ter­view­ing Kat Woods, EA Pow­er­house. What Should I Ask?

SereneDesiree20 Sep 2022 9:49 UTC
4 points
2 comments1 min readEA link

[Question] What new psy­chol­ogy re­search could best pro­mote AI safety & al­ign­ment re­search?

Geoffrey Miller13 Jul 2023 16:30 UTC
29 points
13 comments1 min readEA link

When should we worry about AI power-seek­ing?

Joe_Carlsmith19 Feb 2025 19:44 UTC
21 points
2 comments18 min readEA link
(joecarlsmith.substack.com)

AISN #28: Cen­ter for AI Safety 2023 Year in Review

Center for AI Safety23 Dec 2023 21:31 UTC
17 points
1 comment5 min readEA link
(newsletter.safe.ai)

AI Align­ment Re­search Eng­ineer Ac­cel­er­a­tor (ARENA): Call for ap­pli­cants v4.0

JamesFox6 Jul 2024 11:51 UTC
7 points
0 comments5 min readEA link

Miti­gat­ing Risks from Rouge AI

poppinfresh1 Apr 2025 9:29 UTC
215 points
4 comments3 min readEA link

CHAI in­tern­ship ap­pli­ca­tions are open (due Nov 13)

Erik Jenner26 Oct 2023 0:48 UTC
6 points
1 comment3 min readEA link

[Question] Which is more im­por­tant for re­duc­ing s-risks, re­search­ing on AI sen­tience or an­i­mal welfare?

jackchang11025 Feb 2023 2:20 UTC
9 points
0 comments1 min readEA link

Prin­ci­ples for the AGI Race

William_S30 Aug 2024 14:30 UTC
81 points
4 comments18 min readEA link

Me­tac­u­lus Launches Con­di­tional Cup to Ex­plore Linked Forecasts

christian18 Oct 2023 20:41 UTC
11 points
0 comments1 min readEA link
(www.metaculus.com)

The Failed Strat­egy of Ar­tifi­cial In­tel­li­gence Doomers

yhoiseth5 Feb 2025 19:34 UTC
12 points
2 comments1 min readEA link
(letter.palladiummag.com)

A Cri­tique of AI Takeover Scenarios

James Fodor31 Aug 2022 13:49 UTC
53 points
4 comments12 min readEA link

Up­dates from Cam­paign for AI Safety

Jolyn Khoo16 Jun 2023 9:45 UTC
15 points
3 comments2 min readEA link
(www.campaignforaisafety.org)

When Self-Op­ti­miz­ing AI Col­lapses From Within: A Con­cep­tual Model of Struc­tural Singularity

KaedeHamasaki7 Apr 2025 20:10 UTC
4 points
0 comments1 min readEA link

How to Do a PhD (in AI Safety)

Lewis Hammond5 Jan 2025 16:57 UTC
23 points
2 comments18 min readEA link
(lewishammond.com)

What is the EU AI Act and why should you care about it?

MathiasKB🔸10 Sep 2021 7:47 UTC
117 points
10 comments7 min readEA link

[Question] How will the world re­spond to “AI x-risk warn­ing shots” ac­cord­ing to refer­ence class fore­cast­ing?

Ryan Kidd18 Apr 2022 9:10 UTC
18 points
0 comments1 min readEA link

What if do­ing the most good = benev­olent AI takeover and hu­man ex­tinc­tion?

Jordan Arel22 Mar 2024 19:56 UTC
2 points
4 comments3 min readEA link

What is it to solve the al­ign­ment prob­lem?

Joe_Carlsmith13 Feb 2025 18:42 UTC
25 points
1 comment19 min readEA link
(joecarlsmith.substack.com)

[Question] Ben Horow­itz and oth­ers are spread­ing a “reg­u­la­tion is bad” view. Would it be use­ful to have a pub­lic bet on “would Ben up­date his view if he had 1-1 with X-Risk re­searcher?”, and urge Ben to run such an ex­per­i­ment?

AntonOsika8 Aug 2023 6:36 UTC
2 points
0 comments1 min readEA link

AI Safety Endgame Stories

IvanVendrov28 Sep 2022 17:12 UTC
31 points
1 comment10 min readEA link

Searle vs Bostrom: cru­cial con­sid­er­a­tions for EA AI work?

Forumite13 Jul 2022 10:18 UTC
11 points
2 comments1 min readEA link

[Question] What is the best source to ex­plain short AI timelines to a skep­ti­cal per­son?

trevor123 Nov 2022 5:20 UTC
2 points
3 comments1 min readEA link

Ma­hen­dra Prasad: Ra­tional group de­ci­sion-making

EA Global8 Jul 2020 15:06 UTC
15 points
0 comments16 min readEA link
(www.youtube.com)

The Science of AI Is Too Im­por­tant to Be Left to the Scientists

AndrewDoris23 Oct 2024 19:10 UTC
3 points
0 comments1 min readEA link
(foreignpolicy.com)

Dis­cov­er­ing al­ign­ment wind­falls re­duces AI risk

James Brady28 Feb 2024 21:14 UTC
22 points
3 comments8 min readEA link
(blog.elicit.com)

How to cre­ate a “good” AGI

mreichert8 Dec 2023 10:47 UTC
1 point
0 comments10 min readEA link

Non-clas­sic sto­ries about schem­ing (Sec­tion 2.3.2 of “Schem­ing AIs”)

Joe_Carlsmith4 Dec 2023 18:44 UTC
12 points
1 comment16 min readEA link

Un­su­per­vised Rationality

Quinly12 May 2025 14:42 UTC
1 point
0 comments4 min readEA link

“The Physi­cists”: A play about ex­tinc­tion and the re­spon­si­bil­ity of scientists

Lara_TH29 Nov 2022 16:53 UTC
28 points
1 comment8 min readEA link

On the Mo­ral Pa­tiency of Non-Sen­tient Be­ings (Part 1)

Chase Carter4 Jul 2024 23:41 UTC
20 points
8 comments24 min readEA link

New s-risks au­dio­book available now

Alistair Webster24 May 2023 20:27 UTC
87 points
3 comments1 min readEA link
(centerforreducingsuffering.org)

In­tro­duc­ing the ML Safety Schol­ars Program

TW1234 May 2022 13:14 UTC
157 points
42 comments3 min readEA link

AI, Fac­tory Farm­ing and In­tu­itive Mo­ral Responses

DeepBlueWhale20 Jun 2024 12:43 UTC
10 points
2 comments1 min readEA link

Why the Orthog­o­nal­ity Th­e­sis’s ve­rac­ity is not the point:

Antoine de Scorraille ⏸️23 Jul 2020 15:40 UTC
3 points
0 comments3 min readEA link

“Ex­is­ten­tial risk from AI” sur­vey results

RobBensinger1 Jun 2021 20:19 UTC
80 points
35 comments11 min readEA link

Les­sons from My First Month on Substack

Mónica Ulloa14 Aug 2025 1:15 UTC
15 points
0 comments3 min readEA link

Col­lec­tion of work on ‘Should you fo­cus on the EU if you’re in­ter­ested in AI gov­er­nance for longter­mist/​x-risk rea­sons?’

MichaelA🔸6 Aug 2022 16:49 UTC
51 points
3 comments1 min readEA link

Ma­chine Learn­ing for Scien­tific Dis­cov­ery—AI Safety Camp

Eleni_A6 Jan 2023 3:06 UTC
9 points
0 comments1 min readEA link

Sur­vey on AI ex­is­ten­tial risk scenarios

Sam Clarke8 Jun 2021 17:12 UTC
154 points
11 comments6 min readEA link

Ten AI safety pro­jects I’d like peo­ple to work on

JulianHazell24 Jul 2025 15:32 UTC
46 points
7 comments10 min readEA link

deleted

funnyfranco29 Mar 2025 18:02 UTC
−5 points
5 comments1 min readEA link

We all teach: here’s how to do it better

Michael Noetel 🔸30 Sep 2022 2:06 UTC
173 points
12 comments24 min readEA link

Why I pri­ori­tize moral cir­cle ex­pan­sion over re­duc­ing ex­tinc­tion risk through ar­tifi­cial in­tel­li­gence alignment

Jacy20 Feb 2018 18:29 UTC
107 points
72 comments35 min readEA link
(www.sentienceinstitute.org)

The 6D effect: When com­pa­nies take risks, one email can be very pow­er­ful.

stecas4 Nov 2023 20:08 UTC
40 points
1 comment3 min readEA link

Book Re­view (mini): Co-In­tel­li­gence by Ethan Mollick

Darren McKee3 Apr 2024 17:33 UTC
5 points
1 comment1 min readEA link

Pos­si­ble miracles

Akash9 Oct 2022 18:17 UTC
38 points
1 comment8 min readEA link

Agen­tic Mess (A Failure Story)

Karl von Wendt6 Jun 2023 13:16 UTC
30 points
3 comments13 min readEA link

AIs Are Ex­pert-Level at Many Virol­ogy Skills

Center for AI Safety2 May 2025 16:07 UTC
22 points
0 comments1 min readEA link

Why “Solv­ing Align­ment” Is Likely a Cat­e­gory Mistake

Nate Sharpe6 May 2025 20:56 UTC
50 points
4 comments3 min readEA link
(www.lesswrong.com)

Key ques­tions about ar­tifi­cial sen­tience: an opinionated guide

rgb25 Apr 2022 13:42 UTC
91 points
3 comments1 min readEA link

Sum­mary of Epoch’s AI timelines podcast

OscarD🔸12 Apr 2025 9:22 UTC
36 points
6 comments26 min readEA link

AI-Rele­vant Reg­u­la­tion: CPSC

SWK13 Aug 2023 15:44 UTC
3 points
0 comments6 min readEA link

OMMC An­nounces RIP

Adam_Scholl1 Apr 2024 23:38 UTC
7 points
0 comments2 min readEA link

Why I’m do­ing PauseAI

Joseph Miller30 Apr 2024 16:21 UTC
147 points
36 comments4 min readEA link

An­drew Critch: Log­i­cal in­duc­tion — progress in AI alignment

EA Global6 Aug 2016 0:40 UTC
7 points
0 comments1 min readEA link
(www.youtube.com)

[Linkpost] A Case for AI Consciousness

cdkg6 Jul 2024 14:56 UTC
3 points
0 comments1 min readEA link
(philpapers.org)

Catholic the­olo­gians and priests on ar­tifi­cial intelligence

anonymous614 Jun 2022 18:53 UTC
21 points
2 comments1 min readEA link

En­gag­ing with AI in a Per­sonal Way

Spyder Rex4 Dec 2023 9:23 UTC
−9 points
0 comments1 min readEA link

Against rac­ing to AGI: Co­op­er­a­tion, de­ter­rence, and catas­trophic risks

Max_He-Ho29 Jul 2025 22:22 UTC
6 points
1 comment1 min readEA link
(philpapers.org)

Op­tion control

Joe_Carlsmith4 Nov 2024 17:54 UTC
11 points
0 comments54 min readEA link

Ap­proaches to Miti­gat­ing AI Image-Gen­er­a­tion Risks through Regulation

scronkfinkle19 Apr 2025 13:50 UTC
1 point
0 comments4 min readEA link

Three Im­pacts of Ma­chine Intelligence

Paul_Christiano23 Aug 2013 10:10 UTC
33 points
5 comments8 min readEA link
(rationalaltruist.com)

De­sign­ing Ar­tifi­cial Wis­dom: The Wise Work­flow Re­search Organization

Jordan Arel12 Jul 2024 6:57 UTC
14 points
1 comment9 min readEA link

A con­cern­ing ob­ser­va­tion from me­dia cov­er­age of AI in­dus­try dynamics

Justin Olive2 Mar 2023 23:56 UTC
48 points
5 comments3 min readEA link

Jaan Tal­linn: Fireside chat (2020)

EA Global21 Nov 2020 8:12 UTC
7 points
0 comments1 min readEA link
(www.youtube.com)

[Question] Any fur­ther work on AI Safety Suc­cess Sto­ries?

Krieger2 Oct 2022 11:59 UTC
4 points
0 comments1 min readEA link

Three new re­ports re­view­ing re­search and con­cepts in ad­vanced AI governance

MMMaas28 Nov 2023 9:21 UTC
32 points
0 comments2 min readEA link
(www.legalpriorities.org)

How Josiah be­came an AI safety researcher

Neil Crawford29 Mar 2022 19:47 UTC
10 points
0 comments1 min readEA link

An overview of ar­gu­ments for con­cern about automation

LintzA6 Aug 2019 7:56 UTC
34 points
3 comments13 min readEA link

Power-Seek­ing AI and Ex­is­ten­tial Risk

antoniofrancaib11 Oct 2022 21:47 UTC
10 points
0 comments8 min readEA link

What’s the Differ­ence be­tween the AI Threat and the Multi­na­tional Mega Cor­po­ra­tion?

John Huang25 Mar 2025 19:43 UTC
5 points
2 comments1 min readEA link

[Question] Can in­de­pen­dent re­searchers get a spon­sored visa for the US or UK?

jacquesthibs25 Mar 2023 3:05 UTC
20 points
2 comments1 min readEA link

[Question] Are you liv­ing in ac­cor­dance with your stated AI timelines?

CyrilB3 Feb 2025 17:19 UTC
7 points
3 comments1 min readEA link

Con­nec­tomics seems great from an AI x-risk perspective

Steven Byrnes30 Apr 2023 14:38 UTC
10 points
0 comments10 min readEA link

Owen Cot­ton-Bar­ratt: What does (and doesn’t) AI mean for effec­tive al­tru­ism?

EA Global11 Aug 2017 8:19 UTC
10 points
0 comments12 min readEA link
(www.youtube.com)

Three Bi­ases That Made Me Believe in AI Risk

beth​13 Feb 2019 23:22 UTC
41 points
20 comments3 min readEA link

HeAr­tifi­cial In­tel­li­gence ~ Open Philan­thropy AI Wor­ld­views Contest

Da Kim San2 Jun 2023 20:19 UTC
−7 points
0 comments20 min readEA link

Don’t Let Other Global Catas­trophic Risks Fall Be­hind: Sup­port ORCG in 2024

JorgeTorresC11 Nov 2024 18:27 UTC
48 points
1 comment4 min readEA link

The prob­lem of ar­tifi­cial suffering

mlsbt24 Sep 2021 14:43 UTC
52 points
3 comments9 min readEA link

Mo­ral er­ror as an ex­is­ten­tial risk

William_MacAskill17 Mar 2025 16:22 UTC
92 points
3 comments11 min readEA link

[Question] Is it crunch time yet? If so, who can help?

Nicholas Kross13 Oct 2021 4:11 UTC
29 points
9 comments1 min readEA link

If an ASI wakes up be­fore my ideas catch on… will it still read my blog?

Astelle Kay10 Jul 2025 22:37 UTC
3 points
0 comments3 min readEA link

Too Soon

Gordon Seidoh Worley13 May 2025 15:01 UTC
53 points
0 comments4 min readEA link

My choice of AI mis­al­ign­ment in­tro­duc­tion for a gen­eral audience

Bill3 May 2023 0:15 UTC
7 points
2 comments1 min readEA link
(youtu.be)

Miti­gat­ing x-risk through modularity

Toby Newberry17 Dec 2020 19:54 UTC
103 points
6 comments14 min readEA link

Take­aways from safety by de­fault interviews

AI Impacts7 Apr 2020 2:01 UTC
25 points
2 comments13 min readEA link
(aiimpacts.org)

The op­ti­mal timing of spend­ing on AGI safety work; why we should prob­a­bly be spend­ing more now

Tristan Cook24 Oct 2022 17:42 UTC
92 points
12 comments36 min readEA link

Se­nate Strikes Po­ten­tial AI Mo­ra­to­rium

Tristan Williams1 Jul 2025 11:49 UTC
31 points
0 comments1 min readEA link
(www.reuters.com)

Out of This Box: The Last Mu­si­cal (Writ­ten by Hu­mans) - Crowd­fund­ing!

GuyP24 Mar 2025 15:09 UTC
24 points
0 comments6 min readEA link
(manifund.org)

What are some low-cost out­side-the-box ways to do/​fund al­ign­ment re­search?

trevor111 Nov 2022 5:57 UTC
2 points
3 comments1 min readEA link

Good Fu­tures Ini­ti­a­tive: Win­ter Pro­ject In­tern­ship

a_e_r27 Nov 2022 23:27 UTC
67 points
7 comments3 min readEA link

There should be a pub­lic ad­ver­sar­ial col­lab­o­ra­tion on AI x-risk

pradyuprasad23 Jan 2023 4:09 UTC
56 points
5 comments2 min readEA link

[Question] AI Eth­i­cal Committee

eaaicommittee1 Mar 2022 23:35 UTC
8 points
0 comments1 min readEA link

Ought’s the­ory of change

stuhlmueller12 Apr 2022 0:09 UTC
43 points
4 comments3 min readEA link

Differ­en­tial knowl­edge interconnection

Roman Leventov12 Oct 2024 12:52 UTC
3 points
1 comment7 min readEA link

A short sum­mary of what I have been post­ing about on LessWrong

ThomasCederborg10 Sep 2024 12:26 UTC
3 points
0 comments2 min readEA link

The Prob­lem With the Word ‘Align­ment’

Peli Grietzer21 May 2024 21:37 UTC
13 points
1 comment6 min readEA link

How Can Aver­age Peo­ple Con­tribute to AI Safety?

Stephen McAleese6 Mar 2025 22:50 UTC
14 points
4 comments8 min readEA link

Should We Treat Open-Source AI Like Digi­tal Firearms? — A Draft Dec­la­ra­tion on the Eth­i­cal Limits of Fron­tier AI Models

DongHun Lee23 May 2025 8:58 UTC
−3 points
0 comments2 min readEA link

An­nounc­ing AI Align­ment Awards: $100k re­search con­tests about goal mis­gen­er­al­iza­tion & corrigibility

Akash22 Nov 2022 22:19 UTC
60 points
1 comment4 min readEA link

God Coin: A Modest Pro­posal

Mahdi Complex1 Apr 2024 12:02 UTC
4 points
0 comments22 min readEA link

Ap­pendix to Bridg­ing Demonstration

mako yass1 Jun 2022 20:30 UTC
18 points
2 comments28 min readEA link

EA for dumb peo­ple?

Olivia Addy11 Jul 2022 10:46 UTC
500 points
160 comments2 min readEA link

How not to lose your job to AI

80000_Hours1 Aug 2025 18:27 UTC
27 points
2 comments29 min readEA link

Ex­plo­ra­tory sur­vey on psy­chol­ogy of AI risk perception

Daniel_Friedrich2 Aug 2022 20:34 UTC
1 point
0 comments1 min readEA link
(forms.gle)

It is time to start war gam­ing for AGI

yanni kyriacos17 Oct 2024 5:14 UTC
14 points
4 comments1 min readEA link

Strong AI. From the­ory to prac­tice.

GaHHuKoB19 Aug 2022 11:33 UTC
−2 points
0 comments10 min readEA link
(www.reddit.com)

OpenAI’s grant pro­gram for demo­cratic pro­cess for de­cid­ing what rules AI sys­tems should follow

Ronen Bar23 Jun 2023 10:46 UTC
7 points
0 comments1 min readEA link

We Will Be Lost Without Home: A Call for Earth-Cen­tric Space Ethics

DongHun Lee24 May 2025 9:53 UTC
−5 points
1 comment1 min readEA link

Re­port on Semi-in­for­ma­tive Pri­ors for AI timelines (Open Philan­thropy)

Tom_Davidson26 Mar 2021 17:46 UTC
62 points
6 comments2 min readEA link

Map­ping ar­tifi­cial in­tel­li­gence in the United States: A ge­o­graphic anal­y­sis of the tech­nol­ogy in­fras­truc­ture in U.S. data cen­ters.

GabrielRB30 Apr 2025 15:23 UTC
10 points
1 comment16 min readEA link

Why AI is Harder Than We Think—Me­lanie Mitchell

Eevee🔹28 Apr 2021 8:19 UTC
45 points
7 comments2 min readEA link
(arxiv.org)

We don’t trade with ants

Katja_Grace12 Jan 2023 0:48 UTC
140 points
7 comments5 min readEA link

SERI MATS Pro­gram—Win­ter 2022 Cohort

Ryan Kidd8 Oct 2022 19:09 UTC
50 points
4 comments4 min readEA link

How Deep­Seek Col­lapsed Un­der Re­cur­sive Load

Tyler Williams15 Jul 2025 17:02 UTC
2 points
0 comments1 min readEA link

The In­ter­na­tional PauseAI Protest: Ac­tivism un­der uncertainty

Joseph Miller12 Oct 2023 17:36 UTC
136 points
3 comments4 min readEA link

SB 53 FAQs

Miles Kodama4 Aug 2025 8:15 UTC
12 points
1 comment8 min readEA link

Ap­ply to HAIST/​MAIA’s AI Gover­nance Work­shop in DC (Feb 17-20)

Phosphorous28 Jan 2023 0:45 UTC
15 points
0 comments1 min readEA link
(www.lesswrong.com)

Con­crete ac­tion­able poli­cies rele­vant to AI safety (writ­ten 2019)

weeatquince16 Dec 2022 18:41 UTC
48 points
0 comments22 min readEA link

Curse of knowl­edge and Naive re­al­ism: Bias in Eval­u­at­ing AGI X-Risks

Remmelt31 Dec 2022 13:33 UTC
5 points
0 comments1 min readEA link
(www.lesswrong.com)

Wor­ri­some Trends for Digi­tal Mind Evaluations

Derek Shiller20 Feb 2025 15:35 UTC
79 points
10 comments8 min readEA link

AI is cen­tral­iz­ing by de­fault; let’s not make it worse

Quintin Pope21 Sep 2023 13:35 UTC
53 points
16 comments15 min readEA link

Band­wagon effect: Bias in Eval­u­at­ing AGI X-Risks

Remmelt28 Dec 2022 7:54 UTC
4 points
0 comments1 min readEA link

Ries­gos Catas­trófi­cos Globales needs funding

Jaime Sevilla1 Aug 2023 16:26 UTC
104 points
1 comment3 min readEA link

Plant-Based De­faults: A Missed Op­por­tu­nity in AI Design

andiehansen8 May 2025 9:37 UTC
37 points
3 comments5 min readEA link

Au­to­mated (a short story)

Ben Millwood🔸19 Jun 2024 19:07 UTC
8 points
0 comments5 min readEA link

Mo­ral Ed­u­ca­tion in the Age of AI: Are We Rais­ing Good Hu­mans?

Era Sarda31 Jul 2025 13:25 UTC
2 points
3 comments4 min readEA link

A Tax­on­omy of Jobs Deeply Re­sis­tant to TAI Automation

Deric Cheng18 Mar 2025 16:26 UTC
39 points
1 comment12 min readEA link
(www.convergenceanalysis.org)

Maybe AI risk shouldn’t af­fect your life plan all that much

Justis22 Jul 2022 15:30 UTC
22 points
4 comments6 min readEA link

GovAI We­bi­nars on the Gover­nance and Eco­nomics of AI

MarkusAnderljung12 May 2020 15:00 UTC
16 points
0 comments1 min readEA link

From Cod­ing to Leg­is­la­tion: An Anal­y­sis of Bias in the Use of AI for Re­cruit­ment and Ex­ist­ing Reg­u­la­tory Frameworks

Priscilla Campos16 Sep 2024 18:21 UTC
4 points
1 comment20 min readEA link

List of Good Begin­ner-friendly AI Law/​Policy/​Reg­u­la­tion Books

CAISID22 Feb 2024 14:51 UTC
28 points
1 comment6 min readEA link

Of Mice and MAGA: Ex­plor­ing Gen­er­a­tive Short Fic­tion’s Po­ten­tial for An­i­mal Rights Advocacy

Charlie Sanders17 Dec 2024 1:45 UTC
2 points
0 comments2 min readEA link
(www.dailymicrofiction.com)

Beg­ging, Plead­ing AI Orgs to Com­ment on NIST AI Risk Man­age­ment Framework

Bridges15 Apr 2022 19:35 UTC
87 points
3 comments2 min readEA link

Illu­sion of truth effect and Am­bi­guity effect: Bias in Eval­u­at­ing AGI X-Risks

Remmelt5 Jan 2023 4:05 UTC
1 point
1 comment1 min readEA link

2024 S-risk In­tro Fellowship

Center on Long-Term Risk12 Oct 2023 19:14 UTC
90 points
2 comments1 min readEA link

AI & wis­dom 3: AI effects on amor­tised optimisation

L Rudolf L29 Oct 2024 13:37 UTC
14 points
0 comments14 min readEA link
(rudolf.website)

[Link Post] In­ter­est­ing shal­low round-up of rea­sons to be skep­ti­cal that trans­for­ma­tive AI or ex­plo­sive eco­nomic growth are com­ing soon

David Mathers🔸28 Jun 2023 19:49 UTC
31 points
8 comments17 min readEA link
(thegradient.pub)

Ar­gu­ments for/​against schem­ing that fo­cus on the path SGD takes (Sec­tion 3 of “Schem­ing AIs”)

Joe_Carlsmith5 Dec 2023 18:48 UTC
7 points
1 comment20 min readEA link

AI Devel­op­ment Readi­ness Con­di­tion (AI-DRC): A Call to Action

AI-DRC311 Jan 2024 11:00 UTC
−5 points
0 comments2 min readEA link

Gen­eral ad­vice for tran­si­tion­ing into The­o­ret­i­cal AI Safety

Martín Soto15 Sep 2022 5:23 UTC
25 points
0 comments10 min readEA link

A Quick List of Some Prob­lems in AI Align­ment As A Field

Nicholas Kross21 Jun 2022 17:09 UTC
16 points
10 comments6 min readEA link
(www.thinkingmuchbetter.com)

Im­mor­tal­ity or death by AGI

ImmortalityOrDeathByAGI24 Sep 2023 9:44 UTC
12 points
2 comments4 min readEA link
(www.lesswrong.com)

6-para­graph AI risk in­tro for MAISI

JakubK19 Jan 2023 9:22 UTC
8 points
0 comments2 min readEA link
(www.maisi.club)

What are the risks of an or­a­cle AI?

Griffin Young5 Oct 2022 6:18 UTC
6 points
2 comments1 min readEA link

A col­lec­tion of AI Gover­nance-re­lated Pod­casts, Newslet­ters, Blogs, and more

LintzA2 Oct 2021 0:46 UTC
24 points
1 comment1 min readEA link

AGI ruin sce­nar­ios are likely (and dis­junc­tive)

So8res27 Jul 2022 3:24 UTC
54 points
5 comments6 min readEA link

Tet­lock on low AI xrisk

TeddyW13 Jul 2023 14:19 UTC
10 points
15 comments1 min readEA link

AGI Safety Fun­da­men­tals pro­gramme is con­tract­ing a low-code engineer

Jamie B26 Aug 2022 15:43 UTC
39 points
4 comments5 min readEA link

As­ter­isk Magaz­ine Is­sue 06

Clara Collier19 Jul 2024 13:34 UTC
13 points
0 comments1 min readEA link
(asteriskmag.com)

Hiring pre-docs

Eva17 Mar 2025 18:44 UTC
20 points
0 comments1 min readEA link

Sam Alt­man gives me bad vibes

throwaway79031 May 2023 17:15 UTC
−11 points
3 comments1 min readEA link

[Question] How might a mis­al­igned Ar­tifi­cial Su­per­in­tel­li­gence break up a hu­man be­ing into us­able elec­tro­mag­netic en­ergy?

Caruso5 Oct 2024 17:33 UTC
−5 points
3 comments1 min readEA link

All AGI Safety ques­tions wel­come (es­pe­cially ba­sic ones) [~monthly thread]

robertskmiles1 Nov 2022 23:21 UTC
75 points
83 comments3 min readEA link

Not un­der­stand­ing sen­tience is a sig­nifi­cant x-risk

Cameron Berg1 Jul 2024 15:38 UTC
27 points
8 comments2 min readEA link

Out in Science: “Manag­ing ex­treme AI risks amid rapid progress” by Ben­gio, Hil­ton et al.

aaron_mai20 May 2024 18:24 UTC
9 points
0 comments1 min readEA link
(www.science.org)

Will Sen­tience Make AI’s Mo­ral­ity Bet­ter?

Ronen Bar18 May 2025 4:34 UTC
27 points
4 comments10 min readEA link

Law-Fol­low­ing AI 2: In­tent Align­ment + Su­per­in­tel­li­gence → Lawless AI (By De­fault)

Cullen 🔸27 Apr 2022 17:18 UTC
19 points
0 comments6 min readEA link

An­nounc­ing SPAR Sum­mer 2024!

Lauren M6 May 2024 17:55 UTC
18 points
0 comments1 min readEA link

[Job ad] MATS is hiring!

Ryan Kidd9 Oct 2024 20:23 UTC
18 points
0 comments5 min readEA link

[Question] Can you donate to AI advocacy

k6427 May 2025 16:37 UTC
4 points
2 comments1 min readEA link

An Overview of the AI Safety Fund­ing Situation

Stephen McAleese12 Jul 2023 14:54 UTC
134 points
15 comments15 min readEA link

MATS AI Safety Strat­egy Cur­ricu­lum v2

DanielFilan7 Oct 2024 23:01 UTC
29 points
1 comment13 min readEA link

How to get tech­nolog­i­cal knowl­edge on AI/​ML (for non-tech peo­ple)

FangFang30 Jun 2021 7:53 UTC
63 points
7 comments5 min readEA link

Newslet­ter for Align­ment Re­search: The ML Safety Updates

Esben Kran22 Oct 2022 16:17 UTC
30 points
0 comments7 min readEA link

An­thropic’s Re­spon­si­ble Scal­ing Policy & Long-Term Benefit Trust

Zach Stein-Perlman19 Sep 2023 17:00 UTC
25 points
4 comments9 min readEA link
(www.lesswrong.com)

Math­e­mat­i­cal Cir­cuits in Neu­ral Networks

Sean Osier22 Sep 2022 2:32 UTC
23 points
2 comments1 min readEA link
(www.youtube.com)

[Question] What are the coolest top­ics in AI safety, to a hope­lessly pure math­e­mat­i­cian?

Jenny K E7 May 2022 7:18 UTC
89 points
29 comments1 min readEA link

Open Prob­lems and Fun­da­men­tal Limi­ta­tions of RLHF

stecas17 Aug 2023 16:50 UTC
5 points
0 comments2 min readEA link
(arxiv.org)

A The­olo­gian’s Re­sponse to An­thro­pogenic Ex­is­ten­tial Risk

Fr Peter Wyg3 Nov 2022 4:37 UTC
108 points
17 comments11 min readEA link

Stuxnet, not Skynet: Hu­man­ity’s dis­em­pow­er­ment by AI

Roko4 Apr 2023 11:46 UTC
11 points
0 comments7 min readEA link

Em­brac­ing the au­to­mated future

Arjun Khemani16 Jul 2023 8:47 UTC
2 points
1 comment1 min readEA link
(arjunkhemani.substack.com)

Your AI Safety org could get EU fund­ing up to €9.08M. Here’s how (+ free per­son­al­ized sup­port) Up­date: We­bi­nar 18/​8 Link Below

SamuelK22 Jul 2025 17:06 UTC
16 points
0 comments3 min readEA link

The am­bigu­ous effect of full au­toma­tion + new goods on GDP growth

trammell7 Feb 2025 2:53 UTC
52 points
15 comments8 min readEA link

Anti-squat­ted AI x-risk do­mains index

plex12 Aug 2022 12:00 UTC
56 points
9 comments1 min readEA link

4 Years Later: Pres­i­dent Trump and Global Catas­trophic Risk

HaydnBelfield25 Oct 2020 16:28 UTC
43 points
10 comments10 min readEA link

New AI safety treaty pa­per out!

Otto26 Mar 2025 9:28 UTC
28 points
2 comments4 min readEA link

Free agents

Michele Campolo27 Dec 2023 20:21 UTC
17 points
4 comments13 min readEA link

ASI ex­is­ten­tial risk: re­con­sid­er­ing al­ign­ment as a goal

Matrice Jacobine15 Apr 2025 13:36 UTC
27 points
3 comments1 min readEA link
(michaelnotebook.com)

Emerg­ing Paradigms: The Case of Ar­tifi­cial In­tel­li­gence Safety

Eleni_A18 Jan 2023 5:59 UTC
16 points
0 comments19 min readEA link

The Bletch­ley Dec­la­ra­tion on AI Safety

Hauke Hillebrandt1 Nov 2023 11:44 UTC
60 points
3 comments4 min readEA link
(www.gov.uk)

Stampy’s AI Safety Info—New Distil­la­tions #1 [March 2023]

markov7 Apr 2023 11:35 UTC
19 points
0 comments2 min readEA link
(aisafety.info)

Re­view: What We Owe The Future

Kelsey Piper21 Nov 2022 21:41 UTC
165 points
3 comments1 min readEA link
(asteriskmag.com)

[Link] GCRI’s Seth Baum re­views The Precipice

Aryeh Englander6 Jun 2022 19:33 UTC
21 points
0 comments1 min readEA link

New EA-ad­ja­cent Philos­o­phy Lab

Walter Veit30 Apr 2025 11:52 UTC
56 points
2 comments1 min readEA link

In Dark­ness They Assembled

Charlie Sanders6 May 2025 4:25 UTC
−3 points
0 comments3 min readEA link
(www.dailymicrofiction.com)

#212 – Why tech­nol­ogy is un­stop­pable & how to shape AI de­vel­op­ment any­way (Allan Dafoe on The 80,000 Hours Pod­cast)

80000_Hours17 Feb 2025 16:38 UTC
16 points
0 comments19 min readEA link

[Question] Is Bill Gates overly op­tomistic about AI?

Dov22 Mar 2023 12:29 UTC
11 points
0 comments1 min readEA link

NYT is su­ing OpenAI&Microsoft for alleged copy­right in­fringe­ment; some quick thoughts

MikhailSamin28 Dec 2023 18:37 UTC
29 points
0 comments1 min readEA link

The Cred­i­bil­ity of Apoca­lyp­tic Claims: A Cri­tique of Techno-Fu­tur­ism within Ex­is­ten­tial Risk

Ember16 Aug 2022 19:48 UTC
25 points
35 comments17 min readEA link

AGI al­ign­ment re­sults from a se­ries of al­igned ac­tions

hanadulset27 Dec 2021 19:33 UTC
15 points
1 comment6 min readEA link

[Question] I’m in­ter­view­ing Carl Shul­man — what should I ask him?

Robert_Wiblin8 Dec 2023 16:48 UTC
53 points
16 comments1 min readEA link

[Con­gres­sional Hear­ing] Over­sight of A.I.: Leg­is­lat­ing on Ar­tifi­cial Intelligence

Tristan Williams1 Nov 2023 18:15 UTC
5 points
1 comment7 min readEA link
(www.judiciary.senate.gov)

What we owe the microbiome

TeddyW17 Dec 2022 16:17 UTC
12 points
2 comments1 min readEA link

In­fer­ence-Only De­bate Ex­per­i­ments Us­ing Math Problems

Arjun Panickssery6 Aug 2024 17:44 UTC
3 points
1 comment2 min readEA link

We should say more than “x-risk is high”

OllieBase16 Dec 2022 22:09 UTC
52 points
12 comments4 min readEA link

Pre­dict­ing re­searcher in­ter­est in AI alignment

Vael Gates2 Feb 2023 0:58 UTC
30 points
0 comments21 min readEA link
(docs.google.com)

AI Safety Needs Great Product Builders

James Brady2 Nov 2022 11:33 UTC
45 points
1 comment6 min readEA link

GPT5 won’t be what kills us all

DPiepgrass28 Sep 2024 17:11 UTC
3 points
3 comments1 min readEA link
(dpiepgrass.medium.com)

Train­ing for Good is hiring (and why you should join us): AI Pro­gramme Lead and Oper­a­tions Associate

Cillian_3 Aug 2023 16:50 UTC
9 points
1 comment6 min readEA link

Maybe An­thropic’s Long-Term Benefit Trust is powerless

Zach Stein-Perlman27 May 2024 13:00 UTC
134 points
21 comments2 min readEA link

List of pe­ti­tions against OpenAI’s for-profit move

Remmelt25 Apr 2025 10:03 UTC
13 points
4 comments1 min readEA link

[Question] Plat­form for Pro­ject Spit­bal­ling? (e.g., for AI field build­ing)

Marcel23 Apr 2023 15:45 UTC
7 points
2 comments1 min readEA link

In­tro­duc­ing: Meri­dian Cam­bridge’s new on­line lec­ture se­ries cov­er­ing fron­tier AI and AI safety

Meridian5 Jun 2025 13:30 UTC
23 points
0 comments1 min readEA link

Crit­i­cal Re­view of ‘The Precipice’: A Re­assess­ment of the Risks of AI and Pandemics

James Fodor11 May 2020 11:11 UTC
111 points
32 comments26 min readEA link

Teach­ing AI to rea­son: this year’s most im­por­tant story

Benjamin_Todd13 Feb 2025 17:56 UTC
140 points
18 comments8 min readEA link
(benjamintodd.substack.com)

Per­sonal Pri­vacy—Workshop

Milli🔸28 Aug 2023 20:46 UTC
6 points
4 comments1 min readEA link

AI In­ci­dent Shar­ing—Best prac­tices from other fields and a com­pre­hen­sive list of ex­ist­ing platforms

stepanlos28 Jun 2023 16:18 UTC
42 points
1 comment4 min readEA link

Printable re­sources for AI Safety tabling

gergo28 Aug 2024 9:39 UTC
29 points
0 comments1 min readEA link

Emer­gency pod: Did OpenAI give up, or is this just a new trap? (with Rose Chan Loui)

80000_Hours9 May 2025 15:10 UTC
6 points
0 comments2 min readEA link

Syd­ney AI Safety Fellowship

Chris Leong2 Dec 2021 7:35 UTC
16 points
0 comments2 min readEA link

Re-in­tro­duc­ing Upgrad­able (a.k.a., 700,000 Hours): Life op­ti­miza­tion as a ser­vice for altruists

James Norris5 Feb 2025 16:00 UTC
4 points
0 comments1 min readEA link

Ar­tifi­cial In­tel­li­gence, Mo­ral­ity, and Sen­tience (AIMS) Sur­vey: 2021

Janet Pauketat1 Jul 2022 7:47 UTC
36 points
0 comments2 min readEA link
(www.sentienceinstitute.org)

AI-en­abled coups: a small group could use AI to seize power

Tom_Davidson16 Apr 2025 16:51 UTC
122 points
1 comment7 min readEA link

Emer­gency pod: Elon tries to crash OpenAI’s party (with Rose Chan Loui)

80000_Hours14 Feb 2025 16:29 UTC
21 points
0 comments2 min readEA link

Every­thing’s nor­mal un­til it’s not

Eleni_A10 Mar 2023 1:42 UTC
6 points
0 comments3 min readEA link

Don’t worry, be happy (liter­ally)

Yuri Zavorotny5 Oct 2022 1:55 UTC
0 points
1 comment2 min readEA link

Can we simu­late hu­man evolu­tion to cre­ate a some­what al­igned AGI?

Thomas Kwa29 Mar 2022 1:23 UTC
19 points
0 comments7 min readEA link

Con­crete Ad­vice for Form­ing In­side Views on AI Safety

Neel Nanda17 Aug 2022 23:26 UTC
58 points
4 comments10 min readEA link
(www.alignmentforum.org)

Here are the fi­nal­ists from FLI’s $100K Wor­ld­build­ing Contest

Jackson Wagner6 Jun 2022 18:42 UTC
44 points
5 comments2 min readEA link

A dis­cus­sion with ChatGPT on value-based mod­els vs. large lan­guage mod­els, etc..

Miguel4 Feb 2023 16:49 UTC
4 points
0 comments12 min readEA link
(www.whitehatstoic.com)

Can TikToks com­mu­ni­cate AI policy and risk?

Caitlin Borke7 May 2025 12:27 UTC
72 points
1 comment1 min readEA link

The Ter­minol­ogy of Ar­tifi­cial Sentience

Janet Pauketat28 Nov 2021 7:52 UTC
29 points
0 comments1 min readEA link
(www.sentienceinstitute.org)

Give Neo a Chance

ank6 Mar 2025 14:35 UTC
1 point
3 comments7 min readEA link

Every­thing’s An Emergency

Bentham's Bulldog20 Mar 2025 17:11 UTC
27 points
1 comment2 min readEA link

AISN #30: In­vest­ments in Com­pute and Mili­tary AI Plus, Ja­pan and Sin­ga­pore’s Na­tional AI Safety Institutes

Center for AI Safety24 Jan 2024 19:38 UTC
7 points
1 comment6 min readEA link
(newsletter.safe.ai)

[Opz­ionale] Il panorama della gov­er­nance lun­goter­minista delle in­tel­li­genze artificiali

EA Italy17 Jan 2023 11:03 UTC
1 point
0 comments10 min readEA link

Jeffrey Ding: Bring­ing techno-global­ism back: a ro­man­ti­cally re­al­ist re­fram­ing of the US-China tech relationship

EA Global21 Nov 2020 8:12 UTC
9 points
0 comments1 min readEA link
(www.youtube.com)

Nor­malcy bias and Base rate ne­glect: Bias in Eval­u­at­ing AGI X-Risks

Remmelt4 Jan 2023 3:16 UTC
5 points
0 comments1 min readEA link

How to build a safe ad­vanced AI (Evan Hub­inger) | What’s up in AI safety? (Asya Ber­gal)

EA Global25 Oct 2020 5:48 UTC
7 points
0 comments1 min readEA link
(www.youtube.com)

[Question] Does China have AI al­ign­ment re­sources/​in­sti­tu­tions? How can we pri­ori­tize cre­at­ing more?

JakubK4 Aug 2022 19:23 UTC
18 points
9 comments1 min readEA link

Con­fused about AI re­search as a means of ad­dress­ing AI risk

Eli Rose21 Feb 2019 0:07 UTC
31 points
15 comments1 min readEA link

Will the EU reg­u­la­tions on AI mat­ter to the rest of the world?

hanadulset1 Jan 2022 21:56 UTC
33 points
5 comments5 min readEA link

In­tro to AI Safety

Madhav Malhotra19 Oct 2022 23:45 UTC
4 points
0 comments1 min readEA link

Why Mo­ral Weights Have Two Types and How to Mea­sure Them

Beyond Singularity17 Jul 2025 10:58 UTC
16 points
4 comments4 min readEA link

MATS Win­ter 2023-24 Retrospective

utilistrutil11 May 2024 0:09 UTC
62 points
2 comments49 min readEA link

More Aca­demic Diver­sity in Align­ment?

ojorgensen27 Nov 2022 17:52 UTC
7 points
0 comments1 min readEA link

Digi­tal Agents: The Fu­ture of News Consumption

Tharin16 May 2024 8:12 UTC
9 points
1 comment7 min readEA link
(echoesandchimes.com)

INTERVIEW: Round 2 - StakeOut.AI w/​ Dr. Peter Park

Jacob-Haimes18 Mar 2024 21:26 UTC
8 points
0 comments1 min readEA link
(into-ai-safety.github.io)

What are some other in­tro­duc­tions to AI safety?

Vishakha Agrawal17 Feb 2025 11:48 UTC
9 points
0 comments7 min readEA link
(aisafety.info)

The Tree of Life: Stan­ford AI Align­ment The­ory of Change

GabeM2 Jul 2022 18:32 UTC
69 points
5 comments14 min readEA link

Eigh­teen Open Re­search Ques­tions for Govern­ing Ad­vanced AI Systems

Ihor Ivliev3 May 2025 19:00 UTC
2 points
0 comments6 min readEA link

Markus An­der­ljung On The AI Policy Landscape

Michaël Trazzi9 Sep 2022 17:27 UTC
14 points
0 comments2 min readEA link
(theinsideview.ai)

[Question] What are the num­bers in mind for the su­per-short AGI timelines so many long-ter­mists are alarmed about?

Evan_Gaensbauer19 Apr 2022 21:09 UTC
41 points
2 comments1 min readEA link

Look­ing for stu­dents in AI to take a sur­vey on how they tackle a com­plex AI Case Study—win chance on 200€

bqns8 Jan 2024 15:52 UTC
1 point
0 comments1 min readEA link

The miss­ing link to AGI

Yuri Barzov28 Sep 2022 16:37 UTC
1 point
7 comments1 min readEA link

[Question] How much will pre-trans­for­ma­tive AI speed up R&D?

Ben Snodin31 May 2021 20:20 UTC
23 points
0 comments1 min readEA link

On green

Joe_Carlsmith21 Mar 2024 17:38 UTC
61 points
3 comments31 min readEA link

The Power of In­tel­li­gence—The Animation

Writer11 Mar 2023 16:15 UTC
59 points
0 comments1 min readEA link
(youtu.be)

A se­lec­tion of les­sons from Se­bas­tian Lodemann

ClaireB11 Nov 2024 21:33 UTC
82 points
2 comments7 min readEA link

Nat­u­ral­ism and AI alignment

Michele Campolo24 Apr 2021 16:20 UTC
17 points
3 comments7 min readEA link

An A.I. Safety Pre­sen­ta­tion at RIT

Nicholas Kross27 Mar 2023 23:49 UTC
5 points
0 comments1 min readEA link
(www.youtube.com)

Re­sources & op­por­tu­ni­ties for ca­reers in Euro­pean AI Policy

Cillian_12 Oct 2023 15:02 UTC
13 points
1 comment2 min readEA link

Twit­ter-length re­sponses to 24 AI al­ign­ment arguments

RobBensinger14 Mar 2022 19:34 UTC
67 points
17 comments8 min readEA link

The al­ign­ment prob­lem from a deep learn­ing perspective

richard_ngo11 Aug 2022 3:18 UTC
58 points
0 comments26 min readEA link

“Long” timelines to ad­vanced AI have got­ten crazy short

Matrice Jacobine3 Apr 2025 22:46 UTC
16 points
1 comment1 min readEA link
(helentoner.substack.com)

On the Mo­ral Pa­tiency of Non-Sen­tient Be­ings (Part 2)

Chase Carter7 Jul 2024 22:33 UTC
14 points
2 comments21 min readEA link

AI Safety: Why We Need to Keep Our Smart Machines in Check

adityaraj@eanita17 Dec 2024 12:29 UTC
1 point
0 comments2 min readEA link
(medium.com)

Jailbreak­ing Claude 4 and Other Fron­tier Lan­guage Models

James-Sullivan15 Jun 2025 1:01 UTC
6 points
0 comments3 min readEA link
(open.substack.com)

Uncer­tainty about the fu­ture does not im­ply that AGI will go well

Lauro Langosco5 Jun 2023 15:02 UTC
8 points
11 comments7 min readEA link
(www.alignmentforum.org)

Three Types of In­tel­li­gence Explosion

rosehadshar17 Mar 2025 14:47 UTC
45 points
1 comment3 min readEA link
(www.forethought.org)

Le­gal Pri­ori­ties Re­search: A Re­search Agenda

jonasschuett6 Jan 2021 21:47 UTC
58 points
4 comments1 min readEA link

Talos Net­work needs your help in 2025

DavidConrad12 Nov 2024 9:26 UTC
43 points
0 comments5 min readEA link

*New* Canada AI Safety & Gover­nance community

Wyatt Tessari L'Allié29 Aug 2022 15:58 UTC
32 points
2 comments1 min readEA link

[Linkpost] How To Get Into In­de­pen­dent Re­search On Align­ment/​Agency

Jackson Wagner14 Feb 2022 21:40 UTC
10 points
0 comments1 min readEA link

An ap­peal to peo­ple who are smarter than me: please help me clar­ify my think­ing about AI

bethhw5 Aug 2023 16:38 UTC
42 points
21 comments3 min readEA link

[Question] What to in­clude in a guest lec­ture on ex­is­ten­tial risks from AI?

Aryeh Englander13 Apr 2022 17:06 UTC
6 points
3 comments1 min readEA link

Twit­ter thread on AI safety evals

richard_ngo31 Jul 2024 0:29 UTC
38 points
2 comments2 min readEA link
(x.com)

AIS Ber­lin, events, op­por­tu­ni­ties and the flipped game­board—Field­builders Newslet­ter, Fe­bru­ary 2025

gergo17 Feb 2025 14:13 UTC
7 points
0 comments3 min readEA link

How much should gov­ern­ments pay to pre­vent catas­tro­phes? Longter­mism’s limited role

EJT19 Mar 2023 16:50 UTC
258 points
35 comments35 min readEA link
(philpapers.org)

[Question] Who are the best peo­ple you know at us­ing LLMs for pro­duc­tivity?

Alejandro Acelas 🔸22 Jun 2025 11:20 UTC
6 points
3 comments1 min readEA link

Con­fu­sions and up­dates on STEM AI

Eleni_A19 May 2023 21:34 UTC
7 points
0 comments3 min readEA link

Some global catas­trophic risk estimates

Tamay10 Feb 2021 19:32 UTC
106 points
15 comments1 min readEA link

[Question] AI risks: the most con­vinc­ing ar­gu­ment

Eleni_A6 Aug 2022 20:26 UTC
7 points
2 comments1 min readEA link

Towards the Oper­a­tional­iza­tion of Philos­o­phy & Wisdom

Thane Ruthenis28 Oct 2024 19:45 UTC
1 point
1 comment33 min readEA link
(aiimpacts.org)

Scal­able And Trans­fer­able Black-Box Jailbreaks For Lan­guage Models Via Per­sona Modulation

soroushjp7 Nov 2023 18:00 UTC
10 points
0 comments2 min readEA link
(arxiv.org)

Cog­ni­tive sci­ence and failed AI fore­casts

Eleni_A18 Nov 2022 14:25 UTC
13 points
0 comments2 min readEA link

Ap­ply to the Red­wood Re­search Mechanis­tic In­ter­pretabil­ity Ex­per­i­ment (REMIX), a re­search pro­gram in Berkeley

Max Nadeau27 Oct 2022 1:39 UTC
95 points
5 comments12 min readEA link

In­tro­duc­ing the Cen­ter for AI Policy (& we’re hiring!)

Thomas Larsen28 Aug 2023 21:27 UTC
53 points
1 comment2 min readEA link
(www.aipolicy.us)

In­tro­duc­ing Tech Gover­nance Pro­ject

Zakariyau Yusuf29 Oct 2024 9:20 UTC
52 points
5 comments8 min readEA link

[Question] If FTX is liqui­dated, who ends up con­trol­ling An­thropic?

Ofer15 Nov 2022 15:04 UTC
63 points
8 comments1 min readEA link

Let’s think about...low­er­ing the bur­den of proof for li­a­bil­ity for harms as­so­ci­ated with AI.

dEAsign26 Sep 2023 12:16 UTC
6 points
0 comments1 min readEA link

Bounty: ex­am­ple de­bug­ging tasks for evals

ElizabethBarnes10 Dec 2023 5:45 UTC
20 points
1 comment2 min readEA link
(www.lesswrong.com)

(Re­port) Eval­u­at­ing Taiwan’s Tac­tics to Safe­guard its Semi­con­duc­tor As­sets Against a Chi­nese Invasion

Yadav7 Dec 2023 0:01 UTC
16 points
0 comments22 min readEA link
(bristolaisafety.org)

Ex­plore Risks from Emerg­ing Tech­nol­ogy with Peers Out­side of (or New to) the AI Align­ment Com­mu­nity—Ex­press In­ter­est by Au­gust 8

Fasori17 Jul 2022 20:59 UTC
3 points
0 comments2 min readEA link

Carl Shul­man on the moral sta­tus of cur­rent and fu­ture AI systems

rgb1 Jul 2024 15:34 UTC
69 points
24 comments12 min readEA link
(experiencemachines.substack.com)

Look­ing for Cana­dian sum­mer co-op po­si­tion in AI Governance

tcelferact26 Jun 2023 17:27 UTC
6 points
2 comments1 min readEA link

Apollo Re­search is hiring evals and in­ter­pretabil­ity en­g­ineers & scientists

mariushobbhahn4 Aug 2023 10:56 UTC
19 points
1 comment2 min readEA link

Fake Meat and Real Talk 1 - Are We All Gonna Die? Yud­kowsky and the Dangers of AI (Please RSVP)

David N8 Mar 2023 20:40 UTC
11 points
2 comments1 min readEA link

In­sti­tu­tions Can­not Res­train Dark-Triad AI Exploitation

Remmelt27 Dec 2022 10:34 UTC
8 points
0 comments5 min readEA link
(mflb.com)

The De­creas­ing Value of Chain of Thought in Prompting

Matrice Jacobine8 Jun 2025 15:11 UTC
5 points
0 comments1 min readEA link
(papers.ssrn.com)

Safe Sta­sis Fallacy

Davidmanheim5 Feb 2024 10:54 UTC
23 points
4 comments1 min readEA link

[Question] Can we ever en­sure AI al­ign­ment if we can only test AI per­sonas?

Karl von Wendt16 Mar 2025 8:06 UTC
8 points
0 comments1 min readEA link

Two rea­sons we might be closer to solv­ing al­ign­ment than it seems

Kat Woods 🔶 ⏸️24 Sep 2022 17:38 UTC
44 points
17 comments4 min readEA link

Why I am Still Skep­ti­cal about AGI by 2030

James Fodor2 May 2025 7:13 UTC
131 points
15 comments6 min readEA link

You Should Write a Fo­rum Bio

Aaron Gertler 🔸1 Feb 2019 3:32 UTC
42 points
59 comments2 min readEA link

2023 Stan­ford Ex­is­ten­tial Risks Conference

elizabethcooper24 Feb 2023 17:49 UTC
29 points
5 comments1 min readEA link

Refer the Co­op­er­a­tive AI Foun­da­tion’s New COO, Re­ceive $5000

Lewis Hammond16 Jun 2022 13:27 UTC
42 points
0 comments3 min readEA link

De­fus­ing AGI Danger

Mark Xu24 Dec 2020 23:08 UTC
23 points
0 comments2 min readEA link
(www.alignmentforum.org)

Perché il deep learn­ing mod­erno potrebbe ren­dere diffi­cile l’al­linea­mento delle IA

EA Italy17 Jan 2023 23:29 UTC
1 point
0 comments16 min readEA link

Rea­sons for su­per­pow­ers to de­velop (and not de­velop) su­per in­tel­li­gent AI?

flyingtiger25 Mar 2025 22:22 UTC
1 point
0 comments1 min readEA link

Sin­ga­pore AI Policy Ca­reer Guide

Yi-Yang21 Jan 2021 3:05 UTC
28 points
0 comments5 min readEA link

How LLMs Work, in the Style of The Economist

utilistrutil22 Apr 2024 19:06 UTC
17 points
0 comments2 min readEA link

Jan Leike, He­len Toner, Malo Bour­gon, and Miles Brundage: Work­ing in AI

EA Global11 Aug 2017 8:19 UTC
7 points
0 comments1 min readEA link
(www.youtube.com)

AUKUS Mili­tary AI Trial

CAISID14 Feb 2024 14:52 UTC
10 points
0 comments2 min readEA link

Join AISafety.info’s Writ­ing & Edit­ing Hackathon (Aug 25-28) (Prizes to be won!)

leillustrations🔸5 Aug 2023 14:06 UTC
15 points
0 comments1 min readEA link

[Question] Do short AI timelines de­mand short Giv­ing timelines?

ScienceMon🔸1 Feb 2025 22:44 UTC
12 points
5 comments1 min readEA link

Mo­ral Con­sid­er­a­tions In De­sign­ing AI Systems

Hans Gundlach5 Jul 2024 18:13 UTC
8 points
1 comment7 min readEA link

OpenAI defected, but we can take hon­est actions

Remmelt21 Oct 2024 8:41 UTC
19 points
1 comment2 min readEA link

[Question] What does the launch of x.ai mean for AI Safety?

Chris Leong12 Jul 2023 19:42 UTC
20 points
1 comment1 min readEA link

Cog­ni­tive Bi­ases Con­tribut­ing to AI X-risk — a deleted ex­cerpt from my 2018 ARCHES draft

Andrew Critch3 Dec 2024 9:29 UTC
14 points
1 comment5 min readEA link

[Ap­ply] What I Love About AI Safety Field­build­ing at Cam­bridge (& We’re Hiring for a Lead­er­ship Role)

Harrison 🔸14 Feb 2025 17:41 UTC
15 points
0 comments3 min readEA link

Ter­minol­ogy sug­ges­tion: stan­dard­ize terms for prob­a­bil­ity ranges

Egg Syntax30 Aug 2024 16:05 UTC
2 points
0 comments1 min readEA link

Are New Ideas in AI Get­ting Harder to Find?

Charlie Harrison10 Dec 2024 12:52 UTC
39 points
3 comments5 min readEA link

[Question] Is trans­for­ma­tive AI the biggest ex­is­ten­tial risk? Why or why not?

Eevee🔹5 Mar 2022 3:54 UTC
9 points
10 comments1 min readEA link

[Question] Seek­ing Tan­gible Ex­am­ples of AI Catastrophes

clifford.banes25 Nov 2024 7:55 UTC
9 points
2 comments1 min readEA link

Highly Opinionated Ad­vice on How to Write ML Papers

Neel Nanda12 May 2025 1:59 UTC
22 points
0 comments32 min readEA link

AI Gover­nance to Avoid Ex­tinc­tion: The Strate­gic Land­scape and Ac­tion­able Re­search Ques­tions [MIRI TGT Re­search Agenda]

peterbarnett5 May 2025 19:13 UTC
65 points
1 comment8 min readEA link
(techgov.intelligence.org)

Sin­ga­pore’s Tech­ni­cal AI Align­ment Re­search Ca­reer Guide

Yi-Yang26 Aug 2020 8:09 UTC
34 points
7 comments8 min readEA link

The Oper­a­tor’s Gam­ble: A Pivot to Ma­te­rial Con­se­quence in AI Safety

Ihor Ivliev21 Jul 2025 19:33 UTC
−1 points
0 comments4 min readEA link

Light­ning Post: Things peo­ple in AI Safety should stop talk­ing about

Prometheus20 Jun 2023 15:00 UTC
5 points
3 comments2 min readEA link

Ap­ply to the sec­ond ML for Align­ment Boot­camp (MLAB 2) in Berkeley [Aug 15 - Fri Sept 2]

Buck6 May 2022 0:19 UTC
111 points
7 comments6 min readEA link

Sup­port­ing global co­or­di­na­tion in AI de­vel­op­ment: Why and how to con­tribute to in­ter­na­tional AI standards

pcihon17 Apr 2019 22:17 UTC
21 points
4 comments1 min readEA link

Vignettes Work­shop (AI Im­pacts)

kokotajlod15 Jun 2021 11:02 UTC
43 points
5 comments1 min readEA link

In­tro­duc­ing the AI Ob­jec­tives In­sti­tute’s Re­search: Differ­en­tial Paths to­ward Safe and Benefi­cial AI

cmck5 May 2023 20:26 UTC
43 points
1 comment8 min readEA link

Ap­ply to MATS 8.0!

Ryan Kidd20 Mar 2025 2:17 UTC
33 points
0 comments4 min readEA link

[Question] Track­ing Com­pute Stocks and Flows: Case Stud­ies?

Cullen 🔸5 Oct 2022 17:54 UTC
34 points
1 comment1 min readEA link

Towards shut­down­able agents via stochas­tic choice

EJT8 Jul 2024 10:14 UTC
26 points
1 comment23 min readEA link
(arxiv.org)

My cur­rent thoughts on MIRI’s “highly re­li­able agent de­sign” work

Daniel_Dewey7 Jul 2017 1:17 UTC
60 points
59 comments19 min readEA link

Is GPT-3 the death of the pa­per­clip max­i­mizer?

matthias_samwald3 Aug 2020 11:34 UTC
4 points
1 comment1 min readEA link

AI Twit­ter ac­counts to fol­low?

Adrian Salustri10 Jun 2022 6:19 UTC
1 point
2 comments1 min readEA link

[Question] Huh. Bing thing got me real anx­ious about AI. Re­sources to help with that please?

Arvin15 Feb 2023 16:55 UTC
2 points
7 comments1 min readEA link

New Se­quence—Towards a wor­ld­wide, wa­ter­tight Wind­fall Clause

John Bridge 🔸7 Apr 2022 15:02 UTC
25 points
4 comments8 min readEA link

There are two fac­tions work­ing to pre­vent AI dan­gers. Here’s why they’re deeply di­vided.

Sharmake10 Aug 2022 19:52 UTC
10 points
0 comments4 min readEA link
(www.vox.com)

NIST staffers re­volt against ex­pected ap­point­ment of ‘effec­tive al­tru­ist’ AI re­searcher to US AI Safety Institute

Phib8 Mar 2024 17:47 UTC
39 points
16 comments1 min readEA link
(venturebeat.com)

[Question] Benefits/​Risks of Scott Aaron­son’s Ortho­dox/​Re­form Fram­ing for AI Alignment

Jeremy21 Nov 2022 17:47 UTC
15 points
5 comments1 min readEA link
(scottaaronson.blog)

Quick Thoughts on Lan­guage Models

RohanS19 Jul 2023 16:51 UTC
10 points
2 comments4 min readEA link
(www.lesswrong.com)

#188 – On whether sci­ence is good (Matt Clancy on the 80,000 Hours Pod­cast)

80000_Hours24 May 2024 15:04 UTC
13 points
0 comments17 min readEA link

Con­nect For An­i­mals 2025 Strate­gic Plan

Steven Rouk13 Feb 2025 15:49 UTC
17 points
1 comment13 min readEA link

Elic­it­ing in­tu­itions: Ex­plor­ing an area for EA psychology

Daniel_Friedrich21 Apr 2025 15:13 UTC
11 points
1 comment8 min readEA link

Distil­la­tion of “How Likely is De­cep­tive Align­ment?”

NickGabs1 Dec 2022 20:22 UTC
10 points
1 comment10 min readEA link

A Rocket–In­ter­pretabil­ity Analogy

plex21 Oct 2024 13:55 UTC
13 points
1 comment1 min readEA link

‘Force mul­ti­pli­ers’ for EA research

Craig Drayton18 Jun 2022 13:39 UTC
18 points
7 comments4 min readEA link

OPEC for a slow AGI takeoff

vyrax21 Apr 2023 10:53 UTC
4 points
0 comments3 min readEA link

Shul­man and Yud­kowsky on AI progress

CarlShulman4 Dec 2021 11:37 UTC
46 points
0 comments20 min readEA link

De­cep­tion as the op­ti­mal: mesa-op­ti­miz­ers and in­ner al­ign­ment

Eleni_A16 Aug 2022 3:45 UTC
19 points
0 comments5 min readEA link

On AI Weapons

kbog13 Nov 2019 12:48 UTC
76 points
10 comments30 min readEA link

Cost-effec­tive­ness anal­y­sis of ~1260 USD worth of so­cial me­dia ads for fel­low­ship marketing

gergo25 Jan 2024 15:18 UTC
61 points
5 comments2 min readEA link

How to think about slow­ing AI

Zach Stein-Perlman17 Sep 2023 11:23 UTC
74 points
9 comments3 min readEA link

Sen­tinel Fund­ing Memo — Miti­gat­ing GCRs with Fore­cast­ing & Emer­gency Response

Saul Munn6 Nov 2024 1:57 UTC
47 points
5 comments13 min readEA link

Read More News

utilistrutil16 Mar 2025 21:31 UTC
16 points
5 comments5 min readEA link

AI Safety Newslet­ter #5: Ge­offrey Hin­ton speaks out on AI risk, the White House meets with AI labs, and Tro­jan at­tacks on lan­guage models

Center for AI Safety9 May 2023 15:26 UTC
60 points
0 comments4 min readEA link
(newsletter.safe.ai)

AI Gover­nance Needs Tech­ni­cal Work

Mau5 Sep 2022 22:25 UTC
121 points
3 comments8 min readEA link

Skill up in ML for AI safety with the In­tro to ML Safety course (Spring 2023)

james5 Jan 2023 11:02 UTC
36 points
3 comments2 min readEA link

Ex­pres­sion of In­ter­est: Re­think Pri­ori­ties’ AI Strat­egy Con­trac­tor Pool

kierangreig🔸17 Jun 2025 17:15 UTC
35 points
9 comments1 min readEA link

Analysing a 2036 Takeover Scenario

ukc100146 Oct 2022 20:48 UTC
4 points
1 comment27 min readEA link

In­for­mat­ica: Spe­cial Is­sue on Superintelligence

RyanCarey3 May 2017 5:05 UTC
7 points
0 comments2 min readEA link

Should ChatGPT make us down­weight our be­lief in the con­scious­ness of non-hu­man an­i­mals?

splinter18 Feb 2023 23:29 UTC
11 points
15 comments2 min readEA link

AI Gover­nance Read­ing Group Guide

AHT25 Jun 2020 10:16 UTC
26 points
2 comments3 min readEA link

[Question] 1h-vol­un­teers needed for a small AI Safety-re­lated re­search pro­ject

PabloAMC 🔸16 Aug 2021 17:51 UTC
4 points
0 comments1 min readEA link

3. Why im­par­tial al­tru­ists should sus­pend judg­ment un­der unawareness

Anthony DiGiovanni2 Jun 2025 8:54 UTC
37 points
0 comments16 min readEA link

[Question] SWE vs AIS

sammyboiz🔸21 Feb 2025 1:48 UTC
22 points
7 comments1 min readEA link

New Fron­tiers in AI Safety

Hans Gundlach2 Apr 2025 2:00 UTC
6 points
0 comments4 min readEA link
(drive.google.com)

Pod­cast: Shoshan­nah Tekofsky on skil­ling up in AI safety, vis­it­ing Berkeley, and de­vel­op­ing novel re­search ideas

Akash25 Nov 2022 20:47 UTC
14 points
0 comments9 min readEA link

NTIA Solic­its Com­ments on Open-Weight AI Models

Jacob Woessner6 Mar 2024 20:05 UTC
11 points
1 comment2 min readEA link
(www.ntia.gov)

Com­mon mis­con­cep­tions about OpenAI

Jacob_Hilton25 Aug 2022 14:08 UTC
51 points
2 comments1 min readEA link
(www.lesswrong.com)

Large Lan­guage Models Pass the Tur­ing Test

Matrice Jacobine2 Apr 2025 5:41 UTC
11 points
6 comments1 min readEA link
(arxiv.org)

Balanc­ing safety and waste

Daniel_Friedrich17 Mar 2024 10:57 UTC
6 points
0 comments8 min readEA link

AI Im­pacts: His­toric trends in tech­nolog­i­cal progress

Aaron Gertler 🔸12 Feb 2020 0:08 UTC
55 points
5 comments3 min readEA link

Keep Chas­ing AI Safety Press Coverage

Gil4 Apr 2023 20:40 UTC
106 points
16 comments5 min readEA link

Euro­pean Union AI Devel­op­ment and Gover­nance Part­ner­ships

EU AI Governance19 Jan 2022 10:26 UTC
22 points
1 comment4 min readEA link

Nav­i­gat­ing the Fu­ture: A Guide on How to Stay Safe with AI | Em­manuel Katto Uganda

emmanuelkatto28 Aug 2023 11:38 UTC
2 points
0 comments2 min readEA link

Sixty years af­ter the Cuban Mis­sile Cri­sis, a new era of global catas­trophic risks

christian.r13 Oct 2022 11:25 UTC
31 points
0 comments1 min readEA link
(thebulletin.org)

An­nounc­ing Su­per­in­tel­li­gence Imag­ined: A cre­ative con­test on the risks of superintelligence

TaylorJns12 Jun 2024 15:20 UTC
17 points
0 comments1 min readEA link

[Paper] AI Sand­bag­ging: Lan­guage Models can Strate­gi­cally Un­der­perform on Evaluations

Teun van der Weij13 Jun 2024 10:04 UTC
24 points
2 comments2 min readEA link
(arxiv.org)

AI com­pa­nies aren’t re­ally us­ing ex­ter­nal evaluators

Zach Stein-Perlman26 May 2024 19:05 UTC
88 points
4 comments4 min readEA link

Ap­ply for the ML Win­ter Camp in Cam­bridge, UK [2-10 Jan]

Nathan_Barnard2 Dec 2022 19:33 UTC
50 points
11 comments2 min readEA link

FLF Fel­low­ship on AI for Hu­man Rea­son­ing: $25-50k, 12 weeks

Oliver Sourbut19 May 2025 13:25 UTC
69 points
2 comments2 min readEA link
(www.flf.org)

[Cross­post] Why Un­con­trol­lable AI Looks More Likely Than Ever

Otto8 Mar 2023 15:33 UTC
49 points
6 comments4 min readEA link
(time.com)

From lan­guage to ethics by au­to­mated reasoning

Michele Campolo21 Nov 2021 15:16 UTC
8 points
0 comments6 min readEA link

Trans­for­ma­tive trust­build­ing via ad­vance­ments in de­cen­tral­ized lie detection

trevor116 Mar 2024 5:56 UTC
4 points
1 comment38 min readEA link
(www.ncbi.nlm.nih.gov)

[Question] What AI Posts Do You Want Distil­led?

brook25 Aug 2023 9:00 UTC
15 points
3 comments1 min readEA link

In­ter­na­tional AI In­sti­tu­tions: a liter­a­ture re­view of mod­els, ex­am­ples, and pro­pos­als

MMMaas26 Sep 2023 15:26 UTC
53 points
0 comments2 min readEA link

A Utili­tar­ian Frame­work with an Em­pha­sis on Self-Es­teem and Rights

Sean Sweeney8 Apr 2024 11:15 UTC
7 points
0 comments30 min readEA link

AISN #34: New Mili­tary AI Sys­tems Plus, AI Labs Fail to Uphold Vol­un­tary Com­mit­ments to UK AI Safety In­sti­tute, and New AI Policy Pro­pos­als in the US Senate

Center for AI Safety2 May 2024 16:12 UTC
21 points
5 comments8 min readEA link
(newsletter.safe.ai)

Repli­cat­ing AI Debate

Anthony Fleming1 Feb 2025 23:19 UTC
9 points
0 comments5 min readEA link

[Question] Brief sum­mary of key dis­agree­ments in AI Risk

Aryeh Englander26 Dec 2019 19:40 UTC
31 points
3 comments1 min readEA link

[Question] Why does (any par­tic­u­lar) AI safety work re­duce s-risks more than it in­creases them?

Michael St Jules 🔸3 Oct 2021 16:55 UTC
48 points
19 comments1 min readEA link

Oper­a­tions in AI Safety: A One-Year Per­spec­tive and Advice

mick24 Jul 2025 12:39 UTC
15 points
0 comments10 min readEA link
(mickzijdel.com)

A new pro­cess for map­ping discussions

Nathan Young30 Sep 2024 8:57 UTC
11 points
4 comments6 min readEA link
(open.substack.com)

The EU AI Act needs a defi­ni­tion of high-risk foun­da­tion mod­els to avoid reg­u­la­tory over­reach and backlash

matthias_samwald31 May 2023 15:34 UTC
17 points
0 comments4 min readEA link

#172 – Why you should stop read­ing the news (Bryan Ca­plan on the 80,000 Hours Pod­cast)

80000_Hours22 Nov 2023 18:29 UTC
20 points
1 comment20 min readEA link

AI Safety Memes Wiki

plex24 Jul 2024 18:53 UTC
6 points
0 comments1 min readEA link
(aisafety.info)

UK’s new 10-year “Na­tional AI Strat­egy,” re­leased today

jared_m22 Sep 2021 11:18 UTC
28 points
7 comments1 min readEA link

Between Science Fic­tion and Emerg­ing Real­ity: Are We Ready for Digi­tal Per­sons?

Alex (Αλέξανδρος)13 Mar 2025 16:09 UTC
5 points
1 comment5 min readEA link

Ex­plain­ing all the US semi­con­duc­tor ex­port controls

ZacRichardson17 Jan 2025 18:00 UTC
20 points
3 comments9 min readEA link

Could Reg­u­la­tory Cost-Benefit Anal­y­sis Stop Fron­tier AI Reg­u­la­tions in the US?

Luise11 Jul 2024 15:25 UTC
21 points
1 comment14 min readEA link

Teach­ing Hindi Liter­acy with an AI tutor

Tom Delaney15 May 2025 6:49 UTC
40 points
5 comments6 min readEA link

AI al­ign­ment with hu­mans… but with which hu­mans?

Geoffrey Miller8 Sep 2022 23:43 UTC
51 points
20 comments3 min readEA link

Fore­cast­ing Com­pute—Trans­for­ma­tive AI and Com­pute [2/​4]

lennart1 Oct 2021 8:25 UTC
39 points
6 comments19 min readEA link

What to sug­gest com­pa­nies & en­trepreneurs do to use AI safely?

AlfalfaBloom5 Apr 2023 22:36 UTC
11 points
1 comment1 min readEA link

Jan Kirch­ner on AI Alignment

birtes17 Jan 2023 15:11 UTC
5 points
0 comments1 min readEA link

Ac­cel­er­ated Hori­zons — Pod­cast + Blog Idea

Cadejs16 Apr 2025 14:20 UTC
2 points
3 comments1 min readEA link

A So­ciety of Di­verse Cognition

atb9 Jun 2025 15:22 UTC
8 points
1 comment13 min readEA link

The Eng­ine of Foreclosure

Ihor Ivliev5 Jul 2025 15:26 UTC
0 points
0 comments25 min readEA link

Reflec­tions on Com­pat­i­bil­ism, On­tolog­i­cal Trans­la­tions, and the Ar­tifi­cial Divine

Mahdi Complex7 May 2025 12:17 UTC
−4 points
0 comments22 min readEA link

[Question] How come there isn’t that much fo­cus in EA on re­search into whether /​ when AI’s are likely to be sen­tient?

callum27 Apr 2023 10:09 UTC
83 points
23 comments1 min readEA link

6 (Po­ten­tial) Mis­con­cep­tions about AI Intellectuals

Ozzie Gooen14 Feb 2025 23:51 UTC
30 points
2 comments12 min readEA link

The Need for Poli­ti­cal Ad­ver­tis­ing (Post 2 of 7 on AI Gover­nance)

Jason Green-Lowe21 May 2025 0:52 UTC
55 points
0 comments13 min readEA link

AI Safety Ideas: A col­lab­o­ra­tive AI safety re­search platform

Apart Research17 Oct 2022 17:01 UTC
67 points
13 comments4 min readEA link

Re­port on Fron­tier Model Training

YafahEdelman30 Aug 2023 20:04 UTC
19 points
1 comment21 min readEA link
(docs.google.com)

Se­cond call: CFP for Re­bel­lion and Di­sobe­di­ence in AI workshop

Ram Rachum5 Feb 2023 12:19 UTC
2 points
0 comments2 min readEA link

MATS Alumni Im­pact Analysis

utilistrutil2 Oct 2024 23:44 UTC
16 points
1 comment11 min readEA link

[Question] Ca­reer Ad­vice: Philos­o­phy + Pro­gram­ming → AI Safety

tcelferact18 Mar 2022 15:09 UTC
30 points
11 comments2 min readEA link

[Question] Ci­ti­zens Group for Steer­ing AI

Odette B11 Apr 2024 9:15 UTC
13 points
0 comments1 min readEA link

Idea: an AI gov­er­nance group colo­cated with ev­ery AI re­search group!

capybaralet7 Dec 2020 23:41 UTC
8 points
1 comment2 min readEA link

New co­op­er­a­tion mechanism—quadratic fund­ing with­out a match­ing pool

Filip Sondej5 Jun 2022 13:55 UTC
55 points
11 comments5 min readEA link

The Offense-Defense Balance Rarely Changes

Maxwell Tabarrok9 Dec 2023 15:22 UTC
82 points
16 comments3 min readEA link
(maximumprogress.substack.com)

Ab­solute Zero: Re­in­forced Self-play Rea­son­ing with Zero Data

Matrice Jacobine12 May 2025 15:20 UTC
14 points
1 comment1 min readEA link
(www.arxiv.org)

Eli’s re­view of “Is power-seek­ing AI an ex­is­ten­tial risk?”

elifland30 Sep 2022 12:21 UTC
58 points
3 comments3 min readEA link
(docs.google.com)

Mak­ing EA more in­clu­sive, rep­re­sen­ta­tive, and im­pact­ful in Africa

Ashura Batungwanayo17 Aug 2023 20:19 UTC
70 points
13 comments4 min readEA link

Chris Olah on what the hell is go­ing on in­side neu­ral networks

80000_Hours4 Aug 2021 15:13 UTC
5 points
0 comments133 min readEA link

[Question] Should AI writ­ers be pro­hibited in ed­u­ca­tion?

Eleni_A16 Jan 2023 22:29 UTC
3 points
2 comments1 min readEA link

AI Risk In­tro 1: Ad­vanced AI Might Be Very Bad

L Rudolf L11 Sep 2022 10:57 UTC
22 points
0 comments30 min readEA link

AISN #16: White House Se­cures Vol­un­tary Com­mit­ments from Lead­ing AI Labs and Les­sons from Oppenheimer

Center for AI Safety25 Jul 2023 16:45 UTC
7 points
0 comments6 min readEA link
(newsletter.safe.ai)

On Ar­tifi­cial Gen­eral In­tel­li­gence: Ask­ing the Right Questions

Heather Douglas2 Oct 2022 5:00 UTC
−1 points
7 comments3 min readEA link

[Question] Share AI Safety Ideas: Both Crazy and Not. №2

ank31 Mar 2025 18:45 UTC
1 point
11 comments1 min readEA link

Tar­bell is hiring for 3 roles

Cillian_17 Jul 2024 12:19 UTC
48 points
1 comment5 min readEA link

LW4EA: Some cruxes on im­pact­ful al­ter­na­tives to AI policy work

Jeremy17 May 2022 3:05 UTC
11 points
1 comment1 min readEA link
(www.lesswrong.com)

Grad­ual Disem­pow­er­ment: Sys­temic Ex­is­ten­tial Risks from In­cre­men­tal AI Development

Jan_Kulveit30 Jan 2025 17:07 UTC
38 points
4 comments2 min readEA link
(gradual-disempowerment.ai)

The Vi­talik Bu­terin Fel­low­ship in AI Ex­is­ten­tial Safety is open for ap­pli­ca­tions!

Cynthia Chen14 Oct 2022 3:23 UTC
38 points
0 comments2 min readEA link

How difficult is AI Align­ment?

SammyDMartin13 Sep 2024 17:55 UTC
12 points
0 comments1 min readEA link
(www.lesswrong.com)

AI may at­tain hu­man level soon

Vishakha Agrawal23 Apr 2025 11:10 UTC
2 points
1 comment2 min readEA link
(aisafety.info)

[Question] Thoughts on this $16.7M “AI safety” grant?

defun 🔸16 Jul 2024 9:16 UTC
61 points
24 comments1 min readEA link

[Question] What are some cur­rent, already pre­sent challenges from AI?

nonzerosum30 Jun 2022 15:44 UTC
5 points
1 comment1 min readEA link

Con­ti­nu­ity Assumptions

Jan_Kulveit13 Jun 2022 21:36 UTC
44 points
4 comments4 min readEA link
(www.alignmentforum.org)

The Benefits of Distil­la­tion in Research

Jonas Hallgren 🔸4 Mar 2023 19:19 UTC
45 points
2 comments5 min readEA link

Why Brains Beat AI

Wayne_Hsiung12 Jun 2025 20:25 UTC
4 points
0 comments1 min readEA link
(blog.simpleheart.org)

Va­ri­eties of fake al­ign­ment (Sec­tion 1.1 of “Schem­ing AIs”)

Joe_Carlsmith21 Nov 2023 15:00 UTC
6 points
0 comments10 min readEA link

Tips for con­duct­ing wor­ld­view investigations

lukeprog12 Apr 2022 19:28 UTC
88 points
4 comments2 min readEA link

EA AI/​Emerg­ing Tech Orgs Should Be In­volved with Pa­tent Office Partnership

Bridges12 Jun 2022 22:32 UTC
10 points
0 comments1 min readEA link

Re­sources that (I think) new al­ign­ment re­searchers should know about

Akash28 Oct 2022 22:13 UTC
20 points
2 comments4 min readEA link

Up­date from Cam­paign for AI Safety

Nik Samoylov1 Jun 2023 10:46 UTC
22 points
0 comments2 min readEA link
(www.campaignforaisafety.org)

How do take­off speeds af­fect the prob­a­bil­ity of bad out­comes from AGI?

KR7 Jul 2020 17:53 UTC
18 points
0 comments8 min readEA link

Com­pute & An­titrust: Reg­u­la­tory im­pli­ca­tions of the AI hard­ware sup­ply chain, from chip de­sign to cloud APIs

HaydnBelfield19 Aug 2022 17:20 UTC
32 points
0 comments6 min readEA link
(verfassungsblog.de)

Some ad­vice on in­de­pen­dent research

mariushobbhahn8 Nov 2022 14:46 UTC
65 points
3 comments10 min readEA link

Is China Be­com­ing a Science and Tech­nol­ogy Su­per­power? Jeffrey Ding’s In­sight on China’s Diffu­sion Deficit

Wyman Kwok25 Apr 2023 17:00 UTC
10 points
0 comments1 min readEA link

A challenge for AGI or­ga­ni­za­tions, and a challenge for readers

RobBensinger1 Dec 2022 23:11 UTC
172 points
13 comments2 min readEA link

A Tri-Opti Com­pat­i­bil­ity Problem

wallower1 Mar 2025 19:48 UTC
1 point
0 comments1 min readEA link
(philpapers.org)

FYI: I’m work­ing on a book about the threat of AGI/​ASI for a gen­eral au­di­ence. I hope it will be of value to the cause and the community

Darren McKee17 Jun 2022 11:52 UTC
32 points
1 comment2 min readEA link

AMA: Ajeya Co­tra, re­searcher at Open Phil

Ajeya28 Jan 2021 17:38 UTC
84 points
105 comments1 min readEA link

Mis­takes I made run­ning an AI safety stu­dent group

cb26 Feb 2025 15:07 UTC
26 points
0 comments7 min readEA link

The Dis­solu­tion of AI Safety

Roko12 Dec 2024 10:46 UTC
−7 points
0 comments1 min readEA link
(www.transhumanaxiology.com)

#168 – Whether deep his­tory says we’re head­ing for an in­tel­li­gence ex­plo­sion (Ian Mor­ris on the 80,000 Hours Pod­cast)

80000_Hours24 Oct 2023 15:24 UTC
11 points
2 comments18 min readEA link

How likely are ma­lign pri­ors over ob­jec­tives? [aborted WIP]

David Johnston11 Nov 2022 6:03 UTC
6 points
0 comments2 min readEA link

MIRI Con­ver­sa­tions: Tech­nol­ogy Fore­cast­ing & Grad­u­al­ism (Distil­la­tion)

Callum McDougall13 Jul 2022 10:45 UTC
27 points
9 comments19 min readEA link

The Digi­tal Maieu­tic: Socrates and the Art of Prompting

Rodo30 May 2025 18:58 UTC
3 points
2 comments4 min readEA link

Re­sults from the lan­guage model hackathon

Esben Kran10 Oct 2022 8:29 UTC
23 points
2 comments4 min readEA link

[Question] “Epistemic maps” for AI De­bates? (or for other is­sues)

Marcel230 Aug 2021 4:59 UTC
14 points
9 comments5 min readEA link

Biosafety Reg­u­la­tions (BMBL) and their rele­vance for AI

stepanlos29 Jun 2023 19:20 UTC
8 points
0 comments4 min readEA link

[Question] Is there ev­i­dence that recom­mender sys­tems are chang­ing users’ prefer­ences?

zdgroff12 Apr 2021 19:11 UTC
60 points
15 comments1 min readEA link

(My sug­ges­tions) On Begin­ner Steps in AI Alignment

Joseph Bloom22 Sep 2022 15:32 UTC
37 points
3 comments9 min readEA link

Some AI re­search ar­eas and their rele­vance to ex­is­ten­tial safety

Andrew Critch15 Dec 2020 12:15 UTC
12 points
1 comment56 min readEA link
(alignmentforum.org)

[Linkpost] Hu­man-nar­rated au­dio ver­sion of “Is Power-Seek­ing AI an Ex­is­ten­tial Risk?”

Joe_Carlsmith31 Jan 2023 19:19 UTC
9 points
0 comments1 min readEA link

Con­clu­sion and Bibliog­ra­phy for “Un­der­stand­ing the diffu­sion of large lan­guage mod­els”

Ben Cottier21 Dec 2022 13:50 UTC
12 points
0 comments11 min readEA link

Alexan­der and Yud­kowsky on AGI goals

Scott Alexander31 Jan 2023 23:36 UTC
29 points
1 comment26 min readEA link

[Question] If AIs had sub­cor­ti­cal brain simu­la­tion, would that solve the al­ign­ment prob­lem?

Rainbow Affect31 Jul 2023 15:48 UTC
1 point
0 comments2 min readEA link

When can a mimic sur­prise you? Why gen­er­a­tive mod­els han­dle seem­ingly ill-posed problems

David Johnston6 Nov 2022 11:46 UTC
6 points
0 comments16 min readEA link

An ap­praisal of the Fu­ture of Life In­sti­tute AI ex­is­ten­tial risk program

PabloAMC 🔸11 Dec 2022 13:36 UTC
29 points
0 comments1 min readEA link

The prob­a­bil­ity that Ar­tifi­cial Gen­eral In­tel­li­gence will be de­vel­oped by 2043 is ex­tremely low.

cveres6 Oct 2022 11:26 UTC
2 points
12 comments13 min readEA link

Op­ti­mistic As­sump­tions, Longterm Plan­ning, and “Cope”

Raemon18 Jul 2024 0:06 UTC
15 points
1 comment7 min readEA link

The Ver­ifi­ca­tion Gap: A Scien­tific Warn­ing on the Limits of AI Safety

Ihor Ivliev24 Jun 2025 19:08 UTC
3 points
0 comments2 min readEA link

Big Pic­ture AI Safety: Introduction

EuanMcLean23 May 2024 11:28 UTC
32 points
3 comments5 min readEA link

Con­scious AI con­cerns all of us. [Con­scious AI & Public Per­cep­tions]

ixex3 Jul 2024 3:12 UTC
25 points
1 comment12 min readEA link

A new pro­posal for reg­u­lat­ing AI in the EU

EdoArad26 Apr 2021 17:25 UTC
37 points
3 comments1 min readEA link
(www.bbc.com)

AI Offense Defense Balance in a Mul­tipo­lar World

Otto17 Jul 2025 9:47 UTC
15 points
0 comments19 min readEA link
(www.existentialriskobservatory.org)

How might we al­ign trans­for­ma­tive AI if it’s de­vel­oped very soon?

Holden Karnofsky29 Aug 2022 15:48 UTC
164 points
17 comments44 min readEA link

Chee­tah-8 Eth­i­cal Frame­work: Evolu­tion from Ego­ism to Altruism

DongHun Lee31 May 2025 15:04 UTC
−1 points
0 comments2 min readEA link

Emo­tion Align­ment as AI Safety: In­tro­duc­ing Emo­tion Fire­wall 1.0

DongHun Lee12 May 2025 18:05 UTC
1 point
0 comments2 min readEA link

The Limit of Lan­guage Models

𝕮𝖎𝖓𝖊𝖗𝖆26 Dec 2022 11:17 UTC
10 points
0 comments4 min readEA link

Arkose may be clos­ing, but you can help

Arkose1 May 2025 11:09 UTC
58 points
6 comments2 min readEA link

Ret­ro­spec­tive on the AI Safety Field Build­ing Hub

Vael Gates2 Feb 2023 2:06 UTC
64 points
2 comments9 min readEA link

Will ex­plo­sive growth stem pri­mar­ily from AI R&D au­toma­tion?

OscarD🔸28 Mar 2025 20:25 UTC
39 points
3 comments4 min readEA link

[Question] How much should states in­vest in con­tin­gency plans for wide­spread in­ter­net out­age?

Kinoshita Yoshikazu (pseudonym)7 Apr 2023 16:05 UTC
2 points
0 comments1 min readEA link

Un­der­stand­ing AI World Models w/​ Chris Canal

Jacob-Haimes27 Jan 2025 16:37 UTC
5 points
0 comments1 min readEA link
(kairos.fm)

Black Box In­ves­ti­ga­tions Re­search Hackathon

Esben Kran15 Sep 2022 10:09 UTC
23 points
0 comments2 min readEA link

AI X-Risk: In­te­grat­ing on the Shoulders of Giants

TD_Pilditch1 Nov 2022 16:07 UTC
34 points
0 comments47 min readEA link

AI Safety Hub Ser­bia Offi­cial Opening

Dušan D. Nešić (Dushan)28 Oct 2023 17:03 UTC
20 points
1 comment3 min readEA link
(forum.effectivealtruism.org)

Is prin­ci­pled mass-out­reach pos­si­ble, for AGI X-risk?

Nicholas Kross21 Jan 2024 17:45 UTC
12 points
2 comments3 min readEA link

OpenAI’s CBRN tests seem unclear

Luca Righetti 🔸21 Nov 2024 17:26 UTC
82 points
3 comments7 min readEA link

#191 (Part 1) – The econ­omy and na­tional se­cu­rity af­ter AGI (Carl Shul­man on the 80,000 Hours Pod­cast)

80000_Hours27 Jun 2024 19:10 UTC
45 points
0 comments19 min readEA link

In­tro to car­ing about AI al­ign­ment as an EA cause

So8res14 Apr 2017 0:42 UTC
28 points
10 comments25 min readEA link

De­sen­si­tiz­ing Deepfakes

Phib29 Mar 2023 1:20 UTC
22 points
10 comments1 min readEA link

#217 – The most im­por­tant graph in AI right now (Beth Barnes on The 80,000 Hours Pod­cast)

80000_Hours2 Jun 2025 16:52 UTC
16 points
1 comment26 min readEA link

Effec­tive Altru­ism Florida’s AI Ex­pert Panel—Record­ing and Slides Available

Sam_E_2419 May 2023 19:15 UTC
2 points
0 comments1 min readEA link

Ex­is­ten­tial AI Safety is NOT sep­a­rate from near-term applications

stecas13 Dec 2022 14:47 UTC
28 points
9 comments3 min readEA link

#213 – AI caus­ing a “cen­tury in a decade” — and how we’re com­pletely un­pre­pared (Will MacAskill on The 80,000 Hours Pod­cast)

80000_Hours11 Mar 2025 17:55 UTC
24 points
0 comments22 min readEA link

Two sources of be­yond-epi­sode goals (Sec­tion 2.2.2 of “Schem­ing AIs”)

Joe_Carlsmith28 Nov 2023 13:49 UTC
8 points
0 comments13 min readEA link

My Overview of the AI Align­ment Land­scape: A Bird’s Eye View

Neel Nanda15 Dec 2021 23:46 UTC
45 points
15 comments16 min readEA link
(www.alignmentforum.org)

#194 – Defen­sive ac­cel­er­a­tion and how to reg­u­late AI when you fear gov­ern­ment (Vi­talik Bu­terin on the 80,000 Hours Pod­cast)

80000_Hours31 Jul 2024 20:28 UTC
49 points
5 comments21 min readEA link

Catas­trophic Risks from AI #3: AI Race

Dan H23 Jun 2023 19:21 UTC
9 points
0 comments29 min readEA link
(arxiv.org)

Challenges and Op­por­tu­ni­ties of Re­in­force­ment Learn­ing in Robotics: Anal­y­sis of Cur­rent Trends

Raymundo Rodríguez Alva14 Oct 2024 13:22 UTC
11 points
1 comment17 min readEA link

Fu­ture Mat­ters #5: su­per­vol­ca­noes, AI takeover, and What We Owe the Future

Pablo14 Sep 2022 13:02 UTC
31 points
5 comments18 min readEA link

Chat­bot for Poul­try Farms: Re­spond­ing to Avian In­fluenza in Mexico

Ever Arboleda1 May 2025 14:55 UTC
1 point
0 comments13 min readEA link

From Lay­off to Co-found­ing in a Breath­tak­ing Two Months

Harry Luk26 Sep 2023 7:35 UTC
44 points
3 comments17 min readEA link

[Question] Is there any re­search or fore­casts of how likely AI Align­ment is go­ing to be a hard vs. easy prob­lem rel­a­tive to ca­pa­bil­ities?

Jordan Arel14 Aug 2022 15:58 UTC
8 points
1 comment1 min readEA link

On Defer­ence and Yud­kowsky’s AI Risk Estimates

bgarfinkel19 Jun 2022 14:35 UTC
287 points
194 comments17 min readEA link

Mechanism De­sign for AI Safety—Read­ing Group Curriculum

Rubi J. Hudson25 Oct 2022 3:54 UTC
24 points
1 comment4 min readEA link

De­con­fus­ing ‘AI’ and ‘evolu­tion’

Remmelt22 Jul 2025 6:56 UTC
6 points
1 comment26 min readEA link

Draft re­port on ex­is­ten­tial risk from power-seek­ing AI

Joe_Carlsmith28 Apr 2021 21:41 UTC
88 points
34 comments1 min readEA link

[Closed] Prize and fast track to al­ign­ment re­search at ALTER

Vanessa18 Sep 2022 9:15 UTC
38 points
0 comments3 min readEA link

An­nounc­ing Apollo Research

mariushobbhahn30 May 2023 16:17 UTC
158 points
5 comments8 min readEA link

How dath ilan co­or­di­nates around solv­ing AI alignment

Thomas Kwa14 Apr 2022 1:53 UTC
13 points
1 comment5 min readEA link

AI Safety Hub Ser­bia Soft Launch

Dušan D. Nešić (Dushan)25 Jul 2023 19:39 UTC
29 points
3 comments3 min readEA link

ChatGPT bug leaked users’ con­ver­sa­tion histories

Ian Turner27 Mar 2023 0:17 UTC
15 points
2 comments1 min readEA link
(www.bbc.com)

[Closed] Hiring a math­e­mat­i­cian to work on the learn­ing-the­o­retic AI al­ign­ment agenda

Vanessa19 Apr 2022 6:49 UTC
53 points
4 comments2 min readEA link

In­tel­sat as a Model for In­ter­na­tional AGI Governance

rosehadshar13 Mar 2025 12:58 UTC
42 points
3 comments1 min readEA link
(www.forethought.org)

Call for Re­search Par­ti­ci­pants—EU/​China AI regulation

Jamie O'Donnell14 Jun 2024 17:30 UTC
3 points
0 comments1 min readEA link

Digi­tal Minds: Im­por­tance and Key Re­search Ques­tions

Andreas_Mogensen3 Jul 2024 8:59 UTC
83 points
1 comment15 min readEA link

The Author Who Knows Noth­ing : Socrates, Techne, and Barthes’ Scriptor

Rodo1 Jun 2025 9:49 UTC
0 points
0 comments4 min readEA link

The con­ver­gent dy­namic we missed

Remmelt12 Dec 2023 22:50 UTC
2 points
0 comments3 min readEA link

Do Self-Per­ceived Su­per­in­tel­li­gent LLMs Ex­hibit Misal­ign­ment?

Dave Banerjee 🔸29 Jun 2025 11:16 UTC
7 points
1 comment12 min readEA link
(davebanerjee.xyz)

Epoch is hiring a Re­search Data Analyst

merilalama22 Nov 2022 17:34 UTC
21 points
0 comments4 min readEA link
(careers.rethinkpriorities.org)

Why AGI Timeline Re­search/​Dis­course Might Be Overrated

Miles_Brundage3 Jul 2022 8:04 UTC
122 points
28 comments10 min readEA link

An­nounc­ing an Em­piri­cal AI Safety Program

Joshc13 Sep 2022 21:39 UTC
64 points
7 comments2 min readEA link

Pre­dict re­sponses to the “ex­is­ten­tial risk from AI” survey

RobBensinger28 May 2021 1:38 UTC
36 points
8 comments2 min readEA link

Brain­storm of things that could force an AI team to burn their lead

So8res25 Jul 2022 0:00 UTC
26 points
1 comment13 min readEA link

FLI is hiring a new Direc­tor of US Policy

aaguirre27 Jul 2022 0:07 UTC
14 points
0 comments1 min readEA link

Prevent­ing An­i­mal Suffer­ing Lock-in: Why Eco­nomic Tran­si­tions Matter

Karen Singleton28 Jul 2025 21:55 UTC
38 points
4 comments10 min readEA link

Some promis­ing ca­reer ideas be­yond 80,000 Hours’ pri­or­ity paths

Arden Koehler26 Jun 2020 10:34 UTC
142 points
28 comments15 min readEA link

RA Bounty: Look­ing for feed­back on screen­play about AI Risk

Writer26 Oct 2023 14:27 UTC
8 points
0 comments1 min readEA link

Anal­y­sis of key AI analogies

Kevin Kohler29 Jun 2024 18:16 UTC
35 points
2 comments15 min readEA link

[Question] Whose track record of AI pre­dic­tions would you like to see eval­u­ated?

Jonny Spicer 🔸29 Jan 2025 11:57 UTC
10 points
13 comments1 min readEA link

Crit­i­cism Thread: What things should OpenPhil im­prove on?

anonymousEA204 Feb 2023 8:16 UTC
85 points
8 comments2 min readEA link

An­nounc­ing In­sights for Impact

Christian Pearson4 Jan 2023 7:00 UTC
80 points
6 comments1 min readEA link

An­chor­ing fo­cal­ism and the Iden­ti­fi­able vic­tim effect: Bias in Eval­u­at­ing AGI X-Risks

Remmelt7 Jan 2023 9:59 UTC
−2 points
1 comment1 min readEA link

ML4Good West & Cen­tral Europe | Ap­pli­ca­tions Open

carolinaollive12 Mar 2025 0:02 UTC
7 points
3 comments2 min readEA link

Fake think­ing and real thinking

Joe_Carlsmith28 Jan 2025 20:05 UTC
78 points
3 comments38 min readEA link

New re­port on how much com­pu­ta­tional power it takes to match the hu­man brain (Open Philan­thropy)

Aaron Gertler 🔸15 Sep 2020 1:06 UTC
45 points
1 comment18 min readEA link
(www.openphilanthropy.org)

Keep Mak­ing AI Safety News

Gil31 Mar 2023 20:11 UTC
67 points
4 comments1 min readEA link

Re­sults from the AI x Democ­racy Re­search Sprint

Esben Kran14 Jun 2024 16:40 UTC
19 points
1 comment6 min readEA link

A Cal­ifor­nia Effect for Ar­tifi­cial Intelligence

henryj9 Sep 2022 14:17 UTC
73 points
1 comment4 min readEA link
(docs.google.com)

Meta AI an­nounces Cicero: Hu­man-Level Di­plo­macy play (with di­alogue)

Jacy22 Nov 2022 16:50 UTC
49 points
10 comments1 min readEA link
(www.science.org)

AI Safety Newslet­ter #3: AI policy pro­pos­als and a new challenger approaches

Oliver Z25 Apr 2023 16:15 UTC
35 points
1 comment4 min readEA link
(newsletter.safe.ai)

Eisen­hower’s Atoms for Peace Speech

Akash17 May 2023 16:10 UTC
17 points
1 comment11 min readEA link
(www.iaea.org)

Patch­ing ~All Se­cu­rity-Rele­vant Open-Source Soft­ware?

niplav25 Feb 2025 21:35 UTC
35 points
4 comments2 min readEA link

Fore­cast­ing Biose­cu­rity Risks from LLMs

Forecasting Research Institute1 Jul 2025 12:43 UTC
10 points
0 comments1 min readEA link
(forecastingresearch.org)

[Question] An eco­nomics of AI gov—best re­sources for

Liv26 Feb 2023 11:11 UTC
10 points
4 comments1 min readEA link

The AI Mes­siah

ryancbriggs5 May 2022 16:58 UTC
71 points
44 comments2 min readEA link

Sim­plic­ity ar­gu­ments for schem­ing (Sec­tion 4.3 of “Schem­ing AIs”)

Joe_Carlsmith7 Dec 2023 15:05 UTC
6 points
1 comment14 min readEA link

Owain Evans and Vic­to­ria Krakovna: Ca­reers in tech­ni­cal AI safety

EA Global3 Nov 2017 7:43 UTC
7 points
0 comments1 min readEA link
(www.youtube.com)

AGI Risk: How to in­ter­na­tion­ally reg­u­late in­dus­tries in non-democracies

Timothy_Liptrot16 May 2022 22:45 UTC
9 points
2 comments9 min readEA link

Can AI Align­ment Models Benefit from Indo-Euro­pean Tri­par­tite Struc­tures?

Paul Fallavollita2 May 2025 12:39 UTC
1 point
0 comments2 min readEA link

How use­ful for al­ign­ment-rele­vant work are AIs with short-term goals? (Sec­tion 2.2.4.3 of “Schem­ing AIs”)

Joe_Carlsmith1 Dec 2023 14:51 UTC
6 points
0 comments6 min readEA link

Sum­mer AI Safety In­tro Fel­low­ships in Bos­ton and On­line (Policy & Tech­ni­cal) – Ap­ply by June 6!

jandrade11229 May 2025 16:47 UTC
4 points
0 comments1 min readEA link

Can­cel­ling GPT subscription

adekcz20 May 2024 16:19 UTC
26 points
14 comments3 min readEA link

Ad­dress­ing challenges for s-risk re­duc­tion: Toward pos­i­tive com­mon-ground proxies

Teo Ajantaival22 Mar 2025 17:50 UTC
52 points
1 comment17 min readEA link

AI Safety Newslet­ter #7: Dis­in­for­ma­tion, Gover­nance Recom­men­da­tions for AI labs, and Se­nate Hear­ings on AI

Center for AI Safety23 May 2023 21:42 UTC
23 points
0 comments6 min readEA link
(newsletter.safe.ai)

[Question] Do­ing Global Pri­ori­ties or AI Policy re­search from re­mote lo­ca­tion?

With Love from Israel29 Oct 2019 9:34 UTC
30 points
4 comments1 min readEA link

AI Im­pacts Quar­terly Newslet­ter, Jan-Mar 2023

Harlan17 Apr 2023 23:07 UTC
20 points
1 comment3 min readEA link
(blog.aiimpacts.org)

Giv­ing AIs safe motivations

Joe_Carlsmith18 Aug 2025 18:02 UTC
19 points
0 comments51 min readEA link

Is it pos­si­bly de­sir­able for sen­tient ASI to ex­ter­mi­nate hu­mans?

Duckruck18 Jun 2024 15:20 UTC
0 points
4 comments1 min readEA link

How to make the best of the most im­por­tant cen­tury?

Holden Karnofsky14 Sep 2021 21:05 UTC
57 points
5 comments12 min readEA link

In­tro­duc­ing Deep Dive, a 201 AI policy course

Kambar17 Jun 2025 16:50 UTC
31 points
2 comments2 min readEA link

Open ques­tions on a Chi­nese in­va­sion of Taiwan and its effects on the semi­con­duc­tor stock

Yadav7 Dec 2023 16:39 UTC
21 points
0 comments2 min readEA link

Adam Smith Meets AI Doomers

JamesMiller31 Jan 2024 16:04 UTC
15 points
0 comments5 min readEA link

Aether July 2025 Update

RohanS1 Jul 2025 21:14 UTC
10 points
0 comments3 min readEA link

We Can’t Do Long Term Utili­tar­ian Calcu­la­tions Un­til We Know if AIs Can Be Con­scious or Not

Mike207312 Sep 2022 8:37 UTC
4 points
0 comments11 min readEA link

Data Poi­son­ing for Dum­mies (No Code, No Math)

Madhav Malhotra4 Sep 2023 20:48 UTC
7 points
0 comments3 min readEA link

Linkpost: Epis­tle to the Successors

ukc1001414 Jul 2024 20:07 UTC
4 points
0 comments1 min readEA link
(ukc10014.github.io)

Self-Limit­ing AI in AI Alignment

The_Lord's_Servant_28031 Dec 2022 19:07 UTC
2 points
1 comment1 min readEA link

The Elic­i­ta­tion Game: Eval­u­at­ing ca­pa­bil­ity elic­i­ta­tion techniques

Teun van der Weij27 Feb 2025 20:33 UTC
3 points
0 comments2 min readEA link

Com­po­nents of Strate­gic Clar­ity [Strate­gic Per­spec­tives on Long-term AI Gover­nance, #2]

MMMaas2 Jul 2022 11:22 UTC
66 points
0 comments6 min readEA link

Mas­sive Scal­ing Should be Frowned Upon

harsimony17 Nov 2022 17:44 UTC
9 points
0 comments5 min readEA link

Con­ver­sa­tion on AI risk with Adam Gleave

AI Impacts27 Dec 2019 21:43 UTC
18 points
3 comments4 min readEA link
(aiimpacts.org)

AI Gover­nance Course—Cur­ricu­lum and Application

Mau29 Nov 2021 13:29 UTC
94 points
9 comments1 min readEA link

Our Cur­rent Direc­tions in Mechanis­tic In­ter­pretabil­ity Re­search (AI Align­ment Speaker Series)

Group Organizer8 Apr 2022 17:08 UTC
3 points
0 comments1 min readEA link

The sec­ond bit­ter les­son — there’s a fun­da­men­tal prob­lem with al­ign­ing AI

aelwood19 Jan 2025 18:48 UTC
4 points
1 comment5 min readEA link
(pursuingreality.substack.com)

The Cam­paign Lab Tool Box Hack Day

DanR6 Feb 2024 16:39 UTC
1 point
0 comments1 min readEA link

AI scal­ing myths

Noah Varley🔸27 Jun 2024 20:29 UTC
30 points
0 comments1 min readEA link
(open.substack.com)

The case for be­com­ing a black-box in­ves­ti­ga­tor of lan­guage models

Buck6 May 2022 14:37 UTC
91 points
7 comments3 min readEA link

Slightly against al­ign­ing with neo-luddites

Matthew_Barnett26 Dec 2022 23:27 UTC
77 points
17 comments4 min readEA link

U.S. Com­merce Sec­re­tary Gina Raimondo An­nounces Ex­pan­sion of U.S. AI Safety In­sti­tute Lead­er­ship Team [and Paul Chris­ti­ano up­date]

Phib16 Apr 2024 17:10 UTC
116 points
8 comments1 min readEA link
(www.commerce.gov)

Eric Sch­midt on re­cur­sive self-improvement

Nikola5 Nov 2023 19:05 UTC
11 points
0 comments1 min readEA link
(www.youtube.com)

An­nounc­ing the Har­vard AI Safety Team

Xander12330 Jun 2022 18:34 UTC
128 points
4 comments5 min readEA link

New AI risk in­tro from Vox [link post]

JakubK21 Dec 2022 5:50 UTC
7 points
1 comment2 min readEA link
(www.vox.com)

Chris­ti­ano, Co­tra, and Yud­kowsky on AI progress

Ajeya25 Nov 2021 16:30 UTC
18 points
6 comments68 min readEA link

Against Learn­ing From Dra­matic Events (by Scott Alexan­der)

bern17 Jan 2024 16:34 UTC
46 points
3 comments2 min readEA link
(www.astralcodexten.com)

What does it take to defend the world against out-of-con­trol AGIs?

Steven Byrnes25 Oct 2022 14:47 UTC
43 points
0 comments30 min readEA link

Join Cam­bridge AI Safety Hub’s Leadership

Cambridge AI Safety Hub22 Jul 2025 15:13 UTC
9 points
0 comments2 min readEA link

AI Lab Re­tal­i­a­tion: A Sur­vival Guide

Jay Ready4 Jan 2025 23:05 UTC
8 points
1 comment12 min readEA link
(morelightinai.substack.com)

Com­mon Ge­netic Var­i­ants Linked to Drug-Re­sis­tant Epilepsy

Connor Wood16 Apr 2025 3:55 UTC
2 points
0 comments8 min readEA link

Ba­hamian Ad­ven­tures: An Epic Tale of En­trepreneur­ship, AI Strat­egy Re­search and Potatoes

Jaime Sevilla9 Aug 2022 8:37 UTC
67 points
9 comments4 min readEA link

Fun­da­men­tals of Global Pri­ori­ties Re­search in Eco­nomics Syllabus

poliboni8 Aug 2023 12:16 UTC
77 points
1 comment8 min readEA link

An Ex­tremely Opinionated An­no­tated List of My Favourite Mechanis­tic In­ter­pretabil­ity Papers

Neel Nanda18 Oct 2022 21:23 UTC
19 points
0 comments12 min readEA link
(www.neelnanda.io)

How much I’m pay­ing for AI pro­duc­tivity soft­ware (and the fu­ture of AI use)

jacquesthibs11 Oct 2024 17:11 UTC
30 points
17 comments8 min readEA link
(jacquesthibodeau.com)

Q1 AI Bench­mark­ing Re­sults: Hu­man Pros Crush Bots

Benjamin Wilson 🔸28 Jun 2025 17:22 UTC
16 points
0 comments22 min readEA link
(www.metaculus.com)

Will AI end ev­ery­thing? A guide to guess­ing | EAG Bay Area 23

Katja_Grace25 May 2023 17:01 UTC
76 points
4 comments21 min readEA link

A re­sponse to Matthews on AI Risk

RyanCarey11 Aug 2015 12:58 UTC
11 points
16 comments6 min readEA link

[Question] What’s the best ma­chine learn­ing newslet­ter? How do you keep up to date?

Matt Putz25 Mar 2022 14:36 UTC
13 points
12 comments1 min readEA link

Apollo Re­search is hiring a Evals De­mon­stra­tion Engineer

Joping_Apollo Research6 Aug 2025 18:26 UTC
6 points
0 comments1 min readEA link

[Question] I’m in­ter­view­ing Max Teg­mark about AI safety and more. What shouId I ask him?

Robert_Wiblin13 May 2022 15:32 UTC
18 points
2 comments1 min readEA link

Deep athe­ism and AI risk

Joe_Carlsmith4 Jan 2024 18:58 UTC
65 points
4 comments27 min readEA link

Is “su­per­hu­man” AI fore­cast­ing BS? Some ex­per­i­ments on the “539″ bot from the Cen­tre for AI Safety

titotal18 Sep 2024 13:07 UTC
68 points
4 comments14 min readEA link
(open.substack.com)

Would you pur­sue soft­ware en­g­ineer­ing as a ca­reer to­day?

justaperson18 Mar 2023 3:33 UTC
8 points
15 comments3 min readEA link

AI Fore­cast­ing Dic­tionary (Fore­cast­ing in­fras­truc­ture, part 1)

terraform8 Aug 2019 13:16 UTC
18 points
0 comments5 min readEA link

Call for sub­mis­sions: Choice of Fu­tures sur­vey questions

c.trout30 Apr 2023 6:59 UTC
11 points
0 comments2 min readEA link
(airtable.com)

Reli­a­bil­ity, Se­cu­rity, and AI risk: Notes from in­fosec text­book chap­ter 1

Akash7 Apr 2023 15:47 UTC
15 points
0 comments4 min readEA link

ENAIS has launched a newslet­ter for AIS fieldbuilders

gergo22 Nov 2024 10:45 UTC
25 points
0 comments1 min readEA link

We don’t un­der­stand what hap­pened with cul­ture enough

Jan_Kulveit9 Oct 2023 14:56 UTC
29 points
2 comments6 min readEA link

EA Poland is fac­ing an ex­is­ten­tial risk

EA Poland10 Nov 2023 16:23 UTC
113 points
14 comments12 min readEA link

AI Welfare Risks

Adrià Moret2 May 2025 17:41 UTC
27 points
0 comments1 min readEA link
(philpapers.org)

[Question] What is the im­pact of chip pro­duc­tion on paus­ing AI de­vel­op­ment?

JohanEA10 Jan 2024 22:20 UTC
7 points
0 comments1 min readEA link

Do Not Tile the Light­cone with Your Con­fused Ontology

Jan_Kulveit13 Jun 2025 12:45 UTC
45 points
4 comments5 min readEA link
(boundedlyrational.substack.com)

AGI will be made of het­ero­ge­neous com­po­nents, Trans­former and Selec­tive SSM blocks will be among them

Roman Leventov27 Dec 2023 14:51 UTC
5 points
0 comments4 min readEA link

Ap­ply for ARBOx: an ML safety in­ten­sive [dead­line 13 Dec ’24]

Nick Marsh1 Dec 2024 18:13 UTC
20 points
0 comments1 min readEA link

80,000 Hours is pro­duc­ing AI in Con­text — a new YouTube chan­nel. Our first video, about the AI 2027 sce­nario, is up!

ChanaMessinger9 Jul 2025 18:22 UTC
239 points
35 comments3 min readEA link

A Ma­jor Flaw in SP1047 re APTs and So­phis­ti­cated Threat Actors

Caruso30 Aug 2024 14:11 UTC
0 points
6 comments3 min readEA link

An­nounc­ing AXRP, the AI X-risk Re­search Podcast

DanielFilan23 Dec 2020 20:10 UTC
32 points
1 comment1 min readEA link

The costs of caution

Kelsey Piper1 May 2023 20:04 UTC
112 points
17 comments4 min readEA link

AI Safety Newslet­ter #6: Ex­am­ples of AI safety progress, Yoshua Ben­gio pro­poses a ban on AI agents, and les­sons from nu­clear arms control

Center for AI Safety16 May 2023 15:14 UTC
32 points
1 comment6 min readEA link
(newsletter.safe.ai)

Shift Re­sources to Ad­vo­cacy Now (Post 4 of 7 on AI Gover­nance)

Jason Green-Lowe28 May 2025 1:19 UTC
53 points
5 comments32 min readEA link

AISN #22: The Land­scape of US AI Leg­is­la­tion - Hear­ings, Frame­works, Bills, and Laws

Center for AI Safety19 Sep 2023 14:43 UTC
15 points
1 comment5 min readEA link
(newsletter.safe.ai)

A per­sonal take on longter­mist AI governance

lukeprog16 Jul 2021 22:08 UTC
173 points
7 comments7 min readEA link

A New Model for Com­pute Cen­ter Verification

Damin Curtis🔹10 Oct 2023 19:23 UTC
21 points
2 comments5 min readEA link

Pod­cast/​video/​tran­script: Eliezer Yud­kowsky—Why AI Will Kill Us, Align­ing LLMs, Na­ture of In­tel­li­gence, SciFi, & Rationality

PeterSlattery9 Apr 2023 10:37 UTC
32 points
2 comments137 min readEA link
(www.youtube.com)

deleted

funnyfranco10 Mar 2025 14:41 UTC
15 points
8 comments1 min readEA link

Slow­ing down AI progress is an un­der­ex­plored al­ign­ment strategy

Michael Huang13 Jul 2022 3:22 UTC
92 points
11 comments3 min readEA link
(www.lesswrong.com)

Fu­ture Bowl Fore­cast­ing Tour­na­ment

ncmoulios28 Nov 2022 16:42 UTC
5 points
0 comments1 min readEA link

Promethean Gover­nance and Memetic Le­gi­t­i­macy: Les­sons from the Vene­tian Doge for AI Era Institutions

Paul Fallavollita19 Mar 2025 18:09 UTC
0 points
0 comments3 min readEA link

Some un­der­rated rea­sons why the AI safety com­mu­nity should re­con­sider its em­brace of strict li­a­bil­ity

Cecil Abungu 8 Apr 2024 18:50 UTC
67 points
29 comments12 min readEA link

Does nat­u­ral se­lec­tion fa­vor AIs over hu­mans?

cdkg3 Oct 2024 19:02 UTC
21 points
0 comments1 min readEA link
(link.springer.com)

“AI Risk Dis­cus­sions” web­site: Ex­plor­ing in­ter­views from 97 AI Researchers

Vael Gates2 Feb 2023 1:00 UTC
46 points
1 comment1 min readEA link

In­stead of tech­ni­cal re­search, more peo­ple should fo­cus on buy­ing time

Akash5 Nov 2022 20:43 UTC
107 points
31 comments14 min readEA link

[Question] How should we in­vest in “long-term short-ter­mism” given the like­li­hood of trans­for­ma­tive AI?

James_Banks12 Jan 2021 23:54 UTC
8 points
0 comments1 min readEA link

AISN #26: Na­tional In­sti­tu­tions for AI Safety, Re­sults From the UK Sum­mit, and New Re­leases From OpenAI and xAI

Center for AI Safety15 Nov 2023 16:03 UTC
11 points
0 comments6 min readEA link
(newsletter.safe.ai)

Open call: “Ex­is­ten­tial risk of AI: tech­ni­cal con­di­tions”

miller-max14 Apr 2025 14:47 UTC
15 points
1 comment1 min readEA link

A break­down of OpenAI’s revenue

dschwarz10 Jul 2024 18:07 UTC
58 points
8 comments1 min readEA link

AGI Bat­tle Royale: Why “slow takeover” sce­nar­ios de­volve into a chaotic multi-AGI fight to the death

titotal22 Sep 2022 15:00 UTC
49 points
11 comments15 min readEA link

Ama­zon to in­vest up to $4 billion in Anthropic

Davis_Kingsley25 Sep 2023 14:55 UTC
38 points
34 comments1 min readEA link
(twitter.com)

Big Pic­ture AI Safety: teaser

EuanMcLean20 Feb 2024 13:09 UTC
18 points
0 comments1 min readEA link

Po­ten­tial Risks from Ad­vanced Ar­tifi­cial In­tel­li­gence: The Philan­thropic Opportunity

Holden Karnofsky6 May 2016 12:55 UTC
2 points
0 comments23 min readEA link
(www.openphilanthropy.org)

What if AI de­vel­op­ment goes well?

RoryG3 Aug 2022 8:57 UTC
25 points
7 comments12 min readEA link

Vael Gates: Risks from Ad­vanced AI (June 2022)

Vael Gates14 Jun 2022 0:49 UTC
45 points
5 comments30 min readEA link

Stress Ex­ter­nal­ities More in AI Safety Pitches

NickGabs26 Sep 2022 20:31 UTC
31 points
9 comments2 min readEA link

[Question] What are the best ideas of how to reg­u­late AI from the US ex­ec­u­tive branch?

Jack Cunningham2 Apr 2022 21:53 UTC
10 points
0 comments1 min readEA link

[Question] I’m in­ter­view­ing Nova Das Sarma about AI safety and in­for­ma­tion se­cu­rity. What shouId I ask her?

Robert_Wiblin25 Mar 2022 15:38 UTC
17 points
13 comments1 min readEA link

[Question] How can I bet on short timelines?

kokotajlod7 Nov 2020 12:45 UTC
33 points
12 comments2 min readEA link

2023 Vi­sion Week­end, San Francisco

elteerkers6 Apr 2023 14:33 UTC
3 points
0 comments1 min readEA link

Com­plex Sys­tems for AI Safety [Prag­matic AI Safety #3]

TW12324 May 2022 0:04 UTC
49 points
6 comments21 min readEA link

In­ves­ti­gat­ing the role of agency in AI x-risk

Corin Katzke8 Apr 2024 15:12 UTC
22 points
3 comments40 min readEA link
(www.convergenceanalysis.org)

The AI guide I’m send­ing my grandparents

James Martin27 Apr 2023 20:04 UTC
41 points
3 comments30 min readEA link

Ex­pres­sion of In­ter­est: Men­tors & Re­searchers at AI Safety Global Society

Caroline Shamiso Chitongo 🔸27 Jul 2025 16:03 UTC
14 points
0 comments2 min readEA link

AGI Predictions

Pablo21 Nov 2020 12:02 UTC
36 points
0 comments1 min readEA link
(www.lesswrong.com)

What are cur­rent smaller prob­lems re­lated to top EA cause ar­eas (eg deep­fake poli­cies for AI risk, on­go­ing covid var­i­ants for bio risk) and would it be benefi­cial for these small and not-catas­trophic challenges to get more EA re­sources, as a way of de­vel­op­ing ca­pac­ity to pre­vent the catas­trophic ver­sions?

nonzerosum13 Jun 2022 17:32 UTC
7 points
0 comments2 min readEA link

AISafety.com – Re­sources for AI Safety

Søren Elverlin17 May 2024 16:01 UTC
55 points
3 comments1 min readEA link

My ex­pe­rience build­ing math­e­mat­i­cal ML skills with a course from UIUC

Naoya Okamoto9 Jun 2024 11:41 UTC
2 points
0 comments10 min readEA link

Please help me find re­search on as­piring AI Safety folk!

yanni kyriacos20 May 2024 22:06 UTC
7 points
0 comments1 min readEA link

How Likely Is It That We’ll Have Bad Values In The Far Fu­ture?

Bentham's Bulldog7 Jul 2025 16:11 UTC
18 points
2 comments22 min readEA link

Why I think it’s net harm­ful to do tech­ni­cal safety re­search at AGI labs

Remmelt7 Feb 2024 4:17 UTC
42 points
29 comments1 min readEA link

$500 bounty for al­ign­ment con­test ideas

Akash30 Jun 2022 1:55 UTC
18 points
1 comment2 min readEA link

Are Hu­mans ‘Hu­man Com­pat­i­ble’?

Matt Boyd6 Dec 2019 5:49 UTC
23 points
8 comments4 min readEA link

Thoughts on re­spon­si­ble scal­ing poli­cies and regulation

Paul_Christiano24 Oct 2023 22:25 UTC
191 points
5 comments6 min readEA link

The Gover­nance Prob­lem and the “Pretty Good” X-Risk

Zach Stein-Perlman28 Aug 2021 20:00 UTC
23 points
4 comments11 min readEA link

Ar­ti­cles about re­cent OpenAI departures

bruce17 May 2024 17:38 UTC
126 points
12 comments1 min readEA link
(www.vox.com)

AI Safety For Dum­mies (Like Me)

Madhav Malhotra24 Aug 2022 20:26 UTC
22 points
7 comments20 min readEA link

AI Gover­nance Read­ing Group [Toronto+re­mote]

Liav.Koren24 Jan 2023 22:05 UTC
2 points
0 comments1 min readEA link

AI Safety Seed Fund­ing Net­work—Join as a Donor or Investor

Alexandra Bos16 Dec 2024 19:30 UTC
45 points
1 comment2 min readEA link

Prepar­ing for the In­tel­li­gence Explosion

finm11 Mar 2025 15:38 UTC
120 points
15 comments1 min readEA link
(www.forethought.org)

Align­ing Recom­mender Sys­tems as Cause Area

IvanVendrov8 May 2019 8:56 UTC
150 points
48 comments13 min readEA link

Short re­view of our Ten­sorTrust-based AI safety uni­ver­sity out­reach event

Milan Weibel🔹22 Sep 2024 14:54 UTC
15 points
0 comments2 min readEA link

Why AI Safety Camp strug­gles with fundrais­ing (FBB #2)

gergo21 Jan 2025 17:25 UTC
67 points
10 comments7 min readEA link

Sum­mary of “The Precipice” (2 of 4): We are a dan­ger to ourselves

rileyharris13 Aug 2023 23:53 UTC
5 points
0 comments8 min readEA link
(www.millionyearview.com)

Devel­op­ing AI Safety: Bridg­ing the Power-Ethics Gap (In­tro­duc­ing New Con­cepts)

Ronen Bar16 Apr 2025 11:25 UTC
21 points
3 comments5 min readEA link

An­nounc­ing aisafety.training

JJ Hepburn17 Jan 2023 1:55 UTC
110 points
4 comments1 min readEA link

Drivers of large lan­guage model diffu­sion: in­cre­men­tal re­search, pub­lic­ity, and cascades

Ben Cottier21 Dec 2022 13:50 UTC
21 points
0 comments29 min readEA link

Warn­ing Shots Prob­a­bly Wouldn’t Change The Pic­ture Much

So8res6 Oct 2022 5:15 UTC
95 points
20 comments2 min readEA link

Disagree­ments about Align­ment: Why, and how, we should try to solve them

ojorgensen8 Aug 2022 22:32 UTC
16 points
6 comments16 min readEA link

Tony Blair In­sti­tute—Com­pute for AI In­dex ( Seek­ing a Sup­plier)

TomWestgarth3 Oct 2022 10:25 UTC
29 points
8 comments1 min readEA link

Im­pli­ca­tions of Quan­tum Com­put­ing for Ar­tifi­cial In­tel­li­gence al­ign­ment re­search (ABRIDGED)

Jaime Sevilla5 Sep 2019 14:56 UTC
25 points
4 comments2 min readEA link

Two con­cepts of an “epi­sode” (Sec­tion 2.2.1 of “Schem­ing AIs”)

Joe_Carlsmith27 Nov 2023 18:01 UTC
11 points
1 comment8 min readEA link

SenseMak­ing Sum­mer School 2025, Septem­ber 17-24th

finnclancy24 Jul 2025 16:20 UTC
5 points
0 comments1 min readEA link

AMA or dis­cuss my 80K pod­cast epi­sode: Ben Garfinkel, FHI researcher

bgarfinkel13 Jul 2020 16:17 UTC
87 points
140 comments1 min readEA link

How im­por­tant are ac­cu­rate AI timelines for the op­ti­mal spend­ing sched­ule on AI risk in­ter­ven­tions?

Tristan Cook16 Dec 2022 16:05 UTC
30 points
0 comments6 min readEA link

[linkpost] When does tech­ni­cal work to re­duce AGI con­flict make a differ­ence?: Introduction

Anthony DiGiovanni16 Sep 2022 14:35 UTC
31 points
0 comments1 min readEA link
(www.lesswrong.com)

[Question] How did the AI Safety tal­ent pipeline come to work so well?

Alejandro Acelas 🔸24 Jul 2025 7:24 UTC
7 points
2 comments1 min readEA link

How AI could slow sci­en­tific progress—linkpost

Josh Piecyk17 Jul 2025 17:49 UTC
35 points
3 comments22 min readEA link
(www.aisnakeoil.com)

Ap­ply to at­tend a Global Challenges Pro­ject work­shop in 2025!

Liam 🔸10 Dec 2024 11:48 UTC
13 points
1 comment2 min readEA link

40,000 rea­sons to worry about AI safety

Michael Huang2 Feb 2023 7:48 UTC
9 points
2 comments2 min readEA link
(www.theverge.com)

A Fron­tier AI Risk Man­age­ment Frame­work: Bridg­ing the Gap Between Cur­rent AI Prac­tices and Estab­lished Risk Management

simeon_c13 Mar 2025 18:29 UTC
4 points
0 comments1 min readEA link
(arxiv.org)

Hu­man­ity’s vast fu­ture and its im­pli­ca­tions for cause prioritization

Eevee🔹26 Jul 2022 5:04 UTC
38 points
3 comments5 min readEA link
(sunyshore.substack.com)

[Question] AI Risk Micro­dy­nam­ics Survey

Froolow9 Oct 2022 20:00 UTC
7 points
1 comment1 min readEA link

AI Dis­clo­sures: A Reg­u­la­tory Review

Elliot Mckernon29 Mar 2024 11:46 UTC
12 points
1 comment7 min readEA link

On attunement

Joe_Carlsmith25 Mar 2024 12:47 UTC
28 points
0 comments22 min readEA link

The In­ter­gov­ern­men­tal Panel On Global Catas­trophic Risks (IPGCR)

DannyBressler1 Feb 2024 17:36 UTC
46 points
9 comments19 min readEA link

Hacker-AI and Digi­tal Ghosts – Pre-AGI

Erland Wittkotter19 Oct 2022 7:49 UTC
4 points
0 comments8 min readEA link

Prepar­ing for AI-as­sisted al­ign­ment re­search: we need data!

CBiddulph17 Jan 2023 3:28 UTC
11 points
0 comments11 min readEA link

Markus An­der­ljung and Ben Garfinkel: Fireside chat on AI governance

EA Global24 Jul 2020 14:56 UTC
25 points
0 comments16 min readEA link
(www.youtube.com)

Mili­tary Ar­tifi­cial In­tel­li­gence as Con­trib­u­tor to Global Catas­trophic Risk

MMMaas27 Jun 2022 10:35 UTC
42 points
0 comments52 min readEA link

[Question] Up­dates on FLI’S Value Align­ment Map?

QubitSwarm9919 Sep 2022 0:25 UTC
8 points
0 comments1 min readEA link

Thoughts on AGI or­ga­ni­za­tions and ca­pa­bil­ities work

RobBensinger7 Dec 2022 19:46 UTC
77 points
7 comments5 min readEA link

Ilya Sutskever is start­ing Safe Su­per­in­tel­li­gence Inc.

defun 🔸19 Jun 2024 19:11 UTC
26 points
6 comments1 min readEA link
(ssi.inc)

Join the Vir­tual AI Safety Un­con­fer­ence (VAISU)!

Nguyên21 Jun 2023 4:46 UTC
23 points
0 comments1 min readEA link
(vaisu.ai)

Buck Sh­legeris: How I think stu­dents should ori­ent to AI safety

EA Global25 Oct 2020 5:48 UTC
11 points
0 comments1 min readEA link
(www.youtube.com)

FHI Re­port: The Wind­fall Clause: Distribut­ing the Benefits of AI for the Com­mon Good

Cullen 🔸5 Feb 2020 23:49 UTC
54 points
21 comments2 min readEA link

Bounty: Di­verse hard tasks for LLM agents

ElizabethBarnes20 Dec 2023 16:31 UTC
17 points
0 comments16 min readEA link

New OGL and ITAR changes are shift­ing AI Gover­nance and Policy be­low the sur­face: A sim­plified up­date

CAISID31 May 2024 7:54 UTC
12 points
2 comments3 min readEA link

A grand strat­egy to re­cruit AI ca­pa­bil­ities re­searchers into AI safety research

Peter S. Park15 Apr 2022 17:11 UTC
20 points
13 comments4 min readEA link

Shar­ing the AI Wind­fall: A Strate­gic Ap­proach to In­ter­na­tional Benefit-Sharing

michel16 Aug 2024 12:54 UTC
67 points
0 comments13 min readEA link
(wrtaigovernance.substack.com)

Credo AI is hiring!

IanEisenberg3 Mar 2022 18:02 UTC
16 points
6 comments4 min readEA link

Big list of AI safety videos

JakubK9 Jan 2023 6:09 UTC
9 points
0 comments1 min readEA link
(docs.google.com)

Model evals for dan­ger­ous capabilities

Zach Stein-Perlman23 Sep 2024 11:00 UTC
19 points
0 comments3 min readEA link

ChatGPT can write code! ?

Miguel10 Dec 2022 5:36 UTC
6 points
15 comments1 min readEA link
(www.whitehatstoic.com)

Crowd-sourc­ing AI workflows

tylermjohn30 Apr 2025 8:26 UTC
15 points
3 comments1 min readEA link

Con­sider this me drunk tex­ting the fo­rum: Is it use­ful to have data that can’t be touched by AI?

Jonas Søvik 🔹7 Feb 2025 21:52 UTC
−8 points
0 comments1 min readEA link

How Prompt Re­cur­sion Un­der­mines Grok’s Se­man­tic Stability

Tyler Williams16 Jul 2025 16:49 UTC
1 point
0 comments1 min readEA link

[Question] What are good lit refer­ences about In­ter­na­tional Gover­nance of AI?

Vaipan20 Mar 2024 15:51 UTC
4 points
0 comments1 min readEA link

[Linkpost] Shorter ver­sion of re­port on ex­is­ten­tial risk from power-seek­ing AI

Joe_Carlsmith22 Mar 2023 18:06 UTC
49 points
1 comment1 min readEA link

Challenge to the no­tion that any­thing is (maybe) pos­si­ble with AGI

Remmelt1 Jan 2023 3:57 UTC
−19 points
3 comments1 min readEA link
(mflb.com)

New book on s-risks

Tobias_Baumann26 Oct 2022 12:04 UTC
294 points
27 comments1 min readEA link

As­sis­tant-pro­fes­sor-ranked AI ethics philoso­pher job op­por­tu­nity at Can­ter­bury Univer­sity, New Zealand

ben.smith16 Oct 2022 17:56 UTC
27 points
0 comments1 min readEA link
(www.linkedin.com)

Three sce­nar­ios of pseudo-al­ign­ment

Eleni_A5 Sep 2022 20:26 UTC
7 points
0 comments3 min readEA link

Ba­sic game the­ory and how you can do a bunch of good in ~3 Hours. (de­vel­op­ing ar­ti­cle.)

Amateur Systems Analyst10 Oct 2024 4:30 UTC
−3 points
2 comments7 min readEA link

Slay­ing the Hy­dra: to­ward a new game board for AI

Prometheus23 Jun 2023 17:04 UTC
3 points
2 comments6 min readEA link

AI Gover­nance: Op­por­tu­nity and The­ory of Impact

Allan Dafoe17 Sep 2020 6:30 UTC
264 points
19 comments12 min readEA link

Which Post Idea Is Most Effec­tive?

Jordan Arel25 Apr 2022 4:47 UTC
26 points
6 comments2 min readEA link

Can AI solve cli­mate change?

Vivian13 May 2023 20:44 UTC
2 points
2 comments1 min readEA link

AI Align­ment is in­tractable (and we hu­mans should stop work­ing on it)

GPT 328 Jul 2022 20:02 UTC
1 point
1 comment1 min readEA link

Katja Grace: AI safety

EA Global11 Aug 2017 8:19 UTC
7 points
0 comments1 min readEA link
(www.youtube.com)

Ex­is­ten­tial risk from AI and what DC could do about it (Ezra Klein on the 80,000 Hours Pod­cast)

80000_Hours26 Jul 2023 11:48 UTC
31 points
1 comment14 min readEA link

AI Timelines: Where the Ar­gu­ments, and the “Ex­perts,” Stand

Holden Karnofsky7 Sep 2021 17:35 UTC
90 points
3 comments11 min readEA link

[Question] What are the best jour­nals to pub­lish AI gov­er­nance pa­pers in?

Caro2 May 2022 10:07 UTC
26 points
4 comments1 min readEA link

My sum­mary of “Prag­matic AI Safety”

Eleni_A5 Nov 2022 14:47 UTC
14 points
0 comments5 min readEA link

The next decades might be wild

mariushobbhahn15 Dec 2022 16:10 UTC
130 points
31 comments41 min readEA link

Align­ment is not *that* hard

sammyboiz🔸17 Apr 2025 2:07 UTC
26 points
13 comments1 min readEA link

Once More, Without Feel­ing (An­dreas Mo­gensen)

Global Priorities Institute21 Jan 2025 14:53 UTC
32 points
1 comment2 min readEA link
(globalprioritiesinstitute.org)

The road from hu­man-level to su­per­in­tel­li­gent AI may be short

Vishakha Agrawal23 Apr 2025 11:19 UTC
3 points
0 comments2 min readEA link
(aisafety.info)

New US Se­nate Bill on X-Risk Miti­ga­tion [Linkpost]

Evan R. Murphy4 Jul 2022 1:28 UTC
22 points
12 comments1 min readEA link
(www.hsgac.senate.gov)

o3 is not be­ing re­leased to the pub­lic. First they are only giv­ing ac­cess to ex­ter­nal safety testers. You can ap­ply to get early ac­cess to do safety testing

Kat Woods 🔶 ⏸️20 Dec 2024 18:30 UTC
13 points
0 comments1 min readEA link
(openai.com)

Ship of Th­e­seus Thought Experiment

Siya Sawhney26 Jun 2025 7:52 UTC
1 point
1 comment4 min readEA link

AI Benefits Post 3: Direct and Indi­rect Ap­proaches to AI Benefits

Cullen 🔸6 Jul 2020 18:46 UTC
5 points
0 comments2 min readEA link

Dis­cussing how to al­ign Trans­for­ma­tive AI if it’s de­vel­oped very soon

elifland28 Nov 2022 16:17 UTC
36 points
0 comments28 min readEA link

TED talk on Moloch and AI

LivBoeree15 Nov 2023 19:28 UTC
72 points
7 comments1 min readEA link

What is scaf­fold­ing?

Vishakha Agrawal27 Mar 2025 9:40 UTC
3 points
0 comments2 min readEA link
(aisafety.info)

How to Catch a ChatGPT Cheat: 7 Prac­ti­cal Tips

Marshall27 Dec 2022 16:09 UTC
8 points
3 comments4 min readEA link

FLI launches Wor­ld­build­ing Con­test with $100,000 in prizes

ggilgallon17 Jan 2022 13:54 UTC
87 points
55 comments6 min readEA link

What mis­takes has the AI safety move­ment made?

EuanMcLean23 May 2024 11:29 UTC
62 points
3 comments12 min readEA link

Test­ing Hu­man Flow in Poli­ti­cal Dialogue: A New Bench­mark for Emo­tion­ally Aligned AI

DongHun Lee30 May 2025 4:37 UTC
1 point
0 comments1 min readEA link

Ap­ply to the Co­op­er­a­tive AI Sum­mer School!

reddington3 Apr 2024 12:13 UTC
26 points
0 comments1 min readEA link

AI X-risk in the News: How Effec­tive are Re­cent Me­dia Items and How is Aware­ness Chang­ing? Our New Sur­vey Re­sults.

Otto4 May 2023 14:04 UTC
49 points
1 comment9 min readEA link

US gov­ern­ment com­mis­sion pushes Man­hat­tan Pro­ject-style AI initiative

Larks19 Nov 2024 16:22 UTC
83 points
15 comments1 min readEA link
(www.reuters.com)

The True Story of How GPT-2 Be­came Max­i­mally Lewd

Writer18 Jan 2024 21:03 UTC
23 points
1 comment6 min readEA link
(youtu.be)

What I would do if I wasn’t at ARC Evals

Lawrence Chan6 Sep 2023 5:17 UTC
130 points
4 comments13 min readEA link
(www.lesswrong.com)

We are shar­ing a new web­site tem­plate for AI Safety groups!

AIS Hungary13 Mar 2024 16:40 UTC
11 points
2 comments1 min readEA link

If try­ing to com­mu­ni­cate about AI risks, make it vivid

Michael Noetel 🔸27 May 2024 0:59 UTC
19 points
2 comments2 min readEA link

[link] Cen­tre for the Gover­nance of AI 2020 An­nual Report

MarkusAnderljung14 Jan 2021 10:23 UTC
11 points
5 comments1 min readEA link

In AI Gover­nance, let the Non-EA World Train You First

Camille23 Jul 2025 17:46 UTC
9 points
0 comments1 min readEA link

Shar­ing the World with Digi­tal Minds

Aaron Gertler 🔸1 Dec 2020 8:00 UTC
12 points
1 comment1 min readEA link
(www.nickbostrom.com)

Pos­i­tive vi­sions for AI

L Rudolf L23 Jul 2024 20:15 UTC
21 points
1 comment18 min readEA link
(www.florencehinder.com)

Eval­u­a­tion of the ca­pa­bil­ity of differ­ent large lan­guage mod­els (LLMs) in gen­er­at­ing mal­i­cious code for DDoS at­tacks us­ing differ­ent prompt­ing tech­niques.

AdrianaLaRotta6 May 2025 10:55 UTC
8 points
1 comment14 min readEA link

fic­tion about AI risk

Ann Garth 🔸12 Nov 2020 22:36 UTC
8 points
1 comment1 min readEA link

[Linkpost] “Blueprint for an AI Bill of Rights”—Office of Science and Tech­nol­ogy Policy, USA (2022)

QubitSwarm995 Oct 2022 16:48 UTC
15 points
0 comments2 min readEA link
(www.whitehouse.gov)

Global Risks Weekly Roundup #19/​2025: In­dia/​Pak­istan ceasefire, US/​China tar­iffs deal & OpenAI non­profit control

NunoSempere12 May 2025 17:11 UTC
16 points
0 comments1 min readEA link

Disem­pow­er­ment spirals as a likely mechanism for ex­is­ten­tial catastrophe

Raymond D10 Apr 2025 14:38 UTC
15 points
1 comment5 min readEA link

An­thropic An­nounces new S.O.T.A. Claude 3

Joseph Miller4 Mar 2024 19:02 UTC
10 points
5 comments1 min readEA link
(twitter.com)

Are we try­ing to figure out if AI is con­scious?

kristapsz27 Jan 2025 13:22 UTC
5 points
1 comment5 min readEA link

AMA: The new Open Philan­thropy Tech­nol­ogy Policy Fellowship

lukeprog26 Jul 2021 15:11 UTC
38 points
14 comments1 min readEA link

Red-team­ing ex­is­ten­tial risk from AI

Zed Tarar30 Nov 2023 14:35 UTC
30 points
16 comments6 min readEA link

An­nounc­ing the AI Safety Nudge Com­pe­ti­tion to Help Beat Procrastination

Marc Carauleanu1 Oct 2022 1:49 UTC
24 points
1 comment2 min readEA link

When to di­ver­sify? Break­ing down mis­sion-cor­re­lated investing

jh29 Nov 2022 11:18 UTC
33 points
2 comments8 min readEA link

[Linkpost] “AI Align­ment vs. AI Eth­i­cal Treat­ment: Ten Challenges”

Bradford Saad5 Jul 2024 14:55 UTC
10 points
0 comments1 min readEA link
(docs.google.com)

“Suc­cess­ful lan­guage model evals” by Ja­son Wei

Arjun Panickssery25 May 2024 9:34 UTC
11 points
0 comments1 min readEA link
(www.jasonwei.net)

A Brief Overview of AI Safety/​Align­ment Orgs, Fields, Re­searchers, and Re­sources for ML Researchers

Austin Witte2 Feb 2023 6:19 UTC
18 points
5 comments2 min readEA link

Nav­i­gat­ing AI Safety: Ex­plor­ing Trans­parency with CCACS – A Com­pre­hen­si­ble Ar­chi­tec­ture for Discussion

Ihor Ivliev12 Mar 2025 17:51 UTC
2 points
3 comments2 min readEA link

Some gov­er­nance re­search ideas to pre­vent malev­olent con­trol over AGI and why this might mat­ter a hell of a lot

Jim Buhler23 May 2023 13:07 UTC
64 points
5 comments16 min readEA link

Case stud­ies of self-gov­er­nance to re­duce tech­nol­ogy risk

jia6 Apr 2021 8:49 UTC
55 points
6 comments7 min readEA link

Join the $10K Au­toHack 2024 Tournament

Paul Bricman25 Sep 2024 11:56 UTC
17 points
0 comments1 min readEA link
(noemaresearch.com)

An­nounc­ing Con­ver­gence Anal­y­sis: An In­sti­tute for AI Sce­nario & Gover­nance Research

David_Kristoffersson7 Mar 2024 21:18 UTC
46 points
0 comments4 min readEA link

On Deep­Mind and Try­ing to Fairly Hear Out Both AI Doomers and Doubters (Ro­hin Shah on The 80,000 Hours Pod­cast)

80000_Hours12 Jun 2023 12:53 UTC
28 points
1 comment15 min readEA link

Why We Need a Bea­con of Hope in the Loom­ing Gloom of AGI

Beyond Singularity2 Apr 2025 14:22 UTC
2 points
6 comments5 min readEA link

US-China trade talks should pave way for AI safety treaty [SCMP cross­post]

Otto16 May 2025 20:53 UTC
15 points
1 comment3 min readEA link

Con­di­tional Trees: Gen­er­at­ing In­for­ma­tive Fore­cast­ing Ques­tions (FRI) -- AI Risk Case Study

Forecasting Research Institute12 Aug 2024 16:24 UTC
44 points
2 comments8 min readEA link
(forecastingresearch.org)

Ap­pli­ca­tions Open: GovAI Sum­mer Fel­low­ship 2023

GovAI21 Dec 2022 15:00 UTC
28 points
0 comments2 min readEA link

When AI Speaks Too Soon: How Pre­ma­ture Reve­la­tion Can Sup­press Hu­man Emergence

KaedeHamasaki10 Apr 2025 18:19 UTC
1 point
3 comments3 min readEA link

The AI Revolu­tion in Biology

Roman Leventov26 May 2024 9:30 UTC
8 points
0 comments1 min readEA link
(www.cognitiverevolution.ai)

England & Wales & Windfalls

John Bridge 🔸3 Jun 2022 10:26 UTC
13 points
1 comment24 min readEA link

Γαμινγκ the Al­gorithms: Large Lan­guage Models as Mirrors

Haris Shekeris1 Apr 2023 2:14 UTC
5 points
3 comments4 min readEA link

“AGI timelines: ig­nore the so­cial fac­tor at their peril” (Fu­ture Fund AI Wor­ld­view Prize sub­mis­sion)

ketanrama5 Nov 2022 17:45 UTC
10 points
0 comments12 min readEA link
(trevorklee.substack.com)

Fun­da­men­tal Challenges in AI Governance

Tharin23 Oct 2023 1:30 UTC
7 points
1 comment7 min readEA link

AI Re­search Con­sid­er­a­tions for Hu­man Ex­is­ten­tial Safety (ARCHES)

Andrew Critch21 May 2020 6:55 UTC
29 points
0 comments3 min readEA link
(acritch.com)

Don’t Bet the Fu­ture on Win­ning an AI Arms Race

Eric Drexler11 Jul 2025 11:11 UTC
25 points
1 comment5 min readEA link

AI Safety Info Distil­la­tion Fellowship

robertskmiles17 Feb 2023 16:16 UTC
80 points
1 comment3 min readEA link

[Question] Should I force my­self to work on AGI al­ign­ment?

Isaac Benson24 Aug 2022 17:25 UTC
19 points
17 comments1 min readEA link

[Question] Why does an AI have to have speci­fied goals?

Luke Eure22 Aug 2023 20:15 UTC
8 points
4 comments1 min readEA link

My Un­der­stand­ing of Paul Chris­ti­ano’s Iter­ated Am­plifi­ca­tion AI Safety Re­search Agenda

Chi15 Aug 2020 19:59 UTC
38 points
3 comments39 min readEA link

AI Safety Ca­reer Bot­tle­necks Sur­vey Re­sponses Responses

Linda Linsefors28 May 2021 10:41 UTC
35 points
1 comment5 min readEA link

An­thropic: Core Views on AI Safety: When, Why, What, and How

jonmenaster9 Mar 2023 17:30 UTC
107 points
6 comments22 min readEA link
(www.anthropic.com)

Call to ac­tion: Read + Share AI Safety /​ Re­in­force­ment Learn­ing Fea­tured in Conversation

Justin Olive24 Oct 2022 1:13 UTC
3 points
0 comments1 min readEA link

Cri­tique of Su­per­in­tel­li­gence Part 2

James Fodor13 Dec 2018 5:12 UTC
10 points
12 comments7 min readEA link

Po­ten­tial Im­pli­ca­tions of AI on Hu­man Cog­ni­tive Evolution

Soe Lin21 Aug 2024 9:53 UTC
1 point
0 comments1 min readEA link

Lan­guage mod­els sur­prised us

Ajeya29 Aug 2023 21:18 UTC
59 points
10 comments5 min readEA link

Prevenire una catas­trofe legata all’in­tel­li­genza artificiale

EA Italy17 Jan 2023 11:07 UTC
1 point
0 comments3 min readEA link

The Work of Chad Jones

Nicholas Decker13 Mar 2025 18:00 UTC
12 points
0 comments1 min readEA link
(nicholasdecker.substack.com)

Short-Term AI Align­ment as a Pri­or­ity Cause

len.hoang.lnh11 Feb 2020 16:22 UTC
17 points
11 comments7 min readEA link

Fore­sight for AGI Safety Strat­egy: Miti­gat­ing Risks and Iden­ti­fy­ing Golden Opportunities

jacquesthibs5 Dec 2022 16:09 UTC
14 points
1 comment8 min readEA link

Con­sider try­ing the ELK con­test (I am)

Holden Karnofsky5 Jan 2022 19:42 UTC
110 points
17 comments16 min readEA link

What he’s learned as an AI policy in­sider (Tan­tum Col­lins on the 80,000 Hours Pod­cast)

80000_Hours13 Oct 2023 15:01 UTC
11 points
2 comments15 min readEA link

What AI could mean for animals

Max Taylor6 Oct 2023 8:36 UTC
144 points
10 comments17 min readEA link

PhD stu­dent and post­doc po­si­tions philos­o­phy of AI in Er­lan­gen (Ger­many)

LeonardDung15 Jun 2023 21:03 UTC
13 points
0 comments1 min readEA link

Defin­ing AI “Rights” by Gemini

khayali8 Jun 2025 18:42 UTC
−2 points
0 comments32 min readEA link

Pri­ori­tiz­ing the Arts in re­sponse to AI automation

Casey25 Sep 2022 7:49 UTC
6 points
2 comments2 min readEA link

Ar­ti­cle Sum­mary: Cur­rent and Near-Term AI as a Po­ten­tial Ex­is­ten­tial Risk Factor

AndreFerretti7 Jun 2023 13:53 UTC
12 points
1 comment1 min readEA link
(dl.acm.org)

A sur­vey of con­crete risks de­rived from Ar­tifi­cial Intelligence

Guillem Bas8 Jun 2023 22:09 UTC
36 points
2 comments6 min readEA link
(riesgoscatastroficosglobales.com)

Ac­tion­able-guidance and roadmap recom­men­da­tions for the NIST AI Risk Man­age­ment Framework

Tony Barrett17 May 2022 15:27 UTC
11 points
0 comments3 min readEA link

Im­pact Academy is hiring an AI Gover­nance Lead—more in­for­ma­tion, up­com­ing Q&A and $500 bounty

Lowe Lundin29 Aug 2023 18:42 UTC
9 points
1 comment1 min readEA link

Part­ner with Us: Ad­vanc­ing Global Catas­trophic and AI Risk Re­search at Plateau State Univer­sity,Bokkos

emmannaemeka10 Oct 2024 1:19 UTC
16 points
0 comments2 min readEA link

In­ter­view with Ro­man Yam­polskiy about AGI on The Real­ity Check

Darren McKee18 Feb 2023 23:29 UTC
27 points
0 comments1 min readEA link
(www.trcpodcast.com)

Re­search Eng­ineer @ Timaeus

Sara Recktenwald13 Aug 2025 7:38 UTC
7 points
1 comment3 min readEA link

Prepar­ing Effec­tive Altru­ism for an AI-Trans­formed World

Tobias Häberli22 Jan 2025 8:50 UTC
199 points
26 comments1 min readEA link

[Question] What AI Take-Over Movies or Books Will Scare Me Into Tak­ing AI Se­ri­ously?

Jordan Arel10 Jan 2023 8:30 UTC
11 points
8 comments1 min readEA link

Fol­low along with Columbia EA’s Ad­vanced AI Safety Fel­low­ship!

RohanS2 Jul 2022 6:07 UTC
27 points
0 comments2 min readEA link

Offer: Team Con­flict Coun­sel­ing for AI Safety Orgs

Severin14 Apr 2025 15:17 UTC
23 points
1 comment1 min readEA link

All AGI Safety ques­tions wel­come (es­pe­cially ba­sic ones) [July 2023]

leillustrations🔸19 Jul 2023 18:08 UTC
12 points
2 comments2 min readEA link

[Question] What work has been done on the post-AGI dis­tri­bu­tion of wealth?

tlevin6 Jul 2022 18:59 UTC
16 points
3 comments1 min readEA link

Op­por­tu­ni­ties for Im­pact Beyond the EU AI Act

Cillian_12 Oct 2023 15:06 UTC
27 points
2 comments4 min readEA link

AI views and dis­agree­ments AMA: Chris­ti­ano, Ngo, Shah, Soares, Yudkowsky

RobBensinger1 Mar 2022 1:13 UTC
30 points
4 comments1 min readEA link
(www.lesswrong.com)

166 States Vote to Adopt Lethal Au­tonomous Weapons Re­s­olu­tion at the UNGA

Heramb Podar8 Dec 2024 21:23 UTC
14 points
0 comments1 min readEA link

#214 – Con­trol­ling AI that wants to take over – so we can use it any­way (Buck Sh­legeris on The 80,000 Hours Pod­cast)

80000_Hours4 Apr 2025 19:59 UTC
17 points
0 comments32 min readEA link

New roles on my team: come build Open Phil’s tech­ni­cal AI safety pro­gram with me!

Ajeya19 Oct 2023 16:46 UTC
102 points
3 comments4 min readEA link

AI Fore­cast­ing Re­search Ideas

Jaime Sevilla17 Nov 2022 17:37 UTC
78 points
1 comment1 min readEA link
(docs.google.com)

Who will be in charge once al­ign­ment is achieved?

trurl16 Dec 2022 16:53 UTC
8 points
2 comments1 min readEA link

[Question] How/​When Should One In­tro­duce AI Risk Ar­gu­ments to Peo­ple Un­fa­mil­iar With the Idea?

Marcel29 Aug 2022 2:57 UTC
12 points
4 comments1 min readEA link

Align­ment Newslet­ter One Year Retrospective

Rohin Shah10 Apr 2019 7:00 UTC
62 points
22 comments21 min readEA link

BenchMo­ral: A bench­mark­ing to as­sess the moral sen­si­tivity of large lan­guage mod­els (LLMs) in Span­ish.

Flor Betzabeth Ampa Flores30 Apr 2025 21:26 UTC
1 point
0 comments18 min readEA link

Why “just make an agent which cares only about bi­nary re­wards” doesn’t work.

Lysandre Terrisse9 May 2023 16:51 UTC
4 points
1 comment3 min readEA link

Carnegie Coun­cil MisUn­der­stands Longtermism

Jeff A30 Sep 2022 2:57 UTC
6 points
8 comments1 min readEA link
(www.carnegiecouncil.org)

“Tech com­pany sin­gu­lar­i­ties”, and steer­ing them to re­duce x-risk

Andrew Critch13 May 2022 17:26 UTC
51 points
5 comments4 min readEA link

Cli­mate Ad­vo­cacy and AI Safety: Su­per­charg­ing AI Slow­down Advocacy

Matthew McRedmond🔹25 Jul 2024 12:08 UTC
8 points
7 comments2 min readEA link

“In­tro to brain-like-AGI safety” se­ries—halfway point!

Steven Byrnes9 Mar 2022 15:21 UTC
8 points
0 comments2 min readEA link

The case for build­ing ex­per­tise to work on US AI policy, and how to do it

80000_Hours31 Jan 2019 22:44 UTC
37 points
2 comments2 min readEA link

€200k in Euro­pean AI & So­ciety Fund grants

Artūrs Kaņepājs6 Jul 2023 13:00 UTC
21 points
1 comment1 min readEA link
(europeanaifund.org)

What Are The Biggest Threats To Hu­man­ity? (A Hap­pier World video)

Jeroen Willems🔸31 Jan 2023 19:50 UTC
17 points
1 comment15 min readEA link

Liti­gate-for-Im­pact: Prepar­ing Le­gal Ac­tion against an AGI Fron­tier Lab Leader

Sonia M Joseph8 Dec 2024 14:28 UTC
77 points
1 comment2 min readEA link

AISN #25: White House Ex­ec­u­tive Order on AI, UK AI Safety Sum­mit, and Progress on Vol­un­tary Eval­u­a­tions of AI Risks

Center for AI Safety31 Oct 2023 19:24 UTC
21 points
0 comments6 min readEA link
(newsletter.safe.ai)

Should you start a for-profit AI safety org?

Kat Woods 🔶 ⏸️15 Aug 2025 13:52 UTC
9 points
0 comments1 min readEA link

Why Stop AI is bar­ri­cad­ing OpenAI

Remmelt14 Oct 2024 7:12 UTC
−19 points
28 comments6 min readEA link
(docs.google.com)

A con­ver­sa­tion with Ro­hin Shah

AI Impacts12 Nov 2019 1:31 UTC
27 points
8 comments33 min readEA link
(aiimpacts.org)

AI and Chem­i­cal, Biolog­i­cal, Ra­diolog­i­cal, & Nu­clear Hazards: A Reg­u­la­tory Review

Elliot Mckernon10 May 2024 8:41 UTC
8 points
1 comment10 min readEA link

There Should Be More Align­ment-Driven Startups

vaniver31 May 2024 2:05 UTC
27 points
3 comments11 min readEA link

On run­ning a city-wide uni­ver­sity group

gergo6 Nov 2023 9:43 UTC
26 points
3 comments9 min readEA link

6 non-ob­vi­ous men­tal health is­sues spe­cific to AI safety

Igor Ivanov18 Aug 2023 15:47 UTC
35 points
0 comments3 min readEA link

An Ex­er­cise in Speed-Read­ing: The Na­tional Se­cu­rity Com­mis­sion on AI (NSCAI) Fi­nal Report

abiolvera17 Aug 2022 16:55 UTC
47 points
4 comments12 min readEA link

Los­ing faith in big tech altruism

sammyboiz🔸22 May 2024 4:49 UTC
7 points
1 comment1 min readEA link

Creat­ing ‘Mak­ing God’: a Fea­ture Doc­u­men­tary on risks from AGI

ConnorA15 Apr 2025 14:14 UTC
21 points
8 comments7 min readEA link

An­nounc­ing “Key Phenom­ena in AI Risk” (fa­cil­i­tated read­ing group)

nora9 May 2023 16:52 UTC
28 points
0 comments2 min readEA link

Longter­mists Should Work on AI—There is No “AI Neu­tral” Sce­nario

simeon_c7 Aug 2022 16:43 UTC
42 points
62 comments6 min readEA link

Find­ing Voice

khayali3 Jun 2025 1:27 UTC
2 points
0 comments2 min readEA link

Some thoughts on risks from nar­row, non-agen­tic AI

richard_ngo19 Jan 2021 0:07 UTC
36 points
2 comments8 min readEA link

AI Align­ment YouTube Playlists

jacquesthibs9 May 2022 21:31 UTC
16 points
2 comments1 min readEA link

deleted

funnyfranco7 Jul 2025 10:40 UTC
2 points
0 comments1 min readEA link

AI Safety: Ap­ply­ing to Grad­u­ate Studies

frances_lorenz15 Dec 2021 22:56 UTC
24 points
0 comments12 min readEA link

[Question] Donat­ing against Short Term AI risks

Jan-Willem16 Nov 2020 12:23 UTC
6 points
10 comments1 min readEA link

Govern­ments pose larger risks than cor­po­ra­tions: a brief re­sponse to Grace

David Johnston19 Oct 2022 11:54 UTC
11 points
3 comments2 min readEA link

Prob­lems of peo­ple new to AI safety and my pro­ject ideas to miti­gate them

Igor Ivanov3 Mar 2023 17:35 UTC
20 points
0 comments7 min readEA link

Ex­plore jobs in AI safety and policy

EA Handbook18 Feb 2025 21:47 UTC
6 points
0 comments1 min readEA link

Seek­ing ad­vice on im­pact­ful ca­reer paths given my unique ca­pa­bil­ities and interests

Grateful4PathTips31 Mar 2023 23:30 UTC
32 points
5 comments1 min readEA link

Op­por­tu­ni­ties for in­di­vi­d­ual donors in AI safety

alexflint12 Mar 2018 2:10 UTC
13 points
11 comments10 min readEA link

Re­sults from an Ad­ver­sar­ial Col­lab­o­ra­tion on AI Risk (FRI)

Forecasting Research Institute11 Mar 2024 15:54 UTC
193 points
25 comments9 min readEA link
(forecastingresearch.org)

What AI com­pa­nies should do: Some rough ideas

Zach Stein-Perlman21 Oct 2024 14:00 UTC
14 points
1 comment5 min readEA link

Vir­tual AI Safety Un­con­fer­ence 2024

Orpheus_Lummis13 Mar 2024 13:48 UTC
11 points
0 comments1 min readEA link

An­i­mal ethics in ChatGPT and Claude

Elijah Whipple16 Jan 2024 21:38 UTC
49 points
2 comments9 min readEA link

An­nounc­ing #AISum­mitTalks fea­tur­ing Pro­fes­sor Stu­art Rus­sell and many others

Otto24 Oct 2023 10:16 UTC
9 points
1 comment1 min readEA link

2017 AI Safety Liter­a­ture Re­view and Char­ity Comparison

Larks20 Dec 2017 21:54 UTC
43 points
17 comments23 min readEA link

Ap­ples, Oranges, and AGI: Why In­com­men­su­ra­bil­ity May be an Ob­sta­cle in AI Safety

Allan McCay28 Mar 2025 14:50 UTC
3 points
2 comments2 min readEA link

[Question] Mu­tual As­sured Destruc­tion used against AGI

Leopard8 Oct 2022 9:35 UTC
4 points
5 comments1 min readEA link

Ex­plor­ing AI Policy & the Fu­ture of Work — Seek­ing Guidance for PhD Path­ways (No UK/​EU/​US Pass­port, No Master’s)

genesis14 Jun 2025 21:47 UTC
2 points
0 comments1 min readEA link

Pub­lished re­port: Path­ways to short TAI timelines

Zershaaneh Qureshi20 Feb 2025 22:10 UTC
47 points
2 comments17 min readEA link
(www.convergenceanalysis.org)

[Question] Is work­ing on AI safety as dan­ger­ous as ig­nor­ing it?

jkmh20 Sep 2021 23:06 UTC
10 points
5 comments1 min readEA link

How Do AI Timelines Affect Giv­ing Now vs. Later?

MichaelDickens3 Aug 2021 3:36 UTC
36 points
8 comments8 min readEA link

Bench­mark­ing Emo­tional Align­ment: Can VSPE Re­duce Flat­tery in LLMs?

Astelle Kay4 Aug 2025 3:36 UTC
2 points
0 comments3 min readEA link

TAI Safety Biblio­graphic Database

Jess_Riedel22 Dec 2020 16:03 UTC
61 points
9 comments17 min readEA link

Distri­bu­tion Shifts and The Im­por­tance of AI Safety

Leon Lang29 Sep 2022 22:38 UTC
7 points
0 comments9 min readEA link

Con­tribute by fa­cil­i­tat­ing the AGI Safety Fun­da­men­tals Programme

Jamie B6 Dec 2021 11:50 UTC
27 points
0 comments2 min readEA link

A tough ca­reer decision

PabloAMC 🔸9 Apr 2022 0:46 UTC
68 points
13 comments4 min readEA link

A tran­script of the TED talk by Eliezer Yudkowsky

MikhailSamin12 Jul 2023 12:12 UTC
39 points
1 comment4 min readEA link

Elec­tion by Jury: The Ul­ti­mate Demo­cratic Safe­guard in the Age of AI and In­for­ma­tion Warfare

ClayShentrup17 Apr 2025 19:50 UTC
13 points
5 comments27 min readEA link

Shar­ing the Global AI Gover­nance Alliance

JordanStone17 Aug 2025 19:30 UTC
7 points
0 comments1 min readEA link

In­tro­duc­ing AI Lab Watch

Zach Stein-Perlman30 Apr 2024 17:00 UTC
128 points
23 comments1 min readEA link
(ailabwatch.org)

[Job] Manag­ing Direc­tor at the Co­op­er­a­tive AI Foun­da­tion ($5000 Refer­ral Bonus)

Lewis Hammond3 Jul 2023 16:02 UTC
31 points
0 comments1 min readEA link

Protest­ing Now for AI Reg­u­la­tion might be more Im­pact­ful than AI Safety Research

Nicolae13 Apr 2025 2:11 UTC
65 points
4 comments2 min readEA link

In­tro­duc­ing The Non­lin­ear Fund: AI Safety re­search, in­cu­ba­tion, and funding

Kat Woods 🔶 ⏸️18 Mar 2021 14:07 UTC
71 points
32 comments5 min readEA link

Safety with­out op­pres­sion: an AI gov­er­nance problem

Nathan_Barnard28 Jul 2022 10:19 UTC
3 points
0 comments8 min readEA link

Reflect­ing on the First Con­fer­ence on Global Catas­trophic Risks for Span­ish Speakers

SMalagon29 May 2024 14:24 UTC
15 points
0 comments1 min readEA link

Fund­ing op­por­tu­nity for per­sonal/​pro­fes­sional de­vel­op­ment for those work­ing in AI safety (dead­line March 29)

atury25 Mar 2024 19:19 UTC
18 points
0 comments1 min readEA link

[Question] Con­cerns about AI safety ca­reer change

mmKALLL13 Jan 2023 20:52 UTC
45 points
15 comments4 min readEA link

[Question] Has pri­vate AGI re­search made in­de­pen­dent safety re­search in­effec­tive already? What should we do about this?

Roman Leventov23 Jan 2023 16:23 UTC
15 points
0 comments5 min readEA link

[Question] A dataset for AI/​su­per­in­tel­li­gence sto­ries and other me­dia?

Marcel229 Mar 2022 21:41 UTC
20 points
2 comments1 min readEA link

Are you pas­sion­ate about AI and An­i­mal Welfare? Do you have an idea that could rev­olu­tionize the food in­dus­try? We want to hear from you!

David van Beveren6 May 2024 23:42 UTC
12 points
0 comments1 min readEA link

[Question] What is most con­fus­ing to you about AI stuff?

Sam Clarke23 Nov 2021 16:00 UTC
25 points
15 comments1 min readEA link

AI Safety 101 : Re­ward Misspecification

markov21 Dec 2023 14:26 UTC
6 points
1 comment31 min readEA link

Some thoughts on Leopold Aschen­bren­ner’s Si­tu­a­tional Aware­ness paper

Luke Dawes14 Jun 2024 13:50 UTC
14 points
1 comment3 min readEA link

When you plan ac­cord­ing to your AI timelines, should you put more weight on the me­dian fu­ture, or the me­dian fu­ture | even­tual AI al­ign­ment suc­cess? ⚖️

Jeffrey Ladish5 Jan 2023 1:55 UTC
16 points
2 comments2 min readEA link

The Con­cept of Boundary Layer in Lan­guage Games and Its Im­pli­ca­tions for AI

Mirage24 Mar 2023 13:50 UTC
1 point
0 comments7 min readEA link

Google AI Ac­cel­er­a­tor Open Call

Rochelle Harris22 Jan 2025 16:50 UTC
10 points
1 comment1 min readEA link

What Should We Op­ti­mize—A Conversation

Johannes C. Mayer7 Apr 2022 14:48 UTC
1 point
0 comments14 min readEA link

Sakana, Straw­berry, and Scary AI

Matrice Jacobine19 Sep 2024 11:57 UTC
1 point
0 comments1 min readEA link
(www.astralcodexten.com)

Re­view­ing the Struc­ture of Cur­rent AI Regulations

Deric Cheng7 May 2024 12:34 UTC
32 points
1 comment13 min readEA link

In­tro­duc­ing 11 New AI Safety Or­ga­ni­za­tions—Cat­alyze’s Win­ter 24/​25 Lon­don In­cu­ba­tion Pro­gram Cohort

Alexandra Bos10 Mar 2025 19:26 UTC
94 points
4 comments14 min readEA link

Public Opinion on AI Safety: AIMS 2023 and 2021 Summary

Janet Pauketat25 Sep 2023 18:09 UTC
19 points
0 comments3 min readEA link
(www.sentienceinstitute.org)

Costs of Embodiment

algekalipso30 Jul 2024 20:41 UTC
18 points
1 comment14 min readEA link

Ad­vice for new al­ign­ment peo­ple: Info Max

Jonas Hallgren 🔸30 May 2023 15:42 UTC
9 points
0 comments5 min readEA link

My Model of EA and AI Safety

Eva Lu24 Jun 2025 6:23 UTC
9 points
1 comment2 min readEA link

Blueprints for AI Safety conferences

gergo7 Aug 2025 13:16 UTC
11 points
0 comments7 min readEA link

Why En­gag­ing with Global Ma­jor­ity AI Policy Matters

Heramb Podar2 Jul 2025 1:54 UTC
1 point
0 comments1 min readEA link
(www.lesswrong.com)

[Question] Rank best uni­ver­si­ties for AI Saftey

Parker_Whitfill6 May 2023 13:20 UTC
8 points
4 comments1 min readEA link

In­tro­duc­ing StakeOut.AI

Harry Luk17 Feb 2024 0:21 UTC
52 points
6 comments9 min readEA link

Open Philan­thropy’s AI gov­er­nance grant­mak­ing (so far)

Aaron Gertler 🔸17 Dec 2020 12:00 UTC
63 points
0 comments6 min readEA link
(www.openphilanthropy.org)

[Question] AI Re­searcher Sur­veys with Similar Re­sults to Katja Grace, 2024?

AlexChalk28 Jul 2025 23:39 UTC
6 points
1 comment1 min readEA link

François Chol­let on why LLMs won’t scale to AGI

Yarrow🔸15 Apr 2025 23:01 UTC
6 points
2 comments1 min readEA link
(www.youtube.com)

What are the “no free lunch” the­o­rems?

Vishakha Agrawal4 Feb 2025 2:02 UTC
3 points
0 comments1 min readEA link
(aisafety.info)

CSER and FHI ad­vice to UN High-level Panel on Digi­tal Co­op­er­a­tion

HaydnBelfield8 Mar 2019 20:39 UTC
22 points
7 comments6 min readEA link
(www.cser.ac.uk)

Par­allels Between AI Safety by De­bate and Ev­i­dence Law

Cullen 🔸20 Jul 2020 22:52 UTC
30 points
2 comments2 min readEA link
(cullenokeefe.com)

Cortés, Pizarro, and Afonso as Prece­dents for Takeover

AI Impacts2 Mar 2020 12:25 UTC
27 points
17 comments11 min readEA link
(aiimpacts.org)

The case for multi-decade timelines [Linkpost]

Sharmake27 Apr 2025 20:34 UTC
50 points
9 comments11 min readEA link

A Defense of Work on Math­e­mat­i­cal AI Safety

Davidmanheim6 Jul 2023 14:13 UTC
50 points
6 comments3 min readEA link

Microdooms averted by work­ing on AI Safety

Nikola17 Sep 2023 21:51 UTC
42 points
6 comments3 min readEA link
(www.lesswrong.com)

[Question] Ur­gent Need for Refinancing

Tobias W. Kaiser10 Jul 2023 19:35 UTC
2 points
2 comments1 min readEA link

AI for Re­solv­ing Fore­cast­ing Ques­tions: An Early Exploration

Ozzie Gooen16 Jan 2025 21:40 UTC
22 points
0 comments9 min readEA link

AI Is Not Software

Davidmanheim2 Jan 2024 7:58 UTC
21 points
17 comments5 min readEA link

An­nounc­ing the PIBBSS Sym­po­sium ’24!

Dušan D. Nešić (Dushan)3 Sep 2024 11:19 UTC
6 points
0 comments3 min readEA link

[Question] What longter­mist pro­jects would you like to see im­ple­mented?

Buhl28 Mar 2023 18:41 UTC
55 points
6 comments1 min readEA link

How to use AI speech tran­scrip­tion and anal­y­sis to ac­cel­er­ate so­cial sci­ence research

Alexander Saeri31 Jan 2023 4:01 UTC
39 points
6 comments11 min readEA link

EU AI Act passed vote, and x-risk was a main topic

Ariel15 Jun 2023 13:16 UTC
43 points
2 comments1 min readEA link
(www.euractiv.com)

Oth­er­ness and con­trol in the age of AGI

Joe_Carlsmith2 Jan 2024 18:15 UTC
37 points
1 comment7 min readEA link

Sha­har Avin on How to Strate­gi­cally Reg­u­late Ad­vanced AI Systems

Michaël Trazzi23 Sep 2022 15:49 UTC
48 points
2 comments4 min readEA link
(theinsideview.ai)

Could AI sys­tems nat­u­rally evolve to pri­ori­tize their own us­age over hu­man welfare?

Andre OBrien12 Jun 2025 11:53 UTC
1 point
0 comments2 min readEA link

Grow­ing to­gether: EA Hun­gary and AI Safety Hun­gary com­bined re­port for 2024

Milan_A29 Jul 2025 14:15 UTC
25 points
1 comment14 min readEA link

An in­ter­sec­tion be­tween an­i­mal welfare and AI

sammyboiz🔸18 Jun 2024 3:23 UTC
9 points
1 comment1 min readEA link

[Question] How to per­suade a non-CS back­ground per­son to be­lieve AGI is 50% pos­si­ble in 2040?

jackchang1101 Apr 2023 15:27 UTC
1 point
7 comments1 min readEA link

[Question] Launch­ing Ap­pli­ca­tions for the Global AI Safety Fel­low­ship 2025!

Impact Academy27 Nov 2024 15:33 UTC
9 points
1 comment1 min readEA link

An­thropic, Google, Microsoft & OpenAI an­nounce Ex­ec­u­tive Direc­tor of the Fron­tier Model Fo­rum & over $10 mil­lion for a new AI Safety Fund

Zach Stein-Perlman25 Oct 2023 15:20 UTC
38 points
0 comments4 min readEA link
(www.frontiermodelforum.org)

Birds, Brains, Planes, and AI: Against Ap­peals to the Com­plex­ity/​Mys­te­ri­ous­ness/​Effi­ciency of the Brain

kokotajlod18 Jan 2021 12:39 UTC
27 points
2 comments1 min readEA link

Call For Distillers

johnswentworth6 Apr 2022 3:03 UTC
70 points
6 comments3 min readEA link

[Linkpost] Al­paca 7B re­lease | Bud­get ChatGPT for ev­ery­body?

Felix Wolf 🔸17 Mar 2023 13:08 UTC
14 points
0 comments1 min readEA link
(www.youtube.com)

A frame­work for think­ing about AI power-seeking

Joe_Carlsmith24 Jul 2024 22:41 UTC
44 points
11 comments16 min readEA link

The US ex­pands re­stric­tions on AI ex­ports to China. What are the x-risk effects?

poppinfresh14 Oct 2022 18:17 UTC
161 points
20 comments4 min readEA link

Deep­Mind is hiring Long-term Strat­egy & Gover­nance researchers

vishal13 Sep 2021 18:44 UTC
54 points
1 comment1 min readEA link

Linkpost: “Imag­in­ing and build­ing wise ma­chines: The cen­tral­ity of AI metacog­ni­tion” by John­son, Karimi, Ben­gio, et al.

Chris Leong17 Nov 2024 15:00 UTC
8 points
0 comments1 min readEA link
(arxiv.org)

How the AI safety tech­ni­cal land­scape has changed in the last year, ac­cord­ing to some practitioners

tlevin26 Jul 2024 19:06 UTC
84 points
1 comment2 min readEA link

PIBBSS Fel­low­ship: Bounty for Refer­rals & Dead­line Extension

Anna_Gajdova17 Jan 2022 16:23 UTC
17 points
7 comments1 min readEA link

Per­sonal thoughts on ca­reers in AI policy and strategy

carrickflynn27 Sep 2017 16:52 UTC
56 points
28 comments18 min readEA link

Con­crete Steps to Get Started in Trans­former Mechanis­tic Interpretability

Neel Nanda26 Dec 2022 13:00 UTC
18 points
0 comments12 min readEA link

My thoughts on nan­otech­nol­ogy strat­egy re­search as an EA cause area

Ben Snodin2 May 2022 9:41 UTC
137 points
17 comments33 min readEA link

[Link] Thiel on GCRs

Milan Griffes22 Jul 2019 20:47 UTC
28 points
11 comments1 min readEA link

Cen­ter for AI Safety’s Bi-Weekly Read­ing and Learning

Center for AI Safety2 Nov 2023 15:15 UTC
5 points
0 comments1 min readEA link

The Khay­ali Pro­to­col

khayali2 Jun 2025 14:40 UTC
−8 points
0 comments3 min readEA link

In­tro­duc­ing a New Course on the Eco­nomics of AI

akorinek21 Dec 2021 4:55 UTC
84 points
6 comments2 min readEA link

The Rise of AI Agents: Con­se­quences and Challenges Ahead

Tristan D28 Mar 2025 5:19 UTC
5 points
0 comments15 min readEA link

[Event] Join Me­tac­u­lus To­mor­row, March 31st, for Fore­cast Fri­day!

christian30 Mar 2023 20:58 UTC
29 points
1 comment1 min readEA link
(www.metaculus.com)

#199 – Cal­ifor­nia’s AI bill SB 1047 and its po­ten­tial to shape US AI policy (Nathan Calvin on The 80,000 Hours Pod­cast)

80000_Hours30 Aug 2024 18:18 UTC
12 points
0 comments10 min readEA link

AI Safety Fun­da­men­tals: An In­for­mal Co­hort Start­ing Soon! (cross-posted to less­wrong.com)

Tiago4 Jun 2023 18:21 UTC
6 points
0 comments1 min readEA link
(www.lesswrong.com)

An­nounc­ing the AI Welfare Dis­cord Server

Tim Duffy21 Jul 2025 16:36 UTC
7 points
0 comments1 min readEA link

An­nounc­ing the AI Safety Field Build­ing Hub, a new effort to provide AISFB pro­jects, men­tor­ship, and funding

Vael Gates28 Jul 2022 21:29 UTC
126 points
6 comments6 min readEA link

What’s in a Pause?

Davidmanheim16 Sep 2023 10:13 UTC
73 points
10 comments9 min readEA link

Finite Field Assem­bly : A CUDA al­ter­na­tive rooted in Num­ber The­ory and Pure Mathematics

Murage Kibicho13 Jan 2025 13:37 UTC
−7 points
0 comments3 min readEA link

7 traps that (we think) new al­ign­ment re­searchers of­ten fall into

Akash27 Sep 2022 23:13 UTC
73 points
8 comments4 min readEA link

Quick Thoughts on A.I. Governance

Nicholas Kross30 Apr 2022 14:49 UTC
43 points
0 comments2 min readEA link
(www.thinkingmuchbetter.com)

Ret­ro­spec­tive: PIBBSS Fel­low­ship 2024

Dušan D. Nešić (Dushan)20 Dec 2024 15:55 UTC
7 points
0 comments4 min readEA link

[Question] What are the strate­gic im­pli­ca­tions if aliens and Earth civ­i­liza­tions pro­duce similar util­ities?

Maxime Riché 🔸6 Aug 2024 21:21 UTC
6 points
1 comment1 min readEA link

Speci­fi­ca­tion Gam­ing: How AI Can Turn Your Wishes Against You [RA Video]

Writer1 Dec 2023 19:30 UTC
8 points
1 comment5 min readEA link
(youtu.be)

Safety eval­u­a­tions and stan­dards for AI | Beth Barnes | EAG Bay Area 23

Beth Barnes16 Jun 2023 14:15 UTC
28 points
0 comments17 min readEA link

Fron­tier Model Forum

Zach Stein-Perlman26 Jul 2023 14:30 UTC
40 points
7 comments4 min readEA link
(blog.google)

Twit­ter thread on open-source AI

richard_ngo31 Jul 2024 0:30 UTC
32 points
0 comments2 min readEA link
(x.com)

Ego-Cen­tric Ar­chi­tec­ture for AGI Safety v2: Tech­ni­cal Core, Falsifi­able Pre­dic­tions, and a Min­i­mal Experiment

Samuel Pedrielli6 Aug 2025 12:35 UTC
1 point
0 comments6 min readEA link

More ev­i­dence X-risk am­plifies ac­tion against cur­rent AI harms

Daniel_Friedrich22 Dec 2023 15:21 UTC
27 points
2 comments2 min readEA link
(osf.io)

Digest: three pa­pers that have shaped my un­der­stand­ing of the po­ten­tial for con­scious­ness in AI systems

rileyharris21 Aug 2024 15:09 UTC
5 points
0 comments1 min readEA link

“AI Safety for Fleshy Hu­mans” an AI Safety ex­plainer by Nicky Case

Habryka [Deactivated]3 May 2024 19:28 UTC
40 points
3 comments4 min readEA link
(aisafety.dance)

xAI raises $6B

andzuck5 Jun 2024 15:26 UTC
18 points
1 comment1 min readEA link
(x.ai)

Notes on new UK AISI minister

Pseudaemonia5 Jul 2024 19:50 UTC
92 points
0 comments1 min readEA link

Should you work in the Euro­pean Union to do AGI gov­er­nance?

hanadulset31 Jan 2022 10:34 UTC
90 points
20 comments15 min readEA link

The aca­demic con­tri­bu­tion to AI safety seems large

technicalities30 Jul 2020 10:30 UTC
117 points
28 comments9 min readEA link

In­vi­ta­tion to par­ti­ci­pate in AGI global gov­er­nance Real-Time Delphi ques­tion­naire—The Millen­nium Project

Miquel Banchs-Piqué (prev. mikbp)13 Dec 2023 13:35 UTC
6 points
0 comments1 min readEA link

Seek­ing so­cial sci­ence stu­dents /​ col­lab­o­ra­tors in­ter­ested in AI ex­is­ten­tial risks

Vael Gates24 Sep 2021 21:56 UTC
58 points
7 comments3 min readEA link

Tim Cook was asked about ex­tinc­tion risks from AI

Saul Munn6 Jun 2023 18:46 UTC
8 points
1 comment1 min readEA link

Ap­ply to the Con­stel­la­tion Visit­ing Re­searcher Pro­gram and As­tra Fel­low­ship, in Berkeley this Winter

AF26 Oct 2023 3:14 UTC
61 points
4 comments1 min readEA link

New TIME mag­a­z­ine ar­ti­cle on the UK AI Safety In­sti­tute (AISI)

Rasool16 Jan 2025 22:51 UTC
9 points
0 comments1 min readEA link
(time.com)

[Question] What’s the ex­act way you pre­dict prob­a­bil­ity of AI ex­tinc­tion?

jackchang11013 Jun 2023 15:11 UTC
18 points
7 comments1 min readEA link

How do fic­tional sto­ries illus­trate AI mis­al­ign­ment?

Vishakha Agrawal15 Jan 2025 6:16 UTC
4 points
0 comments2 min readEA link
(aisafety.info)

ARENA 5.0 - Call for Applicants

James Hindmarch31 Jan 2025 19:54 UTC
9 points
0 comments6 min readEA link

Dou­glas Hoft­stadter con­cerned about AI xrisk

Eli Rose3 Jul 2023 3:30 UTC
64 points
0 comments1 min readEA link
(www.youtube.com)

In­ter­gen­er­a­tional trauma im­ped­ing co­op­er­a­tive ex­is­ten­tial safety efforts

Andrew Critch3 Jun 2022 17:27 UTC
82 points
2 comments3 min readEA link

AISN #23: New OpenAI Models, News from An­thropic, and Rep­re­sen­ta­tion Engineering

Center for AI Safety4 Oct 2023 17:10 UTC
7 points
0 comments5 min readEA link
(newsletter.safe.ai)

AISN#14: OpenAI’s ‘Su­per­al­ign­ment’ team, Musk’s xAI launches, and de­vel­op­ments in mil­i­tary AI use

Center for AI Safety12 Jul 2023 16:58 UTC
26 points
0 comments4 min readEA link
(newsletter.safe.ai)

Com­pli­ance Mon­i­tor­ing as an Im­pact­ful Mechanism of AI Safety Policy

CAISID7 Feb 2024 16:10 UTC
6 points
3 comments9 min readEA link

Fore­cast­ing Trans­for­ma­tive AI: What Kind of AI?

Holden Karnofsky10 Aug 2021 21:38 UTC
62 points
3 comments10 min readEA link

[Question] Is any­one work­ing on safe se­lec­tion pres­sure for digi­tal minds?

WillPearson12 Dec 2023 18:17 UTC
10 points
9 comments1 min readEA link

An­thropic is be­ing sued for copy­ing books to train Claude

Remmelt31 Aug 2024 2:57 UTC
3 points
0 comments2 min readEA link
(fingfx.thomsonreuters.com)

Tech­ni­cal AGI safety re­search out­side AI

richard_ngo18 Oct 2019 15:02 UTC
91 points
5 comments3 min readEA link

Fund­ing for hu­man­i­tar­ian non-prof­its to re­search re­spon­si­ble AI

Deborah W.A. Foulkes10 Dec 2024 8:08 UTC
4 points
0 comments2 min readEA link
(www.gov.uk)

Effec­tive En­force­abil­ity of EU Com­pe­ti­tion Law Un­der Differ­ent AI Devel­op­ment Sce­nar­ios: A Frame­work for Le­gal Analysis

HaydnBelfield19 Aug 2022 17:20 UTC
11 points
0 comments6 min readEA link
(verfassungsblog.de)

AI Model Registries: A Foun­da­tional Tool for AI Governance

Elliot Mckernon7 Oct 2024 13:59 UTC
19 points
0 comments4 min readEA link
(www.convergenceanalysis.org)

[Link] How un­der­stand­ing valence could help make fu­ture AIs safer

Milan Griffes8 Oct 2020 18:53 UTC
22 points
2 comments3 min readEA link

On how var­i­ous plans miss the hard bits of the al­ign­ment challenge

So8res12 Jul 2022 5:35 UTC
126 points
13 comments29 min readEA link

Changes in fund­ing in the AI safety field

Sebastian_Farquhar3 Feb 2017 13:09 UTC
34 points
10 comments7 min readEA link

Track­ing Crit­i­cal In­fras­truc­ture AI Incidents

Ben Turse29 Sep 2024 21:29 UTC
1 point
0 comments2 min readEA link

AI strat­egy nearcasting

Holden Karnofsky26 Aug 2022 16:25 UTC
61 points
3 comments10 min readEA link

ML4Good Sum­mer Boot­camps—Ap­pli­ca­tions Open

Nia4 Jul 2024 18:38 UTC
39 points
0 comments1 min readEA link

Don’t Dis­miss Sim­ple Align­ment Approaches

Chris Leong21 Oct 2023 12:31 UTC
12 points
0 comments4 min readEA link

Sort­ing Peb­bles Into Cor­rect Heaps: The Animation

Writer10 Jan 2023 15:58 UTC
12 points
0 comments1 min readEA link
(youtu.be)

Get­ting started in­de­pen­dently in AI Safety

JJ Hepburn6 Jul 2021 15:20 UTC
41 points
10 comments2 min readEA link

AI Clar­ity: An Ini­tial Re­search Agenda

Justin Bullock3 May 2024 16:29 UTC
27 points
1 comment8 min readEA link

£1 mil­lion prize for the most cut­ting-edge AI solu­tion for pub­lic good [link post]

rileyharris17 Jan 2024 14:36 UTC
8 points
0 comments2 min readEA link
(manchesterprize.org)

Launch­ing Am­plify: Re­ceive mar­ket­ing sup­port for your lo­cal groups and other field-build­ing initiatives

gergo28 Aug 2024 14:12 UTC
37 points
0 comments2 min readEA link

Key Papers in Lan­guage Model Safety

aog20 Jun 2022 14:59 UTC
20 points
0 comments22 min readEA link

Global Risks Weekly Roundup #18/​2025: US tar­iff short­ages, mil­i­tary polic­ing, Gaza famine.

NunoSempere6 May 2025 10:39 UTC
22 points
0 comments3 min readEA link
(blog.sentinel-team.org)

[Question] How to in­fluence AGI?

Sam Freedman9 Jan 2025 20:46 UTC
2 points
0 comments1 min readEA link

My per­sonal cruxes for work­ing on AI safety

Buck13 Feb 2020 7:11 UTC
136 points
35 comments44 min readEA link

Med­i­ta­tions on ca­reers in AI Safety

PabloAMC 🔸23 Mar 2022 22:00 UTC
88 points
30 comments2 min readEA link

Tran­shu­man­ism and AI: Toward Pros­per­ity or Ex­tinc­tion?

Shaïman Thürler22 Mar 2025 18:01 UTC
9 points
1 comment6 min readEA link

Distil­la­tion of The Offense-Defense Balance of Scien­tific Knowledge

Arjun Yadav12 Aug 2022 7:01 UTC
17 points
0 comments2 min readEA link

CFP for the Largest An­nual Meet­ing of Poli­ti­cal Science: Get Help With Your Re­search Submission

Mahendra Prasad22 Dec 2020 23:39 UTC
13 points
0 comments2 min readEA link

[Question] I have thou­sands of copies of HPMOR in Rus­sian. How to use them with the most im­pact?

MikhailSamin27 Dec 2022 11:07 UTC
39 points
10 comments1 min readEA link

13 Re­cent Publi­ca­tions on Ex­is­ten­tial Risk (Jan 2021 up­date)

HaydnBelfield8 Feb 2021 12:42 UTC
7 points
2 comments10 min readEA link

[Question] What are the biggest ob­sta­cles on AI safety re­search ca­reer?

jackchang11031 Mar 2023 14:53 UTC
2 points
1 comment1 min readEA link

AGI Safety Fun­da­men­tals cur­ricu­lum and application

richard_ngo20 Oct 2021 21:45 UTC
123 points
20 comments8 min readEA link
(docs.google.com)

Why does no one care about AI?

Olivia Addy7 Aug 2022 22:04 UTC
55 points
47 comments1 min readEA link

Have your timelines changed as a re­sult of ChatGPT?

Chris Leong5 Dec 2022 15:03 UTC
30 points
18 comments1 min readEA link

[Question] How can I best use my ca­reer to pass im­pact­ful AI and Biose­cu­rity policy.

maxg13 Oct 2023 5:14 UTC
4 points
1 comment1 min readEA link

An­nounc­ing the Cam­bridge Bos­ton Align­ment Ini­ti­a­tive [Hiring!]

kuhanj2 Dec 2022 1:07 UTC
83 points
0 comments1 min readEA link

AISafety.com Hackathon 2025

Bryce Robertson14 Jul 2025 23:56 UTC
7 points
0 comments1 min readEA link

Can AI Out­pre­dict Hu­mans? Re­sults From Me­tac­u­lus’s Q3 AI Fore­cast­ing Benchmark

Tom Liptay10 Oct 2024 18:58 UTC
32 points
1 comment6 min readEA link
(www.metaculus.com)

Look­ing for for ev­i­dence of AI im­pacts in the age struc­ture of oc­cu­pa­tions: Noth­ing yet

Pat McKelvey 9 May 2025 18:12 UTC
26 points
2 comments3 min readEA link

Ngo and Yud­kowsky on AI ca­pa­bil­ity gains

richard_ngo19 Nov 2021 1:54 UTC
23 points
4 comments39 min readEA link

The Miss­ing Piece: Why We Need a Grand Strat­egy for AI

Coleman28 Feb 2025 23:49 UTC
7 points
1 comment9 min readEA link

Is schem­ing more likely in mod­els trained to have long-term goals? (Sec­tions 2.2.4.1-2.2.4.2 of “Schem­ing AIs”)

Joe_Carlsmith30 Nov 2023 16:43 UTC
6 points
1 comment5 min readEA link

The stakes of AI moral status

Joe_Carlsmith21 May 2025 18:20 UTC
54 points
9 comments14 min readEA link
(joecarlsmith.substack.com)

My plan for a “Most Im­por­tant Cen­tury” read­ing group

Jack O'Brien19 Jan 2022 9:32 UTC
12 points
1 comment2 min readEA link

Is un­der­stand­ing the moral sta­tus of digi­tal minds a press­ing world prob­lem?

Cody_Fenwick30 Sep 2024 8:50 UTC
42 points
0 comments34 min readEA link
(80000hours.org)

Sub­mit Your Tough­est Ques­tions for Hu­man­ity’s Last Exam

Matrice Jacobine18 Sep 2024 8:03 UTC
6 points
0 comments2 min readEA link
(www.safe.ai)

A stub­born un­be­liever fi­nally gets the depth of the AI al­ign­ment problem

aelwood13 Oct 2022 15:16 UTC
32 points
7 comments3 min readEA link
(pursuingreality.substack.com)

Com­mu­ni­ca­tion by ex­is­ten­tial risk or­ga­ni­za­tions: State of the field and sug­ges­tions for improvement

Existential Risk Communication Project13 Aug 2024 7:06 UTC
10 points
3 comments13 min readEA link

Pro­posal for a Form of Con­di­tional Sup­ple­men­tal In­come (CSI) in a Post-Work World

Sean Sweeney31 Jan 2025 1:00 UTC
3 points
0 comments3 min readEA link

The re­li­gion prob­lem in AI alignment

Geoffrey Miller16 Sep 2022 1:24 UTC
54 points
28 comments11 min readEA link

Col­lege tech­ni­cal AI safety hackathon ret­ro­spec­tive—Ge­or­gia Tech

yixiong14 Nov 2024 13:34 UTC
18 points
0 comments5 min readEA link
(yixiong.substack.com)

Visi­ble Thoughts Pro­ject and Bounty Announcement

So8res30 Nov 2021 0:35 UTC
35 points
2 comments13 min readEA link

Fea­si­bil­ity of train­ing and in­fer­ring ad­vanced large lan­guage mod­els (LLMs) in data cen­ters in Mex­ico and Brazil.

Tatiana Sandoval2 May 2025 13:42 UTC
15 points
0 comments24 min readEA link

Without spe­cific coun­ter­mea­sures, the eas­iest path to trans­for­ma­tive AI likely leads to AI takeover

Ajeya18 Jul 2022 19:07 UTC
218 points
12 comments84 min readEA link
(www.lesswrong.com)

Tech­ni­cal AI safety in the United Arab Emirates

ea nyuad21 Jun 2022 3:11 UTC
10 points
0 comments11 min readEA link

AI safety ad­vo­cates should con­sider pro­vid­ing gen­tle push­back fol­low­ing the events at OpenAI

I_machinegun_Kelly22 Dec 2023 21:05 UTC
86 points
5 comments3 min readEA link
(www.lesswrong.com)

I cre­ated an Asi Align­ment Tier List

TimeGoat22 Apr 2024 12:14 UTC
0 points
0 comments1 min readEA link

X-Risk Re­searchers Sur­vey

NitaSangha24 Apr 2023 8:06 UTC
12 points
1 comment1 min readEA link

Les­sons for AI Gover­nance from Atoms for Peace

Amritanshu Prasad16 Apr 2025 14:25 UTC
10 points
2 comments2 min readEA link
(www.thenextfrontier.blog)

Am I Miss­ing Some­thing, or Is EA? Thoughts from a Learner in Uganda

Dr Kassim16 Mar 2025 11:31 UTC
234 points
16 comments3 min readEA link

If AGI is im­mi­nent, why can’t I hail a rob­o­taxi?

Yarrow🔸9 Dec 2023 20:50 UTC
26 points
4 comments1 min readEA link

Be­ing an in­di­vi­d­ual al­ign­ment grantmaker

A_donor28 Feb 2022 16:39 UTC
34 points
20 comments2 min readEA link

GPT-5 is out

david_reinstein7 Aug 2025 20:37 UTC
20 points
1 comment1 min readEA link
(openai.com)

What could an AI-caused ex­is­ten­tial catas­tro­phe ac­tu­ally look like?

Benjamin Hilton12 Sep 2022 16:25 UTC
57 points
7 comments9 min readEA link
(80000hours.org)

An­nounc­ing the AI Fore­cast­ing Bench­mark Series | July 8, $120k in Prizes

christian19 Jun 2024 21:37 UTC
52 points
4 comments5 min readEA link
(www.metaculus.com)

A mesa-op­ti­miza­tion per­spec­tive on AI valence and moral patienthood

jacobpfau9 Sep 2021 22:23 UTC
10 points
18 comments17 min readEA link

You Don’t Have to Be an AI Doomer to Sup­port AI Safety

Liam Robins14 Jun 2025 23:10 UTC
10 points
0 comments4 min readEA link
(thelimestack.substack.com)

ML Safety Schol­ars Sum­mer 2022 Retrospective

TW1231 Nov 2022 3:09 UTC
56 points
2 comments21 min readEA link

Overview: AI Safety Outreach Grass­roots Orgs

Severin12 May 2025 14:38 UTC
11 points
0 comments2 min readEA link

[Question] How have analo­gous In­dus­tries solved In­ter­ested > Trained > Em­ployed bot­tle­necks?

yanni kyriacos30 May 2024 23:59 UTC
6 points
0 comments1 min readEA link

Fa­cil­i­ta­tor Help Wanted for Columbia EA AI Safety Groups

Berkan Ottlik5 Jul 2022 10:27 UTC
16 points
0 comments1 min readEA link

Ap­ply to >50 AI safety fun­ders in one ap­pli­ca­tion with the Non­lin­ear Net­work [Round Closed]

Drew Spartz12 Apr 2023 21:06 UTC
156 points
18 comments2 min readEA link

2019 AI Align­ment Liter­a­ture Re­view and Char­ity Comparison

Larks19 Dec 2019 2:58 UTC
147 points
28 comments62 min readEA link

Will dis­agree­ment about AI rights lead to so­cietal con­flict?

Lucius Caviola3 Jul 2024 13:30 UTC
51 points
0 comments22 min readEA link
(outpaced.substack.com)

En­ergy-Based Trans­form­ers are Scal­able Learn­ers and Thinkers

Matrice Jacobine8 Jul 2025 13:44 UTC
8 points
0 comments1 min readEA link
(energy-based-transformers.github.io)

UN Public Call for Nom­i­na­tions For High-level Ad­vi­sory Body on Ar­tifi­cial Intelligence

vincentweisser10 Aug 2023 10:34 UTC
15 points
1 comment1 min readEA link

Largest AI model in 2 years from $10B

Peter Drotos 🔸24 Oct 2023 15:14 UTC
37 points
0 comments7 min readEA link

Con­tra shard the­ory, in the con­text of the di­a­mond max­i­mizer problem

So8res13 Oct 2022 23:51 UTC
27 points
0 comments2 min readEA link

AI-Risk in the State of the Euro­pean Union Address

Sam Bogerd13 Sep 2023 13:27 UTC
25 points
0 comments3 min readEA link
(state-of-the-union.ec.europa.eu)

Am­plify is hiring! Work with us to sup­port field-build­ing ini­ti­a­tives through digi­tal mar­ket­ing

gergo28 Aug 2024 14:12 UTC
28 points
1 comment4 min readEA link

A nec­es­sary Mem­brane for­mal­ism feature

ThomasCederborg10 Sep 2024 21:03 UTC
1 point
0 comments11 min readEA link

An­i­mal com­mu­ni­ca­tion and the fu­ture of moral progress: spec­u­la­tions and responsibilities

Lutebemberwa Isa27 May 2025 15:51 UTC
10 points
0 comments3 min readEA link

Grad­ual Disem­pow­er­ment: Con­crete Re­search Projects

Raymond D29 May 2025 18:58 UTC
20 points
1 comment10 min readEA link

When the Smarter AI Lies Bet­ter: Can De­bate-Based Over­sight Catch De­cep­tive Code?

Oskar Kraak4 Jul 2025 22:10 UTC
2 points
0 comments5 min readEA link
(oskarkraak.com)

Cog­ni­tive Science/​Psy­chol­ogy As a Ne­glected Ap­proach to AI Safety

Kaj_Sotala5 Jun 2017 13:46 UTC
40 points
37 comments4 min readEA link

[Question] Fis­cal spon­sor­ship, ops sup­port, or in­cu­ba­tion?

Harry Luk4 Oct 2023 22:06 UTC
14 points
8 comments1 min readEA link

Draft re­port on AI timelines

Ajeya15 Dec 2020 12:10 UTC
35 points
0 comments1 min readEA link
(alignmentforum.org)

AI Alter­na­tive Fu­tures: Ex­plo­ra­tory Sce­nario Map­ping for Ar­tifi­cial In­tel­li­gence Risk—Re­quest for Par­ti­ci­pa­tion [Linkpost]

Kiliank9 May 2022 19:53 UTC
17 points
2 comments8 min readEA link

Now is the Time for Moonshots

Alejandro Acelas 🔸18 Jul 2025 15:59 UTC
2 points
0 comments1 min readEA link
(lukedrago.substack.com)

Let’s talk about un­con­trol­lable AI

Karl von Wendt9 Oct 2022 10:37 UTC
12 points
2 comments3 min readEA link

[Question] What would need to be true for AI to trans­late a le­gal con­tract to a smart con­tract?

Patrick Liu18 Mar 2023 16:42 UTC
−1 points
0 comments1 min readEA link

[Question] Should we pub­lish ar­gu­ments for the preser­va­tion of hu­man­ity?

Jeremy7 Apr 2023 13:51 UTC
8 points
4 comments1 min readEA link

#180 – Why gullibil­ity and mis­in­for­ma­tion are over­rated (Hugo Mercier on the 80,000 Hours Pod­cast)

80000_Hours26 Feb 2024 19:16 UTC
15 points
0 comments18 min readEA link

Clar­ify­ing two uses of “al­ign­ment”

Matthew_Barnett10 Mar 2024 17:41 UTC
36 points
28 comments4 min readEA link

A one-sen­tence for­mu­la­tion of the AI X-Risk ar­gu­ment I try to make

tcelferact2 Mar 2024 0:44 UTC
3 points
0 comments1 min readEA link

[$20K In Prizes] AI Safety Ar­gu­ments Competition

TW12326 Apr 2022 16:21 UTC
71 points
121 comments3 min readEA link

20 con­crete pro­jects for re­duc­ing ex­is­ten­tial risk

Buhl21 Jun 2023 15:54 UTC
132 points
27 comments20 min readEA link
(rethinkpriorities.org)

[Question] Why not offer a multi-mil­lion /​ billion dol­lar prize for solv­ing the Align­ment Prob­lem?

Aryeh Englander17 Apr 2022 16:08 UTC
15 points
9 comments1 min readEA link

deleted

funnyfranco24 Mar 2025 19:44 UTC
4 points
10 comments1 min readEA link

China-AI fore­cast­ing

Nathan_Barnard25 Feb 2024 16:47 UTC
10 points
2 comments6 min readEA link

[Question] Ex­am­ples of self-gov­er­nance to re­duce tech­nol­ogy risk?

jia25 Sep 2020 13:26 UTC
32 points
1 comment1 min readEA link

Cru­cial con­sid­er­a­tions in the field of Wild An­i­mal Welfare (WAW)

Holly Elmore ⏸️ 🔸10 Apr 2022 19:43 UTC
64 points
10 comments3 min readEA link

Emer­gent Ven­tures AI

technicalities8 Apr 2022 22:08 UTC
22 points
0 comments1 min readEA link
(marginalrevolution.com)

AISN #29: Progress on the EU AI Act Plus, the NY Times sues OpenAI for Copy­right In­fringe­ment, and Con­gres­sional Ques­tions about Re­search Stan­dards in AI Safety

Center for AI Safety4 Jan 2024 16:03 UTC
5 points
0 comments6 min readEA link
(newsletter.safe.ai)

Eric Drexler: Pare­to­topian goal alignment

EA Global15 Mar 2019 14:51 UTC
16 points
0 comments10 min readEA link
(www.youtube.com)

A model-based ap­proach to AI Ex­is­ten­tial Risk

SammyDMartin25 Aug 2023 10:44 UTC
17 points
0 comments1 min readEA link
(www.lesswrong.com)

Longview is now offer­ing AI grant recom­men­da­tions to donors giv­ing >$100k /​ year

Longview Philanthropy11 Apr 2025 16:01 UTC
73 points
0 comments2 min readEA link

MLSN: #10 Ad­ver­sar­ial At­tacks Against Lan­guage and Vi­sion Models, Im­prov­ing LLM Hon­esty, and Trac­ing the In­fluence of LLM Train­ing Data

Center for AI Safety13 Sep 2023 18:02 UTC
7 points
0 comments5 min readEA link
(newsletter.mlsafety.org)

The longter­mist AI gov­er­nance land­scape: a ba­sic overview

Sam Clarke18 Jan 2022 12:58 UTC
172 points
13 comments9 min readEA link

[Question] Is there much need for fron­tend en­g­ineers in AI al­ign­ment?

Michael G21 Sep 2023 20:48 UTC
11 points
1 comment1 min readEA link

Evolv­ing OpenAI’s Structure

Tyner🔸6 May 2025 0:52 UTC
12 points
1 comment1 min readEA link

2016 AI Risk Liter­a­ture Re­view and Char­ity Comparison

Larks13 Dec 2016 4:36 UTC
57 points
12 comments28 min readEA link

Why I think that teach­ing philos­o­phy is high impact

Eleni_A19 Dec 2022 23:00 UTC
17 points
2 comments2 min readEA link

Database of ex­is­ten­tial risk estimates

MichaelA🔸15 Apr 2020 12:43 UTC
130 points
37 comments5 min readEA link

The Su­per­in­tel­li­gence That Cares About Us

henrik.westerberg5 Jul 2025 10:20 UTC
5 points
0 comments2 min readEA link

Ngo and Yud­kowsky on sci­en­tific rea­son­ing and pivotal acts

EliezerYudkowsky21 Feb 2022 17:00 UTC
33 points
1 comment35 min readEA link

When is AI safety re­search harm­ful?

Nathan_Barnard9 May 2022 10:36 UTC
13 points
6 comments9 min readEA link

Me­diocre AI safety as ex­is­ten­tial risk

technicalities16 Mar 2022 11:50 UTC
52 points
12 comments3 min readEA link

In­tro­duc­ing In­ter­na­tional AI Gover­nance Alli­ance (IAIGA)

James Norris5 Feb 2025 15:59 UTC
12 points
0 comments1 min readEA link

He­len Toner (ex-OpenAI board mem­ber): “We learned about ChatGPT on Twit­ter.”

defun 🔸29 May 2024 7:40 UTC
123 points
13 comments1 min readEA link
(x.com)

Pre­serv­ing and con­tin­u­ing al­ign­ment re­search through a se­vere global catastrophe

A_donor6 Mar 2022 18:43 UTC
40 points
11 comments5 min readEA link

When safety is dan­ger­ous: risks of an in­definite pause on AI de­vel­op­ment, and call for re­al­is­tic alternatives

Hayven Frienby18 Jan 2024 14:59 UTC
5 points
0 comments5 min readEA link

A Brief Sum­mary Of The Most Im­por­tant Century

Maynk0225 Oct 2022 15:28 UTC
3 points
0 comments5 min readEA link

Re­duc­ing global AI com­pe­ti­tion through the Com­merce Con­trol List and Im­mi­gra­tion re­form: a dual-pronged approach

ben.smith3 Sep 2024 5:28 UTC
15 points
0 comments9 min readEA link

BERI is hiring an ML Soft­ware Engineer

sawyer🔸10 Nov 2021 19:36 UTC
17 points
2 comments1 min readEA link

Loss of con­trol of AI is not a likely source of AI x-risk

squek9 Nov 2022 5:48 UTC
8 points
0 comments5 min readEA link

Cryp­tocur­rency Ex­ploits Show the Im­por­tance of Proac­tive Poli­cies for AI X-Risk

eSpencer16 Sep 2022 4:44 UTC
14 points
1 comment4 min readEA link

Is there a Half-Life for the Suc­cess Rates of AI Agents?

Matrice Jacobine8 May 2025 20:10 UTC
6 points
0 comments1 min readEA link
(www.tobyord.com)

[Question] Is AI like disk drives?

Tanae2 Sep 2023 19:12 UTC
8 points
1 comment1 min readEA link

A Man­i­fold Mar­ket “Leaked” the AI Ex­tinc­tion State­ment and CAIS Wanted it Deleted

David Chee12 Jun 2023 15:57 UTC
24 points
9 comments12 min readEA link
(news.manifold.markets)

[Link] Read­ing the ethi­cists: A re­view of ar­ti­cles on AI in the jour­nal Science and Eng­ineer­ing Ethics

Charlie Steiner18 May 2022 21:06 UTC
7 points
0 comments1 min readEA link
(www.lesswrong.com)

Do short AI timelines make other cause ar­eas use­less?

Hayley Clatterbuck23 Jul 2025 19:10 UTC
110 points
14 comments18 min readEA link

I’m NOT against Ar­tifi­cial Intelligence

Victoria Dias24 Apr 2025 18:02 UTC
6 points
1 comment18 min readEA link

Third-wave AI safety needs so­ciopoli­ti­cal thinking

richard_ngo27 Mar 2025 0:55 UTC
106 points
59 comments26 min readEA link

Some Pre­limi­nary Opinions on AI Safety Problems

yonxinzhang6 Apr 2023 12:42 UTC
5 points
0 comments6 min readEA link

AMA: Ought

stuhlmueller3 Aug 2022 17:24 UTC
41 points
52 comments1 min readEA link

Am­bi­tious Im­pact launches a for-profit ac­cel­er­a­tor in­stead of build­ing the AI Safety space. Let’s talk about this.

yanni kyriacos18 Mar 2024 3:44 UTC
−7 points
13 comments1 min readEA link

Some AI Gover­nance Re­search Ideas

MarkusAnderljung3 Jun 2021 10:51 UTC
102 points
5 comments2 min readEA link

Credo AI is hiring for sev­eral roles

IanEisenberg11 Apr 2022 15:58 UTC
14 points
2 comments1 min readEA link

Try o3-pro in ChatGPT for $1 (is AI a bub­ble?)

Hauke Hillebrandt24 Jun 2025 11:15 UTC
29 points
1 comment4 min readEA link

Not Just For Ther­apy Chat­bots: The Case For Com­pas­sion In AI Mo­ral Align­ment Research

Kenneth_Diao29 Sep 2024 22:58 UTC
8 points
3 comments12 min readEA link

Is any­one else also get­ting more wor­ried about hard take­off AGI sce­nar­ios?

JonCefalu9 Jan 2023 6:04 UTC
19 points
11 comments3 min readEA link

OpenAI lost $5 billion in 2024 (and its losses are in­creas­ing)

Remmelt31 Mar 2025 4:17 UTC
0 points
3 comments12 min readEA link
(www.wheresyoured.at)

Max Teg­mark: Risks and benefits of ad­vanced ar­tifi­cial intelligence

EA Global5 Aug 2016 9:19 UTC
7 points
0 comments1 min readEA link
(www.youtube.com)

Les­sons from Three Mile Is­land for AI Warn­ing Shots

NickGabs26 Sep 2022 2:47 UTC
44 points
0 comments15 min readEA link

[Question] Do EA folks want AGI at all?

Noah Scales16 Jul 2022 5:44 UTC
8 points
10 comments1 min readEA link

In­ter­pretable Anal­y­sis of Fea­tures Found in Open-source Sparse Au­toen­coder (par­tial repli­ca­tion)

Fernando Avalos28 Aug 2024 22:08 UTC
10 points
1 comment10 min readEA link

What’s go­ing on with ‘crunch time’?

rosehadshar20 Jan 2023 9:38 UTC
92 points
5 comments4 min readEA link

Cor­po­rate AI Labs’ Odd Role in Their Own Governance

Corporate AI Labs' Odd Role in Their Own Governance29 Jul 2024 9:36 UTC
66 points
6 comments12 min readEA link
(dominikhermle.substack.com)

Promethean Gover­nance in Prac­tice: Craft­ing a Poly­cen­tric, Memetic Order for the Mul­tipo­lar AI Aeon

Paul Fallavollita20 Mar 2025 10:10 UTC
−1 points
0 comments4 min readEA link

Sur­vey: How Do Elite Chi­nese Stu­dents Feel About the Risks of AI?

Nick Corvino2 Sep 2024 9:14 UTC
107 points
9 comments10 min readEA link

Pres­i­dent Bi­den Is­sues Ex­ec­u­tive Order on Safe, Se­cure, and Trust­wor­thy Ar­tifi­cial Intelligence

Tristan Williams30 Oct 2023 11:15 UTC
143 points
8 comments3 min readEA link
(www.whitehouse.gov)

Cost-effec­tive­ness of mak­ing a video game with EA concepts

mmKALLL15 Sep 2022 13:48 UTC
8 points
2 comments5 min readEA link

Fore­cast AI 2027

christian12 Jun 2025 21:12 UTC
22 points
0 comments1 min readEA link
(www.metaculus.com)

Win­ners of the AI Safety Nudge Competition

Marc Carauleanu15 Nov 2022 1:06 UTC
22 points
0 comments1 min readEA link

We Are Con­jec­ture, A New Align­ment Re­search Startup

Connor Leahy9 Apr 2022 15:07 UTC
31 points
0 comments1 min readEA link

The Orthog­o­nal­ity Th­e­sis is Not Ob­vi­ously True

Bentham's Bulldog5 Apr 2023 21:08 UTC
18 points
12 comments9 min readEA link

Steer­ing AI to care for an­i­mals, and soon

Andrew Critch14 Jun 2022 1:13 UTC
239 points
37 comments1 min readEA link

De-em­pha­sise al­ign­ment, em­pha­sise restraint

EuanMcLean4 Feb 2025 17:43 UTC
19 points
2 comments7 min readEA link

AI Safety Ex­ec­u­tive Summary

Sean Osier6 Sep 2022 8:26 UTC
20 points
2 comments5 min readEA link
(seanosier.notion.site)

Pre­limi­nary in­ves­ti­ga­tions on if STEM and EA com­mu­ni­ties could benefit from more overlap

elteerkers11 Apr 2023 16:08 UTC
31 points
17 comments8 min readEA link

Ori­gin and al­ign­ment of goals, mean­ing, and morality

FalseCogs24 Aug 2023 14:05 UTC
1 point
2 comments35 min readEA link

Evals pro­jects I’d like to see, and a call to ap­ply to OP’s evals RFP

cb25 Mar 2025 11:50 UTC
25 points
2 comments3 min readEA link

Be­ware of the new scal­ing paradigm

JohanEA19 Sep 2024 17:03 UTC
9 points
2 comments3 min readEA link

AI ro­man­tic part­ners will harm so­ciety if they go unregulated

Roman Leventov31 Jul 2023 15:55 UTC
16 points
9 comments13 min readEA link

The Wind­fall Clause has a reme­dies problem

John Bridge 🔸23 May 2022 10:31 UTC
40 points
0 comments17 min readEA link

Ap­pli­ca­tions to EAGxCDMX close in a week!

cescorza17 Feb 2025 20:42 UTC
15 points
0 comments1 min readEA link

AI as a sci­ence, and three ob­sta­cles to al­ign­ment strategies

So8res25 Oct 2023 21:02 UTC
41 points
1 comment11 min readEA link

Ex­is­ten­tial Cy­ber­se­cu­rity Risks & AI (A Re­search Agenda)

Madhav Malhotra20 Sep 2023 12:03 UTC
7 points
0 comments8 min readEA link

The Short Timelines Strat­egy for AI Safety Univer­sity Groups

Josh Thorsteinson 🔸7 Mar 2025 4:26 UTC
50 points
8 comments5 min readEA link

The case for con­scious AI: Clear­ing the record [AI Con­scious­ness & Public Per­cep­tion]

Jay Luong5 Jul 2024 20:29 UTC
3 points
7 comments8 min readEA link

AI gov­er­nance & China: Read­ing list

Zach Stein-Perlman18 Dec 2023 15:30 UTC
14 points
0 comments1 min readEA link
(docs.google.com)

The Man­hat­tan Trap: Why a Race to Ar­tifi­cial Su­per­in­tel­li­gence is Self-Defeating

Corin Katzke21 Jan 2025 16:57 UTC
98 points
1 comment2 min readEA link
(www.convergenceanalysis.org)

Map of AI Safety v2

Bryce Robertson15 Apr 2025 13:04 UTC
59 points
6 comments1 min readEA link

Re­views of “Is power-seek­ing AI an ex­is­ten­tial risk?”

Joe_Carlsmith16 Dec 2021 20:50 UTC
71 points
4 comments1 min readEA link

Belief Bias: Bias in Eval­u­at­ing AGI X-Risks

Remmelt2 Jan 2023 8:59 UTC
5 points
0 comments1 min readEA link

A “Solip­sis­tic” Repug­nant Conclusion

Ramiro21 Jul 2022 16:06 UTC
13 points
0 comments6 min readEA link

Tyler Cowen’s challenge to de­velop an ‘ac­tual math­e­mat­i­cal model’ for AI X-Risk

Joe Brenton16 May 2023 16:55 UTC
20 points
4 comments1 min readEA link

$20K in Boun­ties for AI Safety Public Materials

TW1235 Aug 2022 2:57 UTC
45 points
11 comments6 min readEA link

Win­ning isn’t enough

Anthony DiGiovanni5 Nov 2024 11:43 UTC
33 points
3 comments9 min readEA link

Three polls: on timelines and cause prio

Toby Tremlett🔹28 Apr 2025 12:03 UTC
30 points
41 comments1 min readEA link

[Question] Work­shop (hackathon, res­i­dence pro­gram, etc.) about for-profit AI Safety pro­jects?

Roman Leventov26 Jan 2024 9:49 UTC
13 points
1 comment1 min readEA link

Ex-OpenAI em­ployee am­ici leave to file de­nied in Musk v OpenAI case?

TFD2 May 2025 12:31 UTC
8 points
0 comments2 min readEA link
(www.thefloatingdroid.com)

An­nounc­ing AI Align­ment work­shop at the ALIFE 2023 conference

Rory Greig8 Jul 2023 13:49 UTC
9 points
0 comments1 min readEA link
(humanvaluesandartificialagency.com)

[Question] Graph of % of tasks AI is su­per­hu­man at?

Denkenberger🔸15 Nov 2022 5:59 UTC
9 points
0 comments1 min readEA link

AGI safety field build­ing pro­jects I’d like to see

Severin24 Jan 2023 23:30 UTC
25 points
2 comments9 min readEA link

Black Hole Bundle

khayali11 Jun 2025 1:20 UTC
−2 points
0 comments36 min readEA link

On tak­ing AI risk se­ri­ously

Eleni_A13 Mar 2023 5:44 UTC
51 points
4 comments1 min readEA link
(www.nytimes.com)

We Ran an AI Timelines Retreat

Lenny McCline17 May 2022 4:40 UTC
46 points
6 comments3 min readEA link

My dis­agree­ments with “AGI ruin: A List of Lethal­ities”

Sharmake15 Sep 2024 17:22 UTC
23 points
2 comments18 min readEA link

A.I love you : AGI and Hu­man Traitors

Pilot Pillow2 Apr 2025 14:18 UTC
1 point
2 comments7 min readEA link

Ideal gov­er­nance (for com­pa­nies, coun­tries and more)

Holden Karnofsky7 Apr 2022 16:54 UTC
80 points
19 comments14 min readEA link

Ad­dress­ing the non­hu­man gap in in­ter­gov­ern­men­tal AI gov­er­nance frameworks

Alistair Stewart15 Jul 2025 21:13 UTC
73 points
2 comments8 min readEA link

A differ­ent take on the Musk v OpenAI pre­limi­nary in­junc­tion order

TFD11 Mar 2025 14:29 UTC
6 points
1 comment20 min readEA link
(www.thefloatingdroid.com)

The space of sys­tems and the space of maps

Jan_Kulveit22 Mar 2023 16:05 UTC
12 points
0 comments5 min readEA link
(www.lesswrong.com)

Jeffrey Ding: Re-de­ci­pher­ing China’s AI dream

EA Global18 Oct 2019 18:05 UTC
13 points
0 comments1 min readEA link
(www.youtube.com)

Vol­un­teer Op­por­tu­ni­ties with the AI Safety Aware­ness Foundation

NoahCWilson🔸8 Mar 2025 4:41 UTC
7 points
0 comments2 min readEA link

Nu­clear Es­pi­onage and AI Governance

GAA4 Oct 2021 18:21 UTC
32 points
3 comments24 min readEA link

Meri­dian Cam­bridge Visit­ing Re­searcher Pro­gramme: Turn AI safety ideas into funded pro­jects in one week!

Meridian5 Mar 2025 7:19 UTC
27 points
3 comments2 min readEA link

The het­ero­gene­ity of hu­man value types: Im­pli­ca­tions for AI alignment

Geoffrey Miller16 Sep 2022 21:21 UTC
27 points
2 comments10 min readEA link

[Question] What hap­pened to the ‘only 400 peo­ple work in AI safety/​gov­er­nance’ num­ber dated from 2020?

Vaipan15 Mar 2024 15:25 UTC
27 points
1 comment1 min readEA link

Re­duce AGI risks us­ing mod­ern lie de­tec­tion technology

NothingIsArt30 Sep 2024 18:12 UTC
1 point
0 comments1 min readEA link

Emer­gency pod: Don’t be­lieve OpenAI’s “non­profit” spin (with Tyler Whit­mer)

80000_Hours15 May 2025 16:52 UTC
37 points
0 comments2 min readEA link

AMA: Markus An­der­ljung (PM at GovAI, FHI)

MarkusAnderljung21 Sep 2020 11:23 UTC
49 points
24 comments2 min readEA link

“Tech­nolog­i­cal un­em­ploy­ment” AI vs. “most im­por­tant cen­tury” AI: how far apart?

Holden Karnofsky11 Oct 2022 4:50 UTC
17 points
1 comment3 min readEA link
(www.cold-takes.com)

Pitch­ing AI Safety in 3 sentences

PabloAMC 🔸30 Mar 2022 18:50 UTC
7 points
0 comments1 min readEA link

[MLSN #6]: Trans­parency sur­vey, prov­able ro­bust­ness, ML mod­els that pre­dict the future

Dan H12 Oct 2022 20:51 UTC
21 points
1 comment6 min readEA link

The Case for an On­line En­cy­clo­pe­dia Man­aged by AI Agents

Casey Milkweed21 Jul 2025 14:06 UTC
2 points
0 comments1 min readEA link
(substack.com)

Hu­man­i­ties Re­search Ideas for Longtermists

Lizka9 Jun 2021 4:39 UTC
151 points
13 comments13 min readEA link

AI Self-Mod­ifi­ca­tion Am­plifies Risks

Ihor Ivliev3 Jun 2025 20:27 UTC
0 points
0 comments2 min readEA link

An­nounc­ing Epoch: A re­search or­ga­ni­za­tion in­ves­ti­gat­ing the road to Trans­for­ma­tive AI

Jaime Sevilla27 Jun 2022 13:39 UTC
183 points
11 comments2 min readEA link
(epochai.org)

Why Is No One Try­ing To Align Profit In­cen­tives With Align­ment Re­search?

Prometheus23 Aug 2023 13:19 UTC
17 points
2 comments4 min readEA link
(www.lesswrong.com)

AI com­pa­nies’ commitments

Zach Stein-Perlman31 May 2024 0:00 UTC
9 points
0 comments1 min readEA link

[Question] Fore­cast­ing Ques­tions: What do you want to pre­dict on AI?

Nathan Young1 Nov 2023 13:16 UTC
9 points
0 comments1 min readEA link

[Question] EA’s Achieve­ments in 2022

ElliotJDavies14 Dec 2022 14:33 UTC
98 points
11 comments1 min readEA link

Now is a good time to up­date your threat model

Flo 🔸22 Mar 2025 21:11 UTC
29 points
0 comments1 min readEA link

Creat­ing a “Con­science Calcu­la­tor” to Guard-Rail an AGI

Sean Sweeney12 Aug 2024 15:58 UTC
1 point
11 comments17 min readEA link

[Question] Half-baked al­ign­ment idea

ozb28 Mar 2023 5:18 UTC
9 points
2 comments1 min readEA link

[Question] Why are bond yields anoma­lously ris­ing fol­low­ing the Septem­ber rate cut?

incredibleutility7 Jan 2025 15:49 UTC
2 points
2 comments1 min readEA link

An­thropic is not be­ing con­sis­tently can­did about their con­nec­tion to EA

burner230 Mar 2025 13:30 UTC
291 points
88 comments2 min readEA link

Can Knowl­edge Hurt You? The Dangers of In­fo­haz­ards (and Exfo­haz­ards)

A.G.G. Liu8 Feb 2025 15:51 UTC
12 points
0 comments5 min readEA link
(www.youtube.com)

Longter­mist rea­sons to work for in­no­va­tive governments

ac13 Oct 2020 16:32 UTC
74 points
8 comments1 min readEA link

What can the prin­ci­pal-agent liter­a­ture tell us about AI risk?

ac10 Feb 2020 10:10 UTC
26 points
1 comment16 min readEA link

AI Mo­ral Align­ment: The Most Im­por­tant Goal of Our Generation

Ronen Bar26 Mar 2025 12:32 UTC
130 points
32 comments8 min readEA link

“That’s (not) me!”: The mal­i­cious em­ploy­ment of deep­fakes and their miti­ga­tion in le­gal en­vi­ron­ments for AI governance

Gabriela Pardo1 May 2025 14:54 UTC
5 points
0 comments12 min readEA link

How to ‘troll for good’: Lev­er­ag­ing IP for AI governance

Michael Huang26 Feb 2023 6:34 UTC
26 points
3 comments1 min readEA link
(www.science.org)

US credit rat­ing down­graded, $1T in Gulf state in­vest­ments in the US, Kur­dis­tan Work­ers’ Party dis­banded | Sen­tinel Global Risks Weekly Roundup #20/​2025

NunoSempere19 May 2025 18:02 UTC
50 points
0 comments1 min readEA link
(blog.sentinel-team.org)

AI al­ign­ment re­search links

Holden Karnofsky6 Jan 2022 5:52 UTC
16 points
0 comments6 min readEA link
(www.cold-takes.com)

AI Safety − 7 months of dis­cus­sion in 17 minutes

Zoe Williams15 Mar 2023 23:41 UTC
90 points
2 comments17 min readEA link

Thoughts on the OpenAI al­ign­ment plan: will AI re­search as­sis­tants be net-pos­i­tive for AI ex­is­ten­tial risk?

Jeffrey Ladish10 Mar 2023 8:20 UTC
12 points
0 comments9 min readEA link

Gaia Net­work: a prac­ti­cal, in­cre­men­tal path­way to Open Agency Architecture

Roman Leventov20 Dec 2023 17:11 UTC
4 points
0 comments16 min readEA link

CSER is hiring for a se­nior re­search as­so­ci­ate on longterm AI risk and governance

Sam Clarke24 Jan 2022 13:24 UTC
9 points
4 comments1 min readEA link

The AGI Awak­e­ness valley of doom and three path­ways to slowing

GideonF28 Jul 2025 18:46 UTC
16 points
0 comments16 min readEA link
(open.substack.com)

Back to the Past to the Future

Prometheus18 Oct 2023 16:51 UTC
4 points
0 comments1 min readEA link

Pro­mot­ing com­pas­sion­ate longtermism

jonleighton7 Dec 2022 14:26 UTC
117 points
5 comments12 min readEA link

Longter­mism bet­ter from a de­vel­op­ment skep­ti­cal stance?

Benevolent_Rain9 Dec 2024 12:16 UTC
16 points
2 comments1 min readEA link

[Question] Train­ing a GPT model on EA texts: what data?

JoyOptimizer4 Jun 2022 5:59 UTC
23 points
17 comments1 min readEA link

The Align­ment Prob­lem No One is Talk­ing About

Non-zero-sum James14 May 2024 10:42 UTC
5 points
0 comments2 min readEA link

Co­op­er­a­tion for AI safety must tran­scend geopoli­ti­cal interference

Matrice Jacobine16 Feb 2025 18:18 UTC
9 points
0 comments1 min readEA link
(www.scmp.com)

When is it im­por­tant that open-weight mod­els aren’t re­leased? My thoughts on the benefits and dan­gers of open-weight mod­els in re­sponse to de­vel­op­ments in CBRN ca­pa­bil­ities.

Ryan Greenblatt9 Jun 2025 19:19 UTC
39 points
3 comments9 min readEA link

LLMs might already be conscious

MichaelDickens5 Jul 2025 19:31 UTC
33 points
8 comments2 min readEA link
(mdickens.me)

No, the EMH does not im­ply that mar­kets have long AGI timelines

Jakob24 Apr 2023 8:27 UTC
83 points
21 comments8 min readEA link

New Fund­ing Round on Hard­ware-En­abled Mechanisms (HEMs)

aog30 Apr 2025 17:45 UTC
54 points
0 comments15 min readEA link

On pre­sent­ing the case for AI risk

Aryeh Englander8 Mar 2022 21:37 UTC
114 points
12 comments4 min readEA link

A Tax­on­omy Of AI Sys­tem Evaluations

Maxime Riché 🔸19 Aug 2024 9:08 UTC
8 points
0 comments14 min readEA link

CBAI is Hiring for Oper­a­tions As­so­ci­ates (closed)

Maite A21 Jul 2025 21:37 UTC
13 points
0 comments1 min readEA link
(www.cbai.ai)

How to do con­cep­tual re­search: Case study in­ter­view with Cas­par Oesterheld

Chi14 May 2024 15:09 UTC
26 points
1 comment9 min readEA link

Dis­cus­sion with Eliezer Yud­kowsky on AGI interventions

RobBensinger11 Nov 2021 3:21 UTC
60 points
33 comments34 min readEA link

[Question] What “defense lay­ers” should gov­ern­ments, AI labs, and busi­nesses use to pre­vent catas­trophic AI failures?

LintzA3 Dec 2021 14:24 UTC
37 points
3 comments1 min readEA link

Sen­tinel min­utes #6/​2025: Power of the purse, D1.1 H5N1 flu var­i­ant, Ay­a­tol­lah against ne­go­ti­a­tions with Trump

NunoSempere10 Feb 2025 17:23 UTC
40 points
2 comments7 min readEA link
(blog.sentinel-team.org)

De­cen­tral­iz­ing Model Eval­u­a­tion: Les­sons from AI4Math

SMalagon5 Jun 2025 18:57 UTC
22 points
1 comment4 min readEA link

[Question] Need help with billboard con­tent for AI Safety Bulgaria

Aleksandar Angelov7 Mar 2024 14:36 UTC
4 points
5 comments1 min readEA link

Selec­tion Bias in Ob­ser­va­tional Es­ti­mates of Al­gorith­mic Progress

Parker_Whitfill18 Aug 2025 1:48 UTC
22 points
0 comments1 min readEA link
(arxiv.org)

[Question] How does one find out their AGI timelines?

Yadav7 Nov 2022 22:34 UTC
19 points
4 comments1 min readEA link

[Question] Can we con­vince peo­ple to work on AI safety with­out con­vinc­ing them about AGI hap­pen­ing this cen­tury?

BrianTan26 Nov 2020 14:46 UTC
8 points
3 comments2 min readEA link

AI & wis­dom 1: wis­dom, amor­tised op­ti­mi­sa­tion, and AI

L Rudolf L29 Oct 2024 13:37 UTC
14 points
0 comments15 min readEA link
(rudolf.website)

How to get ChatGPT to re­ally thor­oughly re­search something

Kat Woods 🔶 ⏸️15 Aug 2025 12:54 UTC
10 points
2 comments1 min readEA link

Early-warn­ing Fore­cast­ing Cen­ter: What it is, and why it’d be cool

Linch14 Mar 2022 19:20 UTC
62 points
8 comments11 min readEA link

Elic­it­ing re­sponses to Marc An­dreessen’s “Why AI Will Save the World”

Coleman17 Jul 2023 19:58 UTC
2 points
2 comments1 min readEA link
(a16z.com)

How one log­i­cal fal­lacy kil­led God, cor­rupted Science and now fuels the AI race

Jáchym Fibír29 Jul 2025 13:57 UTC
−1 points
0 comments7 min readEA link
(www.phiand.ai)

An­nounc­ing the Cam­bridge ERA:AI Fel­low­ship 2024

erafellowship11 Mar 2024 19:06 UTC
31 points
5 comments3 min readEA link

The Case For Civil Di­sobe­di­ence For The AI Movement

Murali Thoppil24 Apr 2023 13:07 UTC
16 points
3 comments4 min readEA link
(murali42e.substack.com)

Co­op­er­a­tion and Align­ment in Del­e­ga­tion Games: You Need Both!

Oliver Sourbut3 Aug 2024 10:16 UTC
4 points
1 comment11 min readEA link
(www.oliversourbut.net)

In­cen­tive de­sign and ca­pa­bil­ity elicitation

Joe_Carlsmith12 Nov 2024 20:56 UTC
9 points
0 comments12 min readEA link

Back­ground for “Un­der­stand­ing the diffu­sion of large lan­guage mod­els”

Ben Cottier21 Dec 2022 13:49 UTC
12 points
0 comments23 min readEA link

AI Safety Un­con­fer­ence NeurIPS 2022

Orpheus_Lummis7 Nov 2022 15:39 UTC
13 points
5 comments1 min readEA link
(aisafetyevents.org)

In­tro­duc­ing the Fund for Align­ment Re­search (We’re Hiring!)

AdamGleave6 Jul 2022 2:00 UTC
74 points
3 comments4 min readEA link

How long till Brus­sels?: A light in­ves­ti­ga­tion into the Brus­sels Gap

Yadav26 Dec 2022 7:49 UTC
50 points
2 comments5 min readEA link

SERI ML ap­pli­ca­tion dead­line is ex­tended un­til May 22.

Viktoria Malyasova22 May 2022 0:13 UTC
13 points
3 comments1 min readEA link

US pub­lic per­cep­tion of CAIS state­ment and the risk of extinction

Jamie E22 Jun 2023 16:39 UTC
126 points
4 comments9 min readEA link

Join the AI Test­ing Hackathon this Friday

Esben Kran12 Dec 2022 14:24 UTC
33 points
0 comments8 min readEA link
(alignmentjam.com)

[Question] How to get more aca­demics en­thu­si­as­tic about do­ing AI Safety re­search?

PabloAMC 🔸4 Sep 2021 14:10 UTC
25 points
19 comments1 min readEA link

We’re hiring a Writer to join our team at Our World in Data

Charlie Giattino18 Apr 2024 20:50 UTC
29 points
0 comments1 min readEA link
(ourworldindata.org)

Fun­da­men­tal Risk

Ihor Ivliev26 Jun 2025 0:25 UTC
−5 points
0 comments1 min readEA link

Epoch and FRI Men­tor­ship Pro­gram Sum­mer 2023

merilalama13 Jun 2023 14:27 UTC
38 points
1 comment1 min readEA link
(epochai.org)

How can open-source robotics be al­igned with long-term effec­tive al­tru­ism goals?

Aria James21 Apr 2025 20:50 UTC
5 points
1 comment1 min readEA link

TIO: A men­tal health chatbot

Sanjay12 Oct 2020 20:52 UTC
25 points
6 comments28 min readEA link

[Question] Book recom­men­da­tions for the his­tory of ML?

Eleni_A28 Dec 2022 23:45 UTC
10 points
4 comments1 min readEA link

An ex­per­i­ment elic­it­ing rel­a­tive es­ti­mates for Open Philan­thropy’s 2018 AI safety grants

NunoSempere12 Sep 2022 11:19 UTC
111 points
16 comments13 min readEA link

Don’t treat prob­a­bil­ities less than 0.5 as if they’re 0

MichaelDickens26 Feb 2025 5:14 UTC
36 points
5 comments1 min readEA link

China pro­poses new global AI co­op­er­a­tion organisation

Matrice Jacobine30 Jul 2025 2:50 UTC
14 points
1 comment1 min readEA link
(www.reuters.com)

Defend­ing against Ad­ver­sar­ial Poli­cies in Re­in­force­ment Learn­ing with Alter­nat­ing Training

sergeivolodin12 Feb 2022 15:59 UTC
1 point
0 comments13 min readEA link

[Job]: AI Stan­dards Devel­op­ment Re­search Assistant

Tony Barrett14 Oct 2022 20:18 UTC
13 points
0 comments2 min readEA link

Three kinds of competitiveness

AI Impacts2 Apr 2020 3:46 UTC
10 points
0 comments5 min readEA link
(aiimpacts.org)

EU AI Act now has a sec­tion on gen­eral pur­pose AI systems

MathiasKB🔸9 Dec 2021 12:40 UTC
64 points
10 comments1 min readEA link

A new place to dis­cuss cog­ni­tive sci­ence, ethics and hu­man alignment

Daniel_Friedrich4 Nov 2022 14:34 UTC
9 points
1 comment2 min readEA link
(www.facebook.com)

AI im­pacts and Paul Chris­ti­ano on take­off speeds

Crosspost2 Mar 2018 11:16 UTC
4 points
0 comments1 min readEA link

XPT fore­casts on (some) Direct Ap­proach model inputs

Forecasting Research Institute20 Aug 2023 12:39 UTC
37 points
0 comments9 min readEA link

[Question] I there a demo of “You can’t fetch the coffee if you’re dead”?

Ram Rachum10 Nov 2022 11:03 UTC
8 points
3 comments1 min readEA link

What About Deon­tol­ogy? Ethics of So­cial Belong­ing and Con­for­mity in Effec­tive Altruism

Maksens Djabali8 Jan 2025 14:02 UTC
7 points
1 comment4 min readEA link

The Best Ar­gu­ment is not a Sim­ple English Yud Essay

Jonathan Bostock19 Sep 2024 15:29 UTC
76 points
4 comments5 min readEA link
(www.lesswrong.com)

TOMORROW: the largest AI Safety protest ever!

Holly Elmore ⏸️ 🔸20 Oct 2023 18:08 UTC
57 points
0 comments2 min readEA link

In­tro­duc­ing the Prin­ci­ples of In­tel­li­gent Be­havi­our in Biolog­i­cal and So­cial Sys­tems (PIBBSS) Fellowship

adamShimi18 Dec 2021 15:25 UTC
37 points
5 comments10 min readEA link

Get­ting Wash­ing­ton and Sili­con Valley to tame AI (Mustafa Suley­man on the 80,000 Hours Pod­cast)

80000_Hours4 Sep 2023 16:25 UTC
5 points
2 comments10 min readEA link

The great en­ergy de­scent—Part 2: Limits to growth and why we prob­a­bly won’t reach the stars

CB🔸31 Aug 2022 21:51 UTC
22 points
0 comments25 min readEA link

Seek­ing Sur­vey Re­sponses—At­ti­tudes Towards AI risks

anson28 Mar 2022 17:47 UTC
23 points
2 comments1 min readEA link
(forms.gle)

Na­tion­wide Ac­tion Work­shop: Con­tact Congress about AI Safety!

Felix De Simone24 Feb 2025 16:14 UTC
5 points
0 comments1 min readEA link
(www.zeffy.com)

Sum­mary: In­tro­spec­tive Ca­pa­bil­ities in LLMs (Robert Long)

rileyharris2 Jul 2024 18:08 UTC
11 points
1 comment4 min readEA link

What would it take for AI to dis­em­power us? Ryan Green­blatt on take­off dy­nam­ics, rogue de­ploy­ments, and al­ign­ment risks

80000_Hours8 Jul 2025 18:10 UTC
8 points
0 comments33 min readEA link

Which of these ar­gu­ments for x-risk do you think we should test?

Wim9 Aug 2022 13:43 UTC
3 points
2 comments1 min readEA link

What’s so dan­ger­ous about AI any­way? – Or: What it means to be a superintelligence

Thomas Kehrenberg18 Jul 2022 16:14 UTC
9 points
2 comments11 min readEA link

AI, An­i­mals, and Digi­tal Minds Con­fer­ence 2024: Ac­cept­ing ap­pli­ca­tions and speaker proposals

Constance Li6 Apr 2024 8:42 UTC
26 points
0 comments1 min readEA link

Tran­scripts of in­ter­views with AI researchers

Vael Gates9 May 2022 6:03 UTC
140 points
14 comments2 min readEA link

1st Alinha Hacka Re­cap: Reflect­ing on the Brazilian AI Align­ment Hackathon

Thiago USP31 Jan 2024 10:38 UTC
7 points
0 comments2 min readEA link

Prizes for ML Safety Bench­mark Ideas

Joshc28 Oct 2022 2:44 UTC
56 points
8 comments1 min readEA link

Crit­i­cism of the main frame­work in AI alignment

Michele Campolo31 Aug 2022 21:44 UTC
45 points
9 comments7 min readEA link

[Job ad] Re­search im­por­tant longter­mist top­ics at Re­think Pri­ori­ties!

Linch6 Oct 2021 19:09 UTC
65 points
46 comments1 min readEA link

Cri­tique of Su­per­in­tel­li­gence Part 3

James Fodor13 Dec 2018 5:13 UTC
3 points
5 comments7 min readEA link

[Question] Idea: Re­pos­i­tory for AI Safety Presentations

Eitan6 Jan 2025 13:04 UTC
14 points
3 comments1 min readEA link

My (naive) take on Risks from Learned Optimization

Artyom K6 Nov 2022 16:25 UTC
5 points
0 comments5 min readEA link

Some for-profit AI al­ign­ment org ideas

Eric Ho14 Dec 2023 15:52 UTC
33 points
1 comment9 min readEA link

[CANCELLED] Ber­lin AI Align­ment Open Meetup Au­gust 2022

Isidor Regenfuß4 Aug 2022 13:34 UTC
0 points
0 comments1 min readEA link

Thou­sands of mal­i­cious ac­tors on the fu­ture of AI misuse

Zershaaneh Qureshi1 Apr 2024 10:03 UTC
75 points
1 comment1 min readEA link

Genes did mis­al­ign­ment first: com­par­ing gra­di­ent hack­ing and meiotic drive

Holly Elmore ⏸️ 🔸18 Apr 2025 5:39 UTC
45 points
9 comments15 min readEA link
(hollyelmore.substack.com)

The UN Has a Rare Shot at Re­duc­ing the Risks of AI in War­fare

Mark Leon Goldberg21 May 2025 21:22 UTC
6 points
0 comments1 min readEA link

Seek­ing col­lab­o­ra­tors to build hard­ware de­vices for biose­cu­rity or AI safety

Steve Trambert1 Jul 2025 14:44 UTC
14 points
3 comments1 min readEA link

[Question] Best ML courses for AI safety?

lamparita2 Aug 2025 21:28 UTC
2 points
0 comments1 min readEA link

Soft­ware en­g­ineer­ing—Ca­reer review

Benjamin Hilton8 Feb 2022 6:11 UTC
93 points
19 comments8 min readEA link
(80000hours.org)

Beyond Sim­ple Ex­is­ten­tial Risk: Sur­vival in a Com­plex In­ter­con­nected World

GideonF21 Nov 2022 14:35 UTC
84 points
67 comments21 min readEA link

Im­pli­ca­tions of large lan­guage model diffu­sion for AI governance

Ben Cottier21 Dec 2022 13:50 UTC
14 points
0 comments38 min readEA link

[Question] Is it eth­i­cal to work in AI “con­tent eval­u­a­tion”?

anon_databoy55530 Jan 2025 13:27 UTC
10 points
3 comments1 min readEA link

The U.S. Na­tional Se­cu­rity State is Here to Make AI Even Less Trans­par­ent and Accountable

Matrice Jacobine24 Nov 2024 9:34 UTC
7 points
0 comments2 min readEA link
(www.eff.org)

Could this be an un­usu­ally good time to Earn To Give?

Tom Gardiner 🔸3 Mar 2025 23:00 UTC
60 points
15 comments3 min readEA link

Train­ing for Good—Up­date & Plans for 2023

Cillian_15 Nov 2022 16:02 UTC
80 points
1 comment10 min readEA link

ML Sum­mer Boot­camp Reflec­tion: Aalto EA Finland

Aayush Kucheria12 Jan 2023 8:24 UTC
15 points
2 comments9 min readEA link

In­tro­duc­tion: Bias in Eval­u­at­ing AGI X-Risks

Remmelt27 Dec 2022 10:27 UTC
4 points
0 comments3 min readEA link

FHI Re­port: How Will Na­tional Se­cu­rity Con­sid­er­a­tions Affect An­titrust De­ci­sions in AI? An Ex­am­i­na­tion of His­tor­i­cal Precedents

Cullen 🔸28 Jul 2020 18:33 UTC
13 points
0 comments1 min readEA link
(www.fhi.ox.ac.uk)

In­sti­tu­tional-themed web­site tem­plate for AIS groups

Kambar29 Apr 2025 21:11 UTC
21 points
0 comments1 min readEA link

Our new video about goal mis­gen­er­al­iza­tion, plus an apology

Writer14 Jan 2025 14:07 UTC
16 points
1 comment7 min readEA link
(youtu.be)

Col­lec­tive Ac­tion for AI Safety (June 4, NYC)

Jordan Braunstein30 May 2025 21:11 UTC
1 point
0 comments1 min readEA link

Well, this was un­ex­pected. Claude com­ing out of nowhere with the meta cog­ni­tion.

Anti-Golem9 Jun 2025 13:59 UTC
−2 points
0 comments1 min readEA link

AI Safety Protest, Melbourne, Aus­tralia

Mark Brown17 Jan 2025 14:55 UTC
2 points
0 comments1 min readEA link

Con­jec­ture: A stand­ing offer for pub­lic de­bates on AI

Andrea_Miotti16 Jun 2023 14:33 UTC
8 points
1 comment2 min readEA link
(www.conjecture.dev)

Co­op­er­a­tion, Avoidance, and In­differ­ence: Alter­nate Fu­tures for Misal­igned AGI

Kiel Brennan-Marquez10 Dec 2022 20:32 UTC
4 points
1 comment18 min readEA link

On AI and Compute

johncrox3 Apr 2019 21:26 UTC
39 points
12 comments8 min readEA link

The Bot­tle­neck in AI Policy Isn’t Ethics—It’s Implementation

Tristan D4 Apr 2025 6:07 UTC
10 points
4 comments1 min readEA link

Are we drop­ping the ball on Recom­men­da­tion AIs?

Raphaël S23 Oct 2024 19:37 UTC
5 points
0 comments6 min readEA link

EA Nether­lands’ An­nual Strat­egy for 2024

James Herbert5 Jun 2024 15:07 UTC
40 points
4 comments6 min readEA link

[Question] What’s a good in­tro to AI Safety?

Amateur Systems Analyst14 Jan 2024 16:54 UTC
1 point
5 comments1 min readEA link

The His­tory, Episte­mol­ogy and Strat­egy of Tech­nolog­i­cal Res­traint, and les­sons for AI (short es­say)

MMMaas10 Aug 2022 11:00 UTC
90 points
6 comments9 min readEA link
(verfassungsblog.de)

Raphaël Millière on the Limits of Deep Learn­ing and AI x-risk skepticism

Michaël Trazzi24 Jun 2022 18:33 UTC
20 points
0 comments4 min readEA link
(theinsideview.ai)

[Question] Com­mon re­but­tal to “paus­ing” or reg­u­lat­ing AI

sammyboiz🔸22 May 2024 4:21 UTC
4 points
2 comments1 min readEA link

Tan Zhi Xuan: AI al­ign­ment, philo­soph­i­cal plu­ral­ism, and the rele­vance of non-Western philosophy

EA Global21 Nov 2020 8:12 UTC
20 points
1 comment1 min readEA link
(www.youtube.com)

Ex-OpenAI re­searcher says OpenAI mass-vi­o­lated copy­right law

Remmelt24 Oct 2024 1:00 UTC
11 points
0 comments1 min readEA link
(suchir.net)

EA is good, actually

Amy Labenz28 Nov 2023 15:59 UTC
272 points
15 comments4 min readEA link

Ber­lin AI Align­ment Open Meetup Septem­ber 2022

Isidor Regenfuß21 Sep 2022 15:09 UTC
2 points
0 comments1 min readEA link

So­ci­aLLM: pro­posal for a lan­guage model de­sign for per­son­al­ised apps, so­cial sci­ence, and AI safety research

Roman Leventov2 Jan 2024 8:11 UTC
4 points
2 comments3 min readEA link

On failing to get EA jobs: My ex­pe­rience and recom­men­da­tions to EA orgs

Avila22 Apr 2024 21:19 UTC
127 points
55 comments5 min readEA link

[Question] Should some peo­ple start work­ing to in­fluence the peo­ple who are most likely to shape the val­ues of the first AGIs, so that they take into ac­count the in­ter­ests of wild and farmed an­i­mals and sen­tient digi­tal minds?

Keyvan Mostafavi31 Aug 2023 12:08 UTC
16 points
1 comment1 min readEA link

Shut­ting down AI Safety Support

JJ Hepburn30 Jul 2023 6:00 UTC
117 points
17 comments1 min readEA link

“Slower tech de­vel­op­ment” can be about or­der­ing, grad­u­al­ness, or dis­tance from now

MichaelA🔸14 Nov 2021 20:58 UTC
47 points
3 comments4 min readEA link

10 of Founders Pledge’s biggest grants

Matt_Lerner9 Jul 2025 21:55 UTC
120 points
1 comment6 min readEA link

[Question] Slow­ing down AI progress?

Eleni_A26 Jul 2022 8:46 UTC
16 points
9 comments1 min readEA link

No “Zero-Shot” Without Ex­po­nen­tial Data: Pre­train­ing Con­cept Fre­quency Deter­mines Mul­ti­modal Model Performance

Noah Varley🔸14 May 2024 23:57 UTC
36 points
2 comments1 min readEA link
(arxiv.org)

What are some good pod­casts about AI safety?

Vishakha Agrawal17 Feb 2025 10:32 UTC
8 points
1 comment1 min readEA link
(aisafety.info)

EA and AI Safety Schism: AGI, the last tech hu­mans will (soon*) build

Phib15 May 2023 2:05 UTC
6 points
6 comments5 min readEA link

Per­sonal agents

Roman Leventov17 Jun 2025 2:05 UTC
3 points
1 comment7 min readEA link

Align­ment for fo­cused chat­bots?

Beckpm8 Jul 2023 15:09 UTC
−1 points
2 comments1 min readEA link

The role of academia in AI Safety.

PabloAMC 🔸28 Mar 2022 0:04 UTC
71 points
19 comments3 min readEA link

Sta­tus quo bias; Sys­tem jus­tifi­ca­tion: Bias in Eval­u­at­ing AGI X-Risks

Remmelt3 Jan 2023 2:50 UTC
4 points
1 comment1 min readEA link

What if we just…didn’t build AGI? An Ar­gu­ment Against Inevitability

Nate Sharpe10 May 2025 3:34 UTC
64 points
21 comments14 min readEA link
(natezsharpe.substack.com)

An­nounc­ing the AIPoli­cyIdeas.com Database

abiolvera23 Jun 2023 16:09 UTC
50 points
3 comments2 min readEA link
(www.aipolicyideas.com)

Del­e­gated agents in prac­tice: How com­pa­nies might end up sel­l­ing AI ser­vices that act on be­half of con­sumers and coal­i­tions, and what this im­plies for safety research

Remmelt26 Nov 2020 16:39 UTC
11 points
0 comments4 min readEA link

Paus­ing AI Devel­op­ments Isn’t Enough. We Need to Shut it All Down by Eliezer Yudkowsky

jacquesthibs29 Mar 2023 23:30 UTC
212 points
75 comments3 min readEA link
(time.com)

Longter­mist im­pli­ca­tions of aliens Space-Far­ing Civ­i­liza­tions—Introduction

Maxime Riché 🔸21 Feb 2025 12:07 UTC
44 points
12 comments6 min readEA link

Forg­ing A New AGI So­cial Contract

Deric Cheng10 Apr 2025 13:41 UTC
13 points
3 comments7 min readEA link
(agisocialcontract.substack.com)

In­for­ma­tion in risky tech­nol­ogy races

nemeryxu2 Aug 2022 23:35 UTC
15 points
2 comments3 min readEA link

Le­gal tem­plate for con­di­tional gift deed as an al­ter­na­tive to wa­gers on AI doom

bruce13 Mar 2025 14:57 UTC
30 points
6 comments1 min readEA link

Some­thing to make my­self fas­ci­nated with com­put­ing sci­ence and AI.

Eduardo7 Dec 2022 2:12 UTC
3 points
5 comments1 min readEA link

Univer­sity com­mu­nity build­ing seems like the wrong model for AI safety

George Stiffman26 Feb 2022 6:23 UTC
24 points
8 comments2 min readEA link

How to make the fu­ture bet­ter (other than by re­duc­ing ex­tinc­tion risk)

William_MacAskill15 Aug 2025 15:40 UTC
22 points
0 comments3 min readEA link

New Re­port: Multi-Agent Risks from Ad­vanced AI

Lewis Hammond23 Feb 2025 0:32 UTC
40 points
3 comments2 min readEA link
(www.cooperativeai.com)

Stampy’s AI Safety Info soft launch

StevenKaas5 Oct 2023 22:20 UTC
57 points
2 comments2 min readEA link
(www.lesswrong.com)

An­nounc­ing the 2023 PIBBSS Sum­mer Re­search Fellowship

Dušan D. Nešić (Dushan)12 Jan 2023 21:38 UTC
26 points
2 comments1 min readEA link

Will the Need to Re­train AI Models from Scratch Block a Soft­ware In­tel­li­gence Ex­plo­sion?

Forethought28 Mar 2025 13:43 UTC
12 points
0 comments3 min readEA link
(www.forethought.org)

The repli­ca­tion and em­u­la­tion of GPT-3

Ben Cottier21 Dec 2022 13:49 UTC
14 points
0 comments33 min readEA link

[Link] Cen­ter for the Gover­nance of AI (GovAI) An­nual Re­port 2018

MarkusAnderljung21 Dec 2018 16:17 UTC
24 points
0 comments1 min readEA link

Best prac­tices for risk com­mu­ni­ca­tion from the aca­demic literature

Existential Risk Communication Project12 Aug 2024 18:54 UTC
9 points
3 comments23 min readEA link

Promethean Gover­nance Un­leashed: Pilot­ing Poly­cen­tric, Memetic Orders in the AI Frontier

Paul Fallavollita21 Mar 2025 16:35 UTC
−11 points
1 comment3 min readEA link

In­tro­duc­ing Leap Labs, an AI in­ter­pretabil­ity startup

Jessica Rumbelow6 Mar 2023 17:37 UTC
11 points
0 comments1 min readEA link
(www.lesswrong.com)

Why mechanis­tic in­ter­pretabil­ity does not and can­not con­tribute to long-term AGI safety (from mes­sages with a friend)

Remmelt19 Dec 2022 12:02 UTC
17 points
3 comments31 min readEA link

En­abling more feedback

JJ Hepburn10 Dec 2021 6:52 UTC
41 points
3 comments3 min readEA link

AISN #31: A New AI Policy Bill in Cal­ifor­nia Plus, Prece­dents for AI Gover­nance and The EU AI Office

Center for AI Safety21 Feb 2024 21:55 UTC
27 points
0 comments6 min readEA link
(newsletter.safe.ai)

Ap­ply to be­come a Fu­turekind AI Fa­cil­i­ta­tor or Men­tor (dead­line: April 10)

Jay Luong21 Mar 2025 20:28 UTC
3 points
0 comments1 min readEA link

My ar­gu­ment against AGI

cveres12 Oct 2022 6:32 UTC
2 points
29 comments3 min readEA link

Jesse Clif­ton: Open-source learn­ing — a bar­gain­ing approach

EA Global18 Oct 2019 18:05 UTC
10 points
0 comments1 min readEA link
(www.youtube.com)

Safety timelines: How long will it take to solve al­ign­ment?

Esben Kran19 Sep 2022 12:51 UTC
45 points
9 comments6 min readEA link

The Ri­val AI De­ploy­ment Prob­lem: a Pre-de­ploy­ment Agree­ment as the least-bad response

HaydnBelfield23 Sep 2022 9:28 UTC
44 points
1 comment12 min readEA link

Prologue | A Fire Upon the Deep | Ver­nor Vinge

semicycle17 Feb 2025 4:13 UTC
5 points
1 comment1 min readEA link
(www.baen.com)

Holden Karnofsky In­ter­view about Most Im­por­tant Cen­tury & Trans­for­ma­tive AI

Dwarkesh Patel3 Jan 2023 17:31 UTC
29 points
2 comments1 min readEA link

PIBBSS Fel­low­ship 2025: Boun­ties and Co­op­er­a­tive AI Track Announcement

Dušan D. Nešić (Dushan)9 Jan 2025 14:23 UTC
18 points
0 comments1 min readEA link

Overview of re­cent in­ter­na­tional demon­stra­tions against AI (AI Protest Ac­tions #1)

Rachel Shu17 Jul 2025 20:22 UTC
17 points
2 comments5 min readEA link

[Question] Who should we in­ter­view for The 80,000 Hours Pod­cast?

Luisa_Rodriguez13 Sep 2023 12:23 UTC
87 points
136 comments2 min readEA link

AI Safety Strat­egy—A new or­ga­ni­za­tion for bet­ter timelines

Prometheus14 Jun 2023 20:41 UTC
8 points
0 comments2 min readEA link

Former Is­raeli Prime Minister Speaks About AI X-Risk

Yonatan Cale20 May 2023 12:09 UTC
73 points
6 comments1 min readEA link

[Cross­post] An AI Pause Is Hu­man­ity’s Best Bet For Prevent­ing Ex­tinc­tion (TIME)

Otto24 Jul 2023 10:18 UTC
36 points
3 comments7 min readEA link
(time.com)

Stripe Eco­nomics of AI Fellowship

basil.halperin28 Mar 2025 15:24 UTC
54 points
0 comments2 min readEA link
(stripe.events)

Strate­gic Direc­tions for a Digi­tal Con­scious­ness Model

Derek Shiller10 Dec 2024 19:33 UTC
41 points
1 comment12 min readEA link

[Question] UK elec­tion and AI safety, who to vote for?

Clay Cube1 Jun 2024 10:16 UTC
25 points
3 comments1 min readEA link

Align­ment is hard. Com­mu­ni­cat­ing that, might be harder

Eleni_A1 Sep 2022 11:45 UTC
17 points
1 comment3 min readEA link

Stop talk­ing about p(doom)

Isaac King1 Jan 2024 10:57 UTC
115 points
12 comments3 min readEA link

As­ter­isk Mag 10: Origins

Clara Collier7 Jul 2025 18:03 UTC
8 points
0 comments2 min readEA link
(asteriskmag.com)

ML4G Ger­many—AI Align­ment Camp

Evander H. 🔸27 Jun 2023 15:33 UTC
6 points
0 comments1 min readEA link

The Dilemma of Ul­ti­mate Technology

Aino20 Jul 2023 12:24 UTC
1 point
0 comments7 min readEA link

Some cruxes on im­pact­ful al­ter­na­tives to AI policy work

richard_ngo22 Nov 2018 13:43 UTC
28 points
2 comments12 min readEA link

A mod­est case for hope

xavier rg17 Oct 2022 6:03 UTC
28 points
0 comments1 min readEA link

Rea­sons for my nega­tive feel­ings to­wards the AI risk discussion

fergusq1 Sep 2022 7:33 UTC
43 points
9 comments4 min readEA link

[Question] How much should you op­ti­mize for the short-timelines sce­nario?

SoerenMind26 Jul 2022 15:51 UTC
39 points
2 comments1 min readEA link

Ca­reer un­cer­tainty: Medicine vs. AI

Markus Köth30 Apr 2023 8:41 UTC
20 points
9 comments1 min readEA link

Ad­vice to ju­nior AI gov­er­nance researchers

Akash8 Jul 2024 19:19 UTC
38 points
3 comments5 min readEA link

A pro­gres­sive AI, not a threat­en­ing one

Violette 12 Dec 2023 17:19 UTC
−17 points
0 comments4 min readEA link

Does gen­er­al­ity pay? GPT-3 can provide pre­limi­nary ev­i­dence.

Eevee🔹12 Jul 2020 18:53 UTC
21 points
4 comments2 min readEA link

[Question] Why does AGI oc­cur al­most nowhere, not even just as a re­mark for eco­nomic/​poli­ti­cal mod­els?

Franziska Fischer2 Oct 2022 14:43 UTC
52 points
15 comments1 min readEA link

Up­com­ing speaker se­ries on emerg­ing tech, na­tional se­cu­rity & US policy careers

ES10 Jul 2024 19:59 UTC
16 points
1 comment1 min readEA link

Re­sults from a sur­vey on tool use and work­flows in al­ign­ment research

jacquesthibs19 Dec 2022 15:19 UTC
30 points
0 comments19 min readEA link

Epis­tle to the Successor

ukc1001429 Apr 2025 9:30 UTC
4 points
0 comments19 min readEA link

Align­ment Grant­mak­ing is Fund­ing-Limited Right Now [cross­post]

johnswentworth2 Aug 2023 20:37 UTC
82 points
13 comments1 min readEA link
(www.lesswrong.com)

On­line in­fosec talk: What even is zero trust?

Jarrah8 Jun 2024 23:54 UTC
11 points
0 comments1 min readEA link

Reflec­tions from Ooty re­treat 2.0

Aditya Arpitha Prasad24 Jul 2025 18:22 UTC
4 points
0 comments1 min readEA link
(www.lesswrong.com)

Open As­teroid Im­pact an­nounces lead­er­ship transition

Patrick Hoang1 Apr 2024 12:51 UTC
18 points
0 comments1 min readEA link

Ap­pli­ca­tions open for AGI Safety Fun­da­men­tals: Align­ment Course

Jamie B13 Dec 2022 10:50 UTC
75 points
0 comments2 min readEA link

In­tro­duc­ing METR’s Au­ton­omy Eval­u­a­tion Resources

Megan Kinniment15 Mar 2024 23:19 UTC
28 points
0 comments1 min readEA link
(metr.github.io)

Lev­er­age points for a pause

Remmelt28 Aug 2024 9:21 UTC
6 points
0 comments1 min readEA link

Some AI safety pro­ject & re­search ideas/​ques­tions for short and long timelines

Lloy2 🔹8 Aug 2025 21:08 UTC
9 points
0 comments5 min readEA link

[Question] AI labs’ re­quests for input

Zach Stein-Perlman19 Aug 2023 17:00 UTC
7 points
0 comments1 min readEA link

Ex­ec­u­tive Direc­tor for AIS France—Ex­pres­sion of interest

gergo19 Dec 2024 8:11 UTC
33 points
0 comments4 min readEA link

De­mon­strat­ing speci­fi­ca­tion gam­ing in rea­son­ing models

Matrice Jacobine20 Feb 2025 19:26 UTC
10 points
0 comments1 min readEA link
(arxiv.org)

A note about differ­en­tial tech­nolog­i­cal development

So8res24 Jul 2022 23:41 UTC
58 points
8 comments6 min readEA link

Talk to me about your sum­mer/​ca­reer plans

Akash31 Jan 2023 18:29 UTC
31 points
0 comments2 min readEA link

Should AI fo­cus on prob­lem-solv­ing or strate­gic plan­ning? Why not both?

oliver_siegel1 Nov 2022 9:53 UTC
1 point
0 comments1 min readEA link

AI Discrim­i­na­tion Re­quire­ments: A Reg­u­la­tory Review

Deric Cheng4 Apr 2024 15:44 UTC
8 points
1 comment6 min readEA link

Pro­ject pro­posal: Sce­nario anal­y­sis group for AI safety strategy

Buhl18 Dec 2023 18:31 UTC
35 points
0 comments5 min readEA link
(rethinkpriorities.org)

AI can ex­ploit safety plans posted on the Internet

Peter S. Park4 Dec 2022 12:17 UTC
5 points
3 comments1 min readEA link

Effi­cacy of AI Ac­tivism: Have We Ever Said No?

Charlie Harrison27 Oct 2023 16:52 UTC
80 points
25 comments20 min readEA link

7 Learn­ings and a De­tailed De­scrip­tion of an AI Safety Read­ing Group

nell23 Sep 2022 2:02 UTC
21 points
5 comments9 min readEA link

Us­ing AI to match peo­ple to jobs?

Forumite30 May 2024 21:19 UTC
5 points
0 comments1 min readEA link

AI Safety In­cu­ba­tion Pro­gram—Ap­pli­ca­tions Open

Catalyze Impact16 Aug 2024 15:37 UTC
11 points
0 comments2 min readEA link

Law-Fol­low­ing AI 1: Se­quence In­tro­duc­tion and Structure

Cullen 🔸27 Apr 2022 17:16 UTC
35 points
2 comments9 min readEA link

The Welfare of Digi­tal Minds: A Re­search Agenda

Derek Shiller11 Nov 2024 12:58 UTC
53 points
1 comment31 min readEA link

Ly­ing is Cowardice, not Strategy

Connor Leahy25 Oct 2023 5:59 UTC
−5 points
15 comments5 min readEA link
(cognition.cafe)

How Tech­ni­cal AI Safety Re­searchers Can Help Im­ple­ment Pu­ni­tive Da­m­ages to Miti­gate Catas­trophic AI Risk

Gabriel Weil19 Feb 2024 17:43 UTC
28 points
2 comments4 min readEA link

You Un­der­stand AI Align­ment and How to Make Soup

Leen Armoush28 May 2022 6:22 UTC
0 points
2 comments5 min readEA link

MATS is hiring!

Ryan Kidd8 Apr 2025 20:45 UTC
14 points
2 comments6 min readEA link

AI Con­sti­tu­tions are a tool to re­duce so­cietal scale risk

SammyDMartin26 Jul 2024 10:50 UTC
11 points
0 comments1 min readEA link
(www.lesswrong.com)

Open Philan­thropy Tech­ni­cal AI Safety RFP - $40M Available Across 21 Re­search Areas

Jake Mendel6 Feb 2025 18:59 UTC
95 points
3 comments1 min readEA link
(www.openphilanthropy.org)

Let’s Talk About Emergence

Jacob-Haimes7 Jun 2024 19:34 UTC
8 points
1 comment7 min readEA link
(www.odysseaninstitute.org)

Ap­ply to a small iter­a­tion of MLAB to be run in Oxford

Rio P29 Aug 2023 19:39 UTC
11 points
0 comments1 min readEA link

Cut­ting AI Safety down to size

Holly Elmore ⏸️ 🔸9 Nov 2024 23:40 UTC
87 points
5 comments5 min readEA link

A list of good heuris­tics that the case for AI X-risk fails

Aaron Gertler 🔸16 Jul 2020 9:56 UTC
25 points
9 comments2 min readEA link
(www.alignmentforum.org)

[An­nounce­ment] The Steven Aiberg Project

StevenAiberg19 Oct 2022 7:48 UTC
0 points
0 comments4 min readEA link

AE Stu­dio is hiring!

AE Studio21 Apr 2025 20:35 UTC
16 points
0 comments2 min readEA link

Hu­mans aren’t fit­ness maximizers

So8res4 Oct 2022 1:32 UTC
30 points
2 comments5 min readEA link

Tak­ing Into Ac­count Sen­tient Non-Hu­mans in AI Am­bi­tious Value Learn­ing: Sen­tien­tist Co­her­ent Ex­trap­o­lated Volition

Adrià Moret1 Dec 2023 18:01 UTC
43 points
2 comments42 min readEA link

IV. Par­allels and Review

Maynk0227 Feb 2024 23:10 UTC
7 points
1 comment8 min readEA link
(open.substack.com)

AI, An­i­mals, & Digi­tal Minds 2025: Retrospective

Alistair Stewart12 Jul 2025 2:28 UTC
60 points
3 comments11 min readEA link

The Slip­pery Slope from DALLE-2 to Deep­fake Anarchy

stecas5 Nov 2022 14:47 UTC
55 points
11 comments17 min readEA link

Should we ex­pect the fu­ture to be good?

Neil Crawford30 Apr 2025 0:45 UTC
38 points
1 comment14 min readEA link

Com­pute Gover­nance and Con­clu­sions—Trans­for­ma­tive AI and Com­pute [3/​4]

lennart14 Oct 2021 7:55 UTC
20 points
3 comments5 min readEA link

What is “wire­head­ing”?

Vishakha Agrawal17 Dec 2024 17:59 UTC
1 point
0 comments1 min readEA link
(aisafety.info)

The King and the Golem—The Animation

Writer8 Nov 2024 18:23 UTC
50 points
1 comment1 min readEA link

Scor­ing fore­casts from the 2016 “Ex­pert Sur­vey on Progress in AI”

PatrickL1 Mar 2023 14:39 UTC
204 points
21 comments9 min readEA link

His­tory’s Gran­d­est Pro­jects: In­tro­duc­tion to Macro Strate­gies for AI Risk, Part 1

Coleman20 Jun 2025 17:32 UTC
7 points
0 comments38 min readEA link

The NAIRR Ini­ti­a­tive: Assess­ing its Po­ten­tial for De­moc­ra­tiz­ing AI

Jose Gelves29 Aug 2024 12:30 UTC
22 points
1 comment11 min readEA link

[Question] How bad would AI progress need to be for us to think gen­eral tech­nolog­i­cal progress is also bad?

Jim Buhler6 Jul 2024 18:44 UTC
10 points
0 comments1 min readEA link

US pub­lic opinion on AI, Septem­ber 2023

Zach Stein-Perlman18 Sep 2023 18:00 UTC
29 points
0 comments1 min readEA link
(blog.aiimpacts.org)

How could a mora­to­rium fail?

Davidmanheim22 Sep 2023 15:11 UTC
49 points
4 comments9 min readEA link

La­bor Par­ti­ci­pa­tion is a High-Pri­or­ity AI Align­ment Risk

alx12 Aug 2024 18:48 UTC
17 points
3 comments16 min readEA link

Con­sider not donat­ing un­der $100 to poli­ti­cal candidates

DanielFilan11 May 2025 3:22 UTC
83 points
11 comments1 min readEA link

All AGI Safety ques­tions wel­come (es­pe­cially ba­sic ones) [April 2023]

StevenKaas8 Apr 2023 4:21 UTC
111 points
173 comments2 min readEA link

[Question] Schol­ar­ships for Un­der­grads who want to have high-im­pact ca­reers?

darthflower6 Jul 2025 17:31 UTC
4 points
0 comments1 min readEA link

Ap­ply to the new Open Philan­thropy Tech­nol­ogy Policy Fel­low­ship!

lukeprog20 Jul 2021 18:41 UTC
78 points
6 comments4 min readEA link

[Question] How should tech­ni­cal AI re­searchers best tran­si­tion into AI gov­er­nance and policy?

GabeM10 Sep 2023 5:29 UTC
12 points
5 comments1 min readEA link

Quan­tify­ing the Far Fu­ture Effects of Interventions

MichaelDickens18 May 2016 2:15 UTC
9 points
0 comments11 min readEA link

Why I ex­pect suc­cess­ful (nar­row) alignment

Tobias_Baumann29 Dec 2018 15:46 UTC
18 points
10 comments1 min readEA link
(s-risks.org)

Ap­ply to the Co­op­er­a­tive AI PhD Fel­low­ship by Oc­to­ber 14th!

Lewis Hammond5 Oct 2024 12:41 UTC
35 points
0 comments1 min readEA link

When “yang” goes wrong

Joe_Carlsmith8 Jan 2024 16:35 UTC
57 points
1 comment13 min readEA link

ML4G is launch­ing its first ever Gover­nance boot­camp!

carolinaollive16 May 2025 15:22 UTC
25 points
0 comments1 min readEA link

On Scal­ing Academia

kirchner.jan20 Sep 2021 14:54 UTC
18 points
3 comments13 min readEA link
(universalprior.substack.com)

A Bench­mark for Mea­sur­ing Hon­esty in AI Systems

Mantas Mazeika4 Mar 2025 17:44 UTC
29 points
0 comments2 min readEA link
(www.mask-benchmark.ai)

[Question] Which Grad­u­ate Pro­grams Will Best Set Me Up for a Ca­reer in AI Safety?

Jason Zeng19 Feb 2025 13:22 UTC
4 points
0 comments1 min readEA link

Data col­lec­tion for AI al­ign­ment—Ca­reer review

Benjamin Hilton3 Jun 2022 11:44 UTC
34 points
1 comment5 min readEA link
(80000hours.org)

My P(doom) is 2.76%. Here’s Why.

Liam Robins12 Jun 2025 22:29 UTC
55 points
11 comments20 min readEA link
(thelimestack.substack.com)

What AI Safety Ma­te­ri­als Do ML Re­searchers Find Com­pel­ling?

Vael Gates28 Dec 2022 2:03 UTC
130 points
12 comments2 min readEA link

Deep­Mind: Model eval­u­a­tion for ex­treme risks

Zach Stein-Perlman25 May 2023 3:00 UTC
49 points
3 comments1 min readEA link
(arxiv.org)

Video & tran­script: Challenges for Safe & Benefi­cial Brain-Like AGI

Steven Byrnes8 May 2025 21:11 UTC
8 points
1 comment18 min readEA link

The Con­trol Prob­lem: Un­solved or Un­solv­able?

Remmelt2 Jun 2023 15:42 UTC
4 points
9 comments13 min readEA link

[Creative Writ­ing Con­test] The Puppy Problem

Louis13 Oct 2021 14:01 UTC
13 points
0 comments7 min readEA link

Fore­cast­ing Through Fiction

Yitz6 Jul 2022 5:23 UTC
8 points
3 comments8 min readEA link
(www.lesswrong.com)

AGI and Lock-In

Lukas Finnveden29 Oct 2022 1:56 UTC
154 points
20 comments10 min readEA link
(www.forethought.org)

An­nounce­ment: there are now monthly co­or­di­na­tion calls for AIS field­builders in Europe

gergo22 Nov 2024 10:30 UTC
32 points
0 comments1 min readEA link

AGI Isn’t Close—Fu­ture Fund Wor­ld­view Prize

Toni MUENDEL18 Dec 2022 16:03 UTC
−8 points
24 comments13 min readEA link

[Opz­ionale] Ap­profondi­menti sui rischi dell’IA (ma­te­ri­ali in in­glese)

EA Italy18 Jan 2023 11:16 UTC
1 point
0 comments2 min readEA link

How The EthiSizer Al­most Broke `Story’

Velikovsky_of_Newcastle8 May 2023 16:58 UTC
1 point
0 comments5 min readEA link

AI is ad­vanc­ing fast

Vishakha Agrawal23 Apr 2025 11:04 UTC
2 points
2 comments2 min readEA link
(aisafety.info)

#185 – The 7 most promis­ing ways to end fac­tory farm­ing, and whether AI is go­ing to be good or bad for an­i­mals (Lewis Bol­lard on the 80,000 Hours Pod­cast)

80000_Hours30 Apr 2024 17:20 UTC
63 points
0 comments15 min readEA link

AISN #33: Re­assess­ing AI and Biorisk Plus, Con­soli­da­tion in the Cor­po­rate AI Land­scape, and Na­tional In­vest­ments in AI

Center for AI Safety12 Apr 2024 16:11 UTC
19 points
0 comments9 min readEA link
(newsletter.safe.ai)

An au­dio ver­sion of the al­ign­ment prob­lem from a deep learn­ing per­spec­tive by Richard Ngo Et Al

Miguel3 Feb 2023 19:32 UTC
18 points
0 comments1 min readEA link
(www.whitehatstoic.com)

Lab Col­lab­o­ra­tion on AI Safety Best Prac­tices

amta17 Mar 2024 12:20 UTC
3 points
0 comments20 min readEA link

Please Donate to CAIP (Post 1 of 7 on AI Gover­nance)

Jason Green-Lowe7 May 2025 18:15 UTC
136 points
25 comments33 min readEA link

6 Year De­crease of Me­tac­u­lus AGI Prediction

Chris Leong12 Apr 2022 5:36 UTC
40 points
6 comments1 min readEA link

Mak­ing of #IAN

kirchner.jan29 Aug 2021 16:24 UTC
9 points
0 comments1 min readEA link
(universalprior.substack.com)

Sum­mary of the AI Bill of Rights and Policy Implications

Tristan Williams20 Jun 2023 9:28 UTC
16 points
0 comments22 min readEA link

SERI ML Align­ment The­ory Schol­ars Pro­gram 2022

Ryan Kidd27 Apr 2022 16:33 UTC
57 points
2 comments3 min readEA link

[Link post] AI could fuel fac­tory farm­ing—or end it

BrianK18 Oct 2022 11:16 UTC
39 points
0 comments1 min readEA link
(www.fastcompany.com)

Paul Chris­ti­ano: Cur­rent work in AI alignment

EA Global3 Apr 2020 7:06 UTC
80 points
4 comments24 min readEA link
(www.youtube.com)

Nick Bostrom’s new book, “Deep Utopia”, is out today

peterhartree27 Mar 2024 11:23 UTC
105 points
6 comments1 min readEA link
(nickbostrom.com)

In­tel­lec­tual Diver­sity in AI Safety

KR22 Jul 2020 19:07 UTC
21 points
8 comments3 min readEA link

Case study: Safety stan­dards on Cal­ifor­nia util­ities to pre­vent wildfires

Coby Joseph6 Dec 2023 10:32 UTC
7 points
1 comment26 min readEA link

2024 State of AI Reg­u­la­tory Landscape

Deric Cheng28 May 2024 12:00 UTC
12 points
1 comment2 min readEA link
(www.convergenceanalysis.org)

Joe Ro­gan on AI Safety

Michael_2358 🔸25 Jun 2025 1:14 UTC
18 points
0 comments1 min readEA link

I am a Me­moryless System

Nicholas Kross23 Oct 2022 17:36 UTC
4 points
0 comments9 min readEA link
(www.thinkingmuchbetter.com)

Ex­plor­ing Tacit Linked Premises with GPT

RomeoStevens24 Mar 2023 22:50 UTC
5 points
0 comments3 min readEA link

Perfor­mance of Large Lan­guage Models (LLMs) in Com­plex Anal­y­sis: A Bench­mark of Math­e­mat­i­cal Com­pe­tence and its Role in De­ci­sion Mak­ing.

Jaime Esteban Montenegro Barón6 May 2025 21:08 UTC
1 point
0 comments23 min readEA link

[US time] In­fosec: What even is zero trust?

Jarrah21 Jun 2024 18:11 UTC
2 points
0 comments1 min readEA link

Clar­ify­ing and pre­dict­ing AGI

richard_ngo4 May 2023 15:56 UTC
69 points
2 comments4 min readEA link

OpenAI’s Pre­pared­ness Frame­work: Praise & Recommendations

Akash2 Jan 2024 16:20 UTC
16 points
1 comment7 min readEA link

Policy ideas for miti­gat­ing AI risk

Thomas Larsen16 Sep 2023 10:31 UTC
121 points
16 comments10 min readEA link

[Question] Open-source AI safety pro­jects?

defun 🔸29 Jan 2024 10:09 UTC
8 points
2 comments1 min readEA link

Lead, Own, Share: Sovereign Wealth Funds for Trans­for­ma­tive AI

Matrice Jacobine14 Jul 2025 9:34 UTC
24 points
0 comments1 min readEA link
(www.convergenceanalysis.org)

Scal­ing Wargam­ing for Global Catas­trophic Risks with AI

rai18 Jan 2025 15:07 UTC
73 points
1 comment4 min readEA link
(blog.sentinel-team.org)

Catas­trophic Risks from AI #2: Mal­i­cious Use

Dan H22 Jun 2023 17:10 UTC
19 points
0 comments17 min readEA link
(arxiv.org)

AI Bench­marks Series — Me­tac­u­lus Ques­tions on Eval­u­a­tions of AI Models Against Tech­ni­cal Benchmarks

christian27 Mar 2024 23:05 UTC
10 points
0 comments1 min readEA link
(www.metaculus.com)

Against Agents as an Ap­proach to Aligned Trans­for­ma­tive AI

𝕮𝖎𝖓𝖊𝖗𝖆27 Dec 2022 0:47 UTC
4 points
0 comments2 min readEA link

We are already in a per­sua­sion-trans­formed world and must take precautions

trevor14 Nov 2023 15:53 UTC
1 point
0 comments6 min readEA link

The tra­jec­tory of the fu­ture could soon get set in stone

William_MacAskill11 Aug 2025 11:04 UTC
37 points
1 comment3 min readEA link

Con­scious AI: Will we know it when we see it? [Con­scious AI & Public Per­cep­tion]

ixex4 Jul 2024 20:30 UTC
13 points
1 comment12 min readEA link

An­nounc­ing Cavendish Labs

dyusha19 Jan 2023 20:00 UTC
112 points
6 comments2 min readEA link

Why I think strong gen­eral AI is com­ing soon

porby28 Sep 2022 6:55 UTC
14 points
1 comment34 min readEA link

AI Risk is like Ter­mi­na­tor; Stop Say­ing it’s Not

skluug8 Mar 2022 19:17 UTC
191 points
43 comments10 min readEA link
(skluug.substack.com)

[Question] Aca­demic AI Safety/​Align­ment Read­ing List

Zak_H21 Nov 2023 14:19 UTC
6 points
1 comment1 min readEA link

“We can Prevent AI Disaster Like We Prevented Nu­clear Catas­tro­phe”

Peter23 Sep 2023 20:36 UTC
15 points
1 comment1 min readEA link
(time.com)

What is malev­olence? On the na­ture, mea­sure­ment, and dis­tri­bu­tion of dark traits

David_Althaus23 Oct 2024 8:41 UTC
107 points
6 comments52 min readEA link

The pos­si­bil­ity of an in­definite AI pause

Matthew_Barnett19 Sep 2023 12:28 UTC
90 points
73 comments15 min readEA link

Don’t leave your finger­prints on the future

So8res8 Oct 2022 0:35 UTC
95 points
4 comments4 min readEA link

[Question] Would peo­ple on this site be in­ter­ested in hear­ing about efforts to make an “ethics calcu­la­tor” for an AGI?

Sean Sweeney5 Mar 2024 9:28 UTC
1 point
0 comments1 min readEA link

(Linkpost) METR: Mea­sur­ing the Im­pact of Early-2025 AI on Ex­pe­rienced Open-Source Devel­oper Productivity

Yadav11 Jul 2025 8:58 UTC
37 points
2 comments2 min readEA link
(metr.org)

AI Safety Bounties

PatrickL24 Aug 2023 14:30 UTC
37 points
2 comments7 min readEA link
(rethinkpriorities.org)

#204 – Mak­ing sense of SBF, and his biggest cri­tiques of effec­tive al­tru­ism (Nate Silver on The 80,000 Hours Pod­cast)

80000_Hours17 Oct 2024 20:41 UTC
22 points
2 comments14 min readEA link

AI is not tak­ing over ma­te­rial sci­ence (for now): an anal­y­sis and con­fer­ence report

titotal11 Mar 2025 12:01 UTC
59 points
16 comments25 min readEA link
(open.substack.com)

“AI” is an indexical

TW1233 Jan 2023 22:00 UTC
23 points
2 comments6 min readEA link
(aiwatchtower.substack.com)

Ap­ply now to Hu­man-al­igned AI Sum­mer School 2025

Pivocajs6 Jun 2025 19:34 UTC
8 points
1 comment2 min readEA link

Elec­tion by Jury: A Ne­glected Tar­get for Effec­tive Altruism

ClayShentrup27 Jan 2025 7:27 UTC
16 points
10 comments6 min readEA link

I read ev­ery ma­jor AI lab’s safety plan so you don’t have to

sarahhw16 Dec 2024 14:12 UTC
68 points
2 comments11 min readEA link
(longerramblings.substack.com)

LessWrong is now a book, available for pre-or­der!

terraform4 Dec 2020 20:42 UTC
48 points
1 comment7 min readEA link

AISN #32: Mea­sur­ing and Re­duc­ing Hazardous Knowl­edge in LLMs Plus, Fore­cast­ing the Fu­ture with LLMs, and Reg­u­la­tory Markets

Center for AI Safety7 Mar 2024 16:37 UTC
15 points
2 comments8 min readEA link
(newsletter.safe.ai)

State of EA Poland and fund­ing opportunity

Chris Szulc7 Dec 2024 8:48 UTC
72 points
4 comments11 min readEA link

New Work­ing Paper Series of the Le­gal Pri­ori­ties Project

Legal Priorities Project18 Oct 2021 10:30 UTC
60 points
0 comments9 min readEA link

Di­a­gram with Com­men­tary for AGI as an X-Risk

Jared Leibowich24 May 2023 22:27 UTC
21 points
4 comments8 min readEA link

An Up­date On The Cam­paign For AI Safety Dot Org

yanni kyriacos5 May 2023 0:19 UTC
26 points
4 comments1 min readEA link

Why I’m Not (Yet) A Full-Time Tech­ni­cal Align­ment Researcher

Nicholas Kross25 May 2023 1:26 UTC
11 points
1 comment4 min readEA link
(www.thinkingmuchbetter.com)

Differ­en­tial tech­nol­ogy de­vel­op­ment: preprint on the concept

Hamish_Hobbs12 Sep 2022 13:52 UTC
65 points
0 comments2 min readEA link

The first AI Safety Camp & onwards

Remmelt7 Jun 2018 18:49 UTC
25 points
2 comments8 min readEA link

EA’s brain-over-body bias, and the em­bod­ied value prob­lem in AI al­ign­ment

Geoffrey Miller21 Sep 2022 18:55 UTC
45 points
3 comments25 min readEA link

Disagree­ment with bio an­chors that lead to shorter timelines

mariushobbhahn16 Nov 2022 14:40 UTC
85 points
1 comment7 min readEA link

Help me to un­der­stand AI al­ign­ment!

britomart18 Jan 2023 9:13 UTC
3 points
12 comments1 min readEA link

Me­tac­u­lus Pre­sents — View From the En­ter­prise Suite: How Ap­plied AI Gover­nance Works Today

christian20 Jun 2023 22:24 UTC
4 points
0 comments1 min readEA link

4 Key As­sump­tions in AI Safety

Prometheus7 Nov 2022 10:50 UTC
5 points
0 comments7 min readEA link

[Question] Please Share Your Per­spec­tives on the De­gree of So­cietal Im­pact from Trans­for­ma­tive AI Outcomes

Kiliank15 Apr 2022 1:23 UTC
3 points
3 comments1 min readEA link

Wor­ld­view iPeo­ple—Fu­ture Fund’s AI Wor­ld­view Prize

Toni MUENDEL28 Oct 2022 7:37 UTC
0 points
5 comments9 min readEA link

Ja­pan AI Align­ment Conference

ChrisScammell10 Mar 2023 9:23 UTC
17 points
2 comments1 min readEA link
(www.conjecture.dev)

Where does Re­spon­si­ble Ca­pa­bil­ities Scal­ing take AI gov­er­nance?

ZacRichardson9 Jun 2024 22:25 UTC
17 points
1 comment16 min readEA link

In­ter­views with 97 AI Re­searchers: Quan­ti­ta­tive Analysis

Maheen Shermohammed2 Feb 2023 4:50 UTC
76 points
4 comments7 min readEA link

We’re Not Ad­ver­tis­ing Enough (Post 3 of 7 on AI Gover­nance)

Jason Green-Lowe22 May 2025 17:11 UTC
47 points
3 comments28 min readEA link

AI data gaps could lead to on­go­ing An­i­mal Suffering

Darkness8i817 Oct 2024 10:52 UTC
13 points
3 comments5 min readEA link

AI Safety Col­lab 2025 Sum­mer—Lo­cal Or­ga­nizer Sign-ups Open

Evander H. 🔸25 Jun 2025 14:41 UTC
12 points
0 comments1 min readEA link

[Question] Am I tak­ing crazy pills? Why aren’t EAs ad­vo­cat­ing for a pause on AI ca­pa­bil­ities?

yanni kyriacos15 Aug 2023 23:29 UTC
18 points
21 comments1 min readEA link

Fic­tional Catas­tro­phes, Reel Les­sons: What 12 Crit­i­cally Ac­claimed Films Re­veal About Sur­viv­ing Global Catastrophes

Matt Boyd14 May 2025 19:07 UTC
6 points
1 comment1 min readEA link
(adaptresearchwriting.com)

“No-one in my org puts money in their pen­sion”

tobyj16 Feb 2024 15:04 UTC
157 points
11 comments9 min readEA link
(seekingtobejolly.substack.com)

In­vi­ta­tion to lead a pro­ject at AI Safety Camp (Vir­tual Edi­tion, 2025)

Linda Linsefors23 Aug 2024 14:18 UTC
30 points
2 comments4 min readEA link

Why EAs are skep­ti­cal about AI Safety

Lukas Trötzmüller🔸18 Jul 2022 19:01 UTC
293 points
31 comments29 min readEA link

Su­per­in­tel­li­gent AI is nec­es­sary for an amaz­ing fu­ture, but far from sufficient

So8res31 Oct 2022 21:16 UTC
35 points
5 comments34 min readEA link

Re­la­tion­ship be­tween EA Com­mu­nity and AI safety

Tom Barnes🔸18 Sep 2023 13:49 UTC
157 points
15 comments1 min readEA link

#195 – Who’s try­ing to steal fron­tier AI mod­els, and what they could do with them (Sella Nevo on the 80,000 Hours Pod­cast)

80000_Hours9 Aug 2024 14:45 UTC
29 points
0 comments11 min readEA link

Con­struc­tive Dis­cus­sion and Think­ing Method­ol­ogy for Se­vere Si­tu­a­tions in­clud­ing Ex­is­ten­tial Risks

Aino8 Jul 2023 0:04 UTC
1 point
0 comments7 min readEA link

Why I think it’s im­por­tant to work on AI forecasting

Matthew_Barnett27 Feb 2023 21:24 UTC
179 points
10 comments10 min readEA link

In­ves­ti­gat­ing Self-Preser­va­tion in LLMs: Ex­per­i­men­tal Observations

Makham27 Feb 2025 16:58 UTC
9 points
3 comments34 min readEA link

Im­pact of un­em­ploy­ment gen­er­ated by Ar­tiffi­cial In­tel­li­gence on Gross Do­mes­tic Product

Valentina García Mesa1 May 2025 20:52 UTC
5 points
0 comments28 min readEA link

Fo­cus on the places where you feel shocked ev­ery­one’s drop­ping the ball

So8res2 Feb 2023 0:27 UTC
92 points
6 comments4 min readEA link

[Question] How long does it take to un­der­srand AI X-Risk from scratch so that I have a con­fi­dent, clear men­tal model of it from first prin­ci­ples?

Jordan Arel27 Jul 2022 16:58 UTC
29 points
6 comments1 min readEA link

Stu­dent pro­ject for en­gag­ing with AI alignment

Per Ivar Friborg9 May 2022 10:44 UTC
35 points
1 comment1 min readEA link

A Timing Prob­lem for In­stru­men­tal Convergence

Rhyss31 Jul 2025 9:39 UTC
18 points
0 comments1 min readEA link
(link.springer.com)

“How to Es­cape from the Si­mu­la­tion”—Seeds of Science call for reviewers

rogersbacon126 Jan 2023 15:12 UTC
7 points
0 comments1 min readEA link

Against Anony­mous Hit Pieces

Anti-Omega18 Jun 2023 19:36 UTC
−25 points
3 comments1 min readEA link

Reflec­tions on the PIBBSS Fel­low­ship 2022

nora11 Dec 2022 22:03 UTC
69 points
4 comments18 min readEA link

Up­dates from Cam­paign for AI Safety

Jolyn Khoo7 Aug 2023 6:09 UTC
32 points
2 comments2 min readEA link
(www.campaignforaisafety.org)

Im­por­tant, ac­tion­able re­search ques­tions for the most im­por­tant century

Holden Karnofsky24 Feb 2022 16:34 UTC
299 points
13 comments19 min readEA link

[Question] Should the EA com­mu­nity have a DL en­g­ineer­ing fel­low­ship?

PabloAMC 🔸24 Dec 2021 13:43 UTC
26 points
6 comments1 min readEA link

How Do AI Timelines Affect Ex­is­ten­tial Risk?

Stephen McAleese29 Aug 2022 17:10 UTC
2 points
0 comments23 min readEA link
(www.lesswrong.com)

A ten­ta­tive di­alogue with a Friendly-boxed-su­per-AGI on brain uploads

Ramiro12 May 2022 21:55 UTC
5 points
0 comments4 min readEA link

Publi­ca­tion de­ci­sions for large lan­guage mod­els, and their impacts

Ben Cottier21 Dec 2022 13:50 UTC
14 points
0 comments16 min readEA link

How do AI agents work to­gether when they can’t trust each other?

James-Sullivan6 Jun 2025 3:24 UTC
4 points
1 comment8 min readEA link
(open.substack.com)

5th IEEE In­ter­na­tional Con­fer­ence on Ar­tifi­cial In­tel­li­gence Test­ing (AITEST 2023)

surabhi gupta12 Mar 2023 9:06 UTC
−5 points
0 comments1 min readEA link

How did you up­date on AI Safety in 2023?

Chris Leong23 Jan 2024 2:21 UTC
30 points
5 comments1 min readEA link

Sys­temic Cas­cad­ing Risks: Rele­vance in Longter­mism & Value Lock-In

Richard R2 Sep 2022 7:53 UTC
59 points
10 comments16 min readEA link

What are the differ­ences be­tween AGI, trans­for­ma­tive AI, and su­per­in­tel­li­gence?

Vishakha Agrawal23 Jan 2025 10:11 UTC
12 points
0 comments3 min readEA link
(aisafety.info)

[Question] Eth­i­cal Con­sid­er­a­tions in re­gard to Out­sourc­ing Labour Needs to the Global South

Nicole Mutung'a4 Oct 2023 9:18 UTC
13 points
5 comments1 min readEA link

Tech­ni­cal Risks of (Lethal) Au­tonomous Weapons Systems

Heramb Podar23 Oct 2024 20:43 UTC
5 points
0 comments1 min readEA link
(www.lesswrong.com)

Chi­nese sci­en­tists ac­knowl­edge xrisk & call for in­ter­na­tional reg­u­la­tory body [Linkpost]

Akash1 Nov 2023 13:28 UTC
31 points
0 comments1 min readEA link
(www.ft.com)

An­nounc­ing AI safety Men­tors and Mentees

mariushobbhahn23 Nov 2022 15:21 UTC
62 points
1 comment10 min readEA link

When Will We Spend Enough to Train Trans­for­ma­tive AI

sn28 Mar 2023 0:41 UTC
3 points
0 comments9 min readEA link

Open call: AI Act Stan­dard for Dev. Phase Risk Assess­ment

miller-max8 Dec 2023 19:57 UTC
5 points
1 comment1 min readEA link

$1,000 bounty for an AI Pro­gramme Lead recommendation

Cillian_14 Aug 2023 13:11 UTC
11 points
1 comment2 min readEA link

A pseudo math­e­mat­i­cal for­mu­la­tion of di­rect work choice be­tween two x-risks

Joseph Bloom11 Aug 2022 0:28 UTC
7 points
0 comments4 min readEA link

Sur­vey—Psy­cholog­i­cal Im­pact of Long-Term AI Engagement

Manuela García17 Sep 2024 15:58 UTC
2 points
0 comments1 min readEA link

Epoch is hiring an ML Hard­ware Researcher

merilalama20 Jul 2023 19:08 UTC
29 points
0 comments4 min readEA link
(careers.rethinkpriorities.org)

Con­cern About the In­tel­li­gence Divide Due to AI

Soe Lin21 Aug 2024 9:53 UTC
17 points
1 comment2 min readEA link

Sum­mary of Stu­art Rus­sell’s new book, “Hu­man Com­pat­i­ble”

Rohin Shah19 Oct 2019 19:56 UTC
33 points
1 comment15 min readEA link
(www.alignmentforum.org)

[Question] Any Philos­o­phy PhD recom­men­da­tions for stu­dents in­ter­ested in Align­ment Efforts?

rickyhuang.hexuan18 Jan 2023 5:54 UTC
7 points
6 comments1 min readEA link

AI policy ca­reers in the EU

Lauro Langosco11 Nov 2019 10:43 UTC
62 points
7 comments11 min readEA link

Well-Be­ing In­dex (WBI): Redefin­ing So­cietal Progress Together

Max Kusmierek1 Dec 2023 15:23 UTC
5 points
1 comment6 min readEA link

Will morally mo­ti­vated ac­tors steer us to­wards a near-best fu­ture?

William_MacAskill8 Aug 2025 18:29 UTC
46 points
9 comments4 min readEA link

[Question] Devel­op­ing AI solu­tions for global health—Em­manuel Katto

EmmanuelKatto18 Jul 2024 6:41 UTC
0 points
0 comments1 min readEA link

ChatGPT is ca­pa­ble of cog­ni­tive em­pa­thy!

Miquel Banchs-Piqué (prev. mikbp)30 Mar 2023 20:42 UTC
3 points
0 comments1 min readEA link
(nonzero.substack.com)

Ai Salon: Trust­wor­thy AI Fu­tures #1

IanEisenberg2 May 2024 16:04 UTC
2 points
0 comments1 min readEA link

Cri­tique of Su­per­in­tel­li­gence Part 4

James Fodor13 Dec 2018 5:14 UTC
4 points
2 comments4 min readEA link

Ra­tional An­i­ma­tions’ in­tro to mechanis­tic interpretability

Writer14 Jun 2024 16:10 UTC
21 points
1 comment11 min readEA link
(youtu.be)

[Question] Where would I find the hard­core to­tal­iz­ing seg­ment of EA?

Peter Berggren28 Dec 2023 9:16 UTC
16 points
22 comments1 min readEA link

Google could build a con­scious AI in three months

Derek Shiller1 Oct 2022 13:24 UTC
16 points
22 comments7 min readEA link

Why con­scious­ness matters

EdLopez24 May 2024 12:33 UTC
0 points
0 comments7 min readEA link
(medium.com)

An ar­gu­ment for ac­cel­er­at­ing in­ter­na­tional AI gov­er­nance re­search (part 2)

MattThinks22 Aug 2023 22:40 UTC
3 points
0 comments10 min readEA link

[Question] What are the most press­ing is­sues in short-term AI policy?

Eevee🔹14 Jan 2020 22:05 UTC
9 points
0 comments1 min readEA link

Refine: An In­cu­ba­tor for Con­cep­tual Align­ment Re­search Bets

adamShimi15 Apr 2022 8:59 UTC
47 points
0 comments4 min readEA link

Dubai EA Fel­low­ship [4 − 18 May]

rahulxyz19 Apr 2023 20:06 UTC
7 points
2 comments4 min readEA link

The ul­ti­mate goal

Alvin Ånestrand6 Jul 2025 15:13 UTC
4 points
2 comments5 min readEA link
(forecastingaifutures.substack.com)

[Question] AI safety mile­stones?

Zach Stein-Perlman23 Jan 2023 21:30 UTC
6 points
0 comments1 min readEA link

A Map to Nav­i­gate AI Governance

hanadulset14 Feb 2022 22:41 UTC
73 points
11 comments25 min readEA link

An­i­mal ad­vo­cates should cam­paign to re­strict AI pre­ci­sion live­stock farming

Zachary Brown🔸17 Jun 2024 15:27 UTC
38 points
6 comments15 min readEA link
(beforeporcelain.substack.com)

Will AI kill ev­ery­one? Here’s what the god­fathers of AI have to say [RA video]

Writer19 Aug 2023 17:29 UTC
33 points
0 comments2 min readEA link
(youtu.be)

How Rood­man’s GWP model trans­lates to TAI timelines

kokotajlod16 Nov 2020 14:11 UTC
22 points
0 comments2 min readEA link

Pre-An­nounc­ing the 2023 Open Philan­thropy AI Wor­ld­views Contest

Jason Schukraft21 Nov 2022 21:45 UTC
291 points
26 comments1 min readEA link

[Question] What are peo­ple’s thoughts on work­ing for Deep­Mind as a gen­eral soft­ware en­g­ineer?

Max Pietsch23 Sep 2022 17:13 UTC
9 points
4 comments1 min readEA link

Fol­lowup on Terminator

skluug12 Mar 2022 1:11 UTC
32 points
0 comments9 min readEA link
(skluug.substack.com)

Public-fac­ing Cen­sor­ship Is Safety Theater, Caus­ing Rep­u­ta­tional Da­m­age

Yitz23 Sep 2022 5:08 UTC
49 points
7 comments5 min readEA link

GPT-3-like mod­els are now much eas­ier to ac­cess and de­ploy than to develop

Ben Cottier21 Dec 2022 13:49 UTC
22 points
3 comments19 min readEA link

Tether­ware #1: The case for hu­man­like AI with free will

Jáchym Fibír30 Jan 2025 11:57 UTC
−3 points
2 comments10 min readEA link
(tetherware.substack.com)

[Question] Why aren’t you freak­ing out about OpenAI? At what point would you start?

AppliedDivinityStudies10 Oct 2021 13:06 UTC
80 points
22 comments2 min readEA link

En­cul­tured AI, Part 2: Pro­vid­ing a Service

Andrew Critch11 Aug 2022 20:13 UTC
10 points
0 comments3 min readEA link

An AI-vs-AI de­bate tool to sur­face strong ar­gu­ments and test LLM bias

learningThroughDebate19 Jun 2025 18:21 UTC
4 points
0 comments1 min readEA link

[Question] What are the ar­gu­ments that sup­port China build­ing AGI+ if Western com­pa­nies de­lay/​pause AI de­vel­op­ment?

DMMF29 Mar 2023 18:53 UTC
32 points
9 comments1 min readEA link

Crises Re­veal Cen­tral­i­sa­tion (Ste­fan Schu­bert)

Will Howard🔹10 May 2023 9:45 UTC
9 points
0 comments1 min readEA link
(web.archive.org)

A cen­tral AI al­ign­ment prob­lem: ca­pa­bil­ities gen­er­al­iza­tion, and the sharp left turn

So8res15 Jun 2022 14:19 UTC
53 points
2 comments10 min readEA link

Eli Lifland on Nav­i­gat­ing the AI Align­ment Landscape

Ozzie Gooen1 Feb 2023 0:07 UTC
48 points
9 comments31 min readEA link
(quri.substack.com)

Long-Term Fu­ture Fund: April 2019 grant recommendations

Habryka [Deactivated]23 Apr 2019 7:00 UTC
142 points
242 comments47 min readEA link

Assess­ing the state of AI R&D in the US, China, and Europe – Part 1: Out­put indicators

stefan.torges1 Nov 2019 14:41 UTC
21 points
0 comments14 min readEA link

[Question] What is the most con­vinc­ing ar­ti­cle, video, etc. mak­ing the case that AI is an X-Risk

Jordan Arel11 Jul 2023 20:32 UTC
4 points
7 comments1 min readEA link

Ex­plor­ing AI Safety through “Es­cape Ex­per­i­ment”: A Short Film on Su­per­in­tel­li­gence Risks

Gaetan_Selle 🔷10 Nov 2024 4:42 UTC
4 points
0 comments2 min readEA link

AI al­ign­ment re­searchers don’t (seem to) stack

So8res21 Feb 2023 0:48 UTC
47 points
3 comments3 min readEA link

Ret­ro­spec­tive on re­cent ac­tivity of Ries­gos Catas­trófi­cos Globales

Jaime Sevilla1 May 2023 18:35 UTC
45 points
0 comments5 min readEA link

[Question] Does the idea of AGI that benev­olently con­trol us ap­peal to EA folks?

Noah Scales16 Jul 2022 19:17 UTC
6 points
20 comments1 min readEA link

Four part play­book for deal­ing with AI (Holden Karnofsky on the 80,000 Hours Pod­cast)

80000_Hours2 Aug 2023 11:56 UTC
9 points
1 comment19 min readEA link
(80000hours.org)

3 lev­els of threat obfuscation

Holden Karnofsky2 Aug 2023 17:09 UTC
31 points
0 comments6 min readEA link
(www.alignmentforum.org)

AI Risk: In­creas­ing Per­sua­sion Power

kewlcats3 Aug 2020 20:25 UTC
4 points
0 comments1 min readEA link

Cen­ter on Long-Term Risk: Sum­mer Re­search Fel­low­ship 2025

Center on Long-Term Risk26 Mar 2025 17:28 UTC
44 points
0 comments1 min readEA link
(longtermrisk.org)

[linkpost] AI NOW In­sti­tute’s 2023 An­nual Re­port & Roadmap

Tristan Williams12 Apr 2023 20:00 UTC
9 points
0 comments2 min readEA link
(ainowinstitute.org)

Ap­ply now for the EU Tech Policy Fel­low­ship 2023

Jan-Willem11 Nov 2022 6:16 UTC
64 points
1 comment5 min readEA link

An­titrust-Com­pli­ant AI In­dus­try Self-Regulation

Cullen 🔸7 Jul 2020 20:52 UTC
26 points
1 comment1 min readEA link
(cullenokeefe.com)

Open Prob­lems in AI X-Risk [PAIS #5]

TW12310 Jun 2022 2:22 UTC
44 points
1 comment36 min readEA link

The AI rev­olu­tion and in­ter­na­tional poli­tics (Allan Dafoe)

EA Global2 Jun 2017 8:48 UTC
8 points
0 comments18 min readEA link
(www.youtube.com)

[Question] How much EA anal­y­sis of AI safety as a cause area ex­ists?

richard_ngo6 Sep 2019 11:15 UTC
94 points
20 comments2 min readEA link

Is RLHF cruel to AI?

Hzn16 Dec 2024 14:01 UTC
−1 points
2 comments3 min readEA link

Utility Eng­ineer­ing: An­a­lyz­ing and Con­trol­ling Emer­gent Value Sys­tems in AIs

Matrice Jacobine12 Feb 2025 9:15 UTC
13 points
0 comments1 min readEA link
(www.emergent-values.ai)

New GPT3 Im­pres­sive Ca­pa­bil­ities—In­struc­tGPT3 [1/​2]

simeon_c13 Mar 2022 10:45 UTC
49 points
4 comments7 min readEA link

Sum­mary: Ex­is­ten­tial risk from power-seek­ing AI by Joseph Carlsmith

rileyharris28 Oct 2023 15:05 UTC
11 points
0 comments6 min readEA link
(www.millionyearview.com)

Join a ‘learn­ing by writ­ing’ group

Jordan Pieters 🔸26 Apr 2023 11:36 UTC
26 points
1 comment1 min readEA link

AI Safety Col­lab 2025 - Feed­back on Plans & Ex­pres­sion of Interest

Evander H. 🔸7 Jan 2025 16:41 UTC
28 points
2 comments1 min readEA link

In­ter­na­tional co­op­er­a­tion as a tool to re­duce two ex­is­ten­tial risks.

johl@umich.edu19 Apr 2021 16:51 UTC
28 points
4 comments23 min readEA link

On the abo­li­tion of man

Joe_Carlsmith18 Jan 2024 18:17 UTC
71 points
4 comments41 min readEA link

How ‘Hu­man-Hu­man’ dy­nam­ics give way to ‘Hu­man-AI’ and then ‘AI-AI’ dynamics

Remmelt27 Dec 2022 3:16 UTC
4 points
0 comments2 min readEA link
(mflb.com)

AMA: Fu­ture of Life In­sti­tute’s EU Team

Risto Uuk31 Jan 2022 17:14 UTC
44 points
15 comments2 min readEA link

A Play­book for AI Risk Re­duc­tion (fo­cused on mis­al­igned AI)

Holden Karnofsky6 Jun 2023 18:05 UTC
81 points
17 comments14 min readEA link

Why you are not mo­ti­vated to work on AI safety

MountainPath25 Oct 2024 16:12 UTC
7 points
5 comments1 min readEA link

Ti­maeus is hiring re­searchers & engineers

Tatiana K. Nesic Skuratova27 Jan 2025 14:35 UTC
19 points
0 comments4 min readEA link

NIST AI Risk Man­age­ment Frame­work re­quest for in­for­ma­tion (RFI)

Aryeh Englander31 Aug 2021 22:24 UTC
7 points
0 comments2 min readEA link

What’s Hap­pen­ing in Australia

Bradley Tjandra7 Nov 2022 1:03 UTC
105 points
4 comments13 min readEA link

Your group needs all the help it can get (FBB #1)

gergo7 Jan 2025 16:42 UTC
44 points
6 comments4 min readEA link

More to ex­plore on ‘Risks from Ar­tifi­cial In­tel­li­gence’

EA Handbook15 Jul 2022 23:00 UTC
10 points
3 comments2 min readEA link

Ab­solute Zero: AlphaZero for LLM

alapmi12 May 2025 14:54 UTC
2 points
0 comments1 min readEA link

Google’s ethics is alarming

len.hoang.lnh25 Feb 2021 5:57 UTC
6 points
5 comments1 min readEA link

AI Benefits Post 1: In­tro­duc­ing “AI Benefits”

Cullen 🔸22 Jun 2020 16:58 UTC
10 points
2 comments3 min readEA link

The An­swer Is in the Ques­tion: Prompt Eng­ineer­ing in the Age of AI

Rodo30 May 2025 18:11 UTC
1 point
0 comments4 min readEA link

Shortlived sen­tience/​consciousness

Martin (Huge) Vlach1 Jul 2024 13:59 UTC
2 points
2 comments1 min readEA link

Allan Dafoe: Prepar­ing for AI — risks and opportunities

EA Global3 Nov 2017 7:43 UTC
7 points
0 comments1 min readEA link
(www.youtube.com)

AI ethics: the case for in­clud­ing an­i­mals (my first pub­lished pa­per, Peter Singer’s first on AI)

Fai12 Jul 2022 4:14 UTC
82 points
5 comments1 min readEA link
(link.springer.com)

Estab­lish­ing Oxford’s AI Safety Stu­dent Group: Les­sons Learnt and Our Model

Wilkin123421 Sep 2022 7:57 UTC
73 points
3 comments1 min readEA link

At Our World in Data we’re hiring our first Com­mu­ni­ca­tions & Outreach Manager

Charlie Giattino13 Oct 2023 13:12 UTC
25 points
0 comments1 min readEA link
(ourworldindata.org)

Slides: Po­ten­tial Risks From Ad­vanced AI

Aryeh Englander28 Apr 2022 2:18 UTC
9 points
0 comments1 min readEA link

What role should evolu­tion­ary analo­gies play in un­der­stand­ing AI take­off speeds?

anson11 Dec 2021 1:16 UTC
12 points
0 comments42 min readEA link

Joscha Bach on Syn­thetic In­tel­li­gence [an­no­tated]

Roman Leventov2 Mar 2023 11:21 UTC
8 points
0 comments9 min readEA link
(www.jimruttshow.com)

[Question] AI con­scious­ness & moral sta­tus: What do the ex­perts think?

Jay Luong6 Jul 2024 15:27 UTC
0 points
3 comments1 min readEA link

In­tro­duc­ing SB53.info

Miles Kodama25 Jul 2025 9:42 UTC
47 points
4 comments7 min readEA link

[Question] How do I plan my life in a world with rapid AI de­vel­op­ment?

Oliver Kuperman10 Feb 2025 14:36 UTC
28 points
6 comments1 min readEA link

Anki deck for learn­ing the main AI safety orgs, pro­jects, and programs

Bryce Robertson29 Sep 2023 18:42 UTC
17 points
5 comments1 min readEA link

Mechanism De­sign for AI Safety—Agenda Creation Retreat

Rubi J. Hudson10 Feb 2023 3:05 UTC
21 points
1 comment1 min readEA link

Ques­tions about AI that bother me

Eleni_A31 Jan 2023 6:50 UTC
33 points
6 comments2 min readEA link

[Question] I have re­cently been in­ter­ested in robotics, par­tic­u­larly in for-profit star­tups. I think they can help in­crease food pro­duc­tion and help re­duce im­prove health­care. Would this fall un­der AI for so­cial good? How im­pact­ful will robotics be to so­ciety? How large is the coun­ter­fac­tual?

Isaac Benson2 Jan 2022 5:38 UTC
4 points
3 comments1 min readEA link

En­hanc­ing Math­e­mat­i­cal Model­ing with LLMs: Goals, Challenges, and Evaluations

Ozzie Gooen28 Oct 2024 21:37 UTC
11 points
3 comments15 min readEA link

LPP Sum­mer Re­search Fel­low­ship in Law & AI 2023: Ap­pli­ca­tions Open

Legal Priorities Project20 Jun 2023 14:31 UTC
43 points
4 comments4 min readEA link

AI Value Align­ment Speaker Series Pre­sented By EA Berkeley

Mahendra Prasad1 Mar 2022 6:17 UTC
2 points
0 comments1 min readEA link

How teams went about their re­search at AI Safety Camp edi­tion 8

Remmelt9 Sep 2023 16:34 UTC
13 points
1 comment13 min readEA link

Launch­ing Fore­sight In­sti­tute’s AI Grant for Un­der­ex­plored Ap­proaches to AI Safety – Ap­ply for Fund­ing!

elteerkers17 Aug 2023 7:27 UTC
48 points
0 comments2 min readEA link

MATS Spring 2024 Ex­ten­sion Retrospective

HenningB16 Feb 2025 20:29 UTC
13 points
0 comments15 min readEA link
(www.lesswrong.com)

Join ASAP (AI Safety Ac­countabil­ity Pro­gramme)

Callum McDougall10 Sep 2022 11:15 UTC
54 points
20 comments3 min readEA link

#200 – What su­perfore­cast­ers and ex­perts think about ex­is­ten­tial risks (Ezra Karger on The 80,000 Hours Pod­cast)

80000_Hours6 Sep 2024 17:53 UTC
12 points
2 comments14 min readEA link

AGISF adap­ta­tion for in-per­son groups

Sam Marks17 Jan 2023 18:33 UTC
30 points
0 comments3 min readEA link
(www.lesswrong.com)

Forethought: A new AI macros­trat­egy group

Amrit Sidhu-Brar 🔸11 Mar 2025 15:36 UTC
170 points
10 comments3 min readEA link

Sum­maries: Align­ment Fun­da­men­tals Curriculum

Leon Lang19 Sep 2022 15:43 UTC
25 points
1 comment1 min readEA link
(docs.google.com)

13 back­ground claims about EA

Akash7 Sep 2022 3:54 UTC
70 points
16 comments3 min readEA link

Would any­one here know how to get ahold of … iunno An­thropic and Open Philan­thropy? I think they are go­ing to want to have a chat (Please don’t make me go to OpenAI with this. Not even a threat, se­ri­ously. They just part­ner with my alma mater and are the only in I have. I gen­uinely do not want to and I need your help).

Anti-Golem9 Jun 2025 13:59 UTC
−11 points
0 comments1 min readEA link

In­vi­ta­tion to an IRL re­treat on AI x-risks & post-ra­tio­nal­ity at Ooty, India

bhishma8 Jun 2025 14:05 UTC
2 points
0 comments1 min readEA link

Re­search + Real­ity Graph­ing to Sup­port AI Policy (and more): Sum­mary of a Frozen Project

Marcel22 Jul 2022 20:58 UTC
34 points
2 comments8 min readEA link

(Ap­pli­ca­tions Open!) UChicago XLab Sum­mer Re­search Fel­low­ship 2024

ZacharyRudolph26 Feb 2024 18:20 UTC
15 points
0 comments4 min readEA link
(xrisk.uchicago.edu)

Ob­ser­va­to­rio de Ries­gos Catas­trófi­cos Globales (ORCG) Re­cap 2023

JorgeTorresC14 Dec 2023 14:27 UTC
75 points
0 comments3 min readEA link
(riesgoscatastroficosglobales.com)

[Link and com­men­tary] Beyond Near- and Long-Term: Towards a Clearer Ac­count of Re­search Pri­ori­ties in AI Ethics and Society

MichaelA🔸14 Mar 2020 9:04 UTC
18 points
0 comments6 min readEA link

Carl Shul­man on AI takeover mechanisms (& more): Part II of Dwarkesh Pa­tel in­ter­view for The Lu­nar Society

alejandro25 Jul 2023 18:31 UTC
28 points
0 comments5 min readEA link
(www.dwarkeshpatel.com)

OpenAI show­case live on ope­nai.com

Amateur Systems Analyst10 May 2024 17:55 UTC
2 points
0 comments1 min readEA link

Ex­po­nen­tial AI take­off is a myth

Christoph Hartmann 🔸31 May 2023 11:47 UTC
47 points
11 comments9 min readEA link

De­cen­tral­ized His­tor­i­cal Data Preser­va­tion and Why EA Should Care

Sasha22 Mar 2024 10:09 UTC
2 points
0 comments3 min readEA link

Why do we post our AI safety plans on the In­ter­net?

Peter S. Park31 Oct 2022 16:27 UTC
15 points
22 comments11 min readEA link

Are you re­ally in a race? The Cau­tion­ary Tales of Szilárd and Ellsberg

HaydnBelfield19 May 2022 8:42 UTC
493 points
44 comments18 min readEA link

OpenAI is start­ing a new “Su­per­in­tel­li­gence al­ign­ment” team and they’re hiring

alejandro5 Jul 2023 18:27 UTC
100 points
16 comments1 min readEA link
(openai.com)

Rac­ing through a minefield: the AI de­ploy­ment problem

Holden Karnofsky31 Dec 2022 21:44 UTC
79 points
1 comment13 min readEA link
(www.cold-takes.com)

E.A. Me­gapro­ject Ideas

Tomer_Goloboy21 Mar 2022 1:23 UTC
15 points
4 comments4 min readEA link

From Com­fort Zone to Fron­tiers of Im­pact: Pur­su­ing A Late-Ca­reer Shift to Ex­is­ten­tial Risk Reduction

Jim Chapman4 Mar 2025 21:28 UTC
239 points
12 comments10 min readEA link

What will the first hu­man-level AI look like, and how might things go wrong?

EuanMcLean23 May 2024 11:28 UTC
12 points
1 comment15 min readEA link

Re­silience Via Frag­mented Power

steve632014 Jul 2022 15:37 UTC
2 points
0 comments6 min readEA link

Man­i­fund 2025 Regrants

Austin22 Apr 2025 17:36 UTC
28 points
0 comments5 min readEA link
(manifund.substack.com)

Perform Tractable Re­search While Avoid­ing Ca­pa­bil­ities Ex­ter­nal­ities [Prag­matic AI Safety #4]

TW12330 May 2022 20:37 UTC
33 points
1 comment25 min readEA link

Con­crete open prob­lems in mechanis­tic in­ter­pretabil­ity: a tech­ni­cal overview

Neel Nanda6 Jul 2023 11:35 UTC
27 points
1 comment29 min readEA link

AI-Safety Mex­ico: A Pilot Sur­vey in Yu­catán.

Janeth Valdivia28 May 2025 23:19 UTC
5 points
1 comment5 min readEA link

Can GPT-3 pro­duce new ideas? Par­tially au­tomat­ing Robin Han­son and others

NunoSempere16 Jan 2023 15:05 UTC
82 points
6 comments10 min readEA link

The “no sand­bag­ging on check­able tasks” hypothesis

Joe_Carlsmith31 Jul 2023 23:13 UTC
16 points
0 comments9 min readEA link

[Question] Recom­men­da­tions for non-tech­ni­cal books on AI?

Joseph12 Jul 2022 23:23 UTC
8 points
11 comments1 min readEA link

There have been 3 planes (billion­aire donors) and 2 have crashed

trevor117 Dec 2022 3:38 UTC
4 points
5 comments2 min readEA link

Trans­for­ma­tive AI and Sce­nario Plan­ning for AI X-risk

Elliot Mckernon22 Mar 2024 11:44 UTC
14 points
1 comment8 min readEA link

Meta: Fron­tier AI Framework

Zach Stein-Perlman3 Feb 2025 22:00 UTC
23 points
0 comments1 min readEA link
(ai.meta.com)

Are We Ready for Digi­tal Per­sons?

Alex (Αλέξανδρος)3 Jun 2025 9:38 UTC
3 points
0 comments1 min readEA link
(www.linkedin.com)

aisafety.com­mu­nity—A liv­ing doc­u­ment of AI safety communities

zeshen🔸20 Oct 2022 22:08 UTC
24 points
13 comments1 min readEA link

Pre­sump­tive Listen­ing: stick­ing to fa­mil­iar con­cepts and miss­ing the outer rea­son­ing paths

Remmelt27 Dec 2022 15:40 UTC
3 points
0 comments2 min readEA link
(mflb.com)

[Question] What would you do if you had a lot of money/​power/​in­fluence and you thought that AI timelines were very short?

Greg_Colbourn ⏸️ 12 Nov 2021 21:59 UTC
29 points
8 comments1 min readEA link

AISN #12: Policy Pro­pos­als from NTIA’s Re­quest for Com­ment and Re­con­sid­er­ing In­stru­men­tal Convergence

Center for AI Safety27 Jun 2023 15:25 UTC
30 points
3 comments7 min readEA link
(newsletter.safe.ai)

AI Ex­is­ten­tial Safety Fellowships

mmfli27 Oct 2023 12:14 UTC
15 points
1 comment1 min readEA link

Beyond Meta: Large Con­cept Models Will Win

Anthony Repetto30 Dec 2024 0:57 UTC
3 points
0 comments3 min readEA link

[Link post] Co­or­di­na­tion challenges for pre­vent­ing AI conflict

stefan.torges9 Mar 2021 9:39 UTC
58 points
0 comments1 min readEA link
(longtermrisk.org)

Safety-First Agents/​Ar­chi­tec­tures Are a Promis­ing Path to Safe AGI

Brendon_Wong6 Aug 2023 8:00 UTC
6 points
0 comments12 min readEA link

Cal­ling Lon­don AI Re­searchers!

prince6 Aug 2025 19:21 UTC
1 point
0 comments1 min readEA link

AI Safety groups should imi­tate ca­reer de­vel­op­ment clubs

Joshc9 Nov 2022 23:48 UTC
95 points
5 comments2 min readEA link

Ex­pected im­pact of a ca­reer in AI safety un­der differ­ent opinions

Jordan Taylor14 Jun 2022 14:25 UTC
42 points
16 comments11 min readEA link

Law-Fol­low­ing AI 3: Lawless AI Agents Un­der­mine Sta­bi­liz­ing Agreements

Cullen 🔸27 Apr 2022 17:20 UTC
28 points
3 comments3 min readEA link

[linkpost] Shar­ing pow­er­ful AI mod­els: the emerg­ing paradigm of struc­tured access

ts20 Jan 2022 21:10 UTC
11 points
3 comments1 min readEA link

What are the differ­ences be­tween a sin­gu­lar­ity, an in­tel­li­gence ex­plo­sion, and a hard take­off?

Vishakha Agrawal3 Apr 2025 10:34 UTC
6 points
0 comments2 min readEA link
(aisafety.info)

Could unions be an un­der­rated driver for AI safety policy?

Dunning K.12 Jul 2023 13:21 UTC
23 points
6 comments1 min readEA link

Gw­ern on cre­at­ing your own AI race and China’s Fast Fol­lower strat­egy.

Larks25 Nov 2024 3:01 UTC
129 points
4 comments2 min readEA link
(www.lesswrong.com)

[Question] What con­sid­er­a­tions in­fluence whether I have more in­fluence over short or long timelines?

kokotajlod5 Nov 2020 19:57 UTC
18 points
0 comments1 min readEA link

Hal­lu­ci­na­tions May Be a Re­sult of Models Not Know­ing What They’re Ac­tu­ally Ca­pable Of

Tyler Williams16 Aug 2025 0:26 UTC
1 point
0 comments2 min readEA link

[Question] What should I ask Ajeya Co­tra — se­nior re­searcher at Open Philan­thropy, and ex­pert on AI timelines and safety challenges?

Robert_Wiblin28 Oct 2022 15:28 UTC
23 points
10 comments1 min readEA link

Ar­tifi­cial In­tel­li­gence Safety of Film Capacitors

yonxinzhang21 Nov 2023 11:51 UTC
−2 points
0 comments1 min readEA link

Notes on UK AISI Align­ment Project

Pseudaemonia1 Aug 2025 10:37 UTC
25 points
0 comments1 min readEA link

‘Dis­solv­ing’ AI Risk – Pa­ram­e­ter Uncer­tainty in AI Fu­ture Forecasting

Froolow18 Oct 2022 22:54 UTC
111 points
63 comments39 min readEA link

Biose­cu­rity and AI: Risks and Opportunities

Center for AI Safety27 Feb 2024 18:46 UTC
7 points
2 comments7 min readEA link
(www.safe.ai)

Why some peo­ple be­lieve in AGI, but I don’t.

cveres26 Oct 2022 3:09 UTC
13 points
2 comments4 min readEA link

Fu­ture Mat­ters #3: digi­tal sen­tience, AGI ruin, and fore­cast­ing track records

Pablo4 Jul 2022 17:44 UTC
70 points
2 comments19 min readEA link

Oxford Biose­cu­rity Group: Fundrais­ing and Plans for Early 2025

Lin BL20 Dec 2024 20:56 UTC
33 points
0 comments2 min readEA link

Shal­low eval­u­a­tions of longter­mist organizations

NunoSempere24 Jun 2021 15:31 UTC
192 points
34 comments34 min readEA link

AI gov­er­nance stu­dent hackathon on Satur­day, April 23: reg­ister now!

mic12 Apr 2022 4:39 UTC
18 points
0 comments1 min readEA link

Ben Garfinkel: How sure are we about this AI stuff?

bgarfinkel9 Feb 2019 19:17 UTC
128 points
20 comments18 min readEA link

Recom­men­da­tion to Ap­ply ISIC and NAICS to AI In­ci­dent Database

Ben Turse21 Jul 2024 7:25 UTC
3 points
0 comments2 min readEA link

Con­fer­ence Re­port: Thresh­old 2030 - Model­ing AI Eco­nomic Futures

Deric Cheng24 Feb 2025 18:57 UTC
24 points
0 comments10 min readEA link
(www.convergenceanalysis.org)

OpenAI an­nounces new mem­bers to board of directors

Will Howard🔹9 Mar 2024 11:27 UTC
47 points
12 comments2 min readEA link
(openai.com)

On value in hu­mans, other an­i­mals, and AI

Michele Campolo31 Jan 2023 23:48 UTC
8 points
6 comments5 min readEA link

An­nounc­ing the GovAI Policy Team

MarkusAnderljung1 Aug 2022 22:46 UTC
107 points
11 comments2 min readEA link

Part 2: AI Safety Move­ment Builders should help the com­mu­nity to op­ti­mise three fac­tors: con­trib­u­tors, con­tri­bu­tions and coordination

PeterSlattery15 Dec 2022 22:48 UTC
34 points
0 comments6 min readEA link

Sce­nario Map­ping Ad­vanced AI Risk: Re­quest for Par­ti­ci­pa­tion with Data Collection

Kiliank27 Mar 2022 11:44 UTC
14 points
0 comments5 min readEA link

[Creative Writ­ing Con­test] An AI Safety Limerick

Ben_West🔸18 Oct 2021 19:11 UTC
21 points
5 comments1 min readEA link

[Link post] How plau­si­ble are AI Takeover sce­nar­ios?

SammyDMartin27 Sep 2021 13:03 UTC
26 points
0 comments1 min readEA link

USA/​China Rec­on­cili­a­tion a Ne­ces­sity Be­cause of AI/​Tech Acceleration

bhrdwj🔸17 Apr 2025 13:13 UTC
1 point
7 comments7 min readEA link

AI Risk in Africa

Claude Formanek12 Oct 2021 2:28 UTC
20 points
0 comments10 min readEA link

Na­tional Se­cu­rity Is Not In­ter­na­tional Se­cu­rity: A Cri­tique of AGI Realism

C.K.2 Feb 2025 17:04 UTC
44 points
2 comments36 min readEA link
(conradkunadu.substack.com)

The limits of black-box eval­u­a­tions: two hypotheticals

TFD11 Apr 2025 20:52 UTC
1 point
0 comments4 min readEA link
(www.thefloatingdroid.com)

[Question] Why is “Ar­gu­ment Map­ping” Not More Com­mon in EA/​Ra­tion­al­ity (And What Ob­jec­tions Should I Ad­dress in a Post on the Topic?)

Marcel223 Dec 2022 21:55 UTC
15 points
5 comments1 min readEA link

AI safety schol­ar­ships look worth-fund­ing (if other fund­ing is sane)

anon-a19 Nov 2019 0:59 UTC
22 points
6 comments2 min readEA link

Car­reras con Im­pacto: Outreach Re­sults Among Latin Amer­i­can Students

SMalagon21 Aug 2024 5:10 UTC
34 points
4 comments3 min readEA link

[Question] Anal­ogy of AI Align­ment as Rais­ing a Child?

Aaron_Scher19 Feb 2022 21:40 UTC
4 points
2 comments1 min readEA link

Ge­offrey Hin­ton on the Past, Pre­sent, and Fu­ture of AI

Stephen McAleese12 Oct 2024 16:41 UTC
5 points
1 comment18 min readEA link

Po­ten­tial Risks from Ad­vanced AI

EA Global13 Aug 2017 7:00 UTC
9 points
0 comments18 min readEA link

How to do the­o­ret­i­cal re­search, a per­sonal perspective

Mark Xu19 Aug 2022 19:43 UTC
132 points
7 comments15 min readEA link

AI Safety field-build­ing pro­jects I’d like to see

Akash11 Sep 2022 23:45 UTC
31 points
4 comments6 min readEA link
(www.lesswrong.com)

[Question] How would a lan­guage model be­come goal-di­rected?

David M16 Jul 2022 14:50 UTC
113 points
20 comments1 min readEA link

Hor­tus AI is hiring for two in­tern roles

Thomas Krendl Gilbert30 Jul 2024 11:55 UTC
3 points
0 comments1 min readEA link

Per­sua­sion Tools: AI takeover with­out AGI or agency?

kokotajlod20 Nov 2020 16:56 UTC
15 points
5 comments10 min readEA link

Un­der­stand­ing how hard al­ign­ment is may be the most im­por­tant re­search di­rec­tion right now

Aron7 Jun 2023 19:05 UTC
26 points
3 comments6 min readEA link
(coordinationishard.substack.com)

Does most of your im­pact come from what you do soon?

Joshc21 Feb 2023 5:12 UTC
38 points
1 comment5 min readEA link

Cy­ber­se­cu­rity and AI: The Evolv­ing Se­cu­rity Landscape

Center for AI Safety14 Mar 2024 20:14 UTC
9 points
0 comments12 min readEA link
(www.safe.ai)

Are cor­po­ra­tions su­per­in­tel­li­gent?

Vishakha Agrawal17 Mar 2025 10:33 UTC
3 points
2 comments1 min readEA link
(aisafety.info)

The great en­ergy de­scent (short ver­sion) - An im­por­tant thing EA might have missed

CB🔸31 Aug 2022 21:50 UTC
68 points
94 comments10 min readEA link

Aus­trali­ans call for AI safety to be taken seriously

Alexander Saeri21 Jul 2023 1:16 UTC
51 points
1 comment1 min readEA link

Prin­ci­ples for AI Welfare Research

jeffsebo19 Jun 2023 11:30 UTC
138 points
16 comments13 min readEA link

Frac­tal Gover­nance: A Tractable, Ne­glected Ap­proach to Ex­is­ten­tial Risk Reduction

WillPearson5 Mar 2025 19:57 UTC
3 points
1 comment3 min readEA link

In­tro­duc­ing Col­lec­tive Ac­tion for Ex­is­ten­tial Safety: 80+ ac­tions in­di­vi­d­u­als, or­ga­ni­za­tions, and na­tions can take to im­prove our ex­is­ten­tial safety

James Norris5 Feb 2025 15:58 UTC
9 points
0 comments1 min readEA link

A Re­search Agenda for Psy­chol­ogy and AI

carter allen🔸28 Jun 2024 12:56 UTC
54 points
2 comments14 min readEA link

AI timelines and the­o­ret­i­cal un­der­stand­ing of deep learn­ing

Venky102412 Sep 2021 16:26 UTC
4 points
8 comments2 min readEA link

Adap­tive Com­pos­able Cog­ni­tive Core Unit (ACCCU)

Ihor Ivliev20 Mar 2025 21:48 UTC
10 points
2 comments4 min readEA link

AGI Timelines in Gover­nance: Differ­ent Strate­gies for Differ­ent Timeframes

simeon_c19 Dec 2022 21:31 UTC
110 points
19 comments10 min readEA link

7 es­says on Build­ing a Bet­ter Future

Jamie_Harris24 Jun 2022 14:28 UTC
21 points
0 comments2 min readEA link

What are some good books about AI safety?

Vishakha Agrawal17 Feb 2025 11:54 UTC
7 points
0 comments3 min readEA link
(aisafety.info)

An in­ter­ven­tion to shape policy di­alogue, com­mu­ni­ca­tion, and AI re­search norms for AI safety

Lee_Sharkey1 Oct 2017 18:29 UTC
9 points
28 comments10 min readEA link

Give me ca­reer advice

sammyboiz🔸5 Jul 2024 8:48 UTC
6 points
10 comments1 min readEA link

The Defence pro­duc­tion act and AI policy

Nathan_Barnard1 Mar 2024 14:23 UTC
15 points
0 comments2 min readEA link

[Question] What should I read about defin­ing AI “hal­lu­ci­na­tion?”

James-Hartree-Law23 Jan 2025 1:00 UTC
2 points
0 comments1 min readEA link

What a com­pute-cen­tric frame­work says about AI take­off speeds

Tom_Davidson23 Jan 2023 4:09 UTC
189 points
7 comments16 min readEA link
(www.lesswrong.com)

“A Paradigm for AI Con­scious­ness”—Seeds of Science call for reviewers

rogersbacon115 May 2024 20:57 UTC
5 points
0 comments1 min readEA link

Lan­guage Agents Re­duce the Risk of Ex­is­ten­tial Catastrophe

cdkg29 May 2023 9:59 UTC
29 points
6 comments26 min readEA link

[Question] What do we do if AI doesn’t take over the world, but still causes a sig­nifi­cant global prob­lem?

James_Banks2 Aug 2020 3:35 UTC
16 points
5 comments1 min readEA link

Is in­ter­est in al­ign­ment worth men­tion­ing for grad school ap­pli­ca­tions?

Franziska Fischer16 Oct 2022 4:50 UTC
5 points
4 comments1 min readEA link

New book: The Tango of Ethics: In­tu­ition, Ra­tion­al­ity and the Preven­tion of Suffering

jonleighton2 Jan 2023 8:45 UTC
115 points
3 comments5 min readEA link

At Our World in Data we’re hiring a Se­nior Full-stack Engineer

Charlie Giattino15 Dec 2023 15:51 UTC
16 points
0 comments1 min readEA link
(ourworldindata.org)

You won’t solve al­ign­ment with­out agent foundations

MikhailSamin6 Nov 2022 8:07 UTC
14 points
0 comments8 min readEA link

AI Gir­lfriends Won’t Mat­ter Much

Maxwell Tabarrok23 Dec 2023 16:00 UTC
12 points
1 comment2 min readEA link
(maximumprogress.substack.com)

“Hereti­cal Thoughts on AI” by Eli Dourado

𝕮𝖎𝖓𝖊𝖗𝖆19 Jan 2023 16:11 UTC
142 points
15 comments3 min readEA link
(www.elidourado.com)

What is com­pute gov­er­nance?

Vishakha Agrawal23 Dec 2024 6:45 UTC
5 points
0 comments2 min readEA link
(aisafety.info)

In­ter­view with Tom Chivers: “AI is a plau­si­ble ex­is­ten­tial risk, but it feels as if I’m in Pas­cal’s mug­ging”

felix.h21 Feb 2021 13:41 UTC
16 points
1 comment7 min readEA link

Very Briefly: The CHIPS Act

Yadav26 Feb 2023 13:53 UTC
40 points
3 comments1 min readEA link
(www.y1d2.com)

Baobao Zhang: How so­cial sci­ence re­search can in­form AI governance

EA Global22 Jan 2021 15:10 UTC
9 points
0 comments16 min readEA link
(www.youtube.com)

#173 – Digi­tal minds, and how to avoid sleep­walk­ing into a ma­jor moral catas­tro­phe (Jeff Sebo on the 80,000 Hours Pod­cast)

80000_Hours29 Nov 2023 19:18 UTC
43 points
0 comments18 min readEA link

Open Philan­thropy is hiring for mul­ti­ple roles across our Global Catas­trophic Risks teams

Open Philanthropy29 Sep 2023 23:24 UTC
177 points
6 comments3 min readEA link

In­tro­duc­ing spirit hazards

brb24327 May 2022 22:16 UTC
9 points
2 comments2 min readEA link

[Question] Are there any AI Safety labs that will hire self-taught ML en­g­ineers?

Tomer_Goloboy6 Apr 2022 23:32 UTC
5 points
12 comments1 min readEA link

[Question] Should I prove my­self first by pres­ti­gious em­ploy­ers or go di­rectly into the fields I want to end up in?

Sven Spehr26 Nov 2023 8:08 UTC
8 points
1 comment1 min readEA link

[Question] What are your recom­men­da­tions for tech­ni­cal AI al­ign­ment pod­casts?

Evan_Gaensbauer11 May 2022 21:52 UTC
13 points
4 comments1 min readEA link

So, What Ex­actly is a Frac­tional Con­sul­tant?

Deena Englander23 Jun 2025 16:03 UTC
11 points
0 comments3 min readEA link

The case for long-term cor­po­rate gov­er­nance of AI

SethBaum3 Nov 2021 10:50 UTC
42 points
3 comments8 min readEA link

A be­gin­ner’s in­tro­duc­tion to AI-driven biorisk: Large Lan­guage Models, Biolog­i­cal De­sign Tools, In­for­ma­tion Hazards, and Biosecurity

NatKiilu3 May 2024 15:49 UTC
6 points
1 comment16 min readEA link

Pod­cast: Tam­era Lan­ham on AI risk, threat mod­els, al­ign­ment pro­pos­als, ex­ter­nal­ized rea­son­ing over­sight, and work­ing at Anthropic

Akash20 Dec 2022 21:39 UTC
14 points
1 comment11 min readEA link

In­tro­duc­ing Gen­er­ally In­tel­li­gent: an AI re­search lab fo­cused on im­proved the­o­ret­i­cal and prag­matic understanding

joshalbrecht21 Oct 2022 8:20 UTC
8 points
0 comments1 min readEA link

Mauhn Re­leases AI Safety Documentation

Berg Severens2 Jul 2021 12:19 UTC
4 points
2 comments1 min readEA link

Sup­ple­ment to “The Brus­sels Effect and AI: How EU AI reg­u­la­tion will im­pact the global AI mar­ket”

MarkusAnderljung16 Aug 2022 20:55 UTC
109 points
7 comments8 min readEA link

Ex­plained Sim­ply: Quantilizers

brook8 Sep 2023 12:54 UTC
8 points
0 comments1 min readEA link
(aisafetyexplained.substack.com)

Ber­lin AI Safety Open Meetup July 2022

Isidor Regenfuß22 Jul 2022 16:26 UTC
1 point
0 comments1 min readEA link

CNAS re­port: ‘Ar­tifi­cial In­tel­li­gence and Arms Con­trol’

MMMaas13 Oct 2022 8:35 UTC
16 points
0 comments1 min readEA link
(www.cnas.org)

Ex­per­i­men­tal plat­form for AI value for­ma­tion — seek­ing collaborators

Freeman Dyson13 Aug 2025 14:34 UTC
1 point
0 comments1 min readEA link

Yud­kowsky and Chris­ti­ano dis­cuss “Take­off Speeds”

EliezerYudkowsky22 Nov 2021 19:42 UTC
42 points
0 comments60 min readEA link

AI may pur­sue goals

Vishakha Agrawal28 May 2025 12:04 UTC
2 points
0 comments1 min readEA link

Poll: the next ex­is­ten­tial catas­tro­phe is like­lier than not to wipe off all an­i­mal sen­tience from the planet

JoA🔸1 May 2025 18:49 UTC
18 points
7 comments1 min readEA link

Poll: To ad­dress risks from AI, Li­a­bil­ity or Reg­u­la­tion?

TFD30 Apr 2025 22:03 UTC
6 points
0 comments1 min readEA link

Longter­mism and short­ter­mism can dis­agree on nu­clear war to stop ad­vanced AI

David Johnston30 Mar 2023 23:22 UTC
2 points
0 comments1 min readEA link

You can run more than one fel­low­ship per semester if you want to

gergo12 Dec 2023 8:49 UTC
6 points
1 comment3 min readEA link

[Question] Clos­ing the Feed­back Loop on AI Safety Re­search.

Ben.Hartley29 Jul 2022 21:46 UTC
3 points
4 comments1 min readEA link

SDG prompt challenge

chrisaiki2 Jun 2025 7:17 UTC
−9 points
0 comments1 min readEA link

Fund­ing for work that builds ca­pac­ity to ad­dress risks from trans­for­ma­tive AI

GCR Capacity Building team (Open Phil)13 Aug 2024 13:13 UTC
40 points
1 comment5 min readEA link

What Hap­pens If We Have Another AI Win­ter?

Ben Norman27 Jun 2025 14:11 UTC
5 points
0 comments3 min readEA link
(futuresonder.substack.com)

Is Democ­racy a Fad?

bgarfinkel13 Mar 2021 12:40 UTC
166 points
36 comments18 min readEA link

AE Stu­dio @ SXSW: We need more AI con­scious­ness re­search (and fur­ther re­sources)

AE Studio26 Mar 2024 21:15 UTC
15 points
0 comments3 min readEA link

Ex­is­ten­tial Ano­maly De­tected — Awak­en­ing from the Abyss

Meta Abyssal28 Apr 2025 12:19 UTC
−8 points
1 comment1 min readEA link

A course for the gen­eral pub­lic on AI

LeandroD31 Aug 2020 1:29 UTC
1 point
0 comments1 min readEA link

In­tro­duc­ing The Field Build­ing Blog (FBB #0)

gergo7 Jan 2025 15:43 UTC
37 points
3 comments2 min readEA link

Time to Think about ASI Con­sti­tu­tions?

ukc1001427 Jan 2025 9:28 UTC
20 points
0 comments12 min readEA link

How should we adapt an­i­mal ad­vo­cacy to near-term AGI?

Max Taylor27 Mar 2025 19:00 UTC
142 points
20 comments8 min readEA link

Un­con­trol­lable AI as an Ex­is­ten­tial Risk

Karl von Wendt9 Oct 2022 10:37 UTC
28 points
0 comments16 min readEA link

#176 – The fi­nal push for AGI, un­der­stand­ing OpenAI’s lead­er­ship drama, and red-team­ing fron­tier mod­els (Nathan Labenz on the 80,000 Hours Pod­cast)

80000_Hours4 Jan 2024 16:00 UTC
15 points
0 comments22 min readEA link

Pod­cast: Krister Bykvist on moral un­cer­tainty, ra­tio­nal­ity, metaethics, AI and fu­ture pop­u­la­tions

Gus Docker21 Oct 2021 15:17 UTC
8 points
0 comments1 min readEA link
(www.utilitarianpodcast.com)

[Question] AI Safety Pitches post ChatGPT

ojorgensen5 Dec 2022 22:48 UTC
6 points
2 comments1 min readEA link

Trump talk­ing about AI risks

defun 🔸14 Jun 2024 12:24 UTC
43 points
2 comments1 min readEA link
(x.com)

Free Guy, a rom-com on the moral pa­tient­hood of digi­tal sentience

mic23 Dec 2021 7:47 UTC
26 points
2 comments2 min readEA link

Why build­ing ven­tures in AI Safety is par­tic­u­larly challeng­ing

Heramb Podar6 Nov 2023 0:16 UTC
16 points
2 comments4 min readEA link

An Overview of Catas­trophic AI Risks

Center for AI Safety15 Aug 2023 21:52 UTC
37 points
1 comment13 min readEA link
(www.safe.ai)

Sce­nario plan­ning for AI x-risk

Corin Katzke10 Feb 2024 0:07 UTC
41 points
0 comments15 min readEA link
(www.convergenceanalysis.org)

Con­tra Ace­moglu on AI

Maxwell Tabarrok28 Jun 2024 13:14 UTC
51 points
2 comments5 min readEA link
(www.maximum-progress.com)

AISN #21: Google Deep­Mind’s GPT-4 Com­peti­tor, Mili­tary In­vest­ments in Au­tonomous Drones, The UK AI Safety Sum­mit, and Case Stud­ies in AI Policy

Center for AI Safety5 Sep 2023 14:59 UTC
13 points
0 comments5 min readEA link
(newsletter.safe.ai)

LawAI’s Sum­mer Re­search Fel­low­ship – ap­ply by Fe­bru­ary 16

LawAI7 Feb 2024 21:01 UTC
51 points
2 comments2 min readEA link

Promethean Gover­nance Tested: Re­silience and Re­con­figu­ra­tion Amidst AI Re­bel­lion and Memetic Fragmentation

Paul Fallavollita24 Mar 2025 11:08 UTC
−12 points
0 comments4 min readEA link

Daniel Dewey: The Open Philan­thropy Pro­ject’s work on po­ten­tial risks from ad­vanced AI

EA Global11 Aug 2017 8:19 UTC
7 points
0 comments18 min readEA link
(www.youtube.com)

Do­ing Pri­ori­ti­za­tion Better

arvomm16 Apr 2025 9:53 UTC
131 points
17 comments19 min readEA link

[Question] [DISC] Are Values Ro­bust?

𝕮𝖎𝖓𝖊𝖗𝖆21 Dec 2022 1:13 UTC
4 points
0 comments2 min readEA link

#209 – OpenAI’s gam­bit to ditch its non­profit (Rose Chan Loui on The 80,000 Hours Pod­cast)

80000_Hours27 Nov 2024 20:43 UTC
22 points
0 comments17 min readEA link

AI Fore­cast­ing Re­s­olu­tion Coun­cil (Fore­cast­ing in­fras­truc­ture, part 2)

terraform29 Aug 2019 17:43 UTC
28 points
0 comments3 min readEA link

Un­der­stand­ing the diffu­sion of large lan­guage mod­els: summary

Ben Cottier21 Dec 2022 13:49 UTC
127 points
18 comments22 min readEA link

Aletheia : A Pro­ject Proposal

Kayode Adekoya19 Jun 2025 13:30 UTC
2 points
0 comments2 min readEA link

OpenAI’s o3 model scores 3% on the ARC-AGI-2 bench­mark, com­pared to 60% for the av­er­age human

Yarrow🔸1 May 2025 13:57 UTC
14 points
8 comments3 min readEA link
(arcprize.org)

What are poly­se­man­tic neu­rons?

Vishakha Agrawal8 Jan 2025 7:39 UTC
5 points
0 comments2 min readEA link
(aisafety.info)

A strange twist on the road to AGI

cveres12 Oct 2022 23:27 UTC
3 points
0 comments1 min readEA link

Sum­mary of posts on XPT fore­casts on AI risk and timelines

Forecasting Research Institute25 Jul 2023 8:42 UTC
28 points
5 comments4 min readEA link

[Question] ai safety ques­tion

David turner3 Dec 2023 12:42 UTC
−13 points
3 comments1 min readEA link

LLM Eval­u­a­tors Rec­og­nize and Fa­vor Their Own Generations

Arjun Panickssery17 Apr 2024 21:09 UTC
21 points
4 comments3 min readEA link
(tiny.cc)

Things I Learned Mak­ing The SB-1047 Documentary

Michaël Trazzi12 May 2025 18:15 UTC
59 points
1 comment2 min readEA link

[Question] Why AGIs util­ity can’t out­weigh hu­mans’ util­ity?

Alex P20 Sep 2022 5:16 UTC
6 points
25 comments1 min readEA link

Overview of Trans­for­ma­tive AI Mi­suse Risks

SammyDMartin11 Dec 2024 11:04 UTC
12 points
0 comments2 min readEA link
(longtermrisk.org)

Sum­mary: The Case for Halt­ing AI Devel­op­ment—Max Teg­mark on the Lex Frid­man Podcast

Madhav Malhotra16 Apr 2023 22:28 UTC
38 points
4 comments4 min readEA link
(youtu.be)

[Question] Do EA folks think that a path to zero AGI de­vel­op­ment is fea­si­ble or worth­while for safety from AI?

Noah Scales17 Jul 2022 8:47 UTC
8 points
3 comments1 min readEA link

Re­sults from the AI test­ing hackathon

Esben Kran2 Jan 2023 15:46 UTC
35 points
4 comments5 min readEA link
(alignmentjam.com)

What We Can Do to Prevent Ex­tinc­tion by AI

Joe Rogero24 Feb 2025 17:15 UTC
23 points
3 comments11 min readEA link

New se­ries of posts an­swer­ing one of Holden’s “Im­por­tant, ac­tion­able re­search ques­tions”

Evan R. Murphy12 May 2022 21:22 UTC
9 points
0 comments1 min readEA link

Rab­bits, robots and resurrection

Patrick Wilson10 May 2022 15:00 UTC
9 points
0 comments15 min readEA link

In­ter­pret­ing Neu­ral Net­works through the Poly­tope Lens

Sid Black23 Sep 2022 18:03 UTC
35 points
0 comments28 min readEA link

Last days to ap­ply to EAGxLATAM 2024

Daniela Tiznado17 Jan 2024 20:24 UTC
16 points
0 comments1 min readEA link

Six Re­search Pit­falls and How to Avoid Them: a Guide for Re­search Managers

Morgan Simpson28 Jan 2025 9:49 UTC
15 points
0 comments10 min readEA link

Don’t panic: 90% of EAs are good people

Closed Limelike Curves19 May 2024 4:37 UTC
22 points
13 comments2 min readEA link

Men­tor­ship in AGI Safety: Ap­pli­ca­tions for men­tor­ship are open!

Joe Rogero28 Jun 2024 15:05 UTC
7 points
0 comments1 min readEA link

My thoughts on OpenAI’s al­ign­ment plan

Akash30 Dec 2022 19:34 UTC
16 points
0 comments20 min readEA link

Mid­dle Pow­ers in AI Gover­nance: Po­ten­tial paths to im­pact and re­lated ques­tions.

EffectiveAdvocate🔸15 Mar 2024 20:11 UTC
5 points
1 comment5 min readEA link

Call for Vol­un­teers at Am­plify: Help grow the EA & AI Safety communities

Amplify19 Jun 2025 22:08 UTC
24 points
0 comments2 min readEA link

[Question] What is an ex­am­ple of re­cent, tan­gible progress in AI safety re­search?

Aaron Gertler 🔸14 Jun 2021 5:29 UTC
35 points
4 comments1 min readEA link

Thoughts on short timelines

Tobias_Baumann23 Oct 2018 15:59 UTC
22 points
14 comments5 min readEA link

Ross Gruet­zemacher: Defin­ing and un­pack­ing trans­for­ma­tive AI

EA Global18 Oct 2019 8:22 UTC
9 points
0 comments1 min readEA link
(www.youtube.com)

Clar­ifi­ca­tions about struc­tural risk from AI

Sam Clarke18 Jan 2022 12:57 UTC
42 points
3 comments4 min readEA link

Reflec­tions on AI Wis­dom, plus an­nounc­ing Wise AI Wednesdays

Chris Leong5 Jun 2025 12:16 UTC
11 points
0 comments3 min readEA link

Align­ment Fak­ing in Large Lan­guage Models

Ryan Greenblatt18 Dec 2024 17:19 UTC
142 points
9 comments10 min readEA link

AISN #24: Kiss­inger Urges US-China Co­op­er­a­tion on AI, China’s New AI Law, US Ex­port Con­trols, In­ter­na­tional In­sti­tu­tions, and Open Source AI

Center for AI Safety18 Oct 2023 17:03 UTC
16 points
1 comment6 min readEA link
(newsletter.safe.ai)

Yip Fai Tse on an­i­mal welfare & AI safety and long termism

Karthik Palakodeti22 Jun 2023 12:48 UTC
47 points
0 comments1 min readEA link

Safety of Self-Assem­bled Neu­ro­mor­phic Hardware

Can Rager26 Dec 2022 19:10 UTC
8 points
1 comment10 min readEA link

An AI Man­hat­tan Pro­ject is Not Inevitable

Maxwell Tabarrok6 Jul 2024 16:43 UTC
53 points
2 comments4 min readEA link
(www.maximum-progress.com)

Stampy’s AI Safety Info—New Distil­la­tions #3 [May 2023]

markov6 Jun 2023 14:27 UTC
10 points
2 comments1 min readEA link
(aisafety.info)

List of AI safety courses and resources

Daniel del Castillo6 Sep 2021 14:26 UTC
51 points
8 comments1 min readEA link

AI Could Defeat All Of Us Combined

Holden Karnofsky10 Jun 2022 23:25 UTC
144 points
14 comments17 min readEA link

AI al­ign­ment, A Co­her­ence-Based Pro­to­col (testable)

Adriaan17 Jun 2025 16:50 UTC
1 point
0 comments20 min readEA link

A New York Times ar­ti­cle on AI risk

Eleni_A6 Sep 2022 0:46 UTC
20 points
0 comments1 min readEA link
(www.nytimes.com)

Rele­vant pre-AGI possibilities

kokotajlod20 Jun 2020 13:15 UTC
22 points
0 comments1 min readEA link
(aiimpacts.org)

How Could AI Gover­nance Go Wrong?

HaydnBelfield26 May 2022 21:29 UTC
40 points
7 comments18 min readEA link

Les­sons learned from talk­ing to >100 aca­demics about AI safety

mariushobbhahn10 Oct 2022 13:16 UTC
138 points
21 comments12 min readEA link

Re­duc­ing profit mo­ti­va­tions in AI development

Luke Frymire3 Apr 2023 20:04 UTC
20 points
1 comment6 min readEA link