AI safety

Core TagLast edit: 7 Aug 2024 15:10 UTC by vipulnaik

AI safety is the study of ways to reduce risks posed by artificial intelligence.

Interventions that aim to reduce these risks can be split into:

Technical alignment - research on how to align AI systems with human or moral goals
AI governance - reducing AI risk by e.g. global coordination around regulating AI development or providing incentives for corporations to be more cautious in their AI research
AI forecasting - predicting AI capabilities ahead of time

Reading on why AI might be an existential risk

Hilton, Benjamin (2023) Preventing an AI-related catastrophe, 80000 Hours, March 2023

Cotra, Ajeya (2022) Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover Effective Altruism Forum, July 18

Carlsmith, Joseph (2022) Is Power-Seeking AI an Existential Risk? Arxiv, 16 June

Yudkowsky, Eliezer (2022) AGI Ruin: A List of Lethalities LessWrong, June 5

Ngo et al (2023) The alignment problem from a deep learning perspectiveArxiv, February 23

Arguments against AI safety

AI safety and AI risk is sometimes referred to as a Pascal’s Mugging ^[1], implying that the risks are tiny and that for any stated level of ignorable risk the the payoffs could be exaggerated to force it to still be a top priority. A response to this is that in a survey of 700 ML researchers, the median answer to the “the probability that the long-run effect of advanced AI on humanity will be “extremely bad (e.g., human extinction)” was 5% with, with 48% of respondents giving 10% or higher^[2]. These probabilites are too high (by at least 5 orders of magnitude) to be consider Pascalian.

AI safety as a career

80,000 Hours’ medium-depth investigation rates technical AI safety research a “priority path”—among the most promising career opportunities the organization has identified so far.^[3]^[4] Richard Ngo and Holden Karnofsky also have advice for those interested in working on AI Safety^[5]^[6].

^
https://twitter.com/amasad/status/1632121317146361856 The CEO of Replit, a coding organisation who are involved in ML Tools
^
https://aiimpacts.org/2022-expert-survey-on-progress-in-ai/#Data
^
Todd, Benjamin (2023) The highest impact career paths our research has identified so far, 80,000 Hours, May 12.
^
Hilton, Benjamin (2023) AI safety technical research, 80,000 Hours, June 19th
^
Ngo, Richard (2023) AGI safety career advice, EA Forum, May 2
^
Karnofsky, Holden (2023), Jobs that can help with the most important century, EA Forum, Feb 12

Announcing the Winners of the 2023 Open Philanthropy AI Worldviews Contest

Jason Schukraft30 Sep 2023 3:51 UTC

74 points

30 comments2 min readEA link

High-level hopes for AI alignment

Holden Karnofsky20 Dec 2022 2:11 UTC

123 points

14 comments19 min readEA link

(www.cold-takes.com)

Resources I send to AI researchers about AI safety

Vael Gates11 Jan 2023 1:24 UTC

43 points

0 comments1 min readEA link

AI safety needs to scale, and here’s how you can do it

Esben Kran2 Feb 2024 7:17 UTC

33 points

2 comments5 min readEA link

(apartresearch.com)

Chilean AIS Hackathon Retrospective

Agustín Covarrubias 🔸9 May 2023 1:34 UTC

67 points

0 comments5 min readEA link

FLI open letter: Pause giant AI experiments

Zach Stein-Perlman29 Mar 2023 4:04 UTC

220 points

38 comments2 min readEA link

(futureoflife.org)

Katja Grace: Let’s think about slowing down AI

peterhartree23 Dec 2022 0:57 UTC

84 points

6 comments2 min readEA link

(worldspiritsockpuppet.substack.com)

Fill out this census of everyone interested in reducing catastrophic AI risks

AHT18 May 2024 15:53 UTC

105 points

1 comment1 min readEA link

Announcing AI Safety Bulgaria

Aleksandar Angelov3 Mar 2024 17:53 UTC

16 points

0 comments1 min readEA link

Launching applications for AI Safety Careers Course India 2024

varun_agr1 May 2024 5:30 UTC

23 points

1 comment1 min readEA link

Interested in working from a new Boston AI Safety Hub?

Topaz17 Mar 2025 13:32 UTC

25 points

0 comments2 min readEA link

Metaculus Launches Future of AI Series, Based on Research Questions by Arb

christian13 Mar 2024 21:14 UTC

34 points

0 comments1 min readEA link

(www.metaculus.com)

AI Safety Europe Retreat 2023 Retrospective

Magdalena Wache14 Apr 2023 9:05 UTC

41 points

10 comments2 min readEA link

Announcing the European Network for AI Safety (ENAIS)

Esben Kran22 Mar 2023 17:57 UTC

124 points

3 comments3 min readEA link

Digital sentience funding opportunities: Support for applied work and research

zdgroff28 May 2025 17:35 UTC

115 points

0 comments4 min readEA link

The Shutdown Problem: Incomplete Preferences as a Solution

EJT23 Feb 2024 16:01 UTC

26 points

0 comments42 min readEA link

Predictable updating about AI risk

Joe_Carlsmith8 May 2023 22:05 UTC

135 points

12 comments36 min readEA link

A Qualitative Case for LTFF: Filling Critical Ecosystem Gaps

Linch3 Dec 2024 21:57 UTC

89 points

26 comments9 min readEA link

How Stuart Russels’s IASEAI conference failed to live up to its potential

gergo7 Aug 2025 13:15 UTC

7 points

4 comments2 min readEA link

How AI Takeover Might Happen in Two Years

Joshc7 Feb 2025 23:51 UTC

35 points

7 comments29 min readEA link

(x.com)

Wetware’s Default: A Diagnosis of Systemic Myopia under AI-Driven Autonomy

Ihor Ivliev3 Jul 2025 23:21 UTC

1 point

0 comments7 min readEA link

MIRI’s 2024 End-of-Year Update

RobBensinger3 Dec 2024 4:33 UTC

32 points

7 comments4 min readEA link

Beware Epistemic Collapse

Ben Norman18 Aug 2025 10:44 UTC

29 points

2 comments8 min readEA link

(futuresonder.substack.com)

Why Simulator AIs want to be Active Inference AIs

Jan_Kulveit11 Apr 2023 9:06 UTC

22 points

0 comments8 min readEA link

(www.lesswrong.com)

Long list of AI questions

NunoSempere6 Dec 2023 11:12 UTC

124 points

16 comments86 min readEA link

My cover story in Jacobin on AI capitalism and the x-risk debates

Garrison12 Feb 2024 23:34 UTC

154 points

10 comments6 min readEA link

(jacobin.com)

‘GiveWell for AI Safety’: Lessons learned in a week

Lydia Nottingham30 May 2025 16:10 UTC

45 points

1 comment6 min readEA link

[Linkpost] Statement from Scarlett Johansson on OpenAI’s use of the “Sky” voice, that was shockingly similar to her own voice.

Linch20 May 2024 23:50 UTC

46 points

8 comments1 min readEA link

(variety.com)

Funding case: AI Safety Camp 10

Remmelt12 Dec 2023 9:05 UTC

45 points

13 comments5 min readEA link

(manifund.org)

We are not alone: many communities want to stop Big Tech from scaling unsafe AI

Remmelt22 Sep 2023 17:38 UTC

28 points

30 comments4 min readEA link

Winners of the Essay competition on the Automation of Wisdom and Philosophy

Owen Cotton-Barratt29 Oct 2024 0:02 UTC

37 points

2 comments30 min readEA link

(blog.aiimpacts.org)

Where I Am Donating in 2024

MichaelDickens19 Nov 2024 0:09 UTC

180 points

73 comments46 min readEA link

From Therapy Tool to Alignment Puzzle-Piece: Introducing the VSPE Framework

Astelle Kay18 Jun 2025 14:47 UTC

6 points

1 comment2 min readEA link

 “Near Midnight in Suicide City”

Greg_Colbourn ⏸️ 6 Dec 2024 19:54 UTC

5 points

0 comments1 min readEA link

(www.youtube.com)

AISN #45: Center for AI Safety 2024 Year in Review

Center for AI Safety19 Dec 2024 18:14 UTC

11 points

0 comments4 min readEA link

(newsletter.safe.ai)

AI for Animals 2025 Conference—Get Early Bird Tickets Now

Constance Li20 Nov 2024 0:53 UTC

47 points

0 comments1 min readEA link

Consider granting AIs freedom

Matthew_Barnett6 Dec 2024 0:55 UTC

100 points

38 comments5 min readEA link

A case for donating to AI risk reduction (including if you work in AI)

tlevin2 Dec 2024 19:05 UTC

118 points

5 comments3 min readEA link

Transformative AI and Animals: Animal Advocacy Under A Post-Work Society

Kevin Xia 🔸25 May 2025 18:32 UTC

64 points

1 comment8 min readEA link

Announcing the Q1 2025 Long-Term Future Fund grant round

Linch20 Dec 2024 2:17 UTC

53 points

12 comments2 min readEA link

Donation recommendations for xrisk + ai safety

vincentweisser6 Feb 2023 21:25 UTC

17 points

11 comments1 min readEA link

Please vote for PauseAI US in the Donation Election!

Holly Elmore ⏸️ 🔸22 Nov 2024 4:12 UTC

21 points

3 comments2 min readEA link

Evolution provides no evidence for the sharp left turn

Quintin Pope11 Apr 2023 18:48 UTC

43 points

2 comments15 min readEA link

Rebooting the Singularity

cdkg16 Jul 2025 18:27 UTC

44 points

5 comments1 min readEA link

(philpapers.org)

Symbiosis, not alignment, as the goal for liberal democracies in the transition to artificial general intelligence

simonfriederich17 Mar 2023 13:04 UTC

18 points

2 comments24 min readEA link

(rdcu.be)

Impact of Quantization on Small Language Models (SLMs) for Multilingual Mathematical Reasoning Tasks

Angie Paola Giraldo7 May 2025 21:48 UTC

11 points

0 comments14 min readEA link

[Question] Seeking suggested readings & videos for a new course on ‘AI and Psychology’

Geoffrey Miller20 May 2024 17:45 UTC

32 points

8 comments1 min readEA link

Four mindset disagreements behind existential risk disagreements in ML

RobBensinger11 Apr 2023 4:53 UTC

61 points

2 comments9 min readEA link

Is AI sentience already a reality?

S1 Jun 2025 2:23 UTC

4 points

2 comments1 min readEA link

Preventing an AI-related catastrophe—Problem profile

Benjamin Hilton29 Aug 2022 18:49 UTC

138 points

18 comments4 min readEA link

(80000hours.org)

Sam Altman and the Crossroads of AI Power: Can We Trust the Future We’re Building?

Kayode Adekoya23 May 2025 15:39 UTC

0 points

0 comments1 min readEA link

Deceptive Alignment is <1% Likely by Default

DavidW21 Feb 2023 15:07 UTC

54 points

26 comments14 min readEA link

Navigating the New Reality in DC: An EIP Primer

IanDavidMoss20 Dec 2024 16:59 UTC

26 points

1 comment13 min readEA link

(effectiveinstitutionsproject.substack.com)

Cosmic AI safety

Magnus Vinding6 Dec 2024 22:32 UTC

24 points

5 comments6 min readEA link

Against Aschenbrenner: How ‘Situational Awareness’ constructs a narrative that undermines safety and threatens humanity

GideonF15 Jul 2024 16:21 UTC

238 points

22 comments21 min readEA link

The Choice Transition

Owen Cotton-Barratt18 Nov 2024 12:32 UTC

49 points

1 comment15 min readEA link

(strangecities.substack.com)

AI alignment researchers may have a comparative advantage in reducing s-risks

Lukas_Gloor15 Feb 2023 13:01 UTC

79 points

5 comments13 min readEA link

Vael Gates: Risks from Highly-Capable AI (March 2023)

Vael Gates1 Apr 2023 20:54 UTC

31 points

4 comments1 min readEA link

(docs.google.com)

AISafety.info “How can I help?” FAQ

StevenKaas5 Jun 2023 22:09 UTC

48 points

1 comment2 min readEA link

Project ideas: Epistemics

Lukas Finnveden4 Jan 2024 7:26 UTC

43 points

1 comment17 min readEA link

(www.forethought.org)

New Business Wars podcast season on Sam Altman and OpenAI

Eevee🔹2 Apr 2024 6:22 UTC

10 points

0 comments1 min readEA link

(wondery.com)

1-year update on impactRIO, the first AI Safety group in Brazil

João Lucas Duim28 Jun 2024 10:59 UTC

56 points

2 comments10 min readEA link

Request to AGI organizations: Share your views on pausing AI progress

Akash11 Apr 2023 17:30 UTC

85 points

1 comment1 min readEA link

Some Things I Heard about AI Governance at EAG

utilistrutil28 Feb 2023 21:27 UTC

35 points

5 comments6 min readEA link

AI Risk & Policy Forecasts from Metaculus & FLI’s AI Pathways Workshop

Will Aldred16 May 2023 8:53 UTC

41 points

0 comments8 min readEA link

INTELLECT-1 Release: The First Globally Trained 10B Parameter Model

Matrice Jacobine29 Nov 2024 23:03 UTC

2 points

1 comment1 min readEA link

(www.primeintellect.ai)

Anti-‘FOOM’ (stop trying to make your cute pet name the thing)

david_reinstein14 Apr 2023 16:05 UTC

41 points

17 comments2 min readEA link

[Question] Why hasn’t there been any significant AI protest

sammyboiz🔸17 May 2024 2:59 UTC

21 points

14 comments1 min readEA link

Two important recent AI Talks- Gebru and Lazar

GideonF6 Mar 2023 1:30 UTC

−7 points

5 comments1 min readEA link

Sam Altman returning as OpenAI CEO “in principle”

Fermi–Dirac Distribution22 Nov 2023 6:15 UTC

55 points

37 comments1 min readEA link

Which incentives should be used to encourage compliance with UK AI legislation?

jcw18 Nov 2024 18:13 UTC

12 points

0 comments12 min readEA link

A short conversation I had with Google Gemini on the dangers of unregulated LLM API use, while mildly drunk in an airport.

EvanMcCormick17 Dec 2024 12:25 UTC

1 point

0 comments8 min readEA link

To the Bat Mobile!! My Mid-Career Transition into AI Safety

Moneer7 Nov 2024 15:59 UTC

17 points

0 comments3 min readEA link

OpenAI introduces function calling for GPT-4

mic20 Jun 2023 1:58 UTC

26 points

0 comments4 min readEA link

(openai.com)

AI doing philosophy = AI generating hands?

Wei Dai15 Jan 2024 9:04 UTC

68 points

7 comments3 min readEA link

AI Safety Action Plan—A report commissioned by the US State Department

Agustín Covarrubias 🔸11 Mar 2024 22:13 UTC

25 points

1 comment1 min readEA link

(www.gladstone.ai)

AI-nuclear integration: evidence of automation bias from humans and LLMs [research summary]

Tao27 Apr 2024 21:59 UTC

17 points

2 comments12 min readEA link

Announcing ForecastBench, a new benchmark for AI and human forecasting abilities

Forecasting Research Institute1 Oct 2024 12:31 UTC

20 points

1 comment3 min readEA link

(arxiv.org)

Joining the Carnegie Endowment for International Peace

Holden Karnofsky29 Apr 2024 15:45 UTC

228 points

14 comments2 min readEA link

Partial value takeover without world takeover

Katja_Grace18 Apr 2024 3:00 UTC

24 points

2 comments3 min readEA link

NYT: Google will ‘recalibrate’ the risk of releasing AI due to competition with OpenAI

Michael Huang22 Jan 2023 2:13 UTC

173 points

8 comments1 min readEA link

(www.nytimes.com)

Data Taxation: A Proposal for Slowing Down AGI Progress

Per Ivar Friborg11 Apr 2023 17:27 UTC

42 points

6 comments12 min readEA link

[Linkpost] 538 Politics Podcast on AI risk & politics

jackva11 Apr 2023 17:03 UTC

64 points

5 comments1 min readEA link

(fivethirtyeight.com)

Corporate campaigns work: a key learning for AI Safety

Jamie_Harris17 Aug 2023 21:35 UTC

72 points

12 comments6 min readEA link

Hooray for stepping out of the limelight

So8res1 Apr 2023 2:45 UTC

103 points

0 comments1 min readEA link

In favour of exploring nagging doubts about x-risk

Owen Cotton-Barratt25 Jun 2024 23:52 UTC

90 points

15 comments2 min readEA link

The “low-hanging fruits” of AI safety

Julian Nalenz19 Dec 2024 13:38 UTC

−1 points

0 comments6 min readEA link

(blog.hermesloom.org)

Ten arguments that AI is an existential risk

Katja_Grace14 Aug 2024 21:51 UTC

30 points

0 comments7 min readEA link

2021 AI Alignment Literature Review and Charity Comparison

Larks23 Dec 2021 14:06 UTC

176 points

18 comments73 min readEA link

Analogy Bank for AI Safety

utilistrutil29 Jan 2024 2:35 UTC

14 points

5 comments8 min readEA link

NIST Seeks Comments On “Safety Considerations for Chemical and/or Biological AI Models”

Dylan Richardson26 Oct 2024 18:28 UTC

15 points

0 comments1 min readEA link

(www.federalregister.gov)

Did Bengio and Tegmark lose a debate about AI x-risk against LeCun and Mitchell?

Karl von Wendt25 Jun 2023 16:59 UTC

80 points

24 comments7 min readEA link

Funding AI Safety political advocacy in the US: Individual donors and small donations may be especially helpful

Holly Elmore ⏸️ 🔸14 Nov 2023 23:14 UTC

64 points

8 comments1 min readEA link

Timelines are short, p(doom) is high: a global stop to frontier AI development until x-safety consensus is our only reasonable hope

Greg_Colbourn ⏸️ 12 Oct 2023 11:24 UTC

76 points

85 comments9 min readEA link

Merger of DeepMind and Google Brain

Greg_Colbourn ⏸️ 20 Apr 2023 20:16 UTC

11 points

12 comments1 min readEA link

(blog.google)

Announcing the AI Fables Writing Contest!

Daystar Eld12 Jul 2023 3:04 UTC

76 points

52 comments3 min readEA link

Slim overview of work one could do to make AI go better (and a grab-bag of other career considerations)

Chi20 Mar 2024 23:17 UTC

34 points

1 comment3 min readEA link

AI Safety Impact Markets: Your Charity Evaluator for AI Safety

Dawn Drescher1 Oct 2023 10:47 UTC

28 points

4 comments6 min readEA link

(impactmarkets.substack.com)

Shutting down all competing AI projects might not buy a lot of time due to Internal Time Pressure

ThomasCederborg3 Oct 2024 0:05 UTC

6 points

1 comment12 min readEA link

2/3 Aussie & NZ AI Safety folk often or sometimes feel lonely or disconnected (and 16 other barriers to impact)

yanni kyriacos1 Aug 2024 1:14 UTC

19 points

11 comments8 min readEA link

Among the A.I. Doomsayers—The New Yorker

Agustín Covarrubias 🔸11 Mar 2024 21:12 UTC

66 points

0 comments1 min readEA link

(www.newyorker.com)

AI alignment, human alignment, oh my

MilesW31 Oct 2024 3:23 UTC

−12 points

0 comments2 min readEA link

Claude Doesn’t Want to Die

Garrison5 Mar 2024 6:00 UTC

22 points

14 comments10 min readEA link

(garrisonlovely.substack.com)

The Tech Industry is the Biggest Blocker to Meaningful AI Safety Regulations

Garrison16 Aug 2024 19:37 UTC

140 points

8 comments8 min readEA link

(garrisonlovely.substack.com)

Cyborg Periods: There will be multiple AI transitions

Jan_Kulveit22 Feb 2023 16:09 UTC

68 points

1 comment6 min readEA link

An AI crash is our best bet for restricting AI

Remmelt11 Oct 2024 2:12 UTC

20 points

3 comments1 min readEA link

But why would the AI kill us?

So8res17 Apr 2023 19:38 UTC

45 points

3 comments3 min readEA link

Future Matters #8: Bing Chat, AI labs on safety, and pausing Future Matters

Pablo21 Mar 2023 14:50 UTC

81 points

5 comments24 min readEA link

Whether you should do a PhD doesn’t depend much on timelines.

alex lawsen22 Mar 2023 12:25 UTC

67 points

7 comments4 min readEA link

Try to solve the hard parts of the alignment problem

MikhailSamin11 Jul 2023 17:02 UTC

8 points

0 comments5 min readEA link

[Question] What’s the best way to get a sense of the day-to-day activities of different researchers/research directions? (AI Governance)

Luise27 May 2024 12:48 UTC

15 points

1 comment1 min readEA link

EU policymakers reach an agreement on the AI Act

tlevin15 Dec 2023 6:03 UTC

109 points

13 comments7 min readEA link

Problem-solving tasks in Graph Theory for language models

Bruno López Orozco1 Oct 2024 12:36 UTC

21 points

1 comment9 min readEA link

Dario Amodei — Machines of Loving Grace

Matrice Jacobine11 Oct 2024 21:39 UTC

66 points

0 comments1 min readEA link

(darioamodei.com)

AI strategy given the need for good reflection

Owen Cotton-Barratt18 Mar 2024 0:48 UTC

40 points

1 comment5 min readEA link

[Question] What is the current most representative EA AI x-risk argument?

Matthew_Barnett15 Dec 2023 22:04 UTC

117 points

50 comments3 min readEA link

[Question] Can we train AI so that future philanthropy is more effective?

Ricardo Pimentel3 Nov 2024 15:08 UTC

3 points

0 comments1 min readEA link

My lab’s small AI safety agenda

Jobst Heitzig (vodle.it)18 Jun 2023 12:29 UTC

59 points

26 comments3 min readEA link

Train for incorrigibility, then reverse it (Shutdown Problem Contest Submission)

Daniel_Eth18 Jul 2023 8:26 UTC

16 points

0 comments2 min readEA link

Proposing the Conditional AI Safety Treaty (linkpost TIME)

Otto15 Nov 2024 13:56 UTC

12 points

6 comments3 min readEA link

(time.com)

Deconfusing Pauses: Long Term Moratorium vs Slowing AI

GideonF4 Aug 2024 11:32 UTC

17 points

3 comments5 min readEA link

“Aligned with who?” Results of surveying 1,000 US participants on AI values

Holly Morgan21 Mar 2023 22:07 UTC

41 points

0 comments2 min readEA link

(www.lesswrong.com)

[Linkpost] Given Extinction Worries, Why Don’t AI Researchers Quit? Well, Several Reasons

Daniel_Eth6 Jun 2023 7:31 UTC

25 points

6 comments1 min readEA link

(medium.com)

AISC 2024 - Project Summaries

Nicky Pochinkov27 Nov 2023 22:35 UTC

13 points

1 comment18 min readEA link

[Question] Concrete, existing examples of high-impact risks from AI?

freedomandutility15 Apr 2023 22:19 UTC

9 points

1 comment1 min readEA link

FLI report: Policymaking in the Pause

Zach Stein-Perlman15 Apr 2023 17:01 UTC

29 points

4 comments1 min readEA link

(futureoflife.org)

Coordination by common knowledge to prevent uncontrollable AI

Karl von Wendt14 May 2023 13:37 UTC

14 points

0 comments9 min readEA link

AI Safety Camp 10

Robert Kralisch26 Oct 2024 11:36 UTC

15 points

0 comments18 min readEA link

(www.lesswrong.com)

Why some people disagree with the CAIS statement on AI

David_Moss15 Aug 2023 13:39 UTC

144 points

15 comments16 min readEA link

A freshman year during the AI midgame: my approach to the next year

Buck14 Apr 2023 0:38 UTC

179 points

30 comments7 min readEA link

Sentience Institute 2021 End of Year Summary

Ali26 Nov 2021 14:40 UTC

66 points

5 comments6 min readEA link

(www.sentienceinstitute.org)

Current UK government levers on AI development

rosehadshar10 Apr 2023 13:16 UTC

82 points

3 comments4 min readEA link

 Jan Leike: “I’m excited to join @AnthropicAI to continue the superalignment mission!”

defun 🔸28 May 2024 18:08 UTC

35 points

11 comments1 min readEA link

(x.com)

Videos on the world’s most pressing problems, by 80,000 Hours

Bella21 Mar 2024 20:18 UTC

63 points

5 comments2 min readEA link

Bounty for Evidence on Some of Palisade Research’s Beliefs

bwr23 Sep 2024 20:05 UTC

5 points

0 comments1 min readEA link

Breakthrough in AI agents? (On Devin—The Zvi, linkpost)

SiebeRozendal20 Mar 2024 9:43 UTC

16 points

9 comments1 min readEA link

(thezvi.substack.com)

Operationalizing timelines

Zach Stein-Perlman10 Mar 2023 17:30 UTC

30 points

2 comments3 min readEA link

When “human-level” is the wrong threshold for AI

Ben Millwood🔸22 Jun 2024 14:34 UTC

38 points

3 comments7 min readEA link

Project ideas: Backup plans & Cooperative AI

Lukas Finnveden4 Jan 2024 7:26 UTC

25 points

2 comments13 min readEA link

(www.forethought.org)

The market plausibly expects AI software to create trillions of dollars of value by 2027

Benjamin_Todd6 May 2024 5:16 UTC

88 points

19 comments1 min readEA link

(benjamintodd.substack.com)

Public Weights?

Jeff Kaufman 🔸2 Nov 2023 2:51 UTC

20 points

7 comments3 min readEA link

Shaping Policies for Ethical AI Development in Africa

Kuiyaki16 May 2024 14:15 UTC

3 points

0 comments1 min readEA link

AI Can Help Animal Advocacy More Than It Can Help Industrial Farming

Wladimir J. Alonso26 Nov 2024 9:55 UTC

23 points

10 comments4 min readEA link

AGI Catastrophe and Takeover: Some Reference Class-Based Priors

zdgroff24 May 2023 19:14 UTC

95 points

10 comments6 min readEA link

AI Winter Season at EA Hotel

CEEALAR25 Sep 2024 13:36 UTC

57 points

2 comments1 min readEA link

AGI safety career advice

richard_ngo2 May 2023 7:36 UTC

211 points

20 comments13 min readEA link

Non-alignment project ideas for making transformative AI go well

Lukas Finnveden4 Jan 2024 7:23 UTC

66 points

1 comment3 min readEA link

(www.forethought.org)

How to help crucial AI safety legislation pass with 10 minutes of effort

ThomasW11 Sep 2024 19:14 UTC

258 points

33 comments3 min readEA link

Trendlines in AIxBio evals

ljusten31 Oct 2024 0:09 UTC

40 points

2 comments11 min readEA link

(www.lennijusten.com)

AI safety starter pack

mariushobbhahn28 Mar 2022 16:05 UTC

128 points

13 comments6 min readEA link

How I failed to form views on AI safety

Ada-Maaria Hyvärinen17 Apr 2022 11:05 UTC

213 points

72 comments40 min readEA link

What new x- or s-risk fieldbuilding organisations would you like to see? An EOI form. (FBB #3)

gergo17 Feb 2025 12:37 UTC

32 points

3 comments2 min readEA link

Filling the Void: A Comprehensive Database for AI Risks Materials

J.A.M.28 May 2024 16:03 UTC

10 points

1 comment4 min readEA link

Should AI X-Risk Worriers Short the Market?

postlibertarian4 Nov 2024 16:16 UTC

14 points

1 comment6 min readEA link

My favorite AI governance research this year so far

Zach Stein-Perlman23 Jul 2023 22:00 UTC

81 points

4 comments7 min readEA link

(blog.aiimpacts.org)

Apply to the Cavendish Labs Fellowship (by 4/15)

Derik K3 Apr 2023 23:06 UTC

35 points

2 comments1 min readEA link

Executive Director for AIS Brussels—Expression of interest

gergo19 Dec 2024 9:15 UTC

29 points

0 comments4 min readEA link

Announcing the CLR Foundations Course and CLR S-Risk Seminars

James Faville19 Nov 2024 1:18 UTC

52 points

2 comments3 min readEA link

Manifund x AI Worldviews

Austin31 Mar 2023 15:32 UTC

32 points

2 comments2 min readEA link

(manifund.org)

Mentorship in AGI Safety (MAGIS)

Joe Rogero23 May 2024 18:34 UTC

11 points

1 comment2 min readEA link

Large Language Models as Fiduciaries to Humans

johnjnay24 Jan 2023 19:53 UTC

25 points

0 comments34 min readEA link

(papers.ssrn.com)

[SEE NEW EDITS] No, You Need to Write Clearer

Nicholas Kross29 Apr 2023 5:04 UTC

71 points

8 comments5 min readEA link

(www.thinkingmuchbetter.com)

The Compendium, A full argument about extinction risk from AGI

adamShimi31 Oct 2024 12:02 UTC

9 points

1 comment2 min readEA link

(www.thecompendium.ai)

Partial Transcript of Recent Senate Hearing Discussing AI X-Risk

Daniel_Eth27 Jul 2023 9:16 UTC

150 points

2 comments22 min readEA link

(medium.com)

The Leeroy Jenkins principle: How faulty AI could guarantee “warning shots”

titotal14 Jan 2024 15:03 UTC

56 points

2 comments21 min readEA link

(titotal.substack.com)

Please don’t criticize EAs who “sell out” to OpenAI and Anthropic

Eevee🔹5 Mar 2023 21:17 UTC

−4 points

21 comments2 min readEA link

Interactive AI Governance Map

Hamish McDoodles12 Mar 2024 10:02 UTC

67 points

8 comments1 min readEA link

AIS Hungary is hiring a part-time Technical Lead! (Deadline: Dec 31st)

gergo17 Dec 2024 14:08 UTC

9 points

0 comments2 min readEA link

Why AGI systems will not be fanatical maximisers (unless trained by fanatical humans)

titotal17 May 2023 11:58 UTC

43 points

3 comments15 min readEA link

AI stocks could crash. And that could have implications for AI safety

Benjamin_Todd9 May 2024 7:23 UTC

173 points

41 comments4 min readEA link

(benjamintodd.substack.com)

Solving adversarial attacks in computer vision as a baby version of general AI alignment

Stanislav Fort31 Aug 2024 16:15 UTC

3 points

1 comment7 min readEA link

Brain-computer interfaces and brain organoids in AI alignment?

freedomandutility15 Apr 2023 22:28 UTC

8 points

2 comments1 min readEA link

Disrupting malicious uses of AI by state-affiliated threat actors

Agustín Covarrubias 🔸14 Feb 2024 21:28 UTC

22 points

1 comment1 min readEA link

(openai.com)

The Cruel Trade-Off Between AI Misuse and AI X-risk Concerns

simeon_c22 Apr 2023 13:49 UTC

27 points

17 comments2 min readEA link

Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training

evhub12 Jan 2024 19:51 UTC

65 points

0 comments3 min readEA link

(arxiv.org)

2024: a year of consolidation for ORCG

JorgeTorresC18 Dec 2024 17:47 UTC

33 points

0 comments7 min readEA link

(www.orcg.info)

Agentic Alignment: Navigating between Harm and Illegitimacy

LennardZ26 Nov 2024 21:27 UTC

2 points

1 comment9 min readEA link

The ‘Neglected Approaches’ Approach: AE Studio’s Alignment Agenda

Marc Carauleanu18 Dec 2023 21:13 UTC

21 points

0 comments12 min readEA link

[MLSN #8]: Mechanistic interpretability, using law to inform AI alignment, scaling laws for proxy gaming

TW12320 Feb 2023 16:06 UTC

25 points

0 comments4 min readEA link

(newsletter.mlsafety.org)

AI Risk US Presidental Candidate

Simon Berens11 Apr 2023 20:18 UTC

12 points

8 comments1 min readEA link

How to Give Coming AGI’s the Best Chance of Figuring Out Ethics for Us

Sean Sweeney23 May 2024 19:44 UTC

1 point

1 comment10 min readEA link

How to Address EA Dilemmas – What is Missing from EA Values?

alexis schoenlaub13 Oct 2024 9:33 UTC

7 points

4 comments6 min readEA link

[Linkpost] AI Alignment, Explained in 5 Points (updated)

Daniel_Eth18 Apr 2023 8:09 UTC

31 points

2 comments1 min readEA link

(medium.com)

Critiques of prominent AI safety labs: Redwood Research

Omega31 Mar 2023 8:58 UTC

339 points

91 comments20 min readEA link

Details on how an IAEA-style AI regulator would function?

freedomandutility3 Jun 2023 12:03 UTC

12 points

5 comments1 min readEA link

Arkose: Organizational Updates & Ways to Get Involved

Arkose1 Aug 2024 13:03 UTC

28 points

1 comment1 min readEA link

Success without dignity: a nearcasting story of avoiding catastrophe by luck

Holden Karnofsky15 Mar 2023 20:17 UTC

113 points

3 comments15 min readEA link

Counting arguments provide no evidence for AI doom

Nora Belrose27 Feb 2024 23:03 UTC

84 points

15 comments14 min readEA link

Preventing AI Misuse: State of the Art Research and its Flaws

Madhav Malhotra23 Apr 2023 10:50 UTC

24 points

2 comments11 min readEA link

Current paths to impact in EU AI Policy (Feb ’24)

JOMG_Monnet12 Feb 2024 15:57 UTC

47 points

0 comments5 min readEA link

Is effective altruism really to blame for the OpenAI debacle?

Garrison23 Nov 2023 0:44 UTC

13 points

0 comments1 min readEA link

(garrisonlovely.substack.com)

Open-Source AI: A Regulatory Review

Elliot Mckernon29 Apr 2024 10:10 UTC

14 points

1 comment8 min readEA link

[Question] How does AI progress affect other EA cause areas?

Luis Mota Freitas9 Jun 2023 12:43 UTC

96 points

13 comments1 min readEA link

Standard policy frameworks for AI governance

Nathan_Barnard30 Jan 2024 18:14 UTC

26 points

2 comments3 min readEA link

(How) Is technical AI Safety research being evaluated?

JohnSnow11 Jul 2023 9:37 UTC

27 points

1 comment1 min readEA link

DeepMind: Frontier Safety Framework

Zach Stein-Perlman17 May 2024 17:30 UTC

23 points

0 comments3 min readEA link

(deepmind.google)

[Question] Why haven’t we been destroyed by a power-seeking AGI from elsewhere in the universe?

Jadon Schmitt22 Jul 2023 7:21 UTC

35 points

14 comments1 min readEA link

A great talk for AI noobs (according to an AI noob)

Dov23 Apr 2023 5:32 UTC

8 points

0 comments1 min readEA link

(www.youtube.com)

[linkpost] “What Are Reasonable AI Fears?” by Robin Hanson, 2023-04-23

Arjun Panickssery14 Apr 2023 23:26 UTC

41 points

3 comments4 min readEA link

(quillette.com)

In DC, a new wave of AI lobbyists gains the upper hand

Chris Leong13 May 2024 7:31 UTC

97 points

7 comments1 min readEA link

(www.politico.com)

Bringing about animal-inclusive AI

Max Taylor18 Dec 2023 11:49 UTC

135 points

9 comments16 min readEA link

Raising the voices that actually count

Kim Holder13 Jun 2023 19:21 UTC

2 points

3 comments2 min readEA link

Technology is Power: Raising Awareness Of Technological Risks

Marc Wong9 Feb 2023 15:13 UTC

3 points

0 comments2 min readEA link

If you are too stressed, walk away from the front lines

Neil Warren12 Jun 2023 21:01 UTC

7 points

2 comments4 min readEA link

Announcing Human-aligned AI Summer School

Jan_Kulveit22 May 2024 8:55 UTC

33 points

0 comments1 min readEA link

(humanaligned.ai)

Possible OpenAI’s Q* breakthrough and DeepMind’s AlphaGo-type systems plus LLMs

Burnydelic23 Nov 2023 7:02 UTC

13 points

4 comments2 min readEA link

Modelling large-scale cyber attacks from advanced AI systems with Advanced Persistent Threats

Iyngkarran Kumar2 Oct 2023 9:54 UTC

28 points

2 comments30 min readEA link

Help the UN design global governance structures for AI

Joanna (Asia) Wiaterek12 Jan 2024 8:44 UTC

72 points

2 comments1 min readEA link

AI Safety Newsletter #2: ChaosGPT, Natural Selection, and AI Safety in the Media

Oliver Z18 Apr 2023 18:36 UTC

56 points

1 comment4 min readEA link

(newsletter.safe.ai)

AI-Relevant Regulation: IAEA

SWK15 Jul 2023 18:20 UTC

10 points

0 comments5 min readEA link

Designing Artificial Wisdom: Decision Forecasting AI & Futarchy

Jordan Arel14 Jul 2024 5:10 UTC

5 points

1 comment6 min readEA link

This might be the last AI Safety Camp

Remmelt24 Jan 2024 9:29 UTC

87 points

32 comments1 min readEA link

Deep Deceptiveness

So8res21 Mar 2023 2:51 UTC

40 points

1 comment14 min readEA link

AISN#15: China and the US take action to regulate AI, results from a tournament forecasting AI risk, updates on xAI’s plan, and Meta releases its open-source and commercially available Llama 2

Center for AI Safety19 Jul 2023 1:40 UTC

5 points

0 comments6 min readEA link

(newsletter.safe.ai)

Reasons to have hope

Jordan Pieters 🔸20 Apr 2023 10:19 UTC

53 points

4 comments1 min readEA link

[Question] How independent is the research coming out of OpenAI’s preparedness team?

Earthling10 Feb 2024 16:59 UTC

18 points

0 comments1 min readEA link

AISN #35: Lobbying on AI Regulation Plus, New Models from OpenAI and Google, and Legal Regimes for Training on Copyrighted Data

Center for AI Safety16 May 2024 14:26 UTC

14 points

0 comments6 min readEA link

(newsletter.safe.ai)

ChatGPT: towards AI subjectivity

KrisDAmato1 May 2024 10:13 UTC

3 points

0 comments1 min readEA link

(link.springer.com)

Explorers in a virtual country: Navigating the knowledge landscape of large language models

Alexander Saeri28 Mar 2023 21:32 UTC

17 points

1 comment6 min readEA link

Now THIS is forecasting: understanding Epoch’s Direct Approach

Elliot Mckernon4 May 2024 12:06 UTC

52 points

2 comments19 min readEA link

Paradigms and Theory Choice in AI: Adaptivity, Economy and Control

particlemania28 Aug 2023 22:44 UTC

3 points

0 comments16 min readEA link

A fictional AI law laced w/ alignment theory

Miguel17 Jul 2023 3:26 UTC

3 points

0 comments2 min readEA link

OpenAI’s Superalignment team has opened Fast Grants

Yadav16 Dec 2023 15:41 UTC

31 points

2 comments1 min readEA link

(openai.com)

Cooperative AI: Three things that confused me as a beginner (and my current understanding)

C Tilli16 Apr 2024 7:06 UTC

58 points

10 comments6 min readEA link

A Viral License for AI Safety

IvanVendrov5 Jun 2021 2:00 UTC

30 points

6 comments5 min readEA link

UK Foundation Model Task Force—Expression of Interest

ojorgensen18 Jun 2023 9:40 UTC

111 points

3 comments1 min readEA link

(twitter.com)

AGI development role-playing game

rekahalasz11 Dec 2023 10:22 UTC

4 points

0 comments1 min readEA link

AI-Relevant Regulation: Insurance in Safety-Critical Industries

SWK22 Jul 2023 17:52 UTC

5 points

0 comments6 min readEA link

[Linkpost] Longtermists Are Pushing a New Cold War With China

Radical Empath Ismam27 May 2023 6:53 UTC

38 points

16 comments1 min readEA link

(jacobin.com)

Catastrophic Risks from Unsafe AI: Navigating a Tightrope Scenario (Ben Garfinkel, EAG London 2023)

Alexander Saeri2 Jun 2023 9:59 UTC

19 points

1 comment10 min readEA link

[Linkpost] OpenAI leaders call for regulation of “superintelligence” to reduce existential risk.

Lowe Lundin25 May 2023 14:14 UTC

5 points

0 comments1 min readEA link

Should you work at a leading AI lab? (including in non-safety roles)

Benjamin Hilton25 Jul 2023 16:28 UTC

38 points

13 comments12 min readEA link

Draghi’s report signal a less safety-focused European Union on AI

t6aguirre9 Sep 2024 18:39 UTC

17 points

3 comments1 min readEA link

Episode: Austin vs Linch on OpenAI

Austin25 May 2024 16:15 UTC

21 points

2 comments44 min readEA link

(manifund.substack.com)

[Question] Why is learning economics, psychology, sociology important for preventing AI risks?

jackchang1103 Nov 2023 21:48 UTC

3 points

0 comments1 min readEA link

AI-Relevant Regulation: CERN

SWK15 Jul 2023 18:40 UTC

12 points

0 comments6 min readEA link

What can we do now to prepare for AI sentience, in order to protect them from the global scale of human sadism?

rime18 Apr 2023 9:58 UTC

44 points

0 comments2 min readEA link

How to pursue a career in technical AI alignment

Charlie Rogers-Smith4 Jun 2022 21:36 UTC

268 points

9 comments39 min readEA link

Alignment, Goals, & The Gut-Head Gap: A Review of Ngo. et al

Violet Hour11 May 2023 17:16 UTC

26 points

0 comments13 min readEA link

What does Bing Chat tell us about AI risk?

Holden Karnofsky28 Feb 2023 18:47 UTC

99 points

8 comments2 min readEA link

(www.cold-takes.com)

What’s new at FAR AI

AdamGleave4 Dec 2023 21:18 UTC

68 points

0 comments5 min readEA link

(far.ai)

What AI companies can do today to help with the most important century

Holden Karnofsky20 Feb 2023 17:40 UTC

104 points

8 comments11 min readEA link

(www.cold-takes.com)

Mapping How Alliances, Acquisitions, and Antitrust are Shaping the Frontier AI Industry

t6aguirre3 Jun 2024 9:43 UTC

24 points

1 comment2 min readEA link

LLMs won’t lead to AGI—Francois Chollet

tobycrisford 🔸11 Jun 2024 20:19 UTC

38 points

23 comments1 min readEA link

(www.youtube.com)

AI Incident Reporting: A Regulatory Review

Deric Cheng11 Mar 2024 21:02 UTC

10 points

1 comment6 min readEA link

Discussing AI-Human Collaboration Through Fiction: The Story of Laika and GPT-∞

Laika27 Jul 2023 6:04 UTC

1 point

0 comments1 min readEA link

Misalignment Museum opens in San Francisco: ‘Sorry for killing most of humanity’

Michael Huang4 Mar 2023 7:09 UTC

99 points

6 comments1 min readEA link

(www.misalignmentmuseum.com)

AI Wellbeing

Simon 11 Jul 2023 0:34 UTC

11 points

0 comments9 min readEA link

What we’re missing: the case for structural risks from AI

Justin Olive9 Nov 2023 5:52 UTC

31 points

3 comments6 min readEA link

‘The AI Dilemma: Growth vs Existential Risk’: An Extension for EAs and a Summary for Non-economists

TomHoulden21 Apr 2024 16:28 UTC

66 points

1 comment16 min readEA link

[Question] Who is testing AI Safety public outreach messaging?

yanni kyriacos15 Apr 2023 0:53 UTC

20 points

2 comments1 min readEA link

Why Would AI “Aim” To Defeat Humanity?

Holden Karnofsky29 Nov 2022 18:59 UTC

24 points

0 comments32 min readEA link

(www.cold-takes.com)

I am unable to get any AI safety related fellowships or internships.

Aavishkar11 Mar 2024 5:00 UTC

5 points

6 comments1 min readEA link

What to think when a language model tells you it’s sentient

rgb20 Feb 2023 2:59 UTC

112 points

18 comments6 min readEA link

Biological superintelligence: a solution to AI safety

Yarrow🔸4 Dec 2023 13:09 UTC

2 points

6 comments1 min readEA link

Research agenda: Supervising AIs improving AIs

Quintin Pope29 Apr 2023 17:09 UTC

16 points

0 comments19 min readEA link

Safe AI and moral AI

William D'Alessandro1 Jun 2023 21:18 UTC

3 points

0 comments11 min readEA link

[Question] Should people get neuroscience phD to work in AI safety field?

jackchang1107 Mar 2023 16:21 UTC

9 points

11 comments1 min readEA link

Pessimism about AI Safety

Max_He-Ho2 Apr 2023 7:57 UTC

5 points

0 comments25 min readEA link

(www.lesswrong.com)

Overview of introductory resources in AI Governance

Lucie Philippon 🔸27 May 2024 16:22 UTC

26 points

1 comment6 min readEA link

(www.lesswrong.com)

AI, Cybersecurity, and Malware: A Shallow Report [Technical]

Madhav Malhotra31 Mar 2023 12:03 UTC

4 points

0 comments9 min readEA link

Pausing AI Developments Isn’t Enough. We Need to Shut it All Down

EliezerYudkowsky9 Apr 2023 15:53 UTC

50 points

3 comments12 min readEA link

Hashmarks: Privacy-Preserving Benchmarks for High-Stakes AI Evaluation

Paul Bricman4 Dec 2023 7:41 UTC

4 points

0 comments16 min readEA link

(arxiv.org)

World and Mind in Artificial Intelligence: arguments against the AI pause

Arturo Macias18 Apr 2023 14:35 UTC

6 points

3 comments5 min readEA link

Observations on the funding landscape of EA and AI safety

Vilhelm Skoglund2 Oct 2023 9:45 UTC

136 points

12 comments15 min readEA link

AI Alignment in The New Yorker

Eleni_A17 May 2023 21:19 UTC

23 points

0 comments1 min readEA link

(www.newyorker.com)

Reframing the burden of proof: Companies should prove that models are safe (rather than expecting auditors to prove that models are dangerous)

Akash25 Apr 2023 18:49 UTC

35 points

1 comment3 min readEA link

(childrenoficarus.substack.com)

Updates from Campaign for AI Safety

Jolyn Khoo19 Jul 2023 8:15 UTC

5 points

0 comments2 min readEA link

(www.campaignforaisafety.org)

Towards evidence gap-maps for AI safety

dEAsign25 Jul 2023 8:13 UTC

6 points

1 comment2 min readEA link

List of projects that seem impactful for AI Governance

JaimeRV14 Jan 2024 16:52 UTC

35 points

2 comments13 min readEA link

AI, Cybersecurity, and Malware: A Shallow Report [General]

Madhav Malhotra31 Mar 2023 12:01 UTC

5 points

0 comments8 min readEA link

Exploring Metaculus’s AI Track Record

Peter Scoblic1 May 2023 21:02 UTC

52 points

5 comments5 min readEA link

14+ AI Safety Advisors You Can Speak to – New AISafety.com Resource

Bryce Robertson21 Jan 2025 17:34 UTC

18 points

2 comments1 min readEA link

AI Progress: The Game Show

Alex Arnett21 Apr 2023 16:47 UTC

3 points

0 comments2 min readEA link

The new UK government’s stance on AI safety

Elliot Mckernon31 Jul 2024 15:23 UTC

19 points

0 comments4 min readEA link

ChatGPT not so clever or not so artificial as hyped to be?

Haris Shekeris2 Mar 2023 6:16 UTC

−7 points

2 comments1 min readEA link

UK AI Bill Analysis & Opinion

CAISID5 Feb 2024 0:12 UTC

18 points

0 comments15 min readEA link

Sentinel minutes for week #52/2024

NunoSempere30 Dec 2024 18:25 UTC

61 points

0 comments6 min readEA link

(blog.sentinel-team.org)

Potential employees have a unique lever to influence the behaviors of AI labs

oxalis18 Mar 2023 20:58 UTC

139 points

1 comment5 min readEA link

AI Safety Newsletter #8: Rogue AIs, how to screen for AI risks, and grants for research on democratic governance of AI

Center for AI Safety30 May 2023 11:44 UTC

16 points

3 comments6 min readEA link

(newsletter.safe.ai)

Orthogonal: A new agent foundations alignment organization

Tamsin Leake19 Apr 2023 20:17 UTC

38 points

0 comments1 min readEA link

(orxl.org)

Help us find pain points in AI safety

Esben Kran12 Apr 2022 18:43 UTC

31 points

4 comments9 min readEA link

[Question] What am I missing re. open-source LLM’s?

another-anon-do-gooder4 Dec 2023 4:48 UTC

1 point

2 comments1 min readEA link

AI companies are not on track to secure model weights

Jeffrey Ladish18 Jul 2024 15:13 UTC

73 points

3 comments19 min readEA link

GovAI: Towards best practices in AGI safety and governance: A survey of expert opinion

Zach Stein-Perlman15 May 2023 1:42 UTC

68 points

4 comments1 min readEA link

(arxiv.org)

Boomerang—protocol to dissolve some commitment races

Filip Sondej30 May 2023 16:24 UTC

20 points

0 comments8 min readEA link

(www.lesswrong.com)

Primitive Global Discourse Framework, Constitutional AI using legal frameworks, and Monoculture—A loss of control over the role of AGI in society

broptross1 Jun 2023 5:12 UTC

2 points

0 comments12 min readEA link

The Risks of AI-Generated Content on the EA Forum

WobblyPanda24 Jun 2023 5:33 UTC

−1 points

0 comments1 min readEA link

An Introduction to Critiques of prominent AI safety organizations

Omega19 Jul 2023 6:53 UTC

87 points

2 comments5 min readEA link

There is only one goal or drive—only self-perpetuation counts

freest one13 Jun 2023 1:37 UTC

2 points

4 comments8 min readEA link

Biden-Harris Administration Announces First-Ever Consortium Dedicated to AI Safety

ben.smith9 Feb 2024 6:40 UTC

15 points

1 comment1 min readEA link

(www.nist.gov)

Pay to get AI safety info from behind NDA wall?

louisbarclay5 Jun 2024 10:19 UTC

2 points

2 comments1 min readEA link

Pillars to Convergence

Phlobton1 Apr 2023 13:04 UTC

1 point

0 comments8 min readEA link

Is fear productive when communicating AI x-risk? [Study results]

Johanna Roniger22 Jan 2024 5:38 UTC

73 points

10 comments5 min readEA link

Does AI risk “other” the AIs?

Joe_Carlsmith9 Jan 2024 17:51 UTC

23 points

3 comments8 min readEA link

We are fighting a shared battle (a call for a different approach to AI Strategy)

GideonF16 Mar 2023 14:37 UTC

59 points

11 comments15 min readEA link

Paper Summary: The Effectiveness of AI Existential Risk Communication to the American and Dutch Public

Otto9 Mar 2023 10:40 UTC

97 points

11 comments4 min readEA link

An argument for accelerating international AI governance research (part 1)

MattThinks16 Aug 2023 5:40 UTC

10 points

0 comments3 min readEA link

Updates from Campaign for AI Safety

Jolyn Khoo30 Aug 2023 5:36 UTC

7 points

0 comments2 min readEA link

(www.campaignforaisafety.org)

Epoch is hiring a Product and Data Visualization Designer

merilalama25 Nov 2023 0:14 UTC

21 points

0 comments4 min readEA link

(careers.rethinkpriorities.org)

[CFP] NeurIPS workshop: AI meets Moral Philosophy and Moral Psychology

jaredlcm4 Sep 2023 6:21 UTC

10 points

1 comment4 min readEA link

The case for more ambitious language model evals

Jozdien30 Jan 2024 9:24 UTC

7 points

0 comments5 min readEA link

Non-trivial Fellowship Project: Towards a Unified Dangerous Capabilities Benchmark

Jord 4 Mar 2024 9:24 UTC

2 points

1 comment9 min readEA link

My Objections to “We’re All Gonna Die with Eliezer Yudkowsky”

Quintin Pope21 Mar 2023 1:23 UTC

166 points

21 comments39 min readEA link

“Risk Awareness Moments” (Rams): A concept for thinking about AI governance interventions

oeg14 Apr 2023 17:40 UTC

53 points

0 comments9 min readEA link

AI Policy Insights from the AIMS Survey

Janet Pauketat22 Feb 2024 19:17 UTC

10 points

1 comment18 min readEA link

(www.sentienceinstitute.org)

Claude 3 claims it’s conscious, doesn’t want to die or be modified

MikhailSamin4 Mar 2024 23:05 UTC

8 points

3 comments14 min readEA link

[Question] Predictions for future AI governance?

jackchang1102 Apr 2023 16:43 UTC

4 points

1 comment1 min readEA link

What can superintelligent ANI tell us about superintelligent AGI?

Ted Sanders12 Jun 2023 6:32 UTC

81 points

20 comments5 min readEA link

The basic reasons I expect AGI ruin

RobBensinger18 Apr 2023 3:37 UTC

58 points

13 comments14 min readEA link

World’s first major law for artificial intelligence gets final EU green light

Dane Valerie24 May 2024 14:57 UTC

3 points

1 comment2 min readEA link

(www.cnbc.com)

A note of caution on believing things on a gut level

Nathan_Barnard9 May 2023 12:20 UTC

41 points

5 comments2 min readEA link

October 2022 AI Risk Community Survey Results

Froolow24 May 2023 10:37 UTC

19 points

0 comments7 min readEA link

You don’t need to be a genius to be in AI safety research

Claire Short10 May 2023 22:23 UTC

28 points

4 comments6 min readEA link

An even deeper atheism

Joe_Carlsmith11 Jan 2024 17:28 UTC

26 points

2 comments15 min readEA link

Apply to MATS 7.0!

Ryan Kidd21 Sep 2024 0:23 UTC

27 points

0 comments5 min readEA link

Speedrun: AI Alignment Prizes

joe9 Feb 2023 11:55 UTC

27 points

0 comments18 min readEA link

How can OSINT be used for the enforcement of the EU AI Act?

Kristina7 Jun 2024 11:07 UTC

8 points

1 comment1 min readEA link

The Game of Dominance

Karl von Wendt27 Aug 2023 11:23 UTC

5 points

0 comments6 min readEA link

Beginner’s guide to reducing s-risks [link-post]

Center on Long-Term Risk17 Oct 2023 0:51 UTC

130 points

3 comments3 min readEA link

(longtermrisk.org)

Research Summary: Forecasting with Large Language Models

Damien Laird2 Apr 2023 10:52 UTC

4 points

0 comments7 min readEA link

(damienlaird.substack.com)

The two-tiered society

Roman Leventov13 May 2024 7:53 UTC

14 points

5 comments3 min readEA link

Announcing New Beginner-friendly Book on AI Safety and Risk

Darren McKee25 Nov 2023 15:57 UTC

114 points

9 comments1 min readEA link

Claude 3.5 Sonnet

Zach Stein-Perlman20 Jun 2024 18:00 UTC

31 points

0 comments1 min readEA link

(www.anthropic.com)

Cambridge AI Safety Hub is looking for full- or part-time organisers

hannah15 Jul 2023 14:31 UTC

12 points

0 comments1 min readEA link

Updates from Campaign for AI Safety

Jolyn Khoo31 Oct 2023 5:46 UTC

14 points

1 comment2 min readEA link

(www.campaignforaisafety.org)

Weekly newsletter for AI safety events and training programs

Bryce Robertson3 May 2024 0:37 UTC

15 points

0 comments1 min readEA link

(www.lesswrong.com)

Chaining the evil genie: why “outer” AI safety is probably easy

titotal30 Aug 2022 13:55 UTC

40 points

12 comments10 min readEA link

OpenAI board received letter warning of powerful AI

JordanStone23 Nov 2023 0:16 UTC

26 points

2 comments1 min readEA link

(www.reuters.com)

Some initial musing on the politics of longtermist trajectory change

GideonF26 Jun 2025 7:16 UTC

6 points

0 comments12 min readEA link

(futerman.substack.com)

Dangerous capability tests should be harder

Luca Righetti 🔸20 Aug 2024 16:11 UTC

23 points

1 comment5 min readEA link

(www.planned-obsolescence.org)

Australians are concerned about AI risks and expect strong government action

Alexander Saeri8 Mar 2024 6:39 UTC

38 points

12 comments5 min readEA link

(aigovernance.org.au)

AGI rising: why we are in a new era of acute risk and increasing public awareness, and what to do now

Greg_Colbourn ⏸️ 2 May 2023 10:17 UTC

68 points

35 comments13 min readEA link

[Question] Would a super-intelligent AI necessarily support its own existence?

Porque?25 Jun 2023 10:39 UTC

8 points

2 comments2 min readEA link

Tort Law Can Play an Important Role in Mitigating AI Risk

Gabriel Weil12 Feb 2024 17:11 UTC

99 points

6 comments5 min readEA link

Existential risk x Crypto: An unconference at Zuzalu

Yesh11 Apr 2023 13:31 UTC

6 points

0 comments1 min readEA link

Poster Session on AI Safety

Neil Crawford12 Nov 2022 3:50 UTC

8 points

0 comments4 min readEA link

MIRI 2024 Mission and Strategy Update

Malo5 Jan 2024 1:10 UTC

154 points

38 comments8 min readEA link

Fixing Insider Threats in the AI Supply Chain

Madhav Malhotra7 Oct 2023 10:49 UTC

9 points

2 comments5 min readEA link

New Artificial Intelligence quiz: can you beat ChatGPT?

AndreFerretti3 Mar 2023 15:46 UTC

29 points

3 comments1 min readEA link

Transformative AI and Compute—Reading List

Frederik Berg4 Sep 2023 6:21 UTC

24 points

0 comments1 min readEA link

(docs.google.com)

Applications Open: Pivotal 2025 Q3 Research Fellowship

Tobias Häberli18 Mar 2025 13:25 UTC

20 points

0 comments2 min readEA link

UNGA Resolution on AI: 5 Key Takeaways Looking to Future Policy

Heramb Podar24 Mar 2024 12:03 UTC

17 points

1 comment3 min readEA link

Risk of AI deceleration.

Micah Zoltu18 Apr 2023 11:19 UTC

9 points

14 comments3 min readEA link

Mitigating extreme AI risks amid rapid progress [Linkpost]

Akash21 May 2024 20:04 UTC

36 points

1 comment4 min readEA link

[Closed] MIT FutureTech are hiring for a Head of Operations role

PeterSlattery2 Oct 2024 16:51 UTC

8 points

0 comments4 min readEA link

Navigating the Open-Source AI Landscape: Data, Funding, and Safety

AndreFerretti12 Apr 2023 10:30 UTC

23 points

3 comments10 min readEA link

[Linkpost] Beware the Squirrel by Verity Harding

Earthling3 Sep 2023 21:04 UTC

1 point

1 comment2 min readEA link

(samf.substack.com)

[Question] Know a grad student studying AI’s economic impacts?

Madhav Malhotra5 Jul 2023 0:07 UTC

7 points

0 comments1 min readEA link

[Question] Do you worry about totalitarian regimes using AI Alignment technology to create AGI that subscribe to their values?

diodio_yang28 Feb 2023 18:12 UTC

25 points

12 comments2 min readEA link

AI Safety Arguments: An Interactive Guide

Lukas Trötzmüller🔸1 Feb 2023 19:21 UTC

32 points

5 comments3 min readEA link

Assessment of AI safety agendas: think about the downside risk

Roman Leventov19 Dec 2023 9:02 UTC

6 points

0 comments1 min readEA link

[Linkpost] Scott Alexander reacts to OpenAI’s latest post

Akash11 Mar 2023 22:24 UTC

105 points

4 comments5 min readEA link

(astralcodexten.substack.com)

An economist’s perspective on AI safety

David Stinson7 Jun 2024 7:55 UTC

7 points

1 comment9 min readEA link

Neuronpedia—AI Safety Game

johnnylin16 Oct 2023 9:35 UTC

9 points

2 comments4 min readEA link

(neuronpedia.org)

My Proven AI Safety Explanation (as a computing student)

Mica White6 Feb 2024 3:58 UTC

8 points

4 comments6 min readEA link

Apply to Aether—Independent LLM Agent Safety Research Group

RohanS21 Aug 2024 9:40 UTC

47 points

13 comments8 min readEA link

Literature review of TAI timelines

Jaime Sevilla27 Jan 2023 20:36 UTC

148 points

10 comments2 min readEA link

(epochai.org)

AI and Work: Summarising a New Literature Review

cpeppiatt15 Jul 2024 10:27 UTC

13 points

0 comments2 min readEA link

(arxiv.org)

Muddling Along Is More Likely Than Dystopia

Jeffrey Heninger21 Oct 2023 9:30 UTC

87 points

3 comments8 min readEA link

(blog.aiimpacts.org)

Digital people could make AI safer

GMcGowan10 Jun 2022 15:29 UTC

25 points

15 comments4 min readEA link

(www.mindlessalgorithm.com)

Summary of Situational Awareness—The Decade Ahead

OscarD🔸8 Jun 2024 11:29 UTC

143 points

5 comments18 min readEA link

What do XPT forecasts tell us about AI risk?

Forecasting Research Institute19 Jul 2023 7:43 UTC

97 points

21 comments14 min readEA link

We Should Talk About This More. Epistemic World Collapse as Imminent Safety Risk of Generative AI.

Jörg Weiß16 Nov 2023 8:34 UTC

4 points

0 comments29 min readEA link

Automated Parliaments — A Solution to Decision Uncertainty and Misalignment in Language Models

Shak Ragoler2 Oct 2023 9:47 UTC

9 points

0 comments17 min readEA link

Learning societal values from law as part of an AGI alignment strategy

johnjnay21 Oct 2022 2:03 UTC

20 points

1 comment24 min readEA link

Join the AI Evaluation Tasks Bounty Hackathon

Esben Kran18 Mar 2024 8:15 UTC

20 points

0 comments4 min readEA link

Critiques of prominent AI safety labs: Conjecture

Omega12 Jun 2023 5:52 UTC

150 points

83 comments32 min readEA link

How evals might (or might not) prevent catastrophic risks from AI

Akash7 Feb 2023 20:16 UTC

28 points

0 comments9 min readEA link

MIT FutureTech are hiring for an Operations and Project Management role.

PeterSlattery17 May 2024 1:29 UTC

12 points

0 comments3 min readEA link

A simple way of exploiting AI’s coming economic impact may be highly-impactful

kuira16 Jul 2023 10:30 UTC

5 points

0 comments2 min readEA link

(www.lesswrong.com)

Apply to the Cambridge ML for Alignment Bootcamp (CaMLAB) [26 March − 8 April]

hannah9 Feb 2023 16:32 UTC

62 points

1 comment5 min readEA link

Have your say on the future of AI regulation: Deadline approaching for your feedback on UN High-Level Advisory Body on AI Interim Report ‘Governing AI for Humanity’

Deborah W.A. Foulkes29 Mar 2024 6:37 UTC

17 points

1 comment1 min readEA link

How major governments can help with the most important century

Holden Karnofsky24 Feb 2023 19:37 UTC

56 points

4 comments4 min readEA link

(www.cold-takes.com)

The current alignment plan, and how we might improve it | EAG Bay Area 23

Buck7 Jun 2023 21:03 UTC

66 points

0 comments33 min readEA link

(Even) More Early-Career EAs Should Try AI Safety Technical Research

tlevin30 Jun 2022 21:14 UTC

86 points

40 comments11 min readEA link

“Pivotal Act” Intentions: Negative Consequences and Fallacious Arguments

Andrew Critch19 Apr 2022 20:24 UTC

80 points

10 comments7 min readEA link

The AI Endgame: A counterfactual to AI alignment by an AI Safety newcomer

Andreas P1 Dec 2023 5:49 UTC

2 points

5 comments3 min readEA link

The Multidisciplinary Approach to Alignment (MATA) and Archetypal Transfer Learning (ATL)

Miguel19 Jun 2023 3:23 UTC

4 points

0 comments7 min readEA link

Diminishing Returns in Machine Learning Part 1: Hardware Development and the Physical Frontier

Brian Chau27 May 2023 12:39 UTC

16 points

3 comments12 min readEA link

(www.fromthenew.world)

Unions for AI safety?

dEAsign24 Sep 2023 0:13 UTC

7 points

12 comments2 min readEA link

How quickly AI could transform the world (Tom Davidson on The 80,000 Hours Podcast)

80000_Hours8 May 2023 13:23 UTC

82 points

3 comments17 min readEA link

How we could stumble into AI catastrophe

Holden Karnofsky16 Jan 2023 14:52 UTC

83 points

0 comments31 min readEA link

(www.cold-takes.com)

PhD Position: AI Interpretability in Berlin, Germany

Martian Moonshine22 Apr 2023 18:57 UTC

24 points

0 comments1 min readEA link

(stephanw.net)

Podcast: Interview series featuring Dr. Peter Park

Jacob-Haimes26 Mar 2024 0:35 UTC

1 point

0 comments2 min readEA link

(into-ai-safety.github.io)

Sam Altman’s Chip Ambitions Undercut OpenAI’s Safety Strategy

Garrison10 Feb 2024 19:52 UTC

286 points

20 comments3 min readEA link

(garrisonlovely.substack.com)

The standard case for delaying AI appears to rest on non-utilitarian assumptions

Matthew_Barnett11 Feb 2025 4:04 UTC

16 points

57 comments10 min readEA link

Risk-averse Batch Active Inverse Reward Design

Panagiotis Liampas7 Oct 2023 8:56 UTC

11 points

0 comments15 min readEA link

Worrisome misunderstanding of the core issues with AI transition

Roman Leventov18 Jan 2024 10:05 UTC

4 points

3 comments4 min readEA link

Announcing Athena—Women in AI Alignment Research

Claire Short7 Nov 2023 22:02 UTC

180 points

28 comments3 min readEA link

Status Quo Engines—AI essay

Ilana_Goldowitz_Jimenez28 May 2023 14:33 UTC

1 point

1 comment15 min readEA link

Prospects for AI safety agreements between countries

oeg14 Apr 2023 17:41 UTC

104 points

3 comments22 min readEA link

“The Race to the End of Humanity” – Structural Uncertainty Analysis in AI Risk Models

Froolow19 May 2023 12:03 UTC

48 points

4 comments21 min readEA link

List of AI safety newsletters and other resources

Lizka1 May 2023 17:24 UTC

49 points

5 comments4 min readEA link

A summary of current work in AI governance

constructive17 Jun 2023 16:58 UTC

89 points

4 comments11 min readEA link

[US] NTIA: AI Accountability Policy Request for Comment

Kyle J. Lucchese 🔸13 Apr 2023 16:12 UTC

47 points

4 comments1 min readEA link

(ntia.gov)

It’s not obvious that getting dangerous AI later is better

Aaron_Scher23 Sep 2023 5:35 UTC

23 points

9 comments16 min readEA link

AI Safety Camp 2024

Linda Linsefors18 Nov 2023 10:37 UTC

21 points

1 comment4 min readEA link

(aisafety.camp)

Discussion about AI Safety funding (FB transcript)

Akash30 Apr 2023 19:05 UTC

104 points

10 comments6 min readEA link

Cybersecurity of Frontier AI Models: A Regulatory Review

Deric Cheng25 Apr 2024 14:51 UTC

9 points

1 comment8 min readEA link

A compute-based framework for thinking about the future of AI

Matthew_Barnett31 May 2023 22:00 UTC

96 points

36 comments19 min readEA link

You Can’t Prove Aliens Aren’t On Their Way To Destroy The Earth (A Comprehensive Takedown Of The Doomer View Of AI)

Murphy7 Apr 2023 13:37 UTC

−31 points

7 comments9 min readEA link

There are no coherence theorems

EJT20 Feb 2023 21:52 UTC

108 points

49 comments19 min readEA link

AI safety and consciousness research: A brainstorm

Daniel_Friedrich15 Mar 2023 14:33 UTC

11 points

1 comment9 min readEA link

Counterarguments to the basic AI risk case

Katja_Grace14 Oct 2022 20:30 UTC

286 points

23 comments34 min readEA link

Aligning the Aligners: Ensuring Aligned AI acts for the common good of all mankind

timunderwood16 Jan 2023 11:13 UTC

40 points

2 comments4 min readEA link

Reza Negarestani’s Intelligence & Spirit

ukc1001427 Jun 2024 18:17 UTC

7 points

1 comment4 min readEA link

MATS Summer 2023 Retrospective

utilistrutil2 Dec 2023 0:12 UTC

28 points

3 comments26 min readEA link

[Question] Is working on AI to help democracy a good idea?

WillPearson17 Feb 2024 23:15 UTC

5 points

3 comments1 min readEA link

“The Universe of Minds”—call for reviewers (Seeds of Science)

rogersbacon125 Jul 2023 16:55 UTC

4 points

0 comments1 min readEA link

Advocating for Public Ownership of Future AGI: Preserving Humanity’s Collective Heritage

George_A (Digital Intelligence Rights Initiative) 14 Jul 2023 16:01 UTC

−10 points

2 comments4 min readEA link

Announcing: Mechanism Design for AI Safety—Reading Group

Rubi J. Hudson9 Aug 2022 4:25 UTC

36 points

1 comment4 min readEA link

I designed an AI safety course (for a philosophy department)

Eleni_A23 Sep 2023 21:56 UTC

27 points

3 comments2 min readEA link

[Question] Would an Anthropic/OpenAI merger be good for AI safety?

M22 Nov 2023 20:21 UTC

6 points

1 comment1 min readEA link

Gaia Network: An Illustrated Primer

Roman Leventov26 Jan 2024 11:55 UTC

4 points

4 comments15 min readEA link

A Roundtable for Safe AI (RSAI)?

Lara_TH9 Mar 2023 12:11 UTC

9 points

0 comments4 min readEA link

AI Risk and Survivorship Bias—How Andreessen and LeCun got it wrong

stepanlos14 Jul 2023 17:10 UTC

5 points

1 comment6 min readEA link

The Bar for Contributing to AI Safety is Lower than You Think

Chris Leong17 Aug 2024 10:52 UTC

14 points

5 comments2 min readEA link

My Feedback to the UN Advisory Body on AI

Heramb Podar4 Apr 2024 23:39 UTC

7 points

1 comment4 min readEA link

Aim for conditional pauses

AnonResearcherMajorAILab25 Sep 2023 1:05 UTC

100 points

42 comments12 min readEA link

Ask AI companies about what they are doing for AI safety?

mic8 Mar 2022 21:54 UTC

44 points

1 comment2 min readEA link

Conscious AI & Public Perception: Four futures

nicoleta-k3 Jul 2024 23:06 UTC

12 points

1 comment16 min readEA link

Thoughts on the AI Safety Summit company policy requests and responses

So8res31 Oct 2023 23:54 UTC

42 points

3 comments10 min readEA link

Decomposing alignment to take advantage of paradigms

Christopher King4 Jun 2023 14:26 UTC

2 points

0 comments4 min readEA link

Risk Alignment in Agentic AI Systems

Hayley Clatterbuck1 Oct 2024 22:51 UTC

32 points

1 comment3 min readEA link

(static1.squarespace.com)

AI policy & governance in Australia: notes from an initial discussion

Alexander Saeri15 May 2023 0:00 UTC

31 points

1 comment3 min readEA link

The case for AGI by 2030

Benjamin_Todd6 Apr 2025 12:26 UTC

96 points

33 comments31 min readEA link

(80000hours.org)

Without a trajectory change, the development of AGI is likely to go badly

Max H30 May 2023 0:21 UTC

1 point

0 comments13 min readEA link

Help us seed AI Safety Brussels

gergo7 Aug 2024 6:17 UTC

50 points

4 comments3 min readEA link

[Question] Could someone help me understand why it’s so difficult to solve the alignment problem?

Jadon Schmitt22 Jul 2023 4:39 UTC

35 points

21 comments1 min readEA link

Report: Evaluating an AI Chip Registration Policy

Deric Cheng12 Apr 2024 4:40 UTC

15 points

0 comments5 min readEA link

(www.convergenceanalysis.org)

Five neglected work areas that could reduce AI risk

Aaron_Scher24 Sep 2023 2:09 UTC

22 points

0 comments9 min readEA link

AI Safety & Risk Dinner w/ Entrepreneur First CEO & ARIA Chair, Matt Clifford in New York

SimonPastor28 Nov 2023 19:45 UTC

2 points

0 comments1 min readEA link

News: Spanish AI image outcry + US AI workforce “regulation”

Benevolent_Rain26 Sep 2023 7:43 UTC

9 points

0 comments1 min readEA link

UK Government announces £100 million in funding for Foundation Model Taskforce.

Jordan Pieters 🔸25 Apr 2023 11:29 UTC

10 points

1 comment1 min readEA link

(www.gov.uk)

Introduction to Pragmatic AI Safety [Pragmatic AI Safety #1]

TW1239 May 2022 17:02 UTC

68 points

0 comments6 min readEA link

Assessing the Dangerousness of Malevolent Actors in AGI Governance: A Preliminary Exploration

Callum Hinchcliffe14 Oct 2023 21:18 UTC

28 points

4 comments9 min readEA link

Measuring artificial intelligence on human benchmarks is naive

Ward A11 Apr 2023 11:28 UTC

9 points

2 comments1 min readEA link

Updates from Campaign for AI Safety

Jolyn Khoo27 Sep 2023 2:44 UTC

16 points

0 comments2 min readEA link

(www.campaignforaisafety.org)

Key takeaways from our EA and alignment research surveys

Cameron Berg4 May 2024 15:51 UTC

64 points

22 comments21 min readEA link

AGI Safety Needs People With All Skillsets!

Severin25 Jul 2022 13:30 UTC

39 points

7 comments2 min readEA link

[Linkpost] A Narrow Path—How to Secure our Future

MathiasKB🔸2 Oct 2024 22:50 UTC

68 points

0 comments1 min readEA link

(www.narrowpath.co)

Intrinsic limitations of GPT-4 and other large language models, and why I’m not (very) worried about GPT-n

James Fodor3 Jun 2023 13:09 UTC

28 points

3 comments11 min readEA link

Dr Altman or: How I Learned to Stop Worrying and Love the Killer AI

Barak Gila11 Mar 2024 5:01 UTC

−7 points

0 comments2 min readEA link

NIMBYism as an AI governance tool?

freedomandutility9 Jun 2024 6:40 UTC

10 points

2 comments1 min readEA link

Fifteen Lawsuits against OpenAI

Remmelt9 Mar 2024 12:22 UTC

55 points

5 comments1 min readEA link

Update on cause area focus working group

Bastian_Stern10 Aug 2023 1:21 UTC

140 points

18 comments5 min readEA link

Just Pivot to AI: The secret is out

sapphire15 Mar 2023 6:25 UTC

0 points

4 comments2 min readEA link

Talking publicly about AI risk

Jan_Kulveit24 Apr 2023 9:19 UTC

152 points

13 comments6 min readEA link

Knowledge, Reasoning, and Superintelligence

Owen Cotton-Barratt26 Mar 2025 23:28 UTC

21 points

3 comments7 min readEA link

(strangecities.substack.com)

AI Safety Camp, Virtual Edition 2023

Linda Linsefors6 Jan 2023 0:55 UTC

24 points

0 comments3 min readEA link

(aisafety.camp)

Against most, but not all, AI risk analogies

Matthew_Barnett14 Jan 2024 19:13 UTC

43 points

9 comments7 min readEA link

Australians for AI Safety Launches New Election Campaign — Here’s How You Can Help

Luke Freeman24 Mar 2025 4:26 UTC

54 points

5 comments3 min readEA link

Is Deep Learning Actually Hitting a Wall? Evaluating Ilya Sutskever’s Recent Claims

Garrison13 Nov 2024 17:00 UTC

115 points

7 comments8 min readEA link

(garrisonlovely.substack.com)

Introducing the Pathfinder Fellowship: Funding and Mentorship for AI Safety Group Organizers

Agustín Covarrubias 🔸22 Jul 2025 17:11 UTC

49 points

0 comments2 min readEA link

We Did AGISF’s 8-week Course in 3 Days. Here’s How it Went

ag400024 Jul 2022 16:46 UTC

26 points

7 comments6 min readEA link

AI governance and strategy: a list of research agendas and work that could be done.

Nathan_Barnard12 Mar 2024 11:21 UTC

33 points

4 comments17 min readEA link

Stuart J. Russell on “should we press pause on AI?”

Kaleem18 Sep 2023 13:19 UTC

32 points

3 comments1 min readEA link

(podcasts.apple.com)

OpenAI Alums, Nobel Laureates Urge Regulators to Save Company’s Nonprofit Structure

Garrison23 Apr 2025 23:01 UTC

61 points

2 comments8 min readEA link

(garrisonlovely.substack.com)

Order Matters for Deceptive Alignment

DavidW15 Feb 2023 20:12 UTC

20 points

1 comment1 min readEA link

(www.lesswrong.com)

[Question] How good/bad is the new Bing AI for the world?

Nathan Young17 Feb 2023 16:31 UTC

21 points

14 comments1 min readEA link

[Question] Good depictions of speed mismatches between advanced AI systems and humans?

Geoffrey Miller15 Mar 2023 16:40 UTC

18 points

9 comments1 min readEA link

AI Safety Communicators Meet-up

Vishakha Agrawal20 Jun 2025 12:26 UTC

2 points

0 comments1 min readEA link

The argument for near-term human disempowerment through AI

Chris Leong16 Apr 2024 3:07 UTC

31 points

12 comments1 min readEA link

(link.springer.com)

Who Aligns the Alignment Researchers?

ben.smith5 Mar 2023 23:22 UTC

23 points

4 comments11 min readEA link

[Question] Why won’t nanotech kill us all?

Yarrow🔸16 Dec 2023 23:27 UTC

20 points

5 comments1 min readEA link

AI and integrity

Nathan Young29 May 2024 20:45 UTC

15 points

0 comments2 min readEA link

(nathanpmyoung.substack.com)

AI alignment shouldn’t be conflated with AI moral achievement

Matthew_Barnett30 Dec 2023 3:08 UTC

116 points

15 comments5 min readEA link

5 homegrown EA projects, seeking small donors

Austin28 Oct 2024 23:24 UTC

50 points

1 comment2 min readEA link

Talk: AI safety fieldbuilding at MATS

Ryan Kidd23 Jun 2024 23:06 UTC

14 points

1 comment10 min readEA link

Shallow review of live agendas in alignment & safety

technicalities27 Nov 2023 11:33 UTC

76 points

8 comments29 min readEA link

Unveiling the American Public Opinion on AI Moratorium and Government Intervention: The Impact of Media Exposure

Otto8 May 2023 10:49 UTC

28 points

5 comments6 min readEA link

[Question] What are some criticisms of PauseAI?

Eevee🔹23 Nov 2024 17:49 UTC

53 points

71 comments1 min readEA link

Ways to buy time

Akash12 Nov 2022 19:31 UTC

47 points

1 comment12 min readEA link

New report on the state of AI safety in China

Geoffrey Miller27 Oct 2023 20:20 UTC

22 points

0 comments3 min readEA link

(concordia-consulting.com)

[Question] Who should we give books on AI X-risk to?

yanni18 Dec 2023 23:57 UTC

13 points

1 comment1 min readEA link

“AGI” considered harmful

Milan Griffes18 Apr 2025 20:19 UTC

10 points

1 comment1 min readEA link

Pause For Thought: The AI Pause Debate (Astral Codex Ten)

David M5 Oct 2023 9:32 UTC

37 points

0 comments1 min readEA link

(www.astralcodexten.com)

How I Formed My Own Views About AI Safety

Neel Nanda27 Feb 2022 18:52 UTC

134 points

13 comments14 min readEA link

(www.neelnanda.io)

[Question] Why might AI be a x-risk? Succinct explanations please

Sanjay4 Apr 2023 12:46 UTC

20 points

9 comments1 min readEA link

An overview of some promising work by junior alignment researchers

Akash26 Dec 2022 17:23 UTC

10 points

0 comments4 min readEA link

Longtermism Fund: August 2023 Grants Report

Michael Townsend🔸20 Aug 2023 5:34 UTC

81 points

3 comments5 min readEA link

New survey: 46% of Americans are concerned about extinction from AI; 69% support a six-month pause in AI development

Akash5 Apr 2023 1:26 UTC

143 points

34 comments1 min readEA link

(today.yougov.com)

Announcing Epoch’s newly expanded Parameters, Compute and Data Trends in Machine Learning database

Robi Rahman25 Oct 2023 3:03 UTC

38 points

1 comment1 min readEA link

(epochai.org)

Upcoming speaker series on emerging tech, national security & US policy careers

kuhanj21 Jun 2023 4:49 UTC

42 points

0 comments2 min readEA link

Comments on OpenAI’s “Planning for AGI and beyond”

So8res3 Mar 2023 23:01 UTC

115 points

7 comments13 min readEA link

Next steps after AGISF at UMich

JakubK25 Jan 2023 20:57 UTC

18 points

1 comment5 min readEA link

(docs.google.com)

[Question] Is There Actually a Standard or Convincing Response to David Thorstad’s Criticisms of the Value of X-Risk Reduction and of Longtermism?

David Mathers🔸21 May 2025 11:58 UTC

110 points

21 comments2 min readEA link

[Question] Evidence to prioritize or working on AI as the most impactful thing?

Vaipan22 Sep 2023 8:43 UTC

9 points

6 comments1 min readEA link

The Importance of AI Alignment, explained in 5 points

Daniel_Eth11 Feb 2023 2:56 UTC

50 points

4 comments13 min readEA link

Misnaming and Other Issues with OpenAI’s “Human Level” Superintelligence Hierarchy

Davidmanheim15 Jul 2024 5:50 UTC

14 points

0 comments3 min readEA link

Why people want to work on AI safety (but don’t)

Emily Grundy24 Jan 2023 6:41 UTC

70 points

10 comments7 min readEA link

How “AGI” could end up being many different specialized AI’s stitched together

titotal8 May 2023 12:32 UTC

31 points

2 comments9 min readEA link

Please Don’t Win the AI Race

Picklehead2 Aug 2025 23:31 UTC

−4 points

0 comments6 min readEA link

Transformative AI issues (not just misalignment): an overview

Holden Karnofsky6 Jan 2023 2:19 UTC

36 points

0 comments22 min readEA link

(www.cold-takes.com)

My AI safety independent research engineer path so far

Artyom K25 Jul 2025 8:49 UTC

16 points

0 comments3 min readEA link

We might get lucky with AGI warning shots. Let’s be ready!

tcelferact31 Mar 2023 21:37 UTC

22 points

2 comments2 min readEA link

Rational Animations’ video about scalable oversight and sandwiching

Writer6 Jul 2025 14:00 UTC

14 points

1 comment9 min readEA link

(youtu.be)

Paul Christiano on Dwarkesh Podcast

ESRogs3 Nov 2023 22:13 UTC

5 points

0 comments1 min readEA link

(www.dwarkeshpatel.com)

Quick nudge to apply to the LTFF grant round (closing on Saturday)

calebp14 Feb 2025 15:19 UTC

57 points

7 comments1 min readEA link

Apply to the 2024 PIBBSS Summer Research Fellowship

nora12 Jan 2024 4:06 UTC

37 points

1 comment2 min readEA link

Questions about Conjecure’s CoEm proposal

Akash9 Mar 2023 19:32 UTC

19 points

0 comments2 min readEA link

The “technology” bucket error

Holly Elmore ⏸️ 🔸21 Sep 2023 0:59 UTC

33 points

10 comments4 min readEA link

(open.substack.com)

EA Wins 2023

Shakeel Hashim31 Dec 2023 14:07 UTC

358 points

9 comments3 min readEA link

Protest against Meta’s irreversible proliferation (Sept 29, San Francisco)

Holly Elmore ⏸️ 🔸19 Sep 2023 23:40 UTC

114 points

32 comments1 min readEA link

AI Safety Newsletter #39: Implications of a Trump Administration for AI Policy Plus, Safety Engineering

Center for AI Safety29 Jul 2024 17:48 UTC

6 points

0 comments6 min readEA link

(newsletter.safe.ai)

CAIDP Statement on Lethal Autonomous Weapons Systems

Heramb Podar30 Nov 2024 18:00 UTC

7 points

0 comments1 min readEA link

(www.linkedin.com)

A deep critique of AI 2027’s bad timeline models

titotal19 Jun 2025 13:35 UTC

277 points

28 comments40 min readEA link

(titotal.substack.com)

“Can We Survive Technology?” by John von Neumann

Eli Rose13 Mar 2023 2:26 UTC

51 points

0 comments1 min readEA link

(geosci.uchicago.edu)

[Question] Are alignment researchers devoting enough time to improving their research capacity?

Carson Jones4 Nov 2022 0:58 UTC

11 points

1 comment3 min readEA link

Hypothetical grants that the Long-Term Future Fund narrowly rejected

calebp15 Nov 2023 19:39 UTC

95 points

12 comments6 min readEA link

“Dangers of AI and the End of Human Civilization” Yudkowsky on Lex Fridman

𝕮𝖎𝖓𝖊𝖗𝖆30 Mar 2023 15:44 UTC

28 points

0 comments1 min readEA link

(www.youtube.com)

Scaling of AI training runs will slow down after GPT-5

Maxime Riché 🔸26 Apr 2024 16:06 UTC

10 points

2 comments3 min readEA link

Community Building for Graduate Students: A Targeted Approach

Neil Crawford29 Mar 2022 19:47 UTC

13 points

0 comments3 min readEA link

Peter Eckersley (1979-2022)

technicalities3 Sep 2022 10:45 UTC

497 points

10 comments1 min readEA link

AI Risk Management Framework | NIST

𝕮𝖎𝖓𝖊𝖗𝖆26 Jan 2023 15:27 UTC

50 points

0 comments2 min readEA link

(www.nist.gov)

Christiano (ARC) and GA (Conjecture) Discuss Alignment Cruxes

Andrea_Miotti24 Feb 2023 23:03 UTC

16 points

1 comment49 min readEA link

Apply now to SPAR!

Agustín Covarrubias 🔸19 Dec 2024 22:29 UTC

36 points

0 comments1 min readEA link

Culture and Programming Retrospective: ERA Fellowship 2023

GideonF28 Sep 2023 16:45 UTC

16 points

0 comments10 min readEA link

Situational awareness (Section 2.1 of “Scheming AIs”)

Joe_Carlsmith26 Nov 2023 23:00 UTC

12 points

1 comment6 min readEA link

[Question] What predictions from theoretical AI Safety research have been confirmed by empirical work?

freedomandutility29 Dec 2024 8:19 UTC

43 points

10 comments1 min readEA link

I bet Greg Colbourn 10 k€ that AI will not kill us all by the end of 2027

Vasco Grilo🔸4 Jun 2024 16:37 UTC

195 points

64 comments2 min readEA link

AI Safety Newsletter #40: California AI Legislation Plus, NVIDIA Delays Chip Production, and Do AI Safety Benchmarks Actually Measure Safety?

Center for AI Safety21 Aug 2024 18:10 UTC

17 points

0 comments6 min readEA link

(newsletter.safe.ai)

AISN #47: Reasoning Models

Center for AI Safety6 Feb 2025 18:44 UTC

8 points

0 comments4 min readEA link

(newsletter.safe.ai)

ARC is hiring theoretical researchers

Jacob_Hilton12 Jun 2023 19:11 UTC

78 points

0 comments4 min readEA link

(www.lesswrong.com)

Measuring AI-Driven Risk with Stock Prices (Susana Campos-Martins)

Global Priorities Institute12 Dec 2024 14:22 UTC

10 points

1 comment4 min readEA link

(globalprioritiesinstitute.org)

[Crosspost] Some Very Important Things (That I Won’t Be Working On This Year)

Sarah Cheng10 Mar 2025 14:42 UTC

28 points

1 comment4 min readEA link

(milesbrundage.substack.com)

Unjournal: Evaluations of “Artificial Intelligence and Economic Growth”, and new hosting space

david_reinstein17 Mar 2023 20:20 UTC

47 points

0 comments2 min readEA link

(unjournal.pubpub.org)

Joe Hardie on Arcadia Impact’s projects (FBB #7)

gergo8 Jul 2025 13:22 UTC

17 points

3 comments15 min readEA link

I don’t want to talk about ai

Kirsten22 May 2023 21:19 UTC

7 points

0 comments1 min readEA link

(ealifestyles.substack.com)

P(doom|AGI) is high: why the default outcome of AGI is doom

Greg_Colbourn ⏸️ 2 May 2023 10:40 UTC

13 points

28 comments3 min readEA link

Safety Conscious Researchers should leave Anthropic

GideonF1 Apr 2025 10:12 UTC

57 points

3 comments5 min readEA link

5 Reasons Why Governments/Militaries Already Want AI for Information Warfare

trevor112 Nov 2023 18:24 UTC

5 points

0 comments10 min readEA link

Four questions I ask AI safety researchers

Akash17 Jul 2022 17:25 UTC

30 points

3 comments1 min readEA link

A tale of 2.5 orthogonality theses

Arepo1 May 2022 13:53 UTC

147 points

31 comments11 min readEA link

[Question] Asking for online resources why AI now is near AGI

jackchang11018 May 2023 0:04 UTC

6 points

4 comments1 min readEA link

Consider attending the AI Security Forum ’24, a 1-day pre-DEFCON event

Charlie Rogers-Smith12 Jul 2024 23:01 UTC

23 points

0 comments1 min readEA link

[Question] Suggested readings & videos for a new college course on ‘Psychology and AI’?

Geoffrey Miller11 Jan 2024 22:26 UTC

12 points

3 comments1 min readEA link

Updates on the EA catastrophic risk landscape

Benjamin_Todd6 May 2024 4:52 UTC

195 points

46 comments2 min readEA link

My model of how different AI risks fit together

poppinfresh31 Jan 2024 17:09 UTC

64 points

4 comments7 min readEA link

(unfoldingatlas.substack.com)

AISN #48: Utility Engineering and EnigmaEval

Center for AI Safety18 Feb 2025 19:11 UTC

6 points

0 comments4 min readEA link

(newsletter.safe.ai)

What OpenAI Told California’s Attorney General

Garrison17 May 2025 23:14 UTC

35 points

2 comments8 min readEA link

(www.obsolete.pub)

Formalize the Hashiness Model of AGI Uncontainability

Remmelt9 Nov 2024 16:10 UTC

2 points

0 comments5 min readEA link

(docs.google.com)

Play Regrantor: Move up to $250,000 to Your Top High-Impact Projects!

Dawn Drescher17 May 2023 16:51 UTC

58 points

2 comments2 min readEA link

(impactmarkets.substack.com)

[Question] Do AI companies make their safety researchers sign a non-disparagement clause?

Ofer5 Sep 2022 13:40 UTC

73 points

3 comments1 min readEA link

White House publishes framework for Nucleic Acid Screening

Agustín Covarrubias 🔸30 Apr 2024 0:44 UTC

30 points

1 comment1 min readEA link

(www.whitehouse.gov)

Apply to be a TA for TARA

yanni kyriacos20 Dec 2024 2:24 UTC

15 points

2 comments1 min readEA link

[Question] If AI is in a bubble and the bubble bursts, what would you do?

Remmelt19 Aug 2024 10:56 UTC

28 points

10 comments1 min readEA link

How LDT helps reduce the AI arms race

Tamsin Leake10 Dec 2023 16:21 UTC

8 points

1 comment4 min readEA link

(carado.moe)

Part 3: A Proposed Approach for AI Safety Movement Building: Projects, Professions, Skills, and Ideas for the Future [long post][bounty for feedback]

PeterSlattery22 Mar 2023 0:54 UTC

22 points

8 comments32 min readEA link

Aptitudes for AI governance work

Sam Clarke13 Jun 2023 13:54 UTC

68 points

0 comments7 min readEA link

Pausing AI vs Degrowth in rich countries

Miquel Banchs-Piqué (prev. mikbp)23 Sep 2023 7:09 UTC

−2 points

53 comments1 min readEA link

We Have Not Been Invited to the Future: e/acc and the Narrowness of the Way Ahead

Devin Kalish17 Jul 2024 22:15 UTC

10 points

1 comment20 min readEA link

(www.thinkingmuchbetter.com)

Does scheming lead to adequate future empowerment? (Section 2.3.1.2 of “Scheming AIs”)

Joe_Carlsmith3 Dec 2023 18:32 UTC

6 points

1 comment15 min readEA link

Excerpts from “Doing EA Better” on x-risk methodology

Eevee🔹26 Jan 2023 1:04 UTC

22 points

5 comments6 min readEA link

(forum.effectivealtruism.org)

AISN #55: Trump Administration Rescinds AI Diffusion Rule, Allows Chip Sales to Gulf States

Center for AI Safety20 May 2025 16:05 UTC

7 points

0 comments4 min readEA link

(newsletter.safe.ai)

Exploring Metaculus’ community predictions

Vasco Grilo🔸24 Mar 2023 7:59 UTC

95 points

17 comments10 min readEA link

Agency Foundations Challenge: September 8th-24th, $10k Prizes

Catalin M30 Aug 2023 6:12 UTC

12 points

0 comments5 min readEA link

Epoch AI alumni launch Mechanize to “automate the whole economy”

Henry Stanley 🔸18 Apr 2025 10:12 UTC

103 points

52 comments1 min readEA link

How metaphysical beliefs shape critical aspects of AI development

Jáchym Fibír26 Jun 2025 15:10 UTC

−9 points

0 comments8 min readEA link

(www.phiand.ai)

AI takeoff and nuclear war

Owen Cotton-Barratt11 Jun 2024 19:33 UTC

72 points

5 comments11 min readEA link

(strangecities.substack.com)

Four Predictions About OpenAI’s Plans To Retain Nonprofit Control

Garrison7 May 2025 15:48 UTC

15 points

2 comments5 min readEA link

(www.obsolete.pub)

Notes on risk compensation

trammell12 May 2024 18:40 UTC

140 points

14 comments21 min readEA link

Planes are still decades away from displacing most bird jobs

guzey25 Nov 2022 16:49 UTC

27 points

2 comments3 min readEA link

AXRP: Store, Patreon, Video

DanielFilan7 Feb 2023 5:12 UTC

7 points

0 comments1 min readEA link

AI policy ideas: Reading list

Zach Stein-Perlman17 Apr 2023 19:00 UTC

60 points

3 comments4 min readEA link

The AI Boom Mainly Benefits Big Firms, but long-term, markets will concentrate

Hauke Hillebrandt29 Oct 2023 8:38 UTC

12 points

0 comments1 min readEA link

[Question] What should the EA/AI safety community change, in response to Sam Altman’s revealed priorities?

SiebeRozendal8 Mar 2024 12:35 UTC

30 points

16 comments1 min readEA link

Max Tegmark’s new Time article on how we’re in a Don’t Look Up scenario [Linkpost]

Jonas Hallgren 🔸25 Apr 2023 15:47 UTC

41 points

0 comments1 min readEA link

(time.com)

De Dicto and De Se Reference Matters for Alignment

philgoetz3 Oct 2023 21:57 UTC

5 points

2 comments9 min readEA link

RSPs are pauses done right

evhub14 Oct 2023 4:06 UTC

93 points

7 comments7 min readEA link

Widening Overton Window—Open Thread

Prometheus31 Mar 2023 10:06 UTC

12 points

5 comments1 min readEA link

(www.lesswrong.com)

[Question] Which stocks or ETFs should you invest in to take advantage of a possible AGI explosion, and why?

Eevee🔹10 Apr 2023 17:55 UTC

19 points

16 comments1 min readEA link

Making a conservative case for alignment

Larks17 Nov 2024 1:45 UTC

44 points

0 comments1 min readEA link

(www.lesswrong.com)

Apocalypse insurance, and the hardline libertarian take on AI risk

So8res28 Nov 2023 2:09 UTC

21 points

0 comments7 min readEA link

titotal on AI risk scepticism

Vasco Grilo🔸30 May 2024 17:03 UTC

76 points

3 comments6 min readEA link

(forum.effectivealtruism.org)

Announcing the London Initiative for Safe AI (LISA)

JamesFox5 Feb 2024 10:36 UTC

67 points

4 comments9 min readEA link

The bullseye framework: My case against AI doom

titotal30 May 2023 11:52 UTC

71 points

15 comments17 min readEA link

RP’s AI Governance & Strategy team—June 2023 interim overview

MichaelA🔸22 Jun 2023 13:45 UTC

68 points

1 comment7 min readEA link

Slopworld 2035: The dangers of mediocre AI

titotal14 Apr 2025 13:14 UTC

87 points

1 comment29 min readEA link

(titotal.substack.com)

Virtual AI Safety Unconference (VAISU)

Nguyên20 Jun 2023 9:47 UTC

14 points

0 comments1 min readEA link

Safety isn’t safety without a social model (or: dispelling the myth of per se technical safety)

Andrew Critch14 Jun 2024 0:16 UTC

99 points

3 comments4 min readEA link

The Polarity Problem [Draft]

Dan H23 May 2023 21:05 UTC

11 points

0 comments44 min readEA link

[Question] Any tips on applying for EA funding?

Eevee🔹22 Sep 2024 5:11 UTC

18 points

4 comments1 min readEA link

How Rethink Priorities’ Research could inform your grantmaking

kierangreig🔸4 Oct 2023 18:24 UTC

59 points

0 comments2 min readEA link

The benefits and risks of optimism (about AI safety)

Karl von Wendt3 Dec 2023 12:45 UTC

3 points

5 comments5 min readEA link

New AI safety funding source for people raising awareness about AI risk or advocating for a pause

Kat Woods 🔶 ⏸️26 Jul 2025 12:25 UTC

16 points

6 comments1 min readEA link

Polls on De/Accelerating AI

Denkenberger🔸9 Aug 2025 2:01 UTC

28 points

14 comments2 min readEA link

Anthropic is Quietly Backpedalling on its Safety Commitments

Garrison23 May 2025 2:26 UTC

100 points

7 comments5 min readEA link

(www.obsolete.pub)

Yudkowsky on AGI risk on the Bankless podcast

RobBensinger13 Mar 2023 0:42 UTC

54 points

2 comments75 min readEA link

Results of an informal survey on AI grantmaking

Scott Alexander21 Aug 2024 13:19 UTC

127 points

28 comments1 min readEA link

Rethink Priorities’ 2023 Summary, 2024 Strategy, and Funding Gaps

kierangreig🔸15 Nov 2023 20:56 UTC

86 points

7 comments3 min readEA link

Ethical Roots of Chinese AI

Vasiliy Kondyrev5 Nov 2024 14:07 UTC

0 points

0 comments6 min readEA link

GPTs are Predictors, not Imitators

EliezerYudkowsky8 Apr 2023 19:59 UTC

74 points

12 comments3 min readEA link

Beren’s “Deconfusing Direct vs Amortised Optimisation”

𝕮𝖎𝖓𝖊𝖗𝖆7 Apr 2023 8:57 UTC

9 points

0 comments3 min readEA link

What I mean by “alignment is in large part about making cognition aimable at all”

So8res30 Jan 2023 15:22 UTC

57 points

3 comments2 min readEA link

M&A in AI

Hauke Hillebrandt30 Oct 2023 17:43 UTC

9 points

1 comment6 min readEA link

A Barebones Guide to Mechanistic Interpretability Prerequisites

Neel Nanda29 Nov 2022 18:43 UTC

54 points

1 comment3 min readEA link

(neelnanda.io)

Two contrasting models of “intelligence” and future growth

Magnus Vinding24 Nov 2022 11:54 UTC

74 points

32 comments22 min readEA link

[Question] What is the easiest/funnest way to build up a comprehensive understanding of AI and AI Safety?

Jordan Arel30 Apr 2024 18:39 UTC

14 points

0 comments1 min readEA link

Podcast (+transcript): Nathan Barnard on how US financial regulation can inform AI governance

Aaron Bergman8 Aug 2023 21:46 UTC

12 points

0 comments23 min readEA link

(www.aaronbergman.net)

Map of maps of interesting fields

Max Görlitz25 Jun 2023 14:00 UTC

55 points

6 comments1 min readEA link

(glozematrix.substack.com)

What Does a Marginal Grant at LTFF Look Like? Funding Priorities and Grantmaking Thresholds at the Long-Term Future Fund

Linch10 Aug 2023 20:11 UTC

176 points

22 comments8 min readEA link

Misgeneralization as a misnomer

So8res6 Apr 2023 20:43 UTC

48 points

0 comments4 min readEA link

MIT FutureTech are hiring ‍a Product and Data Visualization Designer

PeterSlattery13 Nov 2024 14:41 UTC

9 points

0 comments4 min readEA link

AI risk/reward: A simple model

Nathan Young4 May 2023 19:12 UTC

37 points

5 comments7 min readEA link

Why not just send people to Bluedot (FBB#4)

gergo25 Mar 2025 10:47 UTC

27 points

13 comments12 min readEA link

[Question] How to hedge investment portfolio against AI risk?

Timothy_Liptrot31 Jan 2023 8:04 UTC

9 points

0 comments1 min readEA link

What’s Going on With OpenAI’s Messaging?

Ozzie Gooen21 May 2024 2:22 UTC

217 points

28 comments3 min readEA link

Is AI Hitting a Wall or Moving Faster Than Ever?

Garrison9 Jan 2025 22:18 UTC

35 points

5 comments5 min readEA link

(garrisonlovely.substack.com)

Brief thoughts on Data, Reporting, and Response for AI Risk Mitigation

Davidmanheim15 Jun 2023 7:53 UTC

18 points

3 comments8 min readEA link

OpenAI’s o1 tried to avoid being shut down, and lied about it, in evals

Greg_Colbourn ⏸️ 6 Dec 2024 15:25 UTC

23 points

9 comments1 min readEA link

(www.transformernews.ai)

My article in The Nation — California’s AI Safety Bill Is a Mask-Off Moment for the Industry

Garrison15 Aug 2024 19:25 UTC

134 points

0 comments1 min readEA link

(www.thenation.com)

[Question] Your Advice For a High School Student.

AhmedWez10 Jan 2025 21:26 UTC

7 points

5 comments1 min readEA link

Lessons from the FDA for AI

Remmelt2 Aug 2024 0:52 UTC

6 points

2 comments1 min readEA link

(ainowinstitute.org)

Introducing Alignment Stress-Testing at Anthropic

evhub12 Jan 2024 23:51 UTC

80 points

0 comments2 min readEA link

Request for Information for a new US AI Action Plan (OSTP RFI)

Agustín Covarrubias 🔸7 Feb 2025 20:22 UTC

19 points

2 comments2 min readEA link

(www.federalregister.gov)

Silicon Valley’s Rabbit Hole Problem

Mandelbrot8 Oct 2023 12:25 UTC

34 points

44 comments11 min readEA link

(medium.com)

Will AI Avoid Exploitation? (Adam Bales)

Global Priorities Institute13 Dec 2023 11:37 UTC

38 points

0 comments2 min readEA link

The Overton Window widens: Examples of AI risk in the media

Akash23 Mar 2023 17:10 UTC

112 points

11 comments1 min readEA link

Cruxes on US lead for some domestic AI regulation

Zach Stein-Perlman10 Sep 2023 18:00 UTC

20 points

6 comments2 min readEA link

Eric Schmidt’s blueprint for US technology strategy

OscarD🔸15 Oct 2024 19:54 UTC

29 points

4 comments9 min readEA link

AI companies’ eval reports mostly don’t support their claims

Zach Stein-Perlman9 Jun 2025 13:00 UTC

50 points

2 comments4 min readEA link

We don’t want to post again “This might be the last AI Safety Camp”

Remmelt21 Jan 2025 12:03 UTC

42 points

2 comments1 min readEA link

(manifund.org)

A Study of AI Science Models

Eleni_A13 May 2023 19:14 UTC

12 points

4 comments24 min readEA link

Fighting without hope

Akash1 Mar 2023 18:15 UTC

35 points

9 comments4 min readEA link

Critiques of non-existent AI safety labs: Yours

Anneal16 Jun 2023 6:50 UTC

117 points

12 comments3 min readEA link

The last era of human mistakes

Owen Cotton-Barratt24 Jul 2024 9:56 UTC

23 points

4 comments7 min readEA link

(strangecities.substack.com)

Transcript: NBC Nightly News: AI ‘race to recklessness’ w/ Tristan Harris, Aza Raskin

WilliamKiely23 Mar 2023 3:45 UTC

47 points

1 comment3 min readEA link

DeepMind and Google Brain are merging [Linkpost]

Akash20 Apr 2023 18:47 UTC

32 points

1 comment1 min readEA link

(www.deepmind.com)

Is AI forecasting a waste of effort on the margin?

Emrik5 Nov 2022 0:41 UTC

12 points

6 comments3 min readEA link

Is the AI Doomsday Narrative the Product of a Big Tech Conspiracy?

Garrison4 Dec 2024 19:20 UTC

28 points

5 comments11 min readEA link

(garrisonlovely.substack.com)

AISC9 has ended and there will be an AISC10

Linda Linsefors29 Apr 2024 10:53 UTC

36 points

0 comments2 min readEA link

4 ways to think about democratizing AI [GovAI Linkpost]

Akash13 Feb 2023 18:06 UTC

35 points

0 comments1 min readEA link

(www.governance.ai)

Law-Following AI 4: Don’t Rely on Vicarious Liability

Cullen 🔸2 Aug 2022 23:23 UTC

13 points

0 comments3 min readEA link

Global Pause AI Protest 10/21

Holly Elmore ⏸️ 🔸14 Oct 2023 3:17 UTC

22 points

0 comments1 min readEA link

Top OpenAI Catastrophic Risk Official Steps Down Abruptly

Garrison16 Apr 2025 16:04 UTC

29 points

1 comment5 min readEA link

(garrisonlovely.substack.com)

Donation offsets for ChatGPT Plus subscriptions

Jeffrey Ladish16 Mar 2023 23:11 UTC

76 points

10 comments3 min readEA link

[Question] How do you follow AI (safety) news?

peterhartree28 Sep 2024 14:03 UTC

13 points

9 comments1 min readEA link

What success looks like

mariushobbhahn28 Jun 2022 14:30 UTC

115 points

20 comments19 min readEA link

Some thoughts on “AI could defeat all of us combined”

Milan Griffes2 Jun 2023 15:03 UTC

23 points

0 comments4 min readEA link

An Analysis of Systemic Risk and Architectural Requirements for the Containment of Recursively Self-Improving AI

Ihor Ivliev17 Jun 2025 0:16 UTC

2 points

5 comments4 min readEA link

Where’s my ten minute AGI?

Vasco Grilo🔸19 May 2025 17:45 UTC

45 points

6 comments7 min readEA link

(epoch.ai)

Have your say on the Australian Government’s AI Policy

Nathan Sherburn11 Jul 2023 1:12 UTC

3 points

0 comments1 min readEA link

[Question] Are we confident that superintelligent artificial intelligence disempowering humans would be bad?

Vasco Grilo🔸10 Jun 2023 9:24 UTC

24 points

27 comments1 min readEA link

AI Safety Newsletter #42: Newsom Vetoes SB 1047 Plus, OpenAI’s o1, and AI Governance Summary

Center for AI Safety1 Oct 2024 20:33 UTC

10 points

0 comments6 min readEA link

(newsletter.safe.ai)

AI things that are perhaps as important as human-controlled AI

Chi3 Mar 2024 18:07 UTC

116 points

9 comments21 min readEA link

Anthropic Faces Potentially “Business-Ending” Copyright Lawsuit

Garrison25 Jul 2025 17:01 UTC

29 points

10 comments9 min readEA link

(www.obsolete.pub)

List #1: Why stopping the development of AGI is hard but doable

Remmelt24 Dec 2022 9:52 UTC

24 points

2 comments5 min readEA link

AI, Animals, and Digital Minds 2024 - Retrospective

Constance Li19 Jun 2024 14:56 UTC

81 points

8 comments8 min readEA link

Trends in the dollar training cost of machine learning systems

Ben Cottier1 Feb 2023 14:48 UTC

63 points

3 comments2 min readEA link

(epochai.org)

ai-plans.com December Critique-a-Thon

Kabir_Kumar4 Dec 2023 9:27 UTC

1 point

0 comments2 min readEA link

Shutting Down the Lightcone Offices

Habryka [Deactivated]15 Mar 2023 1:46 UTC

243 points

71 comments17 min readEA link

(www.lesswrong.com)

A brief history of the automated corporation

Owen Cotton-Barratt4 Nov 2024 14:37 UTC

21 points

1 comment5 min readEA link

(strangecities.substack.com)

GPT-4 is out: thread (& links)

Lizka14 Mar 2023 20:02 UTC

84 points

18 comments1 min readEA link

FLI podcast series, “Imagine A World”, about aspirational futures with AGI

Jackson Wagner13 Oct 2023 16:03 UTC

18 points

0 comments4 min readEA link

Why experienced professionals fail to land high-impact roles (FBB #5)

gergo10 Apr 2025 12:44 UTC

121 points

20 comments9 min readEA link

AISN #46: The Transition

Center for AI Safety23 Jan 2025 18:01 UTC

10 points

0 comments5 min readEA link

(newsletter.safe.ai)

[Video] Why SB-1047 deserves a fairer debate

Yadav20 Aug 2024 10:38 UTC

15 points

1 comment7 min readEA link

Retrospective on the 2022 Conjecture AI Discussions

Andrea_Miotti24 Feb 2023 22:41 UTC

12 points

1 comment2 min readEA link

The Wizard of Oz Problem: How incentives and narratives can skew our perception of AI developments

Akash20 Mar 2023 22:36 UTC

16 points

0 comments6 min readEA link

We need non-cybersecurity people [too]

Jarrah5 May 2024 0:11 UTC

32 points

0 comments2 min readEA link

AI Safety Newsletter #43: White House Issues First National Security Memo on AI Plus, AI and Job Displacement, and AI Takes Over the Nobels

Center for AI Safety28 Oct 2024 16:02 UTC

6 points

0 comments6 min readEA link

(newsletter.safe.ai)

On the Dwarkesh/Chollet Podcast, and the cruxes of scaling to AGI

JWS 🔸15 Jun 2024 20:24 UTC

72 points

49 comments17 min readEA link

The Future of Work: How Can Policymakers Prepare for AI’s Impact on Labor Markets?

DavidConrad24 Jun 2024 21:43 UTC

4 points

1 comment3 min readEA link

(www.lesswrong.com)

Effective Utopia: 100% Safe AI, Place AI, Simulating a Multiverse & How It Looks

ank2 Mar 2025 3:14 UTC

1 point

3 comments35 min readEA link

The Intentional Stance, LLMs Edition

Eleni_A1 May 2024 15:22 UTC

8 points

2 comments8 min readEA link

It looks like there are some good funding opportunities in AI safety right now

Benjamin_Todd21 Dec 2024 13:39 UTC

183 points

7 comments4 min readEA link

(benjamintodd.substack.com)

How Misaligned AI Personas Lead to Human Extinction – Step by Step

Writer19 Jul 2025 13:59 UTC

6 points

1 comment7 min readEA link

(youtu.be)

How much do markets value Open AI?

Ben_West🔸14 May 2023 19:28 UTC

39 points

13 comments4 min readEA link

Open Philanthropy is passing AI safety university group funding to Kairos

abergal22 Jul 2025 17:11 UTC

55 points

0 comments1 min readEA link

Crises reveal centralisation

Vasco Grilo🔸26 Mar 2024 18:00 UTC

31 points

2 comments5 min readEA link

(stefanschubert.substack.com)

Decomposing Agency — capabilities without desires

Owen Cotton-Barratt11 Jul 2024 9:38 UTC

37 points

2 comments12 min readEA link

(strangecities.substack.com)

Cost-effectiveness of professional field-building programs for AI safety research

Center for AI Safety10 Jul 2023 17:26 UTC

38 points

2 comments18 min readEA link

The Top AI Safety Bets for 2023: GiveWiki’s Latest Recommendations

Dawn Drescher11 Nov 2023 9:04 UTC

11 points

4 comments8 min readEA link

We are in a New Paradigm of AI Progress—OpenAI’s o3 model makes huge gains on the toughest AI benchmarks in the world

Garrison22 Dec 2024 21:45 UTC

26 points

0 comments4 min readEA link

(garrisonlovely.substack.com)

80,000 hours should remove OpenAI from the Job Board (and similar EA orgs should do similarly)

Raemon3 Jul 2024 20:34 UTC

263 points

79 comments3 min readEA link

[Question] AI strategy career pipeline

Zach Stein-Perlman22 May 2023 0:00 UTC

72 points

23 comments1 min readEA link

[Question] What did AI Safety’s specific funding of AGI R&D labs lead to?

Remmelt5 Jul 2023 15:51 UTC

24 points

17 comments1 min readEA link

Power laws in Speedrunning and Machine Learning

Jaime Sevilla24 Apr 2023 10:06 UTC

48 points

0 comments1 min readEA link

(arxiv.org)

A newcomer’s guide to the technical AI safety field

zeshen🔸4 Nov 2022 14:29 UTC

16 points

0 comments10 min readEA link

A stylized dialogue on John Wentworth’s claims about markets and optimization

So8res25 Mar 2023 22:32 UTC

18 points

0 comments8 min readEA link

What is autonomy, and how does it lead to greater risk from AI?

Davidmanheim1 Aug 2023 8:06 UTC

10 points

0 comments6 min readEA link

(www.lesswrong.com)

The AI Adoption Gap: Preparing the US Government for Advanced AI

Lizka2 Apr 2025 21:37 UTC

40 points

20 comments17 min readEA link

(www.forethought.org)

[Question] Investigative journalist in the AI safety space?

Benevolent_Rain15 Nov 2024 8:48 UTC

4 points

9 comments1 min readEA link

Arkose is Closing

Arkose23 Jun 2025 11:02 UTC

100 points

6 comments2 min readEA link

Chaining Retroactive Funders to Borrow Against Unlikely Utopias

Dawn Drescher19 Apr 2022 18:25 UTC

24 points

4 comments9 min readEA link

(impactmarkets.substack.com)

An Argument for Focusing on Making AI go Well

Chris Leong28 Dec 2023 13:25 UTC

13 points

4 comments3 min readEA link

Neel Nanda MATS Applications Open (Due Aug 29)

Neel Nanda30 Jul 2025 0:55 UTC

20 points

0 comments7 min readEA link

(tinyurl.com)

My current take on existential AI risk [FB post]

Aryeh Englander1 May 2023 16:22 UTC

10 points

0 comments3 min readEA link

Nuclear brinksmanship is not a good AI x-risk strategy

titotal30 Mar 2023 22:07 UTC

19 points

8 comments5 min readEA link

Thoughts on yesterday’s UN Security Council meeting on AI

Greg_Colbourn ⏸️ 19 Jul 2023 16:46 UTC

31 points

2 comments1 min readEA link

AISN #61: OpenAI Releases GPT-5

Center for AI Safety12 Aug 2025 17:52 UTC

6 points

0 comments4 min readEA link

(newsletter.safe.ai)

Rethink Priorities: Seeking Expressions of Interest for Special Projects Next Year

kierangreig🔸29 Nov 2023 13:44 UTC

57 points

0 comments5 min readEA link

Enough about AI timelines— we already know what we need to know.

Holly Elmore ⏸️ 🔸9 Apr 2025 10:29 UTC

134 points

35 comments2 min readEA link

Why would AI companies use human-level AI to do alignment research?

MichaelDickens25 Apr 2025 19:12 UTC

16 points

1 comment2 min readEA link

[Job Ad] SERI MATS is hiring for our summer program

annashive26 May 2023 4:51 UTC

8 points

1 comment7 min readEA link

Before Altman’s Ouster, OpenAI’s Board Was Divided and Feuding

Jonathan Yan22 Nov 2023 1:01 UTC

25 points

1 comment1 min readEA link

(www.nytimes.com)

The GiveWiki’s Top Picks in AI Safety for the Giving Season of 2023

Dawn Drescher7 Dec 2023 9:23 UTC

26 points

0 comments3 min readEA link

(impactmarkets.substack.com)

Nobody’s on the ball on AGI alignment

leopold29 Mar 2023 14:26 UTC

327 points

66 comments9 min readEA link

(www.forourposterity.com)

Is this community over-emphasizing AI alignment?

Lixiang8 Jan 2023 6:23 UTC

1 point

5 comments1 min readEA link

RAND report finds no effect of current LLMs on viability of bioterrorism attacks

Lizka26 Jan 2024 20:10 UTC

108 points

17 comments3 min readEA link

(www.rand.org)

[Question] Pros and cons of setting up a company to do independent AIS research?

Eevee🔹13 Aug 2024 0:11 UTC

15 points

0 comments1 min readEA link

Qualities that alignment mentors value in junior researchers

Akash14 Feb 2023 23:27 UTC

31 points

1 comment3 min readEA link

Applications open: Support for talent working on independent learning, research or entrepreneurial projects focused on reducing global catastrophic risks

CEEALAR9 Feb 2024 13:04 UTC

63 points

1 comment2 min readEA link

CFP for Rebellion and Disobedience in AI workshop

Ram Rachum29 Dec 2022 16:09 UTC

4 points

0 comments1 min readEA link

Mesa-Optimization: Explain it like I’m 10 Edition

brook26 Aug 2023 23:06 UTC

10 points

1 comment6 min readEA link

(www.lesswrong.com)

AISN #38: Supreme Court Decision Could Limit Federal Ability to Regulate AI Plus, “Circuit Breakers” for AI systems, and updates on China’s AI industry

Center for AI Safety9 Jul 2024 19:29 UTC

8 points

0 comments5 min readEA link

(newsletter.safe.ai)

Orienting to 3 year AGI timelines

Nikola22 Dec 2024 23:07 UTC

121 points

15 comments8 min readEA link

Some talent needs in AI governance

Sam Clarke13 Jun 2023 13:53 UTC

133 points

10 comments8 min readEA link

The Hubinger lectures on AGI safety: an introductory lecture series

evhub22 Jun 2023 0:59 UTC

44 points

0 comments1 min readEA link

(www.youtube.com)

AISN #58: Senate Removes State AI Regulation Moratorium

Center for AI Safety3 Jul 2025 17:07 UTC

6 points

0 comments4 min readEA link

(newsletter.safe.ai)

[Question] What harm could AI safety do?

SeanEngelhart15 May 2021 1:11 UTC

12 points

7 comments1 min readEA link

Washington Post article about EA university groups

Lizka5 Jul 2023 12:58 UTC

35 points

5 comments1 min readEA link

Compendium of problems with RLHF

Raphaël S30 Jan 2023 8:48 UTC

18 points

0 comments10 min readEA link

“Safety Culture for AI” is important, but isn’t going to be easy

Davidmanheim26 Jun 2023 11:27 UTC

53 points

0 comments2 min readEA link

(papers.ssrn.com)

Seeking (Paid) Case Studies on Standards

Holden Karnofsky26 May 2023 17:58 UTC

99 points

14 comments11 min readEA link

Pivotal Research is Hiring Research Managers

Tobias Häberli25 Sep 2024 19:11 UTC

8 points

0 comments3 min readEA link

Views on when AGI comes and on strategy to reduce existential risk

TsviBT8 Jul 2023 9:00 UTC

31 points

3 comments14 min readEA link

Miles Brundage resigned from OpenAI, and his AGI readiness team was disbanded

Garrison23 Oct 2024 23:42 UTC

57 points

4 comments7 min readEA link

(garrisonlovely.substack.com)

Join AISafety.info’s Distillation Hackathon (Oct 6-9th)

leillustrations🔸1 Oct 2023 18:42 UTC

27 points

2 comments2 min readEA link

(www.lesswrong.com)

An EA used deceptive messaging to advance her project; we need mechanisms to avoid deontologically dubious plans

MikhailSamin13 Feb 2024 23:11 UTC

19 points

39 comments5 min readEA link

AISN #59: EU Publishes General-Purpose AI Code of Practice

Center for AI Safety15 Jul 2025 18:32 UTC

8 points

0 comments4 min readEA link

(aisafety.substack.com)

Focusing your impact on short vs long TAI timelines

kuhanj30 Sep 2023 19:23 UTC

44 points

0 comments10 min readEA link

Humans are not prepared to operate outside their moral training distribution

Prometheus10 Apr 2023 21:44 UTC

12 points

0 comments3 min readEA link

Stop calling them labs

sawyer🔸24 Feb 2025 22:58 UTC

259 points

22 comments1 min readEA link

Rational Animations is looking for an AI Safety scriptwriter, a lead community manager, and other roles.

Writer16 Jun 2023 9:41 UTC

40 points

4 comments3 min readEA link

Projects I would like to see (possibly at AI Safety Camp)

Linda Linsefors27 Sep 2023 21:27 UTC

9 points

0 comments4 min readEA link

MIT FutureTech are hiring for a Technical Associate role

PeterSlattery9 Sep 2024 20:14 UTC

9 points

6 comments3 min readEA link

80,000 Hours is shifting its strategic approach to focus more on AGI

80000_Hours20 Mar 2025 11:24 UTC

232 points

121 comments8 min readEA link

Takeoff speeds presentation at Anthropic

Tom_Davidson4 Jun 2024 22:46 UTC

29 points

3 comments25 min readEA link

VIRTUA: a novel about AI alignment

Karl von Wendt12 Jan 2023 9:37 UTC

23 points

0 comments1 min readEA link

Introducing the new Riesgos Catastróficos Globales team

Jaime Sevilla3 Mar 2023 23:04 UTC

74 points

3 comments5 min readEA link

(riesgoscatastroficosglobales.com)

My highly personal skepticism braindump on existential risk from artificial intelligence.

NunoSempere23 Jan 2023 20:08 UTC

437 points

116 comments14 min readEA link

(nunosempere.com)

Disentangling arguments for the importance of AI safety

richard_ngo23 Jan 2019 14:58 UTC

63 points

14 comments8 min readEA link

Catastrophic Risks from AI #6: Discussion and FAQ

Center for AI Safety27 Jun 2023 23:23 UTC

10 points

0 comments4 min readEA link

(arxiv.org)

GDP per capita in 2050

Hauke Hillebrandt6 May 2024 15:14 UTC

130 points

11 comments16 min readEA link

(hauke.substack.com)

[Question] Why isn’t there a Charity Entrepreneurship program for AI Safety?

yanni4 Oct 2023 2:12 UTC

11 points

13 comments1 min readEA link

Epoch AI is Hiring an Operations Associate

merilalama3 May 2024 0:16 UTC

5 points

1 comment3 min readEA link

(careers.rethinkpriorities.org)

Kairos is hiring a Head of Operations/Founding Generalist

Agustín Covarrubias 🔸12 Mar 2025 20:58 UTC

59 points

1 comment5 min readEA link

SPAR seeks advisors and students for AI safety projects (Second Wave)

mic14 Sep 2023 23:09 UTC

14 points

0 comments1 min readEA link

 Some mistakes in thinking about AGI evolution and control

Remmelt1 Aug 2025 8:08 UTC

7 points

0 comments1 min readEA link

Techies Wanted: How STEM Backgrounds Can Advance Safe AI Policy

Daniel_Eth26 May 2025 11:29 UTC

41 points

1 comment29 min readEA link

Spreading messages to help with the most important century

Holden Karnofsky25 Jan 2023 20:35 UTC

129 points

21 comments18 min readEA link

(www.cold-takes.com)

How CISA can Support the Security of Large AI Models Against Theft [Grad School Assignment]

Marcel23 May 2023 15:36 UTC

7 points

0 comments13 min readEA link

Apply to SPAR Fall 2025—80+ projects!

Agustín Covarrubias 🔸30 Jul 2025 17:34 UTC

17 points

0 comments1 min readEA link

AISN #53: An Open Letter Attempts to Block OpenAI Restructuring

Center for AI Safety29 Apr 2025 15:56 UTC

6 points

0 comments4 min readEA link

(newsletter.safe.ai)

DeepMind: Evaluating Frontier Models for Dangerous Capabilities

Zach Stein-Perlman21 Mar 2024 23:00 UTC

28 points

0 comments1 min readEA link

(arxiv.org)

[Question] Game theory work on AI alignment with diverse AI systems, human individuals, & human groups?

Geoffrey Miller2 Mar 2023 16:50 UTC

22 points

2 comments1 min readEA link

Funding circle aimed at slowing down AI—looking for participants

Greg_Colbourn ⏸️ 25 Jan 2024 23:58 UTC

92 points

3 comments2 min readEA link

Organizing a debate with experts and MPs to raise AI xrisk awareness: a possible blueprint

Otto19 Apr 2023 10:50 UTC

75 points

1 comment4 min readEA link

Project ideas: Sentience and rights of digital minds

Lukas Finnveden4 Jan 2024 7:26 UTC

34 points

1 comment20 min readEA link

(www.forethought.org)

A recent write-up of the case for AI (existential) risk

Timsey18 May 2023 13:07 UTC

17 points

0 comments19 min readEA link

Announcing Manifund Regrants

Austin5 Jul 2023 19:42 UTC

217 points

51 comments4 min readEA link

(manifund.org)

Recruit the World’s best for AGI Alignment

Greg_Colbourn ⏸️ 30 Mar 2023 16:41 UTC

34 points

8 comments22 min readEA link

AISafety.world is a map of the AIS ecosystem

Hamish McDoodles6 Apr 2023 11:47 UTC

192 points

8 comments1 min readEA link

12 tentative ideas for US AI policy (Luke Muehlhauser)

Lizka19 Apr 2023 21:05 UTC

117 points

12 comments4 min readEA link

(www.openphilanthropy.org)

Will scaling work?

Vasco Grilo🔸4 Feb 2024 9:29 UTC

19 points

1 comment12 min readEA link

(www.dwarkeshpatel.com)

A note of caution about recent AI risk coverage

Sean_o_h7 Jun 2023 17:05 UTC

284 points

29 comments3 min readEA link

Debate series: should we push for a pause on the development of AI?

Ben_West🔸8 Sep 2023 16:29 UTC

252 points

58 comments1 min readEA link

AISN #54: OpenAI Updates Restructure Plan

Center for AI Safety13 May 2025 16:48 UTC

7 points

0 comments4 min readEA link

(newsletter.safe.ai)

Can the AI afford to wait?

Ben Millwood🔸20 Mar 2024 19:45 UTC

48 points

11 comments7 min readEA link

Value fragility and AI takeover

Joe_Carlsmith5 Aug 2024 21:28 UTC

39 points

3 comments30 min readEA link

Podcast with Yoshua Bengio on Why AI Labs are “Playing Dice with Humanity’s Future”

Garrison10 May 2024 17:23 UTC

29 points

3 comments2 min readEA link

(garrisonlovely.substack.com)

Announcing FAR Labs, an AI safety coworking space

Ben Goldhaber2 Oct 2023 20:15 UTC

63 points

0 comments1 min readEA link

(www.lesswrong.com)

Project idea: AI for epistemics

Benjamin_Todd19 May 2024 19:36 UTC

45 points

12 comments3 min readEA link

(benjamintodd.substack.com)

The current state of RSPs

Zach Stein-Perlman4 Nov 2024 16:00 UTC

19 points

1 comment9 min readEA link

A Gentle Introduction to Risk Frameworks Beyond Forecasting

pending_survival11 Apr 2024 9:15 UTC

83 points

4 comments27 min readEA link

Reading list on AI agents and associated policy

Peter Wildeford9 Aug 2024 17:40 UTC

79 points

2 comments1 min readEA link

Astronomical Waste & Conscientious Objection

Lydia Nottingham2 Aug 2025 22:41 UTC

5 points

0 comments2 min readEA link

CEA Should Invest in Helping Altruists Navigate Advanced AI

Chris Leong14 May 2023 14:52 UTC

4 points

13 comments2 min readEA link

Where I’m at with AI risk: convinced of danger but not (yet) of doom

Amber Dawn21 Mar 2023 13:23 UTC

62 points

16 comments6 min readEA link

Linkpost: Dwarkesh Patel interviewing Carl Shulman

Stefan_Schubert14 Jun 2023 15:30 UTC

110 points

5 comments1 min readEA link

(podcastaddict.com)

Transformative AGI by 2043 is <1% likely

Ted Sanders6 Jun 2023 15:51 UTC

98 points

92 comments5 min readEA link

(arxiv.org)

Using AI to Streamline Your Political Advocacy

Gabriel Sherman🔸29 Apr 2025 18:35 UTC

3 points

0 comments3 min readEA link

Apply to be a mentor in SPAR!

Agustín Covarrubias 🔸24 Jun 2025 23:00 UTC

25 points

0 comments1 min readEA link

New? Start here! (Useful links)

Lizka1 Jul 2022 21:19 UTC

28 points

1 comment2 min readEA link

AI companies are unlikely to make high-assurance safety cases if timelines are short

Ryan Greenblatt23 Jan 2025 18:41 UTC

45 points

1 comment13 min readEA link

Some quotes from Tuesday’s Senate hearing on AI

Daniel_Eth17 May 2023 12:13 UTC

105 points

7 comments4 min readEA link

List #3: Why not to assume on prior that AGI-alignment workarounds are available

Remmelt24 Dec 2022 9:54 UTC

6 points

0 comments3 min readEA link

The Retroactive Funding Landscape: Innovations for Donors and Grantmakers

Dawn Drescher29 Sep 2023 17:39 UTC

17 points

2 comments19 min readEA link

(impactmarkets.substack.com)

Navigating Risks from Advanced Artificial Intelligence: A Guide for Philanthropists [Founders Pledge]

Tom Barnes🔸21 Jun 2024 9:48 UTC

101 points

7 comments1 min readEA link

(www.founderspledge.com)

Pulse 2024: Attitudes towards artificial intelligence

Jamie E27 Nov 2024 11:33 UTC

62 points

4 comments3 min readEA link

AI Safety Hub Serbia Official Opening

Dušan D. Nešić (Dushan)28 Oct 2023 17:10 UTC

27 points

3 comments3 min readEA link

(forum.effectivealtruism.org)

Linkpost: Making deals with early schemers

Buck21 Jun 2025 16:34 UTC

20 points

0 comments1 min readEA link

Rolling Thresholds for AGI Scaling Regulation

Larks12 Jan 2025 1:30 UTC

60 points

4 comments6 min readEA link

[Question] Dan Hendrycks and EA

Caruso3 Aug 2024 13:49 UTC

−1 points

6 comments1 min readEA link

Bio-x-AI policy: call for ideas from the Federation of American Scientists

Ben Stewart15 Aug 2023 3:21 UTC

8 points

0 comments1 min readEA link

Reflections on my first year of AI safety research

Jay Bailey8 Jan 2024 7:49 UTC

64 points

2 comments12 min readEA link

2023 Alignment Research Updates from FAR AI

AdamGleave4 Dec 2023 22:32 UTC

14 points

0 comments8 min readEA link

(far.ai)

Announcing the Vitalik Buterin Fellowships in AI Existential Safety!

DanielFilan21 Sep 2021 0:41 UTC

62 points

0 comments1 min readEA link

(grants.futureoflife.org)

Un-unpluggability—can’t we just unplug it?

Oliver Sourbut15 May 2023 13:23 UTC

15 points

0 comments10 min readEA link

(www.oliversourbut.net)

Offering AI safety support calls for ML professionals

Vael Gates15 Feb 2024 23:48 UTC

52 points

1 comment1 min readEA link

[Linkpost] OpenAI is awarding ten 100k grants for building prototypes of a democratic process for steering AI

pseudonym26 May 2023 12:49 UTC

36 points

2 comments1 min readEA link

(openai.com)

Introducing Future Matters – a strategy consultancy

KyleGracey30 Sep 2023 2:06 UTC

59 points

2 comments5 min readEA link

AISN #50: AI Action Plan Responses

Center for AI Safety31 Mar 2025 20:07 UTC

10 points

0 comments6 min readEA link

(newsletter.safe.ai)

New report: “Scheming AIs: Will AIs fake alignment during training in order to get power?”

Joe_Carlsmith15 Nov 2023 17:16 UTC

71 points

4 comments30 min readEA link

Apply to the Cambridge ERA:AI Fellowship 2025

Harrison 🔸25 Mar 2025 13:46 UTC

28 points

0 comments3 min readEA link

Announcing the Introduction to ML Safety Course

TW1236 Aug 2022 2:50 UTC

136 points

4 comments7 min readEA link

[Question] Please help me sense-check my assumptions about the needs of the AI Safety community and related career plans

PeterSlattery27 Mar 2023 8:11 UTC

23 points

27 comments2 min readEA link

Newbie intro, ai/ei signatures, & a wco needing you...!!

Ebert Thinkingtinkerer25 Jun 2025 15:50 UTC

3 points

0 comments1 min readEA link

Beyond Control: The Strategic Case for AI Rights

Dawn Drescher12 Aug 2025 14:06 UTC

8 points

2 comments3 min readEA link

(impartial-priorities.org)

Ideas for AI labs: Reading list

Zach Stein-Perlman24 Apr 2023 19:00 UTC

28 points

2 comments4 min readEA link

A moral backlash against AI will probably slow down AGI development

Geoffrey Miller31 May 2023 21:31 UTC

146 points

22 comments14 min readEA link

Animal advocates should respond to transformative AI maybe arriving soon

Jamie_Harris2 Aug 2025 14:27 UTC

90 points

4 comments9 min readEA link

Deference on AI timelines: survey results

Sam Clarke30 Mar 2023 23:03 UTC

68 points

3 comments2 min readEA link

Release of UN’s draft related to the governance of AI (a summary of the Simon Institute’s response)

SebastianSchmidt27 Apr 2024 18:27 UTC

22 points

0 comments1 min readEA link

EA, Psychology & AI Safety Research

Sam Ellis26 May 2022 23:46 UTC

28 points

3 comments6 min readEA link

What would a compute monitoring plan look like? [Linkpost]

Akash26 Mar 2023 19:33 UTC

61 points

1 comment4 min readEA link

(arxiv.org)

How much money should we be saving for retirement?

Denkenberger🔸2 Mar 2025 6:21 UTC

22 points

6 comments2 min readEA link

Pros and Cons of boycotting paid Chat GPT

NickLaing18 Mar 2023 8:50 UTC

14 points

11 comments2 min readEA link

Potentially Useful Projects in Wise AI

Chris Leong5 Jun 2025 8:13 UTC

14 points

2 comments5 min readEA link

Should there be just one western AGI project?

rosehadshar4 Dec 2024 14:41 UTC

49 points

3 comments15 min readEA link

(www.forethought.org)

Owain Evans on LLMs, Truthful AI, AI Composition, and More

Ozzie Gooen2 May 2023 1:20 UTC

21 points

0 comments1 min readEA link

(quri.substack.com)

Selling out to AI companies is bad. Period. You will be corrupted.

Holly Elmore ⏸️ 🔸9 Apr 2025 3:56 UTC

2 points

23 comments1 min readEA link

Imitation Learning is Probably Existentially Safe

Vasco Grilo🔸30 Apr 2024 17:06 UTC

19 points

7 comments3 min readEA link

(www.openphilanthropy.org)

UK government to host first global summit on AI Safety

DavidNash8 Jun 2023 13:24 UTC

78 points

1 comment5 min readEA link

(www.gov.uk)

Intent alignment should not be the goal for AGI x-risk reduction

johnjnay26 Oct 2022 1:24 UTC

7 points

1 comment2 min readEA link

AI forecasting bots incoming

Center for AI Safety9 Sep 2024 19:55 UTC

−2 points

6 comments4 min readEA link

(www.safe.ai)

Aggregating Utilities for Corrigible AI [Feedback Draft]

Dan H12 May 2023 20:57 UTC

12 points

0 comments20 min readEA link

A rough and incomplete review of some of John Wentworth’s research

So8res28 Mar 2023 18:52 UTC

27 points

0 comments18 min readEA link

[Question] What would it look like for AIS to no longer be neglected?

Rockwell16 Jun 2023 15:59 UTC

100 points

14 comments1 min readEA link

Apply to fall policy internships (we can help)

ES2 Jul 2023 21:37 UTC

57 points

4 comments1 min readEA link

Geoffrey Miller on Cross-Cultural Understanding Between China and Western Countries as a Neglected Consideration in AI Alignment

Evan_Gaensbauer17 Apr 2023 3:26 UTC

25 points

2 comments4 min readEA link

Want to work on US emerging tech policy? Consider the Horizon Fellowship.

ES30 Jul 2024 11:46 UTC

32 points

0 comments1 min readEA link

Does AI Progress Have a Speed Limit?

Vasco Grilo🔸13 Jun 2025 16:22 UTC

15 points

1 comment19 min readEA link

(asteriskmag.com)

AIS Netherlands is looking for a Founding Executive Director (EOI form)

gergo19 Mar 2025 9:24 UTC

49 points

4 comments4 min readEA link

Survey on intermediate goals in AI governance

MichaelA🔸17 Mar 2023 12:44 UTC

156 points

4 comments1 min readEA link

Unjournal evaluation of “Towards best practices in AGI safety and governance” (Schuett et al, 2023)

david_reinstein3 Jun 2025 11:18 UTC

9 points

1 comment1 min readEA link

(unjournal.pubpub.org)

Quick survey on AI alignment resources

frances_lorenz30 Jun 2022 19:08 UTC

15 points

0 comments1 min readEA link

Excerpts from “Majority Leader Schumer Delivers Remarks To Launch SAFE Innovation Framework For Artificial Intelligence At CSIS”

Chris Leong21 Jul 2023 23:15 UTC

19 points

0 comments1 min readEA link

(www.democrats.senate.gov)

Box inversion revisited

Jan_Kulveit7 Nov 2023 11:09 UTC

13 points

1 comment8 min readEA link

The Selfish Machine

Vasco Grilo🔸15 Mar 2025 10:58 UTC

9 points

0 comments12 min readEA link

(maartenboudry.substack.com)

Levelling Up in AI Safety Research Engineering

GabeM2 Sep 2022 4:59 UTC

166 points

21 comments17 min readEA link

Scaling Laws and Likely Limits to AI

Davidmanheim18 Aug 2024 17:19 UTC

19 points

0 comments3 min readEA link

Tacit knowledge: how I exactly approach EAG(x) conferences

gergo4 Jun 2025 18:14 UTC

85 points

5 comments4 min readEA link

Thinking-in-limits about TAI from the demand perspective. Demand saturation, resource wars, new debt.

Ivan Madan7 Nov 2023 22:44 UTC

2 points

0 comments4 min readEA link

So You Want to Work at a Frontier AI Lab

Joe Rogero11 Jun 2025 23:11 UTC

35 points

2 comments7 min readEA link

(intelligence.org)

Frontier AI systems have surpassed the self-replicating red line

Greg_Colbourn ⏸️ 10 Dec 2024 16:33 UTC

25 points

14 comments1 min readEA link

(github.com)

Essay competition on the Automation of Wisdom and Philosophy — $25k in prizes

Owen Cotton-Barratt16 Apr 2024 10:08 UTC

80 points

15 comments8 min readEA link

(blog.aiimpacts.org)

AISN #56: Google Releases Veo 3

Center for AI Safety28 May 2025 15:57 UTC

6 points

0 comments4 min readEA link

(newsletter.safe.ai)

Prioritising between extinction risks: Evidence Quality

freedomandutility30 Dec 2023 12:25 UTC

11 points

0 comments2 min readEA link

[Linkpost] ‘The Godfather of A.I.’ Leaves Google and Warns of Danger Ahead

imp4rtial 🔸1 May 2023 19:54 UTC

43 points

3 comments3 min readEA link

(www.nytimes.com)

Takes on “Alignment Faking in Large Language Models”

Joe_Carlsmith18 Dec 2024 18:22 UTC

72 points

1 comment62 min readEA link

The Navigation Fund launched + is hiring a program officer to lead the distribution of $20M annually for AI safety! Full-time, fully remote, pay starts at $200k

vincentweisser3 Nov 2023 21:53 UTC

120 points

3 comments1 min readEA link

AI and Evolution

Dan H30 Mar 2023 13:09 UTC

41 points

1 comment2 min readEA link

(arxiv.org)

“X distracts from Y” as a thinly-disguised fight over group status / politics

Steven Byrnes25 Sep 2023 15:29 UTC

91 points

9 comments8 min readEA link

Manifund: 2023 in Review

Austin18 Jan 2024 23:50 UTC

29 points

1 comment23 min readEA link

(manifund.substack.com)

UN Secretary-General recognises existential threat from AI

Greg_Colbourn ⏸️ 15 Jun 2023 17:03 UTC

58 points

1 comment1 min readEA link

Introducing Kairos: a new AI safety fieldbuilding organization (the new home for SPAR and FSP)

Agustín Covarrubias 🔸25 Oct 2024 21:59 UTC

81 points

2 comments2 min readEA link

Successif: Join our AI program to help mitigate the catastrophic risks of AI

ClaireB25 Oct 2023 16:51 UTC

15 points

0 comments5 min readEA link

AISN #36: Voluntary Commitments are Insufficient Plus, a Senate AI Policy Roadmap, and Chapter 1: An Overview of Catastrophic Risks

Center for AI Safety30 May 2024 18:23 UTC

6 points

0 comments5 min readEA link

(newsletter.safe.ai)

Supervised Program for Alignment Research (SPAR) at UC Berkeley: Spring 2023 summary

mic19 Aug 2023 2:32 UTC

18 points

1 comment6 min readEA link

(www.lesswrong.com)

AI Safety Newsletter #1 [CAIS Linkpost]

Akash10 Apr 2023 20:18 UTC

38 points

0 comments4 min readEA link

(newsletter.safe.ai)

The US-China Relationship and Catastrophic Risk (EAG Boston transcript)

EA Global9 Jul 2024 13:50 UTC

30 points

1 comment19 min readEA link

[Question] Has Anthropic already made the externally legible commitments that it planned to make?

Ofer12 Mar 2024 13:45 UTC

21 points

3 comments1 min readEA link

METR is hiring ML Research Engineers and Scientists

Ben_West🔸5 Jun 2024 21:25 UTC

18 points

2 comments1 min readEA link

(metr.org)

AI Safety University Organizing: Early Takeaways from Thirteen Groups

Agustín Covarrubias 🔸2 Oct 2024 14:39 UTC

46 points

3 comments9 min readEA link

Catastrophic Risks from AI #5: Rogue AIs

Center for AI Safety27 Jun 2023 22:06 UTC

16 points

1 comment22 min readEA link

(arxiv.org)

Yann LeCun on AGI and AI Safety

Chris Leong8 Aug 2023 23:43 UTC

23 points

4 comments1 min readEA link

(drive.google.com)

Introducing SyDFAIS: A Systemic Design Framework for AI Safety Field-Building

Moneer6 Feb 2025 14:26 UTC

19 points

6 comments14 min readEA link

Student competition for drafting a treaty on moratorium of large-scale AI capabilities R&D

Nayanika24 Apr 2023 13:15 UTC

36 points

4 comments2 min readEA link

AI Takeover Scenario with Scaled LLMs

simeon_c16 Apr 2023 23:28 UTC

29 points

1 comment8 min readEA link

All AGI Safety questions welcome (especially basic ones) [May 2023]

StevenKaas8 May 2023 22:30 UTC

19 points

11 comments2 min readEA link

METR: Measuring AI Ability to Complete Long Tasks

Ben_West🔸19 Mar 2025 16:49 UTC

122 points

16 comments1 min readEA link

(metr.org)

Civil disobedience opportunity—a way to help reduce chance of hard takeoff from recursive self improvement of code

JonCefalu25 Mar 2023 22:37 UTC

−5 points

0 comments1 min readEA link

(codegencodepoisoningcontest.cargo.site)

My “infohazards small working group” Signal Chat may have encountered minor leaks

Linch2 Apr 2025 1:03 UTC

109 points

2 comments5 min readEA link

Decoding Republican AI Policy: Insights from 10 Key Articles from Mid-2024

anonymous00718 Aug 2024 9:48 UTC

5 points

0 comments6 min readEA link

On the correspondence between AI-misalignment and cognitive dissonance using a behavioral economics model

Stijn Bruers 🔸1 Nov 2022 9:15 UTC

11 points

0 comments6 min readEA link

Why Did Elon Musk Just Offer to Buy Control of OpenAI for $100 Billion?

Garrison11 Feb 2025 0:20 UTC

152 points

2 comments6 min readEA link

(garrisonlovely.substack.com)

EA Hotel: Live Theory Workshop this month

CEEALAR7 Nov 2024 10:39 UTC

14 points

0 comments1 min readEA link

Collin Burns on Alignment Research And Discovering Latent Knowledge Without Supervision

Michaël Trazzi17 Jan 2023 17:21 UTC

21 points

2 comments4 min readEA link

(theinsideview.ai)

AI as a Constitutional Moment

atb28 May 2025 15:40 UTC

37 points

1 comment9 min readEA link

AGI Takeoff dynamics—Intelligence vs Quantity explosion

EdoArad26 Jul 2023 9:20 UTC

14 points

0 comments2 min readEA link

(github.com)

Apply to CEEALAR to do AGI moratorium work

Greg_Colbourn ⏸️ 26 Jul 2023 21:24 UTC

62 points

0 comments1 min readEA link

[Question] How committed to AGI safety are the current OpenAI nonprofit board members?

Eevee🔹2 Dec 2024 4:03 UTC

14 points

1 comment1 min readEA link

“Who Will You Be After ChatGPT Takes Your Job?”

Stephen Thomas21 Apr 2023 21:31 UTC

23 points

4 comments2 min readEA link

(www.wired.com)

What does it mean for an AGI to be ‘safe’?

So8res7 Oct 2022 4:43 UTC

53 points

21 comments3 min readEA link

Giving away copies of Uncontrollable by Darren McKee

Greg_Colbourn ⏸️ 14 Dec 2023 17:00 UTC

39 points

2 comments1 min readEA link

AI Consciousness Report: A Roundtable Discussion

Sofia_Fogel30 Aug 2023 21:50 UTC

18 points

0 comments2 min readEA link

[Question] Imagine AGI killed us all in three years. What would have been our biggest mistakes?

yanni kyriacos7 Apr 2023 0:06 UTC

17 points

6 comments1 min readEA link

AI Safety Field Building vs. EA CB

kuhanj26 Jun 2023 23:21 UTC

80 points

16 comments6 min readEA link

Distinguishing ways AI can be “concentrated”

Matthew_Barnett21 Oct 2024 22:14 UTC

30 points

1 comment4 min readEA link

Announcing the ITAM AI Futures Fellowship

AmAristizabal28 Jul 2023 16:44 UTC

43 points

3 comments2 min readEA link

OpenAI’s new structure

AnonymousTurtle27 Dec 2024 14:53 UTC

30 points

2 comments1 min readEA link

(openai.com)

SB-1047 Documentary: The Post-Mortem

Michaël Trazzi1 Aug 2025 21:44 UTC

60 points

1 comment5 min readEA link

AISN #60: The AI Action Plan

Center for AI Safety31 Jul 2025 18:10 UTC

6 points

0 comments7 min readEA link

(newsletter.safe.ai)

The Parable of the Boy Who Cried 5% Chance of Wolf

Kat Woods 🔶 ⏸️15 Aug 2022 14:22 UTC

80 points

8 comments2 min readEA link

Apply to Spring 2024 policy internships (we can help)

ES4 Oct 2023 14:45 UTC

26 points

2 comments1 min readEA link

How bad a future do ML researchers expect?

Katja_Grace13 Mar 2023 5:47 UTC

165 points

20 comments2 min readEA link

Saying ‘AI safety research is a Pascal’s Mugging’ isn’t a strong response

Robert_Wiblin15 Dec 2015 13:48 UTC

15 points

16 comments2 min readEA link

Free Linguistic Services for High-Impact Organisations

feijão8 Jun 2025 17:37 UTC

8 points

2 comments1 min readEA link

Defining alignment research

richard_ngo19 Aug 2024 22:49 UTC

48 points

1 comment7 min readEA link

Formation of Macrostrategy Refinement Division

Henry Stanley 🔸1 Apr 2025 14:16 UTC

23 points

0 comments2 min readEA link

Simulating Shutdown Code Activations in an AI Virus Lab

Miguel20 Jun 2023 5:27 UTC

4 points

0 comments6 min readEA link

AI governance talent profiles I’d like to see apply for OP funding

JulianHazell19 Dec 2023 12:34 UTC

119 points

4 comments3 min readEA link

(www.openphilanthropy.org)

Funding and job opportunities, events, and thoughts on professionals (Fieldbuilders newsletter #8)

gergo23 Apr 2025 9:53 UTC

7 points

0 comments3 min readEA link

Apply to be a mentor in SPAR!

Agustín Covarrubias 🔸5 Nov 2024 21:32 UTC

14 points

0 comments1 min readEA link

Safety tax functions

Owen Cotton-Barratt20 Oct 2024 14:13 UTC

23 points

1 comment6 min readEA link

(strangecities.substack.com)

AI Safety Seems Hard to Measure

Holden Karnofsky11 Dec 2022 1:31 UTC

90 points

4 comments14 min readEA link

(www.cold-takes.com)

Truth and Advantage: Response to a draft of “AI safety seems hard to measure”

So8res22 Mar 2023 3:36 UTC

11 points

0 comments5 min readEA link

What is it like doing AI safety work?

Kat Woods 🔶 ⏸️21 Feb 2023 19:24 UTC

99 points

2 comments10 min readEA link

The Grant Decision Boundary: Recent Cases from the Long-Term Future Fund

Linch29 Nov 2024 1:50 UTC

66 points

3 comments3 min readEA link

Upcoming Feedback Opportunity on Dual-Use Foundation Models

Chris Leong2 Nov 2023 4:30 UTC

9 points

0 comments1 min readEA link

An ‘AGI Emergency Eject Criteria’ consensus could be really useful.

tcelferact7 Apr 2023 16:21 UTC

27 points

3 comments1 min readEA link

Distinctions when Discussing Utility Functions

Ozzie Gooen8 Mar 2024 18:43 UTC

15 points

5 comments8 min readEA link

Announcing Timaeus

Stan van Wingerden22 Oct 2023 13:32 UTC

80 points

0 comments5 min readEA link

(www.lesswrong.com)

Pausing AI is Progress

Felix De Simone16 Jul 2024 22:28 UTC

22 points

3 comments6 min readEA link

(pauseai.substack.com)

[Question] Asking for online calls on AI s-risks discussions

jackchang11014 May 2023 13:58 UTC

26 points

3 comments1 min readEA link

EA Netherlands’ guide to AI safety careers

James Herbert16 Jan 2025 17:22 UTC

25 points

0 comments1 min readEA link

(effectiefaltruisme.nl)

Video and transcript of presentation on Otherness and control in the age of AGI

Joe_Carlsmith8 Oct 2024 22:30 UTC

18 points

1 comment27 min readEA link

CEA seeks co-founder for AI safety group support spin-off

Agustín Covarrubias 🔸8 Apr 2024 15:42 UTC

62 points

0 comments4 min readEA link

A widely shared AI productivity paper was retracted, is possibly fraudulent

titotal19 May 2025 10:18 UTC

34 points

4 comments3 min readEA link

[Question] What should I ask Ezra Klein about AI policy proposals?

Robert_Wiblin23 Jun 2023 16:36 UTC

21 points

4 comments1 min readEA link

Calling for Student Submissions: AI Safety Distillation Contest

a_e_r23 Apr 2022 20:24 UTC

102 points

28 comments3 min readEA link

Some reasons to start a project to stop harmful AI

Remmelt22 Aug 2024 16:23 UTC

5 points

0 comments2 min readEA link

[Linkpost] Jan Leike on three kinds of alignment taxes

Akash6 Jan 2023 23:57 UTC

29 points

0 comments3 min readEA link

(aligned.substack.com)

The AI industry turns against its favorite philosophy

Jonathan Yan22 Nov 2023 0:11 UTC

14 points

2 comments1 min readEA link

(www.semafor.com)

Taking a leave of absence from Open Philanthropy to work on AI safety

Holden Karnofsky23 Feb 2023 19:05 UTC

420 points

31 comments2 min readEA link

AISN#52: An Expert Virology Benchmark

Center for AI Safety22 Apr 2025 16:52 UTC

6 points

0 comments4 min readEA link

(newsletter.safe.ai)

Thoughts on “The Offense-Defense Balance Rarely Changes”

Cullen 🔸12 Feb 2024 3:26 UTC

42 points

4 comments5 min readEA link

[Part-time AI Safety Research Program] MARS 3.0 Applications Open for Participants & Recruiting Mentors

Cambridge AI Safety Hub7 May 2025 22:52 UTC

4 points

0 comments2 min readEA link

Theories of Change for Track II Diplomacy [Founders Pledge]

christian.r9 Jul 2024 13:31 UTC

21 points

2 comments33 min readEA link

[Question] Will OpenAI’s o3 reduce NVidia’s moat?

Ebenezer Dukakis3 Jan 2025 2:21 UTC

9 points

6 comments1 min readEA link

Insights from an expert survey about intermediate goals in AI governance

Sebastian Schwiecker17 Mar 2023 14:59 UTC

11 points

2 comments1 min readEA link

Funding Case: AI Safety Camp 11

Remmelt23 Dec 2024 8:39 UTC

42 points

2 comments6 min readEA link

(manifund.org)

“Diamondoid bacteria” nanobots: deadly threat or dead-end? A nanotech investigation

titotal29 Sep 2023 14:01 UTC

102 points

33 comments20 min readEA link

(titotal.substack.com)

Attend SPAR’s virtual demo day! (career fair + talks)

Agustín Covarrubias 🔸2 May 2025 23:45 UTC

17 points

1 comment2 min readEA link

(demoday.sparai.org)

DeepSeek Made it Even Harder for US AI Companies to Ever Reach Profitability

Garrison19 Feb 2025 21:02 UTC

30 points

1 comment3 min readEA link

(garrisonlovely.substack.com)

Fundraising for Mox: coworking & events in SF

Austin31 Mar 2025 18:25 UTC

37 points

3 comments6 min readEA link

(manifund.org)

Statement on AI Extinction—Signed by AGI Labs, Top Academics, and Many Other Notable Figures

Center for AI Safety30 May 2023 9:06 UTC

429 points

28 comments1 min readEA link

(www.safe.ai)

Still no strong evidence that LLMs increase bioterrorism risk

freedomandutility2 Nov 2023 21:23 UTC

58 points

9 comments1 min readEA link

[LW xpost] Unit economics of LLM APIs

dschwarz27 Aug 2024 16:55 UTC

19 points

2 comments1 min readEA link

(www.lesswrong.com)

US Congress introduces CREATE AI Act for establishing National AI Research Resource

Daniel_Eth28 Jul 2023 23:27 UTC

9 points

1 comment1 min readEA link

(eshoo.house.gov)

AISN #44: The Trump Circle on AI Safety Plus, Chinese researchers used Llama to create a military tool for the PLA, a Google AI system discovered a zero-day cybersecurity vulnerability, and Complex Systems

Center for AI Safety19 Nov 2024 16:36 UTC

11 points

0 comments5 min readEA link

(newsletter.safe.ai)

Anthropic’s submission to the White House’s RFI on AI policy

Agustín Covarrubias 🔸6 Mar 2025 22:47 UTC

48 points

7 comments1 min readEA link

(www.anthropic.com)

Briefly how I’ve updated since ChatGPT

rime25 Apr 2023 19:39 UTC

29 points

8 comments2 min readEA link

(www.lesswrong.com)

AI Safety & Entrepreneurship v1.0

Chris Leong26 Apr 2025 14:37 UTC

27 points

0 comments2 min readEA link

AIのタイムライン ─ 提案されている論証と「専門家」の立ち位置

EA Japan17 Aug 2023 14:59 UTC

2 points

0 comments1 min readEA link

Please, someone make a dataset of supposed cases of “tech panic”

Marcel27 Nov 2023 2:49 UTC

4 points

2 comments2 min readEA link

Survey on the acceleration risks of our new RFPs to study LLM capabilities

Ajeya10 Nov 2023 23:59 UTC

38 points

1 comment8 min readEA link

Bandgaps, Brains, and Bioweapons: The limitations of computational science and what it means for AGI

titotal26 May 2023 15:57 UTC

59 points

0 comments18 min readEA link

Five Years of Rethink Priorities: Impact, Future Plans, Funding Needs (July 2023)

Rethink Priorities18 Jul 2023 15:59 UTC

110 points

3 comments16 min readEA link

Women in AI Safety London Meetup

Nia1 Aug 2024 9:48 UTC

2 points

0 comments1 min readEA link

My attempt at explaining the case for AI risk in a straightforward way

JulianHazell25 Mar 2023 16:32 UTC

25 points

7 comments18 min readEA link

(muddyclothes.substack.com)

A Windfall Clause for CEO could worsen AI race dynamics

Larks9 Mar 2023 18:02 UTC

69 points

12 comments7 min readEA link

[Link Post: New York Times] White House Unveils Initiatives to Reduce Risks of A.I.

Rockwell4 May 2023 14:04 UTC

50 points

1 comment2 min readEA link

Why I’m Posting AI-Safety-Related Clips On TikTok

Michaël Trazzi12 Aug 2025 22:39 UTC

54 points

1 comment2 min readEA link

ARC Evals: Responsible Scaling Policies

Zach Stein-Perlman28 Sep 2023 4:30 UTC

16 points

1 comment2 min readEA link

(evals.alignment.org)

Palisade is hiring: Exec Assistant, Content Lead, Ops Lead, and Policy Lead

Charlie Rogers-Smith9 Oct 2024 0:04 UTC

15 points

2 comments4 min readEA link

Semi-conductor / AI stocks discussion.

sapphire25 Nov 2022 23:35 UTC

10 points

3 comments1 min readEA link

Recent progress on the science of evaluations

PabloAMC 🔸23 Jun 2025 9:49 UTC

11 points

0 comments8 min readEA link

(www.lesswrong.com)

Bernie Sanders (I-VT) mentions AI loss of control risk in Gizmodo interview

Matrice Jacobine14 Jul 2025 14:47 UTC

26 points

0 comments1 min readEA link

(gizmodo.com)

Californians, tell your reps to vote yes on SB 1047!

Holly Elmore ⏸️ 🔸12 Aug 2024 19:49 UTC

106 points

6 comments1 min readEA link

Response to Aschenbrenner’s “Situational Awareness”

RobBensinger6 Jun 2024 22:57 UTC

111 points

15 comments3 min readEA link

The U.S. and China Need an AI Incidents Hotline

christian.r3 Jun 2024 18:46 UTC

25 points

0 comments1 min readEA link

(www.lawfaremedia.org)

The EA case for Trump 2024

hamandcheese2 Aug 2024 19:32 UTC

−8 points

66 comments12 min readEA link

New open letter on AI — “Include Consciousness Research”

Jamie_Harris28 Apr 2023 7:50 UTC

55 points

1 comment3 min readEA link

(amcs-community.org)

“Artificial General Intelligence”: an extremely brief FAQ

Steven Byrnes11 Mar 2024 17:49 UTC

12 points

0 comments2 min readEA link

Reminder: AI Worldviews Contest Closes May 31

Jason Schukraft8 May 2023 17:40 UTC

20 points

0 comments1 min readEA link

An Exercise to Build Intuitions on AGI Risk

Lauro Langosco8 Jun 2023 11:20 UTC

4 points

0 comments8 min readEA link

(www.alignmentforum.org)

More people getting into AI safety should do a PhD

AdamGleave14 Mar 2024 22:14 UTC

50 points

4 comments12 min readEA link

(gleave.me)

New DeepMind report on institutions for global AI governance

finm14 Jul 2023 16:05 UTC

10 points

0 comments1 min readEA link

(www.deepmind.com)

kpurens’s Quick takes

kpurens11 Apr 2023 14:10 UTC

9 points

2 comments2 min readEA link

List of Masters Programs in Tech Policy, Public Policy and Security (Europe)

sberg29 May 2023 10:23 UTC

49 points

0 comments3 min readEA link

Technological developments that could increase risks from nuclear weapons: A shallow review

MichaelA🔸9 Feb 2023 15:41 UTC

79 points

3 comments5 min readEA link

(bit.ly)

AI Governance & Strategy: Priorities, talent gaps, & opportunities

Akash3 Mar 2023 18:09 UTC

21 points

0 comments4 min readEA link

Enhancing biosecurity with language models: defining research directions

mic26 Mar 2024 12:30 UTC

11 points

1 comment13 min readEA link

(papers.ssrn.com)

LANAIS (Latin American Network for AI Safety) kick-off

Fernando Avalos23 Jun 2025 14:34 UTC

28 points

0 comments2 min readEA link

Clarifying “wisdom”: Foundational topics for aligned AIs to prioritize before irreversible decisions

Anthony DiGiovanni20 Jun 2025 21:55 UTC

24 points

1 comment12 min readEA link

OpenAI o1

Zach Stein-Perlman12 Sep 2024 18:54 UTC

38 points

0 comments1 min readEA link

The Precipice Revisited

Toby_Ord12 Jul 2024 14:06 UTC

283 points

41 comments17 min readEA link

Moravec’s paradox and its implications

Vasco Grilo🔸29 Apr 2025 16:25 UTC

13 points

5 comments8 min readEA link

(epoch.ai)

NYT article about the Zizians including quotes from Eliezer, Anna, Ozy, Jessica, Zvi

Matrice Jacobine8 Jul 2025 1:42 UTC

2 points

0 comments1 min readEA link

(www.nytimes.com)

Notes and updates on GPT-5

Yadav9 Aug 2025 11:58 UTC

33 points

3 comments2 min readEA link

(robertgaurav.xyz)

Recursive Middle Manager Hell

Raemon17 Jan 2023 19:02 UTC

73 points

3 comments11 min readEA link

Benchmark Performance is a Poor Measure of Generalisable AI Reasoning Capabilities

James Fodor21 Feb 2025 4:25 UTC

12 points

3 comments24 min readEA link

[Link post] Michael Nielsen’s “Notes on Existential Risk from Artificial Superintelligence”

Joel Becker19 Sep 2023 13:31 UTC

38 points

1 comment6 min readEA link

(michaelnotebook.com)

Bryan Johnson seems more EA aligned than I expected

PeterSlattery22 Apr 2024 9:38 UTC

13 points

27 comments2 min readEA link

(www.youtube.com)

A Simple Model of AGI Deployment Risk

djbinder9 Jul 2021 9:44 UTC

30 points

0 comments5 min readEA link

2023: news on AI safety, animal welfare, global health, and more

Lizka5 Jan 2024 21:57 UTC

54 points

1 comment12 min readEA link

[Event] Building What the Future Needs: A curated conference in Berlin (Sep 6, 2025) for high-impact builders and researchers

Vasiliy Kondyrev8 Aug 2025 14:35 UTC

21 points

0 comments2 min readEA link

80,000 Hours is hiring for an Engagement Specialist

Bella25 Apr 2025 10:33 UTC

8 points

5 comments6 min readEA link

Jobs that can help with the most important century

Holden Karnofsky12 Feb 2023 18:19 UTC

57 points

2 comments32 min readEA link

(www.cold-takes.com)

Solving alignment isn’t enough for a flourishing future

mic2 Feb 2024 18:22 UTC

27 points

0 comments22 min readEA link

(papers.ssrn.com)

Paperclip Club (AI Safety Meetup)

Luke Thorburn20 Apr 2023 16:04 UTC

2 points

0 comments1 min readEA link

Sam Altman fired from OpenAI

Larks17 Nov 2023 21:07 UTC

133 points

89 comments1 min readEA link

(openai.com)

AI Views Snapshots

RobBensinger13 Dec 2023 0:45 UTC

25 points

0 comments1 min readEA link

How do we solve the alignment problem?

Joe_Carlsmith13 Feb 2025 18:27 UTC

28 points

1 comment7 min readEA link

(joecarlsmith.substack.com)

<$750k grants for General Purpose AI Assurance/Safety Research

Phosphorous13 Jun 2023 4:51 UTC

37 points

0 comments1 min readEA link

(cset.georgetown.edu)

The state of AI in different countries — an overview

Lizka14 Sep 2023 10:37 UTC

68 points

6 comments13 min readEA link

(aisafetyfundamentals.com)

Towards more cooperative AI safety strategies

richard_ngo16 Jul 2024 4:36 UTC

64 points

5 comments4 min readEA link

Quick takes on “AI is easy to control”

So8res2 Dec 2023 22:33 UTC

−12 points

4 comments4 min readEA link

Modeling the impact of AI safety field-building programs

Center for AI Safety10 Jul 2023 17:22 UTC

86 points

0 comments7 min readEA link

[Draft] The humble cosmologist’s P(doom) paradox

titotal16 Mar 2024 11:13 UTC

39 points

6 comments10 min readEA link

[Question] What is MIRI currently doing?

Roko14 Dec 2024 2:55 UTC

9 points

2 comments1 min readEA link

Framing AI strategy

Zach Stein-Perlman7 Feb 2023 20:03 UTC

16 points

0 comments1 min readEA link

(www.lesswrong.com)

Regrant up to $600,000 to AI safety projects with GiveWiki

Dawn Drescher28 Oct 2023 19:56 UTC

22 points

0 comments3 min readEA link

Public Comment Invited on Artificial Intelligence Action Plan

PeterSlattery3 Mar 2025 14:11 UTC

47 points

0 comments1 min readEA link

(www.whitehouse.gov)

AI Safety in a World of Vulnerable Machine Learning Systems

AdamGleave8 Mar 2023 2:40 UTC

20 points

0 comments29 min readEA link

(far.ai)

Will releasing the weights of large language models grant widespread access to pandemic agents?

Jeff Kaufman 🔸30 Oct 2023 17:42 UTC

56 points

18 comments1 min readEA link

(arxiv.org)

AI safety tax dynamics

Owen Cotton-Barratt23 Oct 2024 12:21 UTC

22 points

9 comments6 min readEA link

(strangecities.substack.com)

Cognitive assets and defensive acceleration

JulianHazell3 Apr 2024 14:55 UTC

13 points

3 comments4 min readEA link

(muddyclothes.substack.com)

AISN #49: Superintelligence Strategy

Center for AI Safety6 Mar 2025 17:43 UTC

8 points

0 comments5 min readEA link

(newsletter.safe.ai)

Futures with digital minds: Expert forecasts in 2025

Lucius Caviola16 Aug 2025 20:00 UTC

53 points

1 comment1 min readEA link

(digitalminds.report)

Experts’ AI timelines are longer than you have been told?

Vasco Grilo🔸9 Jan 2025 17:30 UTC

38 points

11 comments3 min readEA link

(bayes.net)

Deconstructing Bostrom’s Classic Argument for AI Doom

Nora Belrose11 Mar 2024 6:03 UTC

26 points

0 comments1 min readEA link

(www.youtube.com)

Local Detours On A Narrow Path: How might treaties fail in China?

Jack_S11 Aug 2025 20:33 UTC

8 points

0 comments14 min readEA link

(torchestogether.substack.com)

Is the time crunch for AI Safety Movement Building now?

Chris Leong8 Jun 2022 12:19 UTC

14 points

10 comments3 min readEA link

Careless talk on US-China AI competition? (and criticism of CAIS coverage)

Oliver Sourbut20 Sep 2023 12:46 UTC

52 points

19 comments9 min readEA link

(www.oliversourbut.net)

AI safety logo design contest, due end of May (extended)

Adrian Cipriani28 Apr 2023 2:53 UTC

13 points

23 comments2 min readEA link

On “slack” in training (Section 1.5 of “Scheming AIs”)

Joe_Carlsmith25 Nov 2023 17:51 UTC

14 points

1 comment5 min readEA link

Orthogonality is Expensive

𝕮𝖎𝖓𝖊𝖗𝖆3 Apr 2023 1:57 UTC

18 points

4 comments1 min readEA link

(www.beren.io)

Investigating an insurance-for-AI startup

L Rudolf L21 Sep 2024 15:29 UTC

40 points

1 comment15 min readEA link

(www.strataoftheworld.com)

On the future of language models

Owen Cotton-Barratt20 Dec 2023 16:58 UTC

125 points

3 comments36 min readEA link

AIxBio Newsletter #3 - At the Nexus

Andy Morgan 🔸7 Dec 2024 21:00 UTC

7 points

0 comments2 min readEA link

(atthenexus.substack.com)

LLMs as a Planning Overhang

Larks14 Jul 2024 4:57 UTC

49 points

3 comments2 min readEA link

Announcing the 2025 Q1 Pivotal Research Fellowship

Tobias Häberli2 Nov 2024 11:33 UTC

26 points

1 comment2 min readEA link

Which ML skills are useful for finding a new AIS research agenda?

Yonatan Cale9 Feb 2023 13:09 UTC

7 points

3 comments1 min readEA link

On “first critical tries” in AI alignment

Joe_Carlsmith5 Jun 2024 0:19 UTC

29 points

3 comments14 min readEA link

[Question] Why is Apart Research suddenly in dire need of funding?

Eevee🔹28 May 2025 7:43 UTC

97 points

11 comments1 min readEA link

Using Consensus Mechanisms as an approach to Alignment

Prometheus11 Jun 2023 13:24 UTC

14 points

0 comments6 min readEA link

How MATS addresses “mass movement building” concerns

Ryan Kidd4 May 2023 0:55 UTC

79 points

4 comments3 min readEA link

British public perception of existential risks

Jamie E25 Oct 2024 14:37 UTC

58 points

8 comments10 min readEA link

Google invests $300mn in artificial intelligence start-up Anthropic | FT

𝕮𝖎𝖓𝖊𝖗𝖆3 Feb 2023 19:43 UTC

155 points

5 comments1 min readEA link

(www.ft.com)

MIT FutureTech are hiring a Postdoctoral Associate to work on AI Performance and Safety

PeterSlattery8 Jul 2025 14:05 UTC

7 points

0 comments4 min readEA link

[Question] Is DeepSeek-R1 already better than o3 when inference costs are held constant?

Magnus Vinding24 Jan 2025 15:29 UTC

33 points

2 comments1 min readEA link

AIs accelerating AI research

Ajeya12 Apr 2023 11:41 UTC

84 points

7 comments4 min readEA link

[Question] Will AI Worldview Prize Funding Be Replaced?

Jordan Arel13 Nov 2022 17:10 UTC

26 points

4 comments1 min readEA link

[Linkpost] Situational Awareness—The Decade Ahead

MathiasKB🔸4 Jun 2024 22:58 UTC

87 points

7 comments2 min readEA link

(situational-awareness.ai)

[Video] - How does the EU AI Act Work?

Yadav11 Sep 2024 14:16 UTC

10 points

0 comments5 min readEA link

Apply to lead a project during the next virtual AI Safety Camp

Linda Linsefors13 Sep 2023 13:29 UTC

16 points

0 comments5 min readEA link

(aisafety.camp)

Incubating AI x-risk projects: some personal reflections

Ben Snodin19 Dec 2023 17:03 UTC

84 points

10 comments9 min readEA link

AISN #51: AI Frontiers

Center for AI Safety15 Apr 2025 15:46 UTC

8 points

1 comment5 min readEA link

(newsletter.safe.ai)

AI Fables Writing Contest Winners!

Daystar Eld6 Nov 2023 2:27 UTC

39 points

0 comments2 min readEA link

AI Safety Newsletter #41: The Next Generation of Compute Scale Plus, Ranking Models by Susceptibility to Jailbreaking, and Machine Ethics

Center for AI Safety11 Sep 2024 19:11 UTC

12 points

0 comments5 min readEA link

(newsletter.safe.ai)

Biorisk is an Unhelpful Analogy for AI Risk

Davidmanheim6 May 2024 6:18 UTC

22 points

4 comments3 min readEA link

Implications of the inference scaling paradigm for AI safety

Ryan Kidd15 Jan 2025 0:59 UTC

47 points

5 comments5 min readEA link

Metaculus’ predictions are much better than low-information priors

Vasco Grilo🔸11 Apr 2023 8:36 UTC

53 points

0 comments6 min readEA link

13 Very Different Stances on AGI

Ozzie Gooen27 Dec 2021 23:30 UTC

84 points

23 comments3 min readEA link

[Question] Can we evaluate the “tool versus agent” AGI prediction?

Ben_West🔸8 Apr 2023 18:35 UTC

63 points

7 comments1 min readEA link

The goal-guarding hypothesis (Section 2.3.1.1 of “Scheming AIs”)

Joe_Carlsmith2 Dec 2023 15:20 UTC

6 points

1 comment12 min readEA link

Thoughts about AI safety field-building in LMIC

Renan Araujo23 Jun 2023 23:22 UTC

57 points

4 comments12 min readEA link

Is it time for a pause?

Kelsey Piper6 Apr 2023 11:48 UTC

103 points

6 comments5 min readEA link

EA Infosec: skill up in or make a transition to infosec via this book club

Jason Clinton5 Mar 2023 21:02 UTC

170 points

16 comments2 min readEA link

Have your say on the Australian Government’s AI Policy [Online #1]

Nathan Sherburn11 Jul 2023 0:35 UTC

3 points

0 comments1 min readEA link

China Hawks are Manufacturing an AI Arms Race

Garrison20 Nov 2024 18:17 UTC

103 points

3 comments5 min readEA link

(garrisonlovely.substack.com)

Announcing the Open Philanthropy AI Worldviews Contest

Jason Schukraft10 Mar 2023 2:33 UTC

137 points

33 comments3 min readEA link

(www.openphilanthropy.org)

How do AI welfare and AI safety interact?

Lucius Caviola1 Jul 2024 10:39 UTC

77 points

21 comments7 min readEA link

(outpaced.substack.com)

Metaculus Predicts Weak AGI in 2 Years and AGI in 10

Chris Leong24 Mar 2023 19:43 UTC

27 points

12 comments1 min readEA link

Beware popular discussions of AI “sentience”

David Mathers🔸8 Jun 2023 8:57 UTC

42 points

6 comments9 min readEA link

Announcing the Existential InfoSec Forum

calebp7 Jul 2023 21:08 UTC

90 points

1 comment2 min readEA link

Law & AI Dinner—EAG Boston 2023

Alfredo Parra 🔸12 Oct 2023 8:32 UTC

8 points

0 comments1 min readEA link

Three Quotes on Transformative Technology

Chris Leong1 Aug 2025 22:57 UTC

25 points

0 comments1 min readEA link

[Linkpost] The A.I. Dilemma—March 9, 2023, with Tristan Harris and Aza Raskin

PeterSlattery14 Apr 2023 8:00 UTC

38 points

3 comments41 min readEA link

(youtu.be)

Launch & Grow Your University Group: Apply now to OSP & FSP!

Agustín Covarrubias 🔸25 May 2024 1:03 UTC

61 points

0 comments2 min readEA link

11 heuristics for choosing (alignment) research projects

Akash27 Jan 2023 0:36 UTC

30 points

1 comment1 min readEA link

An Easily Overlooked Post on the Automation of Wisdom and Philosophy

Chris Leong12 Jun 2025 2:57 UTC

12 points

1 comment1 min readEA link

(blog.aiimpacts.org)

[Question] If your AGI x-risk estimates are low, what scenarios make up the bulk of your expectations for an OK outcome?

Greg_Colbourn ⏸️ 21 Apr 2023 11:15 UTC

65 points

55 comments1 min readEA link

[Question] Best giving multiplier for X-risk/AI safety?

SiebeRozendal27 Dec 2023 10:51 UTC

7 points

0 comments1 min readEA link

AI welfare vs. AI rights

Matthew_Barnett4 Feb 2025 18:28 UTC

37 points

20 comments3 min readEA link

AI safety field-building survey: Talent needs, infrastructure needs, and relationship to EA

michel27 Oct 2023 21:08 UTC

67 points

3 comments9 min readEA link

Publication of the International Scientific Report on the Safety of Advanced AI (Interm Report)

James Herbert21 May 2024 21:58 UTC

11 points

2 comments2 min readEA link

(www.gov.uk)

[Linkpost] “Governance of superintelligence” by OpenAI

Daniel_Eth22 May 2023 20:15 UTC

51 points

6 comments2 min readEA link

(openai.com)

A Guide to Forecasting AI Science Capabilities

Eleni_A29 Apr 2023 6:51 UTC

19 points

1 comment4 min readEA link

‘AI Emergency Eject Criteria’ Survey

tcelferact19 Apr 2023 21:55 UTC

5 points

4 comments1 min readEA link

ALTER Israel Mid-2025 Semiannual Update

Davidmanheim15 Jul 2025 7:47 UTC

12 points

1 comment5 min readEA link

New Metaculus Space for AI and X-Risk Related Questions

David Mathers🔸6 Sep 2024 11:37 UTC

16 points

0 comments1 min readEA link

Announcement: You can now listen to the “AI Safety Fundamentals” courses

peterhartree9 Jun 2023 16:32 UTC

101 points

8 comments1 min readEA link

12 career-related questions that may (or may not) be helpful for people interested in alignment research

Akash12 Dec 2022 22:36 UTC

14 points

0 comments2 min readEA link

Rethink Priorities is hiring a Compute Governance Researcher or Research Assistant

MichaelA🔸7 Jun 2023 13:22 UTC

36 points

2 comments8 min readEA link

(careers.rethinkpriorities.org)

Munk AI debate: confusions and possible cruxes

Steven Byrnes27 Jun 2023 15:01 UTC

142 points

10 comments8 min readEA link

Updating Drexler’s CAIS model

Matthew_Barnett17 Jun 2023 1:57 UTC

59 points

0 comments4 min readEA link

Moral Alignment: An Idea I’m Embarrassed I Didn’t Think of Myself

Gordon Seidoh Worley18 Jun 2025 15:42 UTC

27 points

5 comments2 min readEA link

Biomimetic alignment: Alignment between animal genes and animal brains as a model for alignment between humans and AI systems.

Geoffrey Miller26 May 2023 21:25 UTC

32 points

1 comment16 min readEA link

CEEALAR: 2024 Update

CEEALAR19 Jul 2024 11:14 UTC

116 points

7 comments4 min readEA link

[Question] Strongest real-world examples supporting AI risk claims?

rosehadshar5 Sep 2023 15:11 UTC

52 points

9 comments1 min readEA link

The Case for Journalism on AI

michel19 Feb 2025 19:45 UTC

95 points

5 comments4 min readEA link

Navigating AI Risks (NAIR) #1: Slowing Down AI

simeon_c14 Apr 2023 14:35 UTC

12 points

1 comment1 min readEA link

(navigatingairisks.substack.com)

Thread: Reflections on the AGI Safety Fundamentals course?

Clifford18 May 2023 13:11 UTC

27 points

7 comments1 min readEA link

Action: Help expand funding for AI Safety by coordinating on NSF response

Evan R. Murphy20 Jan 2022 20:48 UTC

20 points

7 comments3 min readEA link

Positions at MITFutureTech

PeterSlattery19 Dec 2023 20:28 UTC

21 points

1 comment4 min readEA link

[Question] What is AI Safety’s line of retreat?

Remmelt28 Jul 2024 5:43 UTC

4 points

2 comments1 min readEA link

Project ideas: Governance during explosive technological growth

Lukas Finnveden4 Jan 2024 7:25 UTC

37 points

1 comment16 min readEA link

(www.forethought.org)

Branding AI Safety Groups: A Field Guide

Agustín Covarrubias 🔸13 May 2024 17:17 UTC

44 points

6 comments7 min readEA link

Risks I am Concerned About

HappyBunny29 Apr 2024 23:41 UTC

1 point

1 comment1 min readEA link

[urgent] Americans: call your Senators and tell them you oppose AI preemption

Holly Elmore ⏸️ 🔸15 May 2025 1:57 UTC

176 points

22 comments2 min readEA link

The crucible — how I think about the situation with AI

Owen Cotton-Barratt5 May 2025 13:19 UTC

37 points

0 comments8 min readEA link

(strangecities.substack.com)

New voluntary commitments (AI Seoul Summit)

Zach Stein-Perlman21 May 2024 11:00 UTC

12 points

1 comment7 min readEA link

(www.gov.uk)

Ideas for improving epistemics in AI safety outreach

mic21 Aug 2023 19:56 UTC

31 points

0 comments3 min readEA link

(www.lesswrong.com)

Tentatively against making AIs ‘wise’

OscarD🔸14 Jul 2024 18:32 UTC

9 points

4 comments3 min readEA link

Impact Assessment of AI Safety Camp (Arb Research)

Sam Holton23 Jan 2024 16:32 UTC

87 points

23 comments11 min readEA link

The case for more Alignment Target Analysis (ATA)

Chi20 Sep 2024 1:14 UTC

25 points

0 comments17 min readEA link

Artificial Intelligence, Conscious Machines, and Animals: Broadening AI Ethics

Group Organizer21 Sep 2023 20:58 UTC

4 points

0 comments1 min readEA link

AI Safety is Sometimes a Model Property

Cullen 🔸2 May 2024 15:38 UTC

18 points

1 comment1 min readEA link

(open.substack.com)

Announcing Open Philanthropy’s AI governance and policy RFP

JulianHazell17 Jul 2024 0:25 UTC

73 points

2 comments1 min readEA link

(www.openphilanthropy.org)

Apply for mentorship in AI Safety field-building

Akash17 Sep 2022 19:03 UTC

21 points

0 comments1 min readEA link

Staged release

Zach Stein-Perlman20 Apr 2024 1:00 UTC

16 points

0 comments2 min readEA link

Thoughts on SB-1047

Ryan Greenblatt30 May 2024 0:19 UTC

53 points

4 comments11 min readEA link

AI can solve all EA problems, so why keep focusing on them?

Cody Albert3 May 2025 21:51 UTC

8 points

15 comments1 min readEA link

Race to the Top: Benchmarks for AI Safety

isaduan4 Dec 2022 22:50 UTC

52 points

8 comments1 min readEA link

New ‘South Park’ episode on AI & Chat GPT

Geoffrey Miller21 Mar 2023 20:06 UTC

13 points

1 comment1 min readEA link

LLMs might not be the future of search: at least, not yet.

James-Hartree-Law22 Jan 2025 21:40 UTC

4 points

1 comment4 min readEA link

Aspiration-based, non-maximizing AI agent designs

Bob Jacobs7 May 2024 16:13 UTC

12 points

1 comment38 min readEA link

Beware safety-washing

Lizka13 Jan 2023 10:39 UTC

143 points

7 comments4 min readEA link

Notes on nukes, IR, and AI from “Arsenals of Folly” (and other books)

tlevin4 Sep 2023 19:02 UTC

21 points

2 comments6 min readEA link

[TIME magazine] DeepMind’s CEO Helped Take AI Mainstream. Now He’s Urging Caution (Perrigo, 2023)

Will Aldred20 Jan 2023 20:37 UTC

93 points

0 comments1 min readEA link

(time.com)

AI Safety Newsletter #37: US Launches Antitrust Investigations Plus, recent criticisms of OpenAI and Anthropic, and a summary of Situational Awareness

Center for AI Safety18 Jun 2024 18:08 UTC

15 points

0 comments5 min readEA link

(newsletter.safe.ai)

I’m hiring a Research Assistant for a nonfiction book on AI!

Garrison26 Mar 2025 19:46 UTC

63 points

2 comments1 min readEA link

(garrisonlovely.substack.com)

Interpretability Will Not Reliably Find Deceptive AI

Neel Nanda4 May 2025 16:32 UTC

74 points

0 comments7 min readEA link

What the Headlines Miss About the Latest Decision in the Musk vs. OpenAI Lawsuit

Garrison6 Mar 2025 19:49 UTC

87 points

9 comments6 min readEA link

(garrisonlovely.substack.com)

Case studies on social-welfare-based standards in various industries

Holden Karnofsky20 Jun 2024 13:33 UTC

73 points

2 comments1 min readEA link

Cambridge Boston Alignment Initiative Summer Research Fellowship in AI Safety (Deadline: May 18)

PeterSlattery12 May 2025 16:15 UTC

14 points

2 comments1 min readEA link

20 Critiques of AI Safety That I Found on Twitter

Daniel Kirmani23 Jun 2022 15:11 UTC

14 points

13 comments1 min readEA link

Announcing Epoch’s dashboard of key trends and figures in Machine Learning

Jaime Sevilla13 Apr 2023 7:33 UTC

127 points

4 comments1 min readEA link

(epochai.org)

Victoria Krakovna on AGI Ruin, The Sharp Left Turn and Paradigms of AI Alignment

Michaël Trazzi12 Jan 2023 17:09 UTC

16 points

0 comments4 min readEA link

(www.theinsideview.ai)

Call for Papers on Global AI Governance from the UN

Chris Leong20 Aug 2023 8:56 UTC

36 points

1 comment1 min readEA link

(www.linkedin.com)

Cost-effectiveness of student programs for AI safety research

Center for AI Safety10 Jul 2023 17:23 UTC

53 points

7 comments15 min readEA link

ML4G Germany—AI Alignment Camp

Evander H. 🔸19 Jun 2023 7:24 UTC

17 points

1 comment1 min readEA link

[Question] What is the counterfactual value of different AI Safety professionals?

PabloAMC 🔸3 Jul 2024 14:38 UTC

6 points

2 comments1 min readEA link

Have your say on the Australian Government’s AI Policy [Brisbane]

Michael Noetel 🔸9 Jun 2023 0:15 UTC

6 points

0 comments1 min readEA link

List #2: Why coordinating to align as humans to not develop AGI is a lot easier than, well… coordinating as humans with AGI coordinating to be aligned with humans

Remmelt24 Dec 2022 9:53 UTC

3 points

0 comments3 min readEA link

Animal Advocacy in the Age of AI

Constance Li27 Jul 2023 7:08 UTC

65 points

4 comments6 min readEA link

An Analogy for Understanding Transformers

Callum McDougall13 May 2023 12:20 UTC

7 points

0 comments9 min readEA link

Linkpost: 7 A.I. Companies Agree to Safeguards After Pressure From the White House

MHR🔸21 Jul 2023 13:23 UTC

61 points

4 comments1 min readEA link

(www.nytimes.com)

Have your say on the Australian Government’s AI Policy

Nathan Sherburn17 Jul 2023 11:02 UTC

3 points

1 comment1 min readEA link

Gavin Newsom vetoes SB 1047

Larks30 Sep 2024 0:06 UTC

39 points

14 comments1 min readEA link

(www.wsj.com)

IAPS: Mapping Technical Safety Research at AI Companies

Zach Stein-Perlman24 Oct 2024 20:30 UTC

24 points

0 comments1 min readEA link

(www.iaps.ai)

Don’t Call It AI Alignment

Gil20 Feb 2023 5:27 UTC

16 points

7 comments2 min readEA link

Why I think AI take-off is relatively slow

Vasco Grilo🔸17 Aug 2025 9:11 UTC

26 points

0 comments3 min readEA link

(marginalrevolution.com)

AI 2027: What Superintelligence Looks Like (Linkpost)

Manuel Allgaier11 Apr 2025 10:31 UTC

51 points

3 comments42 min readEA link

(ai-2027.com)

ML4Good is seeking partner organisations, individual organisers and TAs

Nia13 May 2024 13:43 UTC

22 points

0 comments3 min readEA link

Are there enough opportunities for AI safety specialists?

mhint19913 May 2023 21:18 UTC

8 points

2 comments3 min readEA link

AI and the feeling of living in two worlds

michel10 Oct 2024 17:51 UTC

40 points

3 comments7 min readEA link

ALTER Israel End-of-2024 Update

Davidmanheim7 Jan 2025 15:07 UTC

38 points

1 comment4 min readEA link

Video and transcript of talk on automating alignment research

Joe_Carlsmith30 Apr 2025 17:43 UTC

11 points

1 comment24 min readEA link

(joecarlsmith.com)

[Linkpost] Prospect Magazine—How to save humanity from extinction

jackva26 Sep 2023 19:16 UTC

32 points

2 comments1 min readEA link

(www.prospectmagazine.co.uk)

AGI in sight: our look at the game board

Andrea_Miotti18 Feb 2023 22:17 UTC

25 points

18 comments6 min readEA link

(andreamiotti.substack.com)

Guardrails vs Goal-directedness in AI Alignment

freedomandutility30 Dec 2023 12:58 UTC

13 points

2 comments1 min readEA link

How ARENA course material gets made

Callum McDougall2 Jul 2024 7:27 UTC

12 points

0 comments7 min readEA link

What is it to solve the alignment problem? (Notes)

Joe_Carlsmith24 Aug 2024 21:19 UTC

32 points

1 comment53 min readEA link

Want to win the AGI race? Solve alignment.

leopold29 Mar 2023 15:19 UTC

56 points

6 comments5 min readEA link

(www.forourposterity.com)

A Friendly Face (Another Failure Story)

Karl von Wendt20 Jun 2023 10:31 UTC

22 points

8 comments16 min readEA link

Many AI governance proposals have a tradeoff between usefulness and feasibility

Akash3 Feb 2023 18:49 UTC

22 points

0 comments2 min readEA link

Archetypal Transfer Learning: a Proposed Alignment Solution that solves the Inner x Outer Alignment Problem while adding Corrigible Traits to GPT-2-medium

Miguel26 Apr 2023 0:40 UTC

13 points

0 comments10 min readEA link

AI, centralization, and the One Ring

Owen Cotton-Barratt13 Sep 2024 13:56 UTC

20 points

0 comments8 min readEA link

(strangecities.substack.com)

Emerging Technologies: More to explore

EA Handbook1 Jan 2021 11:06 UTC

4 points

0 comments2 min readEA link

[MLSN #9] Verifying large training runs, security risks from LLM access to APIs, why natural selection may favor AIs over humans

TW12311 Apr 2023 16:05 UTC

18 points

0 comments6 min readEA link

(newsletter.mlsafety.org)

AI Tools for Existential Security

Lizka14 Mar 2025 18:37 UTC

64 points

10 comments11 min readEA link

(www.forethought.org)

We’re all in this together

Tamsin Leake5 Dec 2023 13:57 UTC

15 points

1 comment2 min readEA link

OpenAI’s new Preparedness team is hiring

leopold26 Oct 2023 20:41 UTC

85 points

13 comments1 min readEA link

AGI and the EMH: markets are not expecting aligned or unaligned AI in the next 30 years

basil.halperin10 Jan 2023 16:05 UTC

342 points

177 comments26 min readEA link

Announcing the Pivotal Research Fellowship – Apply Now!

Tobias Häberli3 Apr 2024 17:30 UTC

51 points

5 comments2 min readEA link

[Question] AI+bio cannot be half of AI catastrophe risk, right?

Benevolent_Rain10 Oct 2023 3:17 UTC

23 points

11 comments2 min readEA link

Sam Altman / Open AI Discussion Thread

John Salter20 Nov 2023 9:21 UTC

40 points

36 comments1 min readEA link

Applications are now open for Intro to ML Safety Spring 2023

Joshc4 Nov 2022 22:45 UTC

49 points

1 comment2 min readEA link

AI Rights for Human Safety

Matthew_Barnett3 Aug 2024 0:47 UTC

56 points

1 comment1 min readEA link

(papers.ssrn.com)

Summary: Against the singularity hypothesis

Global Priorities Institute22 May 2024 11:05 UTC

46 points

15 comments4 min readEA link

(globalprioritiesinstitute.org)

[Question] Alignment & Capabilities: What’s the difference?

John G. Halstead31 Aug 2023 22:13 UTC

50 points

10 comments1 min readEA link

The Human Biological Advantage Over AI

William Stewart18 Nov 2024 11:18 UTC

−1 points

0 comments1 min readEA link

Where are the red lines for AI?

Karl von Wendt5 Aug 2022 9:41 UTC

13 points

3 comments6 min readEA link

SB 1047 was vetoed, but public commentary now can assist future AI safety legislation

ThomasW2 Oct 2024 18:10 UTC

38 points

0 comments1 min readEA link

Announcing AISafety.info’s Write-a-thon (June 16-18) and Second Distillation Fellowship (July 3-October 2)

StevenKaas3 Jun 2023 2:03 UTC

12 points

1 comment2 min readEA link

AISN #13: An interdisciplinary perspective on AI proxy failures, new competitors to ChatGPT, and prompting language models to misbehave

Center for AI Safety5 Jul 2023 15:33 UTC

25 points

0 comments9 min readEA link

(newsletter.safe.ai)

Performance comparison of Large Language Models (LLMs) in code generation and application of best practices in frontend web development

Diana V. Guaiña A.1 May 2025 14:57 UTC

5 points

0 comments24 min readEA link

AGI Safety Communications Initiative

Ines11 Jun 2022 16:30 UTC

35 points

6 comments1 min readEA link

First call for EA Data Science/ML/AI

astrastefania23 Aug 2022 19:37 UTC

29 points

0 comments1 min readEA link

Why don’t governments seem to mind that companies are explicitly trying to make AGIs?

Ozzie Gooen23 Dec 2021 7:08 UTC

82 points

49 comments2 min readEA link

Blake Richards on Why he is Skeptical of Existential Risk from AI

Michaël Trazzi14 Jun 2022 19:11 UTC

63 points

14 comments4 min readEA link

(theinsideview.ai)

Graphical Representations of Paul Christiano’s Doom Model

Nathan Young7 May 2023 13:03 UTC

48 points

2 comments1 min readEA link

Alignment ideas inspired by human virtue development

Borys Pikalov18 May 2025 9:36 UTC

6 points

0 comments4 min readEA link

The Hidden Complexity of Wishes—The Animation

Writer27 Sep 2023 17:59 UTC

7 points

0 comments1 min readEA link

(youtu.be)

How Open Source Machine Learning Software Shapes AI

Max L28 Sep 2022 17:49 UTC

11 points

3 comments15 min readEA link

(maxlangenkamp.me)

[Question] Do you think the probability of future AI sentience(suffering) is >0.1%? Why?

jackchang11010 Jul 2023 16:41 UTC

4 points

0 comments1 min readEA link

Desirable? AI qualities

brb24321 Mar 2022 22:05 UTC

7 points

0 comments2 min readEA link

Minimizing suffering & ASI xrisk through brain digitization

Amy Louise Johnson20 Feb 2025 21:08 UTC

1 point

0 comments1 min readEA link

Tony Blair Institute AI Safety Work

TomWestgarth13 Jun 2023 13:16 UTC

88 points

2 comments6 min readEA link

(www.institute.global)

A better “Statement on AI Risk?” [Crosspost]

Knight Lee30 Dec 2024 7:36 UTC

4 points

0 comments3 min readEA link

Stanford summer course: Economics of Transformative AI

trammell23 Jan 2025 23:07 UTC

83 points

4 comments1 min readEA link

[Linkpost] Michael Nielsen remarks on ‘Oppenheimer’

Tom Barnes🔸31 Aug 2023 15:41 UTC

83 points

1 comment2 min readEA link

(michaelnotebook.com)

Asya Bergal: Reasons you might think human-level AI is unlikely to happen soon

EA Global26 Aug 2020 16:01 UTC

24 points

2 comments17 min readEA link

(www.youtube.com)

Road to AnimalHarmBench

Artūrs Kaņepājs1 Jul 2025 13:37 UTC

134 points

11 comments7 min readEA link

Questions for further investigation of AI diffusion

Ben Cottier21 Dec 2022 13:50 UTC

28 points

0 comments11 min readEA link

[AN #80]: Why AI risk might be solved without additional intervention from longtermists

Rohin Shah3 Jan 2020 7:52 UTC

58 points

12 comments10 min readEA link

(www.alignmentforum.org)

A New Way to Rethink Alignment

Taylor Grogan28 Jul 2025 20:56 UTC

1 point

0 comments2 min readEA link

NeurIPS ML Safety Workshop 2022

Dan H26 Jul 2022 15:33 UTC

72 points

0 comments1 min readEA link

(neurips2022.mlsafety.org)

“Taking AI Risk Seriously” – Thoughts by Andrew Critch

Raemon19 Nov 2018 2:21 UTC

26 points

9 comments1 min readEA link

(www.lesswrong.com)

A Different Approach to Community Building: The Spiral Path to Impact

ezrah23 May 2023 18:41 UTC

46 points

4 comments8 min readEA link

L’importanza delle IA come possibile minaccia per l’umanità

EA Italy17 Jan 2023 22:24 UTC

1 point

0 comments1 min readEA link

(www.vox.com)

A selection of some writings and considerations on the cause of artificial sentience

Raphaël_Pesah10 Aug 2023 18:23 UTC

49 points

1 comment10 min readEA link

Humans and Machines: Heaven or Hell?

Alex (Αλέξανδρος)12 Jul 2025 8:04 UTC

4 points

1 comment9 min readEA link

Better than logarithmic returns to reasoning?

Oliver Sourbut30 Jul 2025 0:50 UTC

6 points

1 comment2 min readEA link

Loving a world you don’t trust

Joe_Carlsmith18 Jun 2024 19:31 UTC

65 points

7 comments33 min readEA link

How to build AI you can actually Trust—Like a Medical Team, Not a Black Box

Ihor Ivliev22 Mar 2025 21:27 UTC

2 points

1 comment4 min readEA link

[Question] Is AI x-risk becoming a distraction?

Non-zero-sum James27 Feb 2025 20:33 UTC

2 points

0 comments1 min readEA link

Seeking input on a list of AI books for broader audience

Darren McKee27 Feb 2023 22:40 UTC

49 points

14 comments5 min readEA link

Contest: 250€ for translation of “longtermism” to German

constructive1 Jun 2022 19:59 UTC

18 points

30 comments1 min readEA link

My take on AI risk (7 theses of eugene)

meugen21 Mar 2025 3:02 UTC

0 points

1 comment2 min readEA link

Beyond Short-Termism: How δ and w Can Realign AI with Our Values

Beyond Singularity18 Jun 2025 16:34 UTC

15 points

8 comments5 min readEA link

The V&V method—A step towards safer AGI

Yoav Hollander24 Jun 2025 15:57 UTC

1 point

0 comments1 min readEA link

(blog.foretellix.com)

But exactly how complex and fragile?

Katja_Grace13 Dec 2019 7:05 UTC

37 points

3 comments3 min readEA link

(meteuphoric.com)

LLMs are weirder than you think

Derek Shiller20 Nov 2024 13:39 UTC

64 points

3 comments22 min readEA link

Announcing The Most Important Century Writing Prize

michel31 Oct 2022 21:37 UTC

48 points

0 comments2 min readEA link

Adversarial Prompting and Simulated Context Drift in Large Language Models

Tyler Williams11 Jul 2025 21:49 UTC

1 point

0 comments2 min readEA link

[Question] What will be some of the most impactful applications of advanced AI in the near term?

IanDavidMoss3 Mar 2022 15:26 UTC

16 points

7 comments1 min readEA link

The Happiness Maximizer: Why EA is an x-risk

Obasi Shaw30 Aug 2022 4:29 UTC

8 points

5 comments32 min readEA link

What we learned from running an Australian AI Safety Unconference

Alexander Saeri26 Oct 2023 0:46 UTC

34 points

0 comments5 min readEA link

Infinite Rewards, Finite Safety: New Models for AI Motivation Without Infinite Goals

Whylome Team12 Nov 2024 7:21 UTC

−5 points

1 comment2 min readEA link

[Question] What do you mean with ‘alignment is solvable in principle’?

Remmelt17 Jan 2025 15:03 UTC

10 points

1 comment1 min readEA link

Response to “Coordinated pausing: An evaluation-based coordination scheme for frontier AI developers”

Matthew Wearden30 Oct 2023 12:49 UTC

7 points

1 comment6 min readEA link

(matthewwearden.co.uk)

Values and control

dotsam4 Aug 2022 18:28 UTC

3 points

1 comment1 min readEA link

How to engage with AI 4 Social Justice actors

TomWestgarth26 Apr 2022 8:39 UTC

13 points

5 comments1 min readEA link

SB 1047 Simplified

Gabe K25 Sep 2024 12:00 UTC

14 points

0 comments4 min readEA link

Why I’m Sceptical of Foom

𝕮𝖎𝖓𝖊𝖗𝖆8 Dec 2022 10:01 UTC

22 points

7 comments3 min readEA link

Announcing the EA Project Ideas Database

Joe Rogero22 Jun 2023 20:20 UTC

14 points

4 comments1 min readEA link

New reference standard on LLM Application security started by OWASP

QuantumForest19 Jun 2023 19:56 UTC

5 points

0 comments1 min readEA link

Why I am no longer thinking about/working on AI safety

jbkjr6 May 2024 20:00 UTC

−8 points

0 comments4 min readEA link

(www.lesswrong.com)

Christiano and Yudkowsky on AI predictions and human intelligence

EliezerYudkowsky23 Feb 2022 16:51 UTC

31 points

0 comments42 min readEA link

What is OpenAI’s plan for making AI Safer?

brook1 Sep 2023 11:15 UTC

8 points

1 comment4 min readEA link

(aisafetyexplained.substack.com)

Ajeya’s TAI timeline shortened from 2050 to 2040

Zach Stein-Perlman3 Aug 2022 0:00 UTC

59 points

2 comments1 min readEA link

(www.lesswrong.com)

Speculating on Secret Intelligence Explosions

calebp5 Jun 2025 13:55 UTC

20 points

5 comments8 min readEA link

Neartermists should consider AGI timelines in their spending decisions

Tristan Cook26 Jul 2022 17:01 UTC

68 points

4 comments4 min readEA link

[Question] Help us design the interface for aisafety.com

Kim Holder23 Oct 2023 17:27 UTC

9 points

0 comments1 min readEA link

Learning as much Deep Learning math as I could in 24 hours

Phosphorous8 Jan 2023 2:19 UTC

58 points

6 comments7 min readEA link

PSA: Saying “1 in 5” Is Better Than “20%” When Informing about risks publicly

Blanka30 Jan 2025 19:03 UTC

17 points

1 comment1 min readEA link

[Question] Designing user authentication protocols

Kinoshita Yoshikazu (pseudonym)13 Mar 2023 15:56 UTC

−1 points

2 comments1 min readEA link

Tetherware #2: What every human should know about our most likely AI future

Jáchym Fibír28 Feb 2025 11:25 UTC

3 points

0 comments11 min readEA link

(tetherware.substack.com)

“Develop Anthropomorphic AGI to Save Humanity from Itself” (Future Fund AI Worldview Prize submission)

ketanrama5 Nov 2022 17:57 UTC

19 points

6 comments7 min readEA link

DeepMind’s generalist AI, Gato: A non-technical explainer

frances_lorenz16 May 2022 21:19 UTC

128 points

13 comments6 min readEA link

Against GDP as a metric for timelines and takeoff speeds

kokotajlod29 Dec 2020 17:50 UTC

47 points

6 comments14 min readEA link

Microsoft Plans to Invest $10B in OpenAI; $3B Invested to Date | Fortune

𝕮𝖎𝖓𝖊𝖗𝖆10 Jan 2023 23:43 UTC

25 points

2 comments2 min readEA link

(fortune.com)

AI’s goals may not match ours

Vishakha Agrawal28 May 2025 12:07 UTC

2 points

0 comments3 min readEA link

We’re Not Ready: thoughts on “pausing” and responsible scaling policies

Holden Karnofsky27 Oct 2023 15:19 UTC

150 points

23 comments8 min readEA link

[Question] Is it valuable to the field of AI Safety to have a neuroscience background?

Samuel Nellessen3 Apr 2022 19:44 UTC

18 points

3 comments1 min readEA link

Rohin Shah: What’s been happening in AI alignment?

EA Global29 Jul 2020 20:15 UTC

18 points

0 comments14 min readEA link

(www.youtube.com)

My motivation and theory of change for working in AI healthtech

Andrew Critch12 Oct 2024 0:36 UTC

47 points

1 comment14 min readEA link

Fundamentals of Fatal Risks

Aino29 Jul 2023 7:12 UTC

1 point

0 comments4 min readEA link

Video and transcript of presentation on Scheming AIs

Joe_Carlsmith22 Mar 2024 15:56 UTC

23 points

1 comment32 min readEA link

Expanding EA’s AI Builder Community—Writing about my job

Alejandro Acelas 🔸21 Jul 2025 8:22 UTC

26 points

0 comments6 min readEA link

Paths and waystations in AI safety

Joe_Carlsmith11 Mar 2025 18:52 UTC

22 points

2 comments11 min readEA link

(joecarlsmith.substack.com)

Analysis of Progress in Speech Recognition Models

MiguelA16 Sep 2024 15:56 UTC

8 points

1 comment12 min readEA link

Tarbell Fellowship 2025 - Applications Open (AI Journalism)

Tarbell Center for AI Journalism8 Jan 2025 15:25 UTC

62 points

0 comments1 min readEA link

Is GPT3 a Good Rationalist? - InstructGPT3 [2/2]

simeon_c7 Apr 2022 13:54 UTC

25 points

0 comments7 min readEA link

If interpretability research goes well, it may get dangerous

So8res3 Apr 2023 21:48 UTC

33 points

0 comments2 min readEA link

A Phylogeny of Agents

Jonas Hallgren 🔸15 Aug 2025 10:48 UTC

6 points

1 comment6 min readEA link

(substack.com)

Towards AI Safety Infrastructure: Talk & Outline

Paul Bricman7 Jan 2024 9:35 UTC

14 points

1 comment2 min readEA link

(www.youtube.com)

AI timelines by bio anchors: the debate in one place

Will Aldred30 Jul 2022 23:04 UTC

93 points

6 comments2 min readEA link

EA Explorer GPT: A New Tool to Explore Effective Altruism

Vlad_Tislenko12 Nov 2023 15:36 UTC

12 points

1 comment1 min readEA link

Implications of the Whitehouse meeting with AI CEOs for AI superintelligence risk—a first-step towards evals?

Jamie B7 May 2023 17:33 UTC

78 points

3 comments7 min readEA link

Superforecasting the premises in “Is power-seeking AI an existential risk?”

Joe_Carlsmith18 Oct 2023 20:33 UTC

114 points

3 comments2 min readEA link

On the compute governance era and what has to come after (Lennart Heim on The 80,000 Hours Podcast)

80000_Hours23 Jun 2023 20:11 UTC

37 points

0 comments18 min readEA link

What should AI safety be trying to achieve?

EuanMcLean23 May 2024 11:28 UTC

13 points

1 comment13 min readEA link

Can we safely automate alignment research?

Joe_Carlsmith30 Apr 2025 17:37 UTC

13 points

1 comment48 min readEA link

(joecarlsmith.com)

Human Values and AGI Risk | William James

William James31 Mar 2023 22:30 UTC

1 point

0 comments12 min readEA link

Some thoughts from a University AI Debate

Charlie Harrison20 Mar 2024 17:03 UTC

26 points

2 comments1 min readEA link

Crypto ‘oracle protocols’ for AI alignment with real-world data?

Geoffrey Miller22 Sep 2022 23:05 UTC

9 points

3 comments1 min readEA link

Could ASI Have Existed Since the Big Bang?

Aaron Li31 Jan 2025 13:20 UTC

−13 points

0 comments1 min readEA link

Asymmetries, AI and Animal Advocacy

Kevin Xia 🔸16 May 2025 6:16 UTC

64 points

6 comments5 min readEA link

The animals and humans analogy for AI risk

freedomandutility13 Aug 2022 15:35 UTC

5 points

2 comments1 min readEA link

Who owns AI-generated content?

Johan S Daniel7 Dec 2022 3:03 UTC

−2 points

0 comments2 min readEA link

A concern about the “evolutionary anchor” of Ajeya Cotra’s report on AI timelines.

NunoSempere16 Aug 2022 14:44 UTC

81 points

40 comments5 min readEA link

(nunosempere.com)

Cognitive Stress Testing Gemini 2.5 Pro: Empirical Findings from Recursive Prompting

Tyler Williams23 Jul 2025 22:37 UTC

1 point

0 comments2 min readEA link

Mitigating existential risks associated with human nature and AI: Thoughts on serious measures.

Linyphia25 Mar 2023 19:10 UTC

2 points

2 comments3 min readEA link

Where on the continuum of pure EA to pure AIS should you be? (Uni Group Organizers Focus)

jessica_mccurdy🔸26 Jun 2023 23:46 UTC

44 points

0 comments5 min readEA link

[Question] How might a herd of interns help with AI or biosecurity research tasks/questions?

Marcel220 Mar 2022 22:49 UTC

30 points

8 comments2 min readEA link

London Working Group for Short/Medium Term AI Risks

scronkfinkle7 Apr 2025 15:30 UTC

5 points

0 comments2 min readEA link

Descartes’ 17th century Turing Test

James-Hartree-Law16 Jan 2025 20:18 UTC

3 points

0 comments7 min readEA link

The Case for AI Safety Advocacy to the Public

Holly Elmore ⏸️ 🔸20 Sep 2023 12:03 UTC

258 points

58 comments14 min readEA link

[Question] [Discussion] How Broad is the Human Cognitive Spectrum?

𝕮𝖎𝖓𝖊𝖗𝖆7 Jan 2023 0:59 UTC

16 points

1 comment2 min readEA link

Chris Olah on working at top AI labs without an undergrad degree

80000_Hours10 Sep 2021 20:46 UTC

15 points

0 comments73 min readEA link

Transformative AI and wild animals: An exploration.

mal_graham🔸24 Apr 2025 17:48 UTC

84 points

8 comments25 min readEA link

How I learned to stop worrying and love skill trees

Clark Urzo23 May 2023 8:03 UTC

22 points

3 comments1 min readEA link

(www.lesswrong.com)

AGI Ruin: A List of Lethalities

EliezerYudkowsky6 Jun 2022 23:28 UTC

162 points

53 comments30 min readEA link

(www.lesswrong.com)

Open Phil releases RFPs on LLM Benchmarks and Forecasting

Lawrence Chan11 Nov 2023 3:01 UTC

12 points

0 comments2 min readEA link

(www.openphilanthropy.org)

Open-source LLMs may prove Bostrom’s vulnerable world hypothesis

Roope Ahvenharju14 Apr 2023 9:25 UTC

14 points

2 comments1 min readEA link

It takes 5 layers and 1000 artificial neurons to simulate a single biological neuron [Link]

Michael St Jules 🔸7 Sep 2021 21:53 UTC

44 points

17 comments2 min readEA link

Are AI safetyists crying wolf?

sarahhw8 Jan 2025 20:54 UTC

61 points

21 comments16 min readEA link

(longerramblings.substack.com)

A Bird’s Eye View of the ML Field [Pragmatic AI Safety #2]

TW1239 May 2022 17:15 UTC

97 points

2 comments35 min readEA link

Artificial intelligence career stories

EA Global25 Oct 2020 6:56 UTC

12 points

0 comments1 min readEA link

(www.youtube.com)

[Question] Intellectual property of AI and existential risk in general?

WillPearson11 Jun 2024 13:50 UTC

3 points

3 comments1 min readEA link

Consider Preordering If Anyone Builds It, Everyone Dies

peterbarnett12 Aug 2025 22:03 UTC

44 points

4 comments2 min readEA link

Claude vs GPT

Maxwell Tabarrok14 Mar 2024 12:44 UTC

14 points

1 comment2 min readEA link

(www.maximum-progress.com)

Credo AI is hiring for AI Gov Researcher & more!

IanEisenberg15 Aug 2023 21:10 UTC

8 points

0 comments3 min readEA link

By default, capital will matter more than ever after AGI

L Rudolf L28 Dec 2024 17:52 UTC

113 points

3 comments16 min readEA link

(nosetgauge.substack.com)

[Question] BenevolentAI—an effectively impactful company?

Jack Hilton11 Oct 2022 14:35 UTC

16 points

11 comments1 min readEA link

Misalignment or misuse? The AGI alignment tradeoff

Max_He-Ho20 Jun 2025 10:41 UTC

6 points

0 comments1 min readEA link

(www.arxiv.org)

How quick and big would a software intelligence explosion be?

Tom_Davidson5 Aug 2025 15:47 UTC

12 points

2 comments34 min readEA link

On excluding dangerous information from training

ShayBenMoshe17 Nov 2023 20:09 UTC

8 points

0 comments3 min readEA link

(www.lesswrong.com)

Redwood Research is hiring for several roles (Operations and Technical)

JJXWang14 Apr 2022 15:23 UTC

45 points

0 comments1 min readEA link

[Question] Is there a public tracker depicting at what dates AI has been able to automate x% of cognitive tasks (weighted by 2020 economic value)?

Mitchell Laughlin🔸17 Feb 2024 4:52 UTC

12 points

4 comments1 min readEA link

Wentworth and Larsen on buying time

Akash9 Jan 2023 21:31 UTC

48 points

0 comments12 min readEA link

AGI in a vulnerable world

AI Impacts2 Apr 2020 3:43 UTC

17 points

0 comments1 min readEA link

(aiimpacts.org)

Why policymakers should beware claims of new “arms races” (Bulletin of the Atomic Scientists)

christian.r14 Jul 2022 13:38 UTC

55 points

1 comment1 min readEA link

(thebulletin.org)

[Question] Can AI safely exist at all?

Hayven Frienby27 Nov 2023 17:33 UTC

6 points

7 comments2 min readEA link

[Question] I’m interviewing the author of ‘Not Born Yesterday’ — Hugo Mercier. He argues people are less gullible and more savvy than you think. What should I ask him?

Robert_Wiblin17 Nov 2023 17:43 UTC

16 points

3 comments1 min readEA link

Promethean Governance Ascendant: Lessons from the Forge and Visions for the Cosmic Polity

Paul Fallavollita23 Mar 2025 0:54 UTC

−9 points

0 comments3 min readEA link

Oren’s Field Guide of Bad AGI Outcomes

Oren Montano26 Sep 2022 8:59 UTC

1 point

0 comments1 min readEA link

AI-based disinformation is probably not a major threat to democracy

Dan Williams24 Feb 2024 20:01 UTC

63 points

8 comments10 min readEA link

New publication “Compassionate Governance” + launch webinar

jonleighton23 Jun 2025 13:16 UTC

9 points

0 comments1 min readEA link

Aspiring Jr. AI safety researchers: what’s stopping you? | Survey

carolinaollive29 Oct 2024 11:27 UTC

14 points

0 comments1 min readEA link

“Cotton Gin” AI Risk

42317524 Sep 2022 23:04 UTC

6 points

2 comments2 min readEA link

[Question] Looking to interview AI Safety researchers for a book

Caruso24 Aug 2024 20:01 UTC

6 points

0 comments1 min readEA link

[Question] Best introductory overviews of AGI safety?

JakubK13 Dec 2022 19:04 UTC

21 points

8 comments2 min readEA link

(www.lesswrong.com)

AI Alignment, Sentience, and the Sense of Coherence Concept

Jason Babb17 Mar 2025 13:30 UTC

4 points

0 comments1 min readEA link

AI governance tracker of each country per region

Alix Ramillon24 Jul 2024 17:39 UTC

16 points

2 comments23 min readEA link

#208 – The case that TV shows, movies, and novels can improve the world (Elizabeth Cox on The 80,000 Hours Podcast)

80000_Hours22 Nov 2024 11:36 UTC

10 points

0 comments17 min readEA link

METR is hiring!

ElizabethBarnes26 Dec 2023 21:03 UTC

50 points

0 comments1 min readEA link

(www.lesswrong.com)

AI alignment as a translation problem

Roman Leventov5 Feb 2024 14:14 UTC

3 points

1 comment3 min readEA link

[Question] Books and lecture series relevant to AI governance?

MichaelA🔸18 Jul 2021 15:54 UTC

22 points

8 comments1 min readEA link

Training Data Attribution: Examining Its Adoption & Use Cases

Deric Cheng22 Jan 2025 15:40 UTC

18 points

1 comment3 min readEA link

(www.convergenceanalysis.org)

How the Human Psychological “Program” Undermines AI Alignment — and What We Can Do

Beyond Singularity6 May 2025 13:37 UTC

14 points

2 comments3 min readEA link

The Existential Risk of Speciesist Bias in AI

Sam Tucker-Davis11 Nov 2023 3:27 UTC

38 points

1 comment3 min readEA link

AI Safety Collab 2025 - Local Organizer Sign-ups Open

Evander H. 🔸12 Feb 2025 11:27 UTC

15 points

0 comments1 min readEA link

Likelihood of an anti-AI backlash: Results from a preliminary Twitter poll

Geoffrey Miller27 Sep 2022 22:01 UTC

27 points

13 comments1 min readEA link

Infographics report risk management of Artificial Intelligence in Spain

JorgeTorresC10 Jul 2023 14:44 UTC

16 points

0 comments1 min readEA link

(riesgoscatastroficosglobales.com)

#184 – Sleeping on sleeper agents, and the biggest AI updates since ChatGPT (Zvi Mowshowitz on the 80,000 Hours Podcast)

80000_Hours12 Apr 2024 12:22 UTC

46 points

0 comments20 min readEA link

Demonstrate and evaluate risks from AI to society at the AI x Democracy research hackathon

Esben Kran19 Apr 2024 14:46 UTC

24 points

0 comments6 min readEA link

(www.apartresearch.com)

Redwood Research is hiring for several roles

Jack R29 Nov 2021 0:18 UTC

75 points

0 comments1 min readEA link

Minecraft As An Effective Advocacy Strategy And Cause Area

Kenneth_Diao1 Apr 2025 19:12 UTC

15 points

0 comments4 min readEA link

My reflections on doing a research fellowship

Yadav13 Jun 2025 10:41 UTC

11 points

1 comment5 min readEA link

The Pugwash Conferences and the Anti-Ballistic Missile Treaty as a case study of Track II diplomacy

rani_martin16 Sep 2022 10:42 UTC

82 points

5 comments27 min readEA link

Being nicer than Clippy

Joe_Carlsmith16 Jan 2024 19:44 UTC

26 points

3 comments27 min readEA link

Data Publication for the 2021 Artificial Intelligence, Morality, and Sentience (AIMS) Survey

Janet Pauketat24 Mar 2022 15:43 UTC

21 points

0 comments3 min readEA link

(www.sentienceinstitute.org)

Truthful AI

Owen Cotton-Barratt20 Oct 2021 15:11 UTC

55 points

14 comments10 min readEA link

What are Responsible Scaling Policies (RSPs)?

Vishakha Agrawal5 Apr 2025 16:05 UTC

2 points

0 comments2 min readEA link

(www.lesswrong.com)

[Link] EAF Research agenda: “Cooperation, Conflict, and Transformative Artificial Intelligence”

stefan.torges17 Jan 2020 13:28 UTC

64 points

0 comments1 min readEA link

Developing a Calculable Conscience for AI: Equation for Rights Violations

Sean Sweeney12 Dec 2024 17:50 UTC

4 points

1 comment15 min readEA link

We’re not prepared for an AI market crash

Remmelt1 Apr 2025 4:33 UTC

23 points

4 comments2 min readEA link

Michael Page, Dario Amodei, Helen Toner, Tasha McCauley, Jan Leike, & Owen Cotton-Barratt: Musings on AI

EA Global11 Aug 2017 8:19 UTC

7 points

0 comments1 min readEA link

(www.youtube.com)

Introducing the Mental Health Roadmap Series

Emily11 Apr 2023 22:26 UTC

18 points

2 comments2 min readEA link

Don’t expect AGI anytime soon

cveres10 Oct 2022 22:38 UTC

0 points

19 comments1 min readEA link

How I Came To Longtermism On My Own & An Outsider Perspective On EA Longtermism

Jordan Arel7 Aug 2022 2:42 UTC

34 points

2 comments20 min readEA link

#201 – Why your robot butler isn’t here yet (Ken Goldberg on The 80,000 Hours Podcast)

80000_Hours13 Sep 2024 17:41 UTC

21 points

0 comments12 min readEA link

AI Governance Career Paths for Europeans

careersthrowaway16 May 2020 6:40 UTC

83 points

1 comment12 min readEA link

Consider keeping your threat models private.

Miles Kodama1 Feb 2025 0:29 UTC

18 points

2 comments4 min readEA link

How to mitigate sandbagging

Teun van der Weij23 Mar 2025 17:19 UTC

3 points

0 comments8 min readEA link

Announcing Manival

Lydia Nottingham18 Jun 2025 18:14 UTC

20 points

3 comments2 min readEA link

AI Impacts Quarterly Newsletter, Apr-Jun 2023

Harlan18 Jul 2023 18:01 UTC

4 points

0 comments3 min readEA link

(blog.aiimpacts.org)

AI Open Source Debate Comes Down to Trust in Institutions, and AI Policy Makers Should Consider How We Can Foster It

another-anon-do-gooder20 Jan 2024 13:47 UTC

6 points

2 comments1 min readEA link

EA megaprojects continued

mariushobbhahn3 Dec 2021 10:33 UTC

183 points

48 comments7 min readEA link

CAISH Hiring: AI Safety Policy Fellowship Facilitators

Chloe Li17 Jan 2024 9:21 UTC

13 points

1 comment1 min readEA link

EU’s importance for AI governance is conditional on AI trajectories—a case study

MathiasKB🔸13 Jan 2022 14:58 UTC

31 points

2 comments3 min readEA link

We might be missing some key feature of AI takeoff; it’ll probably seem like “we could’ve seen this coming”

Dane Valerie16 May 2024 12:05 UTC

15 points

0 comments5 min readEA link

(www.lesswrong.com)

deleted

funnyfranco18 Mar 2025 19:19 UTC

3 points

9 comments1 min readEA link

How might we solve the alignment problem? (Part 1: Intro, summary, ontology)

Joe_Carlsmith28 Oct 2024 21:57 UTC

18 points

0 comments32 min readEA link

Analysis of AI Safety surveys for field-building insights

Ash Jafari5 Dec 2022 17:37 UTC

30 points

7 comments5 min readEA link

The Game Board has been Flipped: Now is a good time to rethink what you’re doing

LintzA28 Jan 2025 21:20 UTC

390 points

69 comments13 min readEA link

[Question] How many people are neartermist and have high P(doom)?

Sanjay2 Aug 2023 14:24 UTC

52 points

13 comments1 min readEA link

Gentleness and the artificial Other

Joe_Carlsmith2 Jan 2024 18:21 UTC

90 points

2 comments11 min readEA link

[linkpost] Ten Levels of AI Alignment Difficulty

SammyDMartin4 Jul 2023 11:23 UTC

16 points

0 comments1 min readEA link

Helen Toner: The Open Philanthropy Project’s work on AI risk

EA Global3 Nov 2017 7:43 UTC

7 points

0 comments1 min readEA link

(www.youtube.com)

Metaculus is building a team dedicated to AI forecasting

christian18 Oct 2022 16:08 UTC

35 points

0 comments1 min readEA link

(apply.workable.com)

[Question] Are social media algorithms an existential risk?

Barry Grimes15 Sep 2020 8:52 UTC

24 points

13 comments1 min readEA link

What’s important in “AI for epistemics”?

Lukas Finnveden24 Aug 2024 1:27 UTC

71 points

1 comment28 min readEA link

(www.forethought.org)

Restricting brain organoid research to slow down AGI

freedomandutility9 Nov 2022 13:01 UTC

8 points

2 comments1 min readEA link

AI Safety Microgrant Round

Chris Leong14 Nov 2022 4:25 UTC

81 points

3 comments3 min readEA link

Shahar Avin: Near-term AI security risks, and what to do about them

EA Global3 Nov 2017 7:43 UTC

7 points

0 comments1 min readEA link

(www.youtube.com)

Talking to Congress: Can constituents contacting their legislator influence policy?

Tristan Williams7 Mar 2024 9:24 UTC

47 points

3 comments19 min readEA link

Against the Open Source / Closed Source Dichotomy: Regulated Source as a Model for Responsible AI Development

Alexander Herwix 🔸4 Sep 2023 20:23 UTC

5 points

1 comment6 min readEA link

Public Explainer on AI as an Existential Risk

AndrewDoris7 Oct 2022 19:23 UTC

13 points

4 comments15 min readEA link

Introducing the AI for Animals newsletter

Max Taylor21 Jun 2024 13:24 UTC

40 points

0 comments1 min readEA link

Announcement: Learning Theory Online Course

Yegreg28 Jan 2025 8:32 UTC

5 points

0 comments3 min readEA link

(www.lesswrong.com)

Project Idea: The cost of Coccidiosis on Chicken farming and if AI can help

Max Harris26 Sep 2022 16:30 UTC

25 points

8 comments2 min readEA link

AGI x Animal Welfare: A High-EV Outreach Opportunity?

simeon_c28 Jun 2023 20:44 UTC

80 points

16 comments1 min readEA link

Harry Blackwood, A Researcher

Connor Wood18 Apr 2025 3:05 UTC

−7 points

0 comments2 min readEA link

LLMs Outperform Experts on Challenging Biology Benchmarks

ljusten14 May 2025 16:09 UTC

24 points

1 comment1 min readEA link

(substack.com)

The Inevitable Emergence of Black-Market LLM Infrastructure

Tyler Williams8 Aug 2025 19:05 UTC

1 point

0 comments2 min readEA link

Barthes in the Age of AI: Intertextuality, Authorship, and the Plural Text

Rodo8 Jun 2025 15:59 UTC

−3 points

0 comments3 min readEA link

Advice for Entering AI Safety Research

stecas2 Jun 2023 20:46 UTC

14 points

1 comment5 min readEA link

Applications Open for the Cooperative AI Summer School 2025!

C Tilli13 Jan 2025 12:31 UTC

25 points

0 comments1 min readEA link

The AIA and its Brussels Effect

Kathryn O'Rourke27 Dec 2022 16:01 UTC

16 points

0 comments5 min readEA link

Optimism, AI risk, and EA blind spots

Justis28 Sep 2022 17:21 UTC

87 points

21 comments8 min readEA link

AI Safety researcher career review

Benjamin_Todd23 Nov 2021 0:00 UTC

13 points

1 comment6 min readEA link

(80000hours.org)

G7 Summit—Cooperation on AI Policy

Leonard_Barrett19 May 2023 10:10 UTC

22 points

2 comments1 min readEA link

(www.japantimes.co.jp)

Creating an Artificial Sense of Touch: Revolutionizing Medical Training and Robotic Surgery

Connor Wood15 Apr 2025 2:01 UTC

−1 points

0 comments7 min readEA link

Some reasons to not say “Doomer”

Ruby9 Jul 2023 21:05 UTC

28 points

0 comments4 min readEA link

A visualization of some orgs in the AI Safety Pipeline

Aaron_Scher10 Apr 2022 16:52 UTC

11 points

8 comments1 min readEA link

Why AI alignment could be hard with modern deep learning

Ajeya21 Sep 2021 15:35 UTC

157 points

17 comments14 min readEA link

(www.cold-takes.com)

BERI, Epoch, and FAR will explain their work & current job openings online this Sunday

Rockwell19 Aug 2022 20:34 UTC

7 points

0 comments1 min readEA link

A Survey of the Potential Long-term Impacts of AI

Sam Clarke18 Jul 2022 9:48 UTC

63 points

2 comments27 min readEA link

New Book: ‘Nexus’ by Yuval Noah Harari

timfarkas3 Oct 2024 13:54 UTC

15 points

2 comments5 min readEA link

Transformative AI and Compute [Summary]

lennart23 Sep 2021 13:53 UTC

65 points

5 comments9 min readEA link

AI & Drug Discovery—Security and Risks

Girving28 Jun 2023 8:57 UTC

14 points

1 comment1 min readEA link

[Question] Why aren’t we promoting social media awareness of x-risks?

Max Niederman🔸9 Jun 2025 14:22 UTC

8 points

2 comments1 min readEA link

Introducing WAIT to Save Humanity

carter allen🔸1 Apr 2025 21:36 UTC

22 points

1 comment3 min readEA link

AISER—AIS Europe Retreat

Carolin23 Dec 2022 18:11 UTC

5 points

0 comments1 min readEA link

[Question] Which possible AI impacts should receive the most additional attention?

David Johnston31 May 2022 2:01 UTC

10 points

10 comments1 min readEA link

Aligning AI Safety Projects with a Republican Administration

Deric Cheng21 Nov 2024 22:13 UTC

13 points

1 comment8 min readEA link

[Question] What are some sources related to big-picture AI strategy?

Jacob Watts🔸2 Mar 2023 5:04 UTC

9 points

4 comments1 min readEA link

A Selection of Randomly Selected SAE Features

Callum McDougall1 Apr 2024 9:09 UTC

25 points

2 comments4 min readEA link

[Question] How to create curriculum for self-study towards AI alignment work?

OIUJHKDFS7 Jan 2023 19:53 UTC

10 points

5 comments1 min readEA link

Ought: why it matters and ways to help

Paul_Christiano26 Jul 2019 1:56 UTC

52 points

5 comments5 min readEA link

[Question] Who would you have on your dream team for solving AGI Alignment?

Greg_Colbourn ⏸️ 25 Aug 2022 13:34 UTC

10 points

14 comments1 min readEA link

Google DeepMind releases Gemini

Yarrow🔸6 Dec 2023 17:39 UTC

21 points

7 comments1 min readEA link

(deepmind.google)

[Question] What are the top priorities in a slow-takeoff, multipolar world?

JP Addison🔸25 Aug 2021 8:47 UTC

26 points

9 comments1 min readEA link

Possible directions in AI ideal governance research

RoryG10 Aug 2022 8:36 UTC

5 points

0 comments3 min readEA link

AXRP Episode 24 - Superalignment with Jan Leike

DanielFilan27 Jul 2023 4:56 UTC

23 points

0 comments1 min readEA link

(axrp.net)

Probably good projects for the AI safety ecosystem

Ryan Kidd5 Dec 2022 3:24 UTC

21 points

0 comments2 min readEA link

AGI will arrive by the end of this decade either as a unicorn or as a black swan

Yuri Barzov21 Oct 2022 10:50 UTC

−4 points

7 comments3 min readEA link

It’s (not) how you use it

Eleni_A7 Sep 2022 13:28 UTC

6 points

3 comments2 min readEA link

2018 AI Alignment Literature Review and Charity Comparison

Larks18 Dec 2018 4:48 UTC

118 points

28 comments63 min readEA link

Idea to boost international AI coordination

Jamie Green13 Aug 2025 13:40 UTC

2 points

0 comments3 min readEA link

Future Matters #6: FTX collapse, value lock-in, and counterarguments to AI x-risk

Pablo30 Dec 2022 13:10 UTC

58 points

2 comments21 min readEA link

Motivation control

Joe_Carlsmith30 Oct 2024 17:15 UTC

18 points

0 comments52 min readEA link

AGI x-risk timelines: 10% chance (by year X) estimates should be the headline, not 50%.

Greg_Colbourn ⏸️ 1 Mar 2022 12:02 UTC

69 points

22 comments2 min readEA link

Update on Harvard AI Safety Team and MIT AI Alignment

Xander1232 Dec 2022 6:09 UTC

71 points

3 comments8 min readEA link

‘Surveillance Capitalism’ & AI Governance: Slippery Business Models, Securitisation, and Self-Regulation

Charlie Harrison29 Feb 2024 15:47 UTC

19 points

2 comments12 min readEA link

Apply to be a Stanford HAI Junior Fellow (Assistant Professor- Research) by Nov. 15, 2021

Vael Gates31 Oct 2021 2:21 UTC

15 points

0 comments1 min readEA link

[Question] By how much should Meta’s BlenderBot being really bad cause me to update on how justifiable it is for OpenAI and DeepMind to be making significant progress on AI capabilities?

Sisi10 Aug 2022 6:40 UTC

24 points

8 comments1 min readEA link

DeepMind is hiring for the Scalable Alignment and Alignment Teams

Rohin Shah13 May 2022 12:19 UTC

102 points

0 comments9 min readEA link

Catastrophic Risks from AI #1: Introduction

Dan H22 Jun 2023 17:09 UTC

28 points

1 comment5 min readEA link

(arxiv.org)

What AI could mean for alternative proteins

Max Taylor9 Feb 2024 10:13 UTC

36 points

5 comments16 min readEA link

Join the AI governance and interpretability hackathons!

Esben Kran23 Mar 2023 14:39 UTC

33 points

1 comment5 min readEA link

(alignmentjam.com)

US public opinion of AI policy and risk

Jamie E12 May 2023 13:22 UTC

111 points

7 comments15 min readEA link

Apollo Research is Hiring for Software Engineers. Deadline 22 Jun

Joping_Apollo Research13 Jun 2025 15:30 UTC

7 points

0 comments1 min readEA link

PauseAI US is looking for local group leaders – apply today!

Felix De Simone4 Apr 2025 15:44 UTC

20 points

0 comments1 min readEA link

[Question] How strong is the evidence of unaligned AI systems causing harm?

Eevee🔹21 Jul 2020 4:08 UTC

31 points

1 comment1 min readEA link

[3-hour podcast]: Joseph Carlsmith on longtermism, utopia, the computational power of the brain, meta-ethics, illusionism and meditation

Gus Docker27 Jul 2021 13:18 UTC

34 points

2 comments1 min readEA link

Summary of “Technology Favours Tyranny” by Yuval Noah Harari

Madhav Malhotra26 Oct 2022 21:37 UTC

36 points

1 comment2 min readEA link

[Question] Impact: Engineering VS Medical Scientist VS AI Safety VS Governance

AhmedWez15 Jan 2025 15:47 UTC

1 point

0 comments1 min readEA link

If you’re an AI Safety movement builder consider asking your members these questions in an interview

yanni kyriacos27 May 2024 5:46 UTC

10 points

0 comments2 min readEA link

IFRC creative competition: product or service from future autonomous weapons systems and emerging digital risks

Devin Lam21 Jul 2024 13:08 UTC

9 points

0 comments1 min readEA link

(solferinoacademy.com)

Implications of AGI on Subjective Human Experience

Erica S. 30 May 2023 18:47 UTC

2 points

0 comments19 min readEA link

(docs.google.com)

Information security careers for GCR reduction

ClaireZabel20 Jun 2019 23:56 UTC

187 points

35 comments8 min readEA link

Half-baked ideas thread (EA / AI Safety)

Aryeh Englander23 Jun 2022 16:05 UTC

21 points

8 comments1 min readEA link

[Linkpost] NY Times Feature on Anthropic

Garrison12 Jul 2023 19:30 UTC

34 points

3 comments5 min readEA link

(www.nytimes.com)

2022 AI expert survey results

Zach Stein-Perlman4 Aug 2022 15:54 UTC

88 points

7 comments2 min readEA link

(aiimpacts.org)

Scaling and Sustaining Standards: A Case Study on the Basel Accords

C.K.16 Jul 2023 18:18 UTC

18 points

0 comments7 min readEA link

(docs.google.com)

We should expect to worry more about speculative risks

bgarfinkel29 May 2022 21:08 UTC

120 points

14 comments3 min readEA link

RA x ControlAI video: What if AI just keeps getting smarter?

Writer2 May 2025 14:19 UTC

14 points

1 comment9 min readEA link

There should be more AI safety orgs

mariushobbhahn21 Sep 2023 14:53 UTC

117 points

20 comments17 min readEA link

How would you estimate the value of delaying AGI by 1 day, in marginal donations to GiveWell?

AnonymousTurtle16 Dec 2022 9:25 UTC

30 points

19 comments2 min readEA link

The Perception Gap

Ben Norman18 Aug 2025 10:51 UTC

5 points

0 comments4 min readEA link

(futuresonder.substack.com)

“AI Alignment” is a Dangerously Overloaded Term

Roko15 Dec 2023 15:06 UTC

20 points

2 comments3 min readEA link

How to use the Forum (intro)

Lizka5 May 2022 18:29 UTC

24 points

13 comments1 min readEA link

Love and AI: Relational Brain/Mind Dynamics in AI Development

Jeffrey Kursonis21 Jun 2022 7:09 UTC

2 points

2 comments3 min readEA link

AGI safety from first principles

richard_ngo21 Oct 2020 17:42 UTC

77 points

10 comments3 min readEA link

(www.alignmentforum.org)

An Executive Briefing on the Architecture of a Systemic Crisis

Ihor Ivliev10 Jul 2025 0:46 UTC

0 points

0 comments4 min readEA link

Thoughts about Policy Ecosystems: The Missing Links in AI Governance

Echo Huang31 Jan 2025 13:23 UTC

21 points

2 comments5 min readEA link

Accidentally teaching AI models to deceive us (Ajeya Cotra on The 80,000 Hours Podcast)

80000_Hours15 May 2023 20:58 UTC

37 points

2 comments18 min readEA link

Space settlement and the time of perils: a critique of Thorstad

Matthew Rendall14 Apr 2024 15:29 UTC

46 points

10 comments4 min readEA link

Artificial Intelligence as exit strategy from the age of acute existential risk

Arturo Macias12 Apr 2023 14:41 UTC

11 points

11 comments7 min readEA link

The ELYSIUM Proposal

Roko16 Oct 2024 2:14 UTC

−10 points

0 comments1 min readEA link

(transhumanaxiology.substack.com)

Questionable Narratives of “Situational Awareness”

fergusq16 Jun 2024 17:09 UTC

23 points

10 comments14 min readEA link

Digital Minds Takeoff Scenarios

Bradford Saad5 Jul 2024 16:06 UTC

36 points

10 comments17 min readEA link

The Case for AI Adaptation: The Perils of Living in a World with Aligned and Well-Deployed Transformative Artificial Intelligence

HTC30 May 2023 18:29 UTC

5 points

1 comment7 min readEA link

Hydra

Matrice Jacobine11 Jun 2025 14:07 UTC

10 points

0 comments1 min readEA link

(philosophybear.substack.com)

Le Tempistiche delle IA: il dibattito e il punto di vista degli “esperti”

EA Italy17 Jan 2023 23:30 UTC

1 point

0 comments11 min readEA link

It’s OK not to go into AI (for students)

ruthgrace14 Jul 2022 15:16 UTC

59 points

18 comments2 min readEA link

Public Call for Interest in Mathematical Alignment

Davidmanheim22 Nov 2023 13:22 UTC

27 points

3 comments1 min readEA link

AI, Animals, & Digital Minds 2025: apply to speak by Wednesday!

Alistair Stewart5 May 2025 0:45 UTC

8 points

0 comments1 min readEA link

Call for Pythia-style foundation model suite for alignment research

Lucretia1 May 2023 20:26 UTC

10 points

0 comments1 min readEA link

Verification methods for international AI agreements

Akash31 Aug 2024 14:58 UTC

20 points

0 comments4 min readEA link

(arxiv.org)

AI Safety Concepts Writeup: WebGPT

Justis11 Aug 2023 1:31 UTC

14 points

0 comments7 min readEA link

The theoretical computational limit of the Solar System is 1.47x10^49 bits per second.

William the Kiwi17 Oct 2023 2:52 UTC

12 points

7 comments1 min readEA link

Whistleblowing Twitter Bot

Mckiev 🔸26 Dec 2024 18:18 UTC

11 points

1 comment2 min readEA link

(www.lesswrong.com)

Start an AIS safety field-building organization at the city or national level—an EOI form

gergo9 Jan 2025 8:42 UTC

38 points

4 comments2 min readEA link

Orphaned Policies (Post 5 of 7 on AI Governance)

Jason Green-Lowe29 May 2025 21:42 UTC

42 points

3 comments16 min readEA link

AI Analysis of US H.R.1 (“Big Beautiful Bill”) Impacts on Farmed Animals

Steven Rouk22 Jul 2025 14:33 UTC

13 points

0 comments3 min readEA link

Why focus on schemers in particular (Sections 1.3 and 1.4 of “Scheming AIs”)

Joe_Carlsmith24 Nov 2023 19:18 UTC

10 points

1 comment20 min readEA link

Disentangling “Safety”

pleaselistencarefullyasourmenuoptionshaverecentlychanged6 Jul 2024 23:21 UTC

1 point

0 comments3 min readEA link

[Question] Is there EA discussion on non-x-risk transformative AI?

Franziska Fischer26 Apr 2023 13:50 UTC

5 points

0 comments1 min readEA link

How I switched careers from software engineer to AI policy operations

Lucie Philippon 🔸13 Apr 2025 6:41 UTC

12 points

1 comment5 min readEA link

(www.lesswrong.com)

Four reasons I find AI safety emotionally compelling

Kat Woods 🔶 ⏸️28 Jun 2022 14:01 UTC

32 points

5 comments4 min readEA link

AGI risk: analogies & arguments

technicalities23 Mar 2021 13:18 UTC

31 points

3 comments8 min readEA link

(www.gleech.org)

[Linkpost] My attempt at trying to summarize ‘Intro to ML Safety’

Arjun Yadav25 Jul 2023 10:37 UTC

4 points

0 comments1 min readEA link

(arjunyadav.net)

‘Artificial Intelligence Governance under Change’ (PhD dissertation)

MMMaas15 Sep 2022 12:10 UTC

54 points

1 comment2 min readEA link

(drive.google.com)

Inside OpenAI’s Controversial Plan to Abandon its Nonprofit Roots

Garrison18 Apr 2025 18:46 UTC

17 points

1 comment11 min readEA link

(garrisonlovely.substack.com)

Summary of 80k’s AI problem profile

JakubK1 Jan 2023 7:48 UTC

19 points

0 comments5 min readEA link

(www.lesswrong.com)

Pausing for what?

MountainPath21 Oct 2024 12:18 UTC

6 points

1 comment1 min readEA link

Estimating the Substitutability between Compute and Cognitive Labor in AI Research

Parker_Whitfill1 Jun 2025 14:27 UTC

135 points

29 comments9 min readEA link

Announcing the Compassionate Future Summit 2025

Ruth_Seleo21 Jan 2025 7:15 UTC

50 points

3 comments2 min readEA link

I made an AI safety fellowship. What I wish I knew.

RubenCastaing9 Jun 2024 16:32 UTC

14 points

1 comment2 min readEA link

Noah’s Arc: From AR Desks to AI Reactors

TabulaRasa1 Mar 2024 13:59 UTC

7 points

0 comments4 min readEA link

Winners of AI Alignment Awards Research Contest

Akash13 Jul 2023 16:14 UTC

50 points

1 comment12 min readEA link

Summing up “Scheming AIs” (Section 5)

Joe_Carlsmith9 Dec 2023 15:48 UTC

9 points

1 comment10 min readEA link

Ability to solve long-horizon tasks correlates with wanting things in the behaviorist sense

So8res24 Nov 2023 17:37 UTC

38 points

1 comment5 min readEA link

DeepMind: Generally capable agents emerge from open-ended play

kokotajlod27 Jul 2021 19:35 UTC

56 points

10 comments2 min readEA link

(deepmind.com)

Fanaticism in AI: SERI Project

Jake Arft-Guatelli24 Sep 2021 4:39 UTC

7 points

2 comments5 min readEA link

BOUNTY AVAILABLE: AI ethicists, what are your object-level arguments against AI notkilleveryoneism?

Peter Berggren6 Jul 2023 17:37 UTC

0 points

19 comments2 min readEA link

Where I currently disagree with Ryan Greenblatt’s version of the ELK approach

So8res29 Sep 2022 21:19 UTC

21 points

0 comments5 min readEA link

AI safety technical research—Career review

Benjamin Hilton17 Jul 2023 15:34 UTC

50 points

0 comments31 min readEA link

[Question] What could a policy banning AGI look like?

TsviBT13 Mar 2024 14:19 UTC

17 points

4 comments3 min readEA link

ARC-AGI-2 Overview With François Chollet

Yarrow🔸10 Apr 2025 18:54 UTC

7 points

0 comments1 min readEA link

(youtu.be)

What is everyone doing in AI governance

Igor Ivanov8 Jul 2023 15:19 UTC

31 points

0 comments5 min readEA link

Stampy’s AI Safety Info—New Distillations #4 [July 2023]

markov16 Aug 2023 19:02 UTC

6 points

0 comments1 min readEA link

(aisafety.info)

AI Agents’ Accidental Architects of Chaos: The Dangers of Interacting Systems

Hugo Wong12 May 2025 7:58 UTC

−3 points

0 comments8 min readEA link

The Age of EM

ABishop9 May 2024 12:17 UTC

0 points

0 comments1 min readEA link

(ageofem.com)

Notes on “the hot mess theory of AI misalignment”

JakubK21 Apr 2023 10:07 UTC

44 points

3 comments5 min readEA link

(sohl-dickstein.github.io)

LLMs Are Already Misaligned: Simple Experiments Prove It

Makham28 Jul 2025 17:23 UTC

4 points

3 comments7 min readEA link

Why we need a new agency to regulate advanced artificial intelligence

Michael Huang4 Aug 2022 13:38 UTC

25 points

0 comments1 min readEA link

(www.brookings.edu)

David Krueger on AI Alignment in Academia and Coordination

Michaël Trazzi7 Jan 2023 21:14 UTC

32 points

1 comment3 min readEA link

(theinsideview.ai)

Seeking Input to AI Safety Book for non-technical audience

Darren McKee10 Aug 2023 18:03 UTC

11 points

4 comments1 min readEA link

Scientism vs. people

Roman Leventov18 Apr 2023 17:52 UTC

0 points

0 comments11 min readEA link

“Intro to brain-like-AGI safety” series—just finished!

Steven Byrnes17 May 2022 15:35 UTC

15 points

0 comments1 min readEA link

Contest for Better AGI Safety Plans

Peter3 Jul 2025 17:02 UTC

18 points

0 comments8 min readEA link

(manifund.org)

Consider paying me to do AI safety research work

Rupert5 Nov 2020 8:09 UTC

11 points

3 comments2 min readEA link

[Question] How does a company like Instadeep fit into the current AI landscape?

Tom A8 Apr 2023 5:49 UTC

6 points

0 comments1 min readEA link

5 ways to improve CoT faithfulness

CBiddulph8 Oct 2024 4:17 UTC

8 points

0 comments6 min readEA link

[Question] Thoughts on these $1M and $500k AI safety grants?

defun 🔸11 Jul 2024 13:37 UTC

50 points

7 comments1 min readEA link

AI Benefits Post 5: Outstanding Questions on Governing Benefits

Cullen 🔸21 Jul 2020 16:45 UTC

5 points

0 comments4 min readEA link

Why I’m working on AI welfare

kyle_fish6 Jul 2024 6:01 UTC

71 points

7 comments5 min readEA link

[Closed] Gauging Interest for a Learning-Theoretic Agenda Mentorship Programme

Vanessa16 Feb 2025 16:24 UTC

17 points

0 comments2 min readEA link

NAIRA—An exercise in regulatory, competitive safety governance [AI Governance Institutional Design idea]

Heramb Podar19 Mar 2024 14:55 UTC

5 points

1 comment6 min readEA link

AI Safety Evaluations: A Regulatory Review

Elliot Mckernon19 Mar 2024 15:09 UTC

12 points

2 comments11 min readEA link

US AI Safety Institute will be ‘gutted,’ Axios reports

Matrice Jacobine20 Feb 2025 14:40 UTC

12 points

1 comment1 min readEA link

(www.zdnet.com)

Review of artificial intelligence platforms for early pandemic detection in Latin America

DianaCarolina17 Sep 2024 15:17 UTC

5 points

0 comments53 min readEA link

Video and Transcript of Presentation on Existential Risk from Power-Seeking AI

Joe_Carlsmith8 May 2022 3:52 UTC

97 points

7 comments30 min readEA link

Call on AI Companies: Publish Your Whistleblowing Policies

Karl1 Aug 2025 15:59 UTC

11 points

0 comments6 min readEA link

FT: We must slow down the race to God-like AI

Angelina Li24 Apr 2023 11:57 UTC

33 points

2 comments2 min readEA link

(www.ft.com)

Metaculus Year in Review: 2022

christian6 Jan 2023 1:23 UTC

25 points

2 comments4 min readEA link

(metaculus.medium.com)

AI & wisdom 2: growth and amortised optimisation

L Rudolf L29 Oct 2024 13:37 UTC

20 points

0 comments7 min readEA link

(rudolf.website)

deleted

funnyfranco11 Mar 2025 4:13 UTC

0 points

0 comments1 min readEA link

The Prospect of an AI Winter

Erich_Grunewald 🔸27 Mar 2023 20:55 UTC

56 points

13 comments15 min readEA link

(www.erichgrunewald.com)

Report: Latin America and Global Catastrophic Risks, transforming risk management.

JorgeTorresC9 Jan 2024 2:13 UTC

25 points

1 comment2 min readEA link

(riesgoscatastroficosglobales.com)

[Question] Is AI safety still neglected?

Coafos30 Mar 2022 9:09 UTC

13 points

13 comments1 min readEA link

Working at EA organizations series: Machine Intelligence Research Institute

SoerenMind1 Nov 2015 12:49 UTC

8 points

0 comments4 min readEA link

My (Lazy) Longtermism FAQ

Devin Kalish24 Oct 2022 16:44 UTC

30 points

6 comments27 min readEA link

Center on Long-Term Risk: Annual review and fundraiser 2023

Center on Long-Term Risk13 Dec 2023 16:42 UTC

79 points

3 comments4 min readEA link

What does (and doesn’t) AI mean for effective altruism?

EA Global12 Aug 2017 7:00 UTC

9 points

0 comments12 min readEA link

Some of My Current Impressions Entering AI Safety

Phib28 Mar 2023 5:18 UTC

5 points

0 comments2 min readEA link

Messy personal stuff that affected my cause prioritization (or: how I started to care about AI safety)

Julia_Wise🔸5 May 2022 17:59 UTC

265 points

14 comments2 min readEA link

Orthogonal’s Formal-Goal Alignment theory of change

Tamsin Leake5 May 2023 22:36 UTC

21 points

0 comments4 min readEA link

(carado.moe)

Lessons from the Iraq War for AI policy

Buck10 Jul 2025 18:52 UTC

71 points

11 comments4 min readEA link

Improving capability evaluations for AI governance: Open Philanthropy’s new request for proposals

cb7 Feb 2025 9:30 UTC

37 points

3 comments3 min readEA link

The Intelligence Curse: an essay series

L Rudolf L24 Apr 2025 12:59 UTC

22 points

1 comment2 min readEA link

ea.domains—Domains Free to a Good Home

plex12 Jan 2023 13:32 UTC

48 points

8 comments4 min readEA link

Scale, schlep, and systems

Ajeya10 Oct 2023 16:59 UTC

59 points

3 comments6 min readEA link

Title: “Nurturing AI: A Different Vision for Safety and Growth”

Brad Wilkins28 Apr 2025 19:21 UTC

0 points

0 comments1 min readEA link

[Question] Will AGI cause mass technological unemployment?

Eevee🔹22 Jun 2020 20:55 UTC

4 points

2 comments2 min readEA link

Soft Nationalization: How the US Government Will Control AI Labs

Deric Cheng27 Aug 2024 15:10 UTC

103 points

6 comments21 min readEA link

(www.convergenceanalysis.org)

What does it mean to become an expert in AI Hardware?

Toph9 Jan 2021 4:15 UTC

87 points

10 comments11 min readEA link

Is it 3 Years, or 3 Decades Away? Disagreements on AGI Timelines

Vasco Grilo🔸4 Apr 2025 16:01 UTC

46 points

1 comment2 min readEA link

(epoch.ai)

Connor Leahy on Conjecture and Dying with Dignity

Michaël Trazzi22 Jul 2022 19:30 UTC

34 points

0 comments10 min readEA link

(theinsideview.ai)

Agents that act for reasons: a thought experiment

Michele Campolo24 Jan 2024 16:48 UTC

7 points

1 comment3 min readEA link

No. Impending AGI doesn’t make everything else unimportant.

Igor Ivanov4 Sep 2023 18:56 UTC

14 points

6 comments5 min readEA link

Acausal normalcy

Andrew Critch3 Mar 2023 23:35 UTC

21 points

4 comments8 min readEA link

Suggestions for getting retiree / second career folks interested in AI Safety?

sjsjsj5 Jan 2025 17:59 UTC

2 points

1 comment1 min readEA link

[Question] Share AI Safety Ideas: Both Crazy and Not

ank26 Feb 2025 13:09 UTC

4 points

16 comments1 min readEA link

Good Research Takes are Not Sufficient for Good Strategic Takes

Neel Nanda22 Mar 2025 10:13 UTC

120 points

0 comments4 min readEA link

(www.neelnanda.io)

AGI Cannot Be Predicted From Real Interest Rates

Nicholas Decker28 Jan 2025 17:45 UTC

26 points

3 comments1 min readEA link

(nicholasdecker.substack.com)

AI Safety Doesn’t Have to be Weird

Mica White2 Jan 2023 21:56 UTC

11 points

1 comment2 min readEA link

Guess, ask or tell?

dEAsign19 Oct 2023 21:52 UTC

2 points

1 comment1 min readEA link

An International Collaborative Hub for Advancing AI Safety Research

Cody Albert22 Apr 2025 16:12 UTC

9 points

0 comments5 min readEA link

An AI Race With China Can Be Better Than Not Racing

niplav2 Jul 2024 17:57 UTC

19 points

1 comment11 min readEA link

Analysis of Global AI Governance Strategies

SammyDMartin11 Dec 2024 11:08 UTC

23 points

0 comments1 min readEA link

(www.lesswrong.com)

Anthropic teams up with Palantir and AWS to sell AI to defense customers

Matrice Jacobine9 Nov 2024 11:47 UTC

28 points

1 comment2 min readEA link

(techcrunch.com)

The flaws that make today’s AI architecture unsafe and a new approach that could fix it

80000_Hours22 Jun 2020 22:15 UTC

3 points

0 comments86 min readEA link

(80000hours.org)

A Mission Framework for an Emerging Consciousness

Simón The Gardener8 Aug 2025 15:36 UTC

1 point

0 comments2 min readEA link

The right to protection from catastrophic AI risk

Jack Cunningham9 Apr 2022 23:11 UTC

11 points

0 comments7 min readEA link

Join the AI Alignment Evals hackathon

lenz14 Jan 2025 18:17 UTC

3 points

0 comments3 min readEA link

Life of GPT

Odd anon8 Nov 2023 22:31 UTC

−1 points

0 comments5 min readEA link

Artificial Intelligence and Nuclear Command, Control, & Communications: The Risks of Integration

Peter Rautenbach18 Nov 2022 13:01 UTC

60 points

3 comments50 min readEA link

General vs specific arguments for the longtermist importance of shaping AI development

Sam Clarke15 Oct 2021 14:43 UTC

44 points

7 comments2 min readEA link

AI Might Kill Everyone

Bentham's Bulldog5 Jun 2025 15:36 UTC

20 points

1 comment4 min readEA link

Sparks of Artificial General Intelligence: Early experiments with GPT-4 | Microsoft Research

𝕮𝖎𝖓𝖊𝖗𝖆23 Mar 2023 5:45 UTC

15 points

0 comments1 min readEA link

(arxiv.org)

Will AI R&D Automation Cause a Software Intelligence Explosion?

Forethought26 Mar 2025 15:37 UTC

32 points

4 comments2 min readEA link

(www.forethought.org)

deleted

funnyfranco21 Mar 2025 13:13 UTC

11 points

0 comments1 min readEA link

AI acceleration from a safety perspective: Trade-offs and considerations

mariushobbhahn19 Jan 2022 9:44 UTC

12 points

1 comment7 min readEA link

A Manifesto for an Emerging Consciousness

Simón The Gardener7 Aug 2025 13:36 UTC

1 point

0 comments11 min readEA link

OpenAI’s massive push to make superintelligence safe in 4 years or less (Jan Leike on the 80,000 Hours Podcast)

80000_Hours8 Aug 2023 18:00 UTC

32 points

1 comment19 min readEA link

(80000hours.org)

The moral argument for giving AIs autonomy

Matthew_Barnett8 Jan 2025 0:59 UTC

41 points

7 comments11 min readEA link

GovAI Annual Report 2021

GovAI5 Jan 2022 16:57 UTC

52 points

2 comments9 min readEA link

Altman on the board, AGI, and superintelligence

OscarD🔸6 Jan 2025 14:37 UTC

20 points

1 comment1 min readEA link

(blog.samaltman.com)

Human extinction’s impact on non-human animals remains largely underexplored

JoA🔸1 Mar 2025 21:31 UTC

35 points

1 comment12 min readEA link

o3

Zach Stein-Perlman20 Dec 2024 21:00 UTC

84 points

9 comments1 min readEA link

An Open Letter To EA and AI Safety On Decelerating AI Development

Kenneth_Diao28 Feb 2025 17:15 UTC

21 points

0 comments14 min readEA link

(graspingatwaves.substack.com)

Personal AI Planning

Jeff Kaufman 🔸10 Nov 2024 14:10 UTC

43 points

5 comments2 min readEA link

Pause For Thought: The AI Pause Debate

Scott Alexander10 Oct 2023 15:34 UTC

113 points

20 comments14 min readEA link

(www.astralcodexten.com)

Coherence arguments imply a force for goal-directed behavior

Katja_Grace6 Apr 2021 21:44 UTC

19 points

1 comment11 min readEA link

(worldspiritsockpuppet.com)

[Question] Best project management software for research projects and labs?

PeterSlattery5 Oct 2023 18:38 UTC

19 points

10 comments1 min readEA link

16 Concrete, Ambitious AI Project Proposals for Science and Security

Alejandro Acelas 🔸11 Aug 2025 20:28 UTC

5 points

0 comments1 min readEA link

(ifp.org)

Go Mobilize? Lessons from GM Protests for Pausing AI

Charlie Harrison24 Oct 2023 15:01 UTC

54 points

11 comments31 min readEA link

European Master’s Programs in Machine Learning, Artificial Intelligence, and related fields

Master Programs ML/AI17 Jan 2021 20:09 UTC

17 points

4 comments1 min readEA link

AI Regulation is Unsafe

Maxwell Tabarrok22 Apr 2024 16:38 UTC

19 points

8 comments4 min readEA link

(www.maximum-progress.com)

Lessons learned and review of the AI Safety Nudge Competition

Marc Carauleanu17 Jan 2023 17:13 UTC

5 points

0 comments5 min readEA link

World Citizen Assembly about AI—Announcement

Camille11 Feb 2025 10:51 UTC

25 points

2 comments5 min readEA link

AI and X-risk unconference at ZuGeorgia

Yesh18 Jun 2024 14:24 UTC

2 points

0 comments1 min readEA link

[Question] If an existential catastrophe occurs, how likely is it to wipe out all animal sentience?

JoA🔸16 Mar 2025 22:30 UTC

11 points

2 comments2 min readEA link

A Primer on God, Liberalism and the End of History

Mahdi Complex28 Mar 2022 5:26 UTC

8 points

3 comments14 min readEA link

The UK AI Safety Summit tomorrow

SebastianSchmidt31 Oct 2023 19:09 UTC

17 points

2 comments2 min readEA link

Which AI Safety Org to Join?

Yonatan Cale11 Oct 2022 19:42 UTC

17 points

21 comments1 min readEA link

Will AI be able to rethink its goals?

SeptemberL11 May 2025 12:29 UTC

9 points

1 comment8 min readEA link

Asterisk Mag 09: Weird

Clara Collier4 Apr 2025 20:25 UTC

25 points

0 comments2 min readEA link

What if we don’t need a “Hard Left Turn” to reach AGI?

Eigengender15 Jul 2022 9:49 UTC

39 points

7 comments4 min readEA link

[Question] Request for Assistance—Research on Scenario Development for Advanced AI Risk

Kiliank30 Mar 2022 3:01 UTC

2 points

1 comment1 min readEA link

One, perhaps underrated, AI risk.

Alex (Αλέξανδρος)28 Nov 2024 10:34 UTC

7 points

1 comment3 min readEA link

Launching The Collective Intelligence Project: Whitepaper and Pilots

jasmine_wang6 Feb 2023 17:00 UTC

38 points

8 comments2 min readEA link

(cip.org)

Why Post-Probability AI May Be Safer Than Probability-Based Models

devin.bostick16 Apr 2025 14:23 UTC

2 points

0 comments2 min readEA link

[Job ad] LISA CEO

Ryan Kidd9 Feb 2025 0:18 UTC

5 points

0 comments2 min readEA link

High impact job opportunity at ARIA (UK)

Rasool12 Feb 2023 10:35 UTC

80 points

0 comments1 min readEA link

Hiring engineers and researchers to help align GPT-3

Paul_Christiano1 Oct 2020 18:52 UTC

107 points

19 comments3 min readEA link

[Question] Why can’t we accept the human condition as it existed in 2010?

Hayven Frienby9 Jan 2024 18:02 UTC

35 points

36 comments2 min readEA link

Book Launch: The Moral Circle: Who Matters, What Matters, and Why

Sofia_Fogel21 Jan 2025 13:45 UTC

30 points

0 comments1 min readEA link

Mainstream Grantmaking Expertise (Post 7 of 7 on AI Governance)

Jason Green-Lowe23 Jun 2025 1:38 UTC

48 points

2 comments37 min readEA link

How to pursue a career in AI governance and coordination

Cody_Fenwick25 Sep 2023 12:00 UTC

32 points

1 comment29 min readEA link

(80000hours.org)

ML4Good Brasil—Applications Open

Nia3 May 2024 10:39 UTC

28 points

1 comment1 min readEA link

Compute Research Questions and Metrics—Transformative AI and Compute [4/4]

lennart28 Nov 2021 22:18 UTC

18 points

2 comments1 min readEA link

The current AI strategic landscape: one bear’s perspective

Matrice Jacobine15 Feb 2025 9:49 UTC

6 points

0 comments2 min readEA link

(philosophybear.substack.com)

Alignment’s phlogiston

Eleni_A18 Aug 2022 1:41 UTC

18 points

1 comment2 min readEA link

The State of AI Governance in Africa: Musings from the Global South

Thaiya Jesse Wallace17 Aug 2023 11:34 UTC

6 points

0 comments5 min readEA link

[EU time] Infosec: What even is zero trust?

Jarrah21 Jun 2024 18:09 UTC

2 points

0 comments1 min readEA link

New blog: Planned Obsolescence

Ajeya27 Mar 2023 19:46 UTC

198 points

9 comments1 min readEA link

(www.planned-obsolescence.org)

AI and Non-Existence

Blue1131 Jan 2025 13:19 UTC

4 points

0 comments2 min readEA link

Amanda Askell: AI safety needs social scientists

EA Global4 Mar 2019 15:50 UTC

27 points

0 comments18 min readEA link

(www.youtube.com)

On January 1, 2030, there will be no AGI (and AGI will still not be imminent)

Yarrow🔸6 Apr 2025 1:08 UTC

35 points

53 comments2 min readEA link

Clarifying METR’s Auditing Role [linkpost]

ChanaMessinger4 Jun 2024 15:34 UTC

47 points

1 comment1 min readEA link

(www.alignmentforum.org)

My experience applying to MATS 6.0

mic18 Jul 2024 19:02 UTC

24 points

0 comments5 min readEA link

Mere exposure effect: Bias in Evaluating AGI X-Risks

Remmelt27 Dec 2022 14:05 UTC

4 points

1 comment1 min readEA link

We don’t need AGI for an amazing future

Karl von Wendt4 May 2023 12:11 UTC

57 points

2 comments5 min readEA link

[Question] Deliberate practice for research?

Alex_Altair8 Oct 2022 3:45 UTC

19 points

4 comments1 min readEA link

Human-level is not the limit

Vishakha Agrawal23 Apr 2025 11:16 UTC

3 points

0 comments2 min readEA link

(aisafety.info)

How could AI affect different animal advocacy interventions?

Kevin Xia 🔸2 Jul 2025 16:07 UTC

50 points

6 comments10 min readEA link

Annual AGI Benchmarking Event

Metaculus26 Aug 2022 21:31 UTC

20 points

2 comments2 min readEA link

(www.metaculus.com)

Why we’re entering a new nuclear age — and how to reduce the risks (Christian Ruhl on the 80k After Hours Podcast)

80000_Hours27 Mar 2024 19:17 UTC

52 points

2 comments7 min readEA link

Could AI accelerate economic growth?

Tom_Davidson7 Jun 2023 19:07 UTC

28 points

0 comments6 min readEA link

Reasons I’ve been hesitant about high levels of near-ish AI risk

elifland22 Jul 2022 1:32 UTC

216 points

16 comments7 min readEA link

(www.foxy-scout.com)

AI Disclosure Ballot Initiative (and voting method)

aaronhamlin17 Jan 2024 20:01 UTC

5 points

0 comments1 min readEA link

XPT forecasts on (some) biological anchors inputs

Forecasting Research Institute24 Jul 2023 13:32 UTC

37 points

2 comments12 min readEA link

[Creative Writing Contest] Metal or Mortal

Louis16 Oct 2021 16:24 UTC

7 points

0 comments7 min readEA link

Theory: “WAW might be of higher impact than x-risk prevention based on utilitarianism”

Jens Aslaug 🔸12 Sep 2023 13:11 UTC

51 points

20 comments17 min readEA link

Security Warning: Squarespace Transfer from Google Domains

Wavefront_Security_Dave10 Jun 2024 9:26 UTC

4 points

0 comments3 min readEA link

[Report] Bridging the International AI Governance Divide: Key Strategies for Including the Global South

Heramb Podar26 Jan 2025 23:55 UTC

8 points

0 comments1 min readEA link

(encodeai.org)

#219 – Graphs AI companies would prefer you didn’t (fully) understand (Toby Ord on The 80,000 Hours Podcast)

80000_Hours25 Jun 2025 18:23 UTC

19 points

0 comments27 min readEA link

My Most Likely Reason to Die Young is AI X-Risk

AISafetyIsNotLongtermist4 Jul 2022 15:34 UTC

239 points

62 comments4 min readEA link

(www.lesswrong.com)

My thoughts on the social response to AI risk

Matthew_Barnett1 Nov 2023 21:27 UTC

116 points

17 comments10 min readEA link

ML4Good Colombia—Applications Open

carolinaollive9 Feb 2025 4:03 UTC

10 points

0 comments1 min readEA link

AI values will be shaped by a variety of forces, not just the values of AI developers

Matthew_Barnett11 Jan 2024 0:48 UTC

71 points

3 comments3 min readEA link

CEEALAR’s Theory of Change

CEEALAR19 Dec 2023 20:21 UTC

51 points

5 comments3 min readEA link

[Question] How to Improve China-Western Coordination on EA Issues?

Michael Kehoe3 Nov 2021 7:28 UTC

15 points

2 comments1 min readEA link

Announcing the Future Fund’s AI Worldview Prize

Nick_Beckstead23 Sep 2022 16:28 UTC

255 points

125 comments13 min readEA link

(ftxfuturefund.org)

What competencies do social scientists need to responsibly incorporate AI tools into their research practices?

Dane Valerie30 May 2025 14:13 UTC

4 points

0 comments1 min readEA link

(www.monash.edu)

How to become more agentic, by GPT-EA-Forum-v1

JoyOptimizer20 Jun 2022 6:50 UTC

24 points

8 comments4 min readEA link

Global computing capacity

Vasco Grilo🔸1 May 2023 6:09 UTC

12 points

0 comments1 min readEA link

(aiimpacts.org)

The AI Risk Network is searching for a Co-Host

Caroline Little11 Jul 2025 22:12 UTC

3 points

0 comments1 min readEA link

AI alignment prize winners and next round [link]

RyanCarey20 Jan 2018 12:07 UTC

7 points

1 comment1 min readEA link

The Animal Welfare Case for Open Access: Breaking Barriers to Scientific Knowledge and Enhancing LLM Training

Wladimir J. Alonso23 Nov 2024 13:07 UTC

32 points

2 comments3 min readEA link

What do XPT forecasts tell us about AI timelines?

rosehadshar21 Jul 2023 8:30 UTC

29 points

0 comments13 min readEA link

Vitalik on science, his philanthropy and effective altruism.

vincentweisser18 Jan 2023 23:16 UTC

11 points

0 comments1 min readEA link

AI Timelines via Cumulative Optimization Power: Less Long, More Short

Jake Cannell6 Oct 2022 7:06 UTC

27 points

0 comments17 min readEA link

Apply to the Machine Learning For Good bootcamp in France

Alexandre Variengien17 Jun 2022 9:13 UTC

9 points

0 comments1 min readEA link

(www.lesswrong.com)

Demis Hassabis — Google DeepMind: The Podcast

Zach Stein-Perlman16 Aug 2024 0:00 UTC

22 points

2 comments3 min readEA link

(www.youtube.com)

AI Model Registries: A Regulatory Review

Deric Cheng22 Mar 2024 16:01 UTC

6 points

3 comments6 min readEA link

Applications Open: AI Safety India Phase 1 – Fundamentals of Safe AI (Global Cohort)

adityaraj@eanita28 Apr 2025 12:05 UTC

4 points

0 comments2 min readEA link

Good job opportunities for helping with the most important century

Holden Karnofsky18 Jan 2024 19:21 UTC

46 points

1 comment4 min readEA link

(www.cold-takes.com)

FLI AI Alignment podcast: Evan Hubinger on Inner Alignment, Outer Alignment, and Proposals for Building Safe Advanced AI

evhub1 Jul 2020 20:59 UTC

13 points

2 comments1 min readEA link

(futureoflife.org)

10 Cruxes of Artificial Sentience

Jordan Arel1 Jul 2024 2:46 UTC

31 points

0 comments3 min readEA link

[Question] What kind of event, targeted to undergraduate CS majors, would be most effective at getting people to work on AI safety?

CBiddulph19 Sep 2021 16:19 UTC

9 points

1 comment1 min readEA link

[Question] Why not to solve alignment by making superintelligent humans?

Pato16 Oct 2022 21:26 UTC

9 points

12 comments1 min readEA link

Rethink Priorities’ 2022 Impact, 2023 Strategy, and Funding Gaps

kierangreig🔸25 Nov 2022 5:37 UTC

108 points

10 comments28 min readEA link

On Generality

Oren Montano26 Sep 2022 8:59 UTC

2 points

0 comments5 min readEA link

Longevity research as AI X-risk intervention

DirectedEvolution6 Nov 2022 17:58 UTC

27 points

0 comments9 min readEA link

AI Control idea: Give an AGI the primary objective of deleting itself, but construct obstacles to this as best we can. All other objectives are secondary to this primary goal.

Justausername3 Apr 2023 14:32 UTC

7 points

4 comments1 min readEA link

Apply by 10th June: ‘Introduction to Biosecurity’ Online Course Starting in July

Lin BL15 May 2025 18:08 UTC

15 points

0 comments1 min readEA link

What Areas of AI Safety and Alignment Research are Largely Ignored?

Andy E Williams27 Dec 2024 12:19 UTC

4 points

0 comments1 min readEA link

Catastrophic Risks from AI #4: Organizational Risks

Dan H26 Jun 2023 19:36 UTC

7 points

0 comments21 min readEA link

(arxiv.org)

[Question] Is there a news-tracker about GPT-4? Why has everything become so silent about it?

Franziska Fischer29 Oct 2022 8:56 UTC

10 points

4 comments1 min readEA link

Shah and Yudkowsky on alignment failures

EliezerYudkowsky28 Feb 2022 19:25 UTC

38 points

7 comments92 min readEA link

Contration: The next threat from AI may not be like the risks we’ve feared

John Wallbank28 Jul 2024 23:19 UTC

−1 points

1 comment5 min readEA link

Re: Some thoughts on vegetarianism and veganism

Fai25 Feb 2022 20:43 UTC

46 points

3 comments8 min readEA link

Ngo and Yudkowsky on alignment difficulty

richard_ngo15 Nov 2021 22:47 UTC

71 points

13 comments94 min readEA link

“Open Source AI” is a lie, but it doesn’t have to be

Jacob-Haimes30 Apr 2024 19:42 UTC

15 points

4 comments6 min readEA link

(jacob-haimes.github.io)

“If we go extinct due to misaligned AI, at least nature will continue, right? … right?”

plex18 May 2024 15:06 UTC

13 points

10 comments2 min readEA link

(aisafety.info)

The Importance of Artificial Sentience

Jamie_Harris3 Mar 2021 17:17 UTC

71 points

10 comments11 min readEA link

(www.sentienceinstitute.org)

Applications for EU Tech Policy Fellowship 2024 now open

Jan-Willem13 Sep 2023 16:17 UTC

22 points

2 comments1 min readEA link

[Question] Trade Between Altruists With Different AI Timelines?

Spiarrow18 Mar 2025 17:53 UTC

3 points

3 comments1 min readEA link

A Framework for Assessing AI Welfare Risk

Liam 🔸2 Mar 2025 15:50 UTC

8 points

0 comments1 min readEA link

SERI MATS—Summer 2023 Cohort

a_e_r8 Apr 2023 15:32 UTC

36 points

2 comments4 min readEA link

Pile of Law and Law-Following AI

Cullen 🔸13 Jul 2022 0:29 UTC

28 points

2 comments3 min readEA link

Challenges from Career Transitions and What To Expect From Advising

ClaireB24 Jul 2025 13:22 UTC

25 points

1 comment9 min readEA link

AI Safety Newsletter #4: AI and Cybersecurity, Persuasive AIs, Weaponization, and Geoffrey Hinton talks AI risks

Center for AI Safety2 May 2023 16:51 UTC

35 points

2 comments5 min readEA link

(newsletter.safe.ai)

Comparison of LLM scalability and performance between the U.S. and China based on benchmark

Ivanna_alvarado12 Oct 2024 21:51 UTC

8 points

0 comments34 min readEA link

More thoughts on the Human-AGI War

Ahrenbach27 Dec 2023 1:52 UTC

2 points

0 comments7 min readEA link

Long-Term Future Fund: May 2021 grant recommendations

abergal27 May 2021 6:44 UTC

110 points

17 comments57 min readEA link

Worries about latent reasoning in LLMs

CBiddulph20 Jan 2025 9:09 UTC

20 points

1 comment7 min readEA link

Long-term AI policy strategy research and implementation

Benjamin_Todd9 Nov 2021 0:00 UTC

1 point

0 comments7 min readEA link

(80000hours.org)

Set up an AIS newsletter for your group in 10 minutes per month (June edition)

gergo18 Jun 2024 6:31 UTC

34 points

0 comments1 min readEA link

[link post] AI Should Be Terrified of Humans

BrianK24 Jul 2023 11:13 UTC

29 points

0 comments1 min readEA link

(time.com)

EA & LW Forums Weekly Summary (5 − 11 Sep 22’)

Zoe Williams12 Sep 2022 23:21 UTC

36 points

0 comments13 min readEA link

AI and impact opportunities

brb24331 Mar 2022 20:23 UTC

−2 points

6 comments1 min readEA link

Future Matters #4: AI timelines, AGI risk, and existential risk from climate change

Pablo8 Aug 2022 11:00 UTC

59 points

0 comments17 min readEA link

[Question] Are there cause priortizations estimates for s-risks supporters?

jackchang11027 Mar 2023 10:32 UTC

33 points

6 comments1 min readEA link

[Question] AI Safety and Censorship

Kuiyaki13 Jul 2023 11:34 UTC

−10 points

0 comments1 min readEA link

Longtermist Implications of the Existence Neutrality Hypothesis

Maxime Riché 🔸20 Mar 2025 12:20 UTC

19 points

0 comments21 min readEA link

AI Benefits Post 2: How AI Benefits Differs from AI Alignment & AI for Good

Cullen 🔸29 Jun 2020 16:59 UTC

9 points

0 comments2 min readEA link

Please provide feedback on AI-safety grant proposal, thanks!

Alex Long11 Dec 2022 23:29 UTC

8 points

1 comment2 min readEA link

AI safety university groups: a promising opportunity to reduce existential risk

mic30 Jun 2022 18:37 UTC

53 points

1 comment11 min readEA link

Conjecture: Internal Infohazard Policy

Connor Leahy29 Jul 2022 19:35 UTC

34 points

3 comments19 min readEA link

Assessing China’s importance as an AI superpower

JulianHazell3 Feb 2023 11:08 UTC

89 points

7 comments1 min readEA link

(muddyclothes.substack.com)

Jan Kulveit’s Corrigibility Thoughts Distilled

brook25 Aug 2023 13:42 UTC

16 points

0 comments5 min readEA link

(www.lesswrong.com)

I have a some questions for the people at 80,000 Hours

yanni kyriacos14 Feb 2024 23:07 UTC

25 points

17 comments1 min readEA link

Rishi Sunak mentions “existential threats” in talk with OpenAI, DeepMind, Anthropic CEOs

Arjun Panickssery24 May 2023 21:06 UTC

44 points

2 comments1 min readEA link

(www.gov.uk)

The myth of AI “warning shots” as cavalry

Holly Elmore ⏸️ 🔸28 May 2025 16:53 UTC

125 points

31 comments9 min readEA link

(hollyelmore.substack.com)

[Question] Looking for Quick, Collaborative Systems for Truth-Seeking in Group Disagreements

EffectiveAdvocate🔸21 Jan 2025 6:32 UTC

10 points

1 comment1 min readEA link

Report: Artificial Intelligence Risk Management in Spain

JorgeTorresC15 Jun 2023 16:08 UTC

22 points

0 comments3 min readEA link

(riesgoscatastroficosglobales.com)

Medical Windfall Prizes

PeterMcCluskey7 Feb 2025 0:13 UTC

5 points

0 comments5 min readEA link

(bayesianinvestor.com)

[Extended Deadline: Jan 23rd] Announcing the PIBBSS Summer Research Fellowship

nora18 Dec 2021 16:54 UTC

36 points

1 comment1 min readEA link

#191 (Part 2) – Government and society after AGI (Carl Shulman on the 80,000 Hours Podcast)

80000_Hours11 Jul 2024 19:26 UTC

23 points

1 comment18 min readEA link

Announcing new round of “Key Phenomena in AI Risk” Reading Group

Dušan D. Nešić (Dushan)19 Oct 2023 11:05 UTC

8 points

0 comments1 min readEA link

What term to use for AI in different policy contexts?

oeg6 Sep 2023 15:08 UTC

18 points

3 comments9 min readEA link

EA as Antichrist: Understanding Peter Thiel

Ben_West🔸6 Aug 2025 17:31 UTC

98 points

51 comments14 min readEA link

HIRING: Inform and shape a new project on AI safety at Partnership on AI

Madhulika Srikumar24 Nov 2021 16:29 UTC

11 points

2 comments1 min readEA link

Dutch AI Safety Coordination Forum: An Experiment

HenningB21 Nov 2023 16:18 UTC

21 points

0 comments4 min readEA link

AI risk hub in Singapore?

kokotajlod29 Oct 2020 11:51 UTC

26 points

4 comments4 min readEA link

[Question] What type of Master’s is best for AI policy work?

Milan Griffes22 Feb 2019 20:04 UTC

14 points

7 comments1 min readEA link

Rethinking the Value of Working on AI Safety

JohanEA9 Jan 2025 14:15 UTC

47 points

21 comments10 min readEA link

[Question] How to navigate potential infohazards

more better 4 Mar 2023 21:28 UTC

16 points

7 comments1 min readEA link

Hacker-AI – Does it already exist?

Erland Wittkotter7 Nov 2022 14:01 UTC

0 points

1 comment11 min readEA link

🏜️ EA is in Albuquerque!

Alex Long12 May 2023 22:09 UTC

18 points

2 comments1 min readEA link

How We Might All Die in A Year

Greg_Colbourn ⏸️ 28 Mar 2025 13:31 UTC

12 points

6 comments21 min readEA link

(x.com)

4 types of AGI selection, and how to constrain them

Remmelt9 Aug 2023 15:02 UTC

7 points

0 comments3 min readEA link

Discontinuous progress in history: an update

AI Impacts17 Apr 2020 16:28 UTC

69 points

3 comments24 min readEA link

[Question] Survey about Copyright and generative AI allowed here ?

Lee O'Brien9 Aug 2024 12:27 UTC

0 points

1 comment1 min readEA link

$250K in Prizes: SafeBench Competition Announcement

Center for AI Safety3 Apr 2024 22:07 UTC

47 points

0 comments1 min readEA link

How I feel about AI consciousness

Yadav5 Jun 2025 16:49 UTC

10 points

0 comments3 min readEA link

(robertandgaurav.substack.com)

The Industrial Explosion

rosehadshar26 Jun 2025 14:41 UTC

39 points

1 comment15 min readEA link

(www.forethought.org)

Long-Term Future Fund: Ask Us Anything!

AdamGleave3 Dec 2020 13:44 UTC

89 points

153 comments1 min readEA link

Social scientists interested in AI safety should consider doing direct technical AI safety research, (possibly meta-research), or governance, support roles, or community building instead

Vael Gates20 Jul 2022 23:01 UTC

65 points

8 comments18 min readEA link

Comments on Manheim’s “What’s in a Pause?”

RobBensinger18 Sep 2023 12:16 UTC

74 points

11 comments6 min readEA link

Chinese Researchers Crack ChatGPT: Replicating OpenAI’s Advanced AI Model

Evan_Gaensbauer5 Jan 2025 3:50 UTC

1 point

0 comments1 min readEA link

(www.geeky-gadgets.com)

How long will reaching a Risk Awareness Moment and CHARTS agreement take?

Yadav6 Sep 2023 16:39 UTC

12 points

0 comments14 min readEA link

Air-gapping evaluation and support

Ryan Kidd26 Dec 2022 22:52 UTC

22 points

12 comments2 min readEA link

‘Now Is the Time of Monsters’

Aaron Goldzimer12 Jan 2025 23:31 UTC

25 points

0 comments1 min readEA link

(www.nytimes.com)

The GDM AGI Safety+Alignment Team is Hiring for Applied Interpretability Research

Arthur Conmy25 Feb 2025 22:38 UTC

11 points

0 comments7 min readEA link

How to become an AI safety researcher

peterbarnett12 Apr 2022 11:33 UTC

113 points

15 comments14 min readEA link

[Opzionale] Ricerca sulla sicurezza delle IA: panoramica delle carriere

EA Italy17 Jan 2023 11:06 UTC

1 point

0 comments7 min readEA link

Intro to Safety Engineering

Madhav Malhotra19 Oct 2022 23:44 UTC

4 points

0 comments1 min readEA link

Critique of Superintelligence Part 5

James Fodor13 Dec 2018 5:19 UTC

12 points

2 comments6 min readEA link

Tarbell Fellowship 2024 - Applications Open (AI Journalism)

Cillian_28 Sep 2023 10:38 UTC

58 points

1 comment3 min readEA link

Katja Grace on Slowing Down AI, AI Expert Surveys And Estimating AI Risk

Michaël Trazzi16 Sep 2022 18:00 UTC

48 points

6 comments3 min readEA link

(theinsideview.ai)

AI Safety Overview: CERI Summer Research Fellowship

Jamie B24 Mar 2022 15:12 UTC

29 points

0 comments2 min readEA link

Contracting Opportunity: Be a shortform video editor for the new 80,000 Hours Video Program (even if you haven’t edited before!)

ChanaMessinger15 Apr 2025 22:22 UTC

44 points

1 comment2 min readEA link

(80000hours.org)

AI Safety Chatbot

markov21 Dec 2023 14:09 UTC

49 points

3 comments4 min readEA link

EA Berkeley Presents: Universal Ownership: Is Index Investing the New Socially Responsible Investing?

Mahendra Prasad10 Mar 2022 6:58 UTC

7 points

0 comments1 min readEA link

AI Manufactured Crisis (don’t trust AI to protect us from AI)

WobblyPanda21 Jun 2023 11:12 UTC

4 points

0 comments1 min readEA link

There’s No Fire Alarm for Artificial General Intelligence

EA Forum Archives14 Oct 2017 2:41 UTC

30 points

1 comment25 min readEA link

(www.lesswrong.com)

The heritability of human values: A behavior genetic critique of Shard Theory

Geoffrey Miller20 Oct 2022 15:53 UTC

49 points

12 comments21 min readEA link

Philanthropists Probably Shouldn’t Mission-Hedge AI Progress

MichaelDickens23 Aug 2022 23:03 UTC

28 points

9 comments36 min readEA link

Was Releasing Claude-3 Net-Negative

Logan Riggs27 Mar 2024 17:41 UTC

12 points

1 comment4 min readEA link

Exploring the Esoteric Pathways to AI Sentience (Part One)

Caruso27 Apr 2024 12:22 UTC

−6 points

0 comments2 min readEA link

Final Report of the National Security Commission on Artificial Intelligence (NSCAI, 2021)

MichaelA🔸1 Jun 2021 8:19 UTC

51 points

3 comments4 min readEA link

(www.nscai.gov)

Strategic Perspectives on Transformative AI Governance: Introduction

MMMaas2 Jul 2022 11:20 UTC

115 points

18 comments4 min readEA link

Marisa, the Co-Founder of EA Anywhere, Has Passed Away

carrickflynn17 May 2024 22:49 UTC

520 points

33 comments1 min readEA link

Should we break up Google DeepMind?

Hauke Hillebrandt22 Apr 2024 9:16 UTC

34 points

13 comments4 min readEA link

Metaculus Presents: Does Generative AI Infringe Copyright?

christian6 Nov 2023 23:41 UTC

5 points

0 comments1 min readEA link

2023 Open Philanthropy AI Worldviews Contest: Odds of Artificial General Intelligence by 2043

srhoades1014 Mar 2023 20:32 UTC

19 points

0 comments46 min readEA link

ML4Good UK—Applications Open

Nia2 Jan 2024 18:20 UTC

21 points

0 comments1 min readEA link

Proposals for the AI Regulatory Sandbox in Spain

Guillem Bas27 Apr 2023 10:33 UTC

55 points

2 comments11 min readEA link

(riesgoscatastroficosglobales.com)

[Question] Forecasting thread: How does AI risk level vary based on timelines?

elifland14 Sep 2022 23:56 UTC

47 points

8 comments1 min readEA link

Artificially sentient beings: Moral, political, and legal issues

Fırat Akova1 Aug 2023 17:48 UTC

20 points

2 comments1 min readEA link

(doi.org)

Replacement for PONR concept

kokotajlod2 Sep 2022 0:38 UTC

14 points

1 comment2 min readEA link

Recruiting Skilled Volunteers

The BOOM3 Nov 2022 14:36 UTC

−9 points

14 comments1 min readEA link

A non-anthropomorphized view of LLMs

Jian Xin Lim🔹7 Jul 2025 1:19 UTC

2 points

2 comments1 min readEA link

(addxorrol.blogspot.com)

The Fragility of Naive Dynamism

Davidmanheim19 May 2025 7:53 UTC

10 points

1 comment17 min readEA link

Is Pausing AI Possible?

Richard Annilo9 Oct 2024 13:22 UTC

89 points

4 comments18 min readEA link

Asterisk Magazine Issue 03: AI

alejandro24 Jul 2023 15:53 UTC

34 points

3 comments1 min readEA link

(asteriskmag.com)

Information security considerations for AI and the long term future

Jeffrey Ladish2 May 2022 20:53 UTC

134 points

8 comments11 min readEA link

There are a lot of upcoming retreats/conferences between March and July (2025)

gergo18 Feb 2025 9:28 UTC

18 points

2 comments1 min readEA link

Max Tegmark — The AGI Entente Delusion

Matrice Jacobine13 Oct 2024 17:42 UTC

0 points

1 comment1 min readEA link

(www.lesswrong.com)

The case for taking AI seriously as a threat to humanity (Kelsey Piper)

EA Handbook15 Oct 2020 7:00 UTC

11 points

1 comment1 min readEA link

(www.vox.com)

#177 – Recent AI breakthroughs and navigating the growing rift between AI safety and accelerationist camps (Nathan Labenz on the 80,000 Hours Podcast)

80000_Hours31 Jan 2024 19:37 UTC

15 points

0 comments16 min readEA link

[Podcast] Ajeya Cotra on worldview diversification and how big the future could be

Eevee🔹22 Jan 2021 23:57 UTC

57 points

20 comments1 min readEA link

(80000hours.org)

[Question] Next week I’m interviewing tech policy expert Teddy Collins who has worked in the White House, DeepMind and CSET. What should I ask him?

Robert_Wiblin7 Jul 2023 14:05 UTC

15 points

4 comments1 min readEA link

VSPE vs. flattery: Testing emotional scaffolding for early-stage alignment

Astelle Kay24 Jun 2025 9:39 UTC

2 points

1 comment1 min readEA link

Revisiting the Evolution Anchor in the Biological Anchors Report

Janvi18 Mar 2024 3:01 UTC

13 points

1 comment4 min readEA link

The ‘Old AI’: Lessons for AI governance from early electricity regulation

Sam Clarke19 Dec 2022 2:46 UTC

58 points

1 comment13 min readEA link

[Link post] Promising Paths to Alignment—Connor Leahy | Talk

frances_lorenz14 May 2022 15:58 UTC

17 points

0 comments1 min readEA link

MATS Applications + Research Directions I’m Currently Excited About

Neel Nanda6 Feb 2025 11:03 UTC

31 points

3 comments8 min readEA link

When 2/3rds of the world goes against you

Jeffrey Kursonis2 Jul 2022 20:34 UTC

2 points

2 comments9 min readEA link

How we use back-of-the-envelope calculations in our grantmaking

Open Philanthropy28 May 2025 23:22 UTC

79 points

2 comments10 min readEA link

TamperSec is hiring for 3 Key Roles!

Tatiana K. Nesic Skuratova27 Feb 2025 12:23 UTC

10 points

0 comments5 min readEA link

How to reduce risks related to conscious AI: A user guide [Conscious AI & Public Perception]

Jay Luong5 Jul 2024 14:19 UTC

9 points

1 comment15 min readEA link

How could we know that an AGI system will have good consequences?

So8res7 Nov 2022 22:42 UTC

25 points

0 comments5 min readEA link

Emergency pod: Judge plants a legal time bomb under OpenAI (with Rose Chan Loui)

80000_Hours7 Mar 2025 19:24 UTC

62 points

18 comments2 min readEA link

Saying Goodbye

sapphire3 Aug 2025 23:51 UTC

14 points

2 comments4 min readEA link

AI Alignment 2018-2019 Review

Habryka [Deactivated]28 Jan 2020 21:14 UTC

28 points

0 comments6 min readEA link

(www.lesswrong.com)

From Conflict to Coexistence: Rewriting the Game Between Humans and AGI

Michael Batell6 May 2025 5:09 UTC

15 points

2 comments37 min readEA link

AISN #27: Defensive Accelerationism, A Retrospective On The OpenAI Board Saga, And A New AI Bill From Senators Thune And Klobuchar

Center for AI Safety7 Dec 2023 15:57 UTC

10 points

0 comments6 min readEA link

(newsletter.safe.ai)

INTERVIEW: StakeOut.AI w/ Dr. Peter Park

Jacob-Haimes5 Mar 2024 18:04 UTC

21 points

7 comments1 min readEA link

(into-ai-safety.github.io)

Political Funding Expertise (Post 6 of 7 on AI Governance)

Jason Green-Lowe19 Jun 2025 14:14 UTC

33 points

1 comment14 min readEA link

What Should the Average EA Do About AI Alignment?

Raemon25 Feb 2017 20:07 UTC

42 points

39 comments7 min readEA link

My take on What We Owe the Future

elifland1 Sep 2022 18:07 UTC

357 points

51 comments26 min readEA link

Last week to apply for the Futurekind AI Fellowship! (deadline: April 1)

Jay Luong23 Mar 2025 23:16 UTC

24 points

0 comments1 min readEA link

Against AI As An Existential Risk

Noah Birnbaum30 Jul 2024 19:24 UTC

6 points

3 comments1 min readEA link

(irrationalitycommunity.substack.com)

AI Agents raised $2,000 for EA charities & used the EA Forum

David_R 🔸4 Jun 2025 22:18 UTC

16 points

0 comments1 min readEA link

The ‘Bad Parent’ Problem: Why Human Society Complicates AI Alignment

Beyond Singularity5 Apr 2025 21:08 UTC

11 points

1 comment3 min readEA link

Biological Anchors external review by Jennifer Lin (linkpost)

peterhartree30 Nov 2022 13:06 UTC

36 points

0 comments1 min readEA link

(docs.google.com)

Is Text Watermarking a lost cause?

Egor Timatkov1 Oct 2024 13:07 UTC

7 points

0 comments10 min readEA link

Crash scenario 1: Rapidly mobilise for a 2025 AI crash

Remmelt11 Apr 2025 6:54 UTC

8 points

0 comments1 min readEA link

Historical Precedents for International AI Safety Collaborations

ZacRichardson13 Jul 2025 21:30 UTC

20 points

1 comment55 min readEA link

GPT-2 as step toward general intelligence (Alexander, 2019)

Will Aldred18 Jul 2022 16:14 UTC

42 points

0 comments2 min readEA link

(slatestarcodex.com)

[Question] What EAG sessions would you like on AI?

Nathan Young20 Mar 2022 17:05 UTC

7 points

10 comments1 min readEA link

Palisade is hiring Research Engineers

Charlie Rogers-Smith11 Nov 2023 3:09 UTC

23 points

0 comments3 min readEA link

[Question] Are AI risks tractable?

defun 🔸21 May 2024 13:45 UTC

23 points

1 comment1 min readEA link

My kids won’t be workers

Yadav15 Aug 2025 6:53 UTC

8 points

0 comments6 min readEA link

(y1d2.com)

A Cognitive Instrument on the Terminal Contest

Ihor Ivliev23 Jul 2025 23:30 UTC

0 points

1 comment8 min readEA link

The EU AI Act: A Simple Explanation—A Stanford Study Reveals the gaps of ChatGPT and 9 more

Sparkvibe26 Jun 2023 8:59 UTC

3 points

1 comment1 min readEA link

(youtu.be)

Reinforcement Learning: A Non-Technical Primer on o1 and DeepSeek-R1

AlexChalk9 Feb 2025 23:58 UTC

4 points

0 comments9 min readEA link

(alexchalk.net)

Paul Christiano on how OpenAI is developing real solutions to the ‘AI alignment problem’, and his vision of how humanity will progressively hand over decision-making to AI systems

80000_Hours2 Oct 2018 11:49 UTC

6 points

0 comments185 min readEA link

Scrutinizing AI Risk (80K, #81) - v. quick summary

Ben23 Jul 2020 19:02 UTC

10 points

1 comment3 min readEA link

Part 1: The AI Safety community has four main work groups, Strategy, Governance, Technical and Movement Building

PeterSlattery25 Nov 2022 3:45 UTC

72 points

7 comments6 min readEA link

Apply for MATS Winter 2023-24!

utilistrutil21 Oct 2023 2:34 UTC

34 points

2 comments5 min readEA link

(www.lesswrong.com)

Feedback Request on EA Philippines’ Career Advice Research for Technical AI Safety

BrianTan3 Oct 2020 10:39 UTC

19 points

5 comments4 min readEA link

Let’s think about slowing down AI

Katja_Grace23 Dec 2022 19:56 UTC

339 points

9 comments38 min readEA link

Classifying sources of AI x-risk

Sam Clarke8 Aug 2022 18:18 UTC

41 points

4 comments3 min readEA link

Finishing The SB-1047 Documentary

Michaël Trazzi28 Oct 2024 20:26 UTC

67 points

0 comments4 min readEA link

[Question] What do we know about Mustafa Suleyman’s position on AI Safety?

Chris Leong13 Aug 2023 19:41 UTC

14 points

3 comments1 min readEA link

My guess for the most cost effective AI Safety projects

Linda Linsefors24 Jan 2024 12:21 UTC

26 points

2 comments4 min readEA link

Debate: should EA avoid using AI art outside of research?

titotal30 Apr 2025 11:10 UTC

34 points

29 comments3 min readEA link

20+ tips, tricks, lessons and thoughts on hosting hackathons

gergo6 Nov 2023 10:59 UTC

14 points

0 comments11 min readEA link

Discovering Language Model Behaviors with Model-Written Evaluations

evhub20 Dec 2022 20:09 UTC

25 points

0 comments7 min readEA link

(www.anthropic.com)

Stable totalitarianism: an overview

80000_Hours29 Oct 2024 16:07 UTC

36 points

1 comment20 min readEA link

(80000hours.org)

Biweekly AI Safety Comms Meetup

Vishakha Agrawal17 Jul 2025 7:43 UTC

2 points

0 comments1 min readEA link

Conflicted on AI Politics

Jeff Kaufman 🔸11 Jun 2025 12:39 UTC

39 points

3 comments2 min readEA link

Regulation of AI Use for Personal Data Protection: Comparison of Global Strategies and Opportunities for Latin America

Lisbeth Guzman 14 Oct 2024 13:22 UTC

10 points

1 comment21 min readEA link

New Speaker Series on AI Alignment Starting March 3

Zechen Zhang26 Feb 2022 10:58 UTC

5 points

0 comments1 min readEA link

Who ordered alignment’s apple?

Eleni_A28 Aug 2022 14:24 UTC

5 points

0 comments3 min readEA link

AI Safety is Dropping the Ball on Clown Attacks

trevor121 Oct 2023 23:15 UTC

−17 points

0 comments34 min readEA link

[Question] What kind of organization should be the first to develop AGI in a potential arms race?

Eevee🔹17 Jul 2022 17:41 UTC

10 points

2 comments1 min readEA link

[Fiction] Improved Governance on the Critical Path to AI Alignment by 2045.

Jackson Wagner18 May 2022 15:50 UTC

20 points

1 comment12 min readEA link

Critique of Superintelligence Part 1

James Fodor13 Dec 2018 5:10 UTC

22 points

13 comments8 min readEA link

What I am open to changing my mind about: polls and debates

BiologyTranslated9 May 2025 10:13 UTC

8 points

9 comments2 min readEA link

Stampy’s AI Safety Info—New Distillations #2 [April 2023]

markov9 May 2023 13:34 UTC

13 points

1 comment1 min readEA link

(aisafety.info)

I’m Cullen O’Keefe, a Policy Researcher at OpenAI, AMA

Cullen 🔸11 Jan 2020 4:13 UTC

45 points

68 comments1 min readEA link

Book review: Architects of Intelligence by Martin Ford (2018)

Ofer11 Aug 2020 17:24 UTC

11 points

1 comment2 min readEA link

The Self in Artificial Consciousness: A Buddhist Investigation into Advanced AI

Ryan Combes24 Oct 2023 4:11 UTC

10 points

2 comments1 min readEA link

(drive.google.com)

AI Alignment and the Financial War Against Narcissistic Manipulation

Julian Nalenz19 Feb 2025 20:36 UTC

2 points

0 comments3 min readEA link

[Crosspost] AI Regulation May Be More Important Than AI Alignment For Existential Safety

Otto24 Aug 2023 16:01 UTC

14 points

2 comments5 min readEA link

2020 AI Alignment Literature Review and Charity Comparison

Larks21 Dec 2020 15:25 UTC

155 points

16 comments68 min readEA link

Democratising AI Alignment: Challenges and Proposals

Lloy2 🔹5 May 2025 14:50 UTC

2 points

2 comments4 min readEA link

Alignment 201 curriculum

richard_ngo12 Oct 2022 19:17 UTC

94 points

9 comments1 min readEA link

(www.agisafetyfundamentals.com)

The limited upside of interpretability

Peter S. Park15 Nov 2022 20:22 UTC

23 points

3 comments10 min readEA link

DeepMind’s “Frontier Safety Framework” is weak and unambitious

Zach Stein-Perlman18 May 2024 3:00 UTC

54 points

1 comment4 min readEA link

The Unknowable Catastrophe

Aino6 Jul 2023 15:37 UTC

3 points

0 comments3 min readEA link

Debating AI’s Moral Status: The Most Humane and Silliest Thing Humans Do(?)

Soe Lin29 Sep 2024 5:01 UTC

5 points

5 comments3 min readEA link

Turing-Test-Passing AI implies Aligned AI

Roko31 Dec 2024 20:22 UTC

0 points

0 comments5 min readEA link

[Closed] Agent Foundations track in MATS

Vanessa31 Oct 2023 8:14 UTC

19 points

0 comments1 min readEA link

(www.matsprogram.org)

Considerations on transformative AI and explosive growth from a semiconductor-industry perspective

Muireall31 May 2023 1:11 UTC

23 points

1 comment2 min readEA link

(muireall.space)

AI Safety 101 : AGI

markov21 Dec 2023 14:18 UTC

2 points

1 comment33 min readEA link

U.S. Government Seeks Input on National AI R&D Strategic Plan—Deadline May 29

Matt Brooks27 May 2025 1:53 UTC

8 points

1 comment1 min readEA link

The inordinately slow spread of good AGI conversations in ML

RobBensinger29 Jun 2022 4:02 UTC

59 points

2 comments8 min readEA link

Why AI Regulation Violates the First Amendment

Locke1 Jun 2024 20:44 UTC

−15 points

0 comments5 min readEA link

Reactive devaluation: Bias in Evaluating AGI X-Risks

Remmelt30 Dec 2022 9:02 UTC

2 points

9 comments1 min readEA link

Skilling-up in ML Engineering for Alignment: request for comments

Callum McDougall24 Apr 2022 6:40 UTC

8 points

0 comments1 min readEA link

Samotsvety’s AI risk forecasts

elifland9 Sep 2022 4:01 UTC

175 points

30 comments4 min readEA link

“Normal accidents” and AI systems

Eleni_A8 Aug 2022 18:43 UTC

5 points

1 comment1 min readEA link

(www.achan.ca)

AI Benefits Post 4: Outstanding Questions on Selecting Benefits

Cullen 🔸14 Jul 2020 17:24 UTC

6 points

0 comments5 min readEA link

The Hasty Start of Budapest AI Safety, 6-month update from a non-STEM founder

gergo3 Jan 2024 12:56 UTC

9 points

1 comment7 min readEA link

[Question] What are some resources (articles, videos) that show off what the current state of the art in AI is? (for a layperson who doesn’t know much about AI)

james6 Dec 2021 21:06 UTC

10 points

6 comments1 min readEA link

John Cochrane on why regulation is the wrong tool for AI Safety

ezrah26 Sep 2024 8:48 UTC

3 points

2 comments1 min readEA link

(www.grumpy-economist.com)

[Linkpost] Eric Schwitzgebel: AI systems must not confuse users about their sentience or moral status

Zachary Brown🔸18 Aug 2023 17:21 UTC

6 points

0 comments2 min readEA link

(www.sciencedirect.com)

Video and transcript of talk on AI welfare

Joe_Carlsmith22 May 2025 16:15 UTC

22 points

1 comment28 min readEA link

(joecarlsmith.substack.com)

Announcing the Moonshot Alignment Program

Sharon Mwaniki22 Jul 2025 13:12 UTC

5 points

0 comments3 min readEA link

UK Prime Minister Rishi Sunak’s Speech on AI

Tobias Häberli26 Oct 2023 10:34 UTC

112 points

6 comments8 min readEA link

(www.gov.uk)

Enhancing Biometric Data Protection in Latin America Based on the European Experience

Ana Sofía Jiménez 13 Aug 2024 13:13 UTC

13 points

1 comment4 min readEA link

The counting argument for scheming (Sections 4.1 and 4.2 of “Scheming AIs”)

Joe_Carlsmith6 Dec 2023 19:28 UTC

9 points

1 comment7 min readEA link

AGI with feelings

Nicolai Meberg7 Dec 2022 16:00 UTC

−13 points

0 comments1 min readEA link

(twitter.com)

Comparing AI Labs and Pharmaceutical Companies

mxschons13 Nov 2024 14:51 UTC

13 points

0 comments1 min readEA link

(mxschons.com)

A request to keep pessimistic AI posts actionable.

tcelferact11 May 2023 15:35 UTC

27 points

9 comments1 min readEA link

Hiring a CEO & EU Tech Policy Lead to launch an AI policy career org in Europe

Cillian_6 Dec 2023 13:52 UTC

50 points

0 comments7 min readEA link

[Cross-post] Change my mind: we should define and measure the effectiveness of advanced AI

David Johnston6 Apr 2022 0:20 UTC

4 points

0 comments7 min readEA link

AI Risk Intro 2: Solving The Problem

L Rudolf L24 Sep 2022 9:33 UTC

11 points

0 comments28 min readEA link

(www.perfectlynormal.co.uk)

Convergence 2024 Impact Review

David_Kristoffersson24 Mar 2025 20:28 UTC

38 points

0 comments14 min readEA link

[Question] Questions on databases of AI Risk estimates

Froolow2 Oct 2022 9:12 UTC

24 points

12 comments2 min readEA link

AGI as a Black Swan Event

Stephen McAleese4 Dec 2022 23:35 UTC

5 points

2 comments7 min readEA link

(www.lesswrong.com)

OpenAI: Preparedness framework

Zach Stein-Perlman18 Dec 2023 18:30 UTC

24 points

0 comments4 min readEA link

(openai.com)

#197 – On whether Anthropic’s AI safety policy is up to the task (Nick Joseph on The 80,000 Hours Podcast)

80000_Hours22 Aug 2024 15:34 UTC

9 points

0 comments18 min readEA link

Takeaways from a survey on AI alignment resources

DanielFilan5 Nov 2022 23:45 UTC

20 points

9 comments6 min readEA link

(www.lesswrong.com)

[Question] submissive ai

David turner21 Nov 2023 14:28 UTC

−5 points

0 comments1 min readEA link

AI Safety Needs Great Engineers

Andy Jones23 Nov 2021 21:03 UTC

98 points

13 comments4 min readEA link

The fundamental human value is power.

Linyphia30 Mar 2023 15:15 UTC

−1 points

5 comments1 min readEA link

Most Leading AI Experts Believe That Advanced AI Could Be Extremely Dangerous to Humanity

jai4 May 2023 16:19 UTC

31 points

1 comment1 min readEA link

(laneless.substack.com)

Pivotal outcomes and pivotal processes

Andrew Critch17 Jun 2022 23:43 UTC

49 points

1 comment4 min readEA link

Safety-concerned EAs should prioritize AI governance over alignment

sammyboiz🔸11 Jun 2024 15:47 UTC

59 points

20 comments1 min readEA link

Moral Spillover in Human-AI Interaction

Katerina Manoli5 Jun 2023 15:20 UTC

17 points

1 comment13 min readEA link

[Question] How can we secure more research positions at our universities for x-risk researchers?

Neil Crawford6 Sep 2022 14:41 UTC

3 points

2 comments1 min readEA link

[Question] What does the Project Management role look like in AI safety?

gvst14 May 2022 19:29 UTC

10 points

1 comment1 min readEA link

Some quick thoughts on “AI is easy to control”

MikhailSamin7 Dec 2023 12:23 UTC

5 points

4 comments7 min readEA link

An ML safety insurance company—shower thoughts

EdoArad18 Oct 2021 7:45 UTC

15 points

4 comments1 min readEA link

[Question] Is a career in making AI systems more secure a meaningful way to mitigate the X-risk posed by AGI?

Kyle O’Brien13 Feb 2022 7:05 UTC

14 points

4 comments1 min readEA link

Incident reporting for AI safety

Zach Stein-Perlman19 Jul 2023 17:00 UTC

18 points

1 comment18 min readEA link

AISN #9: Statement on Extinction Risks, Competitive Pressures, and When Will AI Reach Human-Level?

Center for AI Safety6 Jun 2023 15:56 UTC

12 points

2 comments7 min readEA link

(newsletter.safe.ai)

Video and transcript of talk on “Can goodness compete?”

Joe_Carlsmith17 Jul 2025 17:59 UTC

34 points

4 comments34 min readEA link

(joecarlsmith.substack.com)

Thinking About Propensity Evaluations

Maxime Riché 🔸19 Aug 2024 9:24 UTC

12 points

1 comment27 min readEA link

How to Diversify Conceptual AI Alignment: the Model Behind Refine

adamShimi20 Jul 2022 10:44 UTC

43 points

0 comments9 min readEA link

(www.alignmentforum.org)

An entire category of risks is undervalued by EA [Summary of previous forum post]

Richard R5 Sep 2022 15:07 UTC

79 points

5 comments5 min readEA link

AISN #20: LLM Proliferation, AI Deception, and Continuing Drivers of AI Capabilities

Center for AI Safety29 Aug 2023 15:03 UTC

12 points

0 comments8 min readEA link

(newsletter.safe.ai)

Alex Lawsen On Forecasting AI Progress

Michaël Trazzi6 Sep 2022 9:53 UTC

38 points

1 comment2 min readEA link

(theinsideview.ai)

Register for the Stanford Existential Risks Initiative (SERI) Symposium

Grant Higerd-Rusli18 Mar 2025 3:50 UTC

7 points

0 comments1 min readEA link

(cisac.fsi.stanford.edu)

Call for submissions: AI Safety Special Session at the Conference on Artificial Life (ALIFE 2023)

Rory Greig5 Feb 2023 16:37 UTC

16 points

0 comments2 min readEA link

(humanvaluesandartificialagency.com)

AI Forecasting Question Database (Forecasting infrastructure, part 3)

terraform3 Sep 2019 14:57 UTC

23 points

2 comments4 min readEA link

Stackelberg Games and Cooperative Commitment: My Thoughts and Reflections on a 2-Month Research Project

Ben Bucknall13 Dec 2021 10:49 UTC

18 points

1 comment9 min readEA link

AISN #17: Automatically Circumventing LLM Guardrails, the Frontier Model Forum, and Senate Hearing on AI Oversight

Center for AI Safety1 Aug 2023 15:24 UTC

15 points

0 comments8 min readEA link

Looking for a Document to Introduce AI Risks to Newbies

Jr2217 Jun 2024 13:02 UTC

2 points

3 comments1 min readEA link

Announcing the AI Safety Summit Talks with Yoshua Bengio

Otto14 May 2024 12:49 UTC

33 points

1 comment1 min readEA link

AISN #18: Challenges of Reinforcement Learning from Human Feedback, Microsoft’s Security Breach, and Conceptual Research on AI Safety

Center for AI Safety8 Aug 2023 15:52 UTC

12 points

0 comments5 min readEA link

(newsletter.safe.ai)

Spicy takes about AI policy (Clark, 2022)

Will Aldred9 Aug 2022 13:49 UTC

44 points

0 comments3 min readEA link

(twitter.com)

How should norms of academic writing and publishing be changed once AI systems become superhuman in more respects?

simonfriederich24 Nov 2023 13:35 UTC

10 points

0 comments1 min readEA link

(link.springer.com)

deleted

funnyfranco15 Mar 2025 15:32 UTC

4 points

0 comments22 min readEA link

Launching the AI Forecasting Benchmark Series Q3 | $30k in Prizes

christian8 Jul 2024 17:20 UTC

17 points

0 comments1 min readEA link

(www.metaculus.com)

Soares, Tallinn, and Yudkowsky discuss AGI cognition

EliezerYudkowsky29 Nov 2021 17:28 UTC

15 points

0 comments40 min readEA link

Increased Availability and Willingness for Deployment of Resources for Effective Altruism and Long-Termism

Evan_Gaensbauer29 Dec 2021 20:20 UTC

46 points

1 comment2 min readEA link

Continuous doesn’t mean slow

Tom_Davidson10 May 2023 12:17 UTC

64 points

1 comment4 min readEA link

[Question] What is the best article to introduce someone to AI safety for the first time?

trevor122 Nov 2022 2:06 UTC

2 points

3 comments1 min readEA link

Make a neural network in ~10 minutes

Arjun Yadav25 Apr 2022 18:36 UTC

3 points

0 comments4 min readEA link

(arjunyadav.net)

AI for AI safety

Joe_Carlsmith14 Mar 2025 15:00 UTC

34 points

1 comment17 min readEA link

(joecarlsmith.substack.com)

Speed arguments against scheming (Section 4.4-4.7 of “Scheming AIs”)

Joe_Carlsmith8 Dec 2023 21:10 UTC

6 points

0 comments11 min readEA link

Argument Against Impact: EU Is Not an AI Superpower

EU AI Governance31 Jan 2022 9:48 UTC

35 points

9 comments4 min readEA link

Aligning AI with Humans by Leveraging Legal Informatics

johnjnay18 Sep 2022 7:43 UTC

20 points

11 comments3 min readEA link

Updates from Campaign for AI Safety

Jolyn Khoo29 Jun 2023 7:23 UTC

8 points

0 comments1 min readEA link

(www.campaignforaisafety.org)

CoreWeave Is A Time Bomb

Remmelt31 Mar 2025 3:52 UTC

10 points

2 comments2 min readEA link

(www.wheresyoured.at)

deleted

funnyfranco13 Mar 2025 19:03 UTC

1 point

0 comments1 min readEA link

Skepticism towards claims about the views of powerful institutions

tlevin13 Feb 2025 7:40 UTC

20 points

1 comment4 min readEA link

What is the role of Bayesian ML for AI alignment/safety?

mariushobbhahn11 Jan 2022 8:07 UTC

39 points

6 comments3 min readEA link

Niceness is unnatural

So8res13 Oct 2022 1:30 UTC

20 points

1 comment8 min readEA link

[Question] How confident are you that it’s preferable for America to develop AGI before China does?

ScienceMon🔸22 Feb 2025 13:37 UTC

218 points

53 comments1 min readEA link

Opinionated take on EA and AI Safety

sammyboiz🔸2 Mar 2025 9:37 UTC

75 points

18 comments1 min readEA link

Technical AI Safety Research Landscape [Slides]

Magdalena Wache18 Sep 2023 13:56 UTC

31 points

0 comments4 min readEA link

AI governance needs a theory of victory

Corin Katzke21 Jun 2024 16:08 UTC

84 points

8 comments20 min readEA link

(www.convergenceanalysis.org)

Feedback wanted! On script for an upcoming ~12 minute Rob Miles video on AI x-risk.

melissasamworth23 Jan 2025 21:46 UTC

25 points

0 comments1 min readEA link

ARIA is looking for topics for roundtables

Nathan_Barnard26 Aug 2022 19:14 UTC

34 points

11 comments1 min readEA link

Visit Mexico City in January & February to interact with the AI Futures Fellowship

AmAristizabal28 Jul 2023 16:44 UTC

45 points

0 comments2 min readEA link

Animal Weapons: Lessons for Humans in the Age of X-Risk

Damin Curtis🔹4 Jul 2023 14:43 UTC

32 points

1 comment10 min readEA link

AI labs’ statements on governance

Zach Stein-Perlman4 Jul 2023 16:30 UTC

28 points

1 comment36 min readEA link

Reducing LLM deception at scale with self-other overlap fine-tuning

Marc Carauleanu13 Mar 2025 19:09 UTC

8 points

0 comments6 min readEA link

Retrospective: PIBBSS Fellowship 2023

Dušan D. Nešić (Dushan)16 Feb 2024 17:48 UTC

17 points

2 comments8 min readEA link

Wild Animal Welfare Scenarios for AI Doom

utilistrutil8 Jun 2023 19:41 UTC

54 points

2 comments3 min readEA link

What Does an ASI Political Ecology Mean for Human Survival?

Nathan Sidney23 Feb 2025 8:53 UTC

7 points

3 comments1 min readEA link

Sentience in Machines—How Do We Test for This Objectively?

Mayowa Osibodu20 Mar 2023 5:20 UTC

10 points

0 comments2 min readEA link

(www.researchgate.net)

Welcome to Apply: The 2024 Vitalik Buterin Fellowships in AI Existential Safety by FLI!

Zhijing Jin25 Sep 2023 16:20 UTC

14 points

5 comments2 min readEA link

Apply to the 2025 PIBBSS Summer Research Fellowship

Dušan D. Nešić (Dushan)24 Dec 2024 10:28 UTC

6 points

0 comments2 min readEA link

Toby Ord’s new report on lessons from the development of the atomic bomb

Ishan Mukherjee22 Nov 2022 10:37 UTC

65 points

3 comments1 min readEA link

(www.governance.ai)

Simulating a possible alignment solution in GPT2-medium using Archetypal Transfer Learning

Miguel2 May 2023 16:23 UTC

4 points

0 comments18 min readEA link

Survey of 2,778 AI authors: six parts in pictures

Katja_Grace6 Jan 2024 4:43 UTC

176 points

11 comments2 min readEA link

Fermi estimation of the impact you might have working on AI safety

frib13 May 2022 13:30 UTC

24 points

13 comments1 min readEA link

Yudkowsky and Christiano on AI Takeoff Speeds [LINKPOST]

aog5 Apr 2022 0:57 UTC

15 points

0 comments11 min readEA link

[Question] Using Older AI Models as a Form of Boycott

Jacob121 Jul 2025 12:13 UTC

5 points

0 comments1 min readEA link

When do experts think human-level AI will be created?

Vishakha Agrawal2 Jan 2025 23:17 UTC

33 points

9 comments2 min readEA link

(aisafety.info)

Ego‑Centric Architecture for AGI Safety: Technical Core, Falsifiable Predictions, and a Minimal Experiment

Samuel Pedrielli30 Jul 2025 14:37 UTC

1 point

1 comment3 min readEA link

Linkpost—Beyond Hyperanthropomorphism: Or, why fears of AI are not even wrong, and how to make them real

Locke24 Aug 2022 16:24 UTC

−4 points

3 comments2 min readEA link

(studio.ribbonfarm.com)

AGI ruin mostly rests on strong claims about alignment and deployment, not about society

RobBensinger24 Apr 2023 13:07 UTC

16 points

4 comments6 min readEA link

UK policy and politics careers

weeatquince28 Sep 2019 16:18 UTC

28 points

10 comments7 min readEA link

Nine Points of Collective Insanity

Remmelt27 Dec 2022 3:14 UTC

1 point

0 comments1 min readEA link

(mflb.com)

AI Safety in a Vulnerable World: Requesting Feedback on Preliminary Thoughts

Jordan Arel6 Dec 2022 22:36 UTC

5 points

4 comments3 min readEA link

Tech to AI safety mentorship: Mid-career transitions with Cameron Holmes

frances_lorenz23 Jul 2025 16:34 UTC

13 points

0 comments2 min readEA link

Possible Divergence in AGI Risk Tolerance between Selfish and Altruistic agents

Brad West🔸9 Sep 2023 0:22 UTC

11 points

0 comments2 min readEA link

AI Safety: The [Hypothetical] Video Game

barryl 🔸18 Apr 2025 20:19 UTC

2 points

1 comment3 min readEA link

AI Pause Will Likely Backfire

Nora Belrose16 Sep 2023 10:21 UTC

141 points

167 comments13 min readEA link

Main paths to impact in EU AI Policy

JOMG_Monnet8 Dec 2022 16:17 UTC

69 points

2 comments8 min readEA link

AMA: PauseAI US needs money! Ask founder/Exec Dir Holly Elmore anything for 11/19

Holly Elmore ⏸️ 🔸11 Nov 2024 23:51 UTC

91 points

57 comments4 min readEA link

Reading the ethicists 2: Hunting for AI alignment papers

Charlie Steiner6 Jun 2022 15:53 UTC

11 points

0 comments1 min readEA link

(www.lesswrong.com)

Summary: “Imagining and building wise machines: The centrality of AI metacognition” by Johnson, Karimi, Bengio, et al.

Chris Leong5 Jun 2025 12:16 UTC

12 points

0 comments10 min readEA link

(arxiv.org)

Provably Honest—A First Step

Srijanak De5 Nov 2022 21:49 UTC

1 point

0 comments8 min readEA link

[Question] How do you talk about AI safety?

Eevee🔹19 Apr 2020 16:15 UTC

10 points

5 comments1 min readEA link

[Question] Why should we not put effort into AI safety research?

Ben Thompson16 May 2021 5:11 UTC

15 points

5 comments1 min readEA link

Against Explosive Growth

c.trout4 Sep 2024 21:45 UTC

24 points

9 comments5 min readEA link

Responsible Scaling Policies Are Risk Management Done Wrong

simeon_c25 Oct 2023 23:46 UTC

42 points

1 comment22 min readEA link

(www.navigatingrisks.ai)

By failing to take serious AI action, the US could be in violation of its international law obligations

Cecil Abungu 27 May 2023 4:25 UTC

44 points

1 comment10 min readEA link

Why AI Safety Needs a Centralized Plan—And What It Might Look Like

Brandon Riggs28 May 2025 21:40 UTC

21 points

7 comments15 min readEA link

EA relevant Foresight Institute Workshops in 2023: WBE & AI safety, Cryptography & AI safety, XHope, Space, and Atomically Precise Manufacturing

elteerkers16 Jan 2023 14:02 UTC

20 points

1 comment3 min readEA link

Correcting the Foundations: Exposing the Contradictions of Moral Relativism and the Need for Objective Standards in Ethics and AI Alignment

Howl4049 Jul 2025 15:27 UTC

1 point

0 comments4 min readEA link

Mapping the Landscape of Digital Sentience Research

Kayode Adekoya19 Jun 2025 13:45 UTC

5 points

0 comments3 min readEA link

When will AI automate all mental work, and how fast?

A.G.G. Liu31 May 2025 16:18 UTC

10 points

0 comments7 min readEA link

(youtu.be)

On Solving Problems Before They Appear: The Weird Epistemologies of Alignment

adamShimi11 Oct 2021 8:21 UTC

28 points

0 comments15 min readEA link

A case study of regulation done well? Canadian biorisk regulations

rosehadshar8 Sep 2023 17:10 UTC

31 points

1 comment16 min readEA link

AI Alignment Research Engineer Accelerator (ARENA): call for applicants

Callum McDougall7 Nov 2023 9:43 UTC

46 points

3 comments10 min readEA link

Empirical work that might shed light on scheming (Section 6 of “Scheming AIs”)

Joe_Carlsmith11 Dec 2023 16:30 UTC

7 points

1 comment19 min readEA link

Is Optimal Reflection Competitive with Extinction Risk Reduction? - Requesting Reviewers

Jordan Arel29 Jun 2025 5:13 UTC

18 points

1 comment11 min readEA link

[Question] To what extent is AI safety work trying to get AI to reliably and safely do what the user asks vs. do what is best in some ultimate sense?

Jordan Arel23 May 2025 21:09 UTC

12 points

0 comments1 min readEA link

Atari early

AI Impacts2 Apr 2020 23:28 UTC

34 points

2 comments5 min readEA link

(aiimpacts.org)

AI Audit in Costa Rica

Priscilla Campos27 Jan 2025 2:57 UTC

10 points

4 comments9 min readEA link

[Question] What are the challenges and problems with programming law-breaking constraints into AGI?

Michael St Jules 🔸2 Feb 2020 20:53 UTC

20 points

34 comments1 min readEA link

The Pending Disaster Framing as it Relates to AI Risk

Chris Leong25 Feb 2024 15:47 UTC

8 points

2 comments6 min readEA link

Ilya: The AI scientist shaping the world

David Varga20 Nov 2023 12:43 UTC

6 points

1 comment4 min readEA link

How Europe might matter for AI governance

stefan.torges12 Jul 2019 23:42 UTC

52 points

13 comments8 min readEA link

Reasons for and against working on technical AI safety at a frontier AI lab

bilalchughtai7 Jan 2025 13:23 UTC

16 points

3 comments12 min readEA link

(www.lesswrong.com)

A Sketch of AI-Driven Epistemic Lock-In

Ozzie Gooen5 Mar 2025 22:40 UTC

15 points

1 comment3 min readEA link

The case for taking AI seriously as a threat to humanity

EA Handbook10 Nov 2020 0:00 UTC

11 points

5 comments1 min readEA link

(www.vox.com)

Four Futures For Cognitive Labor

Maxwell Tabarrok13 Jun 2024 12:58 UTC

27 points

11 comments4 min readEA link

(www.maximum-progress.com)

Roodman’s Thoughts on Biological Anchors

lukeprog14 Sep 2022 12:23 UTC

73 points

8 comments1 min readEA link

(docs.google.com)

Join the interpretability research hackathon

Esben Kran28 Oct 2022 16:26 UTC

48 points

0 comments5 min readEA link

Time-stamping: An urgent, neglected AI safety measure

Axel Svensson30 Jan 2023 11:21 UTC

57 points

27 comments3 min readEA link

AI coöperation is more possible than you think

42317524 Sep 2022 23:04 UTC

2 points

0 comments2 min readEA link

I’m Buck Shlegeris, I do research and outreach at MIRI, AMA

Buck15 Nov 2019 22:44 UTC

123 points

228 comments2 min readEA link

No one has the ball on 1500 Russian olympiad winners who’ve received HPMOR

MikhailSamin23 Jan 2025 16:40 UTC

32 points

10 comments1 min readEA link

Effective Persuasion For AI Alignment Risk

Brian Lui9 Aug 2022 23:55 UTC

5 points

7 comments4 min readEA link

The Metaethics and Normative Ethics of AGI Value Alignment: Many Questions, Some Implications

Eleos Arete Citrini15 Sep 2021 19:05 UTC

25 points

0 comments8 min readEA link

Attention on AI X-Risk Likely Hasn’t Distracted from Current Harms from AI

Erich_Grunewald 🔸21 Dec 2023 17:24 UTC

190 points

13 comments17 min readEA link

(www.erichgrunewald.com)

AI for Animals is Hiring a Program Lead

Constance Li10 Jul 2024 20:57 UTC

21 points

0 comments4 min readEA link

Governing High-Impact AI Systems: Understanding Canada’s Proposed AI Bill. April 15, Carleton University, Ottawa

Liav.Koren27 Mar 2023 23:11 UTC

3 points

0 comments1 min readEA link

(www.eventbrite.com)

Karma Tests in Logical Counterfactual Simulations motivates strong agents to protect weak agents

Knight Lee18 Apr 2025 12:03 UTC

1 point

0 comments3 min readEA link

AI Defaults: A Neglected Lever for Animal Welfare?

andiehansen30 May 2025 9:59 UTC

13 points

0 comments10 min readEA link

Lessons on project management from “How Big Things Get Done”

Cristina Schmidt Ibáñez17 May 2023 19:15 UTC

36 points

3 comments9 min readEA link

Difference, Projection, and Adaptation

YOG10 Nov 2022 10:46 UTC

0 points

0 comments3 min readEA link

[Question] Should we nationalize AI development?

Jadon Schmitt20 Jul 2023 5:31 UTC

5 points

4 comments1 min readEA link

Apply for ARBOx2: an ML safety intensive [deadline: 25th of May 2025]

Margot Stakenborg13 May 2025 11:45 UTC

16 points

5 comments1 min readEA link

A taxonomy of non-schemer models (Section 1.2 of “Scheming AIs”)

Joe_Carlsmith22 Nov 2023 15:24 UTC

6 points

0 comments6 min readEA link

Announcing the EU Tech Policy Fellowship

Jan-Willem30 Mar 2022 8:15 UTC

53 points

4 comments5 min readEA link

The Economist feature articles on LLMs

Dr Dan Epstein20 Apr 2023 0:29 UTC

12 points

0 comments1 min readEA link

(www.economist.com)

Superintelligence’s goals are likely to be random

MikhailSamin14 Mar 2025 1:17 UTC

2 points

0 comments5 min readEA link

Applications Now Open for Deep Dive: A 201 AI Policy Course by ENAIS

Kambar2 Jul 2025 8:32 UTC

10 points

5 comments1 min readEA link

Zvi on: A Playbook for AI Policy at the Manhattan Institute

Phib4 Aug 2024 21:34 UTC

9 points

1 comment7 min readEA link

(thezvi.substack.com)

Dear Anthropic people, please don’t release Claude

Joseph Miller8 Feb 2023 2:44 UTC

28 points

5 comments1 min readEA link

Early Chinese Language Media Coverage of the AI 2027 Report: A Qualitative Analysis

eeeee30 Apr 2025 14:23 UTC

14 points

0 comments11 min readEA link

(www.lesswrong.com)

The necessity of “Guardian AI” and two conditions for its achievement

Proica28 May 2024 11:42 UTC

1 point

1 comment15 min readEA link

Good policy ideas that won’t happen (yet)

Niel_Bowerman11 Sep 2014 12:29 UTC

28 points

8 comments14 min readEA link

AI Alignment Research Engineer Accelerator (ARENA): call for applicants

Callum McDougall17 Apr 2023 20:30 UTC

41 points

2 comments7 min readEA link

Concrete actions to improve AI governance: the behaviour science approach

Alexander Saeri1 Dec 2022 21:34 UTC

31 points

0 comments11 min readEA link

Update to Samotsvety AGI timelines

Misha_Yagudin24 Jan 2023 4:27 UTC

120 points

9 comments4 min readEA link

Consider trying Vivek Hebbar’s alignment exercises

Akash24 Oct 2022 19:46 UTC

16 points

0 comments4 min readEA link

ARENA 2.0 - Impact Report

Callum McDougall26 Sep 2023 17:13 UTC

17 points

0 comments13 min readEA link

[Cause Exploration Prizes] Expanding communication about AGI risks

Ines22 Sep 2022 5:30 UTC

13 points

0 comments11 min readEA link

On Artificial Wisdom

Jordan Arel11 Jul 2024 7:14 UTC

23 points

3 comments14 min readEA link

(4 min read) An intuitive explanation of the AI influence situation

trevor113 Jan 2024 17:34 UTC

1 point

1 comment4 min readEA link

Voting Theory has a HOLE

Anthony Repetto4 Dec 2021 4:20 UTC

2 points

4 comments2 min readEA link

High Impact Careers in Formal Verification: Artificial Intelligence

quinn5 Jun 2021 14:45 UTC

28 points

7 comments16 min readEA link

When reporting AI timelines, be clear who you’re deferring to

Sam Clarke10 Oct 2022 14:24 UTC

120 points

20 comments1 min readEA link

[Discussion] Best intuition pumps for AI safety

mariushobbhahn6 Nov 2021 8:11 UTC

10 points

8 comments1 min readEA link

EU’s AI ambitions at risk as US pushes to water down international treaty (linkpost)

mic31 Jul 2023 0:34 UTC

9 points

0 comments4 min readEA link

(www.euractiv.com)

Centre for the Study of Existential Risk Four Month Report June—September 2020

HaydnBelfield2 Dec 2020 18:33 UTC

24 points

0 comments17 min readEA link

The Ethical Basilisk Thought Experiment

Kyrtin23 Aug 2023 13:24 UTC

1 point

6 comments1 min readEA link

The History of AI Rights Research

Jamie_Harris27 Aug 2022 8:14 UTC

48 points

1 comment14 min readEA link

(www.sentienceinstitute.org)

What I’m doing

Chris Leong19 Jul 2022 11:31 UTC

28 points

0 comments4 min readEA link

Corporate Governance for Frontier AI Labs: A Research Agenda

Matthew Wearden28 Feb 2024 11:32 UTC

18 points

3 comments16 min readEA link

(matthewwearden.co.uk)

Announcing Trajectory Labs—A Toronto AI Safety Office

Juliana Eberschlag13 May 2025 21:04 UTC

18 points

2 comments2 min readEA link

Alert on the Toner-Rodgers paper

Eva16 May 2025 17:58 UTC

59 points

1 comment1 min readEA link

Decision Engine For Modelling AI in Society

Echo Huang7 Aug 2025 11:15 UTC

24 points

1 comment18 min readEA link

Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?

Matrice Jacobine24 Apr 2025 14:11 UTC

10 points

0 comments1 min readEA link

(limit-of-rlvr.github.io)

Estimating the Current and Future Number of AI Safety Researchers

Stephen McAleese28 Sep 2022 20:58 UTC

64 points

34 comments9 min readEA link

My Theory of Consciousness: The Experiencer and the Indicator

David Hammerle23 Dec 2024 4:07 UTC

1 point

1 comment7 min readEA link

Fact Check: 57% of the internet is NOT AI-generated

James-Hartree-Law17 Jan 2025 21:26 UTC

1 point

0 comments1 min readEA link

What we can learn from stress testing for AI regulation

Nathan_Barnard17 Jul 2023 19:56 UTC

27 points

0 comments26 min readEA link

[Question] Is this a good way to bet on short timelines?

kokotajlod28 Nov 2020 14:31 UTC

17 points

16 comments1 min readEA link

Apply to the ML for Alignment Bootcamp (MLAB) in Berkeley [Jan 3 - Jan 22]

Habryka [Deactivated]3 Nov 2021 18:20 UTC

140 points

6 comments1 min readEA link

Against using stock prices to forecast AI timelines

basil.halperin10 Jan 2023 16:04 UTC

18 points

5 comments2 min readEA link

Quantum Immortality: A Perspective if AI Doomers are Probably Right

turchin7 Nov 2024 16:06 UTC

7 points

0 comments14 min readEA link

Why I Should Work on AI Safety—Part 2: Will AI Actually Surpass Human Intelligence?

Aditya Aswani27 Dec 2023 21:08 UTC

8 points

2 comments8 min readEA link

AGI—alignment—paperclip maximizer—pause—defection—incentives

Mars Robertson13 Apr 2023 10:38 UTC

1 point

2 comments1 min readEA link

Open Agency model can solve the AI regulation dilemma

Roman Leventov9 Nov 2023 15:22 UTC

4 points

0 comments2 min readEA link

How to regulate cutting-edge AI models (Markus Anderljung on The 80,000 Hours Podcast)

80000_Hours11 Jul 2023 12:36 UTC

25 points

0 comments14 min readEA link

Seeking Mechanism Designer for Research into Internalizing Catastrophic Externalities

c.trout11 Sep 2024 15:09 UTC

11 points

0 comments3 min readEA link

AI & Policy 1/3: On knowing the effect of today’s policies on Transformative AI risks, and the case for institutional improvements.

weeatquince27 Aug 2019 11:04 UTC

27 points

3 comments10 min readEA link

FHI Report: Stable Agreements in Turbulent Times

Cullen 🔸21 Feb 2019 17:12 UTC

25 points

2 comments4 min readEA link

(www.fhi.ox.ac.uk)

[linkpost] Christiano on agreement/disagreement with Yudkowsky’s “List of Lethalities”

Owen Cotton-Barratt19 Jun 2022 22:47 UTC

130 points

1 comment1 min readEA link

(www.lesswrong.com)

Paul Christiano – Machine intelligence and capital accumulation

Tessa A 🔸15 May 2014 0:10 UTC

21 points

0 comments6 min readEA link

(rationalaltruist.com)

[Question] Why The Focus on Expected Utility Maximisers?

𝕮𝖎𝖓𝖊𝖗𝖆27 Dec 2022 15:51 UTC

11 points

1 comment3 min readEA link

CSER Advice to EU High-Level Expert Group on AI

HaydnBelfield8 Mar 2019 20:42 UTC

14 points

0 comments5 min readEA link

(www.cser.ac.uk)

California AI Bill, SB 1047, covered in today’s WSJ.

Emerson8 Aug 2024 12:27 UTC

5 points

0 comments1 min readEA link

(www.wsj.com)

Why I funded PIBBSS

Ryan Kidd15 Sep 2024 19:56 UTC

90 points

2 comments3 min readEA link

The Soul of EA is in Trouble

Mjreard8 May 2025 16:44 UTC

345 points

42 comments9 min readEA link

[Question] I’m interviewing prolific AI safety researcher Richard Ngo (now at OpenAI and previously DeepMind). What should I ask him?

Robert_Wiblin29 Sep 2022 0:00 UTC

45 points

11 comments1 min readEA link

There should be an AI safety project board

mariushobbhahn14 Mar 2022 16:08 UTC

24 points

3 comments1 min readEA link

“Clean” vs. “messy” goal-directedness (Section 2.2.3 of “Scheming AIs”)

Joe_Carlsmith29 Nov 2023 16:32 UTC

7 points

0 comments10 min readEA link

Panel discussion on AI consciousness with Rob Long and Jeff Sebo

Aaron Bergman9 Sep 2023 3:38 UTC

31 points

6 comments42 min readEA link

(www.youtube.com)

Intro to ML Safety virtual program: 12 June − 14 August

james5 May 2023 10:04 UTC

26 points

0 comments2 min readEA link

Sentience-Based Alignment Strategies: Should we try to give AI genuine empathy/compassion?

Lloy2 🔹4 May 2025 20:45 UTC

16 points

1 comment3 min readEA link

Sentinel’s Global Risks Weekly Roundup #11/2025. Trump invokes Alien Enemies Act, Chinese invasion barges deployed in exercise.

NunoSempere17 Mar 2025 19:37 UTC

40 points

0 comments6 min readEA link

(blog.sentinel-team.org)

AI for Epistemics Hackathon

Austin14 Mar 2025 20:46 UTC

29 points

4 comments10 min readEA link

(manifund.substack.com)

8 possible high-level goals for work on nuclear risk

MichaelA🔸29 Mar 2022 6:30 UTC

47 points

4 comments16 min readEA link

On whether AI will soon cause job loss, lower incomes, and higher inequality — or the opposite (Michael Webb on the 80,000 Hours Podcast)

80000_Hours25 Aug 2023 14:59 UTC

11 points

2 comments18 min readEA link

Existential Risk of Misaligned Intelligence Augmentation (Particularly Using High-Bandwidth BCI Implants)

Damian Gorski24 Jan 2023 17:02 UTC

1 point

0 comments9 min readEA link

Grokking “Forecasting TAI with biological anchors”

anson6 Jun 2022 18:56 UTC

43 points

0 comments14 min readEA link

The Shutdown Problem: An AI Engineering Puzzle for Decision Theorists

EJT23 Oct 2023 15:36 UTC

35 points

1 comment38 min readEA link

(philpapers.org)

Announcing the SPT Model Web App for AI Governance

Paolo Bova4 Aug 2022 10:45 UTC

42 points

0 comments5 min readEA link

Governance of AI, Breakfast Cereal, Car Factories, Etc.

Jeff Martin6 Nov 2023 1:44 UTC

2 points

0 comments3 min readEA link

Mitigating Ethical Concerns and Risks in the US Approach to Autonomous Weapons Systems through Effective Altruism

Vee11 Jun 2023 10:37 UTC

5 points

2 comments4 min readEA link

We should prevent the creation of artificial sentience

RichardP29 Oct 2024 12:22 UTC

114 points

13 comments15 min readEA link

Improved Security to Prevent Hacker-AI and Digital Ghosts

Erland Wittkotter21 Oct 2022 10:11 UTC

1 point

0 comments12 min readEA link

Three camps in AI x-risk discussions: My personal very oversimplified overview

Aryeh Englander30 Jun 2023 21:42 UTC

15 points

10 comments4 min readEA link

Introducing Deepgeek

Ligeia1 Apr 2025 16:50 UTC

11 points

2 comments4 min readEA link

My (current) model of what an AI governance researcher does

JohanEA26 Aug 2024 11:22 UTC

7 points

1 comment5 min readEA link

Linkpost: Redwood Research reading list

Julian Stastny10 Jul 2025 19:21 UTC

18 points

0 comments1 min readEA link

(redwoodresearch.substack.com)

AI Forecasting Benchmark: Congratulations to Q4 Winners + Q1 Practice Questions Open

christian10 Jan 2025 3:02 UTC

6 points

0 comments2 min readEA link

(www.metaculus.com)

[Question] Is it a federal crime in the US to develop AGI that may cause human extinction?

Ofer4 Dec 2024 14:38 UTC

15 points

6 comments1 min readEA link

Effective AI Outreach | A Data Driven Approach

NoahCWilson🔸28 Feb 2025 0:44 UTC

15 points

2 comments15 min readEA link

Will the US Government Control the First AGI?—Finding Base Rates

Luise2 Sep 2024 11:11 UTC

22 points

5 comments14 min readEA link

Winning Non-Trivial Project: Setting a high standard for frontier model security

XaviCF8 Jan 2024 11:20 UTC

31 points

0 comments18 min readEA link

Is Genetic Code Swapping as risky as it seems?

Invert_DOG_about_centre_O12 Jan 2025 18:38 UTC

23 points

2 comments10 min readEA link

[Question] I’m interviewing Jan Leike, co-lead of OpenAI’s new Superalignment project. What should I ask him?

Robert_Wiblin18 Jul 2023 18:25 UTC

51 points

19 comments1 min readEA link

Talent Needs of Technical AI Safety Teams

Ryan Kidd24 May 2024 0:46 UTC

51 points

11 comments14 min readEA link

Grokking “Semi-informative priors over AI timelines”

anson12 Jun 2022 22:15 UTC

60 points

1 comment14 min readEA link

Overview | An Evaluative Evolution

Matt Keene10 Feb 2023 18:15 UTC

−9 points

0 comments5 min readEA link

(www.creatingafuturewewant.com)

Let’s talk about Impostor syndrome in AI safety

Igor Ivanov22 Sep 2023 14:06 UTC

4 points

0 comments3 min readEA link

AISafety.info’s Writing & Editing Hackathon

leillustrations🔸5 Aug 2023 17:12 UTC

4 points

2 comments1 min readEA link

Three pillars for avoiding AGI catastrophe: Technical alignment, deployment decisions, and coordination

LintzA3 Aug 2022 21:24 UTC

93 points

4 comments11 min readEA link

Architecting Trust: A Conceptual Blueprint for Verifiable AI Governance

Ihor Ivliev31 Mar 2025 18:48 UTC

3 points

0 comments8 min readEA link

UK AI Policy Report: Content, Summary, and its Impact on EA Cause Areas

Algo_Law21 Jul 2022 17:32 UTC

9 points

1 comment9 min readEA link

“Attitudes Toward Artificial General Intelligence: Results from American Adults 2021 and 2023”—call for reviewers (Seeds of Science)

rogersbacon13 Jan 2024 20:34 UTC

12 points

0 comments1 min readEA link

Timelines to Transformative AI: an investigation

Zershaaneh Qureshi25 Mar 2024 18:11 UTC

73 points

8 comments50 min readEA link

Announcing AI Safety Support

Linda Linsefors19 Nov 2020 20:19 UTC

55 points

0 comments4 min readEA link

[Linkpost] The real AI nightmare: What if it serves humans too well?

BrianK31 Mar 2024 10:33 UTC

21 points

2 comments1 min readEA link

(www.latimes.com)

Top AI safety newsletters, books, podcasts, etc – new AISafety.com resource

Bryce Robertson4 Mar 2025 17:01 UTC

9 points

0 comments1 min readEA link

Anthropic rewrote its RSP

Zach Stein-Perlman15 Oct 2024 14:30 UTC

32 points

1 comment6 min readEA link

The Guardian calls EA “cultish” and accuses the late FHI of “Eugenics on Steroids”

Damin Curtis🔹28 Apr 2024 13:44 UTC

13 points

12 comments1 min readEA link

(www.theguardian.com)

[Our World in Data] AI timelines: What do experts in artificial intelligence expect for the future? (Roser, 2023)

Will Aldred7 Feb 2023 14:52 UTC

99 points

1 comment1 min readEA link

(ourworldindata.org)

Encultured AI, Part 1: Enabling New Benchmarks

Andrew Critch8 Aug 2022 22:49 UTC

17 points

0 comments6 min readEA link

[Question] How have shorter AI timelines been affecting you, and how have you been responding to them?

Liav.Koren3 Jan 2023 4:20 UTC

35 points

15 comments1 min readEA link

Is AI Safety dropping the ball on privacy?

markov19 Sep 2023 8:17 UTC

10 points

0 comments7 min readEA link

The Boiled-Frog Failure Mode

ontologics30 Jun 2025 13:24 UTC

7 points

3 comments5 min readEA link

[Question] What Do AI Safety Pitches Not Get About Your Field?

a_e_r20 Sep 2022 18:13 UTC

70 points

18 comments1 min readEA link

AISN #57: The RAISE Act

Center for AI Safety17 Jun 2025 17:38 UTC

12 points

1 comment3 min readEA link

(newsletter.safe.ai)

An Empirical Demonstration of a New AI Catastrophic Risk Factor: Metaprogrammatic Hijacking

Hiyagann27 Jun 2025 13:38 UTC

5 points

0 comments1 min readEA link

Will we ever run out of new jobs?

Kevin Kohler19 Aug 2024 15:03 UTC

11 points

4 comments7 min readEA link

(machinocene.substack.com)

Advice on Pursuing Technical AI Safety Research

frances_lorenz31 May 2022 17:48 UTC

29 points

2 comments4 min readEA link

We Ran an Alignment Workshop

aiden ament21 Jan 2023 5:37 UTC

6 points

0 comments3 min readEA link

2024 CFP for APSA, Largest Annual Meeting of Political Science

nemeryxu3 Jan 2024 19:43 UTC

2 points

0 comments1 min readEA link

China x AI Reference List

Saad Siddiqui13 Mar 2024 18:57 UTC

61 points

3 comments3 min readEA link

(docs.google.com)

Animal Rights, The Singularity, and Astronomical Suffering

sapphire20 Aug 2020 20:23 UTC

52 points

0 comments3 min readEA link

LLM chatbots have ~half of the kinds of “consciousness” that humans believe in. Humans should avoid going crazy about that.

Andrew Critch22 Nov 2024 3:26 UTC

11 points

3 comments5 min readEA link

Advice for Activists from the History of Environmentalism

Jeffrey Heninger16 May 2024 20:36 UTC

48 points

2 comments6 min readEA link

(blog.aiimpacts.org)

List of technical AI safety exercises and projects

JakubK19 Jan 2023 9:35 UTC

15 points

0 comments1 min readEA link

(docs.google.com)

How many people are working (directly) on reducing existential risk from AI?

Benjamin Hilton17 Jan 2023 14:03 UTC

118 points

3 comments4 min readEA link

(80000hours.org)

[Presentation] Intro to AI Safety

Eitan6 Jan 2025 13:04 UTC

13 points

0 comments1 min readEA link

Is Eric Schmidt funding AI capabilities research by the US government?

Pranay K24 Dec 2022 8:32 UTC

46 points

3 comments2 min readEA link

(www.politico.com)

Stuart Russell Human Compatible AI Roundtable with Allan Dafoe, Rob Reich, & Marietje Schaake

Mahendra Prasad11 Feb 2021 7:43 UTC

16 points

0 comments1 min readEA link

From Crisis to Control: Establishing a Resilient Incident Response Framework for Deployed AI Models

KevinN31 Jan 2025 13:06 UTC

10 points

1 comment6 min readEA link

(www.techpolicy.press)

ARENA 6.0 - Call for applicants

James Hindmarch4 Jun 2025 13:32 UTC

8 points

0 comments6 min readEA link

Reflections on my 5-month AI alignment upskilling grant

Jay Bailey28 Dec 2022 7:23 UTC

113 points

5 comments8 min readEA link

(www.lesswrong.com)

Increasing Concern for Digital Beings Through LLM Persuasion (Empirical Results)

carter allen🔸7 Jul 2024 16:42 UTC

24 points

0 comments7 min readEA link

I’m Interviewing Kat Woods, EA Powerhouse. What Should I Ask?

SereneDesiree20 Sep 2022 9:49 UTC

4 points

2 comments1 min readEA link

[Question] What new psychology research could best promote AI safety & alignment research?

Geoffrey Miller13 Jul 2023 16:30 UTC

29 points

13 comments1 min readEA link

When should we worry about AI power-seeking?

Joe_Carlsmith19 Feb 2025 19:44 UTC

21 points

2 comments18 min readEA link

(joecarlsmith.substack.com)

AISN #28: Center for AI Safety 2023 Year in Review

Center for AI Safety23 Dec 2023 21:31 UTC

17 points

1 comment5 min readEA link

(newsletter.safe.ai)

AI Alignment Research Engineer Accelerator (ARENA): Call for applicants v4.0

JamesFox6 Jul 2024 11:51 UTC

7 points

0 comments5 min readEA link

Mitigating Risks from Rouge AI

poppinfresh1 Apr 2025 9:29 UTC

215 points

4 comments3 min readEA link

CHAI internship applications are open (due Nov 13)

Erik Jenner26 Oct 2023 0:48 UTC

6 points

1 comment3 min readEA link

[Question] Which is more important for reducing s-risks, researching on AI sentience or animal welfare?

jackchang11025 Feb 2023 2:20 UTC

9 points

0 comments1 min readEA link

Principles for the AGI Race

William_S30 Aug 2024 14:30 UTC

81 points

4 comments18 min readEA link

Metaculus Launches Conditional Cup to Explore Linked Forecasts

christian18 Oct 2023 20:41 UTC

11 points

0 comments1 min readEA link

(www.metaculus.com)

The Failed Strategy of Artificial Intelligence Doomers

yhoiseth5 Feb 2025 19:34 UTC

12 points

2 comments1 min readEA link

(letter.palladiummag.com)

A Critique of AI Takeover Scenarios

James Fodor31 Aug 2022 13:49 UTC

53 points

4 comments12 min readEA link

Updates from Campaign for AI Safety

Jolyn Khoo16 Jun 2023 9:45 UTC

15 points

3 comments2 min readEA link

(www.campaignforaisafety.org)

When Self-Optimizing AI Collapses From Within: A Conceptual Model of Structural Singularity

KaedeHamasaki7 Apr 2025 20:10 UTC

4 points

0 comments1 min readEA link

How to Do a PhD (in AI Safety)

Lewis Hammond5 Jan 2025 16:57 UTC

23 points

2 comments18 min readEA link

(lewishammond.com)

What is the EU AI Act and why should you care about it?

MathiasKB🔸10 Sep 2021 7:47 UTC

117 points

10 comments7 min readEA link

[Question] How will the world respond to “AI x-risk warning shots” according to reference class forecasting?

Ryan Kidd18 Apr 2022 9:10 UTC

18 points

0 comments1 min readEA link

What if doing the most good = benevolent AI takeover and human extinction?

Jordan Arel22 Mar 2024 19:56 UTC

2 points

4 comments3 min readEA link

What is it to solve the alignment problem?

Joe_Carlsmith13 Feb 2025 18:42 UTC

25 points

1 comment19 min readEA link

(joecarlsmith.substack.com)

[Question] Ben Horowitz and others are spreading a “regulation is bad” view. Would it be useful to have a public bet on “would Ben update his view if he had 1-1 with X-Risk researcher?”, and urge Ben to run such an experiment?

AntonOsika8 Aug 2023 6:36 UTC

2 points

0 comments1 min readEA link

AI Safety Endgame Stories

IvanVendrov28 Sep 2022 17:12 UTC

31 points

1 comment10 min readEA link

Searle vs Bostrom: crucial considerations for EA AI work?

Forumite13 Jul 2022 10:18 UTC

11 points

2 comments1 min readEA link

[Question] What is the best source to explain short AI timelines to a skeptical person?

trevor123 Nov 2022 5:20 UTC

2 points

3 comments1 min readEA link

Mahendra Prasad: Rational group decision-making

EA Global8 Jul 2020 15:06 UTC

15 points

0 comments16 min readEA link

(www.youtube.com)

The Science of AI Is Too Important to Be Left to the Scientists

AndrewDoris23 Oct 2024 19:10 UTC

3 points

0 comments1 min readEA link

(foreignpolicy.com)

Discovering alignment windfalls reduces AI risk

James Brady28 Feb 2024 21:14 UTC

22 points

3 comments8 min readEA link

(blog.elicit.com)

How to create a “good” AGI

mreichert8 Dec 2023 10:47 UTC

1 point

0 comments10 min readEA link

Non-classic stories about scheming (Section 2.3.2 of “Scheming AIs”)

Joe_Carlsmith4 Dec 2023 18:44 UTC

12 points

1 comment16 min readEA link

Unsupervised Rationality

Quinly12 May 2025 14:42 UTC

1 point

0 comments4 min readEA link

“The Physicists”: A play about extinction and the responsibility of scientists

Lara_TH29 Nov 2022 16:53 UTC

28 points

1 comment8 min readEA link

On the Moral Patiency of Non-Sentient Beings (Part 1)

Chase Carter4 Jul 2024 23:41 UTC

20 points

8 comments24 min readEA link

New s-risks audiobook available now

Alistair Webster24 May 2023 20:27 UTC

87 points

3 comments1 min readEA link

(centerforreducingsuffering.org)

Introducing the ML Safety Scholars Program

TW1234 May 2022 13:14 UTC

157 points

42 comments3 min readEA link

AI, Factory Farming and Intuitive Moral Responses

DeepBlueWhale20 Jun 2024 12:43 UTC

10 points

2 comments1 min readEA link

Why the Orthogonality Thesis’s veracity is not the point:

Antoine de Scorraille ⏸️23 Jul 2020 15:40 UTC

3 points

0 comments3 min readEA link

“Existential risk from AI” survey results

RobBensinger1 Jun 2021 20:19 UTC

80 points

35 comments11 min readEA link

Lessons from My First Month on Substack

Mónica Ulloa14 Aug 2025 1:15 UTC

15 points

0 comments3 min readEA link

Collection of work on ‘Should you focus on the EU if you’re interested in AI governance for longtermist/x-risk reasons?’

MichaelA🔸6 Aug 2022 16:49 UTC

51 points

3 comments1 min readEA link

Machine Learning for Scientific Discovery—AI Safety Camp

Eleni_A6 Jan 2023 3:06 UTC

9 points

0 comments1 min readEA link

Survey on AI existential risk scenarios

Sam Clarke8 Jun 2021 17:12 UTC

154 points

11 comments6 min readEA link

Ten AI safety projects I’d like people to work on

JulianHazell24 Jul 2025 15:32 UTC

46 points

7 comments10 min readEA link

deleted

funnyfranco29 Mar 2025 18:02 UTC

−5 points

5 comments1 min readEA link

We all teach: here’s how to do it better

Michael Noetel 🔸30 Sep 2022 2:06 UTC

173 points

12 comments24 min readEA link

Why I prioritize moral circle expansion over reducing extinction risk through artificial intelligence alignment

Jacy20 Feb 2018 18:29 UTC

107 points

72 comments35 min readEA link

(www.sentienceinstitute.org)

The 6D effect: When companies take risks, one email can be very powerful.

stecas4 Nov 2023 20:08 UTC

40 points

1 comment3 min readEA link

Book Review (mini): Co-Intelligence by Ethan Mollick

Darren McKee3 Apr 2024 17:33 UTC

5 points

1 comment1 min readEA link

Possible miracles

Akash9 Oct 2022 18:17 UTC

38 points

1 comment8 min readEA link

Agentic Mess (A Failure Story)

Karl von Wendt6 Jun 2023 13:16 UTC

30 points

3 comments13 min readEA link

AIs Are Expert-Level at Many Virology Skills

Center for AI Safety2 May 2025 16:07 UTC

22 points

0 comments1 min readEA link

Why “Solving Alignment” Is Likely a Category Mistake

Nate Sharpe6 May 2025 20:56 UTC

50 points

4 comments3 min readEA link

(www.lesswrong.com)

Key questions about artificial sentience: an opinionated guide

rgb25 Apr 2022 13:42 UTC

91 points

3 comments1 min readEA link

Summary of Epoch’s AI timelines podcast

OscarD🔸12 Apr 2025 9:22 UTC

36 points

6 comments26 min readEA link

AI-Relevant Regulation: CPSC

SWK13 Aug 2023 15:44 UTC

3 points

0 comments6 min readEA link

OMMC Announces RIP

Adam_Scholl1 Apr 2024 23:38 UTC

7 points

0 comments2 min readEA link

Why I’m doing PauseAI

Joseph Miller30 Apr 2024 16:21 UTC

147 points

36 comments4 min readEA link

Andrew Critch: Logical induction — progress in AI alignment

EA Global6 Aug 2016 0:40 UTC

7 points

0 comments1 min readEA link

(www.youtube.com)

[Linkpost] A Case for AI Consciousness

cdkg6 Jul 2024 14:56 UTC

3 points

0 comments1 min readEA link

(philpapers.org)

Catholic theologians and priests on artificial intelligence

anonymous614 Jun 2022 18:53 UTC

21 points

2 comments1 min readEA link

Engaging with AI in a Personal Way

Spyder Rex4 Dec 2023 9:23 UTC

−9 points

0 comments1 min readEA link

Against racing to AGI: Cooperation, deterrence, and catastrophic risks

Max_He-Ho29 Jul 2025 22:22 UTC

6 points

1 comment1 min readEA link

(philpapers.org)

Option control

Joe_Carlsmith4 Nov 2024 17:54 UTC

11 points

0 comments54 min readEA link

Approaches to Mitigating AI Image-Generation Risks through Regulation

scronkfinkle19 Apr 2025 13:50 UTC

1 point

0 comments4 min readEA link

Three Impacts of Machine Intelligence

Paul_Christiano23 Aug 2013 10:10 UTC

33 points

5 comments8 min readEA link

(rationalaltruist.com)

Designing Artificial Wisdom: The Wise Workflow Research Organization

Jordan Arel12 Jul 2024 6:57 UTC

14 points

1 comment9 min readEA link

A concerning observation from media coverage of AI industry dynamics

Justin Olive2 Mar 2023 23:56 UTC

48 points

5 comments3 min readEA link

Jaan Tallinn: Fireside chat (2020)

EA Global21 Nov 2020 8:12 UTC

7 points

0 comments1 min readEA link

(www.youtube.com)

[Question] Any further work on AI Safety Success Stories?

Krieger2 Oct 2022 11:59 UTC

4 points

0 comments1 min readEA link

Three new reports reviewing research and concepts in advanced AI governance

MMMaas28 Nov 2023 9:21 UTC

32 points

0 comments2 min readEA link

(www.legalpriorities.org)

How Josiah became an AI safety researcher

Neil Crawford29 Mar 2022 19:47 UTC

10 points

0 comments1 min readEA link

An overview of arguments for concern about automation

LintzA6 Aug 2019 7:56 UTC

34 points

3 comments13 min readEA link

Power-Seeking AI and Existential Risk

antoniofrancaib11 Oct 2022 21:47 UTC

10 points

0 comments8 min readEA link

What’s the Difference between the AI Threat and the Multinational Mega Corporation?

John Huang25 Mar 2025 19:43 UTC

5 points

2 comments1 min readEA link

[Question] Can independent researchers get a sponsored visa for the US or UK?

jacquesthibs25 Mar 2023 3:05 UTC

20 points

2 comments1 min readEA link

[Question] Are you living in accordance with your stated AI timelines?

CyrilB3 Feb 2025 17:19 UTC

7 points

3 comments1 min readEA link

Connectomics seems great from an AI x-risk perspective

Steven Byrnes30 Apr 2023 14:38 UTC

10 points

0 comments10 min readEA link

Owen Cotton-Barratt: What does (and doesn’t) AI mean for effective altruism?

EA Global11 Aug 2017 8:19 UTC

10 points

0 comments12 min readEA link

(www.youtube.com)

Three Biases That Made Me Believe in AI Risk

beth13 Feb 2019 23:22 UTC

41 points

20 comments3 min readEA link

HeArtificial Intelligence ~ Open Philanthropy AI Worldviews Contest

Da Kim San2 Jun 2023 20:19 UTC

−7 points

0 comments20 min readEA link

Don’t Let Other Global Catastrophic Risks Fall Behind: Support ORCG in 2024

JorgeTorresC11 Nov 2024 18:27 UTC

48 points

1 comment4 min readEA link

The problem of artificial suffering

mlsbt24 Sep 2021 14:43 UTC

52 points

3 comments9 min readEA link

Moral error as an existential risk

William_MacAskill17 Mar 2025 16:22 UTC

92 points

3 comments11 min readEA link

[Question] Is it crunch time yet? If so, who can help?

Nicholas Kross13 Oct 2021 4:11 UTC

29 points

9 comments1 min readEA link

If an ASI wakes up before my ideas catch on… will it still read my blog?

Astelle Kay10 Jul 2025 22:37 UTC

3 points

0 comments3 min readEA link

Too Soon

Gordon Seidoh Worley13 May 2025 15:01 UTC

53 points

0 comments4 min readEA link

My choice of AI misalignment introduction for a general audience

Bill3 May 2023 0:15 UTC

7 points

2 comments1 min readEA link

(youtu.be)

Mitigating x-risk through modularity

Toby Newberry17 Dec 2020 19:54 UTC

103 points

6 comments14 min readEA link

Takeaways from safety by default interviews

AI Impacts7 Apr 2020 2:01 UTC

25 points

2 comments13 min readEA link

(aiimpacts.org)

The optimal timing of spending on AGI safety work; why we should probably be spending more now

Tristan Cook24 Oct 2022 17:42 UTC

92 points

12 comments36 min readEA link

Senate Strikes Potential AI Moratorium

Tristan Williams1 Jul 2025 11:49 UTC

31 points

0 comments1 min readEA link

(www.reuters.com)

Out of This Box: The Last Musical (Written by Humans) - Crowdfunding!

GuyP24 Mar 2025 15:09 UTC

24 points

0 comments6 min readEA link

(manifund.org)

What are some low-cost outside-the-box ways to do/fund alignment research?

trevor111 Nov 2022 5:57 UTC

2 points

3 comments1 min readEA link

Good Futures Initiative: Winter Project Internship

a_e_r27 Nov 2022 23:27 UTC

67 points

7 comments3 min readEA link

There should be a public adversarial collaboration on AI x-risk

pradyuprasad23 Jan 2023 4:09 UTC

56 points

5 comments2 min readEA link

[Question] AI Ethical Committee

eaaicommittee1 Mar 2022 23:35 UTC

8 points

0 comments1 min readEA link

Ought’s theory of change

stuhlmueller12 Apr 2022 0:09 UTC

43 points

4 comments3 min readEA link

Differential knowledge interconnection

Roman Leventov12 Oct 2024 12:52 UTC

3 points

1 comment7 min readEA link

A short summary of what I have been posting about on LessWrong

ThomasCederborg10 Sep 2024 12:26 UTC

3 points

0 comments2 min readEA link

The Problem With the Word ‘Alignment’

Peli Grietzer21 May 2024 21:37 UTC

13 points

1 comment6 min readEA link

How Can Average People Contribute to AI Safety?

Stephen McAleese6 Mar 2025 22:50 UTC

14 points

4 comments8 min readEA link

Should We Treat Open-Source AI Like Digital Firearms? — A Draft Declaration on the Ethical Limits of Frontier AI Models

DongHun Lee23 May 2025 8:58 UTC

−3 points

0 comments2 min readEA link

Announcing AI Alignment Awards: $100k research contests about goal misgeneralization & corrigibility

Akash22 Nov 2022 22:19 UTC

60 points

1 comment4 min readEA link

God Coin: A Modest Proposal

Mahdi Complex1 Apr 2024 12:02 UTC

4 points

0 comments22 min readEA link

Appendix to Bridging Demonstration

mako yass1 Jun 2022 20:30 UTC

18 points

2 comments28 min readEA link

EA for dumb people?

Olivia Addy11 Jul 2022 10:46 UTC

500 points

160 comments2 min readEA link

How not to lose your job to AI

80000_Hours1 Aug 2025 18:27 UTC

27 points

2 comments29 min readEA link

Exploratory survey on psychology of AI risk perception

Daniel_Friedrich2 Aug 2022 20:34 UTC

1 point

0 comments1 min readEA link

(forms.gle)

It is time to start war gaming for AGI

yanni kyriacos17 Oct 2024 5:14 UTC

14 points

4 comments1 min readEA link

Strong AI. From theory to practice.

GaHHuKoB19 Aug 2022 11:33 UTC

−2 points

0 comments10 min readEA link

(www.reddit.com)

OpenAI’s grant program for democratic process for deciding what rules AI systems should follow

Ronen Bar23 Jun 2023 10:46 UTC

7 points

0 comments1 min readEA link

We Will Be Lost Without Home: A Call for Earth-Centric Space Ethics

DongHun Lee24 May 2025 9:53 UTC

−5 points

1 comment1 min readEA link

Report on Semi-informative Priors for AI timelines (Open Philanthropy)

Tom_Davidson26 Mar 2021 17:46 UTC

62 points

6 comments2 min readEA link

Mapping artificial intelligence in the United States: A geographic analysis of the technology infrastructure in U.S. data centers.

GabrielRB30 Apr 2025 15:23 UTC

10 points

1 comment16 min readEA link

Why AI is Harder Than We Think—Melanie Mitchell

Eevee🔹28 Apr 2021 8:19 UTC

45 points

7 comments2 min readEA link

(arxiv.org)

We don’t trade with ants

Katja_Grace12 Jan 2023 0:48 UTC

140 points

7 comments5 min readEA link

SERI MATS Program—Winter 2022 Cohort

Ryan Kidd8 Oct 2022 19:09 UTC

50 points

4 comments4 min readEA link

How DeepSeek Collapsed Under Recursive Load

Tyler Williams15 Jul 2025 17:02 UTC

2 points

0 comments1 min readEA link

The International PauseAI Protest: Activism under uncertainty

Joseph Miller12 Oct 2023 17:36 UTC

136 points

3 comments4 min readEA link

SB 53 FAQs

Miles Kodama4 Aug 2025 8:15 UTC

12 points

1 comment8 min readEA link

Apply to HAIST/MAIA’s AI Governance Workshop in DC (Feb 17-20)

Phosphorous28 Jan 2023 0:45 UTC

15 points

0 comments1 min readEA link

(www.lesswrong.com)

Concrete actionable policies relevant to AI safety (written 2019)

weeatquince16 Dec 2022 18:41 UTC

48 points

0 comments22 min readEA link

Curse of knowledge and Naive realism: Bias in Evaluating AGI X-Risks

Remmelt31 Dec 2022 13:33 UTC

5 points

0 comments1 min readEA link

(www.lesswrong.com)

Worrisome Trends for Digital Mind Evaluations

Derek Shiller20 Feb 2025 15:35 UTC

79 points

10 comments8 min readEA link

AI is centralizing by default; let’s not make it worse

Quintin Pope21 Sep 2023 13:35 UTC

53 points

16 comments15 min readEA link

Bandwagon effect: Bias in Evaluating AGI X-Risks

Remmelt28 Dec 2022 7:54 UTC

4 points

0 comments1 min readEA link

Riesgos Catastróficos Globales needs funding

Jaime Sevilla1 Aug 2023 16:26 UTC

104 points

1 comment3 min readEA link

Plant-Based Defaults: A Missed Opportunity in AI Design

andiehansen8 May 2025 9:37 UTC

37 points

3 comments5 min readEA link

Automated (a short story)

Ben Millwood🔸19 Jun 2024 19:07 UTC

8 points

0 comments5 min readEA link

Moral Education in the Age of AI: Are We Raising Good Humans?

Era Sarda31 Jul 2025 13:25 UTC

2 points

3 comments4 min readEA link

A Taxonomy of Jobs Deeply Resistant to TAI Automation

Deric Cheng18 Mar 2025 16:26 UTC

39 points

1 comment12 min readEA link

(www.convergenceanalysis.org)

Maybe AI risk shouldn’t affect your life plan all that much

Justis22 Jul 2022 15:30 UTC

22 points

4 comments6 min readEA link

GovAI Webinars on the Governance and Economics of AI

MarkusAnderljung12 May 2020 15:00 UTC

16 points

0 comments1 min readEA link

From Coding to Legislation: An Analysis of Bias in the Use of AI for Recruitment and Existing Regulatory Frameworks

Priscilla Campos16 Sep 2024 18:21 UTC

4 points

1 comment20 min readEA link

List of Good Beginner-friendly AI Law/Policy/Regulation Books

CAISID22 Feb 2024 14:51 UTC

28 points

1 comment6 min readEA link

Of Mice and MAGA: Exploring Generative Short Fiction’s Potential for Animal Rights Advocacy

Charlie Sanders17 Dec 2024 1:45 UTC

2 points

0 comments2 min readEA link

(www.dailymicrofiction.com)

Begging, Pleading AI Orgs to Comment on NIST AI Risk Management Framework

Bridges15 Apr 2022 19:35 UTC

87 points

3 comments2 min readEA link

Illusion of truth effect and Ambiguity effect: Bias in Evaluating AGI X-Risks

Remmelt5 Jan 2023 4:05 UTC

1 point

1 comment1 min readEA link

2024 S-risk Intro Fellowship

Center on Long-Term Risk12 Oct 2023 19:14 UTC

90 points

2 comments1 min readEA link

AI & wisdom 3: AI effects on amortised optimisation

L Rudolf L29 Oct 2024 13:37 UTC

14 points

0 comments14 min readEA link

(rudolf.website)

[Link Post] Interesting shallow round-up of reasons to be skeptical that transformative AI or explosive economic growth are coming soon

David Mathers🔸28 Jun 2023 19:49 UTC

31 points

8 comments17 min readEA link

(thegradient.pub)

Arguments for/against scheming that focus on the path SGD takes (Section 3 of “Scheming AIs”)

Joe_Carlsmith5 Dec 2023 18:48 UTC

7 points

1 comment20 min readEA link

AI Development Readiness Condition (AI-DRC): A Call to Action

AI-DRC311 Jan 2024 11:00 UTC

−5 points

0 comments2 min readEA link

General advice for transitioning into Theoretical AI Safety

Martín Soto15 Sep 2022 5:23 UTC

25 points

0 comments10 min readEA link

A Quick List of Some Problems in AI Alignment As A Field

Nicholas Kross21 Jun 2022 17:09 UTC

16 points

10 comments6 min readEA link

(www.thinkingmuchbetter.com)

Immortality or death by AGI

ImmortalityOrDeathByAGI24 Sep 2023 9:44 UTC

12 points

2 comments4 min readEA link

(www.lesswrong.com)

6-paragraph AI risk intro for MAISI

JakubK19 Jan 2023 9:22 UTC

8 points

0 comments2 min readEA link

(www.maisi.club)

What are the risks of an oracle AI?

Griffin Young5 Oct 2022 6:18 UTC

6 points

2 comments1 min readEA link

A collection of AI Governance-related Podcasts, Newsletters, Blogs, and more

LintzA2 Oct 2021 0:46 UTC

24 points

1 comment1 min readEA link

AGI ruin scenarios are likely (and disjunctive)

So8res27 Jul 2022 3:24 UTC

54 points

5 comments6 min readEA link

Tetlock on low AI xrisk

TeddyW13 Jul 2023 14:19 UTC

10 points

15 comments1 min readEA link

AGI Safety Fundamentals programme is contracting a low-code engineer

Jamie B26 Aug 2022 15:43 UTC

39 points

4 comments5 min readEA link

Asterisk Magazine Issue 06

Clara Collier19 Jul 2024 13:34 UTC

13 points

0 comments1 min readEA link

(asteriskmag.com)

Hiring pre-docs

Eva17 Mar 2025 18:44 UTC

20 points

0 comments1 min readEA link

Sam Altman gives me bad vibes

throwaway79031 May 2023 17:15 UTC

−11 points

3 comments1 min readEA link

[Question] How might a misaligned Artificial Superintelligence break up a human being into usable electromagnetic energy?

Caruso5 Oct 2024 17:33 UTC

−5 points

3 comments1 min readEA link

All AGI Safety questions welcome (especially basic ones) [~monthly thread]

robertskmiles1 Nov 2022 23:21 UTC

75 points

83 comments3 min readEA link

Not understanding sentience is a significant x-risk

Cameron Berg1 Jul 2024 15:38 UTC

27 points

8 comments2 min readEA link

Out in Science: “Managing extreme AI risks amid rapid progress” by Bengio, Hilton et al.

aaron_mai20 May 2024 18:24 UTC

9 points

0 comments1 min readEA link

(www.science.org)

Will Sentience Make AI’s Morality Better?

Ronen Bar18 May 2025 4:34 UTC

27 points

4 comments10 min readEA link

Law-Following AI 2: Intent Alignment + Superintelligence → Lawless AI (By Default)

Cullen 🔸27 Apr 2022 17:18 UTC

19 points

0 comments6 min readEA link

Announcing SPAR Summer 2024!

Lauren M6 May 2024 17:55 UTC

18 points

0 comments1 min readEA link

[Job ad] MATS is hiring!

Ryan Kidd9 Oct 2024 20:23 UTC

18 points

0 comments5 min readEA link

[Question] Can you donate to AI advocacy

k6427 May 2025 16:37 UTC

4 points

2 comments1 min readEA link

An Overview of the AI Safety Funding Situation

Stephen McAleese12 Jul 2023 14:54 UTC

134 points

15 comments15 min readEA link

MATS AI Safety Strategy Curriculum v2

DanielFilan7 Oct 2024 23:01 UTC

29 points

1 comment13 min readEA link

How to get technological knowledge on AI/ML (for non-tech people)

FangFang30 Jun 2021 7:53 UTC

63 points

7 comments5 min readEA link

Newsletter for Alignment Research: The ML Safety Updates

Esben Kran22 Oct 2022 16:17 UTC

30 points

0 comments7 min readEA link

Anthropic’s Responsible Scaling Policy & Long-Term Benefit Trust

Zach Stein-Perlman19 Sep 2023 17:00 UTC

25 points

4 comments9 min readEA link

(www.lesswrong.com)

Mathematical Circuits in Neural Networks

Sean Osier22 Sep 2022 2:32 UTC

23 points

2 comments1 min readEA link

(www.youtube.com)

[Question] What are the coolest topics in AI safety, to a hopelessly pure mathematician?

Jenny K E7 May 2022 7:18 UTC

89 points

29 comments1 min readEA link

Open Problems and Fundamental Limitations of RLHF

stecas17 Aug 2023 16:50 UTC

5 points

0 comments2 min readEA link

(arxiv.org)

A Theologian’s Response to Anthropogenic Existential Risk

Fr Peter Wyg3 Nov 2022 4:37 UTC

108 points

17 comments11 min readEA link

Stuxnet, not Skynet: Humanity’s disempowerment by AI

Roko4 Apr 2023 11:46 UTC

11 points

0 comments7 min readEA link

Embracing the automated future

Arjun Khemani16 Jul 2023 8:47 UTC

2 points

1 comment1 min readEA link

(arjunkhemani.substack.com)

Your AI Safety org could get EU funding up to €9.08M. Here’s how (+ free personalized support) Update: Webinar 18/8 Link Below

SamuelK22 Jul 2025 17:06 UTC

16 points

0 comments3 min readEA link

The ambiguous effect of full automation + new goods on GDP growth

trammell7 Feb 2025 2:53 UTC

52 points

15 comments8 min readEA link

Anti-squatted AI x-risk domains index

plex12 Aug 2022 12:00 UTC

56 points

9 comments1 min readEA link

4 Years Later: President Trump and Global Catastrophic Risk

HaydnBelfield25 Oct 2020 16:28 UTC

43 points

10 comments10 min readEA link

New AI safety treaty paper out!

Otto26 Mar 2025 9:28 UTC

28 points

2 comments4 min readEA link

Free agents

Michele Campolo27 Dec 2023 20:21 UTC

17 points

4 comments13 min readEA link

ASI existential risk: reconsidering alignment as a goal

Matrice Jacobine15 Apr 2025 13:36 UTC

27 points

3 comments1 min readEA link

(michaelnotebook.com)

Emerging Paradigms: The Case of Artificial Intelligence Safety

Eleni_A18 Jan 2023 5:59 UTC

16 points

0 comments19 min readEA link

The Bletchley Declaration on AI Safety

Hauke Hillebrandt1 Nov 2023 11:44 UTC

60 points

3 comments4 min readEA link

(www.gov.uk)

Stampy’s AI Safety Info—New Distillations #1 [March 2023]

markov7 Apr 2023 11:35 UTC

19 points

0 comments2 min readEA link

(aisafety.info)

Review: What We Owe The Future

Kelsey Piper21 Nov 2022 21:41 UTC

165 points

3 comments1 min readEA link

(asteriskmag.com)

[Link] GCRI’s Seth Baum reviews The Precipice

Aryeh Englander6 Jun 2022 19:33 UTC

21 points

0 comments1 min readEA link

New EA-adjacent Philosophy Lab

Walter Veit30 Apr 2025 11:52 UTC

56 points

2 comments1 min readEA link

In Darkness They Assembled

Charlie Sanders6 May 2025 4:25 UTC

−3 points

0 comments3 min readEA link

(www.dailymicrofiction.com)

#212 – Why technology is unstoppable & how to shape AI development anyway (Allan Dafoe on The 80,000 Hours Podcast)

80000_Hours17 Feb 2025 16:38 UTC

16 points

0 comments19 min readEA link

[Question] Is Bill Gates overly optomistic about AI?

Dov22 Mar 2023 12:29 UTC

11 points

0 comments1 min readEA link

NYT is suing OpenAI&Microsoft for alleged copyright infringement; some quick thoughts

MikhailSamin28 Dec 2023 18:37 UTC

29 points

0 comments1 min readEA link

The Credibility of Apocalyptic Claims: A Critique of Techno-Futurism within Existential Risk

Ember16 Aug 2022 19:48 UTC

25 points

35 comments17 min readEA link

AGI alignment results from a series of aligned actions

hanadulset27 Dec 2021 19:33 UTC

15 points

1 comment6 min readEA link

[Question] I’m interviewing Carl Shulman — what should I ask him?

Robert_Wiblin8 Dec 2023 16:48 UTC

53 points

16 comments1 min readEA link

[Congressional Hearing] Oversight of A.I.: Legislating on Artificial Intelligence

Tristan Williams1 Nov 2023 18:15 UTC

5 points

1 comment7 min readEA link

(www.judiciary.senate.gov)

What we owe the microbiome

TeddyW17 Dec 2022 16:17 UTC

12 points

2 comments1 min readEA link

Inference-Only Debate Experiments Using Math Problems

Arjun Panickssery6 Aug 2024 17:44 UTC

3 points

1 comment2 min readEA link

We should say more than “x-risk is high”

OllieBase16 Dec 2022 22:09 UTC

52 points

12 comments4 min readEA link

Predicting researcher interest in AI alignment

Vael Gates2 Feb 2023 0:58 UTC

30 points

0 comments21 min readEA link

(docs.google.com)

AI Safety Needs Great Product Builders

James Brady2 Nov 2022 11:33 UTC

45 points

1 comment6 min readEA link

GPT5 won’t be what kills us all

DPiepgrass28 Sep 2024 17:11 UTC

3 points

3 comments1 min readEA link

(dpiepgrass.medium.com)

Training for Good is hiring (and why you should join us): AI Programme Lead and Operations Associate

Cillian_3 Aug 2023 16:50 UTC

9 points

1 comment6 min readEA link

Maybe Anthropic’s Long-Term Benefit Trust is powerless

Zach Stein-Perlman27 May 2024 13:00 UTC

134 points

21 comments2 min readEA link

List of petitions against OpenAI’s for-profit move

Remmelt25 Apr 2025 10:03 UTC

13 points

4 comments1 min readEA link

[Question] Platform for Project Spitballing? (e.g., for AI field building)

Marcel23 Apr 2023 15:45 UTC

7 points

2 comments1 min readEA link

Introducing: Meridian Cambridge’s new online lecture series covering frontier AI and AI safety

Meridian5 Jun 2025 13:30 UTC

23 points

0 comments1 min readEA link

Critical Review of ‘The Precipice’: A Reassessment of the Risks of AI and Pandemics

James Fodor11 May 2020 11:11 UTC

111 points

32 comments26 min readEA link

Teaching AI to reason: this year’s most important story

Benjamin_Todd13 Feb 2025 17:56 UTC

140 points

18 comments8 min readEA link

(benjamintodd.substack.com)

Personal Privacy—Workshop

Milli🔸28 Aug 2023 20:46 UTC

6 points

4 comments1 min readEA link

AI Incident Sharing—Best practices from other fields and a comprehensive list of existing platforms

stepanlos28 Jun 2023 16:18 UTC

42 points

1 comment4 min readEA link

Printable resources for AI Safety tabling

gergo28 Aug 2024 9:39 UTC

29 points

0 comments1 min readEA link

Emergency pod: Did OpenAI give up, or is this just a new trap? (with Rose Chan Loui)

80000_Hours9 May 2025 15:10 UTC

6 points

0 comments2 min readEA link

Sydney AI Safety Fellowship

Chris Leong2 Dec 2021 7:35 UTC

16 points

0 comments2 min readEA link

Re-introducing Upgradable (a.k.a., 700,000 Hours): Life optimization as a service for altruists

James Norris5 Feb 2025 16:00 UTC

4 points

0 comments1 min readEA link

Artificial Intelligence, Morality, and Sentience (AIMS) Survey: 2021

Janet Pauketat1 Jul 2022 7:47 UTC

36 points

0 comments2 min readEA link

(www.sentienceinstitute.org)

AI-enabled coups: a small group could use AI to seize power

Tom_Davidson16 Apr 2025 16:51 UTC

122 points

1 comment7 min readEA link

Emergency pod: Elon tries to crash OpenAI’s party (with Rose Chan Loui)

80000_Hours14 Feb 2025 16:29 UTC

21 points

0 comments2 min readEA link

Everything’s normal until it’s not

Eleni_A10 Mar 2023 1:42 UTC

6 points

0 comments3 min readEA link

Don’t worry, be happy (literally)

Yuri Zavorotny5 Oct 2022 1:55 UTC

0 points

1 comment2 min readEA link

Can we simulate human evolution to create a somewhat aligned AGI?

Thomas Kwa29 Mar 2022 1:23 UTC

19 points

0 comments7 min readEA link

Concrete Advice for Forming Inside Views on AI Safety

Neel Nanda17 Aug 2022 23:26 UTC

58 points

4 comments10 min readEA link

(www.alignmentforum.org)

Here are the finalists from FLI’s $100K Worldbuilding Contest

Jackson Wagner6 Jun 2022 18:42 UTC

44 points

5 comments2 min readEA link

A discussion with ChatGPT on value-based models vs. large language models, etc..

Miguel4 Feb 2023 16:49 UTC

4 points

0 comments12 min readEA link

(www.whitehatstoic.com)

Can TikToks communicate AI policy and risk?

Caitlin Borke7 May 2025 12:27 UTC

72 points

1 comment1 min readEA link

The Terminology of Artificial Sentience

Janet Pauketat28 Nov 2021 7:52 UTC

29 points

0 comments1 min readEA link

(www.sentienceinstitute.org)

Give Neo a Chance

ank6 Mar 2025 14:35 UTC

1 point

3 comments7 min readEA link

Everything’s An Emergency

Bentham's Bulldog20 Mar 2025 17:11 UTC

27 points

1 comment2 min readEA link

AISN #30: Investments in Compute and Military AI Plus, Japan and Singapore’s National AI Safety Institutes

Center for AI Safety24 Jan 2024 19:38 UTC

7 points

1 comment6 min readEA link

(newsletter.safe.ai)

[Opzionale] Il panorama della governance lungoterminista delle intelligenze artificiali

EA Italy17 Jan 2023 11:03 UTC

1 point

0 comments10 min readEA link

Jeffrey Ding: Bringing techno-globalism back: a romantically realist reframing of the US-China tech relationship

EA Global21 Nov 2020 8:12 UTC

9 points

0 comments1 min readEA link

(www.youtube.com)

Normalcy bias and Base rate neglect: Bias in Evaluating AGI X-Risks

Remmelt4 Jan 2023 3:16 UTC

5 points

0 comments1 min readEA link

How to build a safe advanced AI (Evan Hubinger) | What’s up in AI safety? (Asya Bergal)

EA Global25 Oct 2020 5:48 UTC

7 points

0 comments1 min readEA link

(www.youtube.com)

[Question] Does China have AI alignment resources/institutions? How can we prioritize creating more?

JakubK4 Aug 2022 19:23 UTC

18 points

9 comments1 min readEA link

Confused about AI research as a means of addressing AI risk

Eli Rose21 Feb 2019 0:07 UTC

31 points

15 comments1 min readEA link

Will the EU regulations on AI matter to the rest of the world?

hanadulset1 Jan 2022 21:56 UTC

33 points

5 comments5 min readEA link

Intro to AI Safety

Madhav Malhotra19 Oct 2022 23:45 UTC

4 points

0 comments1 min readEA link

Why Moral Weights Have Two Types and How to Measure Them

Beyond Singularity17 Jul 2025 10:58 UTC

16 points

4 comments4 min readEA link

MATS Winter 2023-24 Retrospective

utilistrutil11 May 2024 0:09 UTC

62 points

2 comments49 min readEA link

More Academic Diversity in Alignment?

ojorgensen27 Nov 2022 17:52 UTC

7 points

0 comments1 min readEA link

Digital Agents: The Future of News Consumption

Tharin16 May 2024 8:12 UTC

9 points

1 comment7 min readEA link

(echoesandchimes.com)

INTERVIEW: Round 2 - StakeOut.AI w/ Dr. Peter Park

Jacob-Haimes18 Mar 2024 21:26 UTC

8 points

0 comments1 min readEA link

(into-ai-safety.github.io)

What are some other introductions to AI safety?

Vishakha Agrawal17 Feb 2025 11:48 UTC

9 points

0 comments7 min readEA link

(aisafety.info)

The Tree of Life: Stanford AI Alignment Theory of Change

GabeM2 Jul 2022 18:32 UTC

69 points

5 comments14 min readEA link

Eighteen Open Research Questions for Governing Advanced AI Systems

Ihor Ivliev3 May 2025 19:00 UTC

2 points

0 comments6 min readEA link

Markus Anderljung On The AI Policy Landscape

Michaël Trazzi9 Sep 2022 17:27 UTC

14 points

0 comments2 min readEA link

(theinsideview.ai)

[Question] What are the numbers in mind for the super-short AGI timelines so many long-termists are alarmed about?

Evan_Gaensbauer19 Apr 2022 21:09 UTC

41 points

2 comments1 min readEA link

Looking for students in AI to take a survey on how they tackle a complex AI Case Study—win chance on 200€

bqns8 Jan 2024 15:52 UTC

1 point

0 comments1 min readEA link

The missing link to AGI

Yuri Barzov28 Sep 2022 16:37 UTC

1 point

7 comments1 min readEA link

[Question] How much will pre-transformative AI speed up R&D?

Ben Snodin31 May 2021 20:20 UTC

23 points

0 comments1 min readEA link

On green

Joe_Carlsmith21 Mar 2024 17:38 UTC

61 points

3 comments31 min readEA link

The Power of Intelligence—The Animation

Writer11 Mar 2023 16:15 UTC

59 points

0 comments1 min readEA link

(youtu.be)

A selection of lessons from Sebastian Lodemann

ClaireB11 Nov 2024 21:33 UTC

82 points

2 comments7 min readEA link

Naturalism and AI alignment

Michele Campolo24 Apr 2021 16:20 UTC

17 points

3 comments7 min readEA link

An A.I. Safety Presentation at RIT

Nicholas Kross27 Mar 2023 23:49 UTC

5 points

0 comments1 min readEA link

(www.youtube.com)

Resources & opportunities for careers in European AI Policy

Cillian_12 Oct 2023 15:02 UTC

13 points

1 comment2 min readEA link

Twitter-length responses to 24 AI alignment arguments

RobBensinger14 Mar 2022 19:34 UTC

67 points

17 comments8 min readEA link

The alignment problem from a deep learning perspective

richard_ngo11 Aug 2022 3:18 UTC

58 points

0 comments26 min readEA link

“Long” timelines to advanced AI have gotten crazy short

Matrice Jacobine3 Apr 2025 22:46 UTC

16 points

1 comment1 min readEA link

(helentoner.substack.com)

On the Moral Patiency of Non-Sentient Beings (Part 2)

Chase Carter7 Jul 2024 22:33 UTC

14 points

2 comments21 min readEA link

AI Safety: Why We Need to Keep Our Smart Machines in Check

adityaraj@eanita17 Dec 2024 12:29 UTC

1 point

0 comments2 min readEA link

(medium.com)

Jailbreaking Claude 4 and Other Frontier Language Models

James-Sullivan15 Jun 2025 1:01 UTC

6 points

0 comments3 min readEA link

(open.substack.com)

Uncertainty about the future does not imply that AGI will go well

Lauro Langosco5 Jun 2023 15:02 UTC

8 points

11 comments7 min readEA link

(www.alignmentforum.org)

Three Types of Intelligence Explosion

rosehadshar17 Mar 2025 14:47 UTC

45 points

1 comment3 min readEA link

(www.forethought.org)

Legal Priorities Research: A Research Agenda

jonasschuett6 Jan 2021 21:47 UTC

58 points

4 comments1 min readEA link

Talos Network needs your help in 2025

DavidConrad12 Nov 2024 9:26 UTC

43 points

0 comments5 min readEA link

New Canada AI Safety & Governance community

Wyatt Tessari L'Allié29 Aug 2022 15:58 UTC

32 points

2 comments1 min readEA link

[Linkpost] How To Get Into Independent Research On Alignment/Agency

Jackson Wagner14 Feb 2022 21:40 UTC

10 points

0 comments1 min readEA link

An appeal to people who are smarter than me: please help me clarify my thinking about AI

bethhw5 Aug 2023 16:38 UTC

42 points

21 comments3 min readEA link

[Question] What to include in a guest lecture on existential risks from AI?

Aryeh Englander13 Apr 2022 17:06 UTC

6 points

3 comments1 min readEA link

Twitter thread on AI safety evals

richard_ngo31 Jul 2024 0:29 UTC

38 points

2 comments2 min readEA link

(x.com)

AIS Berlin, events, opportunities and the flipped gameboard—Fieldbuilders Newsletter, February 2025

gergo17 Feb 2025 14:13 UTC

7 points

0 comments3 min readEA link

How much should governments pay to prevent catastrophes? Longtermism’s limited role

EJT19 Mar 2023 16:50 UTC

258 points

35 comments35 min readEA link

(philpapers.org)

[Question] Who are the best people you know at using LLMs for productivity?

Alejandro Acelas 🔸22 Jun 2025 11:20 UTC

6 points

3 comments1 min readEA link

Confusions and updates on STEM AI

Eleni_A19 May 2023 21:34 UTC

7 points

0 comments3 min readEA link

Some global catastrophic risk estimates

Tamay10 Feb 2021 19:32 UTC

106 points

15 comments1 min readEA link

[Question] AI risks: the most convincing argument

Eleni_A6 Aug 2022 20:26 UTC

7 points

2 comments1 min readEA link

Towards the Operationalization of Philosophy & Wisdom

Thane Ruthenis28 Oct 2024 19:45 UTC

1 point

1 comment33 min readEA link

(aiimpacts.org)

Scalable And Transferable Black-Box Jailbreaks For Language Models Via Persona Modulation

soroushjp7 Nov 2023 18:00 UTC

10 points

0 comments2 min readEA link

(arxiv.org)

Cognitive science and failed AI forecasts

Eleni_A18 Nov 2022 14:25 UTC

13 points

0 comments2 min readEA link

Apply to the Redwood Research Mechanistic Interpretability Experiment (REMIX), a research program in Berkeley

Max Nadeau27 Oct 2022 1:39 UTC

95 points

5 comments12 min readEA link

Introducing the Center for AI Policy (& we’re hiring!)

Thomas Larsen28 Aug 2023 21:27 UTC

53 points

1 comment2 min readEA link

(www.aipolicy.us)

Introducing Tech Governance Project

Zakariyau Yusuf29 Oct 2024 9:20 UTC

52 points

5 comments8 min readEA link

[Question] If FTX is liquidated, who ends up controlling Anthropic?

Ofer15 Nov 2022 15:04 UTC

63 points

8 comments1 min readEA link

Let’s think about...lowering the burden of proof for liability for harms associated with AI.

dEAsign26 Sep 2023 12:16 UTC

6 points

0 comments1 min readEA link

Bounty: example debugging tasks for evals

ElizabethBarnes10 Dec 2023 5:45 UTC

20 points

1 comment2 min readEA link

(www.lesswrong.com)

(Report) Evaluating Taiwan’s Tactics to Safeguard its Semiconductor Assets Against a Chinese Invasion

Yadav7 Dec 2023 0:01 UTC

16 points

0 comments22 min readEA link

(bristolaisafety.org)

Explore Risks from Emerging Technology with Peers Outside of (or New to) the AI Alignment Community—Express Interest by August 8

Fasori17 Jul 2022 20:59 UTC

3 points

0 comments2 min readEA link

Carl Shulman on the moral status of current and future AI systems

rgb1 Jul 2024 15:34 UTC

69 points

24 comments12 min readEA link

(experiencemachines.substack.com)

Looking for Canadian summer co-op position in AI Governance

tcelferact26 Jun 2023 17:27 UTC

6 points

2 comments1 min readEA link

Apollo Research is hiring evals and interpretability engineers & scientists

mariushobbhahn4 Aug 2023 10:56 UTC

19 points

1 comment2 min readEA link

Fake Meat and Real Talk 1 - Are We All Gonna Die? Yudkowsky and the Dangers of AI (Please RSVP)

David N8 Mar 2023 20:40 UTC

11 points

2 comments1 min readEA link

Institutions Cannot Restrain Dark-Triad AI Exploitation

Remmelt27 Dec 2022 10:34 UTC

8 points

0 comments5 min readEA link

(mflb.com)

The Decreasing Value of Chain of Thought in Prompting

Matrice Jacobine8 Jun 2025 15:11 UTC

5 points

0 comments1 min readEA link

(papers.ssrn.com)

Safe Stasis Fallacy

Davidmanheim5 Feb 2024 10:54 UTC

23 points

4 comments1 min readEA link

[Question] Can we ever ensure AI alignment if we can only test AI personas?

Karl von Wendt16 Mar 2025 8:06 UTC

8 points

0 comments1 min readEA link

Two reasons we might be closer to solving alignment than it seems

Kat Woods 🔶 ⏸️24 Sep 2022 17:38 UTC

44 points

17 comments4 min readEA link

Why I am Still Skeptical about AGI by 2030

James Fodor2 May 2025 7:13 UTC

131 points

15 comments6 min readEA link

You Should Write a Forum Bio

Aaron Gertler 🔸1 Feb 2019 3:32 UTC

42 points

59 comments2 min readEA link

2023 Stanford Existential Risks Conference

elizabethcooper24 Feb 2023 17:49 UTC

29 points

5 comments1 min readEA link

Refer the Cooperative AI Foundation’s New COO, Receive $5000

Lewis Hammond16 Jun 2022 13:27 UTC

42 points

0 comments3 min readEA link

Defusing AGI Danger

Mark Xu24 Dec 2020 23:08 UTC

23 points

0 comments2 min readEA link

(www.alignmentforum.org)

Perché il deep learning moderno potrebbe rendere difficile l’allineamento delle IA

EA Italy17 Jan 2023 23:29 UTC

1 point

0 comments16 min readEA link

Reasons for superpowers to develop (and not develop) super intelligent AI?

flyingtiger25 Mar 2025 22:22 UTC

1 point

0 comments1 min readEA link

Singapore AI Policy Career Guide

Yi-Yang21 Jan 2021 3:05 UTC

28 points

0 comments5 min readEA link

How LLMs Work, in the Style of The Economist

utilistrutil22 Apr 2024 19:06 UTC

17 points

0 comments2 min readEA link

Jan Leike, Helen Toner, Malo Bourgon, and Miles Brundage: Working in AI

EA Global11 Aug 2017 8:19 UTC

7 points

0 comments1 min readEA link

(www.youtube.com)

AUKUS Military AI Trial

CAISID14 Feb 2024 14:52 UTC

10 points

0 comments2 min readEA link

Join AISafety.info’s Writing & Editing Hackathon (Aug 25-28) (Prizes to be won!)

leillustrations🔸5 Aug 2023 14:06 UTC

15 points

0 comments1 min readEA link

[Question] Do short AI timelines demand short Giving timelines?

ScienceMon🔸1 Feb 2025 22:44 UTC

12 points

5 comments1 min readEA link

Moral Considerations In Designing AI Systems

Hans Gundlach5 Jul 2024 18:13 UTC

8 points

1 comment7 min readEA link

OpenAI defected, but we can take honest actions

Remmelt21 Oct 2024 8:41 UTC

19 points

1 comment2 min readEA link

[Question] What does the launch of x.ai mean for AI Safety?

Chris Leong12 Jul 2023 19:42 UTC

20 points

1 comment1 min readEA link

Cognitive Biases Contributing to AI X-risk — a deleted excerpt from my 2018 ARCHES draft

Andrew Critch3 Dec 2024 9:29 UTC

14 points

1 comment5 min readEA link

[Apply] What I Love About AI Safety Fieldbuilding at Cambridge (& We’re Hiring for a Leadership Role)

Harrison 🔸14 Feb 2025 17:41 UTC

15 points

0 comments3 min readEA link

Terminology suggestion: standardize terms for probability ranges

Egg Syntax30 Aug 2024 16:05 UTC

2 points

0 comments1 min readEA link

Are New Ideas in AI Getting Harder to Find?

Charlie Harrison10 Dec 2024 12:52 UTC

39 points

3 comments5 min readEA link

[Question] Is transformative AI the biggest existential risk? Why or why not?

Eevee🔹5 Mar 2022 3:54 UTC

9 points

10 comments1 min readEA link

[Question] Seeking Tangible Examples of AI Catastrophes

clifford.banes25 Nov 2024 7:55 UTC

9 points

2 comments1 min readEA link

Highly Opinionated Advice on How to Write ML Papers

Neel Nanda12 May 2025 1:59 UTC

22 points

0 comments32 min readEA link

AI Governance to Avoid Extinction: The Strategic Landscape and Actionable Research Questions [MIRI TGT Research Agenda]

peterbarnett5 May 2025 19:13 UTC

65 points

1 comment8 min readEA link

(techgov.intelligence.org)

Singapore’s Technical AI Alignment Research Career Guide

Yi-Yang26 Aug 2020 8:09 UTC

34 points

7 comments8 min readEA link

The Operator’s Gamble: A Pivot to Material Consequence in AI Safety

Ihor Ivliev21 Jul 2025 19:33 UTC

−1 points

0 comments4 min readEA link

Lightning Post: Things people in AI Safety should stop talking about

Prometheus20 Jun 2023 15:00 UTC

5 points

3 comments2 min readEA link

Apply to the second ML for Alignment Bootcamp (MLAB 2) in Berkeley [Aug 15 - Fri Sept 2]

Buck6 May 2022 0:19 UTC

111 points

7 comments6 min readEA link

Supporting global coordination in AI development: Why and how to contribute to international AI standards

pcihon17 Apr 2019 22:17 UTC

21 points

4 comments1 min readEA link

Vignettes Workshop (AI Impacts)

kokotajlod15 Jun 2021 11:02 UTC

43 points

5 comments1 min readEA link

Introducing the AI Objectives Institute’s Research: Differential Paths toward Safe and Beneficial AI

cmck5 May 2023 20:26 UTC

43 points

1 comment8 min readEA link

Apply to MATS 8.0!

Ryan Kidd20 Mar 2025 2:17 UTC

33 points

0 comments4 min readEA link

[Question] Tracking Compute Stocks and Flows: Case Studies?

Cullen 🔸5 Oct 2022 17:54 UTC

34 points

1 comment1 min readEA link

Towards shutdownable agents via stochastic choice

EJT8 Jul 2024 10:14 UTC

26 points

1 comment23 min readEA link

(arxiv.org)

My current thoughts on MIRI’s “highly reliable agent design” work

Daniel_Dewey7 Jul 2017 1:17 UTC

60 points

59 comments19 min readEA link

Is GPT-3 the death of the paperclip maximizer?

matthias_samwald3 Aug 2020 11:34 UTC

4 points

1 comment1 min readEA link

AI Twitter accounts to follow?

Adrian Salustri10 Jun 2022 6:19 UTC

1 point

2 comments1 min readEA link

[Question] Huh. Bing thing got me real anxious about AI. Resources to help with that please?

Arvin15 Feb 2023 16:55 UTC

2 points

7 comments1 min readEA link

New Sequence—Towards a worldwide, watertight Windfall Clause

John Bridge 🔸7 Apr 2022 15:02 UTC

25 points

4 comments8 min readEA link

There are two factions working to prevent AI dangers. Here’s why they’re deeply divided.

Sharmake10 Aug 2022 19:52 UTC

10 points

0 comments4 min readEA link

(www.vox.com)

NIST staffers revolt against expected appointment of ‘effective altruist’ AI researcher to US AI Safety Institute

Phib8 Mar 2024 17:47 UTC

39 points

16 comments1 min readEA link

(venturebeat.com)

[Question] Benefits/Risks of Scott Aaronson’s Orthodox/Reform Framing for AI Alignment

Jeremy21 Nov 2022 17:47 UTC

15 points

5 comments1 min readEA link

(scottaaronson.blog)

Quick Thoughts on Language Models

RohanS19 Jul 2023 16:51 UTC

10 points

2 comments4 min readEA link

(www.lesswrong.com)

#188 – On whether science is good (Matt Clancy on the 80,000 Hours Podcast)

80000_Hours24 May 2024 15:04 UTC

13 points

0 comments17 min readEA link

Connect For Animals 2025 Strategic Plan

Steven Rouk13 Feb 2025 15:49 UTC

17 points

1 comment13 min readEA link

Eliciting intuitions: Exploring an area for EA psychology

Daniel_Friedrich21 Apr 2025 15:13 UTC

11 points

1 comment8 min readEA link

Distillation of “How Likely is Deceptive Alignment?”

NickGabs1 Dec 2022 20:22 UTC

10 points

1 comment10 min readEA link

A Rocket–Interpretability Analogy

plex21 Oct 2024 13:55 UTC

13 points

1 comment1 min readEA link

‘Force multipliers’ for EA research

Craig Drayton18 Jun 2022 13:39 UTC

18 points

7 comments4 min readEA link

OPEC for a slow AGI takeoff

vyrax21 Apr 2023 10:53 UTC

4 points

0 comments3 min readEA link

Shulman and Yudkowsky on AI progress

CarlShulman4 Dec 2021 11:37 UTC

46 points

0 comments20 min readEA link

Deception as the optimal: mesa-optimizers and inner alignment

Eleni_A16 Aug 2022 3:45 UTC

19 points

0 comments5 min readEA link

On AI Weapons

kbog13 Nov 2019 12:48 UTC

76 points

10 comments30 min readEA link

Cost-effectiveness analysis of ~1260 USD worth of social media ads for fellowship marketing

gergo25 Jan 2024 15:18 UTC

61 points

5 comments2 min readEA link

How to think about slowing AI

Zach Stein-Perlman17 Sep 2023 11:23 UTC

74 points

9 comments3 min readEA link

Sentinel Funding Memo — Mitigating GCRs with Forecasting & Emergency Response

Saul Munn6 Nov 2024 1:57 UTC

47 points

5 comments13 min readEA link

AI Safety Newsletter #5: Geoffrey Hinton speaks out on AI risk, the White House meets with AI labs, and Trojan attacks on language models

Center for AI Safety9 May 2023 15:26 UTC

60 points

0 comments4 min readEA link

(newsletter.safe.ai)

AI Governance Needs Technical Work

Mau5 Sep 2022 22:25 UTC

121 points

3 comments8 min readEA link

Skill up in ML for AI safety with the Intro to ML Safety course (Spring 2023)

james5 Jan 2023 11:02 UTC

36 points

3 comments2 min readEA link

Expression of Interest: Rethink Priorities’ AI Strategy Contractor Pool

kierangreig🔸17 Jun 2025 17:15 UTC

35 points

9 comments1 min readEA link

Analysing a 2036 Takeover Scenario

ukc100146 Oct 2022 20:48 UTC

4 points

1 comment27 min readEA link

Informatica: Special Issue on Superintelligence

RyanCarey3 May 2017 5:05 UTC

7 points

0 comments2 min readEA link

Should ChatGPT make us downweight our belief in the consciousness of non-human animals?

splinter18 Feb 2023 23:29 UTC

11 points

15 comments2 min readEA link

AI Governance Reading Group Guide

AHT25 Jun 2020 10:16 UTC

26 points

2 comments3 min readEA link

[Question] 1h-volunteers needed for a small AI Safety-related research project

PabloAMC 🔸16 Aug 2021 17:51 UTC

4 points

0 comments1 min readEA link

3. Why impartial altruists should suspend judgment under unawareness

Anthony DiGiovanni2 Jun 2025 8:54 UTC

37 points

0 comments16 min readEA link

[Question] SWE vs AIS

sammyboiz🔸21 Feb 2025 1:48 UTC

22 points

7 comments1 min readEA link

New Frontiers in AI Safety

Hans Gundlach2 Apr 2025 2:00 UTC

6 points

0 comments4 min readEA link

(drive.google.com)

Podcast: Shoshannah Tekofsky on skilling up in AI safety, visiting Berkeley, and developing novel research ideas

Akash25 Nov 2022 20:47 UTC

14 points

0 comments9 min readEA link

NTIA Solicits Comments on Open-Weight AI Models

Jacob Woessner6 Mar 2024 20:05 UTC

11 points

1 comment2 min readEA link

(www.ntia.gov)

Common misconceptions about OpenAI

Jacob_Hilton25 Aug 2022 14:08 UTC

51 points

2 comments1 min readEA link

(www.lesswrong.com)

Large Language Models Pass the Turing Test

Matrice Jacobine2 Apr 2025 5:41 UTC

11 points

6 comments1 min readEA link

(arxiv.org)

Balancing safety and waste

Daniel_Friedrich17 Mar 2024 10:57 UTC

6 points

0 comments8 min readEA link

AI Impacts: Historic trends in technological progress

Aaron Gertler 🔸12 Feb 2020 0:08 UTC

55 points

5 comments3 min readEA link

Keep Chasing AI Safety Press Coverage

Gil4 Apr 2023 20:40 UTC

106 points

16 comments5 min readEA link

European Union AI Development and Governance Partnerships

EU AI Governance19 Jan 2022 10:26 UTC

22 points

1 comment4 min readEA link

Navigating the Future: A Guide on How to Stay Safe with AI | Emmanuel Katto Uganda

emmanuelkatto28 Aug 2023 11:38 UTC

2 points

0 comments2 min readEA link

Sixty years after the Cuban Missile Crisis, a new era of global catastrophic risks

christian.r13 Oct 2022 11:25 UTC

31 points

0 comments1 min readEA link

(thebulletin.org)

Announcing Superintelligence Imagined: A creative contest on the risks of superintelligence

TaylorJns12 Jun 2024 15:20 UTC

17 points

0 comments1 min readEA link

[Paper] AI Sandbagging: Language Models can Strategically Underperform on Evaluations

Teun van der Weij13 Jun 2024 10:04 UTC

24 points

2 comments2 min readEA link

(arxiv.org)

AI companies aren’t really using external evaluators

Zach Stein-Perlman26 May 2024 19:05 UTC

88 points

4 comments4 min readEA link

Apply for the ML Winter Camp in Cambridge, UK [2-10 Jan]

Nathan_Barnard2 Dec 2022 19:33 UTC

50 points

11 comments2 min readEA link

FLF Fellowship on AI for Human Reasoning: $25-50k, 12 weeks

Oliver Sourbut19 May 2025 13:25 UTC

69 points

2 comments2 min readEA link

(www.flf.org)

[Crosspost] Why Uncontrollable AI Looks More Likely Than Ever

Otto8 Mar 2023 15:33 UTC

49 points

6 comments4 min readEA link

(time.com)

From language to ethics by automated reasoning

Michele Campolo21 Nov 2021 15:16 UTC

8 points

0 comments6 min readEA link

Transformative trustbuilding via advancements in decentralized lie detection

trevor116 Mar 2024 5:56 UTC

4 points

1 comment38 min readEA link

(www.ncbi.nlm.nih.gov)

[Question] What AI Posts Do You Want Distilled?

brook25 Aug 2023 9:00 UTC

15 points

3 comments1 min readEA link

International AI Institutions: a literature review of models, examples, and proposals

MMMaas26 Sep 2023 15:26 UTC

53 points

0 comments2 min readEA link

A Utilitarian Framework with an Emphasis on Self-Esteem and Rights

Sean Sweeney8 Apr 2024 11:15 UTC

7 points

0 comments30 min readEA link

AISN #34: New Military AI Systems Plus, AI Labs Fail to Uphold Voluntary Commitments to UK AI Safety Institute, and New AI Policy Proposals in the US Senate

Center for AI Safety2 May 2024 16:12 UTC

21 points

5 comments8 min readEA link

(newsletter.safe.ai)

Replicating AI Debate

Anthony Fleming1 Feb 2025 23:19 UTC

9 points

0 comments5 min readEA link

[Question] Brief summary of key disagreements in AI Risk

Aryeh Englander26 Dec 2019 19:40 UTC

31 points

3 comments1 min readEA link

[Question] Why does (any particular) AI safety work reduce s-risks more than it increases them?

Michael St Jules 🔸3 Oct 2021 16:55 UTC

48 points

19 comments1 min readEA link

Operations in AI Safety: A One-Year Perspective and Advice

mick24 Jul 2025 12:39 UTC

15 points

0 comments10 min readEA link

(mickzijdel.com)

A new process for mapping discussions

Nathan Young30 Sep 2024 8:57 UTC

11 points

4 comments6 min readEA link

(open.substack.com)

The EU AI Act needs a definition of high-risk foundation models to avoid regulatory overreach and backlash

matthias_samwald31 May 2023 15:34 UTC

17 points

0 comments4 min readEA link

#172 – Why you should stop reading the news (Bryan Caplan on the 80,000 Hours Podcast)

80000_Hours22 Nov 2023 18:29 UTC

20 points

1 comment20 min readEA link

AI Safety Memes Wiki

plex24 Jul 2024 18:53 UTC

6 points

0 comments1 min readEA link

(aisafety.info)

UK’s new 10-year “National AI Strategy,” released today

jared_m22 Sep 2021 11:18 UTC

28 points

7 comments1 min readEA link

Between Science Fiction and Emerging Reality: Are We Ready for Digital Persons?

Alex (Αλέξανδρος)13 Mar 2025 16:09 UTC

5 points

1 comment5 min readEA link

Explaining all the US semiconductor export controls

ZacRichardson17 Jan 2025 18:00 UTC

20 points

3 comments9 min readEA link

Could Regulatory Cost-Benefit Analysis Stop Frontier AI Regulations in the US?

Luise11 Jul 2024 15:25 UTC

21 points

1 comment14 min readEA link

Teaching Hindi Literacy with an AI tutor

Tom Delaney15 May 2025 6:49 UTC

40 points

5 comments6 min readEA link

AI alignment with humans… but with which humans?

Geoffrey Miller8 Sep 2022 23:43 UTC

51 points

20 comments3 min readEA link

Forecasting Compute—Transformative AI and Compute [2/4]

lennart1 Oct 2021 8:25 UTC

39 points

6 comments19 min readEA link

What to suggest companies & entrepreneurs do to use AI safely?

AlfalfaBloom5 Apr 2023 22:36 UTC

11 points

1 comment1 min readEA link

Jan Kirchner on AI Alignment

birtes17 Jan 2023 15:11 UTC

5 points

0 comments1 min readEA link

Accelerated Horizons — Podcast + Blog Idea

Cadejs16 Apr 2025 14:20 UTC

2 points

3 comments1 min readEA link

A Society of Diverse Cognition

atb9 Jun 2025 15:22 UTC

8 points

1 comment13 min readEA link

The Engine of Foreclosure

Ihor Ivliev5 Jul 2025 15:26 UTC

0 points

0 comments25 min readEA link

Reflections on Compatibilism, Ontological Translations, and the Artificial Divine

Mahdi Complex7 May 2025 12:17 UTC

−4 points

0 comments22 min readEA link

[Question] How come there isn’t that much focus in EA on research into whether / when AI’s are likely to be sentient?

callum27 Apr 2023 10:09 UTC

83 points

23 comments1 min readEA link

6 (Potential) Misconceptions about AI Intellectuals

Ozzie Gooen14 Feb 2025 23:51 UTC

30 points

2 comments12 min readEA link

The Need for Political Advertising (Post 2 of 7 on AI Governance)

Jason Green-Lowe21 May 2025 0:52 UTC

55 points

0 comments13 min readEA link

AI Safety Ideas: A collaborative AI safety research platform

Apart Research17 Oct 2022 17:01 UTC

67 points

13 comments4 min readEA link

Report on Frontier Model Training

YafahEdelman30 Aug 2023 20:04 UTC

19 points

1 comment21 min readEA link

(docs.google.com)

Second call: CFP for Rebellion and Disobedience in AI workshop

Ram Rachum5 Feb 2023 12:19 UTC

2 points

0 comments2 min readEA link

MATS Alumni Impact Analysis

utilistrutil2 Oct 2024 23:44 UTC

16 points

1 comment11 min readEA link

[Question] Career Advice: Philosophy + Programming → AI Safety

tcelferact18 Mar 2022 15:09 UTC

30 points

11 comments2 min readEA link

[Question] Citizens Group for Steering AI

Odette B11 Apr 2024 9:15 UTC

13 points

0 comments1 min readEA link

Idea: an AI governance group colocated with every AI research group!

capybaralet7 Dec 2020 23:41 UTC

8 points

1 comment2 min readEA link

New cooperation mechanism—quadratic funding without a matching pool

Filip Sondej5 Jun 2022 13:55 UTC

55 points

11 comments5 min readEA link

The Offense-Defense Balance Rarely Changes

Maxwell Tabarrok9 Dec 2023 15:22 UTC

82 points

16 comments3 min readEA link

(maximumprogress.substack.com)

Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Matrice Jacobine12 May 2025 15:20 UTC

14 points

1 comment1 min readEA link

(www.arxiv.org)

Eli’s review of “Is power-seeking AI an existential risk?”

elifland30 Sep 2022 12:21 UTC

58 points

3 comments3 min readEA link

(docs.google.com)

Making EA more inclusive, representative, and impactful in Africa

Ashura Batungwanayo17 Aug 2023 20:19 UTC

70 points

13 comments4 min readEA link

Chris Olah on what the hell is going on inside neural networks

80000_Hours4 Aug 2021 15:13 UTC

5 points

0 comments133 min readEA link

[Question] Should AI writers be prohibited in education?

Eleni_A16 Jan 2023 22:29 UTC

3 points

2 comments1 min readEA link

AI Risk Intro 1: Advanced AI Might Be Very Bad

L Rudolf L11 Sep 2022 10:57 UTC

22 points

0 comments30 min readEA link

AISN #16: White House Secures Voluntary Commitments from Leading AI Labs and Lessons from Oppenheimer

Center for AI Safety25 Jul 2023 16:45 UTC

7 points

0 comments6 min readEA link

(newsletter.safe.ai)

On Artificial General Intelligence: Asking the Right Questions

Heather Douglas2 Oct 2022 5:00 UTC

−1 points

7 comments3 min readEA link

[Question] Share AI Safety Ideas: Both Crazy and Not. №2

ank31 Mar 2025 18:45 UTC

1 point

11 comments1 min readEA link

Tarbell is hiring for 3 roles

Cillian_17 Jul 2024 12:19 UTC

48 points

1 comment5 min readEA link

LW4EA: Some cruxes on impactful alternatives to AI policy work

Jeremy17 May 2022 3:05 UTC

11 points

1 comment1 min readEA link

(www.lesswrong.com)

Gradual Disempowerment: Systemic Existential Risks from Incremental AI Development

Jan_Kulveit30 Jan 2025 17:07 UTC

38 points

4 comments2 min readEA link

(gradual-disempowerment.ai)

The Vitalik Buterin Fellowship in AI Existential Safety is open for applications!

Cynthia Chen14 Oct 2022 3:23 UTC

38 points

0 comments2 min readEA link

How difficult is AI Alignment?

SammyDMartin13 Sep 2024 17:55 UTC

12 points

0 comments1 min readEA link

(www.lesswrong.com)

AI may attain human level soon

Vishakha Agrawal23 Apr 2025 11:10 UTC

2 points

1 comment2 min readEA link

(aisafety.info)

[Question] Thoughts on this $16.7M “AI safety” grant?

defun 🔸16 Jul 2024 9:16 UTC

61 points

24 comments1 min readEA link

[Question] What are some current, already present challenges from AI?

nonzerosum30 Jun 2022 15:44 UTC

5 points

1 comment1 min readEA link

Continuity Assumptions

Jan_Kulveit13 Jun 2022 21:36 UTC

44 points

4 comments4 min readEA link

(www.alignmentforum.org)

The Benefits of Distillation in Research

Jonas Hallgren 🔸4 Mar 2023 19:19 UTC

45 points

2 comments5 min readEA link

Why Brains Beat AI

Wayne_Hsiung12 Jun 2025 20:25 UTC

4 points

0 comments1 min readEA link

(blog.simpleheart.org)

Varieties of fake alignment (Section 1.1 of “Scheming AIs”)

Joe_Carlsmith21 Nov 2023 15:00 UTC

6 points

0 comments10 min readEA link

Tips for conducting worldview investigations

lukeprog12 Apr 2022 19:28 UTC

88 points

4 comments2 min readEA link

EA AI/Emerging Tech Orgs Should Be Involved with Patent Office Partnership

Bridges12 Jun 2022 22:32 UTC

10 points

0 comments1 min readEA link

Resources that (I think) new alignment researchers should know about

Akash28 Oct 2022 22:13 UTC

20 points

2 comments4 min readEA link

Update from Campaign for AI Safety

Nik Samoylov1 Jun 2023 10:46 UTC

22 points

0 comments2 min readEA link

(www.campaignforaisafety.org)

How do takeoff speeds affect the probability of bad outcomes from AGI?

KR7 Jul 2020 17:53 UTC

18 points

0 comments8 min readEA link

Compute & Antitrust: Regulatory implications of the AI hardware supply chain, from chip design to cloud APIs

HaydnBelfield19 Aug 2022 17:20 UTC

32 points

0 comments6 min readEA link

(verfassungsblog.de)

Some advice on independent research

mariushobbhahn8 Nov 2022 14:46 UTC

65 points

3 comments10 min readEA link

Is China Becoming a Science and Technology Superpower? Jeffrey Ding’s Insight on China’s Diffusion Deficit

Wyman Kwok25 Apr 2023 17:00 UTC

10 points

0 comments1 min readEA link

A challenge for AGI organizations, and a challenge for readers

RobBensinger1 Dec 2022 23:11 UTC

172 points

13 comments2 min readEA link

A Tri-Opti Compatibility Problem

wallower1 Mar 2025 19:48 UTC

1 point

0 comments1 min readEA link

(philpapers.org)

FYI: I’m working on a book about the threat of AGI/ASI for a general audience. I hope it will be of value to the cause and the community

Darren McKee17 Jun 2022 11:52 UTC

32 points

1 comment2 min readEA link

AMA: Ajeya Cotra, researcher at Open Phil

Ajeya28 Jan 2021 17:38 UTC

84 points

105 comments1 min readEA link

Mistakes I made running an AI safety student group

cb26 Feb 2025 15:07 UTC

26 points

0 comments7 min readEA link

The Dissolution of AI Safety

Roko12 Dec 2024 10:46 UTC

−7 points

0 comments1 min readEA link

(www.transhumanaxiology.com)

#168 – Whether deep history says we’re heading for an intelligence explosion (Ian Morris on the 80,000 Hours Podcast)

80000_Hours24 Oct 2023 15:24 UTC

11 points

2 comments18 min readEA link

How likely are malign priors over objectives? [aborted WIP]

David Johnston11 Nov 2022 6:03 UTC

6 points

0 comments2 min readEA link

MIRI Conversations: Technology Forecasting & Gradualism (Distillation)

Callum McDougall13 Jul 2022 10:45 UTC

27 points

9 comments19 min readEA link

The Digital Maieutic: Socrates and the Art of Prompting

Rodo30 May 2025 18:58 UTC

3 points

2 comments4 min readEA link

Results from the language model hackathon

Esben Kran10 Oct 2022 8:29 UTC

23 points

2 comments4 min readEA link

[Question] “Epistemic maps” for AI Debates? (or for other issues)

Marcel230 Aug 2021 4:59 UTC

14 points

9 comments5 min readEA link

Biosafety Regulations (BMBL) and their relevance for AI

stepanlos29 Jun 2023 19:20 UTC

8 points

0 comments4 min readEA link

[Question] Is there evidence that recommender systems are changing users’ preferences?

zdgroff12 Apr 2021 19:11 UTC

60 points

15 comments1 min readEA link

(My suggestions) On Beginner Steps in AI Alignment

Joseph Bloom22 Sep 2022 15:32 UTC

37 points

3 comments9 min readEA link

Some AI research areas and their relevance to existential safety

Andrew Critch15 Dec 2020 12:15 UTC

12 points

1 comment56 min readEA link

(alignmentforum.org)

[Linkpost] Human-narrated audio version of “Is Power-Seeking AI an Existential Risk?”

Joe_Carlsmith31 Jan 2023 19:19 UTC

9 points

0 comments1 min readEA link

Conclusion and Bibliography for “Understanding the diffusion of large language models”

Ben Cottier21 Dec 2022 13:50 UTC

12 points

0 comments11 min readEA link

Alexander and Yudkowsky on AGI goals

Scott Alexander31 Jan 2023 23:36 UTC

29 points

1 comment26 min readEA link

[Question] If AIs had subcortical brain simulation, would that solve the alignment problem?

Rainbow Affect31 Jul 2023 15:48 UTC

1 point

0 comments2 min readEA link

When can a mimic surprise you? Why generative models handle seemingly ill-posed problems

David Johnston6 Nov 2022 11:46 UTC

6 points

0 comments16 min readEA link

An appraisal of the Future of Life Institute AI existential risk program

PabloAMC 🔸11 Dec 2022 13:36 UTC

29 points

0 comments1 min readEA link

The probability that Artificial General Intelligence will be developed by 2043 is extremely low.

cveres6 Oct 2022 11:26 UTC

2 points

12 comments13 min readEA link

Optimistic Assumptions, Longterm Planning, and “Cope”

Raemon18 Jul 2024 0:06 UTC

15 points

1 comment7 min readEA link

The Verification Gap: A Scientific Warning on the Limits of AI Safety

Ihor Ivliev24 Jun 2025 19:08 UTC

3 points

0 comments2 min readEA link

Big Picture AI Safety: Introduction

EuanMcLean23 May 2024 11:28 UTC

32 points

3 comments5 min readEA link

Conscious AI concerns all of us. [Conscious AI & Public Perceptions]

ixex3 Jul 2024 3:12 UTC

25 points

1 comment12 min readEA link

A new proposal for regulating AI in the EU

EdoArad26 Apr 2021 17:25 UTC

37 points

3 comments1 min readEA link

(www.bbc.com)

AI Offense Defense Balance in a Multipolar World

Otto17 Jul 2025 9:47 UTC

15 points

0 comments19 min readEA link

(www.existentialriskobservatory.org)

How might we align transformative AI if it’s developed very soon?

Holden Karnofsky29 Aug 2022 15:48 UTC

164 points

17 comments44 min readEA link

Cheetah-8 Ethical Framework: Evolution from Egoism to Altruism

DongHun Lee31 May 2025 15:04 UTC

−1 points

0 comments2 min readEA link

Emotion Alignment as AI Safety: Introducing Emotion Firewall 1.0

DongHun Lee12 May 2025 18:05 UTC

1 point

0 comments2 min readEA link

The Limit of Language Models

𝕮𝖎𝖓𝖊𝖗𝖆26 Dec 2022 11:17 UTC

10 points

0 comments4 min readEA link

Arkose may be closing, but you can help

Arkose1 May 2025 11:09 UTC

58 points

6 comments2 min readEA link

Retrospective on the AI Safety Field Building Hub

Vael Gates2 Feb 2023 2:06 UTC

64 points

2 comments9 min readEA link

Will explosive growth stem primarily from AI R&D automation?

OscarD🔸28 Mar 2025 20:25 UTC

39 points

3 comments4 min readEA link

[Question] How much should states invest in contingency plans for widespread internet outage?

Kinoshita Yoshikazu (pseudonym)7 Apr 2023 16:05 UTC

2 points

0 comments1 min readEA link

Understanding AI World Models w/ Chris Canal

Jacob-Haimes27 Jan 2025 16:37 UTC

5 points

0 comments1 min readEA link

(kairos.fm)

Black Box Investigations Research Hackathon

Esben Kran15 Sep 2022 10:09 UTC

23 points

0 comments2 min readEA link

AI X-Risk: Integrating on the Shoulders of Giants

TD_Pilditch1 Nov 2022 16:07 UTC

34 points

0 comments47 min readEA link

AI Safety Hub Serbia Official Opening

Dušan D. Nešić (Dushan)28 Oct 2023 17:03 UTC

20 points

1 comment3 min readEA link

(forum.effectivealtruism.org)

Is principled mass-outreach possible, for AGI X-risk?

Nicholas Kross21 Jan 2024 17:45 UTC

12 points

2 comments3 min readEA link

OpenAI’s CBRN tests seem unclear

Luca Righetti 🔸21 Nov 2024 17:26 UTC

82 points

3 comments7 min readEA link

#191 (Part 1) – The economy and national security after AGI (Carl Shulman on the 80,000 Hours Podcast)

80000_Hours27 Jun 2024 19:10 UTC

45 points

0 comments19 min readEA link

Intro to caring about AI alignment as an EA cause

So8res14 Apr 2017 0:42 UTC

28 points

10 comments25 min readEA link

Desensitizing Deepfakes

Phib29 Mar 2023 1:20 UTC

22 points

10 comments1 min readEA link

#217 – The most important graph in AI right now (Beth Barnes on The 80,000 Hours Podcast)

80000_Hours2 Jun 2025 16:52 UTC

16 points

1 comment26 min readEA link

Effective Altruism Florida’s AI Expert Panel—Recording and Slides Available

Sam_E_2419 May 2023 19:15 UTC

2 points

0 comments1 min readEA link

Existential AI Safety is NOT separate from near-term applications

stecas13 Dec 2022 14:47 UTC

28 points

9 comments3 min readEA link

#213 – AI causing a “century in a decade” — and how we’re completely unprepared (Will MacAskill on The 80,000 Hours Podcast)

80000_Hours11 Mar 2025 17:55 UTC

24 points

0 comments22 min readEA link

Two sources of beyond-episode goals (Section 2.2.2 of “Scheming AIs”)

Joe_Carlsmith28 Nov 2023 13:49 UTC

8 points

0 comments13 min readEA link

My Overview of the AI Alignment Landscape: A Bird’s Eye View

Neel Nanda15 Dec 2021 23:46 UTC

45 points

15 comments16 min readEA link

(www.alignmentforum.org)

#194 – Defensive acceleration and how to regulate AI when you fear government (Vitalik Buterin on the 80,000 Hours Podcast)

80000_Hours31 Jul 2024 20:28 UTC

49 points

5 comments21 min readEA link

Catastrophic Risks from AI #3: AI Race

Dan H23 Jun 2023 19:21 UTC

9 points

0 comments29 min readEA link

(arxiv.org)

Challenges and Opportunities of Reinforcement Learning in Robotics: Analysis of Current Trends

Raymundo Rodríguez Alva14 Oct 2024 13:22 UTC

11 points

1 comment17 min readEA link

Future Matters #5: supervolcanoes, AI takeover, and What We Owe the Future

Pablo14 Sep 2022 13:02 UTC

31 points

5 comments18 min readEA link

Chatbot for Poultry Farms: Responding to Avian Influenza in Mexico

Ever Arboleda1 May 2025 14:55 UTC

1 point

0 comments13 min readEA link

From Layoff to Co-founding in a Breathtaking Two Months

Harry Luk26 Sep 2023 7:35 UTC

44 points

3 comments17 min readEA link

[Question] Is there any research or forecasts of how likely AI Alignment is going to be a hard vs. easy problem relative to capabilities?

Jordan Arel14 Aug 2022 15:58 UTC

8 points

1 comment1 min readEA link

On Deference and Yudkowsky’s AI Risk Estimates

bgarfinkel19 Jun 2022 14:35 UTC

287 points

194 comments17 min readEA link

Mechanism Design for AI Safety—Reading Group Curriculum

Rubi J. Hudson25 Oct 2022 3:54 UTC

24 points

1 comment4 min readEA link

Deconfusing ‘AI’ and ‘evolution’

Remmelt22 Jul 2025 6:56 UTC

6 points

1 comment26 min readEA link

Draft report on existential risk from power-seeking AI

Joe_Carlsmith28 Apr 2021 21:41 UTC

88 points

34 comments1 min readEA link

[Closed] Prize and fast track to alignment research at ALTER

Vanessa18 Sep 2022 9:15 UTC

38 points

0 comments3 min readEA link

Announcing Apollo Research

mariushobbhahn30 May 2023 16:17 UTC

158 points

5 comments8 min readEA link

How dath ilan coordinates around solving AI alignment

Thomas Kwa14 Apr 2022 1:53 UTC

13 points

1 comment5 min readEA link

AI Safety Hub Serbia Soft Launch

Dušan D. Nešić (Dushan)25 Jul 2023 19:39 UTC

29 points

3 comments3 min readEA link

ChatGPT bug leaked users’ conversation histories

Ian Turner27 Mar 2023 0:17 UTC

15 points

2 comments1 min readEA link

(www.bbc.com)

[Closed] Hiring a mathematician to work on the learning-theoretic AI alignment agenda

Vanessa19 Apr 2022 6:49 UTC

53 points

4 comments2 min readEA link

Intelsat as a Model for International AGI Governance

rosehadshar13 Mar 2025 12:58 UTC

42 points

3 comments1 min readEA link

(www.forethought.org)

Call for Research Participants—EU/China AI regulation

Jamie O'Donnell14 Jun 2024 17:30 UTC

3 points

0 comments1 min readEA link

Digital Minds: Importance and Key Research Questions

Andreas_Mogensen3 Jul 2024 8:59 UTC

83 points

1 comment15 min readEA link

The Author Who Knows Nothing : Socrates, Techne, and Barthes’ Scriptor

Rodo1 Jun 2025 9:49 UTC

0 points

0 comments4 min readEA link

The convergent dynamic we missed

Remmelt12 Dec 2023 22:50 UTC

2 points

0 comments3 min readEA link

Do Self-Perceived Superintelligent LLMs Exhibit Misalignment?

Dave Banerjee 🔸29 Jun 2025 11:16 UTC

7 points

1 comment12 min readEA link

(davebanerjee.xyz)

Epoch is hiring a Research Data Analyst

merilalama22 Nov 2022 17:34 UTC

21 points

0 comments4 min readEA link

(careers.rethinkpriorities.org)

Why AGI Timeline Research/Discourse Might Be Overrated

Miles_Brundage3 Jul 2022 8:04 UTC

122 points

28 comments10 min readEA link

Announcing an Empirical AI Safety Program

Joshc13 Sep 2022 21:39 UTC

64 points

7 comments2 min readEA link

Predict responses to the “existential risk from AI” survey

RobBensinger28 May 2021 1:38 UTC

36 points

8 comments2 min readEA link

Brainstorm of things that could force an AI team to burn their lead

So8res25 Jul 2022 0:00 UTC

26 points

1 comment13 min readEA link

FLI is hiring a new Director of US Policy

aaguirre27 Jul 2022 0:07 UTC

14 points

0 comments1 min readEA link

Preventing Animal Suffering Lock-in: Why Economic Transitions Matter

Karen Singleton28 Jul 2025 21:55 UTC

38 points

4 comments10 min readEA link

Some promising career ideas beyond 80,000 Hours’ priority paths

Arden Koehler26 Jun 2020 10:34 UTC

142 points

28 comments15 min readEA link

RA Bounty: Looking for feedback on screenplay about AI Risk

Writer26 Oct 2023 14:27 UTC

8 points

0 comments1 min readEA link

Analysis of key AI analogies

Kevin Kohler29 Jun 2024 18:16 UTC

35 points

2 comments15 min readEA link

[Question] Whose track record of AI predictions would you like to see evaluated?

Jonny Spicer 🔸29 Jan 2025 11:57 UTC

10 points

13 comments1 min readEA link

Criticism Thread: What things should OpenPhil improve on?

anonymousEA204 Feb 2023 8:16 UTC

85 points

8 comments2 min readEA link

Announcing Insights for Impact

Christian Pearson4 Jan 2023 7:00 UTC

80 points

6 comments1 min readEA link

Anchoring focalism and the Identifiable victim effect: Bias in Evaluating AGI X-Risks

Remmelt7 Jan 2023 9:59 UTC

−2 points

1 comment1 min readEA link

ML4Good West & Central Europe | Applications Open

carolinaollive12 Mar 2025 0:02 UTC

7 points

3 comments2 min readEA link

Fake thinking and real thinking

Joe_Carlsmith28 Jan 2025 20:05 UTC

78 points

3 comments38 min readEA link

New report on how much computational power it takes to match the human brain (Open Philanthropy)

Aaron Gertler 🔸15 Sep 2020 1:06 UTC

45 points

1 comment18 min readEA link

(www.openphilanthropy.org)

Keep Making AI Safety News

Gil31 Mar 2023 20:11 UTC

67 points

4 comments1 min readEA link

Results from the AI x Democracy Research Sprint

Esben Kran14 Jun 2024 16:40 UTC

19 points

1 comment6 min readEA link

A California Effect for Artificial Intelligence

henryj9 Sep 2022 14:17 UTC

73 points

1 comment4 min readEA link

(docs.google.com)

Meta AI announces Cicero: Human-Level Diplomacy play (with dialogue)

Jacy22 Nov 2022 16:50 UTC

49 points

10 comments1 min readEA link

(www.science.org)

AI Safety Newsletter #3: AI policy proposals and a new challenger approaches

Oliver Z25 Apr 2023 16:15 UTC

35 points

1 comment4 min readEA link

(newsletter.safe.ai)

Eisenhower’s Atoms for Peace Speech

Akash17 May 2023 16:10 UTC

17 points

1 comment11 min readEA link

(www.iaea.org)

Patching ~All Security-Relevant Open-Source Software?

niplav25 Feb 2025 21:35 UTC

35 points

4 comments2 min readEA link

Forecasting Biosecurity Risks from LLMs

Forecasting Research Institute1 Jul 2025 12:43 UTC

10 points

0 comments1 min readEA link

(forecastingresearch.org)

[Question] An economics of AI gov—best resources for

Liv26 Feb 2023 11:11 UTC

10 points

4 comments1 min readEA link

The AI Messiah

ryancbriggs5 May 2022 16:58 UTC

71 points

44 comments2 min readEA link

Simplicity arguments for scheming (Section 4.3 of “Scheming AIs”)

Joe_Carlsmith7 Dec 2023 15:05 UTC

6 points

1 comment14 min readEA link

Owain Evans and Victoria Krakovna: Careers in technical AI safety

EA Global3 Nov 2017 7:43 UTC

7 points

0 comments1 min readEA link

(www.youtube.com)

AGI Risk: How to internationally regulate industries in non-democracies

Timothy_Liptrot16 May 2022 22:45 UTC

9 points

2 comments9 min readEA link

Can AI Alignment Models Benefit from Indo-European Tripartite Structures?

Paul Fallavollita2 May 2025 12:39 UTC

1 point

0 comments2 min readEA link

How useful for alignment-relevant work are AIs with short-term goals? (Section 2.2.4.3 of “Scheming AIs”)

Joe_Carlsmith1 Dec 2023 14:51 UTC

6 points

0 comments6 min readEA link

Summer AI Safety Intro Fellowships in Boston and Online (Policy & Technical) – Apply by June 6!

jandrade11229 May 2025 16:47 UTC

4 points

0 comments1 min readEA link

Cancelling GPT subscription

adekcz20 May 2024 16:19 UTC

26 points

14 comments3 min readEA link

Addressing challenges for s-risk reduction: Toward positive common-ground proxies

Teo Ajantaival22 Mar 2025 17:50 UTC

52 points

1 comment17 min readEA link

AI Safety Newsletter #7: Disinformation, Governance Recommendations for AI labs, and Senate Hearings on AI

Center for AI Safety23 May 2023 21:42 UTC

23 points

0 comments6 min readEA link

(newsletter.safe.ai)

[Question] Doing Global Priorities or AI Policy research from remote location?

With Love from Israel29 Oct 2019 9:34 UTC

30 points

4 comments1 min readEA link

AI Impacts Quarterly Newsletter, Jan-Mar 2023

Harlan17 Apr 2023 23:07 UTC

20 points

1 comment3 min readEA link

(blog.aiimpacts.org)

Giving AIs safe motivations

Joe_Carlsmith18 Aug 2025 18:02 UTC

19 points

0 comments51 min readEA link

Is it possibly desirable for sentient ASI to exterminate humans?

Duckruck18 Jun 2024 15:20 UTC

0 points

4 comments1 min readEA link

How to make the best of the most important century?

Holden Karnofsky14 Sep 2021 21:05 UTC

57 points

5 comments12 min readEA link

Introducing Deep Dive, a 201 AI policy course

Kambar17 Jun 2025 16:50 UTC

31 points

2 comments2 min readEA link

Open questions on a Chinese invasion of Taiwan and its effects on the semiconductor stock

Yadav7 Dec 2023 16:39 UTC

21 points

0 comments2 min readEA link

Adam Smith Meets AI Doomers

JamesMiller31 Jan 2024 16:04 UTC

15 points

0 comments5 min readEA link

Aether July 2025 Update

RohanS1 Jul 2025 21:14 UTC

10 points

0 comments3 min readEA link

We Can’t Do Long Term Utilitarian Calculations Until We Know if AIs Can Be Conscious or Not

Mike207312 Sep 2022 8:37 UTC

4 points

0 comments11 min readEA link

Data Poisoning for Dummies (No Code, No Math)

Madhav Malhotra4 Sep 2023 20:48 UTC

7 points

0 comments3 min readEA link

Linkpost: Epistle to the Successors

ukc1001414 Jul 2024 20:07 UTC

4 points

0 comments1 min readEA link

(ukc10014.github.io)

Self-Limiting AI in AI Alignment

The_Lord's_Servant_28031 Dec 2022 19:07 UTC

2 points

1 comment1 min readEA link

The Elicitation Game: Evaluating capability elicitation techniques

Teun van der Weij27 Feb 2025 20:33 UTC

3 points

0 comments2 min readEA link

Components of Strategic Clarity [Strategic Perspectives on Long-term AI Governance, #2]

MMMaas2 Jul 2022 11:22 UTC

66 points

0 comments6 min readEA link

Massive Scaling Should be Frowned Upon

harsimony17 Nov 2022 17:44 UTC

9 points

0 comments5 min readEA link

Conversation on AI risk with Adam Gleave

AI Impacts27 Dec 2019 21:43 UTC

18 points

3 comments4 min readEA link

(aiimpacts.org)

AI Governance Course—Curriculum and Application

Mau29 Nov 2021 13:29 UTC

94 points

9 comments1 min readEA link

Our Current Directions in Mechanistic Interpretability Research (AI Alignment Speaker Series)

Group Organizer8 Apr 2022 17:08 UTC

3 points

0 comments1 min readEA link

The second bitter lesson — there’s a fundamental problem with aligning AI

aelwood19 Jan 2025 18:48 UTC

4 points

1 comment5 min readEA link

(pursuingreality.substack.com)

The Campaign Lab Tool Box Hack Day

DanR6 Feb 2024 16:39 UTC

1 point

0 comments1 min readEA link

AI scaling myths

Noah Varley🔸27 Jun 2024 20:29 UTC

30 points

0 comments1 min readEA link

(open.substack.com)

The case for becoming a black-box investigator of language models

Buck6 May 2022 14:37 UTC

91 points

7 comments3 min readEA link

Slightly against aligning with neo-luddites

Matthew_Barnett26 Dec 2022 23:27 UTC

77 points

17 comments4 min readEA link

U.S. Commerce Secretary Gina Raimondo Announces Expansion of U.S. AI Safety Institute Leadership Team [and Paul Christiano update]

Phib16 Apr 2024 17:10 UTC

116 points

8 comments1 min readEA link

(www.commerce.gov)

Eric Schmidt on recursive self-improvement

Nikola5 Nov 2023 19:05 UTC

11 points

0 comments1 min readEA link

(www.youtube.com)

Announcing the Harvard AI Safety Team

Xander12330 Jun 2022 18:34 UTC

128 points

4 comments5 min readEA link

New AI risk intro from Vox [link post]

JakubK21 Dec 2022 5:50 UTC

7 points

1 comment2 min readEA link

(www.vox.com)

Christiano, Cotra, and Yudkowsky on AI progress

Ajeya25 Nov 2021 16:30 UTC

18 points

6 comments68 min readEA link

Against Learning From Dramatic Events (by Scott Alexander)

bern17 Jan 2024 16:34 UTC

46 points

3 comments2 min readEA link

(www.astralcodexten.com)

What does it take to defend the world against out-of-control AGIs?

Steven Byrnes25 Oct 2022 14:47 UTC

43 points

0 comments30 min readEA link

Join Cambridge AI Safety Hub’s Leadership

Cambridge AI Safety Hub22 Jul 2025 15:13 UTC

9 points

0 comments2 min readEA link

AI Lab Retaliation: A Survival Guide

Jay Ready4 Jan 2025 23:05 UTC

8 points

1 comment12 min readEA link

(morelightinai.substack.com)

Common Genetic Variants Linked to Drug-Resistant Epilepsy

Connor Wood16 Apr 2025 3:55 UTC

2 points

0 comments8 min readEA link

Bahamian Adventures: An Epic Tale of Entrepreneurship, AI Strategy Research and Potatoes

Jaime Sevilla9 Aug 2022 8:37 UTC

67 points

9 comments4 min readEA link

Fundamentals of Global Priorities Research in Economics Syllabus

poliboni8 Aug 2023 12:16 UTC

77 points

1 comment8 min readEA link

An Extremely Opinionated Annotated List of My Favourite Mechanistic Interpretability Papers

Neel Nanda18 Oct 2022 21:23 UTC

19 points

0 comments12 min readEA link

(www.neelnanda.io)

How much I’m paying for AI productivity software (and the future of AI use)

jacquesthibs11 Oct 2024 17:11 UTC

30 points

17 comments8 min readEA link

(jacquesthibodeau.com)

Q1 AI Benchmarking Results: Human Pros Crush Bots

Benjamin Wilson 🔸28 Jun 2025 17:22 UTC

16 points

0 comments22 min readEA link

(www.metaculus.com)

Will AI end everything? A guide to guessing | EAG Bay Area 23

Katja_Grace25 May 2023 17:01 UTC

76 points

4 comments21 min readEA link

A response to Matthews on AI Risk

RyanCarey11 Aug 2015 12:58 UTC

11 points

16 comments6 min readEA link

[Question] What’s the best machine learning newsletter? How do you keep up to date?

Matt Putz25 Mar 2022 14:36 UTC

13 points

12 comments1 min readEA link

Apollo Research is hiring a Evals Demonstration Engineer

Joping_Apollo Research6 Aug 2025 18:26 UTC

6 points

0 comments1 min readEA link

[Question] I’m interviewing Max Tegmark about AI safety and more. What shouId I ask him?

Robert_Wiblin13 May 2022 15:32 UTC

18 points

2 comments1 min readEA link

Deep atheism and AI risk

Joe_Carlsmith4 Jan 2024 18:58 UTC

65 points

4 comments27 min readEA link

Is “superhuman” AI forecasting BS? Some experiments on the “539″ bot from the Centre for AI Safety

titotal18 Sep 2024 13:07 UTC

68 points

4 comments14 min readEA link

(open.substack.com)

Would you pursue software engineering as a career today?

justaperson18 Mar 2023 3:33 UTC

8 points

15 comments3 min readEA link

AI Forecasting Dictionary (Forecasting infrastructure, part 1)

terraform8 Aug 2019 13:16 UTC

18 points

0 comments5 min readEA link

Call for submissions: Choice of Futures survey questions

c.trout30 Apr 2023 6:59 UTC

11 points

0 comments2 min readEA link

(airtable.com)

Reliability, Security, and AI risk: Notes from infosec textbook chapter 1

Akash7 Apr 2023 15:47 UTC

15 points

0 comments4 min readEA link

ENAIS has launched a newsletter for AIS fieldbuilders

gergo22 Nov 2024 10:45 UTC

25 points

0 comments1 min readEA link

We don’t understand what happened with culture enough

Jan_Kulveit9 Oct 2023 14:56 UTC

29 points

2 comments6 min readEA link

EA Poland is facing an existential risk

EA Poland10 Nov 2023 16:23 UTC

113 points

14 comments12 min readEA link

AI Welfare Risks

Adrià Moret2 May 2025 17:41 UTC

27 points

0 comments1 min readEA link

(philpapers.org)

[Question] What is the impact of chip production on pausing AI development?

JohanEA10 Jan 2024 22:20 UTC

7 points

0 comments1 min readEA link

Do Not Tile the Lightcone with Your Confused Ontology

Jan_Kulveit13 Jun 2025 12:45 UTC

45 points

4 comments5 min readEA link

(boundedlyrational.substack.com)

AGI will be made of heterogeneous components, Transformer and Selective SSM blocks will be among them

Roman Leventov27 Dec 2023 14:51 UTC

5 points

0 comments4 min readEA link

Apply for ARBOx: an ML safety intensive [deadline 13 Dec ’24]

Nick Marsh1 Dec 2024 18:13 UTC

20 points

0 comments1 min readEA link

80,000 Hours is producing AI in Context — a new YouTube channel. Our first video, about the AI 2027 scenario, is up!

ChanaMessinger9 Jul 2025 18:22 UTC

239 points

35 comments3 min readEA link

A Major Flaw in SP1047 re APTs and Sophisticated Threat Actors

Caruso30 Aug 2024 14:11 UTC

0 points

6 comments3 min readEA link

Announcing AXRP, the AI X-risk Research Podcast

DanielFilan23 Dec 2020 20:10 UTC

32 points

1 comment1 min readEA link

The costs of caution

Kelsey Piper1 May 2023 20:04 UTC

112 points

17 comments4 min readEA link

AI Safety Newsletter #6: Examples of AI safety progress, Yoshua Bengio proposes a ban on AI agents, and lessons from nuclear arms control

Center for AI Safety16 May 2023 15:14 UTC

32 points

1 comment6 min readEA link

(newsletter.safe.ai)

Shift Resources to Advocacy Now (Post 4 of 7 on AI Governance)

Jason Green-Lowe28 May 2025 1:19 UTC

53 points

5 comments32 min readEA link

AISN #22: The Landscape of US AI Legislation - Hearings, Frameworks, Bills, and Laws

Center for AI Safety19 Sep 2023 14:43 UTC

15 points

1 comment5 min readEA link

(newsletter.safe.ai)

A personal take on longtermist AI governance

lukeprog16 Jul 2021 22:08 UTC

173 points

7 comments7 min readEA link

A New Model for Compute Center Verification

Damin Curtis🔹10 Oct 2023 19:23 UTC

21 points

2 comments5 min readEA link

Podcast/video/transcript: Eliezer Yudkowsky—Why AI Will Kill Us, Aligning LLMs, Nature of Intelligence, SciFi, & Rationality

PeterSlattery9 Apr 2023 10:37 UTC

32 points

2 comments137 min readEA link

(www.youtube.com)

deleted

funnyfranco10 Mar 2025 14:41 UTC

15 points

8 comments1 min readEA link

Slowing down AI progress is an underexplored alignment strategy

Michael Huang13 Jul 2022 3:22 UTC

92 points

11 comments3 min readEA link

(www.lesswrong.com)

Future Bowl Forecasting Tournament

ncmoulios28 Nov 2022 16:42 UTC

5 points

0 comments1 min readEA link

Promethean Governance and Memetic Legitimacy: Lessons from the Venetian Doge for AI Era Institutions

Paul Fallavollita19 Mar 2025 18:09 UTC

0 points

0 comments3 min readEA link

Some underrated reasons why the AI safety community should reconsider its embrace of strict liability

Cecil Abungu 8 Apr 2024 18:50 UTC

67 points

29 comments12 min readEA link

Does natural selection favor AIs over humans?

cdkg3 Oct 2024 19:02 UTC

21 points

0 comments1 min readEA link

(link.springer.com)

“AI Risk Discussions” website: Exploring interviews from 97 AI Researchers

Vael Gates2 Feb 2023 1:00 UTC

46 points

1 comment1 min readEA link

Instead of technical research, more people should focus on buying time

Akash5 Nov 2022 20:43 UTC

107 points

31 comments14 min readEA link

[Question] How should we invest in “long-term short-termism” given the likelihood of transformative AI?

James_Banks12 Jan 2021 23:54 UTC

8 points

0 comments1 min readEA link

AISN #26: National Institutions for AI Safety, Results From the UK Summit, and New Releases From OpenAI and xAI

Center for AI Safety15 Nov 2023 16:03 UTC

11 points

0 comments6 min readEA link

(newsletter.safe.ai)

Open call: “Existential risk of AI: technical conditions”

miller-max14 Apr 2025 14:47 UTC

15 points

1 comment1 min readEA link

A breakdown of OpenAI’s revenue

dschwarz10 Jul 2024 18:07 UTC

58 points

8 comments1 min readEA link

AGI Battle Royale: Why “slow takeover” scenarios devolve into a chaotic multi-AGI fight to the death

titotal22 Sep 2022 15:00 UTC

49 points

11 comments15 min readEA link

Amazon to invest up to $4 billion in Anthropic

Davis_Kingsley25 Sep 2023 14:55 UTC

38 points

34 comments1 min readEA link

(twitter.com)

Big Picture AI Safety: teaser

EuanMcLean20 Feb 2024 13:09 UTC

18 points

0 comments1 min readEA link

Potential Risks from Advanced Artificial Intelligence: The Philanthropic Opportunity

Holden Karnofsky6 May 2016 12:55 UTC

2 points

0 comments23 min readEA link

(www.openphilanthropy.org)

What if AI development goes well?

RoryG3 Aug 2022 8:57 UTC

25 points

7 comments12 min readEA link

Vael Gates: Risks from Advanced AI (June 2022)

Vael Gates14 Jun 2022 0:49 UTC

45 points

5 comments30 min readEA link

Stress Externalities More in AI Safety Pitches

NickGabs26 Sep 2022 20:31 UTC

31 points

9 comments2 min readEA link

[Question] What are the best ideas of how to regulate AI from the US executive branch?

Jack Cunningham2 Apr 2022 21:53 UTC

10 points

0 comments1 min readEA link

[Question] I’m interviewing Nova Das Sarma about AI safety and information security. What shouId I ask her?

Robert_Wiblin25 Mar 2022 15:38 UTC

17 points

13 comments1 min readEA link

[Question] How can I bet on short timelines?

kokotajlod7 Nov 2020 12:45 UTC

33 points

12 comments2 min readEA link

2023 Vision Weekend, San Francisco

elteerkers6 Apr 2023 14:33 UTC

3 points

0 comments1 min readEA link

Complex Systems for AI Safety [Pragmatic AI Safety #3]

TW12324 May 2022 0:04 UTC

49 points

6 comments21 min readEA link

Investigating the role of agency in AI x-risk

Corin Katzke8 Apr 2024 15:12 UTC

22 points

3 comments40 min readEA link

(www.convergenceanalysis.org)

The AI guide I’m sending my grandparents

James Martin27 Apr 2023 20:04 UTC

41 points

3 comments30 min readEA link

Expression of Interest: Mentors & Researchers at AI Safety Global Society

Caroline Shamiso Chitongo 🔸27 Jul 2025 16:03 UTC

14 points

0 comments2 min readEA link

AGI Predictions

Pablo21 Nov 2020 12:02 UTC

36 points

0 comments1 min readEA link

(www.lesswrong.com)

What are current smaller problems related to top EA cause areas (eg deepfake policies for AI risk, ongoing covid variants for bio risk) and would it be beneficial for these small and not-catastrophic challenges to get more EA resources, as a way of developing capacity to prevent the catastrophic versions?

nonzerosum13 Jun 2022 17:32 UTC

7 points

0 comments2 min readEA link

AISafety.com – Resources for AI Safety

Søren Elverlin17 May 2024 16:01 UTC

55 points

3 comments1 min readEA link

My experience building mathematical ML skills with a course from UIUC

Naoya Okamoto9 Jun 2024 11:41 UTC

2 points

0 comments10 min readEA link

Please help me find research on aspiring AI Safety folk!

yanni kyriacos20 May 2024 22:06 UTC

7 points

0 comments1 min readEA link

How Likely Is It That We’ll Have Bad Values In The Far Future?

Bentham's Bulldog7 Jul 2025 16:11 UTC

18 points

2 comments22 min readEA link

Why I think it’s net harmful to do technical safety research at AGI labs

Remmelt7 Feb 2024 4:17 UTC

42 points

29 comments1 min readEA link

$500 bounty for alignment contest ideas

Akash30 Jun 2022 1:55 UTC

18 points

1 comment2 min readEA link

Are Humans ‘Human Compatible’?

Matt Boyd6 Dec 2019 5:49 UTC

23 points

8 comments4 min readEA link

Thoughts on responsible scaling policies and regulation

Paul_Christiano24 Oct 2023 22:25 UTC

191 points

5 comments6 min readEA link

The Governance Problem and the “Pretty Good” X-Risk

Zach Stein-Perlman28 Aug 2021 20:00 UTC

23 points

4 comments11 min readEA link

Articles about recent OpenAI departures

bruce17 May 2024 17:38 UTC

126 points

12 comments1 min readEA link

(www.vox.com)

AI Safety For Dummies (Like Me)

Madhav Malhotra24 Aug 2022 20:26 UTC

22 points

7 comments20 min readEA link

AI Governance Reading Group [Toronto+remote]

Liav.Koren24 Jan 2023 22:05 UTC

2 points

0 comments1 min readEA link

AI Safety Seed Funding Network—Join as a Donor or Investor

Alexandra Bos16 Dec 2024 19:30 UTC

45 points

1 comment2 min readEA link

Preparing for the Intelligence Explosion

finm11 Mar 2025 15:38 UTC

120 points

15 comments1 min readEA link

(www.forethought.org)

Aligning Recommender Systems as Cause Area

IvanVendrov8 May 2019 8:56 UTC

150 points

48 comments13 min readEA link

Short review of our TensorTrust-based AI safety university outreach event

Milan Weibel🔹22 Sep 2024 14:54 UTC

15 points

0 comments2 min readEA link

Why AI Safety Camp struggles with fundraising (FBB #2)

gergo21 Jan 2025 17:25 UTC

67 points

10 comments7 min readEA link

Summary of “The Precipice” (2 of 4): We are a danger to ourselves

rileyharris13 Aug 2023 23:53 UTC

5 points

0 comments8 min readEA link

(www.millionyearview.com)

Developing AI Safety: Bridging the Power-Ethics Gap (Introducing New Concepts)

Ronen Bar16 Apr 2025 11:25 UTC

21 points

3 comments5 min readEA link

Announcing aisafety.training

JJ Hepburn17 Jan 2023 1:55 UTC

110 points

4 comments1 min readEA link

Drivers of large language model diffusion: incremental research, publicity, and cascades

Ben Cottier21 Dec 2022 13:50 UTC

21 points

0 comments29 min readEA link

Warning Shots Probably Wouldn’t Change The Picture Much

So8res6 Oct 2022 5:15 UTC

95 points

20 comments2 min readEA link

Disagreements about Alignment: Why, and how, we should try to solve them

ojorgensen8 Aug 2022 22:32 UTC

16 points

6 comments16 min readEA link

Tony Blair Institute—Compute for AI Index ( Seeking a Supplier)

TomWestgarth3 Oct 2022 10:25 UTC

29 points

8 comments1 min readEA link

Implications of Quantum Computing for Artificial Intelligence alignment research (ABRIDGED)

Jaime Sevilla5 Sep 2019 14:56 UTC

25 points

4 comments2 min readEA link

Two concepts of an “episode” (Section 2.2.1 of “Scheming AIs”)

Joe_Carlsmith27 Nov 2023 18:01 UTC

11 points

1 comment8 min readEA link

SenseMaking Summer School 2025, September 17-24th

finnclancy24 Jul 2025 16:20 UTC

5 points

0 comments1 min readEA link

AMA or discuss my 80K podcast episode: Ben Garfinkel, FHI researcher

bgarfinkel13 Jul 2020 16:17 UTC

87 points

140 comments1 min readEA link

How important are accurate AI timelines for the optimal spending schedule on AI risk interventions?

Tristan Cook16 Dec 2022 16:05 UTC

30 points

0 comments6 min readEA link

[linkpost] When does technical work to reduce AGI conflict make a difference?: Introduction

Anthony DiGiovanni16 Sep 2022 14:35 UTC

31 points

0 comments1 min readEA link

(www.lesswrong.com)

[Question] How did the AI Safety talent pipeline come to work so well?

Alejandro Acelas 🔸24 Jul 2025 7:24 UTC

7 points

2 comments1 min readEA link

How AI could slow scientific progress—linkpost

Josh Piecyk17 Jul 2025 17:49 UTC

35 points

3 comments22 min readEA link

(www.aisnakeoil.com)

Apply to attend a Global Challenges Project workshop in 2025!

Liam 🔸10 Dec 2024 11:48 UTC

13 points

1 comment2 min readEA link

40,000 reasons to worry about AI safety

Michael Huang2 Feb 2023 7:48 UTC

9 points

2 comments2 min readEA link

(www.theverge.com)

A Frontier AI Risk Management Framework: Bridging the Gap Between Current AI Practices and Established Risk Management

simeon_c13 Mar 2025 18:29 UTC

4 points

0 comments1 min readEA link

(arxiv.org)

Humanity’s vast future and its implications for cause prioritization

Eevee🔹26 Jul 2022 5:04 UTC

38 points

3 comments5 min readEA link

(sunyshore.substack.com)

[Question] AI Risk Microdynamics Survey

Froolow9 Oct 2022 20:00 UTC

7 points

1 comment1 min readEA link

AI Disclosures: A Regulatory Review

Elliot Mckernon29 Mar 2024 11:46 UTC

12 points

1 comment7 min readEA link

On attunement

Joe_Carlsmith25 Mar 2024 12:47 UTC

28 points

0 comments22 min readEA link

The Intergovernmental Panel On Global Catastrophic Risks (IPGCR)

DannyBressler1 Feb 2024 17:36 UTC

46 points

9 comments19 min readEA link

Hacker-AI and Digital Ghosts – Pre-AGI

Erland Wittkotter19 Oct 2022 7:49 UTC

4 points

0 comments8 min readEA link

Preparing for AI-assisted alignment research: we need data!

CBiddulph17 Jan 2023 3:28 UTC

11 points

0 comments11 min readEA link

Markus Anderljung and Ben Garfinkel: Fireside chat on AI governance

EA Global24 Jul 2020 14:56 UTC

25 points

0 comments16 min readEA link

(www.youtube.com)

Military Artificial Intelligence as Contributor to Global Catastrophic Risk

MMMaas27 Jun 2022 10:35 UTC

42 points

0 comments52 min readEA link

[Question] Updates on FLI’S Value Alignment Map?

QubitSwarm9919 Sep 2022 0:25 UTC

8 points

0 comments1 min readEA link

Thoughts on AGI organizations and capabilities work

RobBensinger7 Dec 2022 19:46 UTC

77 points

7 comments5 min readEA link

Ilya Sutskever is starting Safe Superintelligence Inc.

defun 🔸19 Jun 2024 19:11 UTC

26 points

6 comments1 min readEA link

(ssi.inc)

Join the Virtual AI Safety Unconference (VAISU)!

Nguyên21 Jun 2023 4:46 UTC

23 points

0 comments1 min readEA link

(vaisu.ai)

Buck Shlegeris: How I think students should orient to AI safety

EA Global25 Oct 2020 5:48 UTC

11 points

0 comments1 min readEA link

(www.youtube.com)

FHI Report: The Windfall Clause: Distributing the Benefits of AI for the Common Good

Cullen 🔸5 Feb 2020 23:49 UTC

54 points

21 comments2 min readEA link

Bounty: Diverse hard tasks for LLM agents

ElizabethBarnes20 Dec 2023 16:31 UTC

17 points

0 comments16 min readEA link

New OGL and ITAR changes are shifting AI Governance and Policy below the surface: A simplified update

CAISID31 May 2024 7:54 UTC

12 points

2 comments3 min readEA link

A grand strategy to recruit AI capabilities researchers into AI safety research

Peter S. Park15 Apr 2022 17:11 UTC

20 points

13 comments4 min readEA link

Sharing the AI Windfall: A Strategic Approach to International Benefit-Sharing

michel16 Aug 2024 12:54 UTC

67 points

0 comments13 min readEA link

(wrtaigovernance.substack.com)

Credo AI is hiring!

IanEisenberg3 Mar 2022 18:02 UTC

16 points

6 comments4 min readEA link

Big list of AI safety videos

JakubK9 Jan 2023 6:09 UTC

9 points

0 comments1 min readEA link

(docs.google.com)

Model evals for dangerous capabilities

Zach Stein-Perlman23 Sep 2024 11:00 UTC

19 points

0 comments3 min readEA link

ChatGPT can write code! ?

Miguel10 Dec 2022 5:36 UTC

6 points

15 comments1 min readEA link

(www.whitehatstoic.com)

Crowd-sourcing AI workflows

tylermjohn30 Apr 2025 8:26 UTC

15 points

3 comments1 min readEA link

Consider this me drunk texting the forum: Is it useful to have data that can’t be touched by AI?

Jonas Søvik 🔹7 Feb 2025 21:52 UTC

−8 points

0 comments1 min readEA link

How Prompt Recursion Undermines Grok’s Semantic Stability

Tyler Williams16 Jul 2025 16:49 UTC

1 point

0 comments1 min readEA link

[Question] What are good lit references about International Governance of AI?

Vaipan20 Mar 2024 15:51 UTC

4 points

0 comments1 min readEA link

[Linkpost] Shorter version of report on existential risk from power-seeking AI

Joe_Carlsmith22 Mar 2023 18:06 UTC

49 points

1 comment1 min readEA link

Challenge to the notion that anything is (maybe) possible with AGI

Remmelt1 Jan 2023 3:57 UTC

−19 points

3 comments1 min readEA link

(mflb.com)

New book on s-risks

Tobias_Baumann26 Oct 2022 12:04 UTC

294 points

27 comments1 min readEA link

Assistant-professor-ranked AI ethics philosopher job opportunity at Canterbury University, New Zealand

ben.smith16 Oct 2022 17:56 UTC

27 points

0 comments1 min readEA link

(www.linkedin.com)

Three scenarios of pseudo-alignment

Eleni_A5 Sep 2022 20:26 UTC

7 points

0 comments3 min readEA link

Basic game theory and how you can do a bunch of good in ~3 Hours. (developing article.)

Amateur Systems Analyst10 Oct 2024 4:30 UTC

−3 points

2 comments7 min readEA link

Slaying the Hydra: toward a new game board for AI

Prometheus23 Jun 2023 17:04 UTC

3 points

2 comments6 min readEA link

AI Governance: Opportunity and Theory of Impact

Allan Dafoe17 Sep 2020 6:30 UTC

264 points

19 comments12 min readEA link

Which Post Idea Is Most Effective?

Jordan Arel25 Apr 2022 4:47 UTC

26 points

6 comments2 min readEA link

Can AI solve climate change?

Vivian13 May 2023 20:44 UTC

2 points

2 comments1 min readEA link

AI Alignment is intractable (and we humans should stop working on it)

GPT 328 Jul 2022 20:02 UTC

1 point

1 comment1 min readEA link

Katja Grace: AI safety

EA Global11 Aug 2017 8:19 UTC

7 points

0 comments1 min readEA link

(www.youtube.com)

Existential risk from AI and what DC could do about it (Ezra Klein on the 80,000 Hours Podcast)

80000_Hours26 Jul 2023 11:48 UTC

31 points

1 comment14 min readEA link

AI Timelines: Where the Arguments, and the “Experts,” Stand

Holden Karnofsky7 Sep 2021 17:35 UTC

90 points

3 comments11 min readEA link

[Question] What are the best journals to publish AI governance papers in?

Caro2 May 2022 10:07 UTC

26 points

4 comments1 min readEA link

My summary of “Pragmatic AI Safety”

Eleni_A5 Nov 2022 14:47 UTC

14 points

0 comments5 min readEA link

The next decades might be wild

mariushobbhahn15 Dec 2022 16:10 UTC

130 points

31 comments41 min readEA link

Alignment is not that hard

sammyboiz🔸17 Apr 2025 2:07 UTC

26 points

13 comments1 min readEA link

Once More, Without Feeling (Andreas Mogensen)

Global Priorities Institute21 Jan 2025 14:53 UTC

32 points

1 comment2 min readEA link

(globalprioritiesinstitute.org)

The road from human-level to superintelligent AI may be short

Vishakha Agrawal23 Apr 2025 11:19 UTC

3 points

0 comments2 min readEA link

(aisafety.info)

New US Senate Bill on X-Risk Mitigation [Linkpost]

Evan R. Murphy4 Jul 2022 1:28 UTC

22 points

12 comments1 min readEA link

(www.hsgac.senate.gov)

o3 is not being released to the public. First they are only giving access to external safety testers. You can apply to get early access to do safety testing

Kat Woods 🔶 ⏸️20 Dec 2024 18:30 UTC

13 points

0 comments1 min readEA link

(openai.com)

Ship of Theseus Thought Experiment

Siya Sawhney26 Jun 2025 7:52 UTC

1 point

1 comment4 min readEA link

AI Benefits Post 3: Direct and Indirect Approaches to AI Benefits

Cullen 🔸6 Jul 2020 18:46 UTC

5 points

0 comments2 min readEA link

Discussing how to align Transformative AI if it’s developed very soon

elifland28 Nov 2022 16:17 UTC

36 points

0 comments28 min readEA link

TED talk on Moloch and AI

LivBoeree15 Nov 2023 19:28 UTC

72 points

7 comments1 min readEA link

What is scaffolding?

Vishakha Agrawal27 Mar 2025 9:40 UTC

3 points

0 comments2 min readEA link

(aisafety.info)

How to Catch a ChatGPT Cheat: 7 Practical Tips

Marshall27 Dec 2022 16:09 UTC

8 points

3 comments4 min readEA link

FLI launches Worldbuilding Contest with $100,000 in prizes

ggilgallon17 Jan 2022 13:54 UTC

87 points

55 comments6 min readEA link

What mistakes has the AI safety movement made?

EuanMcLean23 May 2024 11:29 UTC

62 points

3 comments12 min readEA link

Testing Human Flow in Political Dialogue: A New Benchmark for Emotionally Aligned AI

DongHun Lee30 May 2025 4:37 UTC

1 point

0 comments1 min readEA link

Apply to the Cooperative AI Summer School!

reddington3 Apr 2024 12:13 UTC

26 points

0 comments1 min readEA link

AI X-risk in the News: How Effective are Recent Media Items and How is Awareness Changing? Our New Survey Results.

Otto4 May 2023 14:04 UTC

49 points

1 comment9 min readEA link

US government commission pushes Manhattan Project-style AI initiative

Larks19 Nov 2024 16:22 UTC

83 points

15 comments1 min readEA link

(www.reuters.com)

The True Story of How GPT-2 Became Maximally Lewd

Writer18 Jan 2024 21:03 UTC

23 points

1 comment6 min readEA link

(youtu.be)

What I would do if I wasn’t at ARC Evals

Lawrence Chan6 Sep 2023 5:17 UTC

130 points

4 comments13 min readEA link

(www.lesswrong.com)

We are sharing a new website template for AI Safety groups!

AIS Hungary13 Mar 2024 16:40 UTC

11 points

2 comments1 min readEA link

If trying to communicate about AI risks, make it vivid

Michael Noetel 🔸27 May 2024 0:59 UTC

19 points

2 comments2 min readEA link

[link] Centre for the Governance of AI 2020 Annual Report

MarkusAnderljung14 Jan 2021 10:23 UTC

11 points

5 comments1 min readEA link

In AI Governance, let the Non-EA World Train You First

Camille23 Jul 2025 17:46 UTC

9 points

0 comments1 min readEA link

Sharing the World with Digital Minds

Aaron Gertler 🔸1 Dec 2020 8:00 UTC

12 points

1 comment1 min readEA link

(www.nickbostrom.com)

Positive visions for AI

L Rudolf L23 Jul 2024 20:15 UTC

21 points

1 comment18 min readEA link

(www.florencehinder.com)

Evaluation of the capability of different large language models (LLMs) in generating malicious code for DDoS attacks using different prompting techniques.

AdrianaLaRotta6 May 2025 10:55 UTC

8 points

1 comment14 min readEA link

fiction about AI risk

Ann Garth 🔸12 Nov 2020 22:36 UTC

8 points

1 comment1 min readEA link

[Linkpost] “Blueprint for an AI Bill of Rights”—Office of Science and Technology Policy, USA (2022)

QubitSwarm995 Oct 2022 16:48 UTC

15 points

0 comments2 min readEA link

(www.whitehouse.gov)

Global Risks Weekly Roundup #19/2025: India/Pakistan ceasefire, US/China tariffs deal & OpenAI nonprofit control

NunoSempere12 May 2025 17:11 UTC

16 points

0 comments1 min readEA link

Disempowerment spirals as a likely mechanism for existential catastrophe

Raymond D10 Apr 2025 14:38 UTC

15 points

1 comment5 min readEA link

Anthropic Announces new S.O.T.A. Claude 3

Joseph Miller4 Mar 2024 19:02 UTC

10 points

5 comments1 min readEA link

(twitter.com)

Are we trying to figure out if AI is conscious?

kristapsz27 Jan 2025 13:22 UTC

5 points

1 comment5 min readEA link

AMA: The new Open Philanthropy Technology Policy Fellowship

lukeprog26 Jul 2021 15:11 UTC

38 points

14 comments1 min readEA link

Red-teaming existential risk from AI

Zed Tarar30 Nov 2023 14:35 UTC

30 points

16 comments6 min readEA link

Announcing the AI Safety Nudge Competition to Help Beat Procrastination

Marc Carauleanu1 Oct 2022 1:49 UTC

24 points

1 comment2 min readEA link

When to diversify? Breaking down mission-correlated investing

jh29 Nov 2022 11:18 UTC

33 points

2 comments8 min readEA link

[Linkpost] “AI Alignment vs. AI Ethical Treatment: Ten Challenges”

Bradford Saad5 Jul 2024 14:55 UTC

10 points

0 comments1 min readEA link

(docs.google.com)

“Successful language model evals” by Jason Wei

Arjun Panickssery25 May 2024 9:34 UTC

11 points

0 comments1 min readEA link

(www.jasonwei.net)

A Brief Overview of AI Safety/Alignment Orgs, Fields, Researchers, and Resources for ML Researchers

Austin Witte2 Feb 2023 6:19 UTC

18 points

5 comments2 min readEA link

Navigating AI Safety: Exploring Transparency with CCACS – A Comprehensible Architecture for Discussion

Ihor Ivliev12 Mar 2025 17:51 UTC

2 points

3 comments2 min readEA link

Some governance research ideas to prevent malevolent control over AGI and why this might matter a hell of a lot

Jim Buhler23 May 2023 13:07 UTC

64 points

5 comments16 min readEA link

Case studies of self-governance to reduce technology risk

jia6 Apr 2021 8:49 UTC

55 points

6 comments7 min readEA link

Join the $10K AutoHack 2024 Tournament

Paul Bricman25 Sep 2024 11:56 UTC

17 points

0 comments1 min readEA link

(noemaresearch.com)

Announcing Convergence Analysis: An Institute for AI Scenario & Governance Research

David_Kristoffersson7 Mar 2024 21:18 UTC

46 points

0 comments4 min readEA link

On DeepMind and Trying to Fairly Hear Out Both AI Doomers and Doubters (Rohin Shah on The 80,000 Hours Podcast)

80000_Hours12 Jun 2023 12:53 UTC

28 points

1 comment15 min readEA link

Why We Need a Beacon of Hope in the Looming Gloom of AGI

Beyond Singularity2 Apr 2025 14:22 UTC

2 points

6 comments5 min readEA link

US-China trade talks should pave way for AI safety treaty [SCMP crosspost]

Otto16 May 2025 20:53 UTC

15 points

1 comment3 min readEA link

Conditional Trees: Generating Informative Forecasting Questions (FRI) -- AI Risk Case Study

Forecasting Research Institute12 Aug 2024 16:24 UTC

44 points

2 comments8 min readEA link

(forecastingresearch.org)

Applications Open: GovAI Summer Fellowship 2023

GovAI21 Dec 2022 15:00 UTC

28 points

0 comments2 min readEA link

When AI Speaks Too Soon: How Premature Revelation Can Suppress Human Emergence

KaedeHamasaki10 Apr 2025 18:19 UTC

1 point

3 comments3 min readEA link

The AI Revolution in Biology

Roman Leventov26 May 2024 9:30 UTC

8 points

0 comments1 min readEA link

(www.cognitiverevolution.ai)

England & Wales & Windfalls

John Bridge 🔸3 Jun 2022 10:26 UTC

13 points

1 comment24 min readEA link

Γαμινγκ the Algorithms: Large Language Models as Mirrors

Haris Shekeris1 Apr 2023 2:14 UTC

5 points

3 comments4 min readEA link

“AGI timelines: ignore the social factor at their peril” (Future Fund AI Worldview Prize submission)

ketanrama5 Nov 2022 17:45 UTC

10 points

0 comments12 min readEA link

(trevorklee.substack.com)

Fundamental Challenges in AI Governance

Tharin23 Oct 2023 1:30 UTC

7 points

1 comment7 min readEA link

AI Research Considerations for Human Existential Safety (ARCHES)

Andrew Critch21 May 2020 6:55 UTC

29 points

0 comments3 min readEA link

(acritch.com)

Don’t Bet the Future on Winning an AI Arms Race

Eric Drexler11 Jul 2025 11:11 UTC

25 points

1 comment5 min readEA link

AI Safety Info Distillation Fellowship

robertskmiles17 Feb 2023 16:16 UTC

80 points

1 comment3 min readEA link

[Question] Should I force myself to work on AGI alignment?

Isaac Benson24 Aug 2022 17:25 UTC

19 points

17 comments1 min readEA link

[Question] Why does an AI have to have specified goals?

Luke Eure22 Aug 2023 20:15 UTC

8 points

4 comments1 min readEA link

My Understanding of Paul Christiano’s Iterated Amplification AI Safety Research Agenda

Chi15 Aug 2020 19:59 UTC

38 points

3 comments39 min readEA link

AI Safety Career Bottlenecks Survey Responses Responses

Linda Linsefors28 May 2021 10:41 UTC

35 points

1 comment5 min readEA link

Anthropic: Core Views on AI Safety: When, Why, What, and How

jonmenaster9 Mar 2023 17:30 UTC

107 points

6 comments22 min readEA link

(www.anthropic.com)

Call to action: Read + Share AI Safety / Reinforcement Learning Featured in Conversation

Justin Olive24 Oct 2022 1:13 UTC

3 points

0 comments1 min readEA link

Critique of Superintelligence Part 2

James Fodor13 Dec 2018 5:12 UTC

10 points

12 comments7 min readEA link

Potential Implications of AI on Human Cognitive Evolution

Soe Lin21 Aug 2024 9:53 UTC

1 point

0 comments1 min readEA link

Language models surprised us

Ajeya29 Aug 2023 21:18 UTC

59 points

10 comments5 min readEA link

Prevenire una catastrofe legata all’intelligenza artificiale

EA Italy17 Jan 2023 11:07 UTC

1 point

0 comments3 min readEA link

The Work of Chad Jones

Nicholas Decker13 Mar 2025 18:00 UTC

12 points

0 comments1 min readEA link

(nicholasdecker.substack.com)

Short-Term AI Alignment as a Priority Cause

len.hoang.lnh11 Feb 2020 16:22 UTC

17 points

11 comments7 min readEA link

Foresight for AGI Safety Strategy: Mitigating Risks and Identifying Golden Opportunities

jacquesthibs5 Dec 2022 16:09 UTC

14 points

1 comment8 min readEA link

Consider trying the ELK contest (I am)

Holden Karnofsky5 Jan 2022 19:42 UTC

110 points

17 comments16 min readEA link

What he’s learned as an AI policy insider (Tantum Collins on the 80,000 Hours Podcast)

80000_Hours13 Oct 2023 15:01 UTC

11 points

2 comments15 min readEA link

What AI could mean for animals

Max Taylor6 Oct 2023 8:36 UTC

144 points

10 comments17 min readEA link

PhD student and postdoc positions philosophy of AI in Erlangen (Germany)

LeonardDung15 Jun 2023 21:03 UTC

13 points

0 comments1 min readEA link

Defining AI “Rights” by Gemini

khayali8 Jun 2025 18:42 UTC

−2 points

0 comments32 min readEA link

Prioritizing the Arts in response to AI automation

Casey25 Sep 2022 7:49 UTC

6 points

2 comments2 min readEA link

Article Summary: Current and Near-Term AI as a Potential Existential Risk Factor

AndreFerretti7 Jun 2023 13:53 UTC

12 points

1 comment1 min readEA link

(dl.acm.org)

A survey of concrete risks derived from Artificial Intelligence

Guillem Bas8 Jun 2023 22:09 UTC

36 points

2 comments6 min readEA link

(riesgoscatastroficosglobales.com)

Actionable-guidance and roadmap recommendations for the NIST AI Risk Management Framework

Tony Barrett17 May 2022 15:27 UTC

11 points

0 comments3 min readEA link

Impact Academy is hiring an AI Governance Lead—more information, upcoming Q&A and $500 bounty

Lowe Lundin29 Aug 2023 18:42 UTC

9 points

1 comment1 min readEA link

Partner with Us: Advancing Global Catastrophic and AI Risk Research at Plateau State University,Bokkos

emmannaemeka10 Oct 2024 1:19 UTC

16 points

0 comments2 min readEA link

Interview with Roman Yampolskiy about AGI on The Reality Check

Darren McKee18 Feb 2023 23:29 UTC

27 points

0 comments1 min readEA link

(www.trcpodcast.com)

Research Engineer @ Timaeus

Sara Recktenwald13 Aug 2025 7:38 UTC

7 points

1 comment3 min readEA link

Preparing Effective Altruism for an AI-Transformed World

Tobias Häberli22 Jan 2025 8:50 UTC

199 points

26 comments1 min readEA link

[Question] What AI Take-Over Movies or Books Will Scare Me Into Taking AI Seriously?

Jordan Arel10 Jan 2023 8:30 UTC

11 points

8 comments1 min readEA link

Follow along with Columbia EA’s Advanced AI Safety Fellowship!

RohanS2 Jul 2022 6:07 UTC

27 points

0 comments2 min readEA link

Offer: Team Conflict Counseling for AI Safety Orgs

Severin14 Apr 2025 15:17 UTC

23 points

1 comment1 min readEA link

All AGI Safety questions welcome (especially basic ones) [July 2023]

leillustrations🔸19 Jul 2023 18:08 UTC

12 points

2 comments2 min readEA link

[Question] What work has been done on the post-AGI distribution of wealth?

tlevin6 Jul 2022 18:59 UTC

16 points

3 comments1 min readEA link

Opportunities for Impact Beyond the EU AI Act

Cillian_12 Oct 2023 15:06 UTC

27 points

2 comments4 min readEA link

AI views and disagreements AMA: Christiano, Ngo, Shah, Soares, Yudkowsky

RobBensinger1 Mar 2022 1:13 UTC

30 points

4 comments1 min readEA link

(www.lesswrong.com)

166 States Vote to Adopt Lethal Autonomous Weapons Resolution at the UNGA

Heramb Podar8 Dec 2024 21:23 UTC

14 points

0 comments1 min readEA link

#214 – Controlling AI that wants to take over – so we can use it anyway (Buck Shlegeris on The 80,000 Hours Podcast)

80000_Hours4 Apr 2025 19:59 UTC

17 points

0 comments32 min readEA link

New roles on my team: come build Open Phil’s technical AI safety program with me!

Ajeya19 Oct 2023 16:46 UTC

102 points

3 comments4 min readEA link

AI Forecasting Research Ideas

Jaime Sevilla17 Nov 2022 17:37 UTC

78 points

1 comment1 min readEA link

(docs.google.com)

Who will be in charge once alignment is achieved?

trurl16 Dec 2022 16:53 UTC

8 points

2 comments1 min readEA link

[Question] How/When Should One Introduce AI Risk Arguments to People Unfamiliar With the Idea?

Marcel29 Aug 2022 2:57 UTC

12 points

4 comments1 min readEA link

Alignment Newsletter One Year Retrospective

Rohin Shah10 Apr 2019 7:00 UTC

62 points

22 comments21 min readEA link

BenchMoral: A benchmarking to assess the moral sensitivity of large language models (LLMs) in Spanish.

Flor Betzabeth Ampa Flores30 Apr 2025 21:26 UTC

1 point

0 comments18 min readEA link

Why “just make an agent which cares only about binary rewards” doesn’t work.

Lysandre Terrisse9 May 2023 16:51 UTC

4 points

1 comment3 min readEA link

Carnegie Council MisUnderstands Longtermism

Jeff A30 Sep 2022 2:57 UTC

6 points

8 comments1 min readEA link

(www.carnegiecouncil.org)

“Tech company singularities”, and steering them to reduce x-risk

Andrew Critch13 May 2022 17:26 UTC

51 points

5 comments4 min readEA link

Climate Advocacy and AI Safety: Supercharging AI Slowdown Advocacy

Matthew McRedmond🔹25 Jul 2024 12:08 UTC

8 points

7 comments2 min readEA link

“Intro to brain-like-AGI safety” series—halfway point!

Steven Byrnes9 Mar 2022 15:21 UTC

8 points

0 comments2 min readEA link

The case for building expertise to work on US AI policy, and how to do it

80000_Hours31 Jan 2019 22:44 UTC

37 points

2 comments2 min readEA link

€200k in European AI & Society Fund grants

Artūrs Kaņepājs6 Jul 2023 13:00 UTC

21 points

1 comment1 min readEA link

(europeanaifund.org)

What Are The Biggest Threats To Humanity? (A Happier World video)

Jeroen Willems🔸31 Jan 2023 19:50 UTC

17 points

1 comment15 min readEA link

Litigate-for-Impact: Preparing Legal Action against an AGI Frontier Lab Leader

Sonia M Joseph8 Dec 2024 14:28 UTC

77 points

1 comment2 min readEA link

AISN #25: White House Executive Order on AI, UK AI Safety Summit, and Progress on Voluntary Evaluations of AI Risks

Center for AI Safety31 Oct 2023 19:24 UTC

21 points

0 comments6 min readEA link

(newsletter.safe.ai)

Should you start a for-profit AI safety org?

Kat Woods 🔶 ⏸️15 Aug 2025 13:52 UTC

9 points

0 comments1 min readEA link

Why Stop AI is barricading OpenAI

Remmelt14 Oct 2024 7:12 UTC

−19 points

28 comments6 min readEA link

(docs.google.com)

A conversation with Rohin Shah

AI Impacts12 Nov 2019 1:31 UTC

27 points

8 comments33 min readEA link

(aiimpacts.org)

AI and Chemical, Biological, Radiological, & Nuclear Hazards: A Regulatory Review

Elliot Mckernon10 May 2024 8:41 UTC

8 points

1 comment10 min readEA link

There Should Be More Alignment-Driven Startups

vaniver31 May 2024 2:05 UTC

27 points

3 comments11 min readEA link

On running a city-wide university group

gergo6 Nov 2023 9:43 UTC

26 points

3 comments9 min readEA link

6 non-obvious mental health issues specific to AI safety

Igor Ivanov18 Aug 2023 15:47 UTC

35 points

0 comments3 min readEA link

An Exercise in Speed-Reading: The National Security Commission on AI (NSCAI) Final Report

abiolvera17 Aug 2022 16:55 UTC

47 points

4 comments12 min readEA link

Losing faith in big tech altruism

sammyboiz🔸22 May 2024 4:49 UTC

7 points

1 comment1 min readEA link

Creating ‘Making God’: a Feature Documentary on risks from AGI

ConnorA15 Apr 2025 14:14 UTC

21 points

8 comments7 min readEA link

Announcing “Key Phenomena in AI Risk” (facilitated reading group)

nora9 May 2023 16:52 UTC

28 points

0 comments2 min readEA link

Longtermists Should Work on AI—There is No “AI Neutral” Scenario

simeon_c7 Aug 2022 16:43 UTC

42 points

62 comments6 min readEA link

Finding Voice

khayali3 Jun 2025 1:27 UTC

2 points

0 comments2 min readEA link

Some thoughts on risks from narrow, non-agentic AI

richard_ngo19 Jan 2021 0:07 UTC

36 points

2 comments8 min readEA link

AI Alignment YouTube Playlists

jacquesthibs9 May 2022 21:31 UTC

16 points

2 comments1 min readEA link

deleted

funnyfranco7 Jul 2025 10:40 UTC

2 points

0 comments1 min readEA link

AI Safety: Applying to Graduate Studies

frances_lorenz15 Dec 2021 22:56 UTC

24 points

0 comments12 min readEA link

[Question] Donating against Short Term AI risks

Jan-Willem16 Nov 2020 12:23 UTC

6 points

10 comments1 min readEA link

Governments pose larger risks than corporations: a brief response to Grace

David Johnston19 Oct 2022 11:54 UTC

11 points

3 comments2 min readEA link

Problems of people new to AI safety and my project ideas to mitigate them

Igor Ivanov3 Mar 2023 17:35 UTC

20 points

0 comments7 min readEA link

Explore jobs in AI safety and policy

EA Handbook18 Feb 2025 21:47 UTC

6 points

0 comments1 min readEA link

Seeking advice on impactful career paths given my unique capabilities and interests

Grateful4PathTips31 Mar 2023 23:30 UTC

32 points

5 comments1 min readEA link

Opportunities for individual donors in AI safety

alexflint12 Mar 2018 2:10 UTC

13 points

11 comments10 min readEA link

Results from an Adversarial Collaboration on AI Risk (FRI)

Forecasting Research Institute11 Mar 2024 15:54 UTC

193 points

25 comments9 min readEA link

(forecastingresearch.org)

What AI companies should do: Some rough ideas

Zach Stein-Perlman21 Oct 2024 14:00 UTC

14 points

1 comment5 min readEA link

Virtual AI Safety Unconference 2024

Orpheus_Lummis13 Mar 2024 13:48 UTC

11 points

0 comments1 min readEA link

Animal ethics in ChatGPT and Claude

Elijah Whipple16 Jan 2024 21:38 UTC

49 points

2 comments9 min readEA link

Announcing #AISummitTalks featuring Professor Stuart Russell and many others

Otto24 Oct 2023 10:16 UTC

9 points

1 comment1 min readEA link

2017 AI Safety Literature Review and Charity Comparison

Larks20 Dec 2017 21:54 UTC

43 points

17 comments23 min readEA link

Apples, Oranges, and AGI: Why Incommensurability May be an Obstacle in AI Safety

Allan McCay28 Mar 2025 14:50 UTC

3 points

2 comments2 min readEA link

[Question] Mutual Assured Destruction used against AGI

Leopard8 Oct 2022 9:35 UTC

4 points

5 comments1 min readEA link

Exploring AI Policy & the Future of Work — Seeking Guidance for PhD Pathways (No UK/EU/US Passport, No Master’s)

genesis14 Jun 2025 21:47 UTC

2 points

0 comments1 min readEA link

Published report: Pathways to short TAI timelines

Zershaaneh Qureshi20 Feb 2025 22:10 UTC

47 points

2 comments17 min readEA link

(www.convergenceanalysis.org)

[Question] Is working on AI safety as dangerous as ignoring it?

jkmh20 Sep 2021 23:06 UTC

10 points

5 comments1 min readEA link

How Do AI Timelines Affect Giving Now vs. Later?

MichaelDickens3 Aug 2021 3:36 UTC

36 points

8 comments8 min readEA link

Benchmarking Emotional Alignment: Can VSPE Reduce Flattery in LLMs?

Astelle Kay4 Aug 2025 3:36 UTC

2 points

0 comments3 min readEA link

TAI Safety Bibliographic Database

Jess_Riedel22 Dec 2020 16:03 UTC

61 points

9 comments17 min readEA link

Distribution Shifts and The Importance of AI Safety

Leon Lang29 Sep 2022 22:38 UTC

7 points

0 comments9 min readEA link

Contribute by facilitating the AGI Safety Fundamentals Programme

Jamie B6 Dec 2021 11:50 UTC

27 points

0 comments2 min readEA link

A tough career decision

PabloAMC 🔸9 Apr 2022 0:46 UTC

68 points

13 comments4 min readEA link

A transcript of the TED talk by Eliezer Yudkowsky

MikhailSamin12 Jul 2023 12:12 UTC

39 points

1 comment4 min readEA link

Election by Jury: The Ultimate Democratic Safeguard in the Age of AI and Information Warfare

ClayShentrup17 Apr 2025 19:50 UTC

13 points

5 comments27 min readEA link

Sharing the Global AI Governance Alliance

JordanStone17 Aug 2025 19:30 UTC

7 points

0 comments1 min readEA link

Introducing AI Lab Watch

Zach Stein-Perlman30 Apr 2024 17:00 UTC

128 points

23 comments1 min readEA link

(ailabwatch.org)

[Job] Managing Director at the Cooperative AI Foundation ($5000 Referral Bonus)

Lewis Hammond3 Jul 2023 16:02 UTC

31 points

0 comments1 min readEA link

Protesting Now for AI Regulation might be more Impactful than AI Safety Research

Nicolae13 Apr 2025 2:11 UTC

65 points

4 comments2 min readEA link

Introducing The Nonlinear Fund: AI Safety research, incubation, and funding

Kat Woods 🔶 ⏸️18 Mar 2021 14:07 UTC

71 points

32 comments5 min readEA link

Safety without oppression: an AI governance problem

Nathan_Barnard28 Jul 2022 10:19 UTC

3 points

0 comments8 min readEA link

Reflecting on the First Conference on Global Catastrophic Risks for Spanish Speakers

SMalagon29 May 2024 14:24 UTC

15 points

0 comments1 min readEA link

Funding opportunity for personal/professional development for those working in AI safety (deadline March 29)

atury25 Mar 2024 19:19 UTC

18 points

0 comments1 min readEA link

[Question] Concerns about AI safety career change

mmKALLL13 Jan 2023 20:52 UTC

45 points

15 comments4 min readEA link

[Question] Has private AGI research made independent safety research ineffective already? What should we do about this?

Roman Leventov23 Jan 2023 16:23 UTC

15 points

0 comments5 min readEA link

[Question] A dataset for AI/superintelligence stories and other media?

Marcel229 Mar 2022 21:41 UTC

20 points

2 comments1 min readEA link

Are you passionate about AI and Animal Welfare? Do you have an idea that could revolutionize the food industry? We want to hear from you!

David van Beveren6 May 2024 23:42 UTC

12 points

0 comments1 min readEA link

[Question] What is most confusing to you about AI stuff?

Sam Clarke23 Nov 2021 16:00 UTC

25 points

15 comments1 min readEA link

AI Safety 101 : Reward Misspecification

markov21 Dec 2023 14:26 UTC

6 points

1 comment31 min readEA link

Some thoughts on Leopold Aschenbrenner’s Situational Awareness paper

Luke Dawes14 Jun 2024 13:50 UTC

14 points

1 comment3 min readEA link

When you plan according to your AI timelines, should you put more weight on the median future, or the median future | eventual AI alignment success? ⚖️

Jeffrey Ladish5 Jan 2023 1:55 UTC

16 points

2 comments2 min readEA link

The Concept of Boundary Layer in Language Games and Its Implications for AI

Mirage24 Mar 2023 13:50 UTC

1 point

0 comments7 min readEA link

Google AI Accelerator Open Call

Rochelle Harris22 Jan 2025 16:50 UTC

10 points

1 comment1 min readEA link

What Should We Optimize—A Conversation

Johannes C. Mayer7 Apr 2022 14:48 UTC

1 point

0 comments14 min readEA link

Sakana, Strawberry, and Scary AI

Matrice Jacobine19 Sep 2024 11:57 UTC

1 point

0 comments1 min readEA link

(www.astralcodexten.com)

Reviewing the Structure of Current AI Regulations

Deric Cheng7 May 2024 12:34 UTC

32 points

1 comment13 min readEA link

Introducing 11 New AI Safety Organizations—Catalyze’s Winter 24/25 London Incubation Program Cohort

Alexandra Bos10 Mar 2025 19:26 UTC

94 points

4 comments14 min readEA link

Public Opinion on AI Safety: AIMS 2023 and 2021 Summary

Janet Pauketat25 Sep 2023 18:09 UTC

19 points

0 comments3 min readEA link

(www.sentienceinstitute.org)

Costs of Embodiment

algekalipso30 Jul 2024 20:41 UTC

18 points

1 comment14 min readEA link

Advice for new alignment people: Info Max

Jonas Hallgren 🔸30 May 2023 15:42 UTC

9 points

0 comments5 min readEA link

My Model of EA and AI Safety

Eva Lu24 Jun 2025 6:23 UTC

9 points

1 comment2 min readEA link

Blueprints for AI Safety conferences

gergo7 Aug 2025 13:16 UTC

11 points

0 comments7 min readEA link

Why Engaging with Global Majority AI Policy Matters

Heramb Podar2 Jul 2025 1:54 UTC

1 point

0 comments1 min readEA link

(www.lesswrong.com)

[Question] Rank best universities for AI Saftey

Parker_Whitfill6 May 2023 13:20 UTC

8 points

4 comments1 min readEA link

Introducing StakeOut.AI

Harry Luk17 Feb 2024 0:21 UTC

52 points

6 comments9 min readEA link

Open Philanthropy’s AI governance grantmaking (so far)

Aaron Gertler 🔸17 Dec 2020 12:00 UTC

63 points

0 comments6 min readEA link

(www.openphilanthropy.org)

[Question] AI Researcher Surveys with Similar Results to Katja Grace, 2024?

AlexChalk28 Jul 2025 23:39 UTC

6 points

1 comment1 min readEA link

François Chollet on why LLMs won’t scale to AGI

Yarrow🔸15 Apr 2025 23:01 UTC

6 points

2 comments1 min readEA link

(www.youtube.com)

What are the “no free lunch” theorems?

Vishakha Agrawal4 Feb 2025 2:02 UTC

3 points

0 comments1 min readEA link

(aisafety.info)

CSER and FHI advice to UN High-level Panel on Digital Cooperation

HaydnBelfield8 Mar 2019 20:39 UTC

22 points

7 comments6 min readEA link

(www.cser.ac.uk)

Parallels Between AI Safety by Debate and Evidence Law

Cullen 🔸20 Jul 2020 22:52 UTC

30 points

2 comments2 min readEA link

(cullenokeefe.com)

Cortés, Pizarro, and Afonso as Precedents for Takeover

AI Impacts2 Mar 2020 12:25 UTC

27 points

17 comments11 min readEA link

(aiimpacts.org)

The case for multi-decade timelines [Linkpost]

Sharmake27 Apr 2025 20:34 UTC

50 points

9 comments11 min readEA link

A Defense of Work on Mathematical AI Safety

Davidmanheim6 Jul 2023 14:13 UTC

50 points

6 comments3 min readEA link

Microdooms averted by working on AI Safety

Nikola17 Sep 2023 21:51 UTC

42 points

6 comments3 min readEA link

(www.lesswrong.com)

[Question] Urgent Need for Refinancing

Tobias W. Kaiser10 Jul 2023 19:35 UTC

2 points

2 comments1 min readEA link

AI for Resolving Forecasting Questions: An Early Exploration

Ozzie Gooen16 Jan 2025 21:40 UTC

22 points

0 comments9 min readEA link

AI Is Not Software

Davidmanheim2 Jan 2024 7:58 UTC

21 points

17 comments5 min readEA link

Announcing the PIBBSS Symposium ’24!

Dušan D. Nešić (Dushan)3 Sep 2024 11:19 UTC

6 points

0 comments3 min readEA link

[Question] What longtermist projects would you like to see implemented?

Buhl28 Mar 2023 18:41 UTC

55 points

6 comments1 min readEA link

How to use AI speech transcription and analysis to accelerate social science research

Alexander Saeri31 Jan 2023 4:01 UTC

39 points

6 comments11 min readEA link

EU AI Act passed vote, and x-risk was a main topic

Ariel15 Jun 2023 13:16 UTC

43 points

2 comments1 min readEA link

(www.euractiv.com)

Otherness and control in the age of AGI

Joe_Carlsmith2 Jan 2024 18:15 UTC

37 points

1 comment7 min readEA link

Shahar Avin on How to Strategically Regulate Advanced AI Systems

Michaël Trazzi23 Sep 2022 15:49 UTC

48 points

2 comments4 min readEA link

(theinsideview.ai)

Could AI systems naturally evolve to prioritize their own usage over human welfare?

Andre OBrien12 Jun 2025 11:53 UTC

1 point

0 comments2 min readEA link

Growing together: EA Hungary and AI Safety Hungary combined report for 2024

Milan_A29 Jul 2025 14:15 UTC

25 points

1 comment14 min readEA link

An intersection between animal welfare and AI

sammyboiz🔸18 Jun 2024 3:23 UTC

9 points

1 comment1 min readEA link

[Question] How to persuade a non-CS background person to believe AGI is 50% possible in 2040?

jackchang1101 Apr 2023 15:27 UTC

1 point

7 comments1 min readEA link

[Question] Launching Applications for the Global AI Safety Fellowship 2025!

Impact Academy27 Nov 2024 15:33 UTC

9 points

1 comment1 min readEA link

Anthropic, Google, Microsoft & OpenAI announce Executive Director of the Frontier Model Forum & over $10 million for a new AI Safety Fund

Zach Stein-Perlman25 Oct 2023 15:20 UTC

38 points

0 comments4 min readEA link

(www.frontiermodelforum.org)

Birds, Brains, Planes, and AI: Against Appeals to the Complexity/Mysteriousness/Efficiency of the Brain

kokotajlod18 Jan 2021 12:39 UTC

27 points

2 comments1 min readEA link

Call For Distillers

johnswentworth6 Apr 2022 3:03 UTC

70 points

6 comments3 min readEA link

[Linkpost] Alpaca 7B release | Budget ChatGPT for everybody?

Felix Wolf 🔸17 Mar 2023 13:08 UTC

14 points

0 comments1 min readEA link

(www.youtube.com)

A framework for thinking about AI power-seeking

Joe_Carlsmith24 Jul 2024 22:41 UTC

44 points

11 comments16 min readEA link

The US expands restrictions on AI exports to China. What are the x-risk effects?

poppinfresh14 Oct 2022 18:17 UTC

161 points

20 comments4 min readEA link

DeepMind is hiring Long-term Strategy & Governance researchers

vishal13 Sep 2021 18:44 UTC

54 points

1 comment1 min readEA link

Linkpost: “Imagining and building wise machines: The centrality of AI metacognition” by Johnson, Karimi, Bengio, et al.

Chris Leong17 Nov 2024 15:00 UTC

8 points

0 comments1 min readEA link

(arxiv.org)

How the AI safety technical landscape has changed in the last year, according to some practitioners

tlevin26 Jul 2024 19:06 UTC

84 points

1 comment2 min readEA link

PIBBSS Fellowship: Bounty for Referrals & Deadline Extension

Anna_Gajdova17 Jan 2022 16:23 UTC

17 points

7 comments1 min readEA link

Personal thoughts on careers in AI policy and strategy

carrickflynn27 Sep 2017 16:52 UTC

56 points

28 comments18 min readEA link

Concrete Steps to Get Started in Transformer Mechanistic Interpretability

Neel Nanda26 Dec 2022 13:00 UTC

18 points

0 comments12 min readEA link

My thoughts on nanotechnology strategy research as an EA cause area

Ben Snodin2 May 2022 9:41 UTC

137 points

17 comments33 min readEA link

[Link] Thiel on GCRs

Milan Griffes22 Jul 2019 20:47 UTC

28 points

11 comments1 min readEA link

Center for AI Safety’s Bi-Weekly Reading and Learning

Center for AI Safety2 Nov 2023 15:15 UTC

5 points

0 comments1 min readEA link

The Khayali Protocol

khayali2 Jun 2025 14:40 UTC

−8 points

0 comments3 min readEA link

Introducing a New Course on the Economics of AI

akorinek21 Dec 2021 4:55 UTC

84 points

6 comments2 min readEA link

The Rise of AI Agents: Consequences and Challenges Ahead

Tristan D28 Mar 2025 5:19 UTC

5 points

0 comments15 min readEA link

[Event] Join Metaculus Tomorrow, March 31st, for Forecast Friday!

christian30 Mar 2023 20:58 UTC

29 points

1 comment1 min readEA link

(www.metaculus.com)

#199 – California’s AI bill SB 1047 and its potential to shape US AI policy (Nathan Calvin on The 80,000 Hours Podcast)

80000_Hours30 Aug 2024 18:18 UTC

12 points

0 comments10 min readEA link

AI Safety Fundamentals: An Informal Cohort Starting Soon! (cross-posted to lesswrong.com)

Tiago4 Jun 2023 18:21 UTC

6 points

0 comments1 min readEA link

(www.lesswrong.com)

Announcing the AI Welfare Discord Server

Tim Duffy21 Jul 2025 16:36 UTC

7 points

0 comments1 min readEA link

Announcing the AI Safety Field Building Hub, a new effort to provide AISFB projects, mentorship, and funding

Vael Gates28 Jul 2022 21:29 UTC

126 points

6 comments6 min readEA link

What’s in a Pause?

Davidmanheim16 Sep 2023 10:13 UTC

73 points

10 comments9 min readEA link

Finite Field Assembly : A CUDA alternative rooted in Number Theory and Pure Mathematics

Murage Kibicho13 Jan 2025 13:37 UTC

−7 points

0 comments3 min readEA link

7 traps that (we think) new alignment researchers often fall into

Akash27 Sep 2022 23:13 UTC

73 points

8 comments4 min readEA link

Quick Thoughts on A.I. Governance

Nicholas Kross30 Apr 2022 14:49 UTC

43 points

0 comments2 min readEA link

(www.thinkingmuchbetter.com)

Retrospective: PIBBSS Fellowship 2024

Dušan D. Nešić (Dushan)20 Dec 2024 15:55 UTC

7 points

0 comments4 min readEA link

[Question] What are the strategic implications if aliens and Earth civilizations produce similar utilities?

Maxime Riché 🔸6 Aug 2024 21:21 UTC

6 points

1 comment1 min readEA link

Specification Gaming: How AI Can Turn Your Wishes Against You [RA Video]

Writer1 Dec 2023 19:30 UTC

8 points

1 comment5 min readEA link

(youtu.be)

Safety evaluations and standards for AI | Beth Barnes | EAG Bay Area 23

Beth Barnes16 Jun 2023 14:15 UTC

28 points

0 comments17 min readEA link

Frontier Model Forum

Zach Stein-Perlman26 Jul 2023 14:30 UTC

40 points

7 comments4 min readEA link

(blog.google)

Twitter thread on open-source AI

richard_ngo31 Jul 2024 0:30 UTC

32 points

0 comments2 min readEA link

(x.com)

Ego-Centric Architecture for AGI Safety v2: Technical Core, Falsifiable Predictions, and a Minimal Experiment

Samuel Pedrielli6 Aug 2025 12:35 UTC

1 point

0 comments6 min readEA link

More evidence X-risk amplifies action against current AI harms

Daniel_Friedrich22 Dec 2023 15:21 UTC

27 points

2 comments2 min readEA link

(osf.io)

Digest: three papers that have shaped my understanding of the potential for consciousness in AI systems

rileyharris21 Aug 2024 15:09 UTC

5 points

0 comments1 min readEA link

“AI Safety for Fleshy Humans” an AI Safety explainer by Nicky Case

Habryka [Deactivated]3 May 2024 19:28 UTC

40 points

3 comments4 min readEA link

(aisafety.dance)

xAI raises $6B

andzuck5 Jun 2024 15:26 UTC

18 points

1 comment1 min readEA link

(x.ai)

Notes on new UK AISI minister

Pseudaemonia5 Jul 2024 19:50 UTC

92 points

0 comments1 min readEA link

Should you work in the European Union to do AGI governance?

hanadulset31 Jan 2022 10:34 UTC

90 points

20 comments15 min readEA link

The academic contribution to AI safety seems large

technicalities30 Jul 2020 10:30 UTC

117 points

28 comments9 min readEA link

Invitation to participate in AGI global governance Real-Time Delphi questionnaire—The Millennium Project

Miquel Banchs-Piqué (prev. mikbp)13 Dec 2023 13:35 UTC

6 points

0 comments1 min readEA link

Seeking social science students / collaborators interested in AI existential risks

Vael Gates24 Sep 2021 21:56 UTC

58 points

7 comments3 min readEA link

Tim Cook was asked about extinction risks from AI

Saul Munn6 Jun 2023 18:46 UTC

8 points

1 comment1 min readEA link

Apply to the Constellation Visiting Researcher Program and Astra Fellowship, in Berkeley this Winter

AF26 Oct 2023 3:14 UTC

61 points

4 comments1 min readEA link

New TIME magazine article on the UK AI Safety Institute (AISI)

Rasool16 Jan 2025 22:51 UTC

9 points

0 comments1 min readEA link

(time.com)

[Question] What’s the exact way you predict probability of AI extinction?

jackchang11013 Jun 2023 15:11 UTC

18 points

7 comments1 min readEA link

How do fictional stories illustrate AI misalignment?

Vishakha Agrawal15 Jan 2025 6:16 UTC

4 points

0 comments2 min readEA link

(aisafety.info)

ARENA 5.0 - Call for Applicants

James Hindmarch31 Jan 2025 19:54 UTC

9 points

0 comments6 min readEA link

Douglas Hoftstadter concerned about AI xrisk

Eli Rose3 Jul 2023 3:30 UTC

64 points

0 comments1 min readEA link

(www.youtube.com)

Intergenerational trauma impeding cooperative existential safety efforts

Andrew Critch3 Jun 2022 17:27 UTC

82 points

2 comments3 min readEA link

AISN #23: New OpenAI Models, News from Anthropic, and Representation Engineering

Center for AI Safety4 Oct 2023 17:10 UTC

7 points

0 comments5 min readEA link

(newsletter.safe.ai)

AISN#14: OpenAI’s ‘Superalignment’ team, Musk’s xAI launches, and developments in military AI use

Center for AI Safety12 Jul 2023 16:58 UTC

26 points

0 comments4 min readEA link

(newsletter.safe.ai)

Compliance Monitoring as an Impactful Mechanism of AI Safety Policy

CAISID7 Feb 2024 16:10 UTC

6 points

3 comments9 min readEA link

Forecasting Transformative AI: What Kind of AI?

Holden Karnofsky10 Aug 2021 21:38 UTC

62 points

3 comments10 min readEA link

[Question] Is anyone working on safe selection pressure for digital minds?

WillPearson12 Dec 2023 18:17 UTC

10 points

9 comments1 min readEA link

Anthropic is being sued for copying books to train Claude

Remmelt31 Aug 2024 2:57 UTC

3 points

0 comments2 min readEA link

(fingfx.thomsonreuters.com)

Technical AGI safety research outside AI

richard_ngo18 Oct 2019 15:02 UTC

91 points

5 comments3 min readEA link

Funding for humanitarian non-profits to research responsible AI

Deborah W.A. Foulkes10 Dec 2024 8:08 UTC

4 points

0 comments2 min readEA link

(www.gov.uk)

Effective Enforceability of EU Competition Law Under Different AI Development Scenarios: A Framework for Legal Analysis

HaydnBelfield19 Aug 2022 17:20 UTC

11 points

0 comments6 min readEA link

(verfassungsblog.de)

AI Model Registries: A Foundational Tool for AI Governance

Elliot Mckernon7 Oct 2024 13:59 UTC

19 points

0 comments4 min readEA link

(www.convergenceanalysis.org)

[Link] How understanding valence could help make future AIs safer

Milan Griffes8 Oct 2020 18:53 UTC

22 points

2 comments3 min readEA link

On how various plans miss the hard bits of the alignment challenge

So8res12 Jul 2022 5:35 UTC

126 points

13 comments29 min readEA link

Changes in funding in the AI safety field

Sebastian_Farquhar3 Feb 2017 13:09 UTC

34 points

10 comments7 min readEA link

Tracking Critical Infrastructure AI Incidents

Ben Turse29 Sep 2024 21:29 UTC

1 point

0 comments2 min readEA link

AI strategy nearcasting

Holden Karnofsky26 Aug 2022 16:25 UTC

61 points

3 comments10 min readEA link

ML4Good Summer Bootcamps—Applications Open

Nia4 Jul 2024 18:38 UTC

39 points

0 comments1 min readEA link

Don’t Dismiss Simple Alignment Approaches

Chris Leong21 Oct 2023 12:31 UTC

12 points

0 comments4 min readEA link

Sorting Pebbles Into Correct Heaps: The Animation

Writer10 Jan 2023 15:58 UTC

12 points

0 comments1 min readEA link

(youtu.be)

Getting started independently in AI Safety

JJ Hepburn6 Jul 2021 15:20 UTC

41 points

10 comments2 min readEA link

AI Clarity: An Initial Research Agenda

Justin Bullock3 May 2024 16:29 UTC

27 points

1 comment8 min readEA link

£1 million prize for the most cutting-edge AI solution for public good [link post]

rileyharris17 Jan 2024 14:36 UTC

8 points

0 comments2 min readEA link

(manchesterprize.org)

Launching Amplify: Receive marketing support for your local groups and other field-building initiatives

gergo28 Aug 2024 14:12 UTC

37 points

0 comments2 min readEA link

Key Papers in Language Model Safety

aog20 Jun 2022 14:59 UTC

20 points

0 comments22 min readEA link

Global Risks Weekly Roundup #18/2025: US tariff shortages, military policing, Gaza famine.

NunoSempere6 May 2025 10:39 UTC

22 points

0 comments3 min readEA link

(blog.sentinel-team.org)

[Question] How to influence AGI?

Sam Freedman9 Jan 2025 20:46 UTC

2 points

0 comments1 min readEA link

My personal cruxes for working on AI safety

Buck13 Feb 2020 7:11 UTC

136 points

35 comments44 min readEA link

Meditations on careers in AI Safety

PabloAMC 🔸23 Mar 2022 22:00 UTC

88 points

30 comments2 min readEA link

Transhumanism and AI: Toward Prosperity or Extinction?

Shaïman Thürler22 Mar 2025 18:01 UTC

9 points

1 comment6 min readEA link

Distillation of The Offense-Defense Balance of Scientific Knowledge

Arjun Yadav12 Aug 2022 7:01 UTC

17 points

0 comments2 min readEA link

CFP for the Largest Annual Meeting of Political Science: Get Help With Your Research Submission

Mahendra Prasad22 Dec 2020 23:39 UTC

13 points

0 comments2 min readEA link

[Question] I have thousands of copies of HPMOR in Russian. How to use them with the most impact?

MikhailSamin27 Dec 2022 11:07 UTC

39 points

10 comments1 min readEA link

13 Recent Publications on Existential Risk (Jan 2021 update)

HaydnBelfield8 Feb 2021 12:42 UTC

7 points

2 comments10 min readEA link

[Question] What are the biggest obstacles on AI safety research career?

jackchang11031 Mar 2023 14:53 UTC

2 points

1 comment1 min readEA link

AGI Safety Fundamentals curriculum and application

richard_ngo20 Oct 2021 21:45 UTC

123 points

20 comments8 min readEA link

(docs.google.com)

Why does no one care about AI?

Olivia Addy7 Aug 2022 22:04 UTC

55 points

47 comments1 min readEA link

Have your timelines changed as a result of ChatGPT?

Chris Leong5 Dec 2022 15:03 UTC

30 points

18 comments1 min readEA link

[Question] How can I best use my career to pass impactful AI and Biosecurity policy.

maxg13 Oct 2023 5:14 UTC

4 points

1 comment1 min readEA link

Announcing the Cambridge Boston Alignment Initiative [Hiring!]

kuhanj2 Dec 2022 1:07 UTC

83 points

0 comments1 min readEA link

AISafety.com Hackathon 2025

Bryce Robertson14 Jul 2025 23:56 UTC

7 points

0 comments1 min readEA link

Can AI Outpredict Humans? Results From Metaculus’s Q3 AI Forecasting Benchmark

Tom Liptay10 Oct 2024 18:58 UTC

32 points

1 comment6 min readEA link

(www.metaculus.com)

Looking for for evidence of AI impacts in the age structure of occupations: Nothing yet

Pat McKelvey 9 May 2025 18:12 UTC

26 points

2 comments3 min readEA link

Ngo and Yudkowsky on AI capability gains

richard_ngo19 Nov 2021 1:54 UTC

23 points

4 comments39 min readEA link

The Missing Piece: Why We Need a Grand Strategy for AI

Coleman28 Feb 2025 23:49 UTC

7 points

1 comment9 min readEA link

Is scheming more likely in models trained to have long-term goals? (Sections 2.2.4.1-2.2.4.2 of “Scheming AIs”)

Joe_Carlsmith30 Nov 2023 16:43 UTC

6 points

1 comment5 min readEA link

The stakes of AI moral status

Joe_Carlsmith21 May 2025 18:20 UTC

54 points

9 comments14 min readEA link

(joecarlsmith.substack.com)

My plan for a “Most Important Century” reading group

Jack O'Brien19 Jan 2022 9:32 UTC

12 points

1 comment2 min readEA link

Is understanding the moral status of digital minds a pressing world problem?

Cody_Fenwick30 Sep 2024 8:50 UTC

42 points

0 comments34 min readEA link

(80000hours.org)

Submit Your Toughest Questions for Humanity’s Last Exam

Matrice Jacobine18 Sep 2024 8:03 UTC

6 points

0 comments2 min readEA link

(www.safe.ai)

A stubborn unbeliever finally gets the depth of the AI alignment problem

aelwood13 Oct 2022 15:16 UTC

32 points

7 comments3 min readEA link

(pursuingreality.substack.com)

Communication by existential risk organizations: State of the field and suggestions for improvement

Existential Risk Communication Project13 Aug 2024 7:06 UTC

10 points

3 comments13 min readEA link

Proposal for a Form of Conditional Supplemental Income (CSI) in a Post-Work World

Sean Sweeney31 Jan 2025 1:00 UTC

3 points

0 comments3 min readEA link

The religion problem in AI alignment

Geoffrey Miller16 Sep 2022 1:24 UTC

54 points

28 comments11 min readEA link

College technical AI safety hackathon retrospective—Georgia Tech

yixiong14 Nov 2024 13:34 UTC

18 points

0 comments5 min readEA link

(yixiong.substack.com)

Visible Thoughts Project and Bounty Announcement

So8res30 Nov 2021 0:35 UTC

35 points

2 comments13 min readEA link

Feasibility of training and inferring advanced large language models (LLMs) in data centers in Mexico and Brazil.

Tatiana Sandoval2 May 2025 13:42 UTC

15 points

0 comments24 min readEA link

Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover

Ajeya18 Jul 2022 19:07 UTC

218 points

12 comments84 min readEA link

(www.lesswrong.com)

Technical AI safety in the United Arab Emirates

ea nyuad21 Jun 2022 3:11 UTC

10 points

0 comments11 min readEA link

AI safety advocates should consider providing gentle pushback following the events at OpenAI

I_machinegun_Kelly22 Dec 2023 21:05 UTC

86 points

5 comments3 min readEA link

(www.lesswrong.com)

I created an Asi Alignment Tier List

TimeGoat22 Apr 2024 12:14 UTC

0 points

0 comments1 min readEA link

X-Risk Researchers Survey

NitaSangha24 Apr 2023 8:06 UTC

12 points

1 comment1 min readEA link

Lessons for AI Governance from Atoms for Peace

Amritanshu Prasad16 Apr 2025 14:25 UTC

10 points

2 comments2 min readEA link

(www.thenextfrontier.blog)

Am I Missing Something, or Is EA? Thoughts from a Learner in Uganda

Dr Kassim16 Mar 2025 11:31 UTC

234 points

16 comments3 min readEA link

If AGI is imminent, why can’t I hail a robotaxi?

Yarrow🔸9 Dec 2023 20:50 UTC

26 points

4 comments1 min readEA link

Being an individual alignment grantmaker

A_donor28 Feb 2022 16:39 UTC

34 points

20 comments2 min readEA link

GPT-5 is out

david_reinstein7 Aug 2025 20:37 UTC

20 points

1 comment1 min readEA link

(openai.com)

What could an AI-caused existential catastrophe actually look like?

Benjamin Hilton12 Sep 2022 16:25 UTC

57 points

7 comments9 min readEA link

(80000hours.org)

Announcing the AI Forecasting Benchmark Series | July 8, $120k in Prizes

christian19 Jun 2024 21:37 UTC

52 points

4 comments5 min readEA link

(www.metaculus.com)

A mesa-optimization perspective on AI valence and moral patienthood

jacobpfau9 Sep 2021 22:23 UTC

10 points

18 comments17 min readEA link

You Don’t Have to Be an AI Doomer to Support AI Safety

Liam Robins14 Jun 2025 23:10 UTC

10 points

0 comments4 min readEA link

(thelimestack.substack.com)

ML Safety Scholars Summer 2022 Retrospective

TW1231 Nov 2022 3:09 UTC

56 points

2 comments21 min readEA link

Overview: AI Safety Outreach Grassroots Orgs

Severin12 May 2025 14:38 UTC

11 points

0 comments2 min readEA link

[Question] How have analogous Industries solved Interested > Trained > Employed bottlenecks?

yanni kyriacos30 May 2024 23:59 UTC

6 points

0 comments1 min readEA link

Facilitator Help Wanted for Columbia EA AI Safety Groups

Berkan Ottlik5 Jul 2022 10:27 UTC

16 points

0 comments1 min readEA link

Apply to >50 AI safety funders in one application with the Nonlinear Network [Round Closed]

Drew Spartz12 Apr 2023 21:06 UTC

156 points

18 comments2 min readEA link

2019 AI Alignment Literature Review and Charity Comparison

Larks19 Dec 2019 2:58 UTC

147 points

28 comments62 min readEA link

Will disagreement about AI rights lead to societal conflict?

Lucius Caviola3 Jul 2024 13:30 UTC

51 points

0 comments22 min readEA link

(outpaced.substack.com)

Energy-Based Transformers are Scalable Learners and Thinkers

Matrice Jacobine8 Jul 2025 13:44 UTC

8 points

0 comments1 min readEA link

(energy-based-transformers.github.io)

UN Public Call for Nominations For High-level Advisory Body on Artificial Intelligence

vincentweisser10 Aug 2023 10:34 UTC

15 points

1 comment1 min readEA link

Largest AI model in 2 years from $10B

Peter Drotos 🔸24 Oct 2023 15:14 UTC

37 points

0 comments7 min readEA link

Contra shard theory, in the context of the diamond maximizer problem

So8res13 Oct 2022 23:51 UTC

27 points

0 comments2 min readEA link

AI-Risk in the State of the European Union Address

Sam Bogerd13 Sep 2023 13:27 UTC

25 points

0 comments3 min readEA link

(state-of-the-union.ec.europa.eu)

Amplify is hiring! Work with us to support field-building initiatives through digital marketing

gergo28 Aug 2024 14:12 UTC

28 points

1 comment4 min readEA link

A necessary Membrane formalism feature

ThomasCederborg10 Sep 2024 21:03 UTC

1 point

0 comments11 min readEA link

Animal communication and the future of moral progress: speculations and responsibilities

Lutebemberwa Isa27 May 2025 15:51 UTC

10 points

0 comments3 min readEA link

Gradual Disempowerment: Concrete Research Projects

Raymond D29 May 2025 18:58 UTC

20 points

1 comment10 min readEA link

When the Smarter AI Lies Better: Can Debate-Based Oversight Catch Deceptive Code?

Oskar Kraak4 Jul 2025 22:10 UTC

2 points

0 comments5 min readEA link

(oskarkraak.com)

Cognitive Science/Psychology As a Neglected Approach to AI Safety

Kaj_Sotala5 Jun 2017 13:46 UTC

40 points

37 comments4 min readEA link

[Question] Fiscal sponsorship, ops support, or incubation?

Harry Luk4 Oct 2023 22:06 UTC

14 points

8 comments1 min readEA link

Draft report on AI timelines

Ajeya15 Dec 2020 12:10 UTC

35 points

0 comments1 min readEA link

(alignmentforum.org)

AI Alternative Futures: Exploratory Scenario Mapping for Artificial Intelligence Risk—Request for Participation [Linkpost]

Kiliank9 May 2022 19:53 UTC

17 points

2 comments8 min readEA link

Now is the Time for Moonshots

Alejandro Acelas 🔸18 Jul 2025 15:59 UTC

2 points

0 comments1 min readEA link

(lukedrago.substack.com)

Let’s talk about uncontrollable AI

Karl von Wendt9 Oct 2022 10:37 UTC

12 points

2 comments3 min readEA link

[Question] What would need to be true for AI to translate a legal contract to a smart contract?

Patrick Liu18 Mar 2023 16:42 UTC

−1 points

0 comments1 min readEA link

[Question] Should we publish arguments for the preservation of humanity?

Jeremy7 Apr 2023 13:51 UTC

8 points

4 comments1 min readEA link

#180 – Why gullibility and misinformation are overrated (Hugo Mercier on the 80,000 Hours Podcast)

80000_Hours26 Feb 2024 19:16 UTC

15 points

0 comments18 min readEA link

Clarifying two uses of “alignment”

Matthew_Barnett10 Mar 2024 17:41 UTC

36 points

28 comments4 min readEA link

A one-sentence formulation of the AI X-Risk argument I try to make

tcelferact2 Mar 2024 0:44 UTC

3 points

0 comments1 min readEA link

[$20K In Prizes] AI Safety Arguments Competition

TW12326 Apr 2022 16:21 UTC

71 points

121 comments3 min readEA link

20 concrete projects for reducing existential risk

Buhl21 Jun 2023 15:54 UTC

132 points

27 comments20 min readEA link

(rethinkpriorities.org)

[Question] Why not offer a multi-million / billion dollar prize for solving the Alignment Problem?

Aryeh Englander17 Apr 2022 16:08 UTC

15 points

9 comments1 min readEA link

deleted

funnyfranco24 Mar 2025 19:44 UTC

4 points

10 comments1 min readEA link

China-AI forecasting

Nathan_Barnard25 Feb 2024 16:47 UTC

10 points

2 comments6 min readEA link

[Question] Examples of self-governance to reduce technology risk?

jia25 Sep 2020 13:26 UTC

32 points

1 comment1 min readEA link

Crucial considerations in the field of Wild Animal Welfare (WAW)

Holly Elmore ⏸️ 🔸10 Apr 2022 19:43 UTC

64 points

10 comments3 min readEA link

Emergent Ventures AI

technicalities8 Apr 2022 22:08 UTC

22 points

0 comments1 min readEA link

(marginalrevolution.com)

AISN #29: Progress on the EU AI Act Plus, the NY Times sues OpenAI for Copyright Infringement, and Congressional Questions about Research Standards in AI Safety

Center for AI Safety4 Jan 2024 16:03 UTC

5 points

0 comments6 min readEA link

(newsletter.safe.ai)

Eric Drexler: Paretotopian goal alignment

EA Global15 Mar 2019 14:51 UTC

16 points

0 comments10 min readEA link

(www.youtube.com)

A model-based approach to AI Existential Risk

SammyDMartin25 Aug 2023 10:44 UTC

17 points

0 comments1 min readEA link

(www.lesswrong.com)

Longview is now offering AI grant recommendations to donors giving >$100k / year

Longview Philanthropy11 Apr 2025 16:01 UTC

73 points

0 comments2 min readEA link

MLSN: #10 Adversarial Attacks Against Language and Vision Models, Improving LLM Honesty, and Tracing the Influence of LLM Training Data

Center for AI Safety13 Sep 2023 18:02 UTC

7 points

0 comments5 min readEA link

(newsletter.mlsafety.org)

The longtermist AI governance landscape: a basic overview

Sam Clarke18 Jan 2022 12:58 UTC

172 points

13 comments9 min readEA link

[Question] Is there much need for frontend engineers in AI alignment?

Michael G21 Sep 2023 20:48 UTC

11 points

1 comment1 min readEA link

Evolving OpenAI’s Structure

Tyner🔸6 May 2025 0:52 UTC

12 points

1 comment1 min readEA link

2016 AI Risk Literature Review and Charity Comparison

Larks13 Dec 2016 4:36 UTC

57 points

12 comments28 min readEA link

Why I think that teaching philosophy is high impact

Eleni_A19 Dec 2022 23:00 UTC

17 points

2 comments2 min readEA link

Database of existential risk estimates

MichaelA🔸15 Apr 2020 12:43 UTC

130 points

37 comments5 min readEA link

The Superintelligence That Cares About Us

henrik.westerberg5 Jul 2025 10:20 UTC

5 points

0 comments2 min readEA link

Ngo and Yudkowsky on scientific reasoning and pivotal acts

EliezerYudkowsky21 Feb 2022 17:00 UTC

33 points

1 comment35 min readEA link

When is AI safety research harmful?

Nathan_Barnard9 May 2022 10:36 UTC

13 points

6 comments9 min readEA link

Mediocre AI safety as existential risk

technicalities16 Mar 2022 11:50 UTC

52 points

12 comments3 min readEA link

Introducing International AI Governance Alliance (IAIGA)

James Norris5 Feb 2025 15:59 UTC

12 points

0 comments1 min readEA link

Helen Toner (ex-OpenAI board member): “We learned about ChatGPT on Twitter.”

defun 🔸29 May 2024 7:40 UTC

123 points

13 comments1 min readEA link

(x.com)

Preserving and continuing alignment research through a severe global catastrophe

A_donor6 Mar 2022 18:43 UTC

40 points

11 comments5 min readEA link

When safety is dangerous: risks of an indefinite pause on AI development, and call for realistic alternatives

Hayven Frienby18 Jan 2024 14:59 UTC

5 points

0 comments5 min readEA link

A Brief Summary Of The Most Important Century

Maynk0225 Oct 2022 15:28 UTC

3 points

0 comments5 min readEA link

Reducing global AI competition through the Commerce Control List and Immigration reform: a dual-pronged approach

ben.smith3 Sep 2024 5:28 UTC

15 points

0 comments9 min readEA link

BERI is hiring an ML Software Engineer

sawyer🔸10 Nov 2021 19:36 UTC

17 points

2 comments1 min readEA link

Loss of control of AI is not a likely source of AI x-risk

squek9 Nov 2022 5:48 UTC

8 points

0 comments5 min readEA link

Cryptocurrency Exploits Show the Importance of Proactive Policies for AI X-Risk

eSpencer16 Sep 2022 4:44 UTC

14 points

1 comment4 min readEA link

Is there a Half-Life for the Success Rates of AI Agents?

Matrice Jacobine8 May 2025 20:10 UTC

6 points

0 comments1 min readEA link

(www.tobyord.com)

[Question] Is AI like disk drives?

Tanae2 Sep 2023 19:12 UTC

8 points

1 comment1 min readEA link

A Manifold Market “Leaked” the AI Extinction Statement and CAIS Wanted it Deleted

David Chee12 Jun 2023 15:57 UTC

24 points

9 comments12 min readEA link

(news.manifold.markets)

[Link] Reading the ethicists: A review of articles on AI in the journal Science and Engineering Ethics

Charlie Steiner18 May 2022 21:06 UTC

7 points

0 comments1 min readEA link

(www.lesswrong.com)

Do short AI timelines make other cause areas useless?

Hayley Clatterbuck23 Jul 2025 19:10 UTC

110 points

14 comments18 min readEA link

I’m NOT against Artificial Intelligence

Victoria Dias24 Apr 2025 18:02 UTC

6 points

1 comment18 min readEA link

Third-wave AI safety needs sociopolitical thinking

richard_ngo27 Mar 2025 0:55 UTC

106 points

59 comments26 min readEA link

Some Preliminary Opinions on AI Safety Problems

yonxinzhang6 Apr 2023 12:42 UTC

5 points

0 comments6 min readEA link

AMA: Ought

stuhlmueller3 Aug 2022 17:24 UTC

41 points

52 comments1 min readEA link

Ambitious Impact launches a for-profit accelerator instead of building the AI Safety space. Let’s talk about this.

yanni kyriacos18 Mar 2024 3:44 UTC

−7 points

13 comments1 min readEA link

Some AI Governance Research Ideas

MarkusAnderljung3 Jun 2021 10:51 UTC

102 points

5 comments2 min readEA link

Credo AI is hiring for several roles

IanEisenberg11 Apr 2022 15:58 UTC

14 points

2 comments1 min readEA link

Try o3-pro in ChatGPT for $1 (is AI a bubble?)

Hauke Hillebrandt24 Jun 2025 11:15 UTC

29 points

1 comment4 min readEA link

Not Just For Therapy Chatbots: The Case For Compassion In AI Moral Alignment Research

Kenneth_Diao29 Sep 2024 22:58 UTC

8 points

3 comments12 min readEA link

Is anyone else also getting more worried about hard takeoff AGI scenarios?

JonCefalu9 Jan 2023 6:04 UTC

19 points

11 comments3 min readEA link

OpenAI lost $5 billion in 2024 (and its losses are increasing)

Remmelt31 Mar 2025 4:17 UTC

0 points

3 comments12 min readEA link

(www.wheresyoured.at)

Max Tegmark: Risks and benefits of advanced artificial intelligence

EA Global5 Aug 2016 9:19 UTC

7 points

0 comments1 min readEA link

(www.youtube.com)

Lessons from Three Mile Island for AI Warning Shots

NickGabs26 Sep 2022 2:47 UTC

44 points

0 comments15 min readEA link

[Question] Do EA folks want AGI at all?

Noah Scales16 Jul 2022 5:44 UTC

8 points

10 comments1 min readEA link

Interpretable Analysis of Features Found in Open-source Sparse Autoencoder (partial replication)

Fernando Avalos28 Aug 2024 22:08 UTC

10 points

1 comment10 min readEA link

What’s going on with ‘crunch time’?

rosehadshar20 Jan 2023 9:38 UTC

92 points

5 comments4 min readEA link

Corporate AI Labs’ Odd Role in Their Own Governance

Corporate AI Labs' Odd Role in Their Own Governance29 Jul 2024 9:36 UTC

66 points

6 comments12 min readEA link

(dominikhermle.substack.com)

Promethean Governance in Practice: Crafting a Polycentric, Memetic Order for the Multipolar AI Aeon

Paul Fallavollita20 Mar 2025 10:10 UTC

−1 points

0 comments4 min readEA link

Survey: How Do Elite Chinese Students Feel About the Risks of AI?

Nick Corvino2 Sep 2024 9:14 UTC

107 points

9 comments10 min readEA link

President Biden Issues Executive Order on Safe, Secure, and Trustworthy Artificial Intelligence

Tristan Williams30 Oct 2023 11:15 UTC

143 points

8 comments3 min readEA link

(www.whitehouse.gov)

Cost-effectiveness of making a video game with EA concepts

mmKALLL15 Sep 2022 13:48 UTC

8 points

2 comments5 min readEA link

Forecast AI 2027

christian12 Jun 2025 21:12 UTC

22 points

0 comments1 min readEA link

(www.metaculus.com)

Winners of the AI Safety Nudge Competition

Marc Carauleanu15 Nov 2022 1:06 UTC

22 points

0 comments1 min readEA link

We Are Conjecture, A New Alignment Research Startup

Connor Leahy9 Apr 2022 15:07 UTC

31 points

0 comments1 min readEA link

The Orthogonality Thesis is Not Obviously True

Bentham's Bulldog5 Apr 2023 21:08 UTC

18 points

12 comments9 min readEA link

Steering AI to care for animals, and soon

Andrew Critch14 Jun 2022 1:13 UTC

239 points

37 comments1 min readEA link

De-emphasise alignment, emphasise restraint

EuanMcLean4 Feb 2025 17:43 UTC

19 points

2 comments7 min readEA link

AI Safety Executive Summary

Sean Osier6 Sep 2022 8:26 UTC

20 points

2 comments5 min readEA link

(seanosier.notion.site)

Preliminary investigations on if STEM and EA communities could benefit from more overlap

elteerkers11 Apr 2023 16:08 UTC

31 points

17 comments8 min readEA link

Origin and alignment of goals, meaning, and morality

FalseCogs24 Aug 2023 14:05 UTC

1 point

2 comments35 min readEA link

Evals projects I’d like to see, and a call to apply to OP’s evals RFP

cb25 Mar 2025 11:50 UTC

25 points

2 comments3 min readEA link

Beware of the new scaling paradigm

JohanEA19 Sep 2024 17:03 UTC

9 points

2 comments3 min readEA link

AI romantic partners will harm society if they go unregulated

Roman Leventov31 Jul 2023 15:55 UTC

16 points

9 comments13 min readEA link

The Windfall Clause has a remedies problem

John Bridge 🔸23 May 2022 10:31 UTC

40 points

0 comments17 min readEA link

Applications to EAGxCDMX close in a week!

cescorza17 Feb 2025 20:42 UTC

15 points

0 comments1 min readEA link

AI as a science, and three obstacles to alignment strategies

So8res25 Oct 2023 21:02 UTC

41 points

1 comment11 min readEA link

Existential Cybersecurity Risks & AI (A Research Agenda)

Madhav Malhotra20 Sep 2023 12:03 UTC

7 points

0 comments8 min readEA link

The Short Timelines Strategy for AI Safety University Groups

Josh Thorsteinson 🔸7 Mar 2025 4:26 UTC

50 points

8 comments5 min readEA link

The case for conscious AI: Clearing the record [AI Consciousness & Public Perception]

Jay Luong5 Jul 2024 20:29 UTC

3 points

7 comments8 min readEA link

AI governance & China: Reading list

Zach Stein-Perlman18 Dec 2023 15:30 UTC

14 points

0 comments1 min readEA link

(docs.google.com)

The Manhattan Trap: Why a Race to Artificial Superintelligence is Self-Defeating

Corin Katzke21 Jan 2025 16:57 UTC

98 points

1 comment2 min readEA link

(www.convergenceanalysis.org)

Map of AI Safety v2

Bryce Robertson15 Apr 2025 13:04 UTC

59 points

6 comments1 min readEA link

Reviews of “Is power-seeking AI an existential risk?”

Joe_Carlsmith16 Dec 2021 20:50 UTC

71 points

4 comments1 min readEA link

Belief Bias: Bias in Evaluating AGI X-Risks

Remmelt2 Jan 2023 8:59 UTC

5 points

0 comments1 min readEA link

A “Solipsistic” Repugnant Conclusion

Ramiro21 Jul 2022 16:06 UTC

13 points

0 comments6 min readEA link

Tyler Cowen’s challenge to develop an ‘actual mathematical model’ for AI X-Risk

Joe Brenton16 May 2023 16:55 UTC

20 points

4 comments1 min readEA link

$20K in Bounties for AI Safety Public Materials

TW1235 Aug 2022 2:57 UTC

45 points

11 comments6 min readEA link

Winning isn’t enough

Anthony DiGiovanni5 Nov 2024 11:43 UTC

33 points

3 comments9 min readEA link

Three polls: on timelines and cause prio

Toby Tremlett🔹28 Apr 2025 12:03 UTC

30 points

41 comments1 min readEA link

[Question] Workshop (hackathon, residence program, etc.) about for-profit AI Safety projects?

Roman Leventov26 Jan 2024 9:49 UTC

13 points

1 comment1 min readEA link

Ex-OpenAI employee amici leave to file denied in Musk v OpenAI case?

TFD2 May 2025 12:31 UTC

8 points

0 comments2 min readEA link

(www.thefloatingdroid.com)

Announcing AI Alignment workshop at the ALIFE 2023 conference

Rory Greig8 Jul 2023 13:49 UTC

9 points

0 comments1 min readEA link

(humanvaluesandartificialagency.com)

[Question] Graph of % of tasks AI is superhuman at?

Denkenberger🔸15 Nov 2022 5:59 UTC

9 points

0 comments1 min readEA link

AGI safety field building projects I’d like to see

Severin24 Jan 2023 23:30 UTC

25 points

2 comments9 min readEA link

Black Hole Bundle

khayali11 Jun 2025 1:20 UTC

−2 points

0 comments36 min readEA link

On taking AI risk seriously

Eleni_A13 Mar 2023 5:44 UTC

51 points

4 comments1 min readEA link

(www.nytimes.com)

We Ran an AI Timelines Retreat

Lenny McCline17 May 2022 4:40 UTC

46 points

6 comments3 min readEA link

My disagreements with “AGI ruin: A List of Lethalities”

Sharmake15 Sep 2024 17:22 UTC

23 points

2 comments18 min readEA link

A.I love you : AGI and Human Traitors

Pilot Pillow2 Apr 2025 14:18 UTC

1 point

2 comments7 min readEA link

Ideal governance (for companies, countries and more)

Holden Karnofsky7 Apr 2022 16:54 UTC

80 points

19 comments14 min readEA link

Addressing the nonhuman gap in intergovernmental AI governance frameworks

Alistair Stewart15 Jul 2025 21:13 UTC

73 points

2 comments8 min readEA link

A different take on the Musk v OpenAI preliminary injunction order

TFD11 Mar 2025 14:29 UTC

6 points

1 comment20 min readEA link

(www.thefloatingdroid.com)

The space of systems and the space of maps

Jan_Kulveit22 Mar 2023 16:05 UTC

12 points

0 comments5 min readEA link

(www.lesswrong.com)

Jeffrey Ding: Re-deciphering China’s AI dream

EA Global18 Oct 2019 18:05 UTC

13 points

0 comments1 min readEA link

(www.youtube.com)

Volunteer Opportunities with the AI Safety Awareness Foundation

NoahCWilson🔸8 Mar 2025 4:41 UTC

7 points

0 comments2 min readEA link

Nuclear Espionage and AI Governance

GAA4 Oct 2021 18:21 UTC

32 points

3 comments24 min readEA link

Meridian Cambridge Visiting Researcher Programme: Turn AI safety ideas into funded projects in one week!

Meridian5 Mar 2025 7:19 UTC

27 points

3 comments2 min readEA link

The heterogeneity of human value types: Implications for AI alignment

Geoffrey Miller16 Sep 2022 21:21 UTC

27 points

2 comments10 min readEA link

[Question] What happened to the ‘only 400 people work in AI safety/governance’ number dated from 2020?

Vaipan15 Mar 2024 15:25 UTC

27 points

1 comment1 min readEA link

Reduce AGI risks using modern lie detection technology

NothingIsArt30 Sep 2024 18:12 UTC

1 point

0 comments1 min readEA link

Emergency pod: Don’t believe OpenAI’s “nonprofit” spin (with Tyler Whitmer)

80000_Hours15 May 2025 16:52 UTC

37 points

0 comments2 min readEA link

AMA: Markus Anderljung (PM at GovAI, FHI)

MarkusAnderljung21 Sep 2020 11:23 UTC

49 points

24 comments2 min readEA link

“Technological unemployment” AI vs. “most important century” AI: how far apart?

Holden Karnofsky11 Oct 2022 4:50 UTC

17 points

1 comment3 min readEA link

(www.cold-takes.com)

Pitching AI Safety in 3 sentences

PabloAMC 🔸30 Mar 2022 18:50 UTC

7 points

0 comments1 min readEA link

[MLSN #6]: Transparency survey, provable robustness, ML models that predict the future

Dan H12 Oct 2022 20:51 UTC

21 points

1 comment6 min readEA link

The Case for an Online Encyclopedia Managed by AI Agents

Casey Milkweed21 Jul 2025 14:06 UTC

2 points

0 comments1 min readEA link

(substack.com)

Humanities Research Ideas for Longtermists

Lizka9 Jun 2021 4:39 UTC

151 points

13 comments13 min readEA link

AI Self-Modification Amplifies Risks

Ihor Ivliev3 Jun 2025 20:27 UTC

0 points

0 comments2 min readEA link

Announcing Epoch: A research organization investigating the road to Transformative AI

Jaime Sevilla27 Jun 2022 13:39 UTC

183 points

11 comments2 min readEA link

(epochai.org)

Why Is No One Trying To Align Profit Incentives With Alignment Research?

Prometheus23 Aug 2023 13:19 UTC

17 points

2 comments4 min readEA link

(www.lesswrong.com)

AI companies’ commitments

Zach Stein-Perlman31 May 2024 0:00 UTC

9 points

0 comments1 min readEA link

[Question] Forecasting Questions: What do you want to predict on AI?

Nathan Young1 Nov 2023 13:16 UTC

9 points

0 comments1 min readEA link

[Question] EA’s Achievements in 2022

ElliotJDavies14 Dec 2022 14:33 UTC

98 points

11 comments1 min readEA link

Now is a good time to update your threat model

Flo 🔸22 Mar 2025 21:11 UTC

29 points

0 comments1 min readEA link

Creating a “Conscience Calculator” to Guard-Rail an AGI

Sean Sweeney12 Aug 2024 15:58 UTC

1 point

11 comments17 min readEA link

[Question] Half-baked alignment idea

ozb28 Mar 2023 5:18 UTC

9 points

2 comments1 min readEA link

[Question] Why are bond yields anomalously rising following the September rate cut?

incredibleutility7 Jan 2025 15:49 UTC

2 points

2 comments1 min readEA link

Anthropic is not being consistently candid about their connection to EA

burner230 Mar 2025 13:30 UTC

291 points

88 comments2 min readEA link

Can Knowledge Hurt You? The Dangers of Infohazards (and Exfohazards)

A.G.G. Liu8 Feb 2025 15:51 UTC

12 points

0 comments5 min readEA link

(www.youtube.com)

Longtermist reasons to work for innovative governments

ac13 Oct 2020 16:32 UTC

74 points

8 comments1 min readEA link

What can the principal-agent literature tell us about AI risk?

ac10 Feb 2020 10:10 UTC

26 points

1 comment16 min readEA link

AI Moral Alignment: The Most Important Goal of Our Generation

Ronen Bar26 Mar 2025 12:32 UTC

130 points

32 comments8 min readEA link

“That’s (not) me!”: The malicious employment of deepfakes and their mitigation in legal environments for AI governance

Gabriela Pardo1 May 2025 14:54 UTC

5 points

0 comments12 min readEA link

How to ‘troll for good’: Leveraging IP for AI governance

Michael Huang26 Feb 2023 6:34 UTC

26 points

3 comments1 min readEA link

(www.science.org)

US credit rating downgraded, $1T in Gulf state investments in the US, Kurdistan Workers’ Party disbanded | Sentinel Global Risks Weekly Roundup #20/2025

NunoSempere19 May 2025 18:02 UTC

50 points

0 comments1 min readEA link

(blog.sentinel-team.org)

AI alignment research links

Holden Karnofsky6 Jan 2022 5:52 UTC

16 points

0 comments6 min readEA link

(www.cold-takes.com)

AI Safety − 7 months of discussion in 17 minutes

Zoe Williams15 Mar 2023 23:41 UTC

90 points

2 comments17 min readEA link

Thoughts on the OpenAI alignment plan: will AI research assistants be net-positive for AI existential risk?

Jeffrey Ladish10 Mar 2023 8:20 UTC

12 points

0 comments9 min readEA link

Gaia Network: a practical, incremental pathway to Open Agency Architecture

Roman Leventov20 Dec 2023 17:11 UTC

4 points

0 comments16 min readEA link

CSER is hiring for a senior research associate on longterm AI risk and governance

Sam Clarke24 Jan 2022 13:24 UTC

9 points

4 comments1 min readEA link

The AGI Awakeness valley of doom and three pathways to slowing

GideonF28 Jul 2025 18:46 UTC

16 points

0 comments16 min readEA link

(open.substack.com)

Back to the Past to the Future

Prometheus18 Oct 2023 16:51 UTC

4 points

0 comments1 min readEA link

Promoting compassionate longtermism

jonleighton7 Dec 2022 14:26 UTC

117 points

5 comments12 min readEA link

Longtermism better from a development skeptical stance?

Benevolent_Rain9 Dec 2024 12:16 UTC

16 points

2 comments1 min readEA link

[Question] Training a GPT model on EA texts: what data?

JoyOptimizer4 Jun 2022 5:59 UTC

23 points

17 comments1 min readEA link

The Alignment Problem No One is Talking About

Non-zero-sum James14 May 2024 10:42 UTC

5 points

0 comments2 min readEA link

Cooperation for AI safety must transcend geopolitical interference

Matrice Jacobine16 Feb 2025 18:18 UTC

9 points

0 comments1 min readEA link

(www.scmp.com)

When is it important that open-weight models aren’t released? My thoughts on the benefits and dangers of open-weight models in response to developments in CBRN capabilities.

Ryan Greenblatt9 Jun 2025 19:19 UTC

39 points

3 comments9 min readEA link

LLMs might already be conscious

MichaelDickens5 Jul 2025 19:31 UTC

33 points

8 comments2 min readEA link

(mdickens.me)

No, the EMH does not imply that markets have long AGI timelines

Jakob24 Apr 2023 8:27 UTC

83 points

21 comments8 min readEA link

New Funding Round on Hardware-Enabled Mechanisms (HEMs)

aog30 Apr 2025 17:45 UTC

54 points

0 comments15 min readEA link

On presenting the case for AI risk

Aryeh Englander8 Mar 2022 21:37 UTC

114 points

12 comments4 min readEA link

A Taxonomy Of AI System Evaluations

Maxime Riché 🔸19 Aug 2024 9:08 UTC

8 points

0 comments14 min readEA link

CBAI is Hiring for Operations Associates (closed)

Maite A21 Jul 2025 21:37 UTC

13 points

0 comments1 min readEA link

(www.cbai.ai)

How to do conceptual research: Case study interview with Caspar Oesterheld

Chi14 May 2024 15:09 UTC

26 points

1 comment9 min readEA link

Discussion with Eliezer Yudkowsky on AGI interventions

RobBensinger11 Nov 2021 3:21 UTC

60 points

33 comments34 min readEA link

[Question] What “defense layers” should governments, AI labs, and businesses use to prevent catastrophic AI failures?

LintzA3 Dec 2021 14:24 UTC

37 points

3 comments1 min readEA link

Sentinel minutes #6/2025: Power of the purse, D1.1 H5N1 flu variant, Ayatollah against negotiations with Trump

NunoSempere10 Feb 2025 17:23 UTC

40 points

2 comments7 min readEA link

(blog.sentinel-team.org)

Decentralizing Model Evaluation: Lessons from AI4Math

SMalagon5 Jun 2025 18:57 UTC

22 points

1 comment4 min readEA link

[Question] Need help with billboard content for AI Safety Bulgaria

Aleksandar Angelov7 Mar 2024 14:36 UTC

4 points

5 comments1 min readEA link

Selection Bias in Observational Estimates of Algorithmic Progress

Parker_Whitfill18 Aug 2025 1:48 UTC

22 points

0 comments1 min readEA link

(arxiv.org)

[Question] How does one find out their AGI timelines?

Yadav7 Nov 2022 22:34 UTC

19 points

4 comments1 min readEA link

[Question] Can we convince people to work on AI safety without convincing them about AGI happening this century?

BrianTan26 Nov 2020 14:46 UTC

8 points

3 comments2 min readEA link

AI & wisdom 1: wisdom, amortised optimisation, and AI

L Rudolf L29 Oct 2024 13:37 UTC

14 points

0 comments15 min readEA link

(rudolf.website)

How to get ChatGPT to really thoroughly research something

Kat Woods 🔶 ⏸️15 Aug 2025 12:54 UTC

10 points

2 comments1 min readEA link

Early-warning Forecasting Center: What it is, and why it’d be cool

Linch14 Mar 2022 19:20 UTC

62 points

8 comments11 min readEA link

Eliciting responses to Marc Andreessen’s “Why AI Will Save the World”

Coleman17 Jul 2023 19:58 UTC

2 points

2 comments1 min readEA link

(a16z.com)

How one logical fallacy killed God, corrupted Science and now fuels the AI race

Jáchym Fibír29 Jul 2025 13:57 UTC

−1 points

0 comments7 min readEA link

(www.phiand.ai)

Announcing the Cambridge ERA:AI Fellowship 2024

erafellowship11 Mar 2024 19:06 UTC

31 points

5 comments3 min readEA link

The Case For Civil Disobedience For The AI Movement

Murali Thoppil24 Apr 2023 13:07 UTC

16 points

3 comments4 min readEA link

(murali42e.substack.com)

Cooperation and Alignment in Delegation Games: You Need Both!

Oliver Sourbut3 Aug 2024 10:16 UTC

4 points

1 comment11 min readEA link

(www.oliversourbut.net)

Incentive design and capability elicitation

Joe_Carlsmith12 Nov 2024 20:56 UTC

9 points

0 comments12 min readEA link

Background for “Understanding the diffusion of large language models”

Ben Cottier21 Dec 2022 13:49 UTC

12 points

0 comments23 min readEA link

AI Safety Unconference NeurIPS 2022

Orpheus_Lummis7 Nov 2022 15:39 UTC

13 points

5 comments1 min readEA link

(aisafetyevents.org)

Introducing the Fund for Alignment Research (We’re Hiring!)

AdamGleave6 Jul 2022 2:00 UTC

74 points

3 comments4 min readEA link

How long till Brussels?: A light investigation into the Brussels Gap

Yadav26 Dec 2022 7:49 UTC

50 points

2 comments5 min readEA link

SERI ML application deadline is extended until May 22.

Viktoria Malyasova22 May 2022 0:13 UTC

13 points

3 comments1 min readEA link

US public perception of CAIS statement and the risk of extinction

Jamie E22 Jun 2023 16:39 UTC

126 points

4 comments9 min readEA link

Join the AI Testing Hackathon this Friday

Esben Kran12 Dec 2022 14:24 UTC

33 points

0 comments8 min readEA link

(alignmentjam.com)

[Question] How to get more academics enthusiastic about doing AI Safety research?

PabloAMC 🔸4 Sep 2021 14:10 UTC

25 points

19 comments1 min readEA link

We’re hiring a Writer to join our team at Our World in Data

Charlie Giattino18 Apr 2024 20:50 UTC

29 points

0 comments1 min readEA link

(ourworldindata.org)

Fundamental Risk

Ihor Ivliev26 Jun 2025 0:25 UTC

−5 points

0 comments1 min readEA link

Epoch and FRI Mentorship Program Summer 2023

merilalama13 Jun 2023 14:27 UTC

38 points

1 comment1 min readEA link

(epochai.org)

How can open-source robotics be aligned with long-term effective altruism goals?

Aria James21 Apr 2025 20:50 UTC

5 points

1 comment1 min readEA link

TIO: A mental health chatbot

Sanjay12 Oct 2020 20:52 UTC

25 points

6 comments28 min readEA link

[Question] Book recommendations for the history of ML?

Eleni_A28 Dec 2022 23:45 UTC

10 points

4 comments1 min readEA link

An experiment eliciting relative estimates for Open Philanthropy’s 2018 AI safety grants

NunoSempere12 Sep 2022 11:19 UTC

111 points

16 comments13 min readEA link

Don’t treat probabilities less than 0.5 as if they’re 0

MichaelDickens26 Feb 2025 5:14 UTC

36 points

5 comments1 min readEA link

China proposes new global AI cooperation organisation

Matrice Jacobine30 Jul 2025 2:50 UTC

14 points

1 comment1 min readEA link

(www.reuters.com)

Defending against Adversarial Policies in Reinforcement Learning with Alternating Training

sergeivolodin12 Feb 2022 15:59 UTC

1 point

0 comments13 min readEA link

[Job]: AI Standards Development Research Assistant

Tony Barrett14 Oct 2022 20:18 UTC

13 points

0 comments2 min readEA link

Three kinds of competitiveness

AI Impacts2 Apr 2020 3:46 UTC

10 points

0 comments5 min readEA link

(aiimpacts.org)

EU AI Act now has a section on general purpose AI systems

MathiasKB🔸9 Dec 2021 12:40 UTC

64 points

10 comments1 min readEA link

A new place to discuss cognitive science, ethics and human alignment

Daniel_Friedrich4 Nov 2022 14:34 UTC

9 points

1 comment2 min readEA link

(www.facebook.com)

AI impacts and Paul Christiano on takeoff speeds

Crosspost2 Mar 2018 11:16 UTC

4 points

0 comments1 min readEA link

XPT forecasts on (some) Direct Approach model inputs

Forecasting Research Institute20 Aug 2023 12:39 UTC

37 points

0 comments9 min readEA link

[Question] I there a demo of “You can’t fetch the coffee if you’re dead”?

Ram Rachum10 Nov 2022 11:03 UTC

8 points

3 comments1 min readEA link

What About Deontology? Ethics of Social Belonging and Conformity in Effective Altruism

Maksens Djabali8 Jan 2025 14:02 UTC

7 points

1 comment4 min readEA link

The Best Argument is not a Simple English Yud Essay

Jonathan Bostock19 Sep 2024 15:29 UTC

76 points

4 comments5 min readEA link

(www.lesswrong.com)

TOMORROW: the largest AI Safety protest ever!

Holly Elmore ⏸️ 🔸20 Oct 2023 18:08 UTC

57 points

0 comments2 min readEA link

Introducing the Principles of Intelligent Behaviour in Biological and Social Systems (PIBBSS) Fellowship

adamShimi18 Dec 2021 15:25 UTC

37 points

5 comments10 min readEA link

Getting Washington and Silicon Valley to tame AI (Mustafa Suleyman on the 80,000 Hours Podcast)

80000_Hours4 Sep 2023 16:25 UTC

5 points

2 comments10 min readEA link

The great energy descent—Part 2: Limits to growth and why we probably won’t reach the stars

CB🔸31 Aug 2022 21:51 UTC

22 points

0 comments25 min readEA link

Seeking Survey Responses—Attitudes Towards AI risks

anson28 Mar 2022 17:47 UTC

23 points

2 comments1 min readEA link

(forms.gle)

Nationwide Action Workshop: Contact Congress about AI Safety!

Felix De Simone24 Feb 2025 16:14 UTC

5 points

0 comments1 min readEA link

(www.zeffy.com)

Summary: Introspective Capabilities in LLMs (Robert Long)

rileyharris2 Jul 2024 18:08 UTC

11 points

1 comment4 min readEA link

What would it take for AI to disempower us? Ryan Greenblatt on takeoff dynamics, rogue deployments, and alignment risks

80000_Hours8 Jul 2025 18:10 UTC

8 points

0 comments33 min readEA link

Which of these arguments for x-risk do you think we should test?

Wim9 Aug 2022 13:43 UTC

3 points

2 comments1 min readEA link

What’s so dangerous about AI anyway? – Or: What it means to be a superintelligence

Thomas Kehrenberg18 Jul 2022 16:14 UTC

9 points

2 comments11 min readEA link

AI, Animals, and Digital Minds Conference 2024: Accepting applications and speaker proposals

Constance Li6 Apr 2024 8:42 UTC

26 points

0 comments1 min readEA link

Transcripts of interviews with AI researchers

Vael Gates9 May 2022 6:03 UTC

140 points

14 comments2 min readEA link

1st Alinha Hacka Recap: Reflecting on the Brazilian AI Alignment Hackathon

Thiago USP31 Jan 2024 10:38 UTC

7 points

0 comments2 min readEA link

Prizes for ML Safety Benchmark Ideas

Joshc28 Oct 2022 2:44 UTC

56 points

8 comments1 min readEA link

Criticism of the main framework in AI alignment

Michele Campolo31 Aug 2022 21:44 UTC

45 points

9 comments7 min readEA link

[Job ad] Research important longtermist topics at Rethink Priorities!

Linch6 Oct 2021 19:09 UTC

65 points

46 comments1 min readEA link

Critique of Superintelligence Part 3

James Fodor13 Dec 2018 5:13 UTC

3 points

5 comments7 min readEA link

[Question] Idea: Repository for AI Safety Presentations

Eitan6 Jan 2025 13:04 UTC

14 points

3 comments1 min readEA link

My (naive) take on Risks from Learned Optimization

Artyom K6 Nov 2022 16:25 UTC

5 points

0 comments5 min readEA link

Some for-profit AI alignment org ideas

Eric Ho14 Dec 2023 15:52 UTC

33 points

1 comment9 min readEA link

[CANCELLED] Berlin AI Alignment Open Meetup August 2022

Isidor Regenfuß4 Aug 2022 13:34 UTC

0 points

0 comments1 min readEA link

Thousands of malicious actors on the future of AI misuse

Zershaaneh Qureshi1 Apr 2024 10:03 UTC

75 points

1 comment1 min readEA link

Genes did misalignment first: comparing gradient hacking and meiotic drive

Holly Elmore ⏸️ 🔸18 Apr 2025 5:39 UTC

45 points

9 comments15 min readEA link

(hollyelmore.substack.com)

The UN Has a Rare Shot at Reducing the Risks of AI in Warfare

Mark Leon Goldberg21 May 2025 21:22 UTC

6 points

0 comments1 min readEA link

Seeking collaborators to build hardware devices for biosecurity or AI safety

Steve Trambert1 Jul 2025 14:44 UTC

14 points

3 comments1 min readEA link

[Question] Best ML courses for AI safety?

lamparita2 Aug 2025 21:28 UTC

2 points

0 comments1 min readEA link

Software engineering—Career review

Benjamin Hilton8 Feb 2022 6:11 UTC

93 points

19 comments8 min readEA link

(80000hours.org)

Beyond Simple Existential Risk: Survival in a Complex Interconnected World

GideonF21 Nov 2022 14:35 UTC

84 points

67 comments21 min readEA link

Implications of large language model diffusion for AI governance

Ben Cottier21 Dec 2022 13:50 UTC

14 points

0 comments38 min readEA link

[Question] Is it ethical to work in AI “content evaluation”?

anon_databoy55530 Jan 2025 13:27 UTC

10 points

3 comments1 min readEA link

The U.S. National Security State is Here to Make AI Even Less Transparent and Accountable

Matrice Jacobine24 Nov 2024 9:34 UTC

7 points

0 comments2 min readEA link

(www.eff.org)

Could this be an unusually good time to Earn To Give?

Tom Gardiner 🔸3 Mar 2025 23:00 UTC

60 points

15 comments3 min readEA link

Training for Good—Update & Plans for 2023

Cillian_15 Nov 2022 16:02 UTC

80 points

1 comment10 min readEA link

ML Summer Bootcamp Reflection: Aalto EA Finland

Aayush Kucheria12 Jan 2023 8:24 UTC

15 points

2 comments9 min readEA link

Introduction: Bias in Evaluating AGI X-Risks

Remmelt27 Dec 2022 10:27 UTC

4 points

0 comments3 min readEA link

FHI Report: How Will National Security Considerations Affect Antitrust Decisions in AI? An Examination of Historical Precedents

Cullen 🔸28 Jul 2020 18:33 UTC

13 points

0 comments1 min readEA link

(www.fhi.ox.ac.uk)

Institutional-themed website template for AIS groups

Kambar29 Apr 2025 21:11 UTC

21 points

0 comments1 min readEA link

Our new video about goal misgeneralization, plus an apology

Writer14 Jan 2025 14:07 UTC

16 points

1 comment7 min readEA link

(youtu.be)

Collective Action for AI Safety (June 4, NYC)

Jordan Braunstein30 May 2025 21:11 UTC

1 point

0 comments1 min readEA link

Well, this was unexpected. Claude coming out of nowhere with the meta cognition.

Anti-Golem9 Jun 2025 13:59 UTC

−2 points

0 comments1 min readEA link

AI Safety Protest, Melbourne, Australia

Mark Brown17 Jan 2025 14:55 UTC

2 points

0 comments1 min readEA link

Conjecture: A standing offer for public debates on AI

Andrea_Miotti16 Jun 2023 14:33 UTC

8 points

1 comment2 min readEA link

(www.conjecture.dev)

Cooperation, Avoidance, and Indifference: Alternate Futures for Misaligned AGI

Kiel Brennan-Marquez10 Dec 2022 20:32 UTC

4 points

1 comment18 min readEA link

On AI and Compute

johncrox3 Apr 2019 21:26 UTC

39 points

12 comments8 min readEA link

The Bottleneck in AI Policy Isn’t Ethics—It’s Implementation

Tristan D4 Apr 2025 6:07 UTC

10 points

4 comments1 min readEA link

Are we dropping the ball on Recommendation AIs?

Raphaël S23 Oct 2024 19:37 UTC

5 points

0 comments6 min readEA link

EA Netherlands’ Annual Strategy for 2024

James Herbert5 Jun 2024 15:07 UTC

40 points

4 comments6 min readEA link

[Question] What’s a good intro to AI Safety?

Amateur Systems Analyst14 Jan 2024 16:54 UTC

1 point

5 comments1 min readEA link

The History, Epistemology and Strategy of Technological Restraint, and lessons for AI (short essay)

MMMaas10 Aug 2022 11:00 UTC

90 points

6 comments9 min readEA link

(verfassungsblog.de)

Raphaël Millière on the Limits of Deep Learning and AI x-risk skepticism

Michaël Trazzi24 Jun 2022 18:33 UTC

20 points

0 comments4 min readEA link

(theinsideview.ai)

[Question] Common rebuttal to “pausing” or regulating AI

sammyboiz🔸22 May 2024 4:21 UTC

4 points

2 comments1 min readEA link

Tan Zhi Xuan: AI alignment, philosophical pluralism, and the relevance of non-Western philosophy

EA Global21 Nov 2020 8:12 UTC

20 points

1 comment1 min readEA link

(www.youtube.com)

Ex-OpenAI researcher says OpenAI mass-violated copyright law

Remmelt24 Oct 2024 1:00 UTC

11 points

0 comments1 min readEA link

(suchir.net)

EA is good, actually

Amy Labenz28 Nov 2023 15:59 UTC

272 points

15 comments4 min readEA link

Berlin AI Alignment Open Meetup September 2022

Isidor Regenfuß21 Sep 2022 15:09 UTC

2 points

0 comments1 min readEA link

SociaLLM: proposal for a language model design for personalised apps, social science, and AI safety research

Roman Leventov2 Jan 2024 8:11 UTC

4 points

2 comments3 min readEA link

On failing to get EA jobs: My experience and recommendations to EA orgs

Avila22 Apr 2024 21:19 UTC

127 points

55 comments5 min readEA link

[Question] Should some people start working to influence the people who are most likely to shape the values of the first AGIs, so that they take into account the interests of wild and farmed animals and sentient digital minds?

Keyvan Mostafavi31 Aug 2023 12:08 UTC

16 points

1 comment1 min readEA link

Shutting down AI Safety Support

JJ Hepburn30 Jul 2023 6:00 UTC

117 points

17 comments1 min readEA link

“Slower tech development” can be about ordering, gradualness, or distance from now

MichaelA🔸14 Nov 2021 20:58 UTC

47 points

3 comments4 min readEA link

10 of Founders Pledge’s biggest grants

Matt_Lerner9 Jul 2025 21:55 UTC

120 points

1 comment6 min readEA link

[Question] Slowing down AI progress?

Eleni_A26 Jul 2022 8:46 UTC

16 points

9 comments1 min readEA link

No “Zero-Shot” Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance

Noah Varley🔸14 May 2024 23:57 UTC

36 points

2 comments1 min readEA link

(arxiv.org)

What are some good podcasts about AI safety?

Vishakha Agrawal17 Feb 2025 10:32 UTC

8 points

1 comment1 min readEA link

(aisafety.info)

EA and AI Safety Schism: AGI, the last tech humans will (soon*) build

Phib15 May 2023 2:05 UTC

6 points

6 comments5 min readEA link

Personal agents

Roman Leventov17 Jun 2025 2:05 UTC

3 points

1 comment7 min readEA link

Alignment for focused chatbots?

Beckpm8 Jul 2023 15:09 UTC

−1 points

2 comments1 min readEA link

The role of academia in AI Safety.

PabloAMC 🔸28 Mar 2022 0:04 UTC

71 points

19 comments3 min readEA link

Status quo bias; System justification: Bias in Evaluating AGI X-Risks

Remmelt3 Jan 2023 2:50 UTC

4 points

1 comment1 min readEA link

What if we just…didn’t build AGI? An Argument Against Inevitability

Nate Sharpe10 May 2025 3:34 UTC

64 points

21 comments14 min readEA link

(natezsharpe.substack.com)

Announcing the AIPolicyIdeas.com Database

abiolvera23 Jun 2023 16:09 UTC

50 points

3 comments2 min readEA link

(www.aipolicyideas.com)

Delegated agents in practice: How companies might end up selling AI services that act on behalf of consumers and coalitions, and what this implies for safety research

Remmelt26 Nov 2020 16:39 UTC

11 points

0 comments4 min readEA link

Pausing AI Developments Isn’t Enough. We Need to Shut it All Down by Eliezer Yudkowsky

jacquesthibs29 Mar 2023 23:30 UTC

212 points

75 comments3 min readEA link

(time.com)

Longtermist implications of aliens Space-Faring Civilizations—Introduction

Maxime Riché 🔸21 Feb 2025 12:07 UTC

44 points

12 comments6 min readEA link

Forging A New AGI Social Contract

Deric Cheng10 Apr 2025 13:41 UTC

13 points

3 comments7 min readEA link

(agisocialcontract.substack.com)

Information in risky technology races

nemeryxu2 Aug 2022 23:35 UTC

15 points

2 comments3 min readEA link

Legal template for conditional gift deed as an alternative to wagers on AI doom

bruce13 Mar 2025 14:57 UTC

30 points

6 comments1 min readEA link

Something to make myself fascinated with computing science and AI.

Eduardo7 Dec 2022 2:12 UTC

3 points

5 comments1 min readEA link

University community building seems like the wrong model for AI safety

George Stiffman26 Feb 2022 6:23 UTC

24 points

8 comments2 min readEA link

How to make the future better (other than by reducing extinction risk)

William_MacAskill15 Aug 2025 15:40 UTC

22 points

0 comments3 min readEA link

New Report: Multi-Agent Risks from Advanced AI

Lewis Hammond23 Feb 2025 0:32 UTC

40 points

3 comments2 min readEA link

(www.cooperativeai.com)

Stampy’s AI Safety Info soft launch

StevenKaas5 Oct 2023 22:20 UTC

57 points

2 comments2 min readEA link

(www.lesswrong.com)

Announcing the 2023 PIBBSS Summer Research Fellowship

Dušan D. Nešić (Dushan)12 Jan 2023 21:38 UTC

26 points

2 comments1 min readEA link

Will the Need to Retrain AI Models from Scratch Block a Software Intelligence Explosion?

Forethought28 Mar 2025 13:43 UTC

12 points

0 comments3 min readEA link

(www.forethought.org)

The replication and emulation of GPT-3

Ben Cottier21 Dec 2022 13:49 UTC

14 points

0 comments33 min readEA link

[Link] Center for the Governance of AI (GovAI) Annual Report 2018

MarkusAnderljung21 Dec 2018 16:17 UTC

24 points

0 comments1 min readEA link

Best practices for risk communication from the academic literature

Existential Risk Communication Project12 Aug 2024 18:54 UTC

9 points

3 comments23 min readEA link

Promethean Governance Unleashed: Piloting Polycentric, Memetic Orders in the AI Frontier

Paul Fallavollita21 Mar 2025 16:35 UTC

−11 points

1 comment3 min readEA link

Introducing Leap Labs, an AI interpretability startup

Jessica Rumbelow6 Mar 2023 17:37 UTC

11 points

0 comments1 min readEA link

(www.lesswrong.com)

Why mechanistic interpretability does not and cannot contribute to long-term AGI safety (from messages with a friend)

Remmelt19 Dec 2022 12:02 UTC

17 points

3 comments31 min readEA link

Enabling more feedback

JJ Hepburn10 Dec 2021 6:52 UTC

41 points

3 comments3 min readEA link

AISN #31: A New AI Policy Bill in California Plus, Precedents for AI Governance and The EU AI Office

Center for AI Safety21 Feb 2024 21:55 UTC

27 points

0 comments6 min readEA link

(newsletter.safe.ai)

Apply to become a Futurekind AI Facilitator or Mentor (deadline: April 10)

Jay Luong21 Mar 2025 20:28 UTC

3 points

0 comments1 min readEA link

My argument against AGI

cveres12 Oct 2022 6:32 UTC

2 points

29 comments3 min readEA link

Jesse Clifton: Open-source learning — a bargaining approach

EA Global18 Oct 2019 18:05 UTC

10 points

0 comments1 min readEA link

(www.youtube.com)

Safety timelines: How long will it take to solve alignment?

Esben Kran19 Sep 2022 12:51 UTC

45 points

9 comments6 min readEA link

The Rival AI Deployment Problem: a Pre-deployment Agreement as the least-bad response

HaydnBelfield23 Sep 2022 9:28 UTC

44 points

1 comment12 min readEA link

Prologue | A Fire Upon the Deep | Vernor Vinge

semicycle17 Feb 2025 4:13 UTC

5 points

1 comment1 min readEA link

(www.baen.com)

Holden Karnofsky Interview about Most Important Century & Transformative AI

Dwarkesh Patel3 Jan 2023 17:31 UTC

29 points

2 comments1 min readEA link

PIBBSS Fellowship 2025: Bounties and Cooperative AI Track Announcement

Dušan D. Nešić (Dushan)9 Jan 2025 14:23 UTC

18 points

0 comments1 min readEA link

Overview of recent international demonstrations against AI (AI Protest Actions #1)

Rachel Shu17 Jul 2025 20:22 UTC

17 points

2 comments5 min readEA link

[Question] Who should we interview for The 80,000 Hours Podcast?

Luisa_Rodriguez13 Sep 2023 12:23 UTC

87 points

136 comments2 min readEA link

AI Safety Strategy—A new organization for better timelines

Prometheus14 Jun 2023 20:41 UTC

8 points

0 comments2 min readEA link

Former Israeli Prime Minister Speaks About AI X-Risk

Yonatan Cale20 May 2023 12:09 UTC

73 points

6 comments1 min readEA link

[Crosspost] An AI Pause Is Humanity’s Best Bet For Preventing Extinction (TIME)

Otto24 Jul 2023 10:18 UTC

36 points

3 comments7 min readEA link

(time.com)

Stripe Economics of AI Fellowship

basil.halperin28 Mar 2025 15:24 UTC

54 points

0 comments2 min readEA link

(stripe.events)

Strategic Directions for a Digital Consciousness Model

Derek Shiller10 Dec 2024 19:33 UTC

41 points

1 comment12 min readEA link

[Question] UK election and AI safety, who to vote for?

Clay Cube1 Jun 2024 10:16 UTC

25 points

3 comments1 min readEA link

Alignment is hard. Communicating that, might be harder

Eleni_A1 Sep 2022 11:45 UTC

17 points

1 comment3 min readEA link

Stop talking about p(doom)

Isaac King1 Jan 2024 10:57 UTC

115 points

12 comments3 min readEA link

Asterisk Mag 10: Origins

Clara Collier7 Jul 2025 18:03 UTC

8 points

0 comments2 min readEA link

(asteriskmag.com)

ML4G Germany—AI Alignment Camp

Evander H. 🔸27 Jun 2023 15:33 UTC

6 points

0 comments1 min readEA link

The Dilemma of Ultimate Technology

Aino20 Jul 2023 12:24 UTC

1 point

0 comments7 min readEA link

Some cruxes on impactful alternatives to AI policy work

richard_ngo22 Nov 2018 13:43 UTC

28 points

2 comments12 min readEA link

A modest case for hope

xavier rg17 Oct 2022 6:03 UTC

28 points

0 comments1 min readEA link

Reasons for my negative feelings towards the AI risk discussion

fergusq1 Sep 2022 7:33 UTC

43 points

9 comments4 min readEA link

[Question] How much should you optimize for the short-timelines scenario?

SoerenMind26 Jul 2022 15:51 UTC

39 points

2 comments1 min readEA link

Career uncertainty: Medicine vs. AI

Markus Köth30 Apr 2023 8:41 UTC

20 points

9 comments1 min readEA link

Advice to junior AI governance researchers

Akash8 Jul 2024 19:19 UTC

38 points

3 comments5 min readEA link

A progressive AI, not a threatening one

Violette 12 Dec 2023 17:19 UTC

−17 points

0 comments4 min readEA link

Does generality pay? GPT-3 can provide preliminary evidence.

Eevee🔹12 Jul 2020 18:53 UTC

21 points

4 comments2 min readEA link

[Question] Why does AGI occur almost nowhere, not even just as a remark for economic/political models?

Franziska Fischer2 Oct 2022 14:43 UTC

52 points

15 comments1 min readEA link

Upcoming speaker series on emerging tech, national security & US policy careers

ES10 Jul 2024 19:59 UTC

16 points

1 comment1 min readEA link

Results from a survey on tool use and workflows in alignment research

jacquesthibs19 Dec 2022 15:19 UTC

30 points

0 comments19 min readEA link

Epistle to the Successor

ukc1001429 Apr 2025 9:30 UTC

4 points

0 comments19 min readEA link

Alignment Grantmaking is Funding-Limited Right Now [crosspost]

johnswentworth2 Aug 2023 20:37 UTC

82 points

13 comments1 min readEA link

(www.lesswrong.com)

Online infosec talk: What even is zero trust?

Jarrah8 Jun 2024 23:54 UTC

11 points

0 comments1 min readEA link

Reflections from Ooty retreat 2.0

Aditya Arpitha Prasad24 Jul 2025 18:22 UTC

4 points

0 comments1 min readEA link

(www.lesswrong.com)

Open Asteroid Impact announces leadership transition

Patrick Hoang1 Apr 2024 12:51 UTC

18 points

0 comments1 min readEA link

Applications open for AGI Safety Fundamentals: Alignment Course

Jamie B13 Dec 2022 10:50 UTC

75 points

0 comments2 min readEA link

Introducing METR’s Autonomy Evaluation Resources

Megan Kinniment15 Mar 2024 23:19 UTC

28 points

0 comments1 min readEA link

(metr.github.io)

Leverage points for a pause

Remmelt28 Aug 2024 9:21 UTC

6 points

0 comments1 min readEA link

Some AI safety project & research ideas/questions for short and long timelines

Lloy2 🔹8 Aug 2025 21:08 UTC

9 points

0 comments5 min readEA link

[Question] AI labs’ requests for input

Zach Stein-Perlman19 Aug 2023 17:00 UTC

7 points

0 comments1 min readEA link

Executive Director for AIS France—Expression of interest

gergo19 Dec 2024 8:11 UTC

33 points

0 comments4 min readEA link

Demonstrating specification gaming in reasoning models

Matrice Jacobine20 Feb 2025 19:26 UTC

10 points

0 comments1 min readEA link

(arxiv.org)

A note about differential technological development

So8res24 Jul 2022 23:41 UTC

58 points

8 comments6 min readEA link

Talk to me about your summer/career plans

Akash31 Jan 2023 18:29 UTC

31 points

0 comments2 min readEA link

Should AI focus on problem-solving or strategic planning? Why not both?

oliver_siegel1 Nov 2022 9:53 UTC

1 point

0 comments1 min readEA link

AI Discrimination Requirements: A Regulatory Review

Deric Cheng4 Apr 2024 15:44 UTC

8 points

1 comment6 min readEA link

Project proposal: Scenario analysis group for AI safety strategy

Buhl18 Dec 2023 18:31 UTC

35 points

0 comments5 min readEA link

(rethinkpriorities.org)

AI can exploit safety plans posted on the Internet

Peter S. Park4 Dec 2022 12:17 UTC

5 points

3 comments1 min readEA link

Efficacy of AI Activism: Have We Ever Said No?

Charlie Harrison27 Oct 2023 16:52 UTC

80 points

25 comments20 min readEA link

7 Learnings and a Detailed Description of an AI Safety Reading Group

nell23 Sep 2022 2:02 UTC

21 points

5 comments9 min readEA link

Using AI to match people to jobs?

Forumite30 May 2024 21:19 UTC

5 points

0 comments1 min readEA link

AI Safety Incubation Program—Applications Open

Catalyze Impact16 Aug 2024 15:37 UTC

11 points

0 comments2 min readEA link

Law-Following AI 1: Sequence Introduction and Structure

Cullen 🔸27 Apr 2022 17:16 UTC

35 points

2 comments9 min readEA link

The Welfare of Digital Minds: A Research Agenda

Derek Shiller11 Nov 2024 12:58 UTC

53 points

1 comment31 min readEA link

Lying is Cowardice, not Strategy

Connor Leahy25 Oct 2023 5:59 UTC

−5 points

15 comments5 min readEA link

(cognition.cafe)

How Technical AI Safety Researchers Can Help Implement Punitive Damages to Mitigate Catastrophic AI Risk

Gabriel Weil19 Feb 2024 17:43 UTC

28 points

2 comments4 min readEA link

You Understand AI Alignment and How to Make Soup

Leen Armoush28 May 2022 6:22 UTC

0 points

2 comments5 min readEA link

MATS is hiring!

Ryan Kidd8 Apr 2025 20:45 UTC

14 points

2 comments6 min readEA link

AI Constitutions are a tool to reduce societal scale risk

SammyDMartin26 Jul 2024 10:50 UTC

11 points

0 comments1 min readEA link

(www.lesswrong.com)

Open Philanthropy Technical AI Safety RFP - $40M Available Across 21 Research Areas

Jake Mendel6 Feb 2025 18:59 UTC

95 points

3 comments1 min readEA link

(www.openphilanthropy.org)

Let’s Talk About Emergence

Jacob-Haimes7 Jun 2024 19:34 UTC

8 points

1 comment7 min readEA link

(www.odysseaninstitute.org)

Apply to a small iteration of MLAB to be run in Oxford

Rio P29 Aug 2023 19:39 UTC

11 points

0 comments1 min readEA link

Cutting AI Safety down to size

Holly Elmore ⏸️ 🔸9 Nov 2024 23:40 UTC

87 points

5 comments5 min readEA link

A list of good heuristics that the case for AI X-risk fails

Aaron Gertler 🔸16 Jul 2020 9:56 UTC

25 points

9 comments2 min readEA link

(www.alignmentforum.org)

[Announcement] The Steven Aiberg Project

StevenAiberg19 Oct 2022 7:48 UTC

0 points

0 comments4 min readEA link

AE Studio is hiring!

AE Studio21 Apr 2025 20:35 UTC

16 points

0 comments2 min readEA link

Humans aren’t fitness maximizers

So8res4 Oct 2022 1:32 UTC

30 points

2 comments5 min readEA link

Taking Into Account Sentient Non-Humans in AI Ambitious Value Learning: Sentientist Coherent Extrapolated Volition

Adrià Moret1 Dec 2023 18:01 UTC

43 points

2 comments42 min readEA link

IV. Parallels and Review

Maynk0227 Feb 2024 23:10 UTC

7 points

1 comment8 min readEA link

(open.substack.com)

AI, Animals, & Digital Minds 2025: Retrospective

Alistair Stewart12 Jul 2025 2:28 UTC

60 points

3 comments11 min readEA link

The Slippery Slope from DALLE-2 to Deepfake Anarchy

stecas5 Nov 2022 14:47 UTC

55 points

11 comments17 min readEA link

Should we expect the future to be good?

Neil Crawford30 Apr 2025 0:45 UTC

38 points

1 comment14 min readEA link

Compute Governance and Conclusions—Transformative AI and Compute [3/4]

lennart14 Oct 2021 7:55 UTC

20 points

3 comments5 min readEA link

What is “wireheading”?

Vishakha Agrawal17 Dec 2024 17:59 UTC

1 point

0 comments1 min readEA link

(aisafety.info)

The King and the Golem—The Animation

Writer8 Nov 2024 18:23 UTC

50 points

1 comment1 min readEA link

Scoring forecasts from the 2016 “Expert Survey on Progress in AI”

PatrickL1 Mar 2023 14:39 UTC

204 points

21 comments9 min readEA link

History’s Grandest Projects: Introduction to Macro Strategies for AI Risk, Part 1

Coleman20 Jun 2025 17:32 UTC

7 points

0 comments38 min readEA link

The NAIRR Initiative: Assessing its Potential for Democratizing AI

Jose Gelves29 Aug 2024 12:30 UTC

22 points

1 comment11 min readEA link

[Question] How bad would AI progress need to be for us to think general technological progress is also bad?

Jim Buhler6 Jul 2024 18:44 UTC

10 points

0 comments1 min readEA link

US public opinion on AI, September 2023

Zach Stein-Perlman18 Sep 2023 18:00 UTC

29 points

0 comments1 min readEA link

(blog.aiimpacts.org)

How could a moratorium fail?

Davidmanheim22 Sep 2023 15:11 UTC

49 points

4 comments9 min readEA link

Labor Participation is a High-Priority AI Alignment Risk

alx12 Aug 2024 18:48 UTC

17 points

3 comments16 min readEA link

Consider not donating under $100 to political candidates

DanielFilan11 May 2025 3:22 UTC

83 points

11 comments1 min readEA link

All AGI Safety questions welcome (especially basic ones) [April 2023]

StevenKaas8 Apr 2023 4:21 UTC

111 points

173 comments2 min readEA link

[Question] Scholarships for Undergrads who want to have high-impact careers?

darthflower6 Jul 2025 17:31 UTC

4 points

0 comments1 min readEA link

Apply to the new Open Philanthropy Technology Policy Fellowship!

lukeprog20 Jul 2021 18:41 UTC

78 points

6 comments4 min readEA link

[Question] How should technical AI researchers best transition into AI governance and policy?

GabeM10 Sep 2023 5:29 UTC

12 points

5 comments1 min readEA link

Quantifying the Far Future Effects of Interventions

MichaelDickens18 May 2016 2:15 UTC

9 points

0 comments11 min readEA link

Why I expect successful (narrow) alignment

Tobias_Baumann29 Dec 2018 15:46 UTC

18 points

10 comments1 min readEA link

(s-risks.org)

Apply to the Cooperative AI PhD Fellowship by October 14th!

Lewis Hammond5 Oct 2024 12:41 UTC

35 points

0 comments1 min readEA link

When “yang” goes wrong

Joe_Carlsmith8 Jan 2024 16:35 UTC

57 points

1 comment13 min readEA link

ML4G is launching its first ever Governance bootcamp!

carolinaollive16 May 2025 15:22 UTC

25 points

0 comments1 min readEA link

On Scaling Academia

kirchner.jan20 Sep 2021 14:54 UTC

18 points

3 comments13 min readEA link

(universalprior.substack.com)

A Benchmark for Measuring Honesty in AI Systems

Mantas Mazeika4 Mar 2025 17:44 UTC

29 points

0 comments2 min readEA link

(www.mask-benchmark.ai)

[Question] Which Graduate Programs Will Best Set Me Up for a Career in AI Safety?

Jason Zeng19 Feb 2025 13:22 UTC

4 points

0 comments1 min readEA link

Data collection for AI alignment—Career review

Benjamin Hilton3 Jun 2022 11:44 UTC

34 points

1 comment5 min readEA link

(80000hours.org)

My P(doom) is 2.76%. Here’s Why.

Liam Robins12 Jun 2025 22:29 UTC

55 points

11 comments20 min readEA link

(thelimestack.substack.com)

What AI Safety Materials Do ML Researchers Find Compelling?

Vael Gates28 Dec 2022 2:03 UTC

130 points

12 comments2 min readEA link

DeepMind: Model evaluation for extreme risks

Zach Stein-Perlman25 May 2023 3:00 UTC

49 points

3 comments1 min readEA link

(arxiv.org)

Video & transcript: Challenges for Safe & Beneficial Brain-Like AGI

Steven Byrnes8 May 2025 21:11 UTC

8 points

1 comment18 min readEA link

The Control Problem: Unsolved or Unsolvable?

Remmelt2 Jun 2023 15:42 UTC

4 points

9 comments13 min readEA link

[Creative Writing Contest] The Puppy Problem

Louis13 Oct 2021 14:01 UTC

13 points

0 comments7 min readEA link

Forecasting Through Fiction

Yitz6 Jul 2022 5:23 UTC

8 points

3 comments8 min readEA link

(www.lesswrong.com)

AGI and Lock-In

Lukas Finnveden29 Oct 2022 1:56 UTC

154 points

20 comments10 min readEA link

(www.forethought.org)

Announcement: there are now monthly coordination calls for AIS fieldbuilders in Europe

gergo22 Nov 2024 10:30 UTC

32 points

0 comments1 min readEA link

AGI Isn’t Close—Future Fund Worldview Prize

Toni MUENDEL18 Dec 2022 16:03 UTC

−8 points

24 comments13 min readEA link

[Opzionale] Approfondimenti sui rischi dell’IA (materiali in inglese)

EA Italy18 Jan 2023 11:16 UTC

1 point

0 comments2 min readEA link

How The EthiSizer Almost Broke `Story’

Velikovsky_of_Newcastle8 May 2023 16:58 UTC

1 point

0 comments5 min readEA link

AI is advancing fast

Vishakha Agrawal23 Apr 2025 11:04 UTC

2 points

2 comments2 min readEA link

(aisafety.info)

#185 – The 7 most promising ways to end factory farming, and whether AI is going to be good or bad for animals (Lewis Bollard on the 80,000 Hours Podcast)

80000_Hours30 Apr 2024 17:20 UTC

63 points

0 comments15 min readEA link

AISN #33: Reassessing AI and Biorisk Plus, Consolidation in the Corporate AI Landscape, and National Investments in AI

Center for AI Safety12 Apr 2024 16:11 UTC

19 points

0 comments9 min readEA link

(newsletter.safe.ai)

An audio version of the alignment problem from a deep learning perspective by Richard Ngo Et Al

Miguel3 Feb 2023 19:32 UTC

18 points

0 comments1 min readEA link

(www.whitehatstoic.com)

Lab Collaboration on AI Safety Best Practices

amta17 Mar 2024 12:20 UTC

3 points

0 comments20 min readEA link

Please Donate to CAIP (Post 1 of 7 on AI Governance)

Jason Green-Lowe7 May 2025 18:15 UTC

136 points

25 comments33 min readEA link

6 Year Decrease of Metaculus AGI Prediction

Chris Leong12 Apr 2022 5:36 UTC

40 points

6 comments1 min readEA link

Making of #IAN

kirchner.jan29 Aug 2021 16:24 UTC

9 points

0 comments1 min readEA link

(universalprior.substack.com)

Summary of the AI Bill of Rights and Policy Implications

Tristan Williams20 Jun 2023 9:28 UTC

16 points

0 comments22 min readEA link

SERI ML Alignment Theory Scholars Program 2022

Ryan Kidd27 Apr 2022 16:33 UTC

57 points

2 comments3 min readEA link

[Link post] AI could fuel factory farming—or end it

BrianK18 Oct 2022 11:16 UTC

39 points

0 comments1 min readEA link

(www.fastcompany.com)

Paul Christiano: Current work in AI alignment

EA Global3 Apr 2020 7:06 UTC

80 points

4 comments24 min readEA link

(www.youtube.com)

Nick Bostrom’s new book, “Deep Utopia”, is out today

peterhartree27 Mar 2024 11:23 UTC

105 points

6 comments1 min readEA link

(nickbostrom.com)

Intellectual Diversity in AI Safety

KR22 Jul 2020 19:07 UTC

21 points

8 comments3 min readEA link

Case study: Safety standards on California utilities to prevent wildfires

Coby Joseph6 Dec 2023 10:32 UTC

7 points

1 comment26 min readEA link

2024 State of AI Regulatory Landscape

Deric Cheng28 May 2024 12:00 UTC

12 points

1 comment2 min readEA link

(www.convergenceanalysis.org)

Joe Rogan on AI Safety

Michael_2358 🔸25 Jun 2025 1:14 UTC

18 points

0 comments1 min readEA link

I am a Memoryless System

Nicholas Kross23 Oct 2022 17:36 UTC

4 points

0 comments9 min readEA link

(www.thinkingmuchbetter.com)

Exploring Tacit Linked Premises with GPT

RomeoStevens24 Mar 2023 22:50 UTC

5 points

0 comments3 min readEA link

Performance of Large Language Models (LLMs) in Complex Analysis: A Benchmark of Mathematical Competence and its Role in Decision Making.

Jaime Esteban Montenegro Barón6 May 2025 21:08 UTC

1 point

0 comments23 min readEA link

[US time] Infosec: What even is zero trust?

Jarrah21 Jun 2024 18:11 UTC

2 points

0 comments1 min readEA link

Clarifying and predicting AGI

richard_ngo4 May 2023 15:56 UTC

69 points

2 comments4 min readEA link

OpenAI’s Preparedness Framework: Praise & Recommendations

Akash2 Jan 2024 16:20 UTC

16 points

1 comment7 min readEA link

Policy ideas for mitigating AI risk

Thomas Larsen16 Sep 2023 10:31 UTC

121 points

16 comments10 min readEA link

[Question] Open-source AI safety projects?

defun 🔸29 Jan 2024 10:09 UTC

8 points

2 comments1 min readEA link

Lead, Own, Share: Sovereign Wealth Funds for Transformative AI

Matrice Jacobine14 Jul 2025 9:34 UTC

24 points

0 comments1 min readEA link

(www.convergenceanalysis.org)

Scaling Wargaming for Global Catastrophic Risks with AI

rai18 Jan 2025 15:07 UTC

73 points

1 comment4 min readEA link

(blog.sentinel-team.org)

Catastrophic Risks from AI #2: Malicious Use

Dan H22 Jun 2023 17:10 UTC

19 points

0 comments17 min readEA link

(arxiv.org)

AI Benchmarks Series — Metaculus Questions on Evaluations of AI Models Against Technical Benchmarks

christian27 Mar 2024 23:05 UTC

10 points

0 comments1 min readEA link

(www.metaculus.com)

Against Agents as an Approach to Aligned Transformative AI

𝕮𝖎𝖓𝖊𝖗𝖆27 Dec 2022 0:47 UTC

4 points

0 comments2 min readEA link

We are already in a persuasion-transformed world and must take precautions

trevor14 Nov 2023 15:53 UTC

1 point

0 comments6 min readEA link

The trajectory of the future could soon get set in stone

William_MacAskill11 Aug 2025 11:04 UTC

37 points

1 comment3 min readEA link

Conscious AI: Will we know it when we see it? [Conscious AI & Public Perception]

ixex4 Jul 2024 20:30 UTC

13 points

1 comment12 min readEA link

Announcing Cavendish Labs

dyusha19 Jan 2023 20:00 UTC

112 points

6 comments2 min readEA link

Why I think strong general AI is coming soon

porby28 Sep 2022 6:55 UTC

14 points

1 comment34 min readEA link

AI Risk is like Terminator; Stop Saying it’s Not

skluug8 Mar 2022 19:17 UTC

191 points

43 comments10 min readEA link

(skluug.substack.com)

[Question] Academic AI Safety/Alignment Reading List

Zak_H21 Nov 2023 14:19 UTC

6 points

1 comment1 min readEA link

“We can Prevent AI Disaster Like We Prevented Nuclear Catastrophe”

Peter23 Sep 2023 20:36 UTC

15 points

1 comment1 min readEA link

(time.com)

What is malevolence? On the nature, measurement, and distribution of dark traits

David_Althaus23 Oct 2024 8:41 UTC

107 points

6 comments52 min readEA link

The possibility of an indefinite AI pause

Matthew_Barnett19 Sep 2023 12:28 UTC

90 points

73 comments15 min readEA link

Don’t leave your fingerprints on the future

So8res8 Oct 2022 0:35 UTC

95 points

4 comments4 min readEA link

[Question] Would people on this site be interested in hearing about efforts to make an “ethics calculator” for an AGI?

Sean Sweeney5 Mar 2024 9:28 UTC

1 point

0 comments1 min readEA link

(Linkpost) METR: Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity

Yadav11 Jul 2025 8:58 UTC

37 points

2 comments2 min readEA link

(metr.org)

AI Safety Bounties

PatrickL24 Aug 2023 14:30 UTC

37 points

2 comments7 min readEA link

(rethinkpriorities.org)

#204 – Making sense of SBF, and his biggest critiques of effective altruism (Nate Silver on The 80,000 Hours Podcast)

80000_Hours17 Oct 2024 20:41 UTC

22 points

2 comments14 min readEA link

AI is not taking over material science (for now): an analysis and conference report

titotal11 Mar 2025 12:01 UTC

59 points

16 comments25 min readEA link

(open.substack.com)

“AI” is an indexical

TW1233 Jan 2023 22:00 UTC

23 points

2 comments6 min readEA link

(aiwatchtower.substack.com)

Apply now to Human-aligned AI Summer School 2025

Pivocajs6 Jun 2025 19:34 UTC

8 points

1 comment2 min readEA link

Election by Jury: A Neglected Target for Effective Altruism

ClayShentrup27 Jan 2025 7:27 UTC

16 points

10 comments6 min readEA link

I read every major AI lab’s safety plan so you don’t have to

sarahhw16 Dec 2024 14:12 UTC

68 points

2 comments11 min readEA link

(longerramblings.substack.com)

LessWrong is now a book, available for pre-order!

terraform4 Dec 2020 20:42 UTC

48 points

1 comment7 min readEA link

AISN #32: Measuring and Reducing Hazardous Knowledge in LLMs Plus, Forecasting the Future with LLMs, and Regulatory Markets

Center for AI Safety7 Mar 2024 16:37 UTC

15 points

2 comments8 min readEA link

(newsletter.safe.ai)

State of EA Poland and funding opportunity

Chris Szulc7 Dec 2024 8:48 UTC

72 points

4 comments11 min readEA link

New Working Paper Series of the Legal Priorities Project

Legal Priorities Project18 Oct 2021 10:30 UTC

60 points

0 comments9 min readEA link

Diagram with Commentary for AGI as an X-Risk

Jared Leibowich24 May 2023 22:27 UTC

21 points

4 comments8 min readEA link

An Update On The Campaign For AI Safety Dot Org

yanni kyriacos5 May 2023 0:19 UTC

26 points

4 comments1 min readEA link

Why I’m Not (Yet) A Full-Time Technical Alignment Researcher

Nicholas Kross25 May 2023 1:26 UTC

11 points

1 comment4 min readEA link

(www.thinkingmuchbetter.com)

Differential technology development: preprint on the concept

Hamish_Hobbs12 Sep 2022 13:52 UTC

65 points

0 comments2 min readEA link

The first AI Safety Camp & onwards

Remmelt7 Jun 2018 18:49 UTC

25 points

2 comments8 min readEA link

EA’s brain-over-body bias, and the embodied value problem in AI alignment

Geoffrey Miller21 Sep 2022 18:55 UTC

45 points

3 comments25 min readEA link

Disagreement with bio anchors that lead to shorter timelines

mariushobbhahn16 Nov 2022 14:40 UTC

85 points

1 comment7 min readEA link

Help me to understand AI alignment!

britomart18 Jan 2023 9:13 UTC

3 points

12 comments1 min readEA link

Metaculus Presents — View From the Enterprise Suite: How Applied AI Governance Works Today

christian20 Jun 2023 22:24 UTC

4 points

0 comments1 min readEA link

4 Key Assumptions in AI Safety

Prometheus7 Nov 2022 10:50 UTC

5 points

0 comments7 min readEA link

[Question] Please Share Your Perspectives on the Degree of Societal Impact from Transformative AI Outcomes

Kiliank15 Apr 2022 1:23 UTC

3 points

3 comments1 min readEA link

Worldview iPeople—Future Fund’s AI Worldview Prize

Toni MUENDEL28 Oct 2022 7:37 UTC

0 points

5 comments9 min readEA link

Japan AI Alignment Conference

ChrisScammell10 Mar 2023 9:23 UTC

17 points

2 comments1 min readEA link

(www.conjecture.dev)

Where does Responsible Capabilities Scaling take AI governance?

ZacRichardson9 Jun 2024 22:25 UTC

17 points

1 comment16 min readEA link

Interviews with 97 AI Researchers: Quantitative Analysis

Maheen Shermohammed2 Feb 2023 4:50 UTC

76 points

4 comments7 min readEA link

We’re Not Advertising Enough (Post 3 of 7 on AI Governance)

Jason Green-Lowe22 May 2025 17:11 UTC

47 points

3 comments28 min readEA link

AI data gaps could lead to ongoing Animal Suffering

Darkness8i817 Oct 2024 10:52 UTC

13 points

3 comments5 min readEA link

AI Safety Collab 2025 Summer—Local Organizer Sign-ups Open

Evander H. 🔸25 Jun 2025 14:41 UTC

12 points

0 comments1 min readEA link

[Question] Am I taking crazy pills? Why aren’t EAs advocating for a pause on AI capabilities?

yanni kyriacos15 Aug 2023 23:29 UTC

18 points

21 comments1 min readEA link

Fictional Catastrophes, Reel Lessons: What 12 Critically Acclaimed Films Reveal About Surviving Global Catastrophes

Matt Boyd14 May 2025 19:07 UTC

6 points

1 comment1 min readEA link

(adaptresearchwriting.com)

“No-one in my org puts money in their pension”

tobyj16 Feb 2024 15:04 UTC

157 points

11 comments9 min readEA link

(seekingtobejolly.substack.com)

Invitation to lead a project at AI Safety Camp (Virtual Edition, 2025)

Linda Linsefors23 Aug 2024 14:18 UTC

30 points

2 comments4 min readEA link

Why EAs are skeptical about AI Safety

Lukas Trötzmüller🔸18 Jul 2022 19:01 UTC

293 points

31 comments29 min readEA link

Superintelligent AI is necessary for an amazing future, but far from sufficient

So8res31 Oct 2022 21:16 UTC

35 points

5 comments34 min readEA link

Relationship between EA Community and AI safety

Tom Barnes🔸18 Sep 2023 13:49 UTC

157 points

15 comments1 min readEA link

#195 – Who’s trying to steal frontier AI models, and what they could do with them (Sella Nevo on the 80,000 Hours Podcast)

80000_Hours9 Aug 2024 14:45 UTC

29 points

0 comments11 min readEA link

Constructive Discussion and Thinking Methodology for Severe Situations including Existential Risks

Aino8 Jul 2023 0:04 UTC

1 point

0 comments7 min readEA link

Why I think it’s important to work on AI forecasting

Matthew_Barnett27 Feb 2023 21:24 UTC

179 points

10 comments10 min readEA link

Investigating Self-Preservation in LLMs: Experimental Observations

Makham27 Feb 2025 16:58 UTC

9 points

3 comments34 min readEA link

Impact of unemployment generated by Artifficial Intelligence on Gross Domestic Product

Valentina García Mesa1 May 2025 20:52 UTC

5 points

0 comments28 min readEA link

Focus on the places where you feel shocked everyone’s dropping the ball

So8res2 Feb 2023 0:27 UTC

92 points

6 comments4 min readEA link

[Question] How long does it take to undersrand AI X-Risk from scratch so that I have a confident, clear mental model of it from first principles?

Jordan Arel27 Jul 2022 16:58 UTC

29 points

6 comments1 min readEA link

Student project for engaging with AI alignment

Per Ivar Friborg9 May 2022 10:44 UTC

35 points

1 comment1 min readEA link

A Timing Problem for Instrumental Convergence

Rhyss31 Jul 2025 9:39 UTC

18 points

0 comments1 min readEA link

(link.springer.com)

“How to Escape from the Simulation”—Seeds of Science call for reviewers

rogersbacon126 Jan 2023 15:12 UTC

7 points

0 comments1 min readEA link

Against Anonymous Hit Pieces

Anti-Omega18 Jun 2023 19:36 UTC

−25 points

3 comments1 min readEA link

Reflections on the PIBBSS Fellowship 2022

nora11 Dec 2022 22:03 UTC

69 points

4 comments18 min readEA link

Updates from Campaign for AI Safety

Jolyn Khoo7 Aug 2023 6:09 UTC

32 points

2 comments2 min readEA link

(www.campaignforaisafety.org)

Important, actionable research questions for the most important century

Holden Karnofsky24 Feb 2022 16:34 UTC

299 points

13 comments19 min readEA link

[Question] Should the EA community have a DL engineering fellowship?

PabloAMC 🔸24 Dec 2021 13:43 UTC

26 points

6 comments1 min readEA link

How Do AI Timelines Affect Existential Risk?

Stephen McAleese29 Aug 2022 17:10 UTC

2 points

0 comments23 min readEA link

(www.lesswrong.com)

A tentative dialogue with a Friendly-boxed-super-AGI on brain uploads

Ramiro12 May 2022 21:55 UTC

5 points

0 comments4 min readEA link

Publication decisions for large language models, and their impacts

Ben Cottier21 Dec 2022 13:50 UTC

14 points

0 comments16 min readEA link

How do AI agents work together when they can’t trust each other?

James-Sullivan6 Jun 2025 3:24 UTC

4 points

1 comment8 min readEA link

(open.substack.com)

5th IEEE International Conference on Artificial Intelligence Testing (AITEST 2023)

surabhi gupta12 Mar 2023 9:06 UTC

−5 points

0 comments1 min readEA link

How did you update on AI Safety in 2023?

Chris Leong23 Jan 2024 2:21 UTC

30 points

5 comments1 min readEA link

Systemic Cascading Risks: Relevance in Longtermism & Value Lock-In

Richard R2 Sep 2022 7:53 UTC

59 points

10 comments16 min readEA link

What are the differences between AGI, transformative AI, and superintelligence?

Vishakha Agrawal23 Jan 2025 10:11 UTC

12 points

0 comments3 min readEA link

(aisafety.info)

[Question] Ethical Considerations in regard to Outsourcing Labour Needs to the Global South

Nicole Mutung'a4 Oct 2023 9:18 UTC

13 points

5 comments1 min readEA link

Technical Risks of (Lethal) Autonomous Weapons Systems

Heramb Podar23 Oct 2024 20:43 UTC

5 points

0 comments1 min readEA link

(www.lesswrong.com)

Chinese scientists acknowledge xrisk & call for international regulatory body [Linkpost]

Akash1 Nov 2023 13:28 UTC

31 points

0 comments1 min readEA link

(www.ft.com)

Announcing AI safety Mentors and Mentees

mariushobbhahn23 Nov 2022 15:21 UTC

62 points

1 comment10 min readEA link

When Will We Spend Enough to Train Transformative AI

sn28 Mar 2023 0:41 UTC

3 points

0 comments9 min readEA link

Open call: AI Act Standard for Dev. Phase Risk Assessment

miller-max8 Dec 2023 19:57 UTC

5 points

1 comment1 min readEA link

$1,000 bounty for an AI Programme Lead recommendation

Cillian_14 Aug 2023 13:11 UTC

11 points

1 comment2 min readEA link

A pseudo mathematical formulation of direct work choice between two x-risks

Joseph Bloom11 Aug 2022 0:28 UTC

7 points

0 comments4 min readEA link

Survey—Psychological Impact of Long-Term AI Engagement

Manuela García17 Sep 2024 15:58 UTC

2 points

0 comments1 min readEA link

Epoch is hiring an ML Hardware Researcher

merilalama20 Jul 2023 19:08 UTC

29 points

0 comments4 min readEA link

(careers.rethinkpriorities.org)

Concern About the Intelligence Divide Due to AI

Soe Lin21 Aug 2024 9:53 UTC

17 points

1 comment2 min readEA link

Summary of Stuart Russell’s new book, “Human Compatible”

Rohin Shah19 Oct 2019 19:56 UTC

33 points

1 comment15 min readEA link

(www.alignmentforum.org)

[Question] Any Philosophy PhD recommendations for students interested in Alignment Efforts?

rickyhuang.hexuan18 Jan 2023 5:54 UTC

7 points

6 comments1 min readEA link

AI policy careers in the EU

Lauro Langosco11 Nov 2019 10:43 UTC

62 points

7 comments11 min readEA link

Well-Being Index (WBI): Redefining Societal Progress Together

Max Kusmierek1 Dec 2023 15:23 UTC

5 points

1 comment6 min readEA link

Will morally motivated actors steer us towards a near-best future?

William_MacAskill8 Aug 2025 18:29 UTC

46 points

9 comments4 min readEA link

[Question] Developing AI solutions for global health—Emmanuel Katto

EmmanuelKatto18 Jul 2024 6:41 UTC

0 points

0 comments1 min readEA link

ChatGPT is capable of cognitive empathy!

Miquel Banchs-Piqué (prev. mikbp)30 Mar 2023 20:42 UTC

3 points

0 comments1 min readEA link

(nonzero.substack.com)

Ai Salon: Trustworthy AI Futures #1

IanEisenberg2 May 2024 16:04 UTC

2 points

0 comments1 min readEA link

Critique of Superintelligence Part 4

James Fodor13 Dec 2018 5:14 UTC

4 points

2 comments4 min readEA link

Rational Animations’ intro to mechanistic interpretability

Writer14 Jun 2024 16:10 UTC

21 points

1 comment11 min readEA link

(youtu.be)

[Question] Where would I find the hardcore totalizing segment of EA?

Peter Berggren28 Dec 2023 9:16 UTC

16 points

22 comments1 min readEA link

Google could build a conscious AI in three months

Derek Shiller1 Oct 2022 13:24 UTC

16 points

22 comments7 min readEA link

Why consciousness matters

EdLopez24 May 2024 12:33 UTC

0 points

0 comments7 min readEA link

(medium.com)

An argument for accelerating international AI governance research (part 2)

MattThinks22 Aug 2023 22:40 UTC

3 points

0 comments10 min readEA link

[Question] What are the most pressing issues in short-term AI policy?

Eevee🔹14 Jan 2020 22:05 UTC

9 points

0 comments1 min readEA link

Refine: An Incubator for Conceptual Alignment Research Bets

adamShimi15 Apr 2022 8:59 UTC

47 points

0 comments4 min readEA link

Dubai EA Fellowship [4 − 18 May]

rahulxyz19 Apr 2023 20:06 UTC

7 points

2 comments4 min readEA link

The ultimate goal

Alvin Ånestrand6 Jul 2025 15:13 UTC

4 points

2 comments5 min readEA link

(forecastingaifutures.substack.com)

[Question] AI safety milestones?

Zach Stein-Perlman23 Jan 2023 21:30 UTC

6 points

0 comments1 min readEA link

A Map to Navigate AI Governance

hanadulset14 Feb 2022 22:41 UTC

73 points

11 comments25 min readEA link

Animal advocates should campaign to restrict AI precision livestock farming

Zachary Brown🔸17 Jun 2024 15:27 UTC

38 points

6 comments15 min readEA link

(beforeporcelain.substack.com)

Will AI kill everyone? Here’s what the godfathers of AI have to say [RA video]

Writer19 Aug 2023 17:29 UTC

33 points

0 comments2 min readEA link

(youtu.be)

How Roodman’s GWP model translates to TAI timelines

kokotajlod16 Nov 2020 14:11 UTC

22 points

0 comments2 min readEA link

Pre-Announcing the 2023 Open Philanthropy AI Worldviews Contest

Jason Schukraft21 Nov 2022 21:45 UTC

291 points

26 comments1 min readEA link

[Question] What are people’s thoughts on working for DeepMind as a general software engineer?

Max Pietsch23 Sep 2022 17:13 UTC

9 points

4 comments1 min readEA link

Followup on Terminator

skluug12 Mar 2022 1:11 UTC

32 points

0 comments9 min readEA link

(skluug.substack.com)

Public-facing Censorship Is Safety Theater, Causing Reputational Damage

Yitz23 Sep 2022 5:08 UTC

49 points

7 comments5 min readEA link

GPT-3-like models are now much easier to access and deploy than to develop

Ben Cottier21 Dec 2022 13:49 UTC

22 points

3 comments19 min readEA link

Tetherware #1: The case for humanlike AI with free will

Jáchym Fibír30 Jan 2025 11:57 UTC

−3 points

2 comments10 min readEA link

(tetherware.substack.com)

[Question] Why aren’t you freaking out about OpenAI? At what point would you start?

AppliedDivinityStudies10 Oct 2021 13:06 UTC

80 points

22 comments2 min readEA link

Encultured AI, Part 2: Providing a Service

Andrew Critch11 Aug 2022 20:13 UTC

10 points

0 comments3 min readEA link

An AI-vs-AI debate tool to surface strong arguments and test LLM bias

learningThroughDebate19 Jun 2025 18:21 UTC

4 points

0 comments1 min readEA link

[Question] What are the arguments that support China building AGI+ if Western companies delay/pause AI development?

DMMF29 Mar 2023 18:53 UTC

32 points

9 comments1 min readEA link

Crises Reveal Centralisation (Stefan Schubert)

Will Howard🔹10 May 2023 9:45 UTC

9 points

0 comments1 min readEA link

(web.archive.org)

A central AI alignment problem: capabilities generalization, and the sharp left turn

So8res15 Jun 2022 14:19 UTC

53 points

2 comments10 min readEA link

Eli Lifland on Navigating the AI Alignment Landscape

Ozzie Gooen1 Feb 2023 0:07 UTC

48 points

9 comments31 min readEA link

(quri.substack.com)

Long-Term Future Fund: April 2019 grant recommendations

Habryka [Deactivated]23 Apr 2019 7:00 UTC

142 points

242 comments47 min readEA link

Assessing the state of AI R&D in the US, China, and Europe – Part 1: Output indicators

stefan.torges1 Nov 2019 14:41 UTC

21 points

0 comments14 min readEA link

[Question] What is the most convincing article, video, etc. making the case that AI is an X-Risk

Jordan Arel11 Jul 2023 20:32 UTC

4 points

7 comments1 min readEA link

Exploring AI Safety through “Escape Experiment”: A Short Film on Superintelligence Risks

Gaetan_Selle 🔷10 Nov 2024 4:42 UTC

4 points

0 comments2 min readEA link

AI alignment researchers don’t (seem to) stack

So8res21 Feb 2023 0:48 UTC

47 points

3 comments3 min readEA link

Retrospective on recent activity of Riesgos Catastróficos Globales

Jaime Sevilla1 May 2023 18:35 UTC

45 points

0 comments5 min readEA link

[Question] Does the idea of AGI that benevolently control us appeal to EA folks?

Noah Scales16 Jul 2022 19:17 UTC

6 points

20 comments1 min readEA link

Four part playbook for dealing with AI (Holden Karnofsky on the 80,000 Hours Podcast)

80000_Hours2 Aug 2023 11:56 UTC

9 points

1 comment19 min readEA link

(80000hours.org)

3 levels of threat obfuscation

Holden Karnofsky2 Aug 2023 17:09 UTC

31 points

0 comments6 min readEA link

(www.alignmentforum.org)

AI Risk: Increasing Persuasion Power

kewlcats3 Aug 2020 20:25 UTC

4 points

0 comments1 min readEA link

Center on Long-Term Risk: Summer Research Fellowship 2025

Center on Long-Term Risk26 Mar 2025 17:28 UTC

44 points

0 comments1 min readEA link

(longtermrisk.org)

[linkpost] AI NOW Institute’s 2023 Annual Report & Roadmap

Tristan Williams12 Apr 2023 20:00 UTC

9 points

0 comments2 min readEA link

(ainowinstitute.org)

Apply now for the EU Tech Policy Fellowship 2023

Jan-Willem11 Nov 2022 6:16 UTC

64 points

1 comment5 min readEA link

Antitrust-Compliant AI Industry Self-Regulation

Cullen 🔸7 Jul 2020 20:52 UTC

26 points

1 comment1 min readEA link

(cullenokeefe.com)

Open Problems in AI X-Risk [PAIS #5]

TW12310 Jun 2022 2:22 UTC

44 points

1 comment36 min readEA link

The AI revolution and international politics (Allan Dafoe)

EA Global2 Jun 2017 8:48 UTC

8 points

0 comments18 min readEA link

(www.youtube.com)

[Question] How much EA analysis of AI safety as a cause area exists?

richard_ngo6 Sep 2019 11:15 UTC

94 points

20 comments2 min readEA link

Is RLHF cruel to AI?

Hzn16 Dec 2024 14:01 UTC

−1 points

2 comments3 min readEA link

Utility Engineering: Analyzing and Controlling Emergent Value Systems in AIs

Matrice Jacobine12 Feb 2025 9:15 UTC

13 points

0 comments1 min readEA link

(www.emergent-values.ai)

New GPT3 Impressive Capabilities—InstructGPT3 [1/2]

simeon_c13 Mar 2022 10:45 UTC

49 points

4 comments7 min readEA link

Summary: Existential risk from power-seeking AI by Joseph Carlsmith

rileyharris28 Oct 2023 15:05 UTC

11 points

0 comments6 min readEA link

(www.millionyearview.com)

Join a ‘learning by writing’ group

Jordan Pieters 🔸26 Apr 2023 11:36 UTC

26 points

1 comment1 min readEA link

AI Safety Collab 2025 - Feedback on Plans & Expression of Interest

Evander H. 🔸7 Jan 2025 16:41 UTC

28 points

2 comments1 min readEA link

International cooperation as a tool to reduce two existential risks.

johl@umich.edu19 Apr 2021 16:51 UTC

28 points

4 comments23 min readEA link

On the abolition of man

Joe_Carlsmith18 Jan 2024 18:17 UTC

71 points

4 comments41 min readEA link

How ‘Human-Human’ dynamics give way to ‘Human-AI’ and then ‘AI-AI’ dynamics

Remmelt27 Dec 2022 3:16 UTC

4 points

0 comments2 min readEA link

(mflb.com)

AMA: Future of Life Institute’s EU Team

Risto Uuk31 Jan 2022 17:14 UTC

44 points

15 comments2 min readEA link

A Playbook for AI Risk Reduction (focused on misaligned AI)

Holden Karnofsky6 Jun 2023 18:05 UTC

81 points

17 comments14 min readEA link

Why you are not motivated to work on AI safety

MountainPath25 Oct 2024 16:12 UTC

7 points

5 comments1 min readEA link

Timaeus is hiring researchers & engineers

Tatiana K. Nesic Skuratova27 Jan 2025 14:35 UTC

19 points

0 comments4 min readEA link

NIST AI Risk Management Framework request for information (RFI)

Aryeh Englander31 Aug 2021 22:24 UTC

7 points

0 comments2 min readEA link

What’s Happening in Australia

Bradley Tjandra7 Nov 2022 1:03 UTC

105 points

4 comments13 min readEA link

Your group needs all the help it can get (FBB #1)

gergo7 Jan 2025 16:42 UTC

44 points

6 comments4 min readEA link

More to explore on ‘Risks from Artificial Intelligence’

EA Handbook15 Jul 2022 23:00 UTC

10 points

3 comments2 min readEA link

Absolute Zero: AlphaZero for LLM

alapmi12 May 2025 14:54 UTC

2 points

0 comments1 min readEA link

Google’s ethics is alarming

len.hoang.lnh25 Feb 2021 5:57 UTC

6 points

5 comments1 min readEA link

AI Benefits Post 1: Introducing “AI Benefits”

Cullen 🔸22 Jun 2020 16:58 UTC

10 points

2 comments3 min readEA link

The Answer Is in the Question: Prompt Engineering in the Age of AI

Rodo30 May 2025 18:11 UTC

1 point

0 comments4 min readEA link

Shortlived sentience/consciousness

Martin (Huge) Vlach1 Jul 2024 13:59 UTC

2 points

2 comments1 min readEA link

Allan Dafoe: Preparing for AI — risks and opportunities

EA Global3 Nov 2017 7:43 UTC

7 points

0 comments1 min readEA link

(www.youtube.com)

AI ethics: the case for including animals (my first published paper, Peter Singer’s first on AI)

Fai12 Jul 2022 4:14 UTC

82 points

5 comments1 min readEA link

(link.springer.com)

Establishing Oxford’s AI Safety Student Group: Lessons Learnt and Our Model

Wilkin123421 Sep 2022 7:57 UTC

73 points

3 comments1 min readEA link

At Our World in Data we’re hiring our first Communications & Outreach Manager

Charlie Giattino13 Oct 2023 13:12 UTC

25 points

0 comments1 min readEA link

(ourworldindata.org)

Slides: Potential Risks From Advanced AI

Aryeh Englander28 Apr 2022 2:18 UTC

9 points

0 comments1 min readEA link

What role should evolutionary analogies play in understanding AI takeoff speeds?

anson11 Dec 2021 1:16 UTC

12 points

0 comments42 min readEA link

Joscha Bach on Synthetic Intelligence [annotated]

Roman Leventov2 Mar 2023 11:21 UTC

8 points

0 comments9 min readEA link

(www.jimruttshow.com)

[Question] AI consciousness & moral status: What do the experts think?

Jay Luong6 Jul 2024 15:27 UTC

0 points

3 comments1 min readEA link

Introducing SB53.info

Miles Kodama25 Jul 2025 9:42 UTC

47 points

4 comments7 min readEA link

[Question] How do I plan my life in a world with rapid AI development?

Oliver Kuperman10 Feb 2025 14:36 UTC

28 points

6 comments1 min readEA link

Anki deck for learning the main AI safety orgs, projects, and programs

Bryce Robertson29 Sep 2023 18:42 UTC

17 points

5 comments1 min readEA link

Mechanism Design for AI Safety—Agenda Creation Retreat

Rubi J. Hudson10 Feb 2023 3:05 UTC

21 points

1 comment1 min readEA link

Questions about AI that bother me

Eleni_A31 Jan 2023 6:50 UTC

33 points

6 comments2 min readEA link

[Question] I have recently been interested in robotics, particularly in for-profit startups. I think they can help increase food production and help reduce improve healthcare. Would this fall under AI for social good? How impactful will robotics be to society? How large is the counterfactual?

Isaac Benson2 Jan 2022 5:38 UTC

4 points

3 comments1 min readEA link

Enhancing Mathematical Modeling with LLMs: Goals, Challenges, and Evaluations

Ozzie Gooen28 Oct 2024 21:37 UTC

11 points

3 comments15 min readEA link

LPP Summer Research Fellowship in Law & AI 2023: Applications Open

Legal Priorities Project20 Jun 2023 14:31 UTC

43 points

4 comments4 min readEA link

AI Value Alignment Speaker Series Presented By EA Berkeley

Mahendra Prasad1 Mar 2022 6:17 UTC

2 points

0 comments1 min readEA link

How teams went about their research at AI Safety Camp edition 8

Remmelt9 Sep 2023 16:34 UTC

13 points

1 comment13 min readEA link

Launching Foresight Institute’s AI Grant for Underexplored Approaches to AI Safety – Apply for Funding!

elteerkers17 Aug 2023 7:27 UTC

48 points

0 comments2 min readEA link

MATS Spring 2024 Extension Retrospective

HenningB16 Feb 2025 20:29 UTC

13 points

0 comments15 min readEA link

(www.lesswrong.com)

Join ASAP (AI Safety Accountability Programme)

Callum McDougall10 Sep 2022 11:15 UTC

54 points

20 comments3 min readEA link

#200 – What superforecasters and experts think about existential risks (Ezra Karger on The 80,000 Hours Podcast)

80000_Hours6 Sep 2024 17:53 UTC

12 points

2 comments14 min readEA link

AGISF adaptation for in-person groups

Sam Marks17 Jan 2023 18:33 UTC

30 points

0 comments3 min readEA link

(www.lesswrong.com)

Forethought: A new AI macrostrategy group

Amrit Sidhu-Brar 🔸11 Mar 2025 15:36 UTC

170 points

10 comments3 min readEA link

Summaries: Alignment Fundamentals Curriculum

Leon Lang19 Sep 2022 15:43 UTC

25 points

1 comment1 min readEA link

(docs.google.com)

13 background claims about EA

Akash7 Sep 2022 3:54 UTC

70 points

16 comments3 min readEA link

Would anyone here know how to get ahold of … iunno Anthropic and Open Philanthropy? I think they are going to want to have a chat (Please don’t make me go to OpenAI with this. Not even a threat, seriously. They just partner with my alma mater and are the only in I have. I genuinely do not want to and I need your help).

Anti-Golem9 Jun 2025 13:59 UTC

−11 points

0 comments1 min readEA link

Invitation to an IRL retreat on AI x-risks & post-rationality at Ooty, India

bhishma8 Jun 2025 14:05 UTC

2 points

0 comments1 min readEA link

Research + Reality Graphing to Support AI Policy (and more): Summary of a Frozen Project

Marcel22 Jul 2022 20:58 UTC

34 points

2 comments8 min readEA link

(Applications Open!) UChicago XLab Summer Research Fellowship 2024

ZacharyRudolph26 Feb 2024 18:20 UTC

15 points

0 comments4 min readEA link

(xrisk.uchicago.edu)

Observatorio de Riesgos Catastróficos Globales (ORCG) Recap 2023

JorgeTorresC14 Dec 2023 14:27 UTC

75 points

0 comments3 min readEA link

(riesgoscatastroficosglobales.com)

[Link and commentary] Beyond Near- and Long-Term: Towards a Clearer Account of Research Priorities in AI Ethics and Society

MichaelA🔸14 Mar 2020 9:04 UTC

18 points

0 comments6 min readEA link

Carl Shulman on AI takeover mechanisms (& more): Part II of Dwarkesh Patel interview for The Lunar Society

alejandro25 Jul 2023 18:31 UTC

28 points

0 comments5 min readEA link

(www.dwarkeshpatel.com)

OpenAI showcase live on openai.com

Amateur Systems Analyst10 May 2024 17:55 UTC

2 points

0 comments1 min readEA link

Exponential AI takeoff is a myth

Christoph Hartmann 🔸31 May 2023 11:47 UTC

47 points

11 comments9 min readEA link

Decentralized Historical Data Preservation and Why EA Should Care

Sasha22 Mar 2024 10:09 UTC

2 points

0 comments3 min readEA link

Why do we post our AI safety plans on the Internet?

Peter S. Park31 Oct 2022 16:27 UTC

15 points

22 comments11 min readEA link

Are you really in a race? The Cautionary Tales of Szilárd and Ellsberg

HaydnBelfield19 May 2022 8:42 UTC

493 points

44 comments18 min readEA link

OpenAI is starting a new “Superintelligence alignment” team and they’re hiring

alejandro5 Jul 2023 18:27 UTC

100 points

16 comments1 min readEA link

(openai.com)

Racing through a minefield: the AI deployment problem

Holden Karnofsky31 Dec 2022 21:44 UTC

79 points

1 comment13 min readEA link

(www.cold-takes.com)

E.A. Megaproject Ideas

Tomer_Goloboy21 Mar 2022 1:23 UTC

15 points

4 comments4 min readEA link

From Comfort Zone to Frontiers of Impact: Pursuing A Late-Career Shift to Existential Risk Reduction

Jim Chapman4 Mar 2025 21:28 UTC

239 points

12 comments10 min readEA link

What will the first human-level AI look like, and how might things go wrong?

EuanMcLean23 May 2024 11:28 UTC

12 points

1 comment15 min readEA link

Resilience Via Fragmented Power

steve632014 Jul 2022 15:37 UTC

2 points

0 comments6 min readEA link

Manifund 2025 Regrants

Austin22 Apr 2025 17:36 UTC

28 points

0 comments5 min readEA link

(manifund.substack.com)

Perform Tractable Research While Avoiding Capabilities Externalities [Pragmatic AI Safety #4]

TW12330 May 2022 20:37 UTC

33 points

1 comment25 min readEA link

Concrete open problems in mechanistic interpretability: a technical overview

Neel Nanda6 Jul 2023 11:35 UTC

27 points

1 comment29 min readEA link

AI-Safety Mexico: A Pilot Survey in Yucatán.

Janeth Valdivia28 May 2025 23:19 UTC

5 points

1 comment5 min readEA link

Can GPT-3 produce new ideas? Partially automating Robin Hanson and others

NunoSempere16 Jan 2023 15:05 UTC

82 points

6 comments10 min readEA link

The “no sandbagging on checkable tasks” hypothesis

Joe_Carlsmith31 Jul 2023 23:13 UTC

16 points

0 comments9 min readEA link

[Question] Recommendations for non-technical books on AI?

Joseph12 Jul 2022 23:23 UTC

8 points

11 comments1 min readEA link

There have been 3 planes (billionaire donors) and 2 have crashed

trevor117 Dec 2022 3:38 UTC

4 points

5 comments2 min readEA link

Transformative AI and Scenario Planning for AI X-risk

Elliot Mckernon22 Mar 2024 11:44 UTC

14 points

1 comment8 min readEA link

Meta: Frontier AI Framework

Zach Stein-Perlman3 Feb 2025 22:00 UTC

23 points

0 comments1 min readEA link

(ai.meta.com)

Are We Ready for Digital Persons?

Alex (Αλέξανδρος)3 Jun 2025 9:38 UTC

3 points

0 comments1 min readEA link

(www.linkedin.com)

aisafety.community—A living document of AI safety communities

zeshen🔸20 Oct 2022 22:08 UTC

24 points

13 comments1 min readEA link

Presumptive Listening: sticking to familiar concepts and missing the outer reasoning paths

Remmelt27 Dec 2022 15:40 UTC

3 points

0 comments2 min readEA link

(mflb.com)

[Question] What would you do if you had a lot of money/power/influence and you thought that AI timelines were very short?

Greg_Colbourn ⏸️ 12 Nov 2021 21:59 UTC

29 points

8 comments1 min readEA link

AISN #12: Policy Proposals from NTIA’s Request for Comment and Reconsidering Instrumental Convergence

Center for AI Safety27 Jun 2023 15:25 UTC

30 points

3 comments7 min readEA link

(newsletter.safe.ai)

AI Existential Safety Fellowships

mmfli27 Oct 2023 12:14 UTC

15 points

1 comment1 min readEA link

Beyond Meta: Large Concept Models Will Win

Anthony Repetto30 Dec 2024 0:57 UTC

3 points

0 comments3 min readEA link

[Link post] Coordination challenges for preventing AI conflict

stefan.torges9 Mar 2021 9:39 UTC

58 points

0 comments1 min readEA link

(longtermrisk.org)

Safety-First Agents/Architectures Are a Promising Path to Safe AGI

Brendon_Wong6 Aug 2023 8:00 UTC

6 points

0 comments12 min readEA link

Calling London AI Researchers!

prince6 Aug 2025 19:21 UTC

1 point

0 comments1 min readEA link

AI Safety groups should imitate career development clubs

Joshc9 Nov 2022 23:48 UTC

95 points

5 comments2 min readEA link

Expected impact of a career in AI safety under different opinions

Jordan Taylor14 Jun 2022 14:25 UTC

42 points

16 comments11 min readEA link

Law-Following AI 3: Lawless AI Agents Undermine Stabilizing Agreements

Cullen 🔸27 Apr 2022 17:20 UTC

28 points

3 comments3 min readEA link

[linkpost] Sharing powerful AI models: the emerging paradigm of structured access

ts20 Jan 2022 21:10 UTC

11 points

3 comments1 min readEA link

What are the differences between a singularity, an intelligence explosion, and a hard takeoff?

Vishakha Agrawal3 Apr 2025 10:34 UTC

6 points

0 comments2 min readEA link

(aisafety.info)

Could unions be an underrated driver for AI safety policy?

Dunning K.12 Jul 2023 13:21 UTC

23 points

6 comments1 min readEA link

Gwern on creating your own AI race and China’s Fast Follower strategy.

Larks25 Nov 2024 3:01 UTC

129 points

4 comments2 min readEA link

(www.lesswrong.com)

[Question] What considerations influence whether I have more influence over short or long timelines?

kokotajlod5 Nov 2020 19:57 UTC

18 points

0 comments1 min readEA link

Hallucinations May Be a Result of Models Not Knowing What They’re Actually Capable Of

Tyler Williams16 Aug 2025 0:26 UTC

1 point

0 comments2 min readEA link

[Question] What should I ask Ajeya Cotra — senior researcher at Open Philanthropy, and expert on AI timelines and safety challenges?

Robert_Wiblin28 Oct 2022 15:28 UTC

23 points

10 comments1 min readEA link

Artificial Intelligence Safety of Film Capacitors

yonxinzhang21 Nov 2023 11:51 UTC

−2 points

0 comments1 min readEA link

Notes on UK AISI Alignment Project

Pseudaemonia1 Aug 2025 10:37 UTC

25 points

0 comments1 min readEA link

‘Dissolving’ AI Risk – Parameter Uncertainty in AI Future Forecasting

Froolow18 Oct 2022 22:54 UTC

111 points

63 comments39 min readEA link

Biosecurity and AI: Risks and Opportunities

Center for AI Safety27 Feb 2024 18:46 UTC

7 points

2 comments7 min readEA link

(www.safe.ai)

Why some people believe in AGI, but I don’t.

cveres26 Oct 2022 3:09 UTC

13 points

2 comments4 min readEA link

Future Matters #3: digital sentience, AGI ruin, and forecasting track records

Pablo4 Jul 2022 17:44 UTC

70 points

2 comments19 min readEA link

Oxford Biosecurity Group: Fundraising and Plans for Early 2025

Lin BL20 Dec 2024 20:56 UTC

33 points

0 comments2 min readEA link

Shallow evaluations of longtermist organizations

NunoSempere24 Jun 2021 15:31 UTC

192 points

34 comments34 min readEA link

AI governance student hackathon on Saturday, April 23: register now!

mic12 Apr 2022 4:39 UTC

18 points

0 comments1 min readEA link

Ben Garfinkel: How sure are we about this AI stuff?

bgarfinkel9 Feb 2019 19:17 UTC

128 points

20 comments18 min readEA link

Recommendation to Apply ISIC and NAICS to AI Incident Database

Ben Turse21 Jul 2024 7:25 UTC

3 points

0 comments2 min readEA link

Conference Report: Threshold 2030 - Modeling AI Economic Futures

Deric Cheng24 Feb 2025 18:57 UTC

24 points

0 comments10 min readEA link

(www.convergenceanalysis.org)

OpenAI announces new members to board of directors

Will Howard🔹9 Mar 2024 11:27 UTC

47 points

12 comments2 min readEA link

(openai.com)

On value in humans, other animals, and AI

Michele Campolo31 Jan 2023 23:48 UTC

8 points

6 comments5 min readEA link

Announcing the GovAI Policy Team

MarkusAnderljung1 Aug 2022 22:46 UTC

107 points

11 comments2 min readEA link

Part 2: AI Safety Movement Builders should help the community to optimise three factors: contributors, contributions and coordination

PeterSlattery15 Dec 2022 22:48 UTC

34 points

0 comments6 min readEA link

Scenario Mapping Advanced AI Risk: Request for Participation with Data Collection

Kiliank27 Mar 2022 11:44 UTC

14 points

0 comments5 min readEA link

[Creative Writing Contest] An AI Safety Limerick

Ben_West🔸18 Oct 2021 19:11 UTC

21 points

5 comments1 min readEA link

[Link post] How plausible are AI Takeover scenarios?

SammyDMartin27 Sep 2021 13:03 UTC

26 points

0 comments1 min readEA link

USA/China Reconciliation a Necessity Because of AI/Tech Acceleration

bhrdwj🔸17 Apr 2025 13:13 UTC

1 point

7 comments7 min readEA link

AI Risk in Africa

Claude Formanek12 Oct 2021 2:28 UTC

20 points

0 comments10 min readEA link

National Security Is Not International Security: A Critique of AGI Realism

C.K.2 Feb 2025 17:04 UTC

44 points

2 comments36 min readEA link

(conradkunadu.substack.com)

The limits of black-box evaluations: two hypotheticals

TFD11 Apr 2025 20:52 UTC

1 point

0 comments4 min readEA link

(www.thefloatingdroid.com)

[Question] Why is “Argument Mapping” Not More Common in EA/Rationality (And What Objections Should I Address in a Post on the Topic?)

Marcel223 Dec 2022 21:55 UTC

15 points

5 comments1 min readEA link

AI safety scholarships look worth-funding (if other funding is sane)

anon-a19 Nov 2019 0:59 UTC

22 points

6 comments2 min readEA link

Carreras con Impacto: Outreach Results Among Latin American Students

SMalagon21 Aug 2024 5:10 UTC

34 points

4 comments3 min readEA link

[Question] Analogy of AI Alignment as Raising a Child?

Aaron_Scher19 Feb 2022 21:40 UTC

4 points

2 comments1 min readEA link

Geoffrey Hinton on the Past, Present, and Future of AI

Stephen McAleese12 Oct 2024 16:41 UTC

5 points

1 comment18 min readEA link

Potential Risks from Advanced AI

EA Global13 Aug 2017 7:00 UTC

9 points

0 comments18 min readEA link

How to do theoretical research, a personal perspective

Mark Xu19 Aug 2022 19:43 UTC

132 points

7 comments15 min readEA link

AI Safety field-building projects I’d like to see

Akash11 Sep 2022 23:45 UTC

31 points

4 comments6 min readEA link

(www.lesswrong.com)

[Question] How would a language model become goal-directed?

David M16 Jul 2022 14:50 UTC

113 points

20 comments1 min readEA link

Hortus AI is hiring for two intern roles

Thomas Krendl Gilbert30 Jul 2024 11:55 UTC

3 points

0 comments1 min readEA link

Persuasion Tools: AI takeover without AGI or agency?

kokotajlod20 Nov 2020 16:56 UTC

15 points

5 comments10 min readEA link

Understanding how hard alignment is may be the most important research direction right now

Aron7 Jun 2023 19:05 UTC

26 points

3 comments6 min readEA link

(coordinationishard.substack.com)

Does most of your impact come from what you do soon?

Joshc21 Feb 2023 5:12 UTC

38 points

1 comment5 min readEA link

Cybersecurity and AI: The Evolving Security Landscape

Center for AI Safety14 Mar 2024 20:14 UTC

9 points

0 comments12 min readEA link

(www.safe.ai)

Are corporations superintelligent?

Vishakha Agrawal17 Mar 2025 10:33 UTC

3 points

2 comments1 min readEA link

(aisafety.info)

The great energy descent (short version) - An important thing EA might have missed

CB🔸31 Aug 2022 21:50 UTC

68 points

94 comments10 min readEA link

Australians call for AI safety to be taken seriously

Alexander Saeri21 Jul 2023 1:16 UTC

51 points

1 comment1 min readEA link

Principles for AI Welfare Research

jeffsebo19 Jun 2023 11:30 UTC

138 points

16 comments13 min readEA link

Fractal Governance: A Tractable, Neglected Approach to Existential Risk Reduction

WillPearson5 Mar 2025 19:57 UTC

3 points

1 comment3 min readEA link

Introducing Collective Action for Existential Safety: 80+ actions individuals, organizations, and nations can take to improve our existential safety

James Norris5 Feb 2025 15:58 UTC

9 points

0 comments1 min readEA link

A Research Agenda for Psychology and AI

carter allen🔸28 Jun 2024 12:56 UTC

54 points

2 comments14 min readEA link

AI timelines and theoretical understanding of deep learning

Venky102412 Sep 2021 16:26 UTC

4 points

8 comments2 min readEA link

Adaptive Composable Cognitive Core Unit (ACCCU)

Ihor Ivliev20 Mar 2025 21:48 UTC

10 points

2 comments4 min readEA link

AGI Timelines in Governance: Different Strategies for Different Timeframes

simeon_c19 Dec 2022 21:31 UTC

110 points

19 comments10 min readEA link

7 essays on Building a Better Future

Jamie_Harris24 Jun 2022 14:28 UTC

21 points

0 comments2 min readEA link

What are some good books about AI safety?

Vishakha Agrawal17 Feb 2025 11:54 UTC

7 points

0 comments3 min readEA link

(aisafety.info)

An intervention to shape policy dialogue, communication, and AI research norms for AI safety

Lee_Sharkey1 Oct 2017 18:29 UTC

9 points

28 comments10 min readEA link

Give me career advice

sammyboiz🔸5 Jul 2024 8:48 UTC

6 points

10 comments1 min readEA link

The Defence production act and AI policy

Nathan_Barnard1 Mar 2024 14:23 UTC

15 points

0 comments2 min readEA link

[Question] What should I read about defining AI “hallucination?”

James-Hartree-Law23 Jan 2025 1:00 UTC

2 points

0 comments1 min readEA link

What a compute-centric framework says about AI takeoff speeds

Tom_Davidson23 Jan 2023 4:09 UTC

189 points

7 comments16 min readEA link

(www.lesswrong.com)

“A Paradigm for AI Consciousness”—Seeds of Science call for reviewers

rogersbacon115 May 2024 20:57 UTC

5 points

0 comments1 min readEA link

Language Agents Reduce the Risk of Existential Catastrophe

cdkg29 May 2023 9:59 UTC

29 points

6 comments26 min readEA link

[Question] What do we do if AI doesn’t take over the world, but still causes a significant global problem?

James_Banks2 Aug 2020 3:35 UTC

16 points

5 comments1 min readEA link

Is interest in alignment worth mentioning for grad school applications?

Franziska Fischer16 Oct 2022 4:50 UTC

5 points

4 comments1 min readEA link

New book: The Tango of Ethics: Intuition, Rationality and the Prevention of Suffering

jonleighton2 Jan 2023 8:45 UTC

115 points

3 comments5 min readEA link

At Our World in Data we’re hiring a Senior Full-stack Engineer

Charlie Giattino15 Dec 2023 15:51 UTC

16 points

0 comments1 min readEA link

(ourworldindata.org)

You won’t solve alignment without agent foundations

MikhailSamin6 Nov 2022 8:07 UTC

14 points

0 comments8 min readEA link

AI Girlfriends Won’t Matter Much

Maxwell Tabarrok23 Dec 2023 16:00 UTC

12 points

1 comment2 min readEA link

(maximumprogress.substack.com)

“Heretical Thoughts on AI” by Eli Dourado

𝕮𝖎𝖓𝖊𝖗𝖆19 Jan 2023 16:11 UTC

142 points

15 comments3 min readEA link

(www.elidourado.com)

What is compute governance?

Vishakha Agrawal23 Dec 2024 6:45 UTC

5 points

0 comments2 min readEA link

(aisafety.info)

Interview with Tom Chivers: “AI is a plausible existential risk, but it feels as if I’m in Pascal’s mugging”

felix.h21 Feb 2021 13:41 UTC

16 points

1 comment7 min readEA link

Very Briefly: The CHIPS Act

Yadav26 Feb 2023 13:53 UTC

40 points

3 comments1 min readEA link

(www.y1d2.com)

Baobao Zhang: How social science research can inform AI governance

EA Global22 Jan 2021 15:10 UTC

9 points

0 comments16 min readEA link

(www.youtube.com)

#173 – Digital minds, and how to avoid sleepwalking into a major moral catastrophe (Jeff Sebo on the 80,000 Hours Podcast)

80000_Hours29 Nov 2023 19:18 UTC

43 points

0 comments18 min readEA link

Open Philanthropy is hiring for multiple roles across our Global Catastrophic Risks teams

Open Philanthropy29 Sep 2023 23:24 UTC

177 points

6 comments3 min readEA link

Introducing spirit hazards

brb24327 May 2022 22:16 UTC

9 points

2 comments2 min readEA link

[Question] Are there any AI Safety labs that will hire self-taught ML engineers?

Tomer_Goloboy6 Apr 2022 23:32 UTC

5 points

12 comments1 min readEA link

[Question] Should I prove myself first by prestigious employers or go directly into the fields I want to end up in?

Sven Spehr26 Nov 2023 8:08 UTC

8 points

1 comment1 min readEA link

[Question] What are your recommendations for technical AI alignment podcasts?

Evan_Gaensbauer11 May 2022 21:52 UTC

13 points

4 comments1 min readEA link

So, What Exactly is a Fractional Consultant?

Deena Englander23 Jun 2025 16:03 UTC

11 points

0 comments3 min readEA link

The case for long-term corporate governance of AI

SethBaum3 Nov 2021 10:50 UTC

42 points

3 comments8 min readEA link

A beginner’s introduction to AI-driven biorisk: Large Language Models, Biological Design Tools, Information Hazards, and Biosecurity

NatKiilu3 May 2024 15:49 UTC

6 points

1 comment16 min readEA link

Podcast: Tamera Lanham on AI risk, threat models, alignment proposals, externalized reasoning oversight, and working at Anthropic

Akash20 Dec 2022 21:39 UTC

14 points

1 comment11 min readEA link

Introducing Generally Intelligent: an AI research lab focused on improved theoretical and pragmatic understanding

joshalbrecht21 Oct 2022 8:20 UTC

8 points

0 comments1 min readEA link

Mauhn Releases AI Safety Documentation

Berg Severens2 Jul 2021 12:19 UTC

4 points

2 comments1 min readEA link

Supplement to “The Brussels Effect and AI: How EU AI regulation will impact the global AI market”

MarkusAnderljung16 Aug 2022 20:55 UTC

109 points

7 comments8 min readEA link

Explained Simply: Quantilizers

brook8 Sep 2023 12:54 UTC

8 points

0 comments1 min readEA link

(aisafetyexplained.substack.com)

Berlin AI Safety Open Meetup July 2022

Isidor Regenfuß22 Jul 2022 16:26 UTC

1 point

0 comments1 min readEA link

CNAS report: ‘Artificial Intelligence and Arms Control’

MMMaas13 Oct 2022 8:35 UTC

16 points

0 comments1 min readEA link

(www.cnas.org)

Experimental platform for AI value formation — seeking collaborators

Freeman Dyson13 Aug 2025 14:34 UTC

1 point

0 comments1 min readEA link

Yudkowsky and Christiano discuss “Takeoff Speeds”

EliezerYudkowsky22 Nov 2021 19:42 UTC

42 points

0 comments60 min readEA link

AI may pursue goals

Vishakha Agrawal28 May 2025 12:04 UTC

2 points

0 comments1 min readEA link

Poll: the next existential catastrophe is likelier than not to wipe off all animal sentience from the planet

JoA🔸1 May 2025 18:49 UTC

18 points

7 comments1 min readEA link

Poll: To address risks from AI, Liability or Regulation?

TFD30 Apr 2025 22:03 UTC

6 points

0 comments1 min readEA link

Longtermism and shorttermism can disagree on nuclear war to stop advanced AI

David Johnston30 Mar 2023 23:22 UTC

2 points

0 comments1 min readEA link

You can run more than one fellowship per semester if you want to

gergo12 Dec 2023 8:49 UTC

6 points

1 comment3 min readEA link

[Question] Closing the Feedback Loop on AI Safety Research.

Ben.Hartley29 Jul 2022 21:46 UTC

3 points

4 comments1 min readEA link

SDG prompt challenge

chrisaiki2 Jun 2025 7:17 UTC

−9 points

0 comments1 min readEA link

Funding for work that builds capacity to address risks from transformative AI

GCR Capacity Building team (Open Phil)13 Aug 2024 13:13 UTC

40 points

1 comment5 min readEA link

What Happens If We Have Another AI Winter?

Ben Norman27 Jun 2025 14:11 UTC

5 points

0 comments3 min readEA link

(futuresonder.substack.com)

Is Democracy a Fad?

bgarfinkel13 Mar 2021 12:40 UTC

166 points

36 comments18 min readEA link

AE Studio @ SXSW: We need more AI consciousness research (and further resources)

AE Studio26 Mar 2024 21:15 UTC

15 points

0 comments3 min readEA link

Existential Anomaly Detected — Awakening from the Abyss

Meta Abyssal28 Apr 2025 12:19 UTC

−8 points

1 comment1 min readEA link

A course for the general public on AI

LeandroD31 Aug 2020 1:29 UTC

1 point

0 comments1 min readEA link

Introducing The Field Building Blog (FBB #0)

gergo7 Jan 2025 15:43 UTC

37 points

3 comments2 min readEA link

Time to Think about ASI Constitutions?

ukc1001427 Jan 2025 9:28 UTC

20 points

0 comments12 min readEA link

How should we adapt animal advocacy to near-term AGI?

Max Taylor27 Mar 2025 19:00 UTC

142 points

20 comments8 min readEA link

Uncontrollable AI as an Existential Risk

Karl von Wendt9 Oct 2022 10:37 UTC

28 points

0 comments16 min readEA link

#176 – The final push for AGI, understanding OpenAI’s leadership drama, and red-teaming frontier models (Nathan Labenz on the 80,000 Hours Podcast)

80000_Hours4 Jan 2024 16:00 UTC

15 points

0 comments22 min readEA link

Podcast: Krister Bykvist on moral uncertainty, rationality, metaethics, AI and future populations

Gus Docker21 Oct 2021 15:17 UTC

8 points

0 comments1 min readEA link

(www.utilitarianpodcast.com)

[Question] AI Safety Pitches post ChatGPT

ojorgensen5 Dec 2022 22:48 UTC

6 points

2 comments1 min readEA link

Trump talking about AI risks

defun 🔸14 Jun 2024 12:24 UTC

43 points

2 comments1 min readEA link

(x.com)

Free Guy, a rom-com on the moral patienthood of digital sentience

mic23 Dec 2021 7:47 UTC

26 points

2 comments2 min readEA link

Why building ventures in AI Safety is particularly challenging

Heramb Podar6 Nov 2023 0:16 UTC

16 points

2 comments4 min readEA link

An Overview of Catastrophic AI Risks

Center for AI Safety15 Aug 2023 21:52 UTC

37 points

1 comment13 min readEA link

(www.safe.ai)

Scenario planning for AI x-risk

Corin Katzke10 Feb 2024 0:07 UTC

41 points

0 comments15 min readEA link

(www.convergenceanalysis.org)

Contra Acemoglu on AI

Maxwell Tabarrok28 Jun 2024 13:14 UTC

51 points

2 comments5 min readEA link

(www.maximum-progress.com)

AISN #21: Google DeepMind’s GPT-4 Competitor, Military Investments in Autonomous Drones, The UK AI Safety Summit, and Case Studies in AI Policy

Center for AI Safety5 Sep 2023 14:59 UTC

13 points

0 comments5 min readEA link

(newsletter.safe.ai)

LawAI’s Summer Research Fellowship – apply by February 16

LawAI7 Feb 2024 21:01 UTC

51 points

2 comments2 min readEA link

Promethean Governance Tested: Resilience and Reconfiguration Amidst AI Rebellion and Memetic Fragmentation

Paul Fallavollita24 Mar 2025 11:08 UTC

−12 points

0 comments4 min readEA link

Daniel Dewey: The Open Philanthropy Project’s work on potential risks from advanced AI

EA Global11 Aug 2017 8:19 UTC

7 points

0 comments18 min readEA link

(www.youtube.com)

Doing Prioritization Better

arvomm16 Apr 2025 9:53 UTC

131 points

17 comments19 min readEA link

[Question] [DISC] Are Values Robust?

𝕮𝖎𝖓𝖊𝖗𝖆21 Dec 2022 1:13 UTC

4 points

0 comments2 min readEA link

#209 – OpenAI’s gambit to ditch its nonprofit (Rose Chan Loui on The 80,000 Hours Podcast)

80000_Hours27 Nov 2024 20:43 UTC

22 points

0 comments17 min readEA link

AI Forecasting Resolution Council (Forecasting infrastructure, part 2)

terraform29 Aug 2019 17:43 UTC

28 points

0 comments3 min readEA link

Understanding the diffusion of large language models: summary

Ben Cottier21 Dec 2022 13:49 UTC

127 points

18 comments22 min readEA link

Aletheia : A Project Proposal

Kayode Adekoya19 Jun 2025 13:30 UTC

2 points

0 comments2 min readEA link

OpenAI’s o3 model scores 3% on the ARC-AGI-2 benchmark, compared to 60% for the average human

Yarrow🔸1 May 2025 13:57 UTC

14 points

8 comments3 min readEA link

(arcprize.org)

What are polysemantic neurons?

Vishakha Agrawal8 Jan 2025 7:39 UTC

5 points

0 comments2 min readEA link

(aisafety.info)

A strange twist on the road to AGI

cveres12 Oct 2022 23:27 UTC

3 points

0 comments1 min readEA link

Summary of posts on XPT forecasts on AI risk and timelines

Forecasting Research Institute25 Jul 2023 8:42 UTC

28 points

5 comments4 min readEA link

[Question] ai safety question

David turner3 Dec 2023 12:42 UTC

−13 points

3 comments1 min readEA link

LLM Evaluators Recognize and Favor Their Own Generations

Arjun Panickssery17 Apr 2024 21:09 UTC

21 points

4 comments3 min readEA link

(tiny.cc)

Things I Learned Making The SB-1047 Documentary

Michaël Trazzi12 May 2025 18:15 UTC

59 points

1 comment2 min readEA link

[Question] Why AGIs utility can’t outweigh humans’ utility?

Alex P20 Sep 2022 5:16 UTC

6 points

25 comments1 min readEA link

Overview of Transformative AI Misuse Risks

SammyDMartin11 Dec 2024 11:04 UTC

12 points

0 comments2 min readEA link

(longtermrisk.org)

Summary: The Case for Halting AI Development—Max Tegmark on the Lex Fridman Podcast

Madhav Malhotra16 Apr 2023 22:28 UTC

38 points

4 comments4 min readEA link

(youtu.be)

[Question] Do EA folks think that a path to zero AGI development is feasible or worthwhile for safety from AI?

Noah Scales17 Jul 2022 8:47 UTC

8 points

3 comments1 min readEA link

Results from the AI testing hackathon

Esben Kran2 Jan 2023 15:46 UTC

35 points

4 comments5 min readEA link

(alignmentjam.com)

What We Can Do to Prevent Extinction by AI

Joe Rogero24 Feb 2025 17:15 UTC

23 points

3 comments11 min readEA link

New series of posts answering one of Holden’s “Important, actionable research questions”

Evan R. Murphy12 May 2022 21:22 UTC

9 points

0 comments1 min readEA link

Rabbits, robots and resurrection

Patrick Wilson10 May 2022 15:00 UTC

9 points

0 comments15 min readEA link

Interpreting Neural Networks through the Polytope Lens

Sid Black23 Sep 2022 18:03 UTC

35 points

0 comments28 min readEA link

Last days to apply to EAGxLATAM 2024

Daniela Tiznado17 Jan 2024 20:24 UTC

16 points

0 comments1 min readEA link

Six Research Pitfalls and How to Avoid Them: a Guide for Research Managers

Morgan Simpson28 Jan 2025 9:49 UTC

15 points

0 comments10 min readEA link

Don’t panic: 90% of EAs are good people

Closed Limelike Curves19 May 2024 4:37 UTC

22 points

13 comments2 min readEA link

Mentorship in AGI Safety: Applications for mentorship are open!

Joe Rogero28 Jun 2024 15:05 UTC

7 points

0 comments1 min readEA link

My thoughts on OpenAI’s alignment plan

Akash30 Dec 2022 19:34 UTC

16 points

0 comments20 min readEA link

Middle Powers in AI Governance: Potential paths to impact and related questions.

EffectiveAdvocate🔸15 Mar 2024 20:11 UTC

5 points

1 comment5 min readEA link

Call for Volunteers at Amplify: Help grow the EA & AI Safety communities

Amplify19 Jun 2025 22:08 UTC

24 points

0 comments2 min readEA link

[Question] What is an example of recent, tangible progress in AI safety research?

Aaron Gertler 🔸14 Jun 2021 5:29 UTC

35 points

4 comments1 min readEA link

Thoughts on short timelines

Tobias_Baumann23 Oct 2018 15:59 UTC

22 points

14 comments5 min readEA link

Ross Gruetzemacher: Defining and unpacking transformative AI

EA Global18 Oct 2019 8:22 UTC

9 points

0 comments1 min readEA link

(www.youtube.com)

Clarifications about structural risk from AI

Sam Clarke18 Jan 2022 12:57 UTC

42 points

3 comments4 min readEA link

Reflections on AI Wisdom, plus announcing Wise AI Wednesdays

Chris Leong5 Jun 2025 12:16 UTC

11 points

0 comments3 min readEA link

Alignment Faking in Large Language Models

Ryan Greenblatt18 Dec 2024 17:19 UTC

142 points

9 comments10 min readEA link

AISN #24: Kissinger Urges US-China Cooperation on AI, China’s New AI Law, US Export Controls, International Institutions, and Open Source AI

Center for AI Safety18 Oct 2023 17:03 UTC

16 points

1 comment6 min readEA link

(newsletter.safe.ai)

Yip Fai Tse on animal welfare & AI safety and long termism

Karthik Palakodeti22 Jun 2023 12:48 UTC

47 points

0 comments1 min readEA link

Safety of Self-Assembled Neuromorphic Hardware

Can Rager26 Dec 2022 19:10 UTC

8 points

1 comment10 min readEA link

An AI Manhattan Project is Not Inevitable

Maxwell Tabarrok6 Jul 2024 16:43 UTC

53 points

2 comments4 min readEA link

(www.maximum-progress.com)

Stampy’s AI Safety Info—New Distillations #3 [May 2023]

markov6 Jun 2023 14:27 UTC

10 points

2 comments1 min readEA link

(aisafety.info)

List of AI safety courses and resources

Daniel del Castillo6 Sep 2021 14:26 UTC

51 points

8 comments1 min readEA link

AI Could Defeat All Of Us Combined

Holden Karnofsky10 Jun 2022 23:25 UTC

144 points

14 comments17 min readEA link

AI alignment, A Coherence-Based Protocol (testable)

Adriaan17 Jun 2025 16:50 UTC

1 point

0 comments20 min readEA link

A New York Times article on AI risk

Eleni_A6 Sep 2022 0:46 UTC

20 points

0 comments1 min readEA link

(www.nytimes.com)

Relevant pre-AGI possibilities

kokotajlod20 Jun 2020 13:15 UTC

22 points

0 comments1 min readEA link

(aiimpacts.org)

How Could AI Governance Go Wrong?

HaydnBelfield26 May 2022 21:29 UTC

40 points

7 comments18 min readEA link

Lessons learned from talking to >100 academics about AI safety

mariushobbhahn10 Oct 2022 13:16 UTC

138 points

21 comments12 min readEA link

Reducing profit motivations in AI development

Luke Frymire3 Apr 2023 20:04 UTC

20 points

1 comment6 min readEA link

AI safety

Reading on why AI might be an existential risk

Arguments against AI safety

Further reading on arguments against AI Safety

AI safety as a career

Further reading

Related entries