AI safety

Core TagLast edit: Aug 7, 2024, 3:10 PM by vipulnaik

AI safety is the study of ways to reduce risks posed by artificial intelligence.

Interventions that aim to reduce these risks can be split into:

Technical alignment - research on how to align AI systems with human or moral goals
AI governance - reducing AI risk by e.g. global coordination around regulating AI development or providing incentives for corporations to be more cautious in their AI research
AI forecasting - predicting AI capabilities ahead of time

Reading on why AI might be an existential risk

Hilton, Benjamin (2023) Preventing an AI-related catastrophe, 80000 Hours, March 2023

Cotra, Ajeya (2022) Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover Effective Altruism Forum, July 18

Carlsmith, Joseph (2022) Is Power-Seeking AI an Existential Risk? Arxiv, 16 June

Yudkowsky, Eliezer (2022) AGI Ruin: A List of Lethalities LessWrong, June 5

Ngo et al (2023) The alignment problem from a deep learning perspectiveArxiv, February 23

Arguments against AI safety

AI safety and AI risk is sometimes referred to as a Pascal’s Mugging ^[1], implying that the risks are tiny and that for any stated level of ignorable risk the the payoffs could be exaggerated to force it to still be a top priority. A response to this is that in a survey of 700 ML researchers, the median answer to the “the probability that the long-run effect of advanced AI on humanity will be “extremely bad (e.g., human extinction)” was 5% with, with 48% of respondents giving 10% or higher^[2]. These probabilites are too high (by at least 5 orders of magnitude) to be consider Pascalian.

AI safety as a career

80,000 Hours’ medium-depth investigation rates technical AI safety research a “priority path”—among the most promising career opportunities the organization has identified so far.^[3]^[4] Richard Ngo and Holden Karnofsky also have advice for those interested in working on AI Safety^[5]^[6].

^
https://twitter.com/amasad/status/1632121317146361856 The CEO of Replit, a coding organisation who are involved in ML Tools
^
https://aiimpacts.org/2022-expert-survey-on-progress-in-ai/#Data
^
Todd, Benjamin (2023) The highest impact career paths our research has identified so far, 80,000 Hours, May 12.
^
Hilton, Benjamin (2023) AI safety technical research, 80,000 Hours, June 19th
^
Ngo, Richard (2023) AGI safety career advice, EA Forum, May 2
^
Karnofsky, Holden (2023), Jobs that can help with the most important century, EA Forum, Feb 12

Announcing the Winners of the 2023 Open Philanthropy AI Worldviews Contest

Jason SchukraftSep 30, 2023, 3:51 AM

74 points

30 comments2 min readEA link

High-level hopes for AI alignment

Holden KarnofskyDec 20, 2022, 2:11 AM

123 points

14 comments19 min readEA link

(www.cold-takes.com)

Resources I send to AI researchers about AI safety

Vael GatesJan 11, 2023, 1:24 AM

43 points

0 comments1 min readEA link

AI safety needs to scale, and here’s how you can do it

Esben KranFeb 2, 2024, 7:17 AM

33 points

2 comments5 min readEA link

(apartresearch.com)

Chilean AIS Hackathon Retrospective

Agustín Covarrubias 🔸May 9, 2023, 1:34 AM

67 points

0 comments5 min readEA link

FLI open letter: Pause giant AI experiments

Zach Stein-PerlmanMar 29, 2023, 4:04 AM

220 points

38 comments1 min readEA link

Katja Grace: Let’s think about slowing down AI

peterhartreeDec 23, 2022, 12:57 AM

84 points

6 comments2 min readEA link

(worldspiritsockpuppet.substack.com)

Fill out this census of everyone interested in reducing catastrophic AI risks

Alex HTMay 18, 2024, 3:53 PM

105 points

1 comment1 min readEA link

Announcing AI Safety Bulgaria

Aleksandar AngelovMar 3, 2024, 5:53 PM

16 points

0 comments1 min readEA link

Launching applications for AI Safety Careers Course India 2024

varun_agrMay 1, 2024, 5:30 AM

23 points

1 comment1 min readEA link

Interested in working from a new Boston AI Safety Hub?

TopazMar 17, 2025, 1:32 PM

25 points

0 comments2 min readEA link

Metaculus Launches Future of AI Series, Based on Research Questions by Arb

christianMar 13, 2024, 9:14 PM

34 points

0 comments1 min readEA link

(www.metaculus.com)

AI Safety Europe Retreat 2023 Retrospective

Magdalena WacheApr 14, 2023, 9:05 AM

41 points

10 comments1 min readEA link

Announcing the European Network for AI Safety (ENAIS)

Esben KranMar 22, 2023, 5:57 PM

124 points

3 comments3 min readEA link

Digital sentience funding opportunities: Support for applied work and research

zdgroffMay 28, 2025, 5:35 PM

110 points

0 comments4 min readEA link

The Shutdown Problem: Incomplete Preferences as a Solution

EJTFeb 23, 2024, 4:01 PM

26 points

0 comments1 min readEA link

Predictable updating about AI risk

Joe_CarlsmithMay 8, 2023, 10:05 PM

134 points

12 comments36 min readEA link

A Qualitative Case for LTFF: Filling Critical Ecosystem Gaps

LinchDec 3, 2024, 9:57 PM

89 points

26 comments9 min readEA link

How AI Takeover Might Happen in Two Years

JoshcFeb 7, 2025, 11:51 PM

35 points

7 comments29 min readEA link

(x.com)

MIRI’s 2024 End-of-Year Update

RobBensingerDec 3, 2024, 4:33 AM

32 points

7 comments1 min readEA link

Why Simulator AIs want to be Active Inference AIs

Jan_KulveitApr 11, 2023, 9:06 AM

22 points

0 comments8 min readEA link

(www.lesswrong.com)

Long list of AI questions

NunoSempereDec 6, 2023, 11:12 AM

124 points

15 comments86 min readEA link

My cover story in Jacobin on AI capitalism and the x-risk debates

GarrisonFeb 12, 2024, 11:34 PM

154 points

10 comments6 min readEA link

(jacobin.com)

Here’s how The Midas Project could use additional funding.

Tyler JohnstonNov 17, 2024, 10:15 PM

21 points

0 comments2 min readEA link

‘GiveWell for AI Safety’: Lessons learned in a week

Lydia NottinghamMay 30, 2025, 4:10 PM

41 points

1 comment6 min readEA link

[Linkpost] Statement from Scarlett Johansson on OpenAI’s use of the “Sky” voice, that was shockingly similar to her own voice.

LinchMay 20, 2024, 11:50 PM

46 points

8 comments1 min readEA link

(variety.com)

Funding case: AI Safety Camp 10

RemmeltDec 12, 2023, 9:05 AM

45 points

13 comments5 min readEA link

(manifund.org)

We are not alone: many communities want to stop Big Tech from scaling unsafe AI

RemmeltSep 22, 2023, 5:38 PM

28 points

30 comments4 min readEA link

Winners of the Essay competition on the Automation of Wisdom and Philosophy

Owen Cotton-BarrattOct 29, 2024, 12:02 AM

37 points

2 comments30 min readEA link

(blog.aiimpacts.org)

Where I Am Donating in 2024

MichaelDickensNov 19, 2024, 12:09 AM

180 points

73 comments46 min readEA link

 “Near Midnight in Suicide City”

Greg_Colbourn ⏸️ Dec 6, 2024, 7:54 PM

5 points

0 comments1 min readEA link

(www.youtube.com)

AISN #45: Center for AI Safety 2024 Year in Review

Center for AI SafetyDec 19, 2024, 6:14 PM

11 points

0 comments4 min readEA link

(newsletter.safe.ai)

AI for Animals 2025 Conference—Get Early Bird Tickets Now

Constance LiNov 20, 2024, 12:53 AM

47 points

0 comments1 min readEA link

Consider granting AIs freedom

Matthew_BarnettDec 6, 2024, 12:55 AM

87 points

22 comments5 min readEA link

A case for donating to AI risk reduction (including if you work in AI)

tlevinDec 2, 2024, 7:05 PM

118 points

5 comments3 min readEA link

Transformative AI and Animals: Animal Advocacy Under A Post-Work Society

Kevin Xia 🔸May 25, 2025, 6:32 PM

54 points

0 comments8 min readEA link

Announcing the Q1 2025 Long-Term Future Fund grant round

LinchDec 20, 2024, 2:17 AM

53 points

12 comments2 min readEA link

Donation recommendations for xrisk + ai safety

vincentweisserFeb 6, 2023, 9:25 PM

17 points

11 comments1 min readEA link

Please vote for PauseAI US in the Donation Election!

Holly Elmore ⏸️ 🔸Nov 22, 2024, 4:12 AM

21 points

3 comments2 min readEA link

Evolution provides no evidence for the sharp left turn

Quintin PopeApr 11, 2023, 6:48 PM

43 points

2 comments1 min readEA link

Symbiosis, not alignment, as the goal for liberal democracies in the transition to artificial general intelligence

simonfriederichMar 17, 2023, 1:04 PM

18 points

2 comments24 min readEA link

(rdcu.be)

Impact of Quantization on Small Language Models (SLMs) for Multilingual Mathematical Reasoning Tasks

Angie Paola GiraldoMay 7, 2025, 9:48 PM

1 point

0 comments14 min readEA link

[Question] Seeking suggested readings & videos for a new course on ‘AI and Psychology’

Geoffrey MillerMay 20, 2024, 5:45 PM

32 points

8 comments1 min readEA link

Four mindset disagreements behind existential risk disagreements in ML

RobBensingerApr 11, 2023, 4:53 AM

61 points

2 comments9 min readEA link

Is AI sentience already a reality?

SJun 1, 2025, 2:23 AM

4 points

2 comments1 min readEA link

Preventing an AI-related catastrophe—Problem profile

Benjamin HiltonAug 29, 2022, 6:49 PM

138 points

18 comments4 min readEA link

(80000hours.org)

Sam Altman and the Crossroads of AI Power: Can We Trust the Future We’re Building?

Kayode AdekoyaMay 23, 2025, 3:39 PM

0 points

0 comments1 min readEA link

Deceptive Alignment is <1% Likely by Default

DavidWFeb 21, 2023, 3:07 PM

54 points

26 comments14 min readEA link

Navigating the New Reality in DC: An EIP Primer

IanDavidMossDec 20, 2024, 4:59 PM

20 points

1 comment13 min readEA link

(effectiveinstitutionsproject.substack.com)

Cosmic AI safety

Magnus VindingDec 6, 2024, 10:32 PM

23 points

5 comments6 min readEA link

Against Aschenbrenner: How ‘Situational Awareness’ constructs a narrative that undermines safety and threatens humanity

GideonFJul 15, 2024, 4:21 PM

242 points

22 comments21 min readEA link

The Choice Transition

Owen Cotton-BarrattNov 18, 2024, 12:32 PM

43 points

1 comment15 min readEA link

(strangecities.substack.com)

AI alignment researchers may have a comparative advantage in reducing s-risks

Lukas_GloorFeb 15, 2023, 1:01 PM

79 points

5 comments13 min readEA link

Vael Gates: Risks from Highly-Capable AI (March 2023)

Vael GatesApr 1, 2023, 8:54 PM

31 points

4 comments1 min readEA link

(docs.google.com)

AISafety.info “How can I help?” FAQ

StevenKaasJun 5, 2023, 10:09 PM

48 points

1 comment1 min readEA link

Project ideas: Epistemics

Lukas FinnvedenJan 4, 2024, 7:26 AM

43 points

1 comment17 min readEA link

(www.forethought.org)

New Business Wars podcast season on Sam Altman and OpenAI

Eevee🔹Apr 2, 2024, 6:22 AM

10 points

0 comments1 min readEA link

(wondery.com)

1-year update on impactRIO, the first AI Safety group in Brazil

João Lucas DuimJun 28, 2024, 10:59 AM

56 points

2 comments10 min readEA link

Request to AGI organizations: Share your views on pausing AI progress

AkashApr 11, 2023, 5:30 PM

85 points

1 comment1 min readEA link

Some Things I Heard about AI Governance at EAG

utilistrutilFeb 28, 2023, 9:27 PM

35 points

5 comments6 min readEA link

AI Risk & Policy Forecasts from Metaculus & FLI’s AI Pathways Workshop

Will AldredMay 16, 2023, 8:53 AM

41 points

0 comments8 min readEA link

INTELLECT-1 Release: The First Globally Trained 10B Parameter Model

Matrice JacobineNov 29, 2024, 11:03 PM

2 points

1 comment1 min readEA link

(www.primeintellect.ai)

Anti-‘FOOM’ (stop trying to make your cute pet name the thing)

david_reinsteinApr 14, 2023, 4:05 PM

41 points

17 comments2 min readEA link

[Question] Why hasn’t there been any significant AI protest

sammyboiz🔸May 17, 2024, 2:59 AM

21 points

14 comments1 min readEA link

Two important recent AI Talks- Gebru and Lazar

GideonFMar 6, 2023, 1:30 AM

−7 points

5 comments1 min readEA link

Sam Altman returning as OpenAI CEO “in principle”

Fermi–Dirac DistributionNov 22, 2023, 6:15 AM

55 points

37 comments1 min readEA link

Which incentives should be used to encourage compliance with UK AI legislation?

jcwNov 18, 2024, 6:13 PM

12 points

0 comments12 min readEA link

A short conversation I had with Google Gemini on the dangers of unregulated LLM API use, while mildly drunk in an airport.

EvanMcCormickDec 17, 2024, 12:25 PM

1 point

0 comments8 min readEA link

To the Bat Mobile!! My Mid-Career Transition into AI Safety

MoneerNov 7, 2024, 3:59 PM

12 points

0 comments3 min readEA link

OpenAI introduces function calling for GPT-4

micJun 20, 2023, 1:58 AM

26 points

0 comments1 min readEA link

AI doing philosophy = AI generating hands?

Wei DaiJan 15, 2024, 9:04 AM

68 points

6 comments3 min readEA link

AI Safety Action Plan—A report commissioned by the US State Department

Agustín Covarrubias 🔸Mar 11, 2024, 10:13 PM

25 points

1 comment1 min readEA link

(www.gladstone.ai)

AI-nuclear integration: evidence of automation bias from humans and LLMs [research summary]

TaoApr 27, 2024, 9:59 PM

17 points

2 comments12 min readEA link

Announcing ForecastBench, a new benchmark for AI and human forecasting abilities

Forecasting Research InstituteOct 1, 2024, 12:31 PM

20 points

1 comment3 min readEA link

(arxiv.org)

Joining the Carnegie Endowment for International Peace

Holden KarnofskyApr 29, 2024, 3:45 PM

228 points

14 comments2 min readEA link

Partial value takeover without world takeover

Katja_GraceApr 18, 2024, 3:00 AM

24 points

2 comments1 min readEA link

NYT: Google will ‘recalibrate’ the risk of releasing AI due to competition with OpenAI

Michael HuangJan 22, 2023, 2:13 AM

173 points

8 comments1 min readEA link

(www.nytimes.com)

Data Taxation: A Proposal for Slowing Down AGI Progress

Per Ivar FriborgApr 11, 2023, 5:27 PM

42 points

6 comments12 min readEA link

[Linkpost] 538 Politics Podcast on AI risk & politics

jackvaApr 11, 2023, 5:03 PM

64 points

5 comments1 min readEA link

(fivethirtyeight.com)

Corporate campaigns work: a key learning for AI Safety

Jamie_HarrisAug 17, 2023, 9:35 PM

72 points

12 comments6 min readEA link

Hooray for stepping out of the limelight

So8resApr 1, 2023, 2:45 AM

103 points

0 comments1 min readEA link

In favour of exploring nagging doubts about x-risk

Owen Cotton-BarrattJun 25, 2024, 11:52 PM

89 points

15 comments2 min readEA link

The “low-hanging fruits” of AI safety

Julian NalenzDec 19, 2024, 1:38 PM

−1 points

0 comments6 min readEA link

(blog.hermesloom.org)

Ten arguments that AI is an existential risk

Katja_GraceAug 14, 2024, 9:51 PM

30 points

0 comments7 min readEA link

2021 AI Alignment Literature Review and Charity Comparison

LarksDec 23, 2021, 2:06 PM

176 points

18 comments73 min readEA link

Analogy Bank for AI Safety

utilistrutilJan 29, 2024, 2:35 AM

14 points

5 comments1 min readEA link

NIST Seeks Comments On “Safety Considerations for Chemical and/or Biological AI Models”

Dylan RichardsonOct 26, 2024, 6:28 PM

15 points

0 comments1 min readEA link

(www.federalregister.gov)

Did Bengio and Tegmark lose a debate about AI x-risk against LeCun and Mitchell?

Karl von WendtJun 25, 2023, 4:59 PM

80 points

24 comments1 min readEA link

Funding AI Safety political advocacy in the US: Individual donors and small donations may be especially helpful

Holly Elmore ⏸️ 🔸Nov 14, 2023, 11:14 PM

64 points

8 comments1 min readEA link

Timelines are short, p(doom) is high: a global stop to frontier AI development until x-safety consensus is our only reasonable hope

Greg_Colbourn ⏸️ Oct 12, 2023, 11:24 AM

76 points

85 comments9 min readEA link

Merger of DeepMind and Google Brain

Greg_Colbourn ⏸️ Apr 20, 2023, 8:16 PM

11 points

12 comments1 min readEA link

(blog.google)

Announcing the AI Fables Writing Contest!

Daystar EldJul 12, 2023, 3:04 AM

76 points

52 comments3 min readEA link

Slim overview of work one could do to make AI go better (and a grab-bag of other career considerations)

ChiMar 20, 2024, 11:17 PM

34 points

1 comment3 min readEA link

AI Safety Impact Markets: Your Charity Evaluator for AI Safety

Dawn DrescherOct 1, 2023, 10:47 AM

28 points

4 comments6 min readEA link

(impactmarkets.substack.com)

Shutting down all competing AI projects might not buy a lot of time due to Internal Time Pressure

ThomasCederborgOct 3, 2024, 12:05 AM

6 points

1 comment12 min readEA link

2/3 Aussie & NZ AI Safety folk often or sometimes feel lonely or disconnected (and 16 other barriers to impact)

yanni kyriacosAug 1, 2024, 1:14 AM

19 points

11 comments8 min readEA link

Among the A.I. Doomsayers—The New Yorker

Agustín Covarrubias 🔸Mar 11, 2024, 9:12 PM

66 points

0 comments1 min readEA link

(www.newyorker.com)

AI alignment, human alignment, oh my

MilesWOct 31, 2024, 3:23 AM

−12 points

0 comments2 min readEA link

Claude Doesn’t Want to Die

GarrisonMar 5, 2024, 6:00 AM

22 points

14 comments10 min readEA link

(garrisonlovely.substack.com)

The Tech Industry is the Biggest Blocker to Meaningful AI Safety Regulations

GarrisonAug 16, 2024, 7:37 PM

139 points

8 comments8 min readEA link

(garrisonlovely.substack.com)

Cyborg Periods: There will be multiple AI transitions

Jan_KulveitFeb 22, 2023, 4:09 PM

68 points

1 comment1 min readEA link

An AI crash is our best bet for restricting AI

RemmeltOct 11, 2024, 2:12 AM

20 points

3 comments1 min readEA link

But why would the AI kill us?

So8resApr 17, 2023, 7:38 PM

45 points

3 comments1 min readEA link

Future Matters #8: Bing Chat, AI labs on safety, and pausing Future Matters

PabloMar 21, 2023, 2:50 PM

81 points

5 comments24 min readEA link

Whether you should do a PhD doesn’t depend much on timelines.

alex lawsenMar 22, 2023, 12:25 PM

67 points

7 comments4 min readEA link

Please wonder about the hard parts of the alignment problem

MikhailSaminJul 11, 2023, 5:02 PM

8 points

0 comments1 min readEA link

[Question] What’s the best way to get a sense of the day-to-day activities of different researchers/research directions? (AI Governance)

LuiseMay 27, 2024, 12:48 PM

15 points

1 comment1 min readEA link

EU policymakers reach an agreement on the AI Act

tlevinDec 15, 2023, 6:03 AM

109 points

13 comments1 min readEA link

Problem-solving tasks in Graph Theory for language models

Bruno López OrozcoOct 1, 2024, 12:36 PM

21 points

1 comment9 min readEA link

Dario Amodei — Machines of Loving Grace

Matrice JacobineOct 11, 2024, 9:39 PM

66 points

0 comments1 min readEA link

(darioamodei.com)

AI strategy given the need for good reflection

Owen Cotton-BarrattMar 18, 2024, 12:48 AM

40 points

1 comment5 min readEA link

[Question] What is the current most representative EA AI x-risk argument?

Matthew_BarnettDec 15, 2023, 10:04 PM

117 points

50 comments3 min readEA link

[Question] Can we train AI so that future philanthropy is more effective?

Ricardo PimentelNov 3, 2024, 3:08 PM

3 points

0 comments1 min readEA link

My lab’s small AI safety agenda

Jobst Heitzig (vodle.it)Jun 18, 2023, 12:29 PM

59 points

26 comments3 min readEA link

Train for incorrigibility, then reverse it (Shutdown Problem Contest Submission)

Daniel_EthJul 18, 2023, 8:26 AM

16 points

0 comments2 min readEA link

Proposing the Conditional AI Safety Treaty (linkpost TIME)

OttoNov 15, 2024, 1:56 PM

12 points

6 comments3 min readEA link

(time.com)

Deconfusing Pauses: Long Term Moratorium vs Slowing AI

GideonFAug 4, 2024, 11:32 AM

17 points

3 comments5 min readEA link

“Aligned with who?” Results of surveying 1,000 US participants on AI values

Holly MorganMar 21, 2023, 10:07 PM

41 points

0 comments2 min readEA link

(www.lesswrong.com)

[Linkpost] Given Extinction Worries, Why Don’t AI Researchers Quit? Well, Several Reasons

Daniel_EthJun 6, 2023, 7:31 AM

25 points

6 comments1 min readEA link

(medium.com)

AISC 2024 - Project Summaries

Nicky PochinkovNov 27, 2023, 10:35 PM

13 points

1 comment18 min readEA link

[Question] Concrete, existing examples of high-impact risks from AI?

freedomandutilityApr 15, 2023, 10:19 PM

9 points

1 comment1 min readEA link

FLI report: Policymaking in the Pause

Zach Stein-PerlmanApr 15, 2023, 5:01 PM

29 points

4 comments1 min readEA link

Coordination by common knowledge to prevent uncontrollable AI

Karl von WendtMay 14, 2023, 1:37 PM

14 points

0 comments1 min readEA link

AI Safety Camp 10

Robert KralischOct 26, 2024, 11:36 AM

15 points

0 comments18 min readEA link

(www.lesswrong.com)

Why some people disagree with the CAIS statement on AI

David_MossAug 15, 2023, 1:39 PM

144 points

15 comments16 min readEA link

A freshman year during the AI midgame: my approach to the next year

BuckApr 14, 2023, 12:38 AM

179 points

30 comments7 min readEA link

Sentience Institute 2021 End of Year Summary

AliNov 26, 2021, 2:40 PM

66 points

5 comments6 min readEA link

(www.sentienceinstitute.org)

Current UK government levers on AI development

rosehadsharApr 10, 2023, 1:16 PM

82 points

3 comments4 min readEA link

 Jan Leike: “I’m excited to join @AnthropicAI to continue the superalignment mission!”

defun 🔸May 28, 2024, 6:08 PM

35 points

11 comments1 min readEA link

(x.com)

Videos on the world’s most pressing problems, by 80,000 Hours

BellaMar 21, 2024, 8:18 PM

63 points

5 comments2 min readEA link

Bounty for Evidence on Some of Palisade Research’s Beliefs

bwrSep 23, 2024, 8:05 PM

5 points

0 comments1 min readEA link

Breakthrough in AI agents? (On Devin—The Zvi, linkpost)

SiebeRozendalMar 20, 2024, 9:43 AM

16 points

9 comments1 min readEA link

(thezvi.substack.com)

Operationalizing timelines

Zach Stein-PerlmanMar 10, 2023, 5:30 PM

30 points

2 comments1 min readEA link

When “human-level” is the wrong threshold for AI

Ben Millwood🔸Jun 22, 2024, 2:34 PM

38 points

3 comments7 min readEA link

Project ideas: Backup plans & Cooperative AI

Lukas FinnvedenJan 4, 2024, 7:26 AM

25 points

2 comments13 min readEA link

(www.forethought.org)

The market plausibly expects AI software to create trillions of dollars of value by 2027

Benjamin_ToddMay 6, 2024, 5:16 AM

88 points

19 comments1 min readEA link

(benjamintodd.substack.com)

Public Weights?

Jeff Kaufman 🔸Nov 2, 2023, 2:51 AM

20 points

7 comments1 min readEA link

Shaping Policies for Ethical AI Development in Africa

KuiyakiMay 16, 2024, 2:15 PM

3 points

0 comments1 min readEA link

AI Can Help Animal Advocacy More Than It Can Help Industrial Farming

Wladimir J. AlonsoNov 26, 2024, 9:55 AM

21 points

10 comments4 min readEA link

AGI Catastrophe and Takeover: Some Reference Class-Based Priors

zdgroffMay 24, 2023, 7:14 PM

95 points

10 comments6 min readEA link

AI Winter Season at EA Hotel

CEEALARSep 25, 2024, 1:36 PM

57 points

2 comments1 min readEA link

AGI safety career advice

richard_ngoMay 2, 2023, 7:36 AM

211 points

20 comments1 min readEA link

Non-alignment project ideas for making transformative AI go well

Lukas FinnvedenJan 4, 2024, 7:23 AM

66 points

1 comment3 min readEA link

(www.forethought.org)

How to help crucial AI safety legislation pass with 10 minutes of effort

ThomasWSep 11, 2024, 7:14 PM

258 points

33 comments3 min readEA link

Trendlines in AIxBio evals

ljustenOct 31, 2024, 12:09 AM

40 points

2 comments11 min readEA link

(www.lennijusten.com)

AI safety starter pack

mariushobbhahnMar 28, 2022, 4:05 PM

126 points

13 comments6 min readEA link

How I failed to form views on AI safety

Ada-Maaria HyvärinenApr 17, 2022, 11:05 AM

213 points

72 comments40 min readEA link

What new x- or s-risk fieldbuilding organisations would you like to see? An EOI form. (FBB #3)

gergoFeb 17, 2025, 12:37 PM

32 points

3 comments2 min readEA link

Filling the Void: A Comprehensive Database for AI Risks Materials

J.A.M.May 28, 2024, 4:03 PM

10 points

1 comment4 min readEA link

Should AI X-Risk Worriers Short the Market?

postlibertarianNov 4, 2024, 4:16 PM

14 points

1 comment6 min readEA link

My favorite AI governance research this year so far

Zach Stein-PerlmanJul 23, 2023, 10:00 PM

81 points

4 comments7 min readEA link

(blog.aiimpacts.org)

Apply to the Cavendish Labs Fellowship (by 4/15)

Derik KApr 3, 2023, 11:06 PM

35 points

2 comments1 min readEA link

Executive Director for AIS Brussels—Expression of interest

gergoDec 19, 2024, 9:15 AM

29 points

0 comments4 min readEA link

Announcing the CLR Foundations Course and CLR S-Risk Seminars

James FavilleNov 19, 2024, 1:18 AM

52 points

2 comments3 min readEA link

Manifund x AI Worldviews

AustinMar 31, 2023, 3:32 PM

32 points

2 comments2 min readEA link

(manifund.org)

Mentorship in AGI Safety (MAGIS)

Joe RogeroMay 23, 2024, 6:34 PM

11 points

1 comment2 min readEA link

Large Language Models as Fiduciaries to Humans

johnjnayJan 24, 2023, 7:53 PM

25 points

0 comments34 min readEA link

(papers.ssrn.com)

[SEE NEW EDITS] No, You Need to Write Clearer

Nicholas KrossApr 29, 2023, 5:04 AM

71 points

8 comments1 min readEA link

(www.thinkingmuchbetter.com)

The Compendium, A full argument about extinction risk from AGI

adamShimiOct 31, 2024, 12:02 PM

9 points

1 comment2 min readEA link

(www.thecompendium.ai)

Partial Transcript of Recent Senate Hearing Discussing AI X-Risk

Daniel_EthJul 27, 2023, 9:16 AM

150 points

2 comments22 min readEA link

(medium.com)

The Leeroy Jenkins principle: How faulty AI could guarantee “warning shots”

titotalJan 14, 2024, 3:03 PM

56 points

2 comments21 min readEA link

(titotal.substack.com)

Please don’t criticize EAs who “sell out” to OpenAI and Anthropic

Eevee🔹Mar 5, 2023, 9:17 PM

−4 points

21 comments2 min readEA link

Interactive AI Governance Map

Hamish McDoodlesMar 12, 2024, 10:02 AM

67 points

8 comments1 min readEA link

AIS Hungary is hiring a part-time Technical Lead! (Deadline: Dec 31st)

gergoDec 17, 2024, 2:08 PM

9 points

0 comments2 min readEA link

Why AGI systems will not be fanatical maximisers (unless trained by fanatical humans)

titotalMay 17, 2023, 11:58 AM

43 points

3 comments15 min readEA link

AI stocks could crash. And that could have implications for AI safety

Benjamin_ToddMay 9, 2024, 7:23 AM

173 points

41 comments4 min readEA link

(benjamintodd.substack.com)

Solving adversarial attacks in computer vision as a baby version of general AI alignment

Stanislav FortAug 31, 2024, 4:15 PM

3 points

1 comment7 min readEA link

Brain-computer interfaces and brain organoids in AI alignment?

freedomandutilityApr 15, 2023, 10:28 PM

8 points

2 comments1 min readEA link

Disrupting malicious uses of AI by state-affiliated threat actors

Agustín Covarrubias 🔸Feb 14, 2024, 9:28 PM

22 points

1 comment1 min readEA link

(openai.com)

The Cruel Trade-Off Between AI Misuse and AI X-risk Concerns

simeon_cApr 22, 2023, 1:49 PM

27 points

17 comments1 min readEA link

Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training

evhubJan 12, 2024, 7:51 PM

65 points

0 comments1 min readEA link

(arxiv.org)

2024: a year of consolidation for ORCG

JorgeTorresCDec 18, 2024, 5:47 PM

33 points

0 comments7 min readEA link

(www.orcg.info)

Agentic Alignment: Navigating between Harm and Illegitimacy

LennardZNov 26, 2024, 9:27 PM

2 points

1 comment9 min readEA link

The ‘Neglected Approaches’ Approach: AE Studio’s Alignment Agenda

Marc CarauleanuDec 18, 2023, 9:13 PM

21 points

0 comments12 min readEA link

[MLSN #8]: Mechanistic interpretability, using law to inform AI alignment, scaling laws for proxy gaming

TW123Feb 20, 2023, 4:06 PM

25 points

0 comments4 min readEA link

(newsletter.mlsafety.org)

AI Risk US Presidental Candidate

Simon BerensApr 11, 2023, 8:18 PM

12 points

8 comments1 min readEA link

How to Give Coming AGI’s the Best Chance of Figuring Out Ethics for Us

Sean SweeneyMay 23, 2024, 7:44 PM

1 point

1 comment10 min readEA link

How to Address EA Dilemmas – What is Missing from EA Values?

alexis schoenlaubOct 13, 2024, 9:33 AM

7 points

4 comments6 min readEA link

[Linkpost] AI Alignment, Explained in 5 Points (updated)

Daniel_EthApr 18, 2023, 8:09 AM

31 points

2 comments1 min readEA link

(medium.com)

Critiques of prominent AI safety labs: Redwood Research

OmegaMar 31, 2023, 8:58 AM

339 points

91 comments20 min readEA link

Details on how an IAEA-style AI regulator would function?

freedomandutilityJun 3, 2023, 12:03 PM

12 points

5 comments1 min readEA link

Arkose: Organizational Updates & Ways to Get Involved

ArkoseAug 1, 2024, 1:03 PM

28 points

1 comment1 min readEA link

Success without dignity: a nearcasting story of avoiding catastrophe by luck

Holden KarnofskyMar 15, 2023, 8:17 PM

113 points

3 comments1 min readEA link

Counting arguments provide no evidence for AI doom

Nora BelroseFeb 27, 2024, 11:03 PM

84 points

15 comments1 min readEA link

Preventing AI Misuse: State of the Art Research and its Flaws

Madhav MalhotraApr 23, 2023, 10:50 AM

24 points

2 comments11 min readEA link

Drexler’s Nanosystems is now available online

MikhailSaminJun 1, 2024, 2:41 PM

32 points

4 comments1 min readEA link

(nanosyste.ms)

Current paths to impact in EU AI Policy (Feb ’24)

JOMG_MonnetFeb 12, 2024, 3:57 PM

47 points

0 comments5 min readEA link

Is effective altruism really to blame for the OpenAI debacle?

GarrisonNov 23, 2023, 12:44 AM

13 points

0 comments1 min readEA link

(garrisonlovely.substack.com)

Open-Source AI: A Regulatory Review

Elliot MckernonApr 29, 2024, 10:10 AM

14 points

1 comment8 min readEA link

[Question] How does AI progress affect other EA cause areas?

Luis Mota FreitasJun 9, 2023, 12:43 PM

95 points

13 comments1 min readEA link

Standard policy frameworks for AI governance

Nathan_BarnardJan 30, 2024, 6:14 PM

26 points

2 comments3 min readEA link

(How) Is technical AI Safety research being evaluated?

JohnSnowJul 11, 2023, 9:37 AM

27 points

1 comment1 min readEA link

DeepMind: Frontier Safety Framework

Zach Stein-PerlmanMay 17, 2024, 5:30 PM

23 points

0 comments1 min readEA link

(deepmind.google)

[Question] Why haven’t we been destroyed by a power-seeking AGI from elsewhere in the universe?

Jadon SchmittJul 22, 2023, 7:21 AM

35 points

14 comments1 min readEA link

A great talk for AI noobs (according to an AI noob)

DovApr 23, 2023, 5:32 AM

8 points

0 comments1 min readEA link

(www.youtube.com)

[linkpost] “What Are Reasonable AI Fears?” by Robin Hanson, 2023-04-23

Arjun PanicksseryApr 14, 2023, 11:26 PM

41 points

3 comments4 min readEA link

(quillette.com)

In DC, a new wave of AI lobbyists gains the upper hand

Chris LeongMay 13, 2024, 7:31 AM

97 points

7 comments1 min readEA link

(www.politico.com)

Bringing about animal-inclusive AI

Max TaylorDec 18, 2023, 11:49 AM

133 points

9 comments16 min readEA link

Raising the voices that actually count

Kim HolderJun 13, 2023, 7:21 PM

2 points

3 comments2 min readEA link

Technology is Power: Raising Awareness Of Technological Risks

Marc WongFeb 9, 2023, 3:13 PM

3 points

0 comments2 min readEA link

If you are too stressed, walk away from the front lines

Neil WarrenJun 12, 2023, 9:01 PM

7 points

2 comments4 min readEA link

Announcing Human-aligned AI Summer School

Jan_KulveitMay 22, 2024, 8:55 AM

33 points

0 comments1 min readEA link

(humanaligned.ai)

Possible OpenAI’s Q* breakthrough and DeepMind’s AlphaGo-type systems plus LLMs

BurnydelicNov 23, 2023, 7:02 AM

13 points

4 comments2 min readEA link

Modelling large-scale cyber attacks from advanced AI systems with Advanced Persistent Threats

Iyngkarran KumarOct 2, 2023, 9:54 AM

28 points

2 comments30 min readEA link

Help the UN design global governance structures for AI

Joanna (Asia) WiaterekJan 12, 2024, 8:44 AM

72 points

2 comments1 min readEA link

AI Safety Newsletter #2: ChaosGPT, Natural Selection, and AI Safety in the Media

Oliver ZApr 18, 2023, 6:36 PM

56 points

1 comment4 min readEA link

(newsletter.safe.ai)

AI-Relevant Regulation: IAEA

SWKJul 15, 2023, 6:20 PM

10 points

0 comments5 min readEA link

Designing Artificial Wisdom: Decision Forecasting AI & Futarchy

Jordan ArelJul 14, 2024, 5:10 AM

5 points

1 comment6 min readEA link

This might be the last AI Safety Camp

RemmeltJan 24, 2024, 9:29 AM

87 points

32 comments1 min readEA link

Deep Deceptiveness

So8resMar 21, 2023, 2:51 AM

40 points

1 comment1 min readEA link

AISN#15: China and the US take action to regulate AI, results from a tournament forecasting AI risk, updates on xAI’s plan, and Meta releases its open-source and commercially available Llama 2

Center for AI SafetyJul 19, 2023, 1:40 AM

5 points

0 comments6 min readEA link

(newsletter.safe.ai)

Reasons to have hope

Jordan Pieters 🔸Apr 20, 2023, 10:19 AM

53 points

4 comments1 min readEA link

[Question] How independent is the research coming out of OpenAI’s preparedness team?

EarthlingFeb 10, 2024, 4:59 PM

18 points

0 comments1 min readEA link

AISN #35: Lobbying on AI Regulation Plus, New Models from OpenAI and Google, and Legal Regimes for Training on Copyrighted Data

Center for AI SafetyMay 16, 2024, 2:26 PM

14 points

0 comments6 min readEA link

(newsletter.safe.ai)

ChatGPT: towards AI subjectivity

KrisDAmatoMay 1, 2024, 10:13 AM

3 points

0 comments1 min readEA link

(link.springer.com)

Explorers in a virtual country: Navigating the knowledge landscape of large language models

Alexander SaeriMar 28, 2023, 9:32 PM

17 points

1 comment6 min readEA link

Now THIS is forecasting: understanding Epoch’s Direct Approach

Elliot MckernonMay 4, 2024, 12:06 PM

52 points

2 comments19 min readEA link

Paradigms and Theory Choice in AI: Adaptivity, Economy and Control

particlemaniaAug 28, 2023, 10:44 PM

3 points

0 comments16 min readEA link

A fictional AI law laced w/ alignment theory

MiguelJul 17, 2023, 3:26 AM

3 points

0 comments2 min readEA link

OpenAI’s Superalignment team has opened Fast Grants

YadavDec 16, 2023, 3:41 PM

31 points

2 comments1 min readEA link

(openai.com)

Cooperative AI: Three things that confused me as a beginner (and my current understanding)

C TilliApr 16, 2024, 7:06 AM

58 points

10 comments6 min readEA link

A Viral License for AI Safety

IvanVendrovJun 5, 2021, 2:00 AM

30 points

6 comments5 min readEA link

UK Foundation Model Task Force—Expression of Interest

ojorgensenJun 18, 2023, 9:40 AM

111 points

3 comments1 min readEA link

(twitter.com)

AGI development role-playing game

rekahalaszDec 11, 2023, 10:22 AM

4 points

0 comments1 min readEA link

AI-Relevant Regulation: Insurance in Safety-Critical Industries

SWKJul 22, 2023, 5:52 PM

5 points

0 comments6 min readEA link

[Linkpost] Longtermists Are Pushing a New Cold War With China

Radical Empath IsmamMay 27, 2023, 6:53 AM

37 points

16 comments1 min readEA link

(jacobin.com)

Catastrophic Risks from Unsafe AI: Navigating a Tightrope Scenario (Ben Garfinkel, EAG London 2023)

Alexander SaeriJun 2, 2023, 9:59 AM

19 points

1 comment10 min readEA link

[Linkpost] OpenAI leaders call for regulation of “superintelligence” to reduce existential risk.

Lowe LundinMay 25, 2023, 2:14 PM

5 points

0 comments1 min readEA link

Should you work at a leading AI lab? (including in non-safety roles)

Benjamin HiltonJul 25, 2023, 4:28 PM

38 points

13 comments12 min readEA link

Draghi’s report signal a less safety-focused European Union on AI

t6aguirreSep 9, 2024, 6:39 PM

17 points

3 comments1 min readEA link

Episode: Austin vs Linch on OpenAI

AustinMay 25, 2024, 4:15 PM

21 points

2 comments44 min readEA link

(manifund.substack.com)

[Question] Why is learning economics, psychology, sociology important for preventing AI risks?

jackchang110Nov 3, 2023, 9:48 PM

3 points

0 comments1 min readEA link

AI-Relevant Regulation: CERN

SWKJul 15, 2023, 6:40 PM

12 points

0 comments6 min readEA link

What can we do now to prepare for AI sentience, in order to protect them from the global scale of human sadism?

rimeApr 18, 2023, 9:58 AM

44 points

0 comments2 min readEA link

How to pursue a career in technical AI alignment

Charlie Rogers-SmithJun 4, 2022, 9:36 PM

265 points

9 comments39 min readEA link

Alignment, Goals, & The Gut-Head Gap: A Review of Ngo. et al

Violet HourMay 11, 2023, 5:16 PM

26 points

0 comments13 min readEA link

What does Bing Chat tell us about AI risk?

Holden KarnofskyFeb 28, 2023, 6:47 PM

99 points

8 comments2 min readEA link

(www.cold-takes.com)

What’s new at FAR AI

AdamGleaveDec 4, 2023, 9:18 PM

68 points

0 comments1 min readEA link

(far.ai)

What AI companies can do today to help with the most important century

Holden KarnofskyFeb 20, 2023, 5:40 PM

104 points

8 comments11 min readEA link

(www.cold-takes.com)

Mapping How Alliances, Acquisitions, and Antitrust are Shaping the Frontier AI Industry

t6aguirreJun 3, 2024, 9:43 AM

24 points

1 comment2 min readEA link

LLMs won’t lead to AGI—Francois Chollet

tobycrisford 🔸Jun 11, 2024, 8:19 PM

37 points

23 comments1 min readEA link

(www.youtube.com)

AI Incident Reporting: A Regulatory Review

Deric ChengMar 11, 2024, 9:02 PM

10 points

1 comment6 min readEA link

Discussing AI-Human Collaboration Through Fiction: The Story of Laika and GPT-∞

LaikaJul 27, 2023, 6:04 AM

1 point

0 comments1 min readEA link

Misalignment Museum opens in San Francisco: ‘Sorry for killing most of humanity’

Michael HuangMar 4, 2023, 7:09 AM

99 points

6 comments1 min readEA link

(www.misalignmentmuseum.com)

AI Wellbeing

Simon Jul 11, 2023, 12:34 AM

11 points

0 comments9 min readEA link

What we’re missing: the case for structural risks from AI

Justin OliveNov 9, 2023, 5:52 AM

31 points

3 comments6 min readEA link

‘The AI Dilemma: Growth vs Existential Risk’: An Extension for EAs and a Summary for Non-economists

TomHouldenApr 21, 2024, 4:28 PM

65 points

1 comment16 min readEA link

[Question] Who is testing AI Safety public outreach messaging?

yanni kyriacosApr 15, 2023, 12:53 AM

20 points

2 comments1 min readEA link

Why Would AI “Aim” To Defeat Humanity?

Holden KarnofskyNov 29, 2022, 6:59 PM

24 points

0 comments32 min readEA link

(www.cold-takes.com)

I am unable to get any AI safety related fellowships or internships.

AavishkarMar 11, 2024, 5:00 AM

5 points

6 comments1 min readEA link

What to think when a language model tells you it’s sentient

rgbFeb 20, 2023, 2:59 AM

112 points

18 comments6 min readEA link

Biological superintelligence: a solution to AI safety

Yarrow🔸Dec 4, 2023, 1:09 PM

2 points

6 comments1 min readEA link

Research agenda: Supervising AIs improving AIs

Quintin PopeApr 29, 2023, 5:09 PM

16 points

0 comments1 min readEA link

Safe AI and moral AI

William D'AlessandroJun 1, 2023, 9:18 PM

3 points

0 comments11 min readEA link

[Question] Should people get neuroscience phD to work in AI safety field?

jackchang110Mar 7, 2023, 4:21 PM

9 points

11 comments1 min readEA link

Pessimism about AI Safety

Max_He-HoApr 2, 2023, 7:57 AM

5 points

0 comments25 min readEA link

(www.lesswrong.com)

Overview of introductory resources in AI Governance

Lucie Philippon 🔸May 27, 2024, 4:22 PM

26 points

1 comment6 min readEA link

(www.lesswrong.com)

AI, Cybersecurity, and Malware: A Shallow Report [Technical]

Madhav MalhotraMar 31, 2023, 12:03 PM

4 points

0 comments9 min readEA link

Pausing AI Developments Isn’t Enough. We Need to Shut it All Down

EliezerYudkowskyApr 9, 2023, 3:53 PM

50 points

3 comments1 min readEA link

Hashmarks: Privacy-Preserving Benchmarks for High-Stakes AI Evaluation

Paul BricmanDec 4, 2023, 7:41 AM

4 points

0 comments16 min readEA link

(arxiv.org)

World and Mind in Artificial Intelligence: arguments against the AI pause

Arturo MaciasApr 18, 2023, 2:35 PM

6 points

3 comments5 min readEA link

Observations on the funding landscape of EA and AI safety

Vilhelm SkoglundOct 2, 2023, 9:45 AM

136 points

12 comments15 min readEA link

AI Alignment in The New Yorker

Eleni_AMay 17, 2023, 9:19 PM

23 points

0 comments1 min readEA link

(www.newyorker.com)

Reframing the burden of proof: Companies should prove that models are safe (rather than expecting auditors to prove that models are dangerous)

AkashApr 25, 2023, 6:49 PM

35 points

1 comment1 min readEA link

Updates from Campaign for AI Safety

Jolyn KhooJul 19, 2023, 8:15 AM

5 points

0 comments2 min readEA link

(www.campaignforaisafety.org)

Towards evidence gap-maps for AI safety

dEAsignJul 25, 2023, 8:13 AM

6 points

1 comment2 min readEA link

List of projects that seem impactful for AI Governance

JaimeRVJan 14, 2024, 4:52 PM

35 points

2 comments13 min readEA link

AI, Cybersecurity, and Malware: A Shallow Report [General]

Madhav MalhotraMar 31, 2023, 12:01 PM

5 points

0 comments8 min readEA link

Exploring Metaculus’s AI Track Record

Peter ScoblicMay 1, 2023, 9:02 PM

52 points

5 comments5 min readEA link

14+ AI Safety Advisors You Can Speak to – New AISafety.com Resource

Bryce RobertsonJan 21, 2025, 5:34 PM

18 points

2 comments1 min readEA link

AI Progress: The Game Show

Alex ArnettApr 21, 2023, 4:47 PM

3 points

0 comments2 min readEA link

The new UK government’s stance on AI safety

Elliot MckernonJul 31, 2024, 3:23 PM

19 points

0 comments1 min readEA link

ChatGPT not so clever or not so artificial as hyped to be?

Haris ShekerisMar 2, 2023, 6:16 AM

−7 points

2 comments1 min readEA link

UK AI Bill Analysis & Opinion

CAISIDFeb 5, 2024, 12:12 AM

18 points

0 comments15 min readEA link

Sentinel minutes for week #52/2024

NunoSempereDec 30, 2024, 6:25 PM

61 points

0 comments6 min readEA link

(blog.sentinel-team.org)

Potential employees have a unique lever to influence the behaviors of AI labs

oxalisMar 18, 2023, 8:58 PM

139 points

1 comment5 min readEA link

AI Safety Newsletter #8: Rogue AIs, how to screen for AI risks, and grants for research on democratic governance of AI

Center for AI SafetyMay 30, 2023, 11:44 AM

16 points

3 comments6 min readEA link

(newsletter.safe.ai)

Orthogonal: A new agent foundations alignment organization

Tamsin LeakeApr 19, 2023, 8:17 PM

38 points

0 comments1 min readEA link

Help us find pain points in AI safety

Esben KranApr 12, 2022, 6:43 PM

31 points

4 comments9 min readEA link

[Question] What am I missing re. open-source LLM’s?

another-anon-do-gooderDec 4, 2023, 4:48 AM

1 point

2 comments1 min readEA link

AI companies are not on track to secure model weights

Jeffrey LadishJul 18, 2024, 3:13 PM

73 points

3 comments19 min readEA link

GovAI: Towards best practices in AGI safety and governance: A survey of expert opinion

Zach Stein-PerlmanMay 15, 2023, 1:42 AM

68 points

3 comments1 min readEA link

Boomerang—protocol to dissolve some commitment races

Filip SondejMay 30, 2023, 4:24 PM

20 points

0 comments8 min readEA link

(www.lesswrong.com)

Primitive Global Discourse Framework, Constitutional AI using legal frameworks, and Monoculture—A loss of control over the role of AGI in society

broptrossJun 1, 2023, 5:12 AM

2 points

0 comments12 min readEA link

The Risks of AI-Generated Content on the EA Forum

WobblyPanda2Jun 4, 2023, 5:33 AM

−1 points

0 comments1 min readEA link

An Introduction to Critiques of prominent AI safety organizations

OmegaJul 19, 2023, 6:53 AM

87 points

2 comments5 min readEA link

There is only one goal or drive—only self-perpetuation counts

freest oneJun 13, 2023, 1:37 AM

2 points

4 comments8 min readEA link

Biden-Harris Administration Announces First-Ever Consortium Dedicated to AI Safety

ben.smithFeb 9, 2024, 6:40 AM

15 points

1 comment1 min readEA link

(www.nist.gov)

Pay to get AI safety info from behind NDA wall?

louisbarclayJun 5, 2024, 10:19 AM

2 points

2 comments1 min readEA link

Pillars to Convergence

PhlobtonApr 1, 2023, 1:04 PM

1 point

0 comments8 min readEA link

Is fear productive when communicating AI x-risk? [Study results]

Johanna RonigerJan 22, 2024, 5:38 AM

73 points

10 comments5 min readEA link

Does AI risk “other” the AIs?

Joe_CarlsmithJan 9, 2024, 5:51 PM

23 points

3 comments1 min readEA link

We are fighting a shared battle (a call for a different approach to AI Strategy)

GideonFMar 16, 2023, 2:37 PM

59 points

11 comments15 min readEA link

Paper Summary: The Effectiveness of AI Existential Risk Communication to the American and Dutch Public

OttoMar 9, 2023, 10:40 AM

97 points

11 comments4 min readEA link

An argument for accelerating international AI governance research (part 1)

MattThinksAug 16, 2023, 5:40 AM

10 points

0 comments3 min readEA link

Updates from Campaign for AI Safety

Jolyn KhooAug 30, 2023, 5:36 AM

7 points

0 comments2 min readEA link

(www.campaignforaisafety.org)

Epoch is hiring a Product and Data Visualization Designer

merilalamaNov 25, 2023, 12:14 AM

21 points

0 comments4 min readEA link

(careers.rethinkpriorities.org)

[CFP] NeurIPS workshop: AI meets Moral Philosophy and Moral Psychology

jaredlcmSep 4, 2023, 6:21 AM

10 points

1 comment4 min readEA link

The case for more ambitious language model evals

JozdienJan 30, 2024, 9:24 AM

7 points

0 comments5 min readEA link

Non-trivial Fellowship Project: Towards a Unified Dangerous Capabilities Benchmark

Jord Mar 4, 2024, 9:24 AM

2 points

1 comment9 min readEA link

My Objections to “We’re All Gonna Die with Eliezer Yudkowsky”

Quintin PopeMar 21, 2023, 1:23 AM

166 points

21 comments39 min readEA link

“Risk Awareness Moments” (Rams): A concept for thinking about AI governance interventions

oegApr 14, 2023, 5:40 PM

53 points

0 comments9 min readEA link

AI Policy Insights from the AIMS Survey

Janet PauketatFeb 22, 2024, 7:17 PM

10 points

1 comment18 min readEA link

(www.sentienceinstitute.org)

Claude 3 claims it’s conscious, doesn’t want to die or be modified

MikhailSaminMar 4, 2024, 11:05 PM

8 points

3 comments1 min readEA link

[Question] Predictions for future AI governance?

jackchang110Apr 2, 2023, 4:43 PM

4 points

1 comment1 min readEA link

What can superintelligent ANI tell us about superintelligent AGI?

Ted SandersJun 12, 2023, 6:32 AM

81 points

20 comments5 min readEA link

The basic reasons I expect AGI ruin

RobBensingerApr 18, 2023, 3:37 AM

58 points

13 comments1 min readEA link

World’s first major law for artificial intelligence gets final EU green light

Dane ValerieMay 24, 2024, 2:57 PM

3 points

1 comment2 min readEA link

(www.cnbc.com)

A note of caution on believing things on a gut level

Nathan_BarnardMay 9, 2023, 12:20 PM

41 points

5 comments2 min readEA link

October 2022 AI Risk Community Survey Results

FroolowMay 24, 2023, 10:37 AM

19 points

0 comments7 min readEA link

You don’t need to be a genius to be in AI safety research

Claire ShortMay 10, 2023, 10:23 PM

28 points

4 comments6 min readEA link

An even deeper atheism

Joe_CarlsmithJan 11, 2024, 5:28 PM

26 points

2 comments1 min readEA link

Apply to MATS 7.0!

Ryan KiddSep 21, 2024, 12:23 AM

27 points

0 comments1 min readEA link

Speedrun: AI Alignment Prizes

joeFeb 9, 2023, 11:55 AM

27 points

0 comments18 min readEA link

How can OSINT be used for the enforcement of the EU AI Act?

KristinaJun 7, 2024, 11:07 AM

8 points

1 comment1 min readEA link

The Game of Dominance

Karl von WendtAug 27, 2023, 11:23 AM

5 points

0 comments6 min readEA link

Beginner’s guide to reducing s-risks [link-post]

Center on Long-Term RiskOct 17, 2023, 12:51 AM

129 points

3 comments3 min readEA link

(longtermrisk.org)

Research Summary: Forecasting with Large Language Models

Damien LairdApr 2, 2023, 10:52 AM

4 points

0 comments7 min readEA link

(damienlaird.substack.com)

The two-tiered society

Roman LeventovMay 13, 2024, 7:53 AM

14 points

5 comments1 min readEA link

Announcing New Beginner-friendly Book on AI Safety and Risk

Darren McKeeNov 25, 2023, 3:57 PM

114 points

9 comments1 min readEA link

Claude 3.5 Sonnet

Zach Stein-PerlmanJun 20, 2024, 6:00 PM

31 points

0 comments1 min readEA link

(www.anthropic.com)

Cambridge AI Safety Hub is looking for full- or part-time organisers

hannahJul 15, 2023, 2:31 PM

12 points

0 comments1 min readEA link

Updates from Campaign for AI Safety

Jolyn KhooOct 31, 2023, 5:46 AM

14 points

1 comment2 min readEA link

(www.campaignforaisafety.org)

Weekly newsletter for AI safety events and training programs

Bryce RobertsonMay 3, 2024, 12:37 AM

15 points

0 comments1 min readEA link

(www.lesswrong.com)

Chaining the evil genie: why “outer” AI safety is probably easy

titotalAug 30, 2022, 1:55 PM

40 points

12 comments10 min readEA link

OpenAI board received letter warning of powerful AI

JordanStoneNov 23, 2023, 12:16 AM

26 points

2 comments1 min readEA link

(www.reuters.com)

Dangerous capability tests should be harder

Luca Righetti 🔸Aug 20, 2024, 4:11 PM

23 points

1 comment5 min readEA link

(www.planned-obsolescence.org)

Australians are concerned about AI risks and expect strong government action

Alexander SaeriMar 8, 2024, 6:39 AM

38 points

12 comments5 min readEA link

(aigovernance.org.au)

AGI rising: why we are in a new era of acute risk and increasing public awareness, and what to do now

Greg_Colbourn ⏸️ May 2, 2023, 10:17 AM

68 points

35 comments13 min readEA link

[Question] Would a super-intelligent AI necessarily support its own existence?

Porque?Jun 25, 2023, 10:39 AM

8 points

2 comments2 min readEA link

Tort Law Can Play an Important Role in Mitigating AI Risk

Gabriel WeilFeb 12, 2024, 5:11 PM

99 points

6 comments5 min readEA link

Existential risk x Crypto: An unconference at Zuzalu

YeshApr 11, 2023, 1:31 PM

6 points

0 comments1 min readEA link

Poster Session on AI Safety

Neil CrawfordNov 12, 2022, 3:50 AM

8 points

0 comments4 min readEA link

MIRI 2024 Mission and Strategy Update

MaloJan 5, 2024, 1:10 AM

154 points

38 comments1 min readEA link

Fixing Insider Threats in the AI Supply Chain

Madhav MalhotraOct 7, 2023, 10:49 AM

9 points

2 comments5 min readEA link

New Artificial Intelligence quiz: can you beat ChatGPT?

AndreFerrettiMar 3, 2023, 3:46 PM

29 points

3 comments1 min readEA link

Transformative AI and Compute—Reading List

Frederik BergSep 4, 2023, 6:21 AM

24 points

0 comments1 min readEA link

(docs.google.com)

Applications Open: Pivotal 2025 Q3 Research Fellowship

Tobias HäberliMar 18, 2025, 1:25 PM

20 points

0 comments2 min readEA link

UNGA Resolution on AI: 5 Key Takeaways Looking to Future Policy

Heramb PodarMar 24, 2024, 12:03 PM

17 points

1 comment3 min readEA link

Risk of AI deceleration.

Micah ZoltuApr 18, 2023, 11:19 AM

9 points

14 comments3 min readEA link

Mitigating extreme AI risks amid rapid progress [Linkpost]

AkashMay 21, 2024, 8:04 PM

36 points

1 comment1 min readEA link

[Closed] MIT FutureTech are hiring for a Head of Operations role

PeterSlatteryOct 2, 2024, 4:51 PM

8 points

0 comments4 min readEA link

Navigating the Open-Source AI Landscape: Data, Funding, and Safety

AndreFerrettiApr 12, 2023, 10:30 AM

23 points

3 comments10 min readEA link

[Linkpost] Beware the Squirrel by Verity Harding

EarthlingSep 3, 2023, 9:04 PM

1 point

1 comment2 min readEA link

(samf.substack.com)

[Question] Know a grad student studying AI’s economic impacts?

Madhav MalhotraJul 5, 2023, 12:07 AM

7 points

0 comments1 min readEA link

[Question] Do you worry about totalitarian regimes using AI Alignment technology to create AGI that subscribe to their values?

diodio_yangFeb 28, 2023, 6:12 PM

25 points

12 comments2 min readEA link

AI Safety Arguments: An Interactive Guide

Lukas Trötzmüller🔸Feb 1, 2023, 7:21 PM

32 points

5 comments3 min readEA link

Assessment of AI safety agendas: think about the downside risk

Roman LeventovDec 19, 2023, 9:02 AM

6 points

0 comments1 min readEA link

[Linkpost] Scott Alexander reacts to OpenAI’s latest post

AkashMar 11, 2023, 10:24 PM

105 points

4 comments1 min readEA link

An economist’s perspective on AI safety

David StinsonJun 7, 2024, 7:55 AM

7 points

1 comment9 min readEA link

Neuronpedia—AI Safety Game

johnnylinOct 16, 2023, 9:35 AM

9 points

2 comments4 min readEA link

(neuronpedia.org)

My Proven AI Safety Explanation (as a computing student)

Mica WhiteFeb 6, 2024, 3:58 AM

8 points

4 comments6 min readEA link

Apply to Aether—Independent LLM Agent Safety Research Group

RohanSAug 21, 2024, 9:40 AM

47 points

13 comments8 min readEA link

Literature review of Transformative Artificial Intelligence timelines

Jaime SevillaJan 27, 2023, 8:36 PM

148 points

10 comments1 min readEA link

AI and Work: Summarising a New Literature Review

cpeppiattJul 15, 2024, 10:27 AM

13 points

0 comments2 min readEA link

(arxiv.org)

Muddling Along Is More Likely Than Dystopia

Jeffrey HeningerOct 21, 2023, 9:30 AM

87 points

3 comments8 min readEA link

(blog.aiimpacts.org)

Digital people could make AI safer

GMcGowanJun 10, 2022, 3:29 PM

25 points

15 comments4 min readEA link

(www.mindlessalgorithm.com)

Summary of Situational Awareness—The Decade Ahead

OscarD🔸Jun 8, 2024, 11:29 AM

143 points

5 comments18 min readEA link

What do XPT forecasts tell us about AI risk?

Forecasting Research InstituteJul 19, 2023, 7:43 AM

97 points

21 comments14 min readEA link

We Should Talk About This More. Epistemic World Collapse as Imminent Safety Risk of Generative AI.

Jörg WeißNov 16, 2023, 8:34 AM

4 points

0 comments29 min readEA link

Automated Parliaments — A Solution to Decision Uncertainty and Misalignment in Language Models

Shak RagolerOct 2, 2023, 9:47 AM

8 points

0 comments17 min readEA link

AGI misalignment x-risk may be lower due to an overlooked goal specification technology

johnjnayOct 21, 2022, 2:03 AM

20 points

1 comment1 min readEA link

Join the AI Evaluation Tasks Bounty Hackathon

Esben KranMar 18, 2024, 8:15 AM

20 points

0 comments4 min readEA link

Critiques of prominent AI safety labs: Conjecture

OmegaJun 12, 2023, 5:52 AM

150 points

83 comments32 min readEA link

How evals might (or might not) prevent catastrophic risks from AI

AkashFeb 7, 2023, 8:16 PM

28 points

0 comments1 min readEA link

MIT FutureTech are hiring for an Operations and Project Management role.

PeterSlatteryMay 17, 2024, 1:29 AM

12 points

0 comments3 min readEA link

A simple way of exploiting AI’s coming economic impact may be highly-impactful

kuiraJul 16, 2023, 10:30 AM

5 points

0 comments2 min readEA link

(www.lesswrong.com)

Apply to the Cambridge ML for Alignment Bootcamp (CaMLAB) [26 March − 8 April]

hannahFeb 9, 2023, 4:32 PM

62 points

1 comment5 min readEA link

Have your say on the future of AI regulation: Deadline approaching for your feedback on UN High-Level Advisory Body on AI Interim Report ‘Governing AI for Humanity’

Deborah W.A. FoulkesMar 29, 2024, 6:37 AM

17 points

1 comment1 min readEA link

How major governments can help with the most important century

Holden KarnofskyFeb 24, 2023, 7:37 PM

56 points

4 comments4 min readEA link

(www.cold-takes.com)

The current alignment plan, and how we might improve it | EAG Bay Area 23

BuckJun 7, 2023, 9:03 PM

66 points

0 comments33 min readEA link

(Even) More Early-Career EAs Should Try AI Safety Technical Research

tlevinJun 30, 2022, 9:14 PM

86 points

40 comments11 min readEA link

“Pivotal Act” Intentions: Negative Consequences and Fallacious Arguments

Andrew CritchApr 19, 2022, 8:24 PM

80 points

10 comments7 min readEA link

The AI Endgame: A counterfactual to AI alignment by an AI Safety newcomer

Andreas PDec 1, 2023, 5:49 AM

2 points

5 comments3 min readEA link

The Multidisciplinary Approach to Alignment (MATA) and Archetypal Transfer Learning (ATL)

MiguelJun 19, 2023, 3:23 AM

4 points

0 comments7 min readEA link

Diminishing Returns in Machine Learning Part 1: Hardware Development and the Physical Frontier

Brian ChauMay 27, 2023, 12:39 PM

16 points

3 comments12 min readEA link

(www.fromthenew.world)

Unions for AI safety?

dEAsignSep 24, 2023, 12:13 AM

7 points

12 comments2 min readEA link

How quickly AI could transform the world (Tom Davidson on The 80,000 Hours Podcast)

80000_HoursMay 8, 2023, 1:23 PM

82 points

3 comments17 min readEA link

How we could stumble into AI catastrophe

Holden KarnofskyJan 16, 2023, 2:52 PM

83 points

0 comments31 min readEA link

(www.cold-takes.com)

PhD Position: AI Interpretability in Berlin, Germany

Martian MoonshineApr 22, 2023, 6:57 PM

24 points

0 comments1 min readEA link

(stephanw.net)

Podcast: Interview series featuring Dr. Peter Park

Jacob-HaimesMar 26, 2024, 12:35 AM

1 point

0 comments2 min readEA link

(into-ai-safety.github.io)

Sam Altman’s Chip Ambitions Undercut OpenAI’s Safety Strategy

GarrisonFeb 10, 2024, 7:52 PM

286 points

20 comments3 min readEA link

(garrisonlovely.substack.com)

The standard case for delaying AI appears to rest on non-utilitarian assumptions

Matthew_BarnettFeb 11, 2025, 4:04 AM

15 points

56 comments10 min readEA link

Risk-averse Batch Active Inverse Reward Design

Panagiotis LiampasOct 7, 2023, 8:56 AM

11 points

0 comments15 min readEA link

Worrisome misunderstanding of the core issues with AI transition

Roman LeventovJan 18, 2024, 10:05 AM

4 points

3 comments1 min readEA link

Announcing Athena—Women in AI Alignment Research

Claire ShortNov 7, 2023, 10:02 PM

180 points

28 comments3 min readEA link

Status Quo Engines—AI essay

Ilana_Goldowitz_JimenezMay 28, 2023, 2:33 PM

1 point

0 comments15 min readEA link

Prospects for AI safety agreements between countries

oegApr 14, 2023, 5:41 PM

104 points

3 comments22 min readEA link

“The Race to the End of Humanity” – Structural Uncertainty Analysis in AI Risk Models

FroolowMay 19, 2023, 12:03 PM

48 points

4 comments21 min readEA link

List of AI safety newsletters and other resources

LizkaMay 1, 2023, 5:24 PM

49 points

5 comments4 min readEA link

A summary of current work in AI governance

constructiveJun 17, 2023, 4:58 PM

89 points

4 comments11 min readEA link

[US] NTIA: AI Accountability Policy Request for Comment

Kyle J. Lucchese 🔸Apr 13, 2023, 4:12 PM

47 points

4 comments1 min readEA link

(ntia.gov)

It’s not obvious that getting dangerous AI later is better

Aaron_ScherSep 23, 2023, 5:35 AM

23 points

9 comments16 min readEA link

AI Safety Camp 2024

Linda LinseforsNov 18, 2023, 10:37 AM

21 points

1 comment1 min readEA link

(aisafety.camp)

Discussion about AI Safety funding (FB transcript)

AkashApr 30, 2023, 7:05 PM

104 points

10 comments6 min readEA link

Cybersecurity of Frontier AI Models: A Regulatory Review

Deric ChengApr 25, 2024, 2:51 PM

9 points

1 comment8 min readEA link

A compute-based framework for thinking about the future of AI

Matthew_BarnettMay 31, 2023, 10:00 PM

96 points

36 comments19 min readEA link

You Can’t Prove Aliens Aren’t On Their Way To Destroy The Earth (A Comprehensive Takedown Of The Doomer View Of AI)

MurphyApr 7, 2023, 1:37 PM

−31 points

7 comments9 min readEA link

There are no coherence theorems

EJTFeb 20, 2023, 9:52 PM

107 points

49 comments19 min readEA link

AI safety and consciousness research: A brainstorm

Daniel_FriedrichMar 15, 2023, 2:33 PM

11 points

1 comment9 min readEA link

Counterarguments to the basic AI risk case

Katja_GraceOct 14, 2022, 8:30 PM

286 points

23 comments34 min readEA link

Aligning the Aligners: Ensuring Aligned AI acts for the common good of all mankind

timunderwoodJan 16, 2023, 11:13 AM

40 points

2 comments4 min readEA link

Reza Negarestani’s Intelligence & Spirit

ukc10014Jun 27, 2024, 6:17 PM

7 points

1 comment4 min readEA link

MATS Summer 2023 Retrospective

utilistrutilDec 2, 2023, 12:12 AM

28 points

3 comments1 min readEA link

[Question] Is working on AI to help democracy a good idea?

WillPearsonFeb 17, 2024, 11:15 PM

5 points

3 comments1 min readEA link

“The Universe of Minds”—call for reviewers (Seeds of Science)

rogersbacon1Jul 25, 2023, 4:55 PM

4 points

0 comments1 min readEA link

Advocating for Public Ownership of Future AGI: Preserving Humanity’s Collective Heritage

George_A (Digital Intelligence Rights Initiative) Jul 14, 2023, 4:01 PM

−10 points

2 comments4 min readEA link

Announcing: Mechanism Design for AI Safety—Reading Group

Rubi J. HudsonAug 9, 2022, 4:25 AM

36 points

1 comment4 min readEA link

I designed an AI safety course (for a philosophy department)

Eleni_ASep 23, 2023, 9:56 PM

27 points

3 comments2 min readEA link

[Question] Would an Anthropic/OpenAI merger be good for AI safety?

MNov 22, 2023, 8:21 PM

6 points

1 comment1 min readEA link

Gaia Network: An Illustrated Primer

Roman LeventovJan 26, 2024, 11:55 AM

4 points

4 comments15 min readEA link

A Roundtable for Safe AI (RSAI)?

Lara_THMar 9, 2023, 12:11 PM

9 points

0 comments4 min readEA link

AI Risk and Survivorship Bias—How Andreessen and LeCun got it wrong

stepanlosJul 14, 2023, 5:10 PM

5 points

1 comment6 min readEA link

The Bar for Contributing to AI Safety is Lower than You Think

Chris LeongAug 17, 2024, 10:52 AM

14 points

5 comments2 min readEA link

My Feedback to the UN Advisory Body on AI

Heramb PodarApr 4, 2024, 11:39 PM

7 points

1 comment4 min readEA link

Aim for conditional pauses

AnonResearcherMajorAILabSep 25, 2023, 1:05 AM

100 points

42 comments12 min readEA link

Ask AI companies about what they are doing for AI safety?

micMar 8, 2022, 9:54 PM

44 points

1 comment2 min readEA link

Conscious AI & Public Perception: Four futures

nicoleta-kJul 3, 2024, 11:06 PM

12 points

1 comment16 min readEA link

Thoughts on the AI Safety Summit company policy requests and responses

So8resOct 31, 2023, 11:54 PM

42 points

3 comments1 min readEA link

Decomposing alignment to take advantage of paradigms

Christopher KingJun 4, 2023, 2:26 PM

2 points

0 comments4 min readEA link

Risk Alignment in Agentic AI Systems

Hayley ClatterbuckOct 1, 2024, 10:51 PM

31 points

1 comment3 min readEA link

(static1.squarespace.com)

AI policy & governance in Australia: notes from an initial discussion

Alexander SaeriMay 15, 2023, 12:00 AM

31 points

1 comment3 min readEA link

The case for AGI by 2030

Benjamin_ToddApr 6, 2025, 12:26 PM

96 points

33 comments31 min readEA link

(80000hours.org)

Without a trajectory change, the development of AGI is likely to go badly

Max HMay 30, 2023, 12:21 AM

1 point

0 comments13 min readEA link

Help us seed AI Safety Brussels

gergoAug 7, 2024, 6:17 AM

50 points

4 comments3 min readEA link

[Question] Could someone help me understand why it’s so difficult to solve the alignment problem?

Jadon SchmittJul 22, 2023, 4:39 AM

35 points

21 comments1 min readEA link

Report: Evaluating an AI Chip Registration Policy

Deric ChengApr 12, 2024, 4:40 AM

15 points

0 comments5 min readEA link

(www.convergenceanalysis.org)

Five neglected work areas that could reduce AI risk

Aaron_ScherSep 24, 2023, 2:09 AM

22 points

0 comments9 min readEA link

AI Safety & Risk Dinner w/ Entrepreneur First CEO & ARIA Chair, Matt Clifford in New York

SimonPastorNov 28, 2023, 7:45 PM

2 points

0 comments1 min readEA link

News: Spanish AI image outcry + US AI workforce “regulation”

Benevolent_RainSep 26, 2023, 7:43 AM

9 points

0 comments1 min readEA link

UK Government announces £100 million in funding for Foundation Model Taskforce.

Jordan Pieters 🔸Apr 25, 2023, 11:29 AM

10 points

1 comment1 min readEA link

(www.gov.uk)

Introduction to Pragmatic AI Safety [Pragmatic AI Safety #1]

TW123May 9, 2022, 5:02 PM

68 points

0 comments6 min readEA link

Assessing the Dangerousness of Malevolent Actors in AGI Governance: A Preliminary Exploration

Callum HinchcliffeOct 14, 2023, 9:18 PM

28 points

4 comments9 min readEA link

Measuring artificial intelligence on human benchmarks is naive

Ward AApr 11, 2023, 11:28 AM

9 points

2 comments1 min readEA link

Updates from Campaign for AI Safety

Jolyn KhooSep 27, 2023, 2:44 AM

16 points

0 comments2 min readEA link

(www.campaignforaisafety.org)

Key takeaways from our EA and alignment research surveys

Cameron BergMay 4, 2024, 3:51 PM

64 points

22 comments21 min readEA link

AGI Safety Needs People With All Skillsets!

SeverinJul 25, 2022, 1:30 PM

39 points

7 comments2 min readEA link

[Linkpost] A Narrow Path—How to Secure our Future

MathiasKB🔸Oct 2, 2024, 10:50 PM

68 points

0 comments1 min readEA link

(www.narrowpath.co)

Intrinsic limitations of GPT-4 and other large language models, and why I’m not (very) worried about GPT-n

James FodorJun 3, 2023, 1:09 PM

28 points

3 comments11 min readEA link

Dr Altman or: How I Learned to Stop Worrying and Love the Killer AI

Barak GilaMar 11, 2024, 5:01 AM

−7 points

0 comments2 min readEA link

NIMBYism as an AI governance tool?

freedomandutilityJun 9, 2024, 6:40 AM

10 points

2 comments1 min readEA link

Fifteen Lawsuits against OpenAI

RemmeltMar 9, 2024, 12:22 PM

55 points

5 comments1 min readEA link

Update on cause area focus working group

Bastian_SternAug 10, 2023, 1:21 AM

140 points

18 comments5 min readEA link

Just Pivot to AI: The secret is out

sapphireMar 15, 2023, 6:25 AM

0 points

4 comments2 min readEA link

Talking publicly about AI risk

Jan_KulveitApr 24, 2023, 9:19 AM

152 points

13 comments1 min readEA link

Knowledge, Reasoning, and Superintelligence

Owen Cotton-BarrattMar 26, 2025, 11:28 PM

21 points

3 comments1 min readEA link

(strangecities.substack.com)

AI Safety Camp, Virtual Edition 2023

Linda LinseforsJan 6, 2023, 12:55 AM

24 points

0 comments1 min readEA link

Against most, but not all, AI risk analogies

Matthew_BarnettJan 14, 2024, 7:13 PM

43 points

9 comments1 min readEA link

Australians for AI Safety Launches New Election Campaign — Here’s How You Can Help

Luke FreemanMar 24, 2025, 4:26 AM

54 points

5 comments3 min readEA link

Is Deep Learning Actually Hitting a Wall? Evaluating Ilya Sutskever’s Recent Claims

GarrisonNov 13, 2024, 5:00 PM

115 points

7 comments8 min readEA link

(garrisonlovely.substack.com)

We Did AGISF’s 8-week Course in 3 Days. Here’s How it Went

ag4000Jul 24, 2022, 4:46 PM

26 points

7 comments6 min readEA link

AI governance and strategy: a list of research agendas and work that could be done.

Nathan_BarnardMar 12, 2024, 11:21 AM

33 points

4 comments17 min readEA link

Stuart J. Russell on “should we press pause on AI?”

KaleemSep 18, 2023, 1:19 PM

32 points

3 comments1 min readEA link

(podcasts.apple.com)

OpenAI Alums, Nobel Laureates Urge Regulators to Save Company’s Nonprofit Structure

GarrisonApr 23, 2025, 11:01 PM

61 points

2 comments8 min readEA link

(garrisonlovely.substack.com)

Order Matters for Deceptive Alignment

DavidWFeb 15, 2023, 8:12 PM

20 points

1 comment1 min readEA link

(www.lesswrong.com)

[Question] How good/bad is the new Bing AI for the world?

Nathan YoungFeb 17, 2023, 4:31 PM

21 points

14 comments1 min readEA link

[Question] Good depictions of speed mismatches between advanced AI systems and humans?

Geoffrey MillerMar 15, 2023, 4:40 PM

18 points

9 comments1 min readEA link

The argument for near-term human disempowerment through AI

Chris LeongApr 16, 2024, 3:07 AM

31 points

12 comments1 min readEA link

(link.springer.com)

Who Aligns the Alignment Researchers?

ben.smithMar 5, 2023, 11:22 PM

23 points

4 comments1 min readEA link

[Question] Why won’t nanotech kill us all?

Yarrow🔸Dec 16, 2023, 11:27 PM

20 points

5 comments1 min readEA link

AI and integrity

Nathan YoungMay 29, 2024, 8:45 PM

15 points

0 comments1 min readEA link

(nathanpmyoung.substack.com)

AI alignment shouldn’t be conflated with AI moral achievement

Matthew_BarnettDec 30, 2023, 3:08 AM

116 points

15 comments5 min readEA link

5 homegrown EA projects, seeking small donors

AustinOct 28, 2024, 11:24 PM

50 points

1 comment2 min readEA link

Talk: AI safety fieldbuilding at MATS

Ryan KiddJun 23, 2024, 11:06 PM

14 points

1 comment1 min readEA link

Shallow review of live agendas in alignment & safety

technicalitiesNov 27, 2023, 11:33 AM

76 points

8 comments29 min readEA link

Unveiling the American Public Opinion on AI Moratorium and Government Intervention: The Impact of Media Exposure

OttoMay 8, 2023, 10:49 AM

28 points

5 comments6 min readEA link

[Question] What are some criticisms of PauseAI?

Eevee🔹Nov 23, 2024, 5:49 PM

53 points

70 comments1 min readEA link

Ways to buy time

AkashNov 12, 2022, 7:31 PM

47 points

1 comment1 min readEA link

New report on the state of AI safety in China

Geoffrey MillerOct 27, 2023, 8:20 PM

22 points

0 comments3 min readEA link

(concordia-consulting.com)

[Question] Who should we give books on AI X-risk to?

yanniDec 18, 2023, 11:57 PM

13 points

1 comment1 min readEA link

“AGI” considered harmful

Milan GriffesApr 18, 2025, 8:19 PM

10 points

1 comment1 min readEA link

Pause For Thought: The AI Pause Debate (Astral Codex Ten)

David MOct 5, 2023, 9:32 AM

37 points

0 comments1 min readEA link

(www.astralcodexten.com)

How I Formed My Own Views About AI Safety

Neel NandaFeb 27, 2022, 6:52 PM

134 points

12 comments14 min readEA link

(www.neelnanda.io)

[Question] Why might AI be a x-risk? Succinct explanations please

SanjayApr 4, 2023, 12:46 PM

20 points

9 comments1 min readEA link

An overview of some promising work by junior alignment researchers

AkashDec 26, 2022, 5:23 PM

10 points

0 comments1 min readEA link

Longtermism Fund: August 2023 Grants Report

Michael Townsend🔸Aug 20, 2023, 5:34 AM

81 points

3 comments5 min readEA link

New survey: 46% of Americans are concerned about extinction from AI; 69% support a six-month pause in AI development

AkashApr 5, 2023, 1:26 AM

143 points

34 comments1 min readEA link

Announcing Epoch’s newly expanded Parameters, Compute and Data Trends in Machine Learning database

Robi RahmanOct 25, 2023, 3:03 AM

38 points

1 comment1 min readEA link

(epochai.org)

Upcoming speaker series on emerging tech, national security & US policy careers

kuhanjJun 21, 2023, 4:49 AM

42 points

0 comments2 min readEA link

Comments on OpenAI’s “Planning for AGI and beyond”

So8resMar 3, 2023, 11:01 PM

115 points

7 comments1 min readEA link

Next steps after AGISF at UMich

JakubKJan 25, 2023, 8:57 PM

18 points

1 comment1 min readEA link

[Question] Is There Actually a Standard or Convincing Response to David Thorstad’s Criticisms of the Value of X-Risk Reduction and of Longtermism?

David Mathers🔸May 21, 2025, 11:58 AM

116 points

21 comments2 min readEA link

[Question] Evidence to prioritize or working on AI as the most impactful thing?

VaipanSep 22, 2023, 8:43 AM

9 points

6 comments1 min readEA link

The Importance of AI Alignment, explained in 5 points

Daniel_EthFeb 11, 2023, 2:56 AM

50 points

4 comments13 min readEA link

Misnaming and Other Issues with OpenAI’s “Human Level” Superintelligence Hierarchy

DavidmanheimJul 15, 2024, 5:50 AM

14 points

0 comments1 min readEA link

Why people want to work on AI safety (but don’t)

Emily GrundyJan 24, 2023, 6:41 AM

70 points

10 comments7 min readEA link

How “AGI” could end up being many different specialized AI’s stitched together

titotalMay 8, 2023, 12:32 PM

31 points

2 comments9 min readEA link

Transformative AI issues (not just misalignment): an overview

Holden KarnofskyJan 6, 2023, 2:19 AM

36 points

0 comments22 min readEA link

(www.cold-takes.com)

We might get lucky with AGI warning shots. Let’s be ready!

tcelferactMar 31, 2023, 9:37 PM

22 points

2 comments1 min readEA link

Paul Christiano on Dwarkesh Podcast

ESRogsNov 3, 2023, 10:13 PM

5 points

0 comments1 min readEA link

(www.dwarkeshpatel.com)

Quick nudge to apply to the LTFF grant round (closing on Saturday)

calebpFeb 14, 2025, 3:19 PM

57 points

7 comments1 min readEA link

Apply to the 2024 PIBBSS Summer Research Fellowship

noraJan 12, 2024, 4:06 AM

37 points

1 comment1 min readEA link

Questions about Conjecure’s CoEm proposal

AkashMar 9, 2023, 7:32 PM

19 points

0 comments1 min readEA link

The “technology” bucket error

Holly Elmore ⏸️ 🔸Sep 21, 2023, 12:59 AM

33 points

10 comments4 min readEA link

(open.substack.com)

EA Wins 2023

Shakeel HashimDec 31, 2023, 2:07 PM

358 points

9 comments3 min readEA link

Protest against Meta’s irreversible proliferation (Sept 29, San Francisco)

Holly Elmore ⏸️ 🔸Sep 19, 2023, 11:40 PM

114 points

32 comments1 min readEA link

AI Safety Newsletter #39: Implications of a Trump Administration for AI Policy Plus, Safety Engineering

Center for AI SafetyJul 29, 2024, 5:48 PM

6 points

0 comments6 min readEA link

(newsletter.safe.ai)

CAIDP Statement on Lethal Autonomous Weapons Systems

Heramb PodarNov 30, 2024, 6:00 PM

7 points

0 comments1 min readEA link

(www.linkedin.com)

“Can We Survive Technology?” by John von Neumann

Eli RoseMar 13, 2023, 2:26 AM

51 points

0 comments1 min readEA link

(geosci.uchicago.edu)

Are alignment researchers devoting enough time to improving their research capacity?

Carson JonesNov 4, 2022, 12:58 AM

11 points

1 comment1 min readEA link

Hypothetical grants that the Long-Term Future Fund narrowly rejected

calebpNov 15, 2023, 7:39 PM

95 points

12 comments6 min readEA link

“Dangers of AI and the End of Human Civilization” Yudkowsky on Lex Fridman

𝕮𝖎𝖓𝖊𝖗𝖆Mar 30, 2023, 3:44 PM

28 points

0 comments1 min readEA link

Scaling of AI training runs will slow down after GPT-5

Maxime Riché 🔸Apr 26, 2024, 4:06 PM

10 points

2 comments3 min readEA link

Community Building for Graduate Students: A Targeted Approach

Neil CrawfordMar 29, 2022, 7:47 PM

13 points

0 comments3 min readEA link

My AI Alignment Research Agenda and Threat Model, right now (May 2023)

Nicholas KrossMay 28, 2023, 3:23 AM

6 points

0 comments1 min readEA link

Peter Eckersley (1979-2022)

technicalitiesSep 3, 2022, 10:45 AM

497 points

10 comments1 min readEA link

AI Risk Management Framework | NIST

𝕮𝖎𝖓𝖊𝖗𝖆Jan 26, 2023, 3:27 PM

50 points

0 comments1 min readEA link

Christiano (ARC) and GA (Conjecture) Discuss Alignment Cruxes

Andrea_MiottiFeb 24, 2023, 11:03 PM

16 points

1 comment1 min readEA link

Apply now to SPAR!

Agustín Covarrubias 🔸Dec 19, 2024, 10:29 PM

36 points

0 comments1 min readEA link

Culture and Programming Retrospective: ERA Fellowship 2023

GideonFSep 28, 2023, 4:45 PM

16 points

0 comments10 min readEA link

Situational awareness (Section 2.1 of “Scheming AIs”)

Joe_CarlsmithNov 26, 2023, 11:00 PM

12 points

1 comment1 min readEA link

[Question] What predictions from theoretical AI Safety research have been confirmed by empirical work?

freedomandutilityDec 29, 2024, 8:19 AM

43 points

10 comments1 min readEA link

I bet Greg Colbourn 10 k€ that AI will not kill us all by the end of 2027

Vasco Grilo🔸Jun 4, 2024, 4:37 PM

195 points

64 comments2 min readEA link

AI Safety Newsletter #40: California AI Legislation Plus, NVIDIA Delays Chip Production, and Do AI Safety Benchmarks Actually Measure Safety?

Center for AI SafetyAug 21, 2024, 6:10 PM

17 points

0 comments6 min readEA link

(newsletter.safe.ai)

AISN #47: Reasoning Models

Center for AI SafetyFeb 6, 2025, 6:44 PM

8 points

0 comments4 min readEA link

(newsletter.safe.ai)

ARC is hiring theoretical researchers

Jacob_HiltonJun 12, 2023, 7:11 PM

78 points

0 comments4 min readEA link

(www.lesswrong.com)

Pausing AI might be good policy, but it’s bad politics

Stephen ClareOct 23, 2023, 1:36 PM

168 points

20 comments2 min readEA link

(unfoldingatlas.substack.com)

Measuring AI-Driven Risk with Stock Prices (Susana Campos-Martins)

Global Priorities InstituteDec 12, 2024, 2:22 PM

10 points

1 comment4 min readEA link

(globalprioritiesinstitute.org)

[Crosspost] Some Very Important Things (That I Won’t Be Working On This Year)

Sarah ChengMar 10, 2025, 2:42 PM

28 points

1 comment4 min readEA link

(milesbrundage.substack.com)

Unjournal: Evaluations of “Artificial Intelligence and Economic Growth”, and new hosting space

david_reinsteinMar 17, 2023, 8:20 PM

47 points

0 comments2 min readEA link

(unjournal.pubpub.org)

I don’t want to talk about ai

KirstenMay 22, 2023, 9:19 PM

7 points

0 comments1 min readEA link

(ealifestyles.substack.com)

P(doom|AGI) is high: why the default outcome of AGI is doom

Greg_Colbourn ⏸️ May 2, 2023, 10:40 AM

13 points

28 comments3 min readEA link

Safety Conscious Researchers should leave Anthropic

GideonFApr 1, 2025, 10:12 AM

56 points

3 comments5 min readEA link

5 Reasons Why Governments/Militaries Already Want AI for Information Warfare

trevor1Nov 12, 2023, 6:24 PM

5 points

0 comments1 min readEA link

Four questions I ask AI safety researchers

AkashJul 17, 2022, 5:25 PM

30 points

3 comments1 min readEA link

A tale of 2.5 orthogonality theses

ArepoMay 1, 2022, 1:53 PM

146 points

31 comments11 min readEA link

[Question] Asking for online resources why AI now is near AGI

jackchang110May 18, 2023, 12:04 AM

6 points

4 comments1 min readEA link

Consider attending the AI Security Forum ’24, a 1-day pre-DEFCON event

Charlie Rogers-SmithJul 12, 2024, 11:01 PM

23 points

0 comments1 min readEA link

[Question] Suggested readings & videos for a new college course on ‘Psychology and AI’?

Geoffrey MillerJan 11, 2024, 10:26 PM

12 points

3 comments1 min readEA link

Updates on the EA catastrophic risk landscape

Benjamin_ToddMay 6, 2024, 4:52 AM

195 points

46 comments2 min readEA link

My model of how different AI risks fit together

Stephen ClareJan 31, 2024, 5:09 PM

64 points

4 comments7 min readEA link

(unfoldingatlas.substack.com)

AISN #48: Utility Engineering and EnigmaEval

Center for AI SafetyFeb 18, 2025, 7:11 PM

6 points

0 comments4 min readEA link

(newsletter.safe.ai)

What OpenAI Told California’s Attorney General

GarrisonMay 17, 2025, 11:14 PM

35 points

2 comments8 min readEA link

(www.obsolete.pub)

Formalize the Hashiness Model of AGI Uncontainability

RemmeltNov 9, 2024, 4:10 PM

2 points

0 comments5 min readEA link

(docs.google.com)

Play Regrantor: Move up to $250,000 to Your Top High-Impact Projects!

Dawn DrescherMay 17, 2023, 4:51 PM

58 points

2 comments2 min readEA link

(impactmarkets.substack.com)

[Question] Do AI companies make their safety researchers sign a non-disparagement clause?

OferSep 5, 2022, 1:40 PM

73 points

3 comments1 min readEA link

White House publishes framework for Nucleic Acid Screening

Agustín Covarrubias 🔸Apr 30, 2024, 12:44 AM

30 points

1 comment1 min readEA link

(www.whitehouse.gov)

Apply to be a TA for TARA

yanni kyriacosDec 20, 2024, 2:24 AM

15 points

2 comments1 min readEA link

[Question] If AI is in a bubble and the bubble bursts, what would you do?

RemmeltAug 19, 2024, 10:56 AM

28 points

10 comments1 min readEA link

How LDT helps reduce the AI arms race

Tamsin LeakeDec 10, 2023, 4:21 PM

8 points

1 comment1 min readEA link

(carado.moe)

Part 3: A Proposed Approach for AI Safety Movement Building: Projects, Professions, Skills, and Ideas for the Future [long post][bounty for feedback]

PeterSlatteryMar 22, 2023, 12:54 AM

22 points

8 comments32 min readEA link

Aptitudes for AI governance work

Sam ClarkeJun 13, 2023, 1:54 PM

68 points

0 comments7 min readEA link

Pausing AI vs Degrowth in rich countries

Miquel Banchs-Piqué (prev. mikbp)Sep 23, 2023, 7:09 AM

−2 points

53 comments1 min readEA link

We Have Not Been Invited to the Future: e/acc and the Narrowness of the Way Ahead

Devin KalishJul 17, 2024, 10:15 PM

10 points

1 comment20 min readEA link

(www.thinkingmuchbetter.com)

Does scheming lead to adequate future empowerment? (Section 2.3.1.2 of “Scheming AIs”)

Joe_CarlsmithDec 3, 2023, 6:32 PM

6 points

1 comment1 min readEA link

Excerpts from “Doing EA Better” on x-risk methodology

Eevee🔹Jan 26, 2023, 1:04 AM

22 points

5 comments6 min readEA link

(forum.effectivealtruism.org)

AISN #55: Trump Administration Rescinds AI Diffusion Rule, Allows Chip Sales to Gulf States

Center for AI SafetyMay 20, 2025, 4:05 PM

7 points

0 comments4 min readEA link

(newsletter.safe.ai)

Exploring Metaculus’ community predictions

Vasco Grilo🔸Mar 24, 2023, 7:59 AM

95 points

17 comments10 min readEA link

Agency Foundations Challenge: September 8th-24th, $10k Prizes

Catalin MAug 30, 2023, 6:12 AM

12 points

0 comments5 min readEA link

Epoch AI alumni launch Mechanize to “automate the whole economy”

Henry Stanley 🔸Apr 18, 2025, 10:12 AM

103 points

52 comments1 min readEA link

AI takeoff and nuclear war

Owen Cotton-BarrattJun 11, 2024, 7:33 PM

72 points

5 comments11 min readEA link

(strangecities.substack.com)

Four Predictions About OpenAI’s Plans To Retain Nonprofit Control

GarrisonMay 7, 2025, 3:48 PM

15 points

2 comments5 min readEA link

(www.obsolete.pub)

Notes on risk compensation

trammellMay 12, 2024, 6:40 PM

136 points

14 comments21 min readEA link

Planes are still decades away from displacing most bird jobs

guzeyNov 25, 2022, 4:49 PM

27 points

2 comments1 min readEA link

AXRP: Store, Patreon, Video

DanielFilanFeb 7, 2023, 5:12 AM

7 points

0 comments1 min readEA link

AI policy ideas: Reading list

Zach Stein-PerlmanApr 17, 2023, 7:00 PM

60 points

3 comments1 min readEA link

The AI Boom Mainly Benefits Big Firms, but long-term, markets will concentrate

Hauke HillebrandtOct 29, 2023, 8:38 AM

12 points

0 comments1 min readEA link

[Question] What should the EA/AI safety community change, in response to Sam Altman’s revealed priorities?

SiebeRozendalMar 8, 2024, 12:35 PM

30 points

16 comments1 min readEA link

Max Tegmark’s new Time article on how we’re in a Don’t Look Up scenario [Linkpost]

Jonas Hallgren 🔸Apr 25, 2023, 3:47 PM

41 points

0 comments1 min readEA link

De Dicto and De Se Reference Matters for Alignment

philgoetzOct 3, 2023, 9:57 PM

5 points

2 comments9 min readEA link

RSPs are pauses done right

evhubOct 14, 2023, 4:06 AM

93 points

7 comments1 min readEA link

Widening Overton Window—Open Thread

PrometheusMar 31, 2023, 10:06 AM

12 points

5 comments1 min readEA link

(www.lesswrong.com)

[Question] Which stocks or ETFs should you invest in to take advantage of a possible AGI explosion, and why?

Eevee🔹Apr 10, 2023, 5:55 PM

19 points

16 comments1 min readEA link

Making a conservative case for alignment

LarksNov 17, 2024, 1:45 AM

44 points

0 comments1 min readEA link

(www.lesswrong.com)

Apocalypse insurance, and the hardline libertarian take on AI risk

So8resNov 28, 2023, 2:09 AM

21 points

0 comments1 min readEA link

titotal on AI risk scepticism

Vasco Grilo🔸May 30, 2024, 5:03 PM

76 points

3 comments6 min readEA link

(forum.effectivealtruism.org)

Announcing the London Initiative for Safe AI (LISA)

JamesFoxFeb 5, 2024, 10:36 AM

66 points

3 comments9 min readEA link

The bullseye framework: My case against AI doom

titotalMay 30, 2023, 11:52 AM

71 points

15 comments17 min readEA link

RP’s AI Governance & Strategy team—June 2023 interim overview

MichaelA🔸Jun 22, 2023, 1:45 PM

68 points

1 comment7 min readEA link

Slopworld 2035: The dangers of mediocre AI

titotalApr 14, 2025, 1:14 PM

84 points

1 comment29 min readEA link

(titotal.substack.com)

Virtual AI Safety Unconference (VAISU)

NguyênJun 20, 2023, 9:47 AM

14 points

0 comments1 min readEA link

Safety isn’t safety without a social model (or: dispelling the myth of per se technical safety)

Andrew CritchJun 14, 2024, 12:16 AM

95 points

3 comments1 min readEA link

The Polarity Problem [Draft]

Dan HMay 23, 2023, 9:05 PM

11 points

0 comments1 min readEA link

[Question] Any tips on applying for EA funding?

Eevee🔹Sep 22, 2024, 5:11 AM

18 points

4 comments1 min readEA link

How Rethink Priorities’ Research could inform your grantmaking

kierangreig🔸Oct 4, 2023, 6:24 PM

59 points

0 comments2 min readEA link

The benefits and risks of optimism (about AI safety)

Karl von WendtDec 3, 2023, 12:45 PM

3 points

5 comments1 min readEA link

Anthropic is Quietly Backpedalling on its Safety Commitments

GarrisonMay 23, 2025, 2:26 AM

95 points

7 comments5 min readEA link

(www.obsolete.pub)

Yudkowsky on AGI risk on the Bankless podcast

RobBensingerMar 13, 2023, 12:42 AM

54 points

2 comments75 min readEA link

Results of an informal survey on AI grantmaking

Scott AlexanderAug 21, 2024, 1:19 PM

127 points

28 comments1 min readEA link

Rethink Priorities’ 2023 Summary, 2024 Strategy, and Funding Gaps

kierangreig🔸Nov 15, 2023, 8:56 PM

86 points

7 comments3 min readEA link

Ethical Roots of Chinese AI

Vasiliy KondyrevNov 5, 2024, 2:07 PM

0 points

0 comments6 min readEA link

GPTs are Predictors, not Imitators

EliezerYudkowskyApr 8, 2023, 7:59 PM

74 points

12 comments1 min readEA link

Beren’s “Deconfusing Direct vs Amortised Optimisation”

𝕮𝖎𝖓𝖊𝖗𝖆Apr 7, 2023, 8:57 AM

9 points

0 comments1 min readEA link

Alignment is mostly about making cognition aimable at all

So8resJan 30, 2023, 3:22 PM

57 points

3 comments1 min readEA link

M&A in AI

Hauke HillebrandtOct 30, 2023, 5:43 PM

9 points

1 comment6 min readEA link

A Barebones Guide to Mechanistic Interpretability Prerequisites

Neel NandaNov 29, 2022, 6:43 PM

54 points

1 comment3 min readEA link

(neelnanda.io)

Two contrasting models of “intelligence” and future growth

Magnus VindingNov 24, 2022, 11:54 AM

74 points

32 comments22 min readEA link

[Question] What is the easiest/funnest way to build up a comprehensive understanding of AI and AI Safety?

Jordan ArelApr 30, 2024, 6:39 PM

14 points

0 comments1 min readEA link

Podcast (+transcript): Nathan Barnard on how US financial regulation can inform AI governance

Aaron BergmanAug 8, 2023, 9:46 PM

12 points

0 comments23 min readEA link

(www.aaronbergman.net)

Map of maps of interesting fields

Max GörlitzJun 25, 2023, 2:00 PM

55 points

6 comments1 min readEA link

(glozematrix.substack.com)

What Does a Marginal Grant at LTFF Look Like? Funding Priorities and Grantmaking Thresholds at the Long-Term Future Fund

LinchAug 10, 2023, 8:11 PM

176 points

22 comments8 min readEA link

Misgeneralization as a misnomer

So8resApr 6, 2023, 8:43 PM

48 points

0 comments1 min readEA link

MIT FutureTech are hiring ‍a Product and Data Visualization Designer

PeterSlatteryNov 13, 2024, 2:41 PM

9 points

0 comments4 min readEA link

AI risk/reward: A simple model

Nathan YoungMay 4, 2023, 7:12 PM

37 points

5 comments7 min readEA link

Why not just send people to Bluedot (FBB#4)

gergoMar 25, 2025, 10:47 AM

27 points

13 comments12 min readEA link

[Question] How to hedge investment portfolio against AI risk?

Timothy_LiptrotJan 31, 2023, 8:04 AM

8 points

0 comments1 min readEA link

What’s Going on With OpenAI’s Messaging?

Ozzie GooenMay 21, 2024, 2:22 AM

216 points

28 comments3 min readEA link

Is AI Hitting a Wall or Moving Faster Than Ever?

GarrisonJan 9, 2025, 10:18 PM

35 points

5 comments5 min readEA link

(garrisonlovely.substack.com)

Brief thoughts on Data, Reporting, and Response for AI Risk Mitigation

DavidmanheimJun 15, 2023, 7:53 AM

18 points

3 comments8 min readEA link

OpenAI’s o1 tried to avoid being shut down, and lied about it, in evals

Greg_Colbourn ⏸️ Dec 6, 2024, 3:25 PM

23 points

9 comments1 min readEA link

(www.transformernews.ai)

My article in The Nation — California’s AI Safety Bill Is a Mask-Off Moment for the Industry

GarrisonAug 15, 2024, 7:25 PM

134 points

0 comments1 min readEA link

(www.thenation.com)

[Question] Your Advice For a High School Student.

AhmedWezJan 10, 2025, 9:26 PM

7 points

5 comments1 min readEA link

Lessons from the FDA for AI

RemmeltAug 2, 2024, 12:52 AM

6 points

2 comments1 min readEA link

(ainowinstitute.org)

Introducing Alignment Stress-Testing at Anthropic

evhubJan 12, 2024, 11:51 PM

80 points

0 comments1 min readEA link

Request for Information for a new US AI Action Plan (OSTP RFI)

Agustín Covarrubias 🔸Feb 7, 2025, 8:22 PM

19 points

2 comments2 min readEA link

(www.federalregister.gov)

Silicon Valley’s Rabbit Hole Problem

MandelbrotOct 8, 2023, 12:25 PM

34 points

44 comments11 min readEA link

(medium.com)

Will AI Avoid Exploitation? (Adam Bales)

Global Priorities InstituteDec 13, 2023, 11:37 AM

38 points

0 comments2 min readEA link

The Overton Window widens: Examples of AI risk in the media

AkashMar 23, 2023, 5:10 PM

112 points

11 comments1 min readEA link

Cruxes on US lead for some domestic AI regulation

Zach Stein-PerlmanSep 10, 2023, 6:00 PM

20 points

6 comments2 min readEA link

Eric Schmidt’s blueprint for US technology strategy

OscarD🔸Oct 15, 2024, 7:54 PM

29 points

4 comments9 min readEA link

We don’t want to post again “This might be the last AI Safety Camp”

RemmeltJan 21, 2025, 12:03 PM

42 points

2 comments1 min readEA link

(manifund.org)

A Study of AI Science Models

Eleni_AMay 13, 2023, 7:14 PM

12 points

4 comments24 min readEA link

Fighting without hope

AkashMar 1, 2023, 6:15 PM

35 points

9 comments1 min readEA link

Critiques of non-existent AI safety labs: Yours

AnnealJun 16, 2023, 6:50 AM

117 points

12 comments3 min readEA link

The last era of human mistakes

Owen Cotton-BarrattJul 24, 2024, 9:56 AM

23 points

4 comments7 min readEA link

(strangecities.substack.com)

Transcript: NBC Nightly News: AI ‘race to recklessness’ w/ Tristan Harris, Aza Raskin

WilliamKielyMar 23, 2023, 3:45 AM

47 points

1 comment1 min readEA link

DeepMind and Google Brain are merging [Linkpost]

AkashApr 20, 2023, 6:47 PM

32 points

1 comment1 min readEA link

Is AI forecasting a waste of effort on the margin?

EmrikNov 5, 2022, 12:41 AM

12 points

6 comments3 min readEA link

Is the AI Doomsday Narrative the Product of a Big Tech Conspiracy?

GarrisonDec 4, 2024, 7:20 PM

28 points

5 comments11 min readEA link

(garrisonlovely.substack.com)

AISC9 has ended and there will be an AISC10

Linda LinseforsApr 29, 2024, 10:53 AM

36 points

0 comments1 min readEA link

4 ways to think about democratizing AI [GovAI Linkpost]

AkashFeb 13, 2023, 6:06 PM

35 points

0 comments1 min readEA link

Law-Following AI 4: Don’t Rely on Vicarious Liability

Cullen 🔸Aug 2, 2022, 11:23 PM

13 points

0 comments3 min readEA link

Global Pause AI Protest 10/21

Holly Elmore ⏸️ 🔸Oct 14, 2023, 3:17 AM

22 points

0 comments1 min readEA link

Top OpenAI Catastrophic Risk Official Steps Down Abruptly

GarrisonApr 16, 2025, 4:04 PM

29 points

1 comment5 min readEA link

(garrisonlovely.substack.com)

Donation offsets for ChatGPT Plus subscriptions

Jeffrey LadishMar 16, 2023, 11:11 PM

76 points

10 comments3 min readEA link

[Question] How do you follow AI (safety) news?

peterhartreeSep 28, 2024, 2:03 PM

13 points

9 comments1 min readEA link

What success looks like

mariushobbhahnJun 28, 2022, 2:30 PM

112 points

20 comments19 min readEA link

Some thoughts on “AI could defeat all of us combined”

Milan GriffesJun 2, 2023, 3:03 PM

23 points

0 comments4 min readEA link

Where’s my ten minute AGI?

Vasco Grilo🔸May 19, 2025, 5:45 PM

44 points

6 comments7 min readEA link

(epoch.ai)

Have your say on the Australian Government’s AI Policy

Nathan SherburnJul 11, 2023, 1:12 AM

3 points

0 comments1 min readEA link

[Question] Are we confident that superintelligent artificial intelligence disempowering humans would be bad?

Vasco Grilo🔸Jun 10, 2023, 9:24 AM

24 points

27 comments1 min readEA link

AI Safety Newsletter #42: Newsom Vetoes SB 1047 Plus, OpenAI’s o1, and AI Governance Summary

Center for AI SafetyOct 1, 2024, 8:33 PM

10 points

0 comments6 min readEA link

(newsletter.safe.ai)

AI things that are perhaps as important as human-controlled AI

ChiMar 3, 2024, 6:07 PM

114 points

9 comments21 min readEA link

List #1: Why stopping the development of AGI is hard but doable

RemmeltDec 24, 2022, 9:52 AM

24 points

2 comments1 min readEA link

AI, Animals, and Digital Minds 2024 - Retrospective

Constance LiJun 19, 2024, 2:56 PM

80 points

8 comments8 min readEA link

Why and When Interpretability Work is Dangerous

Nicholas KrossMay 28, 2023, 12:27 AM

6 points

0 comments1 min readEA link

Trends in the dollar training cost of machine learning systems

Ben CottierFeb 1, 2023, 2:48 PM

63 points

3 comments1 min readEA link

ai-plans.com December Critique-a-Thon

Kabir_KumarDec 4, 2023, 9:27 AM

1 point

0 comments2 min readEA link

Shutting Down the Lightcone Offices

Habryka [Deactivated]Mar 15, 2023, 1:46 AM

243 points

71 comments17 min readEA link

(www.lesswrong.com)

A brief history of the automated corporation

Owen Cotton-BarrattNov 4, 2024, 2:37 PM

21 points

1 comment5 min readEA link

(strangecities.substack.com)

GPT-4 is out: thread (& links)

LizkaMar 14, 2023, 8:02 PM

84 points

18 comments1 min readEA link

FLI podcast series, “Imagine A World”, about aspirational futures with AGI

Jackson WagnerOct 13, 2023, 4:03 PM

18 points

0 comments4 min readEA link

Why experienced professionals fail to land high-impact roles (FBB #5)

gergoApr 10, 2025, 12:44 PM

112 points

20 comments9 min readEA link

AISN #46: The Transition

Center for AI SafetyJan 23, 2025, 6:01 PM

10 points

0 comments5 min readEA link

(newsletter.safe.ai)

[Video] Why SB-1047 deserves a fairer debate

YadavAug 20, 2024, 10:38 AM

15 points

1 comment7 min readEA link

Retrospective on the 2022 Conjecture AI Discussions

Andrea_MiottiFeb 24, 2023, 10:41 PM

12 points

1 comment1 min readEA link

The Wizard of Oz Problem: How incentives and narratives can skew our perception of AI developments

AkashMar 20, 2023, 10:36 PM

16 points

0 comments1 min readEA link

We need non-cybersecurity people [too]

JarrahMay 5, 2024, 12:11 AM

32 points

0 comments2 min readEA link

AI Safety Newsletter #43: White House Issues First National Security Memo on AI Plus, AI and Job Displacement, and AI Takes Over the Nobels

Center for AI SafetyOct 28, 2024, 4:02 PM

6 points

0 comments6 min readEA link

(newsletter.safe.ai)

On the Dwarkesh/Chollet Podcast, and the cruxes of scaling to AGI

JWS 🔸Jun 15, 2024, 8:24 PM

72 points

49 comments17 min readEA link

The Future of Work: How Can Policymakers Prepare for AI’s Impact on Labor Markets?

DavidConradJun 24, 2024, 9:43 PM

4 points

1 comment3 min readEA link

(www.lesswrong.com)

Effective Utopia: 100% Safe AI, Place AI, Simulating a Multiverse & How It Looks

ankMar 2, 2025, 3:14 AM

1 point

3 comments35 min readEA link

The Intentional Stance, LLMs Edition

Eleni_AMay 1, 2024, 3:22 PM

8 points

2 comments8 min readEA link

It looks like there are some good funding opportunities in AI safety right now

Benjamin_ToddDec 21, 2024, 1:39 PM

182 points

7 comments4 min readEA link

(benjamintodd.substack.com)

How much do markets value Open AI?

Ben_West🔸May 14, 2023, 7:28 PM

39 points

13 comments4 min readEA link

Crises reveal centralisation

Vasco Grilo🔸Mar 26, 2024, 6:00 PM

31 points

2 comments5 min readEA link

(stefanschubert.substack.com)

Decomposing Agency — capabilities without desires

Owen Cotton-BarrattJul 11, 2024, 9:38 AM

37 points

2 comments12 min readEA link

(strangecities.substack.com)

Cost-effectiveness of professional field-building programs for AI safety research

Center for AI SafetyJul 10, 2023, 5:26 PM

38 points

2 comments18 min readEA link

The Top AI Safety Bets for 2023: GiveWiki’s Latest Recommendations

Dawn DrescherNov 11, 2023, 9:04 AM

11 points

4 comments8 min readEA link

We are in a New Paradigm of AI Progress—OpenAI’s o3 model makes huge gains on the toughest AI benchmarks in the world

GarrisonDec 22, 2024, 9:45 PM

26 points

0 comments4 min readEA link

(garrisonlovely.substack.com)

80,000 hours should remove OpenAI from the Job Board (and similar EA orgs should do similarly)

RaemonJul 3, 2024, 8:34 PM

263 points

79 comments3 min readEA link

[Question] AI strategy career pipeline

Zach Stein-PerlmanMay 22, 2023, 12:00 AM

72 points

23 comments1 min readEA link

[Question] What did AI Safety’s specific funding of AGI R&D labs lead to?

RemmeltJul 5, 2023, 3:51 PM

24 points

17 comments1 min readEA link

Power laws in Speedrunning and Machine Learning

Jaime SevillaApr 24, 2023, 10:06 AM

48 points

0 comments1 min readEA link

A newcomer’s guide to the technical AI safety field

zeshenNov 4, 2022, 2:29 PM

16 points

0 comments1 min readEA link

A stylized dialogue on John Wentworth’s claims about markets and optimization

So8resMar 25, 2023, 10:32 PM

18 points

0 comments1 min readEA link

What is autonomy, and how does it lead to greater risk from AI?

DavidmanheimAug 1, 2023, 8:06 AM

10 points

0 comments6 min readEA link

(www.lesswrong.com)

The AI Adoption Gap: Preparing the US Government for Advanced AI

LizkaApr 2, 2025, 9:37 PM

40 points

20 comments17 min readEA link

(www.forethought.org)

[Question] Investigative journalist in the AI safety space?

Benevolent_RainNov 15, 2024, 8:48 AM

4 points

9 comments1 min readEA link

Chaining Retroactive Funders to Borrow Against Unlikely Utopias

Dawn DrescherApr 19, 2022, 6:25 PM

24 points

4 comments9 min readEA link

(impactmarkets.substack.com)

An Argument for Focusing on Making AI go Well

Chris LeongDec 28, 2023, 1:25 PM

13 points

4 comments3 min readEA link

My current take on existential AI risk [FB post]

Aryeh EnglanderMay 1, 2023, 4:22 PM

10 points

0 comments3 min readEA link

Nuclear brinksmanship is not a good AI x-risk strategy

titotalMar 30, 2023, 10:07 PM

19 points

8 comments5 min readEA link

Thoughts on yesterday’s UN Security Council meeting on AI

Greg_Colbourn ⏸️ Jul 19, 2023, 4:46 PM

31 points

2 comments1 min readEA link

Rethink Priorities: Seeking Expressions of Interest for Special Projects Next Year

kierangreig🔸Nov 29, 2023, 1:44 PM

57 points

0 comments5 min readEA link

Enough about AI timelines— we already know what we need to know.

Holly Elmore ⏸️ 🔸Apr 9, 2025, 10:29 AM

134 points

35 comments2 min readEA link

Why would AI companies use human-level AI to do alignment research?

MichaelDickensApr 25, 2025, 7:12 PM

16 points

1 comment2 min readEA link

[Job Ad] SERI MATS is hiring for our summer program

annashiveMay 26, 2023, 4:51 AM

8 points

1 comment7 min readEA link

Before Altman’s Ouster, OpenAI’s Board Was Divided and Feuding

Jonathan YanNov 22, 2023, 1:01 AM

25 points

1 comment1 min readEA link

(www.nytimes.com)

The GiveWiki’s Top Picks in AI Safety for the Giving Season of 2023

Dawn DrescherDec 7, 2023, 9:23 AM

26 points

0 comments3 min readEA link

(impactmarkets.substack.com)

Nobody’s on the ball on AGI alignment

leopoldMar 29, 2023, 2:26 PM

327 points

65 comments9 min readEA link

(www.forourposterity.com)

Is this community over-emphasizing AI alignment?

LixiangJan 8, 2023, 6:23 AM

1 point

5 comments1 min readEA link

RAND report finds no effect of current LLMs on viability of bioterrorism attacks

LizkaJan 26, 2024, 8:10 PM

108 points

17 comments3 min readEA link

(www.rand.org)

[Question] Pros and cons of setting up a company to do independent AIS research?

Eevee🔹Aug 13, 2024, 12:11 AM

15 points

0 comments1 min readEA link

Qualities that alignment mentors value in junior researchers

AkashFeb 14, 2023, 11:27 PM

31 points

1 comment1 min readEA link

Applications open: Support for talent working on independent learning, research or entrepreneurial projects focused on reducing global catastrophic risks

CEEALARFeb 9, 2024, 1:04 PM

63 points

1 comment2 min readEA link

CFP for Rebellion and Disobedience in AI workshop

Ram RachumDec 29, 2022, 4:09 PM

4 points

0 comments1 min readEA link

Mesa-Optimization: Explain it like I’m 10 Edition

brookAug 26, 2023, 11:06 PM

7 points

0 comments6 min readEA link

(www.lesswrong.com)

AISN #38: Supreme Court Decision Could Limit Federal Ability to Regulate AI Plus, “Circuit Breakers” for AI systems, and updates on China’s AI industry

Center for AI SafetyJul 9, 2024, 7:29 PM

8 points

0 comments5 min readEA link

(newsletter.safe.ai)

Orienting to 3 year AGI timelines

NikolaDec 22, 2024, 11:07 PM

121 points

15 comments1 min readEA link

Some talent needs in AI governance

Sam ClarkeJun 13, 2023, 1:53 PM

133 points

10 comments8 min readEA link

The Hubinger lectures on AGI safety: an introductory lecture series

evhubJun 22, 2023, 12:59 AM

44 points

0 comments1 min readEA link

[Question] What harm could AI safety do?

SeanEngelhartMay 15, 2021, 1:11 AM

12 points

7 comments1 min readEA link

Washington Post article about EA university groups

LizkaJul 5, 2023, 12:58 PM

35 points

5 comments1 min readEA link

Compendium of problems with RLHF

Raphaël SJan 30, 2023, 8:48 AM

18 points

0 comments1 min readEA link

“Safety Culture for AI” is important, but isn’t going to be easy

DavidmanheimJun 26, 2023, 11:27 AM

53 points

0 comments2 min readEA link

(papers.ssrn.com)

Seeking (Paid) Case Studies on Standards

Holden KarnofskyMay 26, 2023, 5:58 PM

99 points

14 comments1 min readEA link

Pivotal Research is Hiring Research Managers

Tobias HäberliSep 25, 2024, 7:11 PM

8 points

0 comments3 min readEA link

Views on when AGI comes and on strategy to reduce existential risk

TsviBTJul 8, 2023, 9:00 AM

31 points

3 comments1 min readEA link

Miles Brundage resigned from OpenAI, and his AGI readiness team was disbanded

GarrisonOct 23, 2024, 11:42 PM

57 points

4 comments7 min readEA link

(garrisonlovely.substack.com)

Join AISafety.info’s Distillation Hackathon (Oct 6-9th)

leillustrations🔸Oct 1, 2023, 6:42 PM

27 points

2 comments2 min readEA link

(www.lesswrong.com)

An EA used deceptive messaging to advance her project; we need mechanisms to avoid deontologically dubious plans

MikhailSaminFeb 13, 2024, 11:11 PM

22 points

39 comments5 min readEA link

Focusing your impact on short vs long TAI timelines

kuhanjSep 30, 2023, 7:23 PM

44 points

0 comments10 min readEA link

Humans are not prepared to operate outside their moral training distribution

PrometheusApr 10, 2023, 9:44 PM

12 points

0 comments1 min readEA link

Stop calling them labs

sawyer🔸Feb 24, 2025, 10:58 PM

257 points

22 comments1 min readEA link

Rational Animations is looking for an AI Safety scriptwriter, a lead community manager, and other roles.

WriterJun 16, 2023, 9:41 AM

40 points

4 comments1 min readEA link

Projects I would like to see (possibly at AI Safety Camp)

Linda LinseforsSep 27, 2023, 9:27 PM

9 points

0 comments1 min readEA link

MIT FutureTech are hiring for a Technical Associate role

PeterSlatterySep 9, 2024, 8:14 PM

9 points

6 comments3 min readEA link

80,000 Hours is shifting its strategic approach to focus more on AGI

80000_HoursMar 20, 2025, 11:24 AM

232 points

121 comments8 min readEA link

Takeoff speeds presentation at Anthropic

Tom_DavidsonJun 4, 2024, 10:46 PM

29 points

3 comments1 min readEA link

VIRTUA: a novel about AI alignment

Karl von WendtJan 12, 2023, 9:37 AM

23 points

0 comments1 min readEA link

Introducing the new Riesgos Catastróficos Globales team

Jaime SevillaMar 3, 2023, 11:04 PM

74 points

3 comments5 min readEA link

(riesgoscatastroficosglobales.com)

My highly personal skepticism braindump on existential risk from artificial intelligence.

NunoSempereJan 23, 2023, 8:08 PM

437 points

116 comments14 min readEA link

(nunosempere.com)

Disentangling arguments for the importance of AI safety

richard_ngoJan 23, 2019, 2:58 PM

63 points

14 comments8 min readEA link

Catastrophic Risks from AI #6: Discussion and FAQ

Center for AI SafetyJun 27, 2023, 11:23 PM

10 points

0 comments1 min readEA link

GDP per capita in 2050

Hauke HillebrandtMay 6, 2024, 3:14 PM

130 points

11 comments16 min readEA link

(hauke.substack.com)

[Question] Why isn’t there a Charity Entrepreneurship program for AI Safety?

yanniOct 4, 2023, 2:12 AM

11 points

13 comments1 min readEA link

Epoch AI is Hiring an Operations Associate

merilalamaMay 3, 2024, 12:16 AM

5 points

1 comment3 min readEA link

(careers.rethinkpriorities.org)

Kairos is hiring a Head of Operations/Founding Generalist

Agustín Covarrubias 🔸Mar 12, 2025, 8:58 PM

59 points

1 comment5 min readEA link

SPAR seeks advisors and students for AI safety projects (Second Wave)

micSep 14, 2023, 11:09 PM

14 points

0 comments1 min readEA link

Techies Wanted: How STEM Backgrounds Can Advance Safe AI Policy

Daniel_EthMay 26, 2025, 11:29 AM

41 points

1 comment1 min readEA link

Spreading messages to help with the most important century

Holden KarnofskyJan 25, 2023, 8:35 PM

129 points

21 comments18 min readEA link

(www.cold-takes.com)

How CISA can Support the Security of Large AI Models Against Theft [Grad School Assignment]

Marcel DMay 3, 2023, 3:36 PM

7 points

0 comments13 min readEA link

AISN #53: An Open Letter Attempts to Block OpenAI Restructuring

Center for AI SafetyApr 29, 2025, 3:56 PM

6 points

0 comments4 min readEA link

(newsletter.safe.ai)

DeepMind: Evaluating Frontier Models for Dangerous Capabilities

Zach Stein-PerlmanMar 21, 2024, 11:00 PM

28 points

0 comments1 min readEA link

(arxiv.org)

[Question] Game theory work on AI alignment with diverse AI systems, human individuals, & human groups?

Geoffrey MillerMar 2, 2023, 4:50 PM

22 points

2 comments1 min readEA link

Funding circle aimed at slowing down AI—looking for participants

Greg_Colbourn ⏸️ Jan 25, 2024, 11:58 PM

92 points

3 comments2 min readEA link

Organizing a debate with experts and MPs to raise AI xrisk awareness: a possible blueprint

OttoApr 19, 2023, 10:50 AM

75 points

1 comment4 min readEA link

Project ideas: Sentience and rights of digital minds

Lukas FinnvedenJan 4, 2024, 7:26 AM

33 points

1 comment20 min readEA link

(www.forethought.org)

A recent write-up of the case for AI (existential) risk

TimseyMay 18, 2023, 1:07 PM

17 points

0 comments19 min readEA link

Announcing Manifund Regrants

AustinJul 5, 2023, 7:42 PM

217 points

51 comments4 min readEA link

(manifund.org)

Recruit the World’s best for AGI Alignment

Greg_Colbourn ⏸️ Mar 30, 2023, 4:41 PM

34 points

8 comments22 min readEA link

AISafety.world is a map of the AIS ecosystem

Hamish McDoodlesApr 6, 2023, 11:47 AM

192 points

8 comments1 min readEA link

12 tentative ideas for US AI policy (Luke Muehlhauser)

LizkaApr 19, 2023, 9:05 PM

117 points

12 comments4 min readEA link

(www.openphilanthropy.org)

Will scaling work?

Vasco Grilo🔸Feb 4, 2024, 9:29 AM

19 points

1 comment12 min readEA link

(www.dwarkeshpatel.com)

A note of caution about recent AI risk coverage

Sean_o_hJun 7, 2023, 5:05 PM

284 points

29 comments3 min readEA link

Debate series: should we push for a pause on the development of AI?

Ben_West🔸Sep 8, 2023, 4:29 PM

252 points

58 comments1 min readEA link

AISN #54: OpenAI Updates Restructure Plan

Center for AI SafetyMay 13, 2025, 4:48 PM

7 points

0 comments4 min readEA link

(newsletter.safe.ai)

Can the AI afford to wait?

Ben Millwood🔸Mar 20, 2024, 7:45 PM

48 points

11 comments7 min readEA link

Value fragility and AI takeover

Joe_CarlsmithAug 5, 2024, 9:28 PM

38 points

3 comments1 min readEA link

Podcast with Yoshua Bengio on Why AI Labs are “Playing Dice with Humanity’s Future”

GarrisonMay 10, 2024, 5:23 PM

29 points

3 comments2 min readEA link

(garrisonlovely.substack.com)

Announcing FAR Labs, an AI safety coworking space

Ben GoldhaberOct 2, 2023, 8:15 PM

63 points

0 comments1 min readEA link

(www.lesswrong.com)

Project idea: AI for epistemics

Benjamin_ToddMay 19, 2024, 7:36 PM

45 points

12 comments3 min readEA link

(benjamintodd.substack.com)

The current state of RSPs

Zach Stein-PerlmanNov 4, 2024, 4:00 PM

19 points

1 comment1 min readEA link

A Gentle Introduction to Risk Frameworks Beyond Forecasting

pending_survivalApr 11, 2024, 9:15 AM

81 points

4 comments27 min readEA link

Reading list on AI agents and associated policy

Peter WildefordAug 9, 2024, 5:40 PM

79 points

2 comments1 min readEA link

CEA Should Invest in Helping Altruists Navigate Advanced AI

Chris LeongMay 14, 2023, 2:52 PM

4 points

12 comments2 min readEA link

Where I’m at with AI risk: convinced of danger but not (yet) of doom

Amber DawnMar 21, 2023, 1:23 PM

62 points

16 comments6 min readEA link

Linkpost: Dwarkesh Patel interviewing Carl Shulman

Stefan_SchubertJun 14, 2023, 3:30 PM

110 points

5 comments1 min readEA link

(podcastaddict.com)

Transformative AGI by 2043 is <1% likely

Ted SandersJun 6, 2023, 3:51 PM

98 points

92 comments5 min readEA link

(arxiv.org)

Using AI to Streamline Your Political Advocacy

Gabriel Sherman🔸Apr 29, 2025, 6:35 PM

3 points

0 comments3 min readEA link

AI companies are unlikely to make high-assurance safety cases if timelines are short

Ryan GreenblattJan 23, 2025, 6:41 PM

45 points

1 comment1 min readEA link

Some quotes from Tuesday’s Senate hearing on AI

Daniel_EthMay 17, 2023, 12:13 PM

105 points

7 comments4 min readEA link

List #3: Why not to assume on prior that AGI-alignment workarounds are available

RemmeltDec 24, 2022, 9:54 AM

6 points

0 comments1 min readEA link

The Retroactive Funding Landscape: Innovations for Donors and Grantmakers

Dawn DrescherSep 29, 2023, 5:39 PM

17 points

2 comments19 min readEA link

(impactmarkets.substack.com)

Navigating Risks from Advanced Artificial Intelligence: A Guide for Philanthropists [Founders Pledge]

Tom Barnes🔸Jun 21, 2024, 9:48 AM

101 points

7 comments1 min readEA link

(www.founderspledge.com)

Pulse 2024: Attitudes towards artificial intelligence

Jamie ENov 27, 2024, 11:33 AM

62 points

4 comments3 min readEA link

AI Safety Hub Serbia Official Opening

Dušan D. Nešić (Dushan)Oct 28, 2023, 5:10 PM

26 points

3 comments1 min readEA link

(forum.effectivealtruism.org)

Rolling Thresholds for AGI Scaling Regulation

LarksJan 12, 2025, 1:30 AM

60 points

4 comments6 min readEA link

[Question] Dan Hendrycks and EA

CarusoAug 3, 2024, 1:49 PM

−1 points

6 comments1 min readEA link

Bio-x-AI policy: call for ideas from the Federation of American Scientists

Ben StewartAug 15, 2023, 3:21 AM

8 points

0 comments1 min readEA link

Reflections on my first year of AI safety research

Jay BaileyJan 8, 2024, 7:49 AM

64 points

2 comments12 min readEA link

2023 Alignment Research Updates from FAR AI

AdamGleaveDec 4, 2023, 10:32 PM

14 points

0 comments1 min readEA link

(far.ai)

Announcing the Vitalik Buterin Fellowships in AI Existential Safety!

DanielFilanSep 21, 2021, 12:41 AM

62 points

0 comments1 min readEA link

(grants.futureoflife.org)

Un-unpluggability—can’t we just unplug it?

Oliver SourbutMay 15, 2023, 1:23 PM

15 points

0 comments1 min readEA link

(www.oliversourbut.net)

Offering AI safety support calls for ML professionals

Vael GatesFeb 15, 2024, 11:48 PM

52 points

1 comment1 min readEA link

[Linkpost] OpenAI is awarding ten 100k grants for building prototypes of a democratic process for steering AI

pseudonymMay 26, 2023, 12:49 PM

36 points

2 comments1 min readEA link

(openai.com)

Introducing Future Matters – a strategy consultancy

KyleGraceySep 30, 2023, 2:06 AM

59 points

2 comments5 min readEA link

AISN #50: AI Action Plan Responses

Center for AI SafetyMar 31, 2025, 8:07 PM

10 points

0 comments6 min readEA link

(newsletter.safe.ai)

New report: “Scheming AIs: Will AIs fake alignment during training in order to get power?”

Joe_CarlsmithNov 15, 2023, 5:16 PM

71 points

4 comments1 min readEA link

Apply to the Cambridge ERA:AI Fellowship 2025

Harrison 🔸Mar 25, 2025, 1:46 PM

28 points

0 comments3 min readEA link

Announcing the Introduction to ML Safety Course

TW123Aug 6, 2022, 2:50 AM

136 points

4 comments7 min readEA link

[Question] Please help me sense-check my assumptions about the needs of the AI Safety community and related career plans

PeterSlatteryMar 27, 2023, 8:11 AM

23 points

27 comments2 min readEA link

Ideas for AI labs: Reading list

Zach Stein-PerlmanApr 24, 2023, 7:00 PM

28 points

2 comments1 min readEA link

A moral backlash against AI will probably slow down AGI development

Geoffrey MillerMay 31, 2023, 9:31 PM

145 points

22 comments14 min readEA link

Deference on AI timelines: survey results

Sam ClarkeMar 30, 2023, 11:03 PM

68 points

3 comments2 min readEA link

Release of UN’s draft related to the governance of AI (a summary of the Simon Institute’s response)

SebastianSchmidtApr 27, 2024, 6:27 PM

22 points

0 comments1 min readEA link

EA, Psychology & AI Safety Research

Sam EllisMay 26, 2022, 11:46 PM

28 points

3 comments6 min readEA link

What would a compute monitoring plan look like? [Linkpost]

AkashMar 26, 2023, 7:33 PM

61 points

1 comment1 min readEA link

How much money should we be saving for retirement?

Denkenberger🔸Mar 2, 2025, 6:21 AM

21 points

6 comments2 min readEA link

Pros and Cons of boycotting paid Chat GPT

NickLaingMar 18, 2023, 8:50 AM

14 points

11 comments2 min readEA link

Potentially Useful Projects in Wise AI

Chris LeongJun 5, 2025, 8:13 AM

14 points

2 comments6 min readEA link

Should there be just one western AGI project?

rosehadsharDec 4, 2024, 2:41 PM

49 points

3 comments1 min readEA link

(www.forethought.org)

Owain Evans on LLMs, Truthful AI, AI Composition, and More

Ozzie GooenMay 2, 2023, 1:20 AM

21 points

0 comments1 min readEA link

(quri.substack.com)

Selling out to AI companies is bad. Period. You will be corrupted.

Holly Elmore ⏸️ 🔸Apr 9, 2025, 3:56 AM

2 points

23 comments1 min readEA link

Imitation Learning is Probably Existentially Safe

Vasco Grilo🔸Apr 30, 2024, 5:06 PM

19 points

7 comments3 min readEA link

(www.openphilanthropy.org)

UK government to host first global summit on AI Safety

DavidNashJun 8, 2023, 1:24 PM

78 points

1 comment5 min readEA link

(www.gov.uk)

Intent alignment should not be the goal for AGI x-risk reduction

johnjnayOct 26, 2022, 1:24 AM

7 points

1 comment1 min readEA link

AI forecasting bots incoming

Center for AI SafetySep 9, 2024, 7:55 PM

−2 points

6 comments4 min readEA link

(www.safe.ai)

Aggregating Utilities for Corrigible AI [Feedback Draft]

Dan HMay 12, 2023, 8:57 PM

12 points

0 comments1 min readEA link

A rough and incomplete review of some of John Wentworth’s research

So8resMar 28, 2023, 6:52 PM

27 points

0 comments1 min readEA link

[Question] What would it look like for AIS to no longer be neglected?

RockwellJun 16, 2023, 3:59 PM

100 points

15 comments1 min readEA link

Apply to fall policy internships (we can help)

ESJul 2, 2023, 9:37 PM

57 points

4 comments1 min readEA link

Geoffrey Miller on Cross-Cultural Understanding Between China and Western Countries as a Neglected Consideration in AI Alignment

Evan_GaensbauerApr 17, 2023, 3:26 AM

25 points

2 comments4 min readEA link

Want to work on US emerging tech policy? Consider the Horizon Fellowship.

ESJul 30, 2024, 11:46 AM

32 points

0 comments1 min readEA link

AIS Netherlands is looking for a Founding Executive Director (EOI form)

gergoMar 19, 2025, 9:24 AM

49 points

4 comments4 min readEA link

Survey on intermediate goals in AI governance

MichaelA🔸Mar 17, 2023, 12:44 PM

155 points

4 comments1 min readEA link

Unjournal evaluation of “Towards best practices in AGI safety and governance” (Schuett et al, 2023)

david_reinsteinJun 3, 2025, 11:18 AM

9 points

0 comments1 min readEA link

(unjournal.pubpub.org)

Quick survey on AI alignment resources

frances_lorenzJun 30, 2022, 7:08 PM

15 points

0 comments1 min readEA link

Excerpts from “Majority Leader Schumer Delivers Remarks To Launch SAFE Innovation Framework For Artificial Intelligence At CSIS”

Chris LeongJul 21, 2023, 11:15 PM

19 points

0 comments1 min readEA link

(www.democrats.senate.gov)

Box inversion revisited

Jan_KulveitNov 7, 2023, 11:09 AM

13 points

1 comment1 min readEA link

The Selfish Machine

Vasco Grilo🔸Mar 15, 2025, 10:58 AM

9 points

0 comments12 min readEA link

(maartenboudry.substack.com)

Levelling Up in AI Safety Research Engineering

GabeMSep 2, 2022, 4:59 AM

166 points

21 comments17 min readEA link

Scaling Laws and Likely Limits to AI

DavidmanheimAug 18, 2024, 5:19 PM

19 points

0 comments3 min readEA link

Tacit knowledge: how I exactly approach EAG(x) conferences

gergoJun 4, 2025, 6:14 PM

68 points

4 comments4 min readEA link

Thinking-in-limits about TAI from the demand perspective. Demand saturation, resource wars, new debt.

Ivan MadanNov 7, 2023, 10:44 PM

2 points

0 comments4 min readEA link

Frontier AI systems have surpassed the self-replicating red line

Greg_Colbourn ⏸️ Dec 10, 2024, 4:33 PM

25 points

14 comments1 min readEA link

(github.com)

Essay competition on the Automation of Wisdom and Philosophy — $25k in prizes

Owen Cotton-BarrattApr 16, 2024, 10:08 AM

80 points

15 comments8 min readEA link

(blog.aiimpacts.org)

AISN #56: Google Releases Veo 3

Center for AI SafetyMay 28, 2025, 3:57 PM

6 points

0 comments4 min readEA link

(newsletter.safe.ai)

Prioritising between extinction risks: Evidence Quality

freedomandutilityDec 30, 2023, 12:25 PM

11 points

0 comments2 min readEA link

[Linkpost] ‘The Godfather of A.I.’ Leaves Google and Warns of Danger Ahead

imp4rtial 🔸May 1, 2023, 7:54 PM

43 points

3 comments3 min readEA link

(www.nytimes.com)

Takes on “Alignment Faking in Large Language Models”

Joe_CarlsmithDec 18, 2024, 6:22 PM

72 points

1 comment1 min readEA link

The Navigation Fund launched + is hiring a program officer to lead the distribution of $20M annually for AI safety! Full-time, fully remote, pay starts at $200k

vincentweisserNov 3, 2023, 9:53 PM

120 points

3 comments1 min readEA link

AI and Evolution

Dan HMar 30, 2023, 1:09 PM

41 points

1 comment2 min readEA link

(arxiv.org)

“X distracts from Y” as a thinly-disguised fight over group status / politics

Steven ByrnesSep 25, 2023, 3:29 PM

91 points

9 comments8 min readEA link

Announcing The Midas Project — and our first campaign (which you can help with!)

Tyler JohnstonJun 13, 2024, 6:41 PM

98 points

15 comments4 min readEA link

Manifund: 2023 in Review

AustinJan 18, 2024, 11:50 PM

29 points

1 comment23 min readEA link

(manifund.substack.com)

UN Secretary-General recognises existential threat from AI

Greg_Colbourn ⏸️ Jun 15, 2023, 5:03 PM

58 points

1 comment1 min readEA link

Introducing Kairos: a new AI safety fieldbuilding organization (the new home for SPAR and FSP)

Agustín Covarrubias 🔸Oct 25, 2024, 9:59 PM

80 points

2 comments2 min readEA link

Successif: Join our AI program to help mitigate the catastrophic risks of AI

ClaireBOct 25, 2023, 4:51 PM

15 points

0 comments5 min readEA link

AISN #36: Voluntary Commitments are Insufficient Plus, a Senate AI Policy Roadmap, and Chapter 1: An Overview of Catastrophic Risks

Center for AI SafetyMay 30, 2024, 6:23 PM

6 points

0 comments5 min readEA link

(newsletter.safe.ai)

Supervised Program for Alignment Research (SPAR) at UC Berkeley: Spring 2023 summary

micAug 19, 2023, 2:32 AM

18 points

1 comment6 min readEA link

(www.lesswrong.com)

AI Safety Newsletter #1 [CAIS Linkpost]

AkashApr 10, 2023, 8:18 PM

38 points

0 comments1 min readEA link

The US-China Relationship and Catastrophic Risk (EAG Boston transcript)

EA GlobalJul 9, 2024, 1:50 PM

30 points

1 comment19 min readEA link

[Question] Has Anthropic already made the externally legible commitments that it planned to make?

OferMar 12, 2024, 1:45 PM

21 points

3 comments1 min readEA link

METR is hiring ML Research Engineers and Scientists

Ben_West🔸Jun 5, 2024, 9:25 PM

18 points

2 comments1 min readEA link

(metr.org)

AI Safety University Organizing: Early Takeaways from Thirteen Groups

Agustín Covarrubias 🔸Oct 2, 2024, 2:39 PM

46 points

3 comments9 min readEA link

Catastrophic Risks from AI #5: Rogue AIs

Center for AI SafetyJun 27, 2023, 10:06 PM

16 points

1 comment1 min readEA link

Yann LeCun on AGI and AI Safety

Chris LeongAug 8, 2023, 11:43 PM

23 points

4 comments1 min readEA link

(drive.google.com)

Introducing SyDFAIS: A Systemic Design Framework for AI Safety Field-Building

MoneerFeb 6, 2025, 2:26 PM

19 points

6 comments14 min readEA link

Student competition for drafting a treaty on moratorium of large-scale AI capabilities R&D

NayanikaApr 24, 2023, 1:15 PM

36 points

4 comments2 min readEA link

AI Takeover Scenario with Scaled LLMs

simeon_cApr 16, 2023, 11:28 PM

29 points

1 comment1 min readEA link

All AGI Safety questions welcome (especially basic ones) [May 2023]

StevenKaasMay 8, 2023, 10:30 PM

19 points

11 comments1 min readEA link

METR: Measuring AI Ability to Complete Long Tasks

Ben_West🔸Mar 19, 2025, 4:49 PM

122 points

16 comments1 min readEA link

(metr.org)

Civil disobedience opportunity—a way to help reduce chance of hard takeoff from recursive self improvement of code

JonCefaluMar 25, 2023, 10:37 PM

−5 points

0 comments1 min readEA link

(codegencodepoisoningcontest.cargo.site)

My “infohazards small working group” Signal Chat may have encountered minor leaks

LinchApr 2, 2025, 1:03 AM

109 points

2 comments5 min readEA link

Decoding Republican AI Policy: Insights from 10 Key Articles from Mid-2024

anonymous007Aug 18, 2024, 9:48 AM

5 points

0 comments6 min readEA link

On the correspondence between AI-misalignment and cognitive dissonance using a behavioral economics model

Stijn Bruers 🔸Nov 1, 2022, 9:15 AM

11 points

0 comments6 min readEA link

Why Did Elon Musk Just Offer to Buy Control of OpenAI for $100 Billion?

GarrisonFeb 11, 2025, 12:20 AM

152 points

2 comments6 min readEA link

(garrisonlovely.substack.com)

EA Hotel: Live Theory Workshop this month

CEEALARNov 7, 2024, 10:39 AM

14 points

0 comments1 min readEA link

Collin Burns on Alignment Research And Discovering Latent Knowledge Without Supervision

Michaël TrazziJan 17, 2023, 5:21 PM

21 points

2 comments1 min readEA link

AI as a Constitutional Moment

atbMay 28, 2025, 3:40 PM

17 points

1 comment9 min readEA link

AGI Takeoff dynamics—Intelligence vs Quantity explosion

EdoAradJul 26, 2023, 9:20 AM

14 points

0 comments2 min readEA link

(github.com)

Apply to CEEALAR to do AGI moratorium work

Greg_Colbourn ⏸️ Jul 26, 2023, 9:24 PM

62 points

0 comments1 min readEA link

[Question] How committed to AGI safety are the current OpenAI nonprofit board members?

Eevee🔹Dec 2, 2024, 4:03 AM

14 points

1 comment1 min readEA link

“Who Will You Be After ChatGPT Takes Your Job?”

Stephen ThomasApr 21, 2023, 9:31 PM

23 points

4 comments2 min readEA link

(www.wired.com)

What does it mean for an AGI to be ‘safe’?

So8resOct 7, 2022, 4:43 AM

53 points

21 comments1 min readEA link

Giving away copies of Uncontrollable by Darren McKee

Greg_Colbourn ⏸️ Dec 14, 2023, 5:00 PM

39 points

2 comments1 min readEA link

AI Consciousness Report: A Roundtable Discussion

Sofia_FogelAug 30, 2023, 9:50 PM

18 points

0 comments2 min readEA link

[Question] Imagine AGI killed us all in three years. What would have been our biggest mistakes?

yanni kyriacosApr 7, 2023, 12:06 AM

17 points

6 comments1 min readEA link

AI Safety Field Building vs. EA CB

kuhanjJun 26, 2023, 11:21 PM

80 points

16 comments6 min readEA link

Distinguishing ways AI can be “concentrated”

Matthew_BarnettOct 21, 2024, 10:14 PM

30 points

1 comment4 min readEA link

Announcing the ITAM AI Futures Fellowship

AmAristizabalJul 28, 2023, 4:44 PM

43 points

3 comments2 min readEA link

OpenAI’s new structure

AnonymousTurtleDec 27, 2024, 2:53 PM

30 points

2 comments1 min readEA link

(openai.com)

The Parable of the Boy Who Cried 5% Chance of Wolf

Kat WoodsAug 15, 2022, 2:22 PM

80 points

8 comments2 min readEA link

Apply to Spring 2024 policy internships (we can help)

ESOct 4, 2023, 2:45 PM

26 points

2 comments1 min readEA link

How bad a future do ML researchers expect?

Katja_GraceMar 13, 2023, 5:47 AM

165 points

20 comments1 min readEA link

Saying ‘AI safety research is a Pascal’s Mugging’ isn’t a strong response

Robert_WiblinDec 15, 2015, 1:48 PM

15 points

16 comments2 min readEA link

Defining alignment research

richard_ngoAug 19, 2024, 10:49 PM

48 points

1 comment1 min readEA link

Formation of Macrostrategy Refinement Division

Henry Stanley 🔸Apr 1, 2025, 2:16 PM

23 points

0 comments2 min readEA link

Simulating Shutdown Code Activations in an AI Virus Lab

MiguelJun 20, 2023, 5:27 AM

4 points

0 comments6 min readEA link

AI governance talent profiles I’d like to see apply for OP funding

JulianHazellDec 19, 2023, 12:34 PM

118 points

4 comments3 min readEA link

(www.openphilanthropy.org)

Funding and job opportunities, events, and thoughts on professionals (Fieldbuilders newsletter #8)

gergoApr 23, 2025, 9:53 AM

7 points

0 comments3 min readEA link

Apply to be a mentor in SPAR!

Agustín Covarrubias 🔸Nov 5, 2024, 9:32 PM

14 points

0 comments1 min readEA link

Safety tax functions

Owen Cotton-BarrattOct 20, 2024, 2:13 PM

23 points

1 comment6 min readEA link

(strangecities.substack.com)

AI Safety Seems Hard to Measure

Holden KarnofskyDec 11, 2022, 1:31 AM

90 points

4 comments14 min readEA link

(www.cold-takes.com)

Truth and Advantage: Response to a draft of “AI safety seems hard to measure”

So8resMar 22, 2023, 3:36 AM

11 points

0 comments1 min readEA link

What is it like doing AI safety work?

Kat WoodsFeb 21, 2023, 7:24 PM

99 points

2 comments10 min readEA link

The Grant Decision Boundary: Recent Cases from the Long-Term Future Fund

LinchNov 29, 2024, 1:50 AM

66 points

3 comments3 min readEA link

Upcoming Feedback Opportunity on Dual-Use Foundation Models

Chris LeongNov 2, 2023, 4:30 AM

9 points

0 comments1 min readEA link

An ‘AGI Emergency Eject Criteria’ consensus could be really useful.

tcelferactApr 7, 2023, 4:21 PM

27 points

3 comments1 min readEA link

Distinctions when Discussing Utility Functions

Ozzie GooenMar 8, 2024, 6:43 PM

15 points

5 comments8 min readEA link

Announcing Timaeus

Stan van WingerdenOct 22, 2023, 1:32 PM

79 points

0 comments5 min readEA link

(www.lesswrong.com)

Pausing AI is Progress

Felix De SimoneJul 16, 2024, 10:28 PM

22 points

3 comments6 min readEA link

(pauseai.substack.com)

[Question] Asking for online calls on AI s-risks discussions

jackchang110May 14, 2023, 1:58 PM

26 points

3 comments1 min readEA link

EA Netherlands’ guide to AI safety careers

James HerbertJan 16, 2025, 5:22 PM

25 points

0 comments1 min readEA link

(effectiefaltruisme.nl)

Video and transcript of presentation on Otherness and control in the age of AGI

Joe_CarlsmithOct 8, 2024, 10:30 PM

18 points

1 comment1 min readEA link

CEA seeks co-founder for AI safety group support spin-off

Agustín Covarrubias 🔸Apr 8, 2024, 3:42 PM

62 points

0 comments4 min readEA link

A widely shared AI productivity paper was retracted, is possibly fraudulent

titotalMay 19, 2025, 10:18 AM

32 points

4 comments3 min readEA link

[Question] What should I ask Ezra Klein about AI policy proposals?

Robert_WiblinJun 23, 2023, 4:36 PM

21 points

4 comments1 min readEA link

Calling for Student Submissions: AI Safety Distillation Contest

a_e_rApr 23, 2022, 8:24 PM

102 points

28 comments3 min readEA link

Some reasons to start a project to stop harmful AI

RemmeltAug 22, 2024, 4:23 PM

5 points

0 comments1 min readEA link

[Linkpost] Jan Leike on three kinds of alignment taxes

AkashJan 6, 2023, 11:57 PM

29 points

0 comments1 min readEA link

The AI industry turns against its favorite philosophy

Jonathan YanNov 22, 2023, 12:11 AM

14 points

2 comments1 min readEA link

(www.semafor.com)

Taking a leave of absence from Open Philanthropy to work on AI safety

Holden KarnofskyFeb 23, 2023, 7:05 PM

420 points

31 comments2 min readEA link

AISN#52: An Expert Virology Benchmark

Center for AI SafetyApr 22, 2025, 4:52 PM

6 points

0 comments4 min readEA link

(newsletter.safe.ai)

Thoughts on “The Offense-Defense Balance Rarely Changes”

Cullen 🔸Feb 12, 2024, 3:26 AM

42 points

4 comments5 min readEA link

[Part-time AI Safety Research Program] MARS 3.0 Applications Open for Participants & Recruiting Mentors

Cambridge AI Safety HubMay 7, 2025, 10:52 PM

4 points

0 comments2 min readEA link

Theories of Change for Track II Diplomacy [Founders Pledge]

christian.rJul 9, 2024, 1:31 PM

21 points

2 comments33 min readEA link

[Question] Will OpenAI’s o3 reduce NVidia’s moat?

Ebenezer DukakisJan 3, 2025, 2:21 AM

9 points

6 comments1 min readEA link

Insights from an expert survey about intermediate goals in AI governance

Sebastian SchwieckerMar 17, 2023, 2:59 PM

11 points

2 comments1 min readEA link

Funding Case: AI Safety Camp 11

RemmeltDec 23, 2024, 8:39 AM

42 points

2 comments6 min readEA link

(manifund.org)

“Diamondoid bacteria” nanobots: deadly threat or dead-end? A nanotech investigation

titotalSep 29, 2023, 2:01 PM

102 points

33 comments20 min readEA link

(titotal.substack.com)

Attend SPAR’s virtual demo day! (career fair + talks)

Agustín Covarrubias 🔸May 2, 2025, 11:45 PM

17 points

1 comment2 min readEA link

(demoday.sparai.org)

DeepSeek Made it Even Harder for US AI Companies to Ever Reach Profitability

GarrisonFeb 19, 2025, 9:02 PM

30 points

1 comment3 min readEA link

(garrisonlovely.substack.com)

Fundraising for Mox: coworking & events in SF

AustinMar 31, 2025, 6:25 PM

37 points

3 comments6 min readEA link

(manifund.org)

Statement on AI Extinction—Signed by AGI Labs, Top Academics, and Many Other Notable Figures

Center for AI SafetyMay 30, 2023, 9:06 AM

428 points

28 comments1 min readEA link

(www.safe.ai)

Still no strong evidence that LLMs increase bioterrorism risk

freedomandutilityNov 2, 2023, 9:23 PM

58 points

9 comments1 min readEA link

[LW xpost] Unit economics of LLM APIs

dschwarzAug 27, 2024, 4:55 PM

19 points

2 comments1 min readEA link

(www.lesswrong.com)

US Congress introduces CREATE AI Act for establishing National AI Research Resource

Daniel_EthJul 28, 2023, 11:27 PM

9 points

1 comment1 min readEA link

(eshoo.house.gov)

AISN #44: The Trump Circle on AI Safety Plus, Chinese researchers used Llama to create a military tool for the PLA, a Google AI system discovered a zero-day cybersecurity vulnerability, and Complex Systems

Center for AI SafetyNov 19, 2024, 4:36 PM

11 points

0 comments5 min readEA link

(newsletter.safe.ai)

Anthropic’s submission to the White House’s RFI on AI policy

Agustín Covarrubias 🔸Mar 6, 2025, 10:47 PM

48 points

7 comments1 min readEA link

(www.anthropic.com)

Briefly how I’ve updated since ChatGPT

rimeApr 25, 2023, 7:39 PM

29 points

8 comments2 min readEA link

(www.lesswrong.com)

AI Safety & Entrepreneurship v1.0

Chris LeongApr 26, 2025, 2:37 PM

27 points

0 comments1 min readEA link

AIのタイムライン ─ 提案されている論証と「専門家」の立ち位置

EA JapanAug 17, 2023, 2:59 PM

2 points

0 comments1 min readEA link

Please, someone make a dataset of supposed cases of “tech panic”

Marcel DNov 7, 2023, 2:49 AM

4 points

2 comments2 min readEA link

Survey on the acceleration risks of our new RFPs to study LLM capabilities

AjeyaNov 10, 2023, 11:59 PM

38 points

1 comment8 min readEA link

Bandgaps, Brains, and Bioweapons: The limitations of computational science and what it means for AGI

titotalMay 26, 2023, 3:57 PM

59 points

0 comments18 min readEA link

Five Years of Rethink Priorities: Impact, Future Plans, Funding Needs (July 2023)

Rethink PrioritiesJul 18, 2023, 3:59 PM

110 points

3 comments16 min readEA link

Women in AI Safety London Meetup

NiaAug 1, 2024, 9:48 AM

2 points

0 comments1 min readEA link

My attempt at explaining the case for AI risk in a straightforward way

JulianHazellMar 25, 2023, 4:32 PM

25 points

7 comments18 min readEA link

(muddyclothes.substack.com)

A Windfall Clause for CEO could worsen AI race dynamics

LarksMar 9, 2023, 6:02 PM

69 points

12 comments7 min readEA link

[Link Post: New York Times] White House Unveils Initiatives to Reduce Risks of A.I.

RockwellMay 4, 2023, 2:04 PM

50 points

1 comment2 min readEA link

ARC Evals: Responsible Scaling Policies

Zach Stein-PerlmanSep 28, 2023, 4:30 AM

16 points

1 comment1 min readEA link

(evals.alignment.org)

Palisade is hiring: Exec Assistant, Content Lead, Ops Lead, and Policy Lead

Charlie Rogers-SmithOct 9, 2024, 12:04 AM

15 points

2 comments1 min readEA link

Semi-conductor / AI stocks discussion.

sapphireNov 25, 2022, 11:35 PM

10 points

3 comments1 min readEA link

Californians, tell your reps to vote yes on SB 1047!

Holly Elmore ⏸️ 🔸Aug 12, 2024, 7:49 PM

106 points

6 comments1 min readEA link

Response to Aschenbrenner’s “Situational Awareness”

RobBensingerJun 6, 2024, 10:57 PM

111 points

15 comments1 min readEA link

The U.S. and China Need an AI Incidents Hotline

christian.rJun 3, 2024, 6:46 PM

25 points

0 comments1 min readEA link

(www.lawfaremedia.org)

The EA case for Trump 2024

hamandcheeseAug 2, 2024, 7:32 PM

6 points

65 comments12 min readEA link

New open letter on AI — “Include Consciousness Research”

Jamie_HarrisApr 28, 2023, 7:50 AM

55 points

1 comment3 min readEA link

(amcs-community.org)

“Artificial General Intelligence”: an extremely brief FAQ

Steven ByrnesMar 11, 2024, 5:49 PM

12 points

0 comments1 min readEA link

Reminder: AI Worldviews Contest Closes May 31

Jason SchukraftMay 8, 2023, 5:40 PM

20 points

0 comments1 min readEA link

An Exercise to Build Intuitions on AGI Risk

Lauro LangoscoJun 8, 2023, 11:20 AM

4 points

0 comments8 min readEA link

(www.alignmentforum.org)

More people getting into AI safety should do a PhD

AdamGleaveMar 14, 2024, 10:14 PM

50 points

4 comments1 min readEA link

(gleave.me)

New DeepMind report on institutions for global AI governance

finmJul 14, 2023, 4:05 PM

10 points

0 comments1 min readEA link

(www.deepmind.com)

kpurens’s Quick takes

kpurensApr 11, 2023, 2:10 PM

9 points

2 comments2 min readEA link

List of Masters Programs in Tech Policy, Public Policy and Security (Europe)

sbergMay 29, 2023, 10:23 AM

49 points

0 comments3 min readEA link

Technological developments that could increase risks from nuclear weapons: A shallow review

MichaelA🔸Feb 9, 2023, 3:41 PM

79 points

3 comments5 min readEA link

(bit.ly)

AI Governance & Strategy: Priorities, talent gaps, & opportunities

AkashMar 3, 2023, 6:09 PM

21 points

0 comments1 min readEA link

Enhancing biosecurity with language models: defining research directions

micMar 26, 2024, 12:30 PM

11 points

1 comment13 min readEA link

(papers.ssrn.com)

OpenAI o1

Zach Stein-PerlmanSep 12, 2024, 6:54 PM

38 points

0 comments1 min readEA link

The Precipice Revisited

Toby_OrdJul 12, 2024, 2:06 PM

283 points

41 comments17 min readEA link

Moravec’s paradox and its implications

Vasco Grilo🔸Apr 29, 2025, 4:25 PM

12 points

5 comments8 min readEA link

(epoch.ai)

Recursive Middle Manager Hell

RaemonJan 17, 2023, 7:02 PM

73 points

3 comments1 min readEA link

Benchmark Performance is a Poor Measure of Generalisable AI Reasoning Capabilities

James FodorFeb 21, 2025, 4:25 AM

12 points

3 comments24 min readEA link

[Link post] Michael Nielsen’s “Notes on Existential Risk from Artificial Superintelligence”

Joel BeckerSep 19, 2023, 1:31 PM

38 points

1 comment6 min readEA link

(michaelnotebook.com)

Bryan Johnson seems more EA aligned than I expected

PeterSlatteryApr 22, 2024, 9:38 AM

13 points

27 comments2 min readEA link

(www.youtube.com)

A Simple Model of AGI Deployment Risk

djbinderJul 9, 2021, 9:44 AM

30 points

0 comments5 min readEA link

2023: news on AI safety, animal welfare, global health, and more

LizkaJan 5, 2024, 9:57 PM

54 points

1 comment12 min readEA link

80,000 Hours is hiring for an Engagement Specialist

BellaApr 25, 2025, 10:33 AM

8 points

5 comments6 min readEA link

Jobs that can help with the most important century

Holden KarnofskyFeb 12, 2023, 6:19 PM

57 points

2 comments32 min readEA link

(www.cold-takes.com)

Solving alignment isn’t enough for a flourishing future

micFeb 2, 2024, 6:22 PM

27 points

0 comments22 min readEA link

(papers.ssrn.com)

Paperclip Club (AI Safety Meetup)

Luke ThorburnApr 20, 2023, 4:04 PM

2 points

0 comments1 min readEA link

Sam Altman fired from OpenAI

LarksNov 17, 2023, 9:07 PM

133 points

89 comments1 min readEA link

(openai.com)

AI Views Snapshots

RobBensingerDec 13, 2023, 12:45 AM

25 points

0 comments1 min readEA link

How do we solve the alignment problem?

Joe_CarlsmithFeb 13, 2025, 6:27 PM

28 points

1 comment1 min readEA link

(joecarlsmith.substack.com)

<$750k grants for General Purpose AI Assurance/Safety Research

PhosphorousJun 13, 2023, 4:51 AM

37 points

0 comments1 min readEA link

(cset.georgetown.edu)

The state of AI in different countries — an overview

LizkaSep 14, 2023, 10:37 AM

68 points

6 comments13 min readEA link

(aisafetyfundamentals.com)

Towards more cooperative AI safety strategies

richard_ngoJul 16, 2024, 4:36 AM

64 points

5 comments1 min readEA link

Quick takes on “AI is easy to control”

So8resDec 2, 2023, 10:33 PM

−12 points

4 comments1 min readEA link

Modeling the impact of AI safety field-building programs

Center for AI SafetyJul 10, 2023, 5:22 PM

83 points

0 comments7 min readEA link

[Draft] The humble cosmologist’s P(doom) paradox

titotalMar 16, 2024, 11:13 AM

39 points

6 comments10 min readEA link

[Question] What is MIRI currently doing?

RokoDec 14, 2024, 2:55 AM

9 points

2 comments1 min readEA link

Framing AI strategy

Zach Stein-PerlmanFeb 7, 2023, 8:03 PM

16 points

0 comments1 min readEA link

(www.lesswrong.com)

Regrant up to $600,000 to AI safety projects with GiveWiki

Dawn DrescherOct 28, 2023, 7:56 PM

22 points

0 comments3 min readEA link

Public Comment Invited on Artificial Intelligence Action Plan

PeterSlatteryMar 3, 2025, 2:11 PM

47 points

0 comments1 min readEA link

(www.whitehouse.gov)

AI Safety in a World of Vulnerable Machine Learning Systems

AdamGleaveMar 8, 2023, 2:40 AM

20 points

0 comments29 min readEA link

(far.ai)

Will releasing the weights of large language models grant widespread access to pandemic agents?

Jeff Kaufman 🔸Oct 30, 2023, 5:42 PM

56 points

18 comments1 min readEA link

(arxiv.org)

AI safety tax dynamics

Owen Cotton-BarrattOct 23, 2024, 12:21 PM

22 points

9 comments6 min readEA link

(strangecities.substack.com)

Cognitive assets and defensive acceleration

JulianHazellApr 3, 2024, 2:55 PM

13 points

3 comments4 min readEA link

(muddyclothes.substack.com)

AISN #49: Superintelligence Strategy

Center for AI SafetyMar 6, 2025, 5:43 PM

8 points

0 comments5 min readEA link

(newsletter.safe.ai)

Experts’ AI timelines are longer than you have been told?

Vasco Grilo🔸Jan 9, 2025, 5:30 PM

38 points

11 comments3 min readEA link

(bayes.net)

Deconstructing Bostrom’s Classic Argument for AI Doom

Nora BelroseMar 11, 2024, 6:03 AM

26 points

0 comments1 min readEA link

(www.youtube.com)

Is the time crunch for AI Safety Movement Building now?

Chris LeongJun 8, 2022, 12:19 PM

14 points

10 comments3 min readEA link

Careless talk on US-China AI competition? (and criticism of CAIS coverage)

Oliver SourbutSep 20, 2023, 12:46 PM

52 points

19 comments1 min readEA link

(www.oliversourbut.net)

AI safety logo design contest, due end of May (extended)

Adrian CiprianiApr 28, 2023, 2:53 AM

13 points

23 comments2 min readEA link

On “slack” in training (Section 1.5 of “Scheming AIs”)

Joe_CarlsmithNov 25, 2023, 5:51 PM

14 points

1 comment1 min readEA link

Orthogonality is Expensive

𝕮𝖎𝖓𝖊𝖗𝖆Apr 3, 2023, 1:57 AM

18 points

4 comments1 min readEA link

Investigating an insurance-for-AI startup

L Rudolf LSep 21, 2024, 3:29 PM

40 points

1 comment1 min readEA link

(www.strataoftheworld.com)

On the future of language models

Owen Cotton-BarrattDec 20, 2023, 4:58 PM

125 points

3 comments36 min readEA link

AIxBio Newsletter #3 - At the Nexus

Andy Morgan 🔸Dec 7, 2024, 9:00 PM

7 points

0 comments2 min readEA link

(atthenexus.substack.com)

LLMs as a Planning Overhang

LarksJul 14, 2024, 4:57 AM

49 points

3 comments1 min readEA link

Announcing the 2025 Q1 Pivotal Research Fellowship

Tobias HäberliNov 2, 2024, 11:33 AM

26 points

1 comment2 min readEA link

Which ML skills are useful for finding a new AIS research agenda?

Yonatan CaleFeb 9, 2023, 1:09 PM

7 points

3 comments1 min readEA link

On “first critical tries” in AI alignment

Joe_CarlsmithJun 5, 2024, 12:19 AM

29 points

3 comments1 min readEA link

[Question] Why is Apart Research suddenly in dire need of funding?

Eevee🔹May 28, 2025, 7:43 AM

96 points

11 comments1 min readEA link

Using Consensus Mechanisms as an approach to Alignment

PrometheusJun 11, 2023, 1:24 PM

14 points

0 comments1 min readEA link

How MATS addresses “mass movement building” concerns

Ryan KiddMay 4, 2023, 12:55 AM

79 points

4 comments1 min readEA link

British public perception of existential risks

Jamie EOct 25, 2024, 2:37 PM

58 points

8 comments10 min readEA link

Google invests $300mn in artificial intelligence start-up Anthropic | FT

𝕮𝖎𝖓𝖊𝖗𝖆Feb 3, 2023, 7:43 PM

155 points

5 comments1 min readEA link

(www.ft.com)

[Question] Is DeepSeek-R1 already better than o3 when inference costs are held constant?

Magnus VindingJan 24, 2025, 3:29 PM

33 points

2 comments1 min readEA link

AIs accelerating AI research

AjeyaApr 12, 2023, 11:41 AM

84 points

7 comments4 min readEA link

[Question] Will AI Worldview Prize Funding Be Replaced?

Jordan ArelNov 13, 2022, 5:10 PM

26 points

4 comments1 min readEA link

[Linkpost] Situational Awareness—The Decade Ahead

MathiasKB🔸Jun 4, 2024, 10:58 PM

87 points

7 comments2 min readEA link

(situational-awareness.ai)

[Video] - How does the EU AI Act Work?

YadavSep 11, 2024, 2:16 PM

10 points

0 comments5 min readEA link

Apply to lead a project during the next virtual AI Safety Camp

Linda LinseforsSep 13, 2023, 1:29 PM

16 points

0 comments1 min readEA link

(aisafety.camp)

Incubating AI x-risk projects: some personal reflections

Ben SnodinDec 19, 2023, 5:03 PM

84 points

10 comments9 min readEA link

AISN #51: AI Frontiers

Center for AI SafetyApr 15, 2025, 3:46 PM

8 points

1 comment5 min readEA link

(newsletter.safe.ai)

AI Fables Writing Contest Winners!

Daystar EldNov 6, 2023, 2:27 AM

39 points

0 comments2 min readEA link

AI Safety Newsletter #41: The Next Generation of Compute Scale Plus, Ranking Models by Susceptibility to Jailbreaking, and Machine Ethics

Center for AI SafetySep 11, 2024, 7:11 PM

12 points

0 comments5 min readEA link

(newsletter.safe.ai)

Biorisk is an Unhelpful Analogy for AI Risk

DavidmanheimMay 6, 2024, 6:18 AM

22 points

4 comments3 min readEA link

Implications of the inference scaling paradigm for AI safety

Ryan KiddJan 15, 2025, 12:59 AM

47 points

5 comments1 min readEA link

Metaculus’ predictions are much better than low-information priors

Vasco Grilo🔸Apr 11, 2023, 8:36 AM

53 points

0 comments6 min readEA link

13 Very Different Stances on AGI

Ozzie GooenDec 27, 2021, 11:30 PM

84 points

23 comments3 min readEA link

[Question] Can we evaluate the “tool versus agent” AGI prediction?

Ben_West🔸Apr 8, 2023, 6:35 PM

63 points

7 comments1 min readEA link

The goal-guarding hypothesis (Section 2.3.1.1 of “Scheming AIs”)

Joe_CarlsmithDec 2, 2023, 3:20 PM

6 points

1 comment1 min readEA link

Thoughts about AI safety field-building in LMIC

Renan AraujoJun 23, 2023, 11:22 PM

56 points

4 comments12 min readEA link

Is it time for a pause?

Kelsey PiperApr 6, 2023, 11:48 AM

103 points

6 comments5 min readEA link

EA Infosec: skill up in or make a transition to infosec via this book club

Jason ClintonMar 5, 2023, 9:02 PM

170 points

16 comments2 min readEA link

Have your say on the Australian Government’s AI Policy [Online #1]

Nathan SherburnJul 11, 2023, 12:35 AM

3 points

0 comments1 min readEA link

China Hawks are Manufacturing an AI Arms Race

GarrisonNov 20, 2024, 6:17 PM

103 points

3 comments5 min readEA link

(garrisonlovely.substack.com)

Announcing the Open Philanthropy AI Worldviews Contest

Jason SchukraftMar 10, 2023, 2:33 AM

137 points

33 comments3 min readEA link

(www.openphilanthropy.org)

How do AI welfare and AI safety interact?

Lucius CaviolaJul 1, 2024, 10:39 AM

77 points

21 comments7 min readEA link

(outpaced.substack.com)

Metaculus Predicts Weak AGI in 2 Years and AGI in 10

Chris LeongMar 24, 2023, 7:43 PM

27 points

12 comments1 min readEA link

Beware popular discussions of AI “sentience”

David Mathers🔸Jun 8, 2023, 8:57 AM

42 points

6 comments9 min readEA link

Announcing the Existential InfoSec Forum

calebpJul 7, 2023, 9:08 PM

90 points

1 comment2 min readEA link

Law & AI Dinner—EAG Boston 2023

Alfredo Parra 🔸Oct 12, 2023, 8:32 AM

8 points

0 comments1 min readEA link

[Linkpost] The A.I. Dilemma—March 9, 2023, with Tristan Harris and Aza Raskin

PeterSlatteryApr 14, 2023, 8:00 AM

38 points

3 comments41 min readEA link

(youtu.be)

Launch & Grow Your University Group: Apply now to OSP & FSP!

Agustín Covarrubias 🔸May 25, 2024, 1:03 AM

61 points

0 comments2 min readEA link

11 heuristics for choosing (alignment) research projects

AkashJan 27, 2023, 12:36 AM

30 points

1 comment1 min readEA link

[Question] If your AGI x-risk estimates are low, what scenarios make up the bulk of your expectations for an OK outcome?

Greg_Colbourn ⏸️ Apr 21, 2023, 11:15 AM

65 points

55 comments1 min readEA link

[Question] Best giving multiplier for X-risk/AI safety?

SiebeRozendalDec 27, 2023, 10:51 AM

7 points

0 comments1 min readEA link

AI welfare vs. AI rights

Matthew_BarnettFeb 4, 2025, 6:28 PM

37 points

20 comments3 min readEA link

AI safety field-building survey: Talent needs, infrastructure needs, and relationship to EA

michelOct 27, 2023, 9:08 PM

67 points

3 comments9 min readEA link

Publication of the International Scientific Report on the Safety of Advanced AI (Interm Report)

James HerbertMay 21, 2024, 9:58 PM

11 points

2 comments2 min readEA link

(www.gov.uk)

[Linkpost] “Governance of superintelligence” by OpenAI

Daniel_EthMay 22, 2023, 8:15 PM

51 points

6 comments2 min readEA link

(openai.com)

A Guide to Forecasting AI Science Capabilities

Eleni_AApr 29, 2023, 6:51 AM

19 points

1 comment4 min readEA link

‘AI Emergency Eject Criteria’ Survey

tcelferactApr 19, 2023, 9:55 PM

5 points

4 comments1 min readEA link

New Metaculus Space for AI and X-Risk Related Questions

David Mathers🔸Sep 6, 2024, 11:37 AM

16 points

0 comments1 min readEA link

Announcement: You can now listen to the “AI Safety Fundamentals” courses

peterhartreeJun 9, 2023, 4:32 PM

101 points

8 comments1 min readEA link

12 career advising questions that may (or may not) be helpful for people interested in alignment research

AkashDec 12, 2022, 10:36 PM

14 points

0 comments1 min readEA link

Rethink Priorities is hiring a Compute Governance Researcher or Research Assistant

MichaelA🔸Jun 7, 2023, 1:22 PM

36 points

2 comments8 min readEA link

(careers.rethinkpriorities.org)

Munk AI debate: confusions and possible cruxes

Steven ByrnesJun 27, 2023, 3:01 PM

142 points

10 comments1 min readEA link

Updating Drexler’s CAIS model

Matthew_BarnettJun 17, 2023, 1:57 AM

59 points

0 comments1 min readEA link

Biomimetic alignment: Alignment between animal genes and animal brains as a model for alignment between humans and AI systems.

Geoffrey MillerMay 26, 2023, 9:25 PM

32 points

1 comment16 min readEA link

CEEALAR: 2024 Update

CEEALARJul 19, 2024, 11:14 AM

116 points

7 comments4 min readEA link

[Question] Strongest real-world examples supporting AI risk claims?

rosehadsharSep 5, 2023, 3:11 PM

52 points

9 comments1 min readEA link

The Case for Journalism on AI

michelFeb 19, 2025, 7:45 PM

95 points

5 comments4 min readEA link

Navigating AI Risks (NAIR) #1: Slowing Down AI

simeon_cApr 14, 2023, 2:35 PM

12 points

1 comment1 min readEA link

Thread: Reflections on the AGI Safety Fundamentals course?

CliffordMay 18, 2023, 1:11 PM

27 points

7 comments1 min readEA link

Action: Help expand funding for AI Safety by coordinating on NSF response

Evan R. MurphyJan 20, 2022, 8:48 PM

20 points

7 comments3 min readEA link

Positions at MITFutureTech

PeterSlatteryDec 19, 2023, 8:28 PM

21 points

1 comment4 min readEA link

[Question] What is AI Safety’s line of retreat?

RemmeltJul 28, 2024, 5:43 AM

4 points

2 comments1 min readEA link

Project ideas: Governance during explosive technological growth

Lukas FinnvedenJan 4, 2024, 7:25 AM

37 points

1 comment16 min readEA link

(www.forethought.org)

Branding AI Safety Groups: A Field Guide

Agustín Covarrubias 🔸May 13, 2024, 5:17 PM

44 points

6 comments1 min readEA link

Risks I am Concerned About

HappyBunnyApr 29, 2024, 11:41 PM

1 point

1 comment1 min readEA link

[urgent] Americans: call your Senators and tell them you oppose AI preemption

Holly Elmore ⏸️ 🔸May 15, 2025, 1:57 AM

168 points

22 comments2 min readEA link

The crucible — how I think about the situation with AI

Owen Cotton-BarrattMay 5, 2025, 1:19 PM

36 points

0 comments8 min readEA link

(strangecities.substack.com)

New voluntary commitments (AI Seoul Summit)

Zach Stein-PerlmanMay 21, 2024, 11:00 AM

12 points

1 comment1 min readEA link

(www.gov.uk)

Ideas for improving epistemics in AI safety outreach

micAug 21, 2023, 7:56 PM

31 points

0 comments3 min readEA link

(www.lesswrong.com)

Tentatively against making AIs ‘wise’

OscarD🔸Jul 14, 2024, 6:32 PM

9 points

4 comments3 min readEA link

Impact Assessment of AI Safety Camp (Arb Research)

Sam HoltonJan 23, 2024, 4:32 PM

87 points

23 comments11 min readEA link

The case for more Alignment Target Analysis (ATA)

ChiSep 20, 2024, 1:14 AM

23 points

0 comments1 min readEA link

Artificial Intelligence, Conscious Machines, and Animals: Broadening AI Ethics

Group OrganizerSep 21, 2023, 8:58 PM

4 points

0 comments1 min readEA link

AI Safety is Sometimes a Model Property

Cullen 🔸May 2, 2024, 3:38 PM

18 points

1 comment1 min readEA link

(open.substack.com)

Announcing Open Philanthropy’s AI governance and policy RFP

JulianHazellJul 17, 2024, 12:25 AM

73 points

2 comments1 min readEA link

(www.openphilanthropy.org)

Apply for mentorship in AI Safety field-building

AkashSep 17, 2022, 7:03 PM

21 points

0 comments1 min readEA link

Staged release

Zach Stein-PerlmanApr 20, 2024, 1:00 AM

16 points

0 comments1 min readEA link

Thoughts on SB-1047

Ryan GreenblattMay 30, 2024, 12:19 AM

53 points

4 comments1 min readEA link

AI can solve all EA problems, so why keep focusing on them?

Cody AlbertMay 3, 2025, 9:51 PM

8 points

15 comments1 min readEA link

Race to the Top: Benchmarks for AI Safety

isaduanDec 4, 2022, 10:50 PM

52 points

8 comments1 min readEA link

New ‘South Park’ episode on AI & Chat GPT

Geoffrey MillerMar 21, 2023, 8:06 PM

13 points

1 comment1 min readEA link

LLMs might not be the future of search: at least, not yet.

James-Hartree-LawJan 22, 2025, 9:40 PM

4 points

1 comment4 min readEA link

Aspiration-based, non-maximizing AI agent designs

Bob JacobsMay 7, 2024, 4:13 PM

12 points

1 comment38 min readEA link

Beware safety-washing

LizkaJan 13, 2023, 10:39 AM

143 points

7 comments4 min readEA link

Notes on nukes, IR, and AI from “Arsenals of Folly” (and other books)

tlevinSep 4, 2023, 7:02 PM

21 points

2 comments6 min readEA link

[TIME magazine] DeepMind’s CEO Helped Take AI Mainstream. Now He’s Urging Caution (Perrigo, 2023)

Will AldredJan 20, 2023, 8:37 PM

93 points

0 comments1 min readEA link

(time.com)

AI Safety Newsletter #37: US Launches Antitrust Investigations Plus, recent criticisms of OpenAI and Anthropic, and a summary of Situational Awareness

Center for AI SafetyJun 18, 2024, 6:08 PM

15 points

0 comments5 min readEA link

(newsletter.safe.ai)

I’m hiring a Research Assistant for a nonfiction book on AI!

GarrisonMar 26, 2025, 7:46 PM

63 points

2 comments1 min readEA link

(garrisonlovely.substack.com)

Interpretability Will Not Reliably Find Deceptive AI

Neel NandaMay 4, 2025, 4:32 PM

74 points

0 comments7 min readEA link

What the Headlines Miss About the Latest Decision in the Musk vs. OpenAI Lawsuit

GarrisonMar 6, 2025, 7:49 PM

87 points

9 comments6 min readEA link

(garrisonlovely.substack.com)

AI Safety Research Organization Incubation Program—Expression of Interest

kaykozaronekNov 20, 2023, 10:25 PM

70 points

0 comments1 min readEA link

Case studies on social-welfare-based standards in various industries

Holden KarnofskyJun 20, 2024, 1:33 PM

73 points

2 comments1 min readEA link

Cambridge Boston Alignment Initiative Summer Research Fellowship in AI Safety (Deadline: May 18)

PeterSlatteryMay 12, 2025, 4:15 PM

14 points

2 comments1 min readEA link

20 Critiques of AI Safety That I Found on Twitter

Daniel KirmaniJun 23, 2022, 3:11 PM

14 points

13 comments1 min readEA link

Announcing Epoch’s dashboard of key trends and figures in Machine Learning

Jaime SevillaApr 13, 2023, 7:33 AM

127 points

4 comments1 min readEA link

Victoria Krakovna on AGI Ruin, The Sharp Left Turn and Paradigms of AI Alignment

Michaël TrazziJan 12, 2023, 5:09 PM

16 points

0 comments1 min readEA link

Call for Papers on Global AI Governance from the UN

Chris LeongAug 20, 2023, 8:56 AM

36 points

1 comment1 min readEA link

(www.linkedin.com)

Cost-effectiveness of student programs for AI safety research

Center for AI SafetyJul 10, 2023, 5:23 PM

53 points

7 comments15 min readEA link

ML4G Germany—AI Alignment Camp

Evander H. 🔸Jun 19, 2023, 7:24 AM

17 points

1 comment1 min readEA link

[Question] What is the counterfactual value of different AI Safety professionals?

PabloAMC 🔸Jul 3, 2024, 2:38 PM

6 points

2 comments1 min readEA link

Have your say on the Australian Government’s AI Policy [Brisbane]

Michael Noetel 🔸Jun 9, 2023, 12:15 AM

6 points

0 comments1 min readEA link

List #2: Why coordinating to align as humans to not develop AGI is a lot easier than, well… coordinating as humans with AGI coordinating to be aligned with humans

RemmeltDec 24, 2022, 9:53 AM

3 points

0 comments1 min readEA link

Animal Advocacy in the Age of AI

Constance LiJul 27, 2023, 7:08 AM

63 points

4 comments6 min readEA link

An Analogy for Understanding Transformers

Callum McDougallMay 13, 2023, 12:20 PM

7 points

0 comments1 min readEA link

Linkpost: 7 A.I. Companies Agree to Safeguards After Pressure From the White House

MHR🔸Jul 21, 2023, 1:23 PM

61 points

4 comments1 min readEA link

(www.nytimes.com)

Have your say on the Australian Government’s AI Policy

Nathan SherburnJul 17, 2023, 11:02 AM

3 points

1 comment1 min readEA link

Gavin Newsom vetoes SB 1047

LarksSep 30, 2024, 12:06 AM

39 points

14 comments1 min readEA link

(www.wsj.com)

Deepseek and Taiwan

YadavFeb 13, 2025, 5:19 PM

5 points

1 comment3 min readEA link

(45seconds.substack.com)

IAPS: Mapping Technical Safety Research at AI Companies

Zach Stein-PerlmanOct 24, 2024, 8:30 PM

24 points

0 comments1 min readEA link

(www.iaps.ai)

Don’t Call It AI Alignment

GilFeb 20, 2023, 5:27 AM

16 points

7 comments2 min readEA link

AI 2027: What Superintelligence Looks Like (Linkpost)

Manuel AllgaierApr 11, 2025, 10:31 AM

51 points

3 comments42 min readEA link

(ai-2027.com)

ML4Good is seeking partner organisations, individual organisers and TAs

NiaMay 13, 2024, 1:43 PM

22 points

0 comments3 min readEA link

Are there enough opportunities for AI safety specialists?

mhint199May 13, 2023, 9:18 PM

8 points

2 comments3 min readEA link

AI and the feeling of living in two worlds

michelOct 10, 2024, 5:51 PM

40 points

3 comments7 min readEA link

ALTER Israel End-of-2024 Update

DavidmanheimJan 7, 2025, 3:07 PM

38 points

1 comment4 min readEA link

Video and transcript of talk on automating alignment research

Joe_CarlsmithApr 30, 2025, 5:43 PM

11 points

1 comment1 min readEA link

(joecarlsmith.com)

[Linkpost] Prospect Magazine—How to save humanity from extinction

jackvaSep 26, 2023, 7:16 PM

32 points

2 comments1 min readEA link

(www.prospectmagazine.co.uk)

AGI in sight: our look at the game board

Andrea_MiottiFeb 18, 2023, 10:17 PM

25 points

18 comments1 min readEA link

Guardrails vs Goal-directedness in AI Alignment

freedomandutilityDec 30, 2023, 12:58 PM

13 points

2 comments1 min readEA link

How ARENA course material gets made

Callum McDougallJul 2, 2024, 7:27 AM

12 points

0 comments1 min readEA link

What is it to solve the alignment problem? (Notes)

Joe_CarlsmithAug 24, 2024, 9:19 PM

32 points

1 comment1 min readEA link

Want to win the AGI race? Solve alignment.

leopoldMar 29, 2023, 3:19 PM

56 points

6 comments5 min readEA link

(www.forourposterity.com)

A Friendly Face (Another Failure Story)

Karl von WendtJun 20, 2023, 10:31 AM

22 points

8 comments1 min readEA link

Many AI governance proposals have a tradeoff between usefulness and feasibility

AkashFeb 3, 2023, 6:49 PM

22 points

0 comments1 min readEA link

Archetypal Transfer Learning: a Proposed Alignment Solution that solves the Inner x Outer Alignment Problem while adding Corrigible Traits to GPT-2-medium

MiguelApr 26, 2023, 12:40 AM

13 points

0 comments10 min readEA link

AI, centralization, and the One Ring

Owen Cotton-BarrattSep 13, 2024, 1:56 PM

20 points

0 comments8 min readEA link

(strangecities.substack.com)

Emerging Technologies: More to explore

EA HandbookJan 1, 2021, 11:06 AM

4 points

0 comments2 min readEA link

[MLSN #9] Verifying large training runs, security risks from LLM access to APIs, why natural selection may favor AIs over humans

TW123Apr 11, 2023, 4:05 PM

18 points

0 comments6 min readEA link

(newsletter.mlsafety.org)

AI Tools for Existential Security

LizkaMar 14, 2025, 6:37 PM

64 points

10 comments11 min readEA link

(www.forethought.org)

We’re all in this together

Tamsin LeakeDec 5, 2023, 1:57 PM

15 points

1 comment1 min readEA link

(carado.moe)

OpenAI’s new Preparedness team is hiring

leopoldOct 26, 2023, 8:41 PM

85 points

13 comments1 min readEA link

AGI and the EMH: markets are not expecting aligned or unaligned AI in the next 30 years

basil.halperinJan 10, 2023, 4:05 PM

342 points

177 comments26 min readEA link

Announcing the Pivotal Research Fellowship – Apply Now!

Tobias HäberliApr 3, 2024, 5:30 PM

51 points

5 comments2 min readEA link

[Question] AI+bio cannot be half of AI catastrophe risk, right?

Benevolent_RainOct 10, 2023, 3:17 AM

23 points

11 comments2 min readEA link

Sam Altman / Open AI Discussion Thread

John SalterNov 20, 2023, 9:21 AM

40 points

36 comments1 min readEA link

Applications are now open for Intro to ML Safety Spring 2023

JoshcNov 4, 2022, 10:45 PM

49 points

1 comment2 min readEA link

AI Rights for Human Safety

Matthew_BarnettAug 3, 2024, 12:47 AM

54 points

1 comment1 min readEA link

(papers.ssrn.com)

Summary: Against the singularity hypothesis

Global Priorities InstituteMay 22, 2024, 11:05 AM

46 points

14 comments4 min readEA link

(globalprioritiesinstitute.org)

[Question] Alignment & Capabilities: What’s the difference?

John G. HalsteadAug 31, 2023, 10:13 PM

50 points

10 comments1 min readEA link

The Human Biological Advantage Over AI

William StewartNov 18, 2024, 11:18 AM

−1 points

0 comments1 min readEA link

Where are the red lines for AI?

Karl von WendtAug 5, 2022, 9:41 AM

13 points

3 comments6 min readEA link

SB 1047 was vetoed, but public commentary now can assist future AI safety legislation

ThomasWOct 2, 2024, 6:10 PM

38 points

0 comments1 min readEA link

Announcing AISafety.info’s Write-a-thon (June 16-18) and Second Distillation Fellowship (July 3-October 2)

StevenKaasJun 3, 2023, 2:03 AM

12 points

1 comment1 min readEA link

AISN #13: An interdisciplinary perspective on AI proxy failures, new competitors to ChatGPT, and prompting language models to misbehave

Center for AI SafetyJul 5, 2023, 3:33 PM

25 points

0 comments9 min readEA link

(newsletter.safe.ai)

Performance comparison of Large Language Models (LLMs) in code generation and application of best practices in frontend web development

Diana V. Guaiña A.May 1, 2025, 2:57 PM

1 point

0 comments24 min readEA link

[Linkpost] Jobs at the AI Safety Institute

PseudaemoniaJan 19, 2024, 4:39 PM

11 points

0 comments1 min readEA link

(www.gov.uk)

AGI Safety Communications Initiative

InesJun 11, 2022, 4:30 PM

35 points

6 comments1 min readEA link

First call for EA Data Science/ML/AI

astrastefaniaAug 23, 2022, 7:37 PM

29 points

0 comments1 min readEA link

Why don’t governments seem to mind that companies are explicitly trying to make AGIs?

Ozzie GooenDec 23, 2021, 7:08 AM

82 points

49 comments2 min readEA link

Blake Richards on Why he is Skeptical of Existential Risk from AI

Michaël TrazziJun 14, 2022, 7:11 PM

63 points

14 comments4 min readEA link

(theinsideview.ai)

Graphical Representations of Paul Christiano’s Doom Model

Nathan YoungMay 7, 2023, 1:03 PM

48 points

2 comments1 min readEA link

Alignment ideas inspired by human virtue development

Borys PikalovMay 18, 2025, 9:36 AM

3 points

0 comments4 min readEA link

The Hidden Complexity of Wishes—The Animation

WriterSep 27, 2023, 5:59 PM

7 points

0 comments1 min readEA link

(youtu.be)

How Open Source Machine Learning Software Shapes AI

Max LSep 28, 2022, 5:49 PM

11 points

3 comments15 min readEA link

(maxlangenkamp.me)

[Question] Do you think the probability of future AI sentience(suffering) is >0.1%? Why?

jackchang110Jul 10, 2023, 4:41 PM

4 points

0 comments1 min readEA link

Desirable? AI qualities

brb243Mar 21, 2022, 10:05 PM

7 points

0 comments2 min readEA link

Minimizing suffering & ASI xrisk through brain digitization

Amy Louise JohnsonFeb 20, 2025, 9:08 PM

1 point

0 comments1 min readEA link

Tony Blair Institute AI Safety Work

TomWestgarthJun 13, 2023, 1:16 PM

88 points

2 comments6 min readEA link

(www.institute.global)

A better “Statement on AI Risk?” [Crosspost]

Knight LeeDec 30, 2024, 7:36 AM

4 points

0 comments3 min readEA link

Stanford summer course: Economics of Transformative AI

trammellJan 23, 2025, 11:07 PM

82 points

4 comments1 min readEA link

[Linkpost] Michael Nielsen remarks on ‘Oppenheimer’

Tom Barnes🔸Aug 31, 2023, 3:41 PM

83 points

1 comment2 min readEA link

(michaelnotebook.com)

Asya Bergal: Reasons you might think human-level AI is unlikely to happen soon

EA GlobalAug 26, 2020, 4:01 PM

24 points

2 comments17 min readEA link

(www.youtube.com)

Questions for further investigation of AI diffusion

Ben CottierDec 21, 2022, 1:50 PM

28 points

0 comments11 min readEA link

[AN #80]: Why AI risk might be solved without additional intervention from longtermists

Rohin ShahJan 3, 2020, 7:52 AM

58 points

12 comments10 min readEA link

(www.alignmentforum.org)

NeurIPS ML Safety Workshop 2022

Dan HJul 26, 2022, 3:33 PM

72 points

0 comments1 min readEA link

(neurips2022.mlsafety.org)

“Taking AI Risk Seriously” – Thoughts by Andrew Critch

RaemonNov 19, 2018, 2:21 AM

26 points

9 comments1 min readEA link

(www.lesswrong.com)

A Different Approach to Community Building: The Spiral Path to Impact

ezrahMay 23, 2023, 6:41 PM

46 points

4 comments8 min readEA link

L’importanza delle IA come possibile minaccia per l’umanità

EA ItalyJan 17, 2023, 10:24 PM

1 point

0 comments1 min readEA link

(www.vox.com)

A selection of some writings and considerations on the cause of artificial sentience

Raphaël_PesahAug 10, 2023, 6:23 PM

49 points

1 comment10 min readEA link

Loving a world you don’t trust

Joe_CarlsmithJun 18, 2024, 7:31 PM

65 points

7 comments1 min readEA link

How to build AI you can actually Trust—Like a Medical Team, Not a Black Box

Ihor IvlievMar 22, 2025, 9:27 PM

2 points

1 comment4 min readEA link

[Question] Is AI x-risk becoming a distraction?

Non-zero-sum JamesFeb 27, 2025, 8:33 PM

2 points

0 comments1 min readEA link

Seeking input on a list of AI books for broader audience

Darren McKeeFeb 27, 2023, 10:40 PM

49 points

14 comments5 min readEA link

Contest: 250€ for translation of “longtermism” to German

constructiveJun 1, 2022, 7:59 PM

18 points

30 comments1 min readEA link

My take on AI risk (7 theses of eugene)

meugenMar 21, 2025, 3:02 AM

0 points

1 comment2 min readEA link

But exactly how complex and fragile?

Katja_GraceDec 13, 2019, 7:05 AM

37 points

3 comments3 min readEA link

(meteuphoric.com)

LLMs are weirder than you think

Derek ShillerNov 20, 2024, 1:39 PM

64 points

3 comments22 min readEA link

Announcing The Most Important Century Writing Prize

michelOct 31, 2022, 9:37 PM

48 points

0 comments2 min readEA link

[Question] What will be some of the most impactful applications of advanced AI in the near term?

IanDavidMossMar 3, 2022, 3:26 PM

16 points

7 comments1 min readEA link

The Happiness Maximizer: Why EA is an x-risk

Obasi ShawAug 30, 2022, 4:29 AM

8 points

5 comments32 min readEA link

What we learned from running an Australian AI Safety Unconference

Alexander SaeriOct 26, 2023, 12:46 AM

34 points

0 comments5 min readEA link

Infinite Rewards, Finite Safety: New Models for AI Motivation Without Infinite Goals

Whylome TeamNov 12, 2024, 7:21 AM

−5 points

1 comment2 min readEA link

[Question] What do you mean with ‘alignment is solvable in principle’?

RemmeltJan 17, 2025, 3:03 PM

10 points

1 comment1 min readEA link

Response to “Coordinated pausing: An evaluation-based coordination scheme for frontier AI developers”

Matthew WeardenOct 30, 2023, 12:49 PM

7 points

1 comment6 min readEA link

(matthewwearden.co.uk)

Values and control

dotsamAug 4, 2022, 6:28 PM

3 points

1 comment1 min readEA link

How to engage with AI 4 Social Justice actors

TomWestgarthApr 26, 2022, 8:39 AM

12 points

5 comments1 min readEA link

SB 1047 Simplified

Gabe KSep 25, 2024, 12:00 PM

14 points

0 comments4 min readEA link

Why I’m Sceptical of Foom

𝕮𝖎𝖓𝖊𝖗𝖆Dec 8, 2022, 10:01 AM

22 points

7 comments1 min readEA link

Announcing the EA Project Ideas Database

Joe RogeroJun 22, 2023, 8:20 PM

14 points

4 comments1 min readEA link

New reference standard on LLM Application security started by OWASP

QuantumForestJun 19, 2023, 7:56 PM

5 points

0 comments1 min readEA link

Why I am no longer thinking about/working on AI safety

jbkjrMay 6, 2024, 8:00 PM

−8 points

0 comments4 min readEA link

(www.lesswrong.com)

Christiano and Yudkowsky on AI predictions and human intelligence

EliezerYudkowskyFeb 23, 2022, 4:51 PM

31 points

0 comments42 min readEA link

What is OpenAI’s plan for making AI Safer?

brookSep 1, 2023, 11:15 AM

8 points

1 comment1 min readEA link

(aisafetyexplained.substack.com)

Ajeya’s TAI timeline shortened from 2050 to 2040

Zach Stein-PerlmanAug 3, 2022, 12:00 AM

59 points

2 comments1 min readEA link

(www.lesswrong.com)

Speculating on Secret Intelligence Explosions

calebpJun 5, 2025, 1:55 PM

20 points

5 comments8 min readEA link

Neartermists should consider AGI timelines in their spending decisions

Tristan CookJul 26, 2022, 5:01 PM

68 points

4 comments4 min readEA link

[Question] Help us design the interface for aisafety.com

Kim HolderOct 23, 2023, 5:27 PM

9 points

0 comments1 min readEA link

Learning as much Deep Learning math as I could in 24 hours

PhosphorousJan 8, 2023, 2:19 AM

58 points

6 comments7 min readEA link

PSA: Saying “1 in 5” Is Better Than “20%” When Informing about risks publicly

BlankaJan 30, 2025, 7:03 PM

17 points

1 comment1 min readEA link

[Question] Designing user authentication protocols

Kinoshita Yoshikazu (pseudonym)Mar 13, 2023, 3:56 PM

−1 points

2 comments1 min readEA link

Tetherware #2: What every human should know about our most likely AI future

Jáchym FibírFeb 28, 2025, 11:25 AM

3 points

0 comments11 min readEA link

(tetherware.substack.com)

“Develop Anthropomorphic AGI to Save Humanity from Itself” (Future Fund AI Worldview Prize submission)

ketanramaNov 5, 2022, 5:57 PM

19 points

6 comments7 min readEA link

DeepMind’s generalist AI, Gato: A non-technical explainer

frances_lorenzMay 16, 2022, 9:19 PM

128 points

13 comments6 min readEA link

Against GDP as a metric for timelines and takeoff speeds

kokotajlodDec 29, 2020, 5:50 PM

47 points

6 comments14 min readEA link

[Rumour] Microsoft to invest $10B in OpenAI, will receive 75% of profits until they recoup investment: GPT would be integrated with Office

𝕮𝖎𝖓𝖊𝖗𝖆Jan 10, 2023, 11:43 PM

25 points

2 comments1 min readEA link

AI’s goals may not match ours

Vishakha AgrawalMay 28, 2025, 12:07 PM

2 points

0 comments3 min readEA link

We’re Not Ready: thoughts on “pausing” and responsible scaling policies

Holden KarnofskyOct 27, 2023, 3:19 PM

150 points

23 comments1 min readEA link

[Question] Is it valuable to the field of AI Safety to have a neuroscience background?

Samuel NellessenApr 3, 2022, 7:44 PM

18 points

3 comments1 min readEA link

Rohin Shah: What’s been happening in AI alignment?

EA GlobalJul 29, 2020, 8:15 PM

18 points

0 comments14 min readEA link

(www.youtube.com)

My motivation and theory of change for working in AI healthtech

Andrew CritchOct 12, 2024, 12:36 AM

47 points

1 comment1 min readEA link

Fundamentals of Fatal Risks

AinoJul 29, 2023, 7:12 AM

1 point

0 comments4 min readEA link

Video and transcript of presentation on Scheming AIs

Joe_CarlsmithMar 22, 2024, 3:56 PM

23 points

1 comment1 min readEA link

Paths and waystations in AI safety

Joe_CarlsmithMar 11, 2025, 6:52 PM

22 points

2 comments1 min readEA link

(joecarlsmith.substack.com)

Analysis of Progress in Speech Recognition Models

MiguelASep 16, 2024, 3:56 PM

8 points

1 comment12 min readEA link

Tarbell Fellowship 2025 - Applications Open (AI Journalism)

Tarbell Center for AI JournalismJan 8, 2025, 3:25 PM

62 points

0 comments1 min readEA link

Is GPT3 a Good Rationalist? - InstructGPT3 [2/2]

simeon_cApr 7, 2022, 1:54 PM

25 points

0 comments7 min readEA link

If interpretability research goes well, it may get dangerous

So8resApr 3, 2023, 9:48 PM

33 points

0 comments1 min readEA link

Towards AI Safety Infrastructure: Talk & Outline

Paul BricmanJan 7, 2024, 9:35 AM

14 points

1 comment2 min readEA link

(www.youtube.com)

AI timelines by bio anchors: the debate in one place

Will AldredJul 30, 2022, 11:04 PM

93 points

6 comments2 min readEA link

EA Explorer GPT: A New Tool to Explore Effective Altruism

Vlad_TislenkoNov 12, 2023, 3:36 PM

12 points

1 comment1 min readEA link

Implications of the Whitehouse meeting with AI CEOs for AI superintelligence risk—a first-step towards evals?

Jamie BMay 7, 2023, 5:33 PM

78 points

3 comments7 min readEA link

Superforecasting the premises in “Is power-seeking AI an existential risk?”

Joe_CarlsmithOct 18, 2023, 8:33 PM

114 points

3 comments1 min readEA link

On the compute governance era and what has to come after (Lennart Heim on The 80,000 Hours Podcast)

80000_HoursJun 23, 2023, 8:11 PM

37 points

0 comments18 min readEA link

What should AI safety be trying to achieve?

EuanMcLeanMay 23, 2024, 11:28 AM

13 points

1 comment13 min readEA link

Can we safely automate alignment research?

Joe_CarlsmithApr 30, 2025, 5:37 PM

13 points

1 comment1 min readEA link

(joecarlsmith.com)

Human Values and AGI Risk | William James

William JamesMar 31, 2023, 10:30 PM

1 point

0 comments12 min readEA link

Some thoughts from a University AI Debate

Charlie HarrisonMar 20, 2024, 5:03 PM

26 points

2 comments1 min readEA link

Crypto ‘oracle protocols’ for AI alignment with real-world data?

Geoffrey MillerSep 22, 2022, 11:05 PM

9 points

3 comments1 min readEA link

Could ASI Have Existed Since the Big Bang?

Aaron LiJan 31, 2025, 1:20 PM

−13 points

0 comments1 min readEA link

Asymmetries, AI and Animal Advocacy

Kevin Xia 🔸May 16, 2025, 6:16 AM

61 points

6 comments5 min readEA link

The animals and humans analogy for AI risk

freedomandutilityAug 13, 2022, 3:35 PM

5 points

2 comments1 min readEA link

Who owns AI-generated content?

Johan S DanielDec 7, 2022, 3:03 AM

−2 points

0 comments2 min readEA link

A concern about the “evolutionary anchor” of Ajeya Cotra’s report on AI timelines.

NunoSempereAug 16, 2022, 2:44 PM

75 points

40 comments5 min readEA link

(nunosempere.com)

Mitigating existential risks associated with human nature and AI: Thoughts on serious measures.

LinyphiaMar 25, 2023, 7:10 PM

2 points

2 comments3 min readEA link

Where on the continuum of pure EA to pure AIS should you be? (Uni Group Organizers Focus)

jessica_mccurdy🔸Jun 26, 2023, 11:46 PM

44 points

0 comments5 min readEA link

[Question] How might a herd of interns help with AI or biosecurity research tasks/questions?

Marcel DMar 20, 2022, 10:49 PM

30 points

8 comments2 min readEA link

London Working Group for Short/Medium Term AI Risks

scronkfinkleApr 7, 2025, 3:30 PM

5 points

0 comments2 min readEA link

Descartes’ 17th century Turing Test

James-Hartree-LawJan 16, 2025, 8:18 PM

3 points

0 comments7 min readEA link

The Case for AI Safety Advocacy to the Public

Holly Elmore ⏸️ 🔸Sep 20, 2023, 12:03 PM

258 points

58 comments14 min readEA link

[Discussion] How Broad is the Human Cognitive Spectrum?

𝕮𝖎𝖓𝖊𝖗𝖆Jan 7, 2023, 12:59 AM

16 points

1 comment1 min readEA link

Chris Olah on working at top AI labs without an undergrad degree

80000_HoursSep 10, 2021, 8:46 PM

15 points

0 comments73 min readEA link

Transformative AI and wild animals: An exploration.

mal_graham🔸Apr 24, 2025, 5:48 PM

82 points

7 comments25 min readEA link

How I learned to stop worrying and love skill trees

Clark UrzoMay 23, 2023, 8:03 AM

22 points

3 comments1 min readEA link

(www.lesswrong.com)

AGI Ruin: A List of Lethalities

EliezerYudkowskyJun 6, 2022, 11:28 PM

162 points

53 comments30 min readEA link

(www.lesswrong.com)

Open Phil releases RFPs on LLM Benchmarks and Forecasting

Lawrence ChanNov 11, 2023, 3:01 AM

12 points

0 comments1 min readEA link

(www.openphilanthropy.org)

Open-source LLMs may prove Bostrom’s vulnerable world hypothesis

Roope AhvenharjuApr 14, 2023, 9:25 AM

14 points

2 comments1 min readEA link

It takes 5 layers and 1000 artificial neurons to simulate a single biological neuron [Link]

MichaelStJulesSep 7, 2021, 9:53 PM

44 points

17 comments2 min readEA link

Are AI safetyists crying wolf?

sarahhwJan 8, 2025, 8:54 PM

61 points

21 comments16 min readEA link

(longerramblings.substack.com)

A Bird’s Eye View of the ML Field [Pragmatic AI Safety #2]

TW123May 9, 2022, 5:15 PM

97 points

2 comments35 min readEA link

Artificial intelligence career stories

EA GlobalOct 25, 2020, 6:56 AM

12 points

0 comments1 min readEA link

(www.youtube.com)

[Question] Intellectual property of AI and existential risk in general?

WillPearsonJun 11, 2024, 1:50 PM

3 points

3 comments1 min readEA link

Claude vs GPT

Maxwell TabarrokMar 14, 2024, 12:44 PM

14 points

1 comment2 min readEA link

(www.maximum-progress.com)

Credo AI is hiring for AI Gov Researcher & more!

IanEisenbergAug 15, 2023, 9:10 PM

8 points

0 comments3 min readEA link

By default, capital will matter more than ever after AGI

L Rudolf LDec 28, 2024, 5:52 PM

113 points

3 comments1 min readEA link

(nosetgauge.substack.com)

[Question] BenevolentAI—an effectively impactful company?

Jack HiltonOct 11, 2022, 2:35 PM

16 points

11 comments1 min readEA link

On excluding dangerous information from training

ShayBenMosheNov 17, 2023, 8:09 PM

8 points

0 comments3 min readEA link

(www.lesswrong.com)

Redwood Research is hiring for several roles (Operations and Technical)

JJXWangApr 14, 2022, 3:23 PM

45 points

0 comments1 min readEA link

[Question] Is there a public tracker depicting at what dates AI has been able to automate x% of cognitive tasks (weighted by 2020 economic value)?

Mitchell Laughlin🔸Feb 17, 2024, 4:52 AM

12 points

4 comments1 min readEA link

Wentworth and Larsen on buying time

AkashJan 9, 2023, 9:31 PM

48 points

0 comments1 min readEA link

AGI in a vulnerable world

AI ImpactsApr 2, 2020, 3:43 AM

17 points

0 comments1 min readEA link

(aiimpacts.org)

Why policymakers should beware claims of new “arms races” (Bulletin of the Atomic Scientists)

christian.rJul 14, 2022, 1:38 PM

55 points

1 comment1 min readEA link

(thebulletin.org)

[Question] Can AI safely exist at all?

Hayven FrienbyNov 27, 2023, 5:33 PM

6 points

7 comments2 min readEA link

[Question] I’m interviewing the author of ‘Not Born Yesterday’ — Hugo Mercier. He argues people are less gullible and more savvy than you think. What should I ask him?

Robert_WiblinNov 17, 2023, 5:43 PM

16 points

3 comments1 min readEA link

Promethean Governance Ascendant: Lessons from the Forge and Visions for the Cosmic Polity

Paul FallavollitaMar 23, 2025, 12:54 AM

−9 points

0 comments3 min readEA link

Oren’s Field Guide of Bad AGI Outcomes

Oren MontanoSep 26, 2022, 8:59 AM

1 point

0 comments1 min readEA link

AI-based disinformation is probably not a major threat to democracy

Dan WilliamsFeb 24, 2024, 8:01 PM

63 points

8 comments10 min readEA link

Aspiring Jr. AI safety researchers: what’s stopping you? | Survey

carolinaolliveOct 29, 2024, 11:27 AM

14 points

0 comments1 min readEA link

“Cotton Gin” AI Risk

423175Sep 24, 2022, 11:04 PM

6 points

2 comments1 min readEA link

[Question] Looking to interview AI Safety researchers for a book

CarusoAug 24, 2024, 8:01 PM

6 points

0 comments1 min readEA link

[Question] Best introductory overviews of AGI safety?

JakubKDec 13, 2022, 7:04 PM

21 points

8 comments2 min readEA link

(www.lesswrong.com)

AI Alignment, Sentience, and the Sense of Coherence Concept

Jason BabbMar 17, 2025, 1:30 PM

4 points

0 comments1 min readEA link

AI governance tracker of each country per region

Alix RamillonJul 24, 2024, 5:39 PM

16 points

2 comments23 min readEA link

#208 – The case that TV shows, movies, and novels can improve the world (Elizabeth Cox on The 80,000 Hours Podcast)

80000_HoursNov 22, 2024, 11:36 AM

10 points

0 comments17 min readEA link

METR is hiring!

ElizabethBarnesDec 26, 2023, 9:03 PM

50 points

0 comments1 min readEA link

(www.lesswrong.com)

AI alignment as a translation problem

Roman LeventovFeb 5, 2024, 2:14 PM

3 points

1 comment1 min readEA link

[Question] Books and lecture series relevant to AI governance?

MichaelA🔸Jul 18, 2021, 3:54 PM

22 points

8 comments1 min readEA link

Training Data Attribution: Examining Its Adoption & Use Cases

Deric ChengJan 22, 2025, 3:40 PM

18 points

1 comment3 min readEA link

(www.convergenceanalysis.org)

How the Human Psychological “Program” Undermines AI Alignment — and What We Can Do

Beyond SingularityMay 6, 2025, 1:37 PM

13 points

2 comments3 min readEA link

The Existential Risk of Speciesist Bias in AI

Sam Tucker-DavisNov 11, 2023, 3:27 AM

38 points

1 comment3 min readEA link

AI Safety Collab 2025 - Local Organizer Sign-ups Open

Evander H. 🔸Feb 12, 2025, 11:27 AM

15 points

0 comments1 min readEA link

Likelihood of an anti-AI backlash: Results from a preliminary Twitter poll

Geoffrey MillerSep 27, 2022, 10:01 PM

27 points

13 comments1 min readEA link

Infographics report risk management of Artificial Intelligence in Spain

JorgeTorresCJul 10, 2023, 2:44 PM

16 points

0 comments1 min readEA link

(riesgoscatastroficosglobales.com)

#184 – Sleeping on sleeper agents, and the biggest AI updates since ChatGPT (Zvi Mowshowitz on the 80,000 Hours Podcast)

80000_HoursApr 12, 2024, 12:22 PM

46 points

0 comments20 min readEA link

Demonstrate and evaluate risks from AI to society at the AI x Democracy research hackathon

Esben KranApr 19, 2024, 2:46 PM

24 points

0 comments6 min readEA link

(www.apartresearch.com)

Redwood Research is hiring for several roles

Jack RNov 29, 2021, 12:18 AM

75 points

0 comments1 min readEA link

Minecraft As An Effective Advocacy Strategy And Cause Area

Kenneth_DiaoApr 1, 2025, 7:12 PM

15 points

0 comments4 min readEA link

The Pugwash Conferences and the Anti-Ballistic Missile Treaty as a case study of Track II diplomacy

rani_martinSep 16, 2022, 10:42 AM

82 points

5 comments27 min readEA link

Being nicer than Clippy

Joe_CarlsmithJan 16, 2024, 7:44 PM

26 points

3 comments1 min readEA link

Data Publication for the 2021 Artificial Intelligence, Morality, and Sentience (AIMS) Survey

Janet PauketatMar 24, 2022, 3:43 PM

21 points

0 comments3 min readEA link

(www.sentienceinstitute.org)

Truthful AI

Owen Cotton-BarrattOct 20, 2021, 3:11 PM

55 points

14 comments10 min readEA link

What are Responsible Scaling Policies (RSPs)?

Vishakha AgrawalApr 5, 2025, 4:05 PM

2 points

0 comments2 min readEA link

(www.lesswrong.com)

[Link] EAF Research agenda: “Cooperation, Conflict, and Transformative Artificial Intelligence”

stefan.torgesJan 17, 2020, 1:28 PM

64 points

0 comments1 min readEA link

Developing a Calculable Conscience for AI: Equation for Rights Violations

Sean SweeneyDec 12, 2024, 5:50 PM

4 points

1 comment15 min readEA link

We’re not prepared for an AI market crash

RemmeltApr 1, 2025, 4:33 AM

23 points

4 comments1 min readEA link

Michael Page, Dario Amodei, Helen Toner, Tasha McCauley, Jan Leike, & Owen Cotton-Barratt: Musings on AI

EA GlobalAug 11, 2017, 8:19 AM

7 points

0 comments1 min readEA link

(www.youtube.com)

Introducing the Mental Health Roadmap Series

EmilyApr 11, 2023, 10:26 PM

18 points

2 comments2 min readEA link

Don’t expect AGI anytime soon

cveresOct 10, 2022, 10:38 PM

0 points

19 comments1 min readEA link

How I Came To Longtermism On My Own & An Outsider Perspective On EA Longtermism

Jordan ArelAug 7, 2022, 2:42 AM

34 points

2 comments20 min readEA link

#201 – Why your robot butler isn’t here yet (Ken Goldberg on The 80,000 Hours Podcast)

80000_HoursSep 13, 2024, 5:41 PM

21 points

0 comments12 min readEA link

AI Governance Career Paths for Europeans

careersthrowawayMay 16, 2020, 6:40 AM

83 points

1 comment12 min readEA link

Consider keeping your threat models private.

Miles KodamaFeb 1, 2025, 12:29 AM

18 points

2 comments4 min readEA link

How to mitigate sandbagging

Teun van der WeijMar 23, 2025, 5:19 PM

3 points

0 comments1 min readEA link

AI Impacts Quarterly Newsletter, Apr-Jun 2023

HarlanJul 18, 2023, 6:01 PM

4 points

0 comments3 min readEA link

(blog.aiimpacts.org)

AI Open Source Debate Comes Down to Trust in Institutions, and AI Policy Makers Should Consider How We Can Foster It

another-anon-do-gooderJan 20, 2024, 1:47 PM

6 points

2 comments1 min readEA link

EA megaprojects continued

mariushobbhahnDec 3, 2021, 10:33 AM

183 points

48 comments7 min readEA link

CAISH Hiring: AI Safety Policy Fellowship Facilitators

Chloe LiJan 17, 2024, 9:21 AM

13 points

1 comment1 min readEA link

EU’s importance for AI governance is conditional on AI trajectories—a case study

MathiasKB🔸Jan 13, 2022, 2:58 PM

31 points

2 comments3 min readEA link

We might be missing some key feature of AI takeoff; it’ll probably seem like “we could’ve seen this coming”

Dane ValerieMay 16, 2024, 12:05 PM

15 points

0 comments5 min readEA link

(www.lesswrong.com)

AGI Morality and Why It Is Unlikely to Emerge as a Feature of Superintelligence

funnyfrancoMar 18, 2025, 7:19 PM

3 points

9 comments18 min readEA link

How might we solve the alignment problem? (Part 1: Intro, summary, ontology)

Joe_CarlsmithOct 28, 2024, 9:57 PM

18 points

0 comments1 min readEA link

Analysis of AI Safety surveys for field-building insights

Ash JafariDec 5, 2022, 5:37 PM

30 points

7 comments5 min readEA link

The Game Board has been Flipped: Now is a good time to rethink what you’re doing

LintzAJan 28, 2025, 9:20 PM

390 points

69 comments13 min readEA link

[Question] How many people are neartermist and have high P(doom)?

SanjayAug 2, 2023, 2:24 PM

52 points

13 comments1 min readEA link

Gentleness and the artificial Other

Joe_CarlsmithJan 2, 2024, 6:21 PM

90 points

2 comments1 min readEA link

[linkpost] Ten Levels of AI Alignment Difficulty

SammyDMartinJul 4, 2023, 11:23 AM

16 points

0 comments1 min readEA link

Helen Toner: The Open Philanthropy Project’s work on AI risk

EA GlobalNov 3, 2017, 7:43 AM

7 points

0 comments1 min readEA link

(www.youtube.com)

Metaculus is building a team dedicated to AI forecasting

christianOct 18, 2022, 4:08 PM

35 points

0 comments1 min readEA link

(apply.workable.com)

[Question] Are social media algorithms an existential risk?

Barry GrimesSep 15, 2020, 8:52 AM

24 points

13 comments1 min readEA link

What’s important in “AI for epistemics”?

Lukas FinnvedenAug 24, 2024, 1:27 AM

66 points

1 comment1 min readEA link

(www.forethought.org)

Restricting brain organoid research to slow down AGI

freedomandutilityNov 9, 2022, 1:01 PM

8 points

2 comments1 min readEA link

AI Safety Microgrant Round

Chris LeongNov 14, 2022, 4:25 AM

81 points

3 comments3 min readEA link

Shahar Avin: Near-term AI security risks, and what to do about them

EA GlobalNov 3, 2017, 7:43 AM

7 points

0 comments1 min readEA link

(www.youtube.com)

Talking to Congress: Can constituents contacting their legislator influence policy?

Tristan WilliamsMar 7, 2024, 9:24 AM

47 points

3 comments19 min readEA link

Against the Open Source / Closed Source Dichotomy: Regulated Source as a Model for Responsible AI Development

Alexander Herwix 🔸Sep 4, 2023, 8:23 PM

5 points

1 comment6 min readEA link

Public Explainer on AI as an Existential Risk

AndrewDorisOct 7, 2022, 7:23 PM

13 points

4 comments15 min readEA link

Introducing the AI for Animals newsletter

Max TaylorJun 21, 2024, 1:24 PM

40 points

0 comments1 min readEA link

Announcement: Learning Theory Online Course

YegregJan 28, 2025, 8:32 AM

5 points

0 comments3 min readEA link

(www.lesswrong.com)

Project Idea: The cost of Coccidiosis on Chicken farming and if AI can help

Max HarrisSep 26, 2022, 4:30 PM

25 points

8 comments2 min readEA link

AGI x Animal Welfare: A High-EV Outreach Opportunity?

simeon_cJun 28, 2023, 8:44 PM

79 points

16 comments1 min readEA link

Harry Blackwood, A Researcher

Connor WoodApr 18, 2025, 3:05 AM

−7 points

0 comments2 min readEA link

LLMs Outperform Experts on Challenging Biology Benchmarks

ljustenMay 14, 2025, 4:09 PM

24 points

1 comment1 min readEA link

(substack.com)

Advice for Entering AI Safety Research

stecasJun 2, 2023, 8:46 PM

14 points

1 comment1 min readEA link

Applications Open for the Cooperative AI Summer School 2025!

C TilliJan 13, 2025, 12:31 PM

25 points

0 comments1 min readEA link

The AIA and its Brussels Effect

Kathryn O'RourkeDec 27, 2022, 4:01 PM

16 points

0 comments5 min readEA link

Optimism, AI risk, and EA blind spots

JustisSep 28, 2022, 5:21 PM

87 points

21 comments8 min readEA link

AI Safety researcher career review

Benjamin_ToddNov 23, 2021, 12:00 AM

13 points

1 comment6 min readEA link

(80000hours.org)

G7 Summit—Cooperation on AI Policy

Leonard_BarrettMay 19, 2023, 10:10 AM

22 points

2 comments1 min readEA link

(www.japantimes.co.jp)

Creating an Artificial Sense of Touch: Revolutionizing Medical Training and Robotic Surgery

Connor WoodApr 15, 2025, 2:01 AM

−1 points

0 comments7 min readEA link

Some reasons to not say “Doomer”

RubyJul 9, 2023, 9:05 PM

28 points

0 comments1 min readEA link

A visualization of some orgs in the AI Safety Pipeline

Aaron_ScherApr 10, 2022, 4:52 PM

11 points

8 comments1 min readEA link

Why AI alignment could be hard with modern deep learning

AjeyaSep 21, 2021, 3:35 PM

156 points

17 comments14 min readEA link

(www.cold-takes.com)

BERI, Epoch, and FAR will explain their work & current job openings online this Sunday

RockwellAug 19, 2022, 8:34 PM

7 points

0 comments1 min readEA link

A Survey of the Potential Long-term Impacts of AI

Sam ClarkeJul 18, 2022, 9:48 AM

63 points

2 comments27 min readEA link

New Book: ‘Nexus’ by Yuval Noah Harari

timfarkasOct 3, 2024, 1:54 PM

15 points

2 comments5 min readEA link

Transformative AI and Compute [Summary]

lennartSep 23, 2021, 1:53 PM

65 points

5 comments9 min readEA link

AI & Drug Discovery—Security and Risks

GirvingJun 28, 2023, 8:57 AM

14 points

1 comment1 min readEA link

Introducing WAIT to Save Humanity

carter allen🔸Apr 1, 2025, 9:36 PM

21 points

1 comment3 min readEA link

AISER—AIS Europe Retreat

CarolinDec 23, 2022, 6:11 PM

5 points

0 comments1 min readEA link

[Question] Which possible AI impacts should receive the most additional attention?

David JohnstonMay 31, 2022, 2:01 AM

10 points

10 comments1 min readEA link

Aligning AI Safety Projects with a Republican Administration

Deric ChengNov 21, 2024, 10:13 PM

13 points

1 comment8 min readEA link

[Question] What are some sources related to big-picture AI strategy?

Jacob Watts🔸Mar 2, 2023, 5:04 AM

9 points

4 comments1 min readEA link

A Selection of Randomly Selected SAE Features

Callum McDougallApr 1, 2024, 9:09 AM

25 points

2 comments1 min readEA link

[Question] How to create curriculum for self-study towards AI alignment work?

OIUJHKDFSJan 7, 2023, 7:53 PM

10 points

5 comments1 min readEA link

Ought: why it matters and ways to help

Paul_ChristianoJul 26, 2019, 1:56 AM

52 points

5 comments5 min readEA link

[Question] Who would you have on your dream team for solving AGI Alignment?

Greg_Colbourn ⏸️ Aug 25, 2022, 1:34 PM

10 points

14 comments1 min readEA link

Google DeepMind releases Gemini

Yarrow🔸Dec 6, 2023, 5:39 PM

21 points

7 comments1 min readEA link

(deepmind.google)

[Question] What are the top priorities in a slow-takeoff, multipolar world?

JP Addison🔸Aug 25, 2021, 8:47 AM

26 points

9 comments1 min readEA link

Possible directions in AI ideal governance research

RoryGAug 10, 2022, 8:36 AM

5 points

0 comments3 min readEA link

AXRP Episode 24 - Superalignment with Jan Leike

DanielFilanJul 27, 2023, 4:56 AM

23 points

0 comments1 min readEA link

(axrp.net)

Probably good projects for the AI safety ecosystem

Ryan KiddDec 5, 2022, 3:24 AM

21 points

0 comments1 min readEA link

AGI will arrive by the end of this decade either as a unicorn or as a black swan

Yuri BarzovOct 21, 2022, 10:50 AM

−4 points

7 comments3 min readEA link

It’s (not) how you use it

Eleni_ASep 7, 2022, 1:28 PM

6 points

3 comments2 min readEA link

2018 AI Alignment Literature Review and Charity Comparison

LarksDec 18, 2018, 4:48 AM

118 points

28 comments63 min readEA link

Future Matters #6: FTX collapse, value lock-in, and counterarguments to AI x-risk

PabloDec 30, 2022, 1:10 PM

58 points

2 comments21 min readEA link

Motivation control

Joe_CarlsmithOct 30, 2024, 5:15 PM

18 points

0 comments1 min readEA link

AGI x-risk timelines: 10% chance (by year X) estimates should be the headline, not 50%.

Greg_Colbourn ⏸️ Mar 1, 2022, 12:02 PM

69 points

22 comments2 min readEA link

Update on Harvard AI Safety Team and MIT AI Alignment

Xander123Dec 2, 2022, 6:09 AM

71 points

3 comments1 min readEA link

‘Surveillance Capitalism’ & AI Governance: Slippery Business Models, Securitisation, and Self-Regulation

Charlie HarrisonFeb 29, 2024, 3:47 PM

19 points

2 comments12 min readEA link

Apply to be a Stanford HAI Junior Fellow (Assistant Professor- Research) by Nov. 15, 2021

Vael GatesOct 31, 2021, 2:21 AM

15 points

0 comments1 min readEA link

[Question] By how much should Meta’s BlenderBot being really bad cause me to update on how justifiable it is for OpenAI and DeepMind to be making significant progress on AI capabilities?

SisiAug 10, 2022, 6:40 AM

24 points

8 comments1 min readEA link

DeepMind is hiring for the Scalable Alignment and Alignment Teams

Rohin ShahMay 13, 2022, 12:19 PM

102 points

0 comments9 min readEA link

Catastrophic Risks from AI #1: Introduction

Dan HJun 22, 2023, 5:09 PM

28 points

1 comment1 min readEA link

(arxiv.org)

What AI could mean for alternative proteins

Max TaylorFeb 9, 2024, 10:13 AM

34 points

5 comments16 min readEA link

Join the AI governance and interpretability hackathons!

Esben KranMar 23, 2023, 2:39 PM

33 points

1 comment5 min readEA link

(alignmentjam.com)

US public opinion of AI policy and risk

Jamie EMay 12, 2023, 1:22 PM

111 points

7 comments15 min readEA link

PauseAI US is looking for local group leaders – apply today!

Felix De SimoneApr 4, 2025, 3:44 PM

20 points

0 comments1 min readEA link

[Question] How strong is the evidence of unaligned AI systems causing harm?

Eevee🔹Jul 21, 2020, 4:08 AM

31 points

1 comment1 min readEA link

[3-hour podcast]: Joseph Carlsmith on longtermism, utopia, the computational power of the brain, meta-ethics, illusionism and meditation

Gus DockerJul 27, 2021, 1:18 PM

34 points

2 comments1 min readEA link

Summary of “Technology Favours Tyranny” by Yuval Noah Harari

Madhav MalhotraOct 26, 2022, 9:37 PM

36 points

1 comment2 min readEA link

[Question] Impact: Engineering VS Medical Scientist VS AI Safety VS Governance

AhmedWezJan 15, 2025, 3:47 PM

1 point

0 comments1 min readEA link

If you’re an AI Safety movement builder consider asking your members these questions in an interview

yanni kyriacosMay 27, 2024, 5:46 AM

10 points

0 comments2 min readEA link

IFRC creative competition: product or service from future autonomous weapons systems and emerging digital risks

Devin LamJul 21, 2024, 1:08 PM

9 points

0 comments1 min readEA link

(solferinoacademy.com)

Implications of AGI on Subjective Human Experience

Erica S. May 30, 2023, 6:47 PM

2 points

0 comments19 min readEA link

(docs.google.com)

Information security careers for GCR reduction

ClaireZabelJun 20, 2019, 11:56 PM

187 points

35 comments8 min readEA link

Half-baked ideas thread (EA / AI Safety)

Aryeh EnglanderJun 23, 2022, 4:05 PM

21 points

8 comments1 min readEA link

[Linkpost] NY Times Feature on Anthropic

GarrisonJul 12, 2023, 7:30 PM

34 points

3 comments5 min readEA link

(www.nytimes.com)

2022 AI expert survey results

Zach Stein-PerlmanAug 4, 2022, 3:54 PM

88 points

7 comments2 min readEA link

(aiimpacts.org)

Scaling and Sustaining Standards: A Case Study on the Basel Accords

C.K.Jul 16, 2023, 6:18 PM

18 points

0 comments7 min readEA link

(docs.google.com)

We should expect to worry more about speculative risks

bgarfinkelMay 29, 2022, 9:08 PM

120 points

14 comments3 min readEA link

RA x ControlAI video: What if AI just keeps getting smarter?

WriterMay 2, 2025, 2:19 PM

14 points

1 comment1 min readEA link

From Laboratories to Language Models: Can AI Support Rigor in the Jungle of Policy Analysis? (Linkpost)

Marcel DFeb 6, 2024, 6:51 PM

22 points

0 comments2 min readEA link

(georgetownsecuritystudiesreview.org)

There should be more AI safety orgs

mariushobbhahnSep 21, 2023, 2:53 PM

117 points

20 comments1 min readEA link

How would you estimate the value of delaying AGI by 1 day, in marginal donations to GiveWell?

AnonymousTurtleDec 16, 2022, 9:25 AM

30 points

19 comments2 min readEA link

“AI Alignment” is a Dangerously Overloaded Term

RokoDec 15, 2023, 3:06 PM

20 points

2 comments3 min readEA link

How to use the Forum (intro)

LizkaMay 5, 2022, 6:29 PM

23 points

13 comments1 min readEA link

“Hard Problems”

khayaliJun 7, 2025, 2:09 PM

−10 points

0 comments1 min readEA link

Love and AI: Relational Brain/Mind Dynamics in AI Development

Jeffrey KursonisJun 21, 2022, 7:09 AM

2 points

2 comments3 min readEA link

AGI safety from first principles

richard_ngoOct 21, 2020, 5:42 PM

77 points

10 comments3 min readEA link

(www.alignmentforum.org)

Thoughts about Policy Ecosystems: The Missing Links in AI Governance

Echo HuangJan 31, 2025, 1:23 PM

21 points

2 comments5 min readEA link

Accidentally teaching AI models to deceive us (Ajeya Cotra on The 80,000 Hours Podcast)

80000_HoursMay 15, 2023, 8:58 PM

37 points

2 comments18 min readEA link

Space settlement and the time of perils: a critique of Thorstad

Matthew RendallApr 14, 2024, 3:29 PM

46 points

10 comments4 min readEA link

Artificial Intelligence as exit strategy from the age of acute existential risk

Arturo MaciasApr 12, 2023, 2:41 PM

11 points

11 comments7 min readEA link

The ELYSIUM Proposal

RokoOct 16, 2024, 2:14 AM

−10 points

0 comments1 min readEA link

(transhumanaxiology.substack.com)

Questionable Narratives of “Situational Awareness”

fergusqJun 16, 2024, 5:09 PM

23 points

10 comments14 min readEA link

Digital Minds Takeoff Scenarios

Bradford SaadJul 5, 2024, 4:06 PM

36 points

10 comments17 min readEA link

The Case for AI Adaptation: The Perils of Living in a World with Aligned and Well-Deployed Transformative Artificial Intelligence

HTCMay 30, 2023, 6:29 PM

5 points

1 comment7 min readEA link

Le Tempistiche delle IA: il dibattito e il punto di vista degli “esperti”

EA ItalyJan 17, 2023, 11:30 PM

1 point

0 comments11 min readEA link

It’s OK not to go into AI (for students)

ruthgraceJul 14, 2022, 3:16 PM

59 points

18 comments2 min readEA link

Public Call for Interest in Mathematical Alignment

DavidmanheimNov 22, 2023, 1:22 PM

27 points

3 comments1 min readEA link

AI, Animals, & Digital Minds 2025: apply to speak by Wednesday!

Alistair StewartMay 5, 2025, 12:45 AM

8 points

0 comments1 min readEA link

Call for Pythia-style foundation model suite for alignment research

LucretiaMay 1, 2023, 8:26 PM

10 points

0 comments1 min readEA link

Verification methods for international AI agreements

AkashAug 31, 2024, 2:58 PM

20 points

0 comments1 min readEA link

(arxiv.org)

AI Safety Concepts Writeup: WebGPT

JustisAug 11, 2023, 1:31 AM

14 points

0 comments7 min readEA link

The theoretical computational limit of the Solar System is 1.47x10^49 bits per second.

William the KiwiOct 17, 2023, 2:52 AM

12 points

7 comments1 min readEA link

Whistleblowing Twitter Bot

Mckiev 🔸Dec 26, 2024, 6:18 PM

11 points

1 comment2 min readEA link

(www.lesswrong.com)

Start an AIS safety field-building organization at the city or national level—an EOI form

gergoJan 9, 2025, 8:42 AM

38 points

4 comments2 min readEA link

Orphaned Policies (Post 5 of 6 on AI Governance)

Jason Green-LoweMay 29, 2025, 9:42 PM

25 points

3 comments16 min readEA link

Why focus on schemers in particular (Sections 1.3 and 1.4 of “Scheming AIs”)

Joe_CarlsmithNov 24, 2023, 7:18 PM

10 points

1 comment1 min readEA link

Disentangling “Safety”

pleaselistencarefullyasourmenuoptionshaverecentlychangedJul 6, 2024, 11:21 PM

1 point

0 comments3 min readEA link

[Question] Is there EA discussion on non-x-risk transformative AI?

Franziska FischerApr 26, 2023, 1:50 PM

5 points

0 comments1 min readEA link

How I switched careers from software engineer to AI policy operations

Lucie Philippon 🔸Apr 13, 2025, 6:41 AM

12 points

1 comment5 min readEA link

(www.lesswrong.com)

Four reasons I find AI safety emotionally compelling

Kat WoodsJun 28, 2022, 2:01 PM

32 points

5 comments4 min readEA link

AGI risk: analogies & arguments

technicalitiesMar 23, 2021, 1:18 PM

31 points

3 comments8 min readEA link

(www.gleech.org)

[Linkpost] My attempt at trying to summarize ‘Intro to ML Safety’

Arjun YadavJul 25, 2023, 10:37 AM

4 points

0 comments1 min readEA link

(arjunyadav.net)

‘Artificial Intelligence Governance under Change’ (PhD dissertation)

MMMaasSep 15, 2022, 12:10 PM

54 points

1 comment2 min readEA link

(drive.google.com)

Inside OpenAI’s Controversial Plan to Abandon its Nonprofit Roots

GarrisonApr 18, 2025, 6:46 PM

17 points

1 comment11 min readEA link

(garrisonlovely.substack.com)

Summary of 80k’s AI problem profile

JakubKJan 1, 2023, 7:48 AM

19 points

0 comments5 min readEA link

(www.lesswrong.com)

Pausing for what?

MountainPathOct 21, 2024, 12:18 PM

6 points

1 comment1 min readEA link

Estimating the Substitutability between Compute and Cognitive Labor in AI Research

Parker_WhitfillJun 1, 2025, 2:27 PM

126 points

25 comments9 min readEA link

Announcing the Compassionate Future Summit 2025

Ruth_SeleoJan 21, 2025, 7:15 AM

45 points

2 comments2 min readEA link

I made an AI safety fellowship. What I wish I knew.

RubenCastaingJun 9, 2024, 4:32 PM

14 points

1 comment2 min readEA link

Noah’s Arc: From AR Desks to AI Reactors

TabulaRasaMar 1, 2024, 1:59 PM

7 points

0 comments4 min readEA link

Winners of AI Alignment Awards Research Contest

AkashJul 13, 2023, 4:14 PM

50 points

1 comment1 min readEA link

Summing up “Scheming AIs” (Section 5)

Joe_CarlsmithDec 9, 2023, 3:48 PM

9 points

1 comment1 min readEA link

Ability to solve long-horizon tasks correlates with wanting things in the behaviorist sense

So8resNov 24, 2023, 5:37 PM

38 points

1 comment1 min readEA link

DeepMind: Generally capable agents emerge from open-ended play

kokotajlodJul 27, 2021, 7:35 PM

56 points

10 comments2 min readEA link

(deepmind.com)

Fanaticism in AI: SERI Project

Jake Arft-GuatelliSep 24, 2021, 4:39 AM

7 points

2 comments5 min readEA link

BOUNTY AVAILABLE: AI ethicists, what are your object-level arguments against AI notkilleveryoneism?

Peter BerggrenJul 6, 2023, 5:37 PM

0 points

19 comments2 min readEA link

Where I currently disagree with Ryan Greenblatt’s version of the ELK approach

So8resSep 29, 2022, 9:19 PM

21 points

0 comments5 min readEA link

AI safety technical research—Career review

Benjamin HiltonJul 17, 2023, 3:34 PM

50 points

0 comments31 min readEA link

[Question] What could a policy banning AGI look like?

TsviBTMar 13, 2024, 2:19 PM

17 points

4 comments1 min readEA link

ARC-AGI-2 Overview With François Chollet

Yarrow🔸Apr 10, 2025, 6:54 PM

7 points

0 comments1 min readEA link

(youtu.be)

What is everyone doing in AI governance

Igor IvanovJul 8, 2023, 3:19 PM

31 points

0 comments5 min readEA link

Stampy’s AI Safety Info—New Distillations #4 [July 2023]

markovAug 16, 2023, 7:02 PM

6 points

0 comments1 min readEA link

(aisafety.info)

AI Agents’ Accidental Architects of Chaos: The Dangers of Interacting Systems

Hugo WongMay 12, 2025, 7:58 AM

−3 points

0 comments8 min readEA link

The Age of EM

ABishopMay 9, 2024, 12:17 PM

0 points

0 comments1 min readEA link

(ageofem.com)

Notes on “the hot mess theory of AI misalignment”

JakubKApr 21, 2023, 10:07 AM

44 points

3 comments1 min readEA link

Why we need a new agency to regulate advanced artificial intelligence

Michael HuangAug 4, 2022, 1:38 PM

25 points

0 comments1 min readEA link

(www.brookings.edu)

David Krueger on AI Alignment in Academia and Coordination

Michaël TrazziJan 7, 2023, 9:14 PM

32 points

1 comment3 min readEA link

(theinsideview.ai)

Seeking Input to AI Safety Book for non-technical audience

Darren McKeeAug 10, 2023, 6:03 PM

11 points

4 comments1 min readEA link

Scientism vs. people

Roman LeventovApr 18, 2023, 5:52 PM

0 points

0 comments11 min readEA link

“Intro to brain-like-AGI safety” series—just finished!

Steven ByrnesMay 17, 2022, 3:35 PM

15 points

0 comments1 min readEA link

Consider paying me to do AI safety research work

RupertNov 5, 2020, 8:09 AM

11 points

3 comments2 min readEA link

[Question] How does a company like Instadeep fit into the current AI landscape?

Tom AApr 8, 2023, 5:49 AM

6 points

0 comments1 min readEA link

5 ways to improve CoT faithfulness

CBiddulphOct 8, 2024, 4:17 AM

8 points

0 comments1 min readEA link

[Question] Thoughts on these $1M and $500k AI safety grants?

defun 🔸Jul 11, 2024, 1:37 PM

50 points

7 comments1 min readEA link

AI Benefits Post 5: Outstanding Questions on Governing Benefits

Cullen 🔸Jul 21, 2020, 4:45 PM

5 points

0 comments4 min readEA link

Why I’m working on AI welfare

kyle_fishJul 6, 2024, 6:01 AM

64 points

7 comments5 min readEA link

[Closed] Gauging Interest for a Learning-Theoretic Agenda Mentorship Programme

VanessaFeb 16, 2025, 4:24 PM

17 points

0 comments1 min readEA link

NAIRA—An exercise in regulatory, competitive safety governance [AI Governance Institutional Design idea]

Heramb PodarMar 19, 2024, 2:55 PM

5 points

1 comment6 min readEA link

AI Safety Evaluations: A Regulatory Review

Elliot MckernonMar 19, 2024, 3:09 PM

12 points

2 comments11 min readEA link

US AI Safety Institute will be ‘gutted,’ Axios reports

Matrice JacobineFeb 20, 2025, 2:40 PM

12 points

1 comment1 min readEA link

(www.zdnet.com)

Review of artificial intelligence platforms for early pandemic detection in Latin America

DianaCarolinaSep 17, 2024, 3:17 PM

5 points

0 comments53 min readEA link

Video and Transcript of Presentation on Existential Risk from Power-Seeking AI

Joe_CarlsmithMay 8, 2022, 3:52 AM

97 points

7 comments30 min readEA link

FT: We must slow down the race to God-like AI

Angelina LiApr 24, 2023, 11:57 AM

33 points

2 comments2 min readEA link

(www.ft.com)

Metaculus Year in Review: 2022

christianJan 6, 2023, 1:23 AM

25 points

2 comments4 min readEA link

(metaculus.medium.com)

AI & wisdom 2: growth and amortised optimisation

L Rudolf LOct 29, 2024, 1:37 PM

20 points

0 comments1 min readEA link

(rudolf.website)

The Psychological Barrier to Accepting AGI-Induced Human Extinction, and Why I Don’t Have It

funnyfrancoMar 11, 2025, 4:13 AM

0 points

0 comments17 min readEA link

The Prospect of an AI Winter

Erich_Grunewald 🔸Mar 27, 2023, 8:55 PM

56 points

13 comments1 min readEA link

Report: Latin America and Global Catastrophic Risks, transforming risk management.

JorgeTorresCJan 9, 2024, 2:13 AM

25 points

1 comment2 min readEA link

(riesgoscatastroficosglobales.com)

[Question] Is AI safety still neglected?

CoafosMar 30, 2022, 9:09 AM

13 points

13 comments1 min readEA link

Working at EA organizations series: Machine Intelligence Research Institute

SoerenMindNov 1, 2015, 12:49 PM

8 points

0 comments4 min readEA link

My (Lazy) Longtermism FAQ

Devin KalishOct 24, 2022, 4:44 PM

30 points

6 comments27 min readEA link

Center on Long-Term Risk: Annual review and fundraiser 2023

Center on Long-Term RiskDec 13, 2023, 4:42 PM

78 points

3 comments4 min readEA link

What does (and doesn’t) AI mean for effective altruism?

EA GlobalAug 12, 2017, 7:00 AM

9 points

0 comments12 min readEA link

Some of My Current Impressions Entering AI Safety

PhibMar 28, 2023, 5:18 AM

5 points

0 comments2 min readEA link

Messy personal stuff that affected my cause prioritization (or: how I started to care about AI safety)

Julia_Wise🔸May 5, 2022, 5:59 PM

265 points

14 comments2 min readEA link

Learning Math in Time for Alignment

Nicholas KrossJan 9, 2024, 1:02 AM

10 points

0 comments1 min readEA link

Orthogonal’s Formal-Goal Alignment theory of change

Tamsin LeakeMay 5, 2023, 10:36 PM

21 points

0 comments1 min readEA link

Improving capability evaluations for AI governance: Open Philanthropy’s new request for proposals

cbFeb 7, 2025, 9:30 AM

37 points

3 comments3 min readEA link

The Intelligence Curse: an essay series

L Rudolf LApr 24, 2025, 12:59 PM

22 points

1 comment1 min readEA link

ea.domains—Domains Free to a Good Home

plexJan 12, 2023, 1:32 PM

48 points

8 comments4 min readEA link

Scale, schlep, and systems

AjeyaOct 10, 2023, 4:59 PM

59 points

3 comments6 min readEA link

Title: “Nurturing AI: A Different Vision for Safety and Growth”

Brad WilkinsApr 28, 2025, 7:21 PM

0 points

0 comments1 min readEA link

[Question] Will AGI cause mass technological unemployment?

Eevee🔹Jun 22, 2020, 8:55 PM

4 points

2 comments2 min readEA link

Soft Nationalization: How the US Government Will Control AI Labs

Deric ChengAug 27, 2024, 3:10 PM

103 points

5 comments21 min readEA link

(www.convergenceanalysis.org)

What does it mean to become an expert in AI Hardware?

TophJan 9, 2021, 4:15 AM

87 points

10 comments11 min readEA link

Is it 3 Years, or 3 Decades Away? Disagreements on AGI Timelines

Vasco Grilo🔸Apr 4, 2025, 4:01 PM

46 points

1 comment2 min readEA link

(epoch.ai)

Connor Leahy on Conjecture and Dying with Dignity

Michaël TrazziJul 22, 2022, 7:30 PM

34 points

0 comments10 min readEA link

(theinsideview.ai)

Agents that act for reasons: a thought experiment

Michele CampoloJan 24, 2024, 4:48 PM

7 points

1 comment3 min readEA link

No. Impending AGI doesn’t make everything else unimportant.

Igor IvanovSep 4, 2023, 6:56 PM

14 points

6 comments5 min readEA link

Acausal normalcy

Andrew CritchMar 3, 2023, 11:35 PM

21 points

4 comments8 min readEA link

Suggestions for getting retiree / second career folks interested in AI Safety?

sjsjsjJan 5, 2025, 5:59 PM

2 points

1 comment1 min readEA link

[Question] Share AI Safety Ideas: Both Crazy and Not

ankFeb 26, 2025, 1:09 PM

4 points

15 comments1 min readEA link

Good Research Takes are Not Sufficient for Good Strategic Takes

Neel NandaMar 22, 2025, 10:13 AM

119 points

0 comments1 min readEA link

(www.neelnanda.io)

AGI Cannot Be Predicted From Real Interest Rates

Nicholas DeckerJan 28, 2025, 5:45 PM

26 points

3 comments1 min readEA link

(nicholasdecker.substack.com)

AI Safety Doesn’t Have to be Weird

Mica WhiteJan 2, 2023, 9:56 PM

11 points

1 comment2 min readEA link

Guess, ask or tell?

dEAsignOct 19, 2023, 9:52 PM

2 points

1 comment1 min readEA link

An International Collaborative Hub for Advancing AI Safety Research

Cody AlbertApr 22, 2025, 4:12 PM

9 points

0 comments5 min readEA link

An AI Race With China Can Be Better Than Not Racing

niplavJul 2, 2024, 5:57 PM

19 points

1 comment1 min readEA link

Analysis of Global AI Governance Strategies

SammyDMartinDec 11, 2024, 11:08 AM

23 points

0 comments1 min readEA link

(www.lesswrong.com)

Anthropic teams up with Palantir and AWS to sell AI to defense customers

Matrice JacobineNov 9, 2024, 11:47 AM

28 points

1 comment2 min readEA link

(techcrunch.com)

The flaws that make today’s AI architecture unsafe and a new approach that could fix it

80000_HoursJun 22, 2020, 10:15 PM

3 points

0 comments86 min readEA link

(80000hours.org)

The right to protection from catastrophic AI risk

Jack CunninghamApr 9, 2022, 11:11 PM

11 points

0 comments7 min readEA link

Join the AI Alignment Evals hackathon

lenzJan 14, 2025, 6:17 PM

3 points

0 comments3 min readEA link

Life of GPT

Odd anonNov 8, 2023, 10:31 PM

−1 points

0 comments5 min readEA link

Artificial Intelligence and Nuclear Command, Control, & Communications: The Risks of Integration

Peter RautenbachNov 18, 2022, 1:01 PM

60 points

3 comments50 min readEA link

General vs specific arguments for the longtermist importance of shaping AI development

Sam ClarkeOct 15, 2021, 2:43 PM

44 points

7 comments2 min readEA link

AI Might Kill Everyone

Bentham's BulldogJun 5, 2025, 3:36 PM

18 points

0 comments4 min readEA link

Sparks of Artificial General Intelligence: Early experiments with GPT-4 | Microsoft Research

𝕮𝖎𝖓𝖊𝖗𝖆Mar 23, 2023, 5:45 AM

15 points

0 comments1 min readEA link

Will AI R&D Automation Cause a Software Intelligence Explosion?

ForethoughtMar 26, 2025, 3:37 PM

32 points

4 comments2 min readEA link

(www.forethought.org)

AI, Greed, and the Death of Oversight: When Institutions Ignore Their Own Limits

funnyfrancoMar 21, 2025, 1:13 PM

11 points

0 comments26 min readEA link

AI acceleration from a safety perspective: Trade-offs and considerations

mariushobbhahnJan 19, 2022, 9:44 AM

12 points

1 comment7 min readEA link

OpenAI’s massive push to make superintelligence safe in 4 years or less (Jan Leike on the 80,000 Hours Podcast)

80000_HoursAug 8, 2023, 6:00 PM

32 points

1 comment19 min readEA link

(80000hours.org)

The moral argument for giving AIs autonomy

Matthew_BarnettJan 8, 2025, 12:59 AM

33 points

7 comments11 min readEA link

GovAI Annual Report 2021

GovAIJan 5, 2022, 4:57 PM

52 points

2 comments9 min readEA link

Altman on the board, AGI, and superintelligence

OscarD🔸Jan 6, 2025, 2:37 PM

20 points

1 comment1 min readEA link

(blog.samaltman.com)

Human extinction’s impact on non-human animals remains largely underexplored

JoA🔸Mar 1, 2025, 9:31 PM

35 points

1 comment12 min readEA link

o3

Zach Stein-PerlmanDec 20, 2024, 9:00 PM

84 points

9 comments1 min readEA link

An Open Letter To EA and AI Safety On Decelerating AI Development

Kenneth_DiaoFeb 28, 2025, 5:15 PM

21 points

0 comments14 min readEA link

(graspingatwaves.substack.com)

Personal AI Planning

Jeff Kaufman 🔸Nov 10, 2024, 2:10 PM

43 points

5 comments1 min readEA link

Pause For Thought: The AI Pause Debate

Scott AlexanderOct 10, 2023, 3:34 PM

113 points

20 comments14 min readEA link

(www.astralcodexten.com)

Coherence arguments imply a force for goal-directed behavior

Katja_GraceApr 6, 2021, 9:44 PM

19 points

1 comment11 min readEA link

(worldspiritsockpuppet.com)

[Question] Best project management software for research projects and labs?

PeterSlatteryOct 5, 2023, 6:38 PM

19 points

10 comments1 min readEA link

Go Mobilize? Lessons from GM Protests for Pausing AI

Charlie HarrisonOct 24, 2023, 3:01 PM

54 points

11 comments31 min readEA link

European Master’s Programs in Machine Learning, Artificial Intelligence, and related fields

Master Programs ML/AIJan 17, 2021, 8:09 PM

17 points

4 comments1 min readEA link

AI Regulation is Unsafe

Maxwell TabarrokApr 22, 2024, 4:38 PM

19 points

8 comments4 min readEA link

(www.maximum-progress.com)

Lessons learned and review of the AI Safety Nudge Competition

Marc CarauleanuJan 17, 2023, 5:13 PM

5 points

0 comments5 min readEA link

World Citizen Assembly about AI—Announcement

CamilleFeb 11, 2025, 10:51 AM

25 points

2 comments5 min readEA link

AI and X-risk unconference at ZuGeorgia

YeshJun 18, 2024, 2:24 PM

2 points

0 comments1 min readEA link

[Question] If an existential catastrophe occurs, how likely is it to wipe out all animal sentience?

JoA🔸Mar 16, 2025, 10:30 PM

11 points

2 comments2 min readEA link

A Primer on God, Liberalism and the End of History

Mahdi ComplexMar 28, 2022, 5:26 AM

8 points

3 comments14 min readEA link

The UK AI Safety Summit tomorrow

SebastianSchmidtOct 31, 2023, 7:09 PM

17 points

2 comments2 min readEA link

Which AI Safety Org to Join?

Yonatan CaleOct 11, 2022, 7:42 PM

17 points

21 comments1 min readEA link

Will AI be able to rethink its goals?

SeptemberLMay 11, 2025, 12:29 PM

9 points

1 comment8 min readEA link

Asterisk Mag 09: Weird

Clara CollierApr 4, 2025, 8:25 PM

25 points

0 comments2 min readEA link

What if we don’t need a “Hard Left Turn” to reach AGI?

EigengenderJul 15, 2022, 9:49 AM

39 points

7 comments4 min readEA link

[Question] Request for Assistance—Research on Scenario Development for Advanced AI Risk

KiliankMar 30, 2022, 3:01 AM

2 points

1 comment1 min readEA link

One, perhaps underrated, AI risk.

Alex (Αλέξανδρος)Nov 28, 2024, 10:34 AM

7 points

1 comment3 min readEA link

Launching The Collective Intelligence Project: Whitepaper and Pilots

jasmine_wangFeb 6, 2023, 5:00 PM

38 points

8 comments2 min readEA link

(cip.org)

Why Post-Probability AI May Be Safer Than Probability-Based Models

devin.bostickApr 16, 2025, 2:23 PM

2 points

0 comments2 min readEA link

[Job ad] LISA CEO

Ryan KiddFeb 9, 2025, 12:18 AM

5 points

0 comments1 min readEA link

High impact job opportunity at ARIA (UK)

RasoolFeb 12, 2023, 10:35 AM

80 points

0 comments1 min readEA link

Hiring engineers and researchers to help align GPT-3

Paul_ChristianoOct 1, 2020, 6:52 PM

107 points

19 comments3 min readEA link

[Question] Why can’t we accept the human condition as it existed in 2010?

Hayven FrienbyJan 9, 2024, 6:02 PM

35 points

36 comments2 min readEA link

Book Launch: The Moral Circle: Who Matters, What Matters, and Why

Sofia_FogelJan 21, 2025, 1:45 PM

30 points

0 comments1 min readEA link

How to pursue a career in AI governance and coordination

Cody_FenwickSep 25, 2023, 12:00 PM

32 points

1 comment29 min readEA link

(80000hours.org)

ML4Good Brasil—Applications Open

NiaMay 3, 2024, 10:39 AM

28 points

1 comment1 min readEA link

Compute Research Questions and Metrics—Transformative AI and Compute [4/4]

lennartNov 28, 2021, 10:18 PM

18 points

2 comments1 min readEA link

The current AI strategic landscape: one bear’s perspective

Matrice JacobineFeb 15, 2025, 9:49 AM

6 points

0 comments2 min readEA link

(philosophybear.substack.com)

Alignment’s phlogiston

Eleni_AAug 18, 2022, 1:41 AM

18 points

1 comment2 min readEA link

The State of AI Governance in Africa: Musings from the Global South

Thaiya Jesse WallaceAug 17, 2023, 11:34 AM

6 points

0 comments5 min readEA link

[EU time] Infosec: What even is zero trust?

JarrahJun 21, 2024, 6:09 PM

2 points

0 comments1 min readEA link

New blog: Planned Obsolescence

AjeyaMar 27, 2023, 7:46 PM

198 points

9 comments1 min readEA link

(www.planned-obsolescence.org)

AI and Non-Existence

Blue11Jan 31, 2025, 1:19 PM

4 points

0 comments2 min readEA link

Amanda Askell: AI safety needs social scientists

EA GlobalMar 4, 2019, 3:50 PM

27 points

0 comments18 min readEA link

(www.youtube.com)

On January 1, 2030, there will be no AGI (and AGI will still not be imminent)

Yarrow🔸Apr 6, 2025, 1:08 AM

35 points

53 comments2 min readEA link

Clarifying METR’s Auditing Role [linkpost]

ChanaMessingerJun 4, 2024, 3:34 PM

47 points

1 comment1 min readEA link

(www.alignmentforum.org)

My experience applying to MATS 6.0

micJul 18, 2024, 7:02 PM

24 points

0 comments1 min readEA link

Mere exposure effect: Bias in Evaluating AGI X-Risks

RemmeltDec 27, 2022, 2:05 PM

4 points

1 comment1 min readEA link

We don’t need AGI for an amazing future

Karl von WendtMay 4, 2023, 12:11 PM

57 points

2 comments1 min readEA link

Deliberate practice for research?

Alex_AltairOct 8, 2022, 3:45 AM

19 points

4 comments1 min readEA link

Human-level is not the limit

Vishakha AgrawalApr 23, 2025, 11:16 AM

3 points

0 comments2 min readEA link

(aisafety.info)

Annual AGI Benchmarking Event

MetaculusAug 26, 2022, 9:31 PM

20 points

2 comments2 min readEA link

(www.metaculus.com)

Why we’re entering a new nuclear age — and how to reduce the risks (Christian Ruhl on the 80k After Hours Podcast)

80000_HoursMar 27, 2024, 7:17 PM

52 points

2 comments7 min readEA link

Could AI accelerate economic growth?

Tom_DavidsonJun 7, 2023, 7:07 PM

28 points

0 comments6 min readEA link

Reasons I’ve been hesitant about high levels of near-ish AI risk

eliflandJul 22, 2022, 1:32 AM

211 points

16 comments7 min readEA link

(www.foxy-scout.com)

AI Disclosure Ballot Initiative (and voting method)

aaronhamlinJan 17, 2024, 8:01 PM

5 points

0 comments1 min readEA link

XPT forecasts on (some) biological anchors inputs

Forecasting Research InstituteJul 24, 2023, 1:32 PM

37 points

2 comments12 min readEA link

[Creative Writing Contest] Metal or Mortal

LouisOct 16, 2021, 4:24 PM

7 points

0 comments7 min readEA link

Theory: “WAW might be of higher impact than x-risk prevention based on utilitarianism”

Jens Aslaug 🔸Sep 12, 2023, 1:11 PM

51 points

20 comments17 min readEA link

Security Warning: Squarespace Transfer from Google Domains

Wavefront_Security_DaveJun 10, 2024, 9:26 AM

4 points

0 comments3 min readEA link

[Report] Bridging the International AI Governance Divide: Key Strategies for Including the Global South

Heramb PodarJan 26, 2025, 11:55 PM

8 points

0 comments1 min readEA link

(encodeai.org)

My Most Likely Reason to Die Young is AI X-Risk

AISafetyIsNotLongtermistJul 4, 2022, 3:34 PM

239 points

62 comments4 min readEA link

(www.lesswrong.com)

My thoughts on the social response to AI risk

Matthew_BarnettNov 1, 2023, 9:27 PM

116 points

17 comments1 min readEA link

ML4Good Colombia—Applications Open

carolinaolliveFeb 9, 2025, 4:03 AM

10 points

0 comments1 min readEA link

AI values will be shaped by a variety of forces, not just the values of AI developers

Matthew_BarnettJan 11, 2024, 12:48 AM

71 points

3 comments3 min readEA link

CEEALAR’s Theory of Change

CEEALARDec 19, 2023, 8:21 PM

51 points

5 comments3 min readEA link

[Question] How to Improve China-Western Coordination on EA Issues?

Michael KehoeNov 3, 2021, 7:28 AM

15 points

2 comments1 min readEA link

Announcing the Future Fund’s AI Worldview Prize

Nick_BecksteadSep 23, 2022, 4:28 PM

255 points

125 comments13 min readEA link

(ftxfuturefund.org)

What competencies do social scientists need to responsibly incorporate AI tools into their research practices?

Dane ValerieMay 30, 2025, 2:13 PM

4 points

0 comments1 min readEA link

(www.monash.edu)

How to become more agentic, by GPT-EA-Forum-v1

JoyOptimizerJun 20, 2022, 6:50 AM

24 points

8 comments4 min readEA link

Global computing capacity

Vasco Grilo🔸May 1, 2023, 6:09 AM

12 points

0 comments1 min readEA link

(aiimpacts.org)

AI alignment prize winners and next round [link]

RyanCareyJan 20, 2018, 12:07 PM

7 points

1 comment1 min readEA link

The Animal Welfare Case for Open Access: Breaking Barriers to Scientific Knowledge and Enhancing LLM Training

Wladimir J. AlonsoNov 23, 2024, 1:07 PM

32 points

2 comments3 min readEA link

What do XPT forecasts tell us about AI timelines?

rosehadsharJul 21, 2023, 8:30 AM

29 points

0 comments13 min readEA link

Vitalik on science, his philanthropy and effective altruism.

vincentweisserJan 18, 2023, 11:16 PM

11 points

0 comments1 min readEA link

AI Timelines via Cumulative Optimization Power: Less Long, More Short

Jake CannellOct 6, 2022, 7:06 AM

27 points

0 comments1 min readEA link

Apply to the Machine Learning For Good bootcamp in France

Alexandre VariengienJun 17, 2022, 9:13 AM

9 points

0 comments1 min readEA link

(www.lesswrong.com)

Demis Hassabis — Google DeepMind: The Podcast

Zach Stein-PerlmanAug 16, 2024, 12:00 AM

22 points

2 comments1 min readEA link

(www.youtube.com)

AI Model Registries: A Regulatory Review

Deric ChengMar 22, 2024, 4:01 PM

6 points

3 comments6 min readEA link

Applications Open: AI Safety India Phase 1 – Fundamentals of Safe AI (Global Cohort)

adityaraj@eanitaApr 28, 2025, 12:05 PM

4 points

0 comments2 min readEA link

Good job opportunities for helping with the most important century

Holden KarnofskyJan 18, 2024, 7:21 PM

46 points

1 comment4 min readEA link

(www.cold-takes.com)

FLI AI Alignment podcast: Evan Hubinger on Inner Alignment, Outer Alignment, and Proposals for Building Safe Advanced AI

evhubJul 1, 2020, 8:59 PM

13 points

2 comments1 min readEA link

(futureoflife.org)

10 Cruxes of Artificial Sentience

Jordan ArelJul 1, 2024, 2:46 AM

31 points

0 comments3 min readEA link

[Question] What kind of event, targeted to undergraduate CS majors, would be most effective at getting people to work on AI safety?

CBiddulphSep 19, 2021, 4:19 PM

9 points

1 comment1 min readEA link

[Question] Why not to solve alignment by making superintelligent humans?

PatoOct 16, 2022, 9:26 PM

9 points

12 comments1 min readEA link

Rethink Priorities’ 2022 Impact, 2023 Strategy, and Funding Gaps

kierangreig🔸Nov 25, 2022, 5:37 AM

108 points

10 comments28 min readEA link

On Generality

Oren MontanoSep 26, 2022, 8:59 AM

2 points

0 comments1 min readEA link

Longevity research as AI X-risk intervention

DirectedEvolutionNov 6, 2022, 5:58 PM

27 points

0 comments9 min readEA link

AI Control idea: Give an AGI the primary objective of deleting itself, but construct obstacles to this as best we can. All other objectives are secondary to this primary goal.

JustausernameApr 3, 2023, 2:32 PM

7 points

4 comments1 min readEA link

Apply by 10th June: ‘Introduction to Biosecurity’ Online Course Starting in July

Lin BLMay 15, 2025, 6:08 PM

15 points

0 comments1 min readEA link

What Areas of AI Safety and Alignment Research are Largely Ignored?

Andy E WilliamsDec 27, 2024, 12:19 PM

4 points

0 comments1 min readEA link

Catastrophic Risks from AI #4: Organizational Risks

Dan HJun 26, 2023, 7:36 PM

7 points

0 comments1 min readEA link

[Question] Is there a news-tracker about GPT-4? Why has everything become so silent about it?

Franziska FischerOct 29, 2022, 8:56 AM

10 points

4 comments1 min readEA link

Shah and Yudkowsky on alignment failures

EliezerYudkowskyFeb 28, 2022, 7:25 PM

38 points

7 comments92 min readEA link

Contration: The next threat from AI may not be like the risks we’ve feared

John WallbankJul 28, 2024, 11:19 PM

−1 points

1 comment5 min readEA link

Re: Some thoughts on vegetarianism and veganism

FaiFeb 25, 2022, 8:43 PM

46 points

3 comments8 min readEA link

Ngo and Yudkowsky on alignment difficulty

richard_ngoNov 15, 2021, 10:47 PM

71 points

13 comments94 min readEA link

“Open Source AI” is a lie, but it doesn’t have to be

Jacob-HaimesApr 30, 2024, 7:42 PM

15 points

4 comments6 min readEA link

(jacob-haimes.github.io)

“If we go extinct due to misaligned AI, at least nature will continue, right? … right?”

plexMay 18, 2024, 3:06 PM

13 points

10 comments1 min readEA link

(aisafety.info)

The Importance of Artificial Sentience

Jamie_HarrisMar 3, 2021, 5:17 PM

70 points

10 comments11 min readEA link

(www.sentienceinstitute.org)

Applications for EU Tech Policy Fellowship 2024 now open

Jan-WillemSep 13, 2023, 4:17 PM

22 points

2 comments1 min readEA link

[Question] Trade Between Altruists With Different AI Timelines?

SpiarrowMar 18, 2025, 5:53 PM

3 points

3 comments1 min readEA link

A Framework for Assessing AI Welfare Risk

Liam 🔸Mar 2, 2025, 3:50 PM

8 points

0 comments1 min readEA link

SERI MATS—Summer 2023 Cohort

a_e_rApr 8, 2023, 3:32 PM

36 points

2 comments1 min readEA link

Pile of Law and Law-Following AI

Cullen 🔸Jul 13, 2022, 12:29 AM

28 points

2 comments3 min readEA link

AI Safety Newsletter #4: AI and Cybersecurity, Persuasive AIs, Weaponization, and Geoffrey Hinton talks AI risks

Center for AI SafetyMay 2, 2023, 4:51 PM

35 points

2 comments5 min readEA link

(newsletter.safe.ai)

Comparison of LLM scalability and performance between the U.S. and China based on benchmark

Ivanna_alvaradoOct 12, 2024, 9:51 PM

8 points

0 comments34 min readEA link

More thoughts on the Human-AGI War

AhrenbachDec 27, 2023, 1:52 AM

2 points

0 comments7 min readEA link

Long-Term Future Fund: May 2021 grant recommendations

abergalMay 27, 2021, 6:44 AM

110 points

17 comments57 min readEA link

Worries about latent reasoning in LLMs

CBiddulphJan 20, 2025, 9:09 AM

20 points

1 comment1 min readEA link

Long-term AI policy strategy research and implementation

Benjamin_ToddNov 9, 2021, 12:00 AM

1 point

0 comments7 min readEA link

(80000hours.org)

Set up an AIS newsletter for your group in 10 minutes per month (June edition)

gergoJun 18, 2024, 6:31 AM

34 points

0 comments1 min readEA link

[link post] AI Should Be Terrified of Humans

BrianKJul 24, 2023, 11:13 AM

29 points

0 comments1 min readEA link

(time.com)

EA & LW Forums Weekly Summary (5 − 11 Sep 22’)

Zoe WilliamsSep 12, 2022, 11:21 PM

36 points

0 comments13 min readEA link

AI and impact opportunities

brb243Mar 31, 2022, 8:23 PM

−2 points

6 comments1 min readEA link

Future Matters #4: AI timelines, AGI risk, and existential risk from climate change

PabloAug 8, 2022, 11:00 AM

59 points

0 comments17 min readEA link

[Question] Are there cause priortizations estimates for s-risks supporters?

jackchang110Mar 27, 2023, 10:32 AM

33 points

6 comments1 min readEA link

[Question] AI Safety and Censorship

KuiyakiJul 13, 2023, 11:34 AM

−10 points

0 comments1 min readEA link

Longtermist Implications of the Existence Neutrality Hypothesis

Maxime Riché 🔸Mar 20, 2025, 12:20 PM

19 points

0 comments21 min readEA link

AI Benefits Post 2: How AI Benefits Differs from AI Alignment & AI for Good

Cullen 🔸Jun 29, 2020, 4:59 PM

9 points

0 comments2 min readEA link

Please provide feedback on AI-safety grant proposal, thanks!

Alex LongDec 11, 2022, 11:29 PM

8 points

1 comment2 min readEA link

AI safety university groups: a promising opportunity to reduce existential risk

micJun 30, 2022, 6:37 PM

53 points

1 comment11 min readEA link

Conjecture: Internal Infohazard Policy

Connor LeahyJul 29, 2022, 7:35 PM

34 points

3 comments19 min readEA link

Assessing China’s importance as an AI superpower

JulianHazellFeb 3, 2023, 11:08 AM

89 points

7 comments1 min readEA link

(muddyclothes.substack.com)

Jan Kulveit’s Corrigibility Thoughts Distilled

brookAug 25, 2023, 1:42 PM

16 points

0 comments5 min readEA link

(www.lesswrong.com)

I have a some questions for the people at 80,000 Hours

yanni kyriacosFeb 14, 2024, 11:07 PM

25 points

17 comments1 min readEA link

Rishi Sunak mentions “existential threats” in talk with OpenAI, DeepMind, Anthropic CEOs

Arjun PanicksseryMay 24, 2023, 9:06 PM

44 points

2 comments1 min readEA link

The myth of AI “warning shots” as cavalry

Holly Elmore ⏸️ 🔸May 28, 2025, 4:53 PM

31 points

4 comments9 min readEA link

(hollyelmore.substack.com)

[Question] Looking for Quick, Collaborative Systems for Truth-Seeking in Group Disagreements

EffectiveAdvocate🔸Jan 21, 2025, 6:32 AM

10 points

1 comment1 min readEA link

Report: Artificial Intelligence Risk Management in Spain

JorgeTorresCJun 15, 2023, 4:08 PM

22 points

0 comments3 min readEA link

(riesgoscatastroficosglobales.com)

Medical Windfall Prizes

PeterMcCluskeyFeb 7, 2025, 12:13 AM

5 points

0 comments5 min readEA link

(bayesianinvestor.com)

[Extended Deadline: Jan 23rd] Announcing the PIBBSS Summer Research Fellowship

noraDec 18, 2021, 4:54 PM

36 points

1 comment1 min readEA link

#191 (Part 2) – Government and society after AGI (Carl Shulman on the 80,000 Hours Podcast)

80000_HoursJul 11, 2024, 7:26 PM

23 points

1 comment18 min readEA link

Announcing new round of “Key Phenomena in AI Risk” Reading Group

Dušan D. Nešić (Dushan)Oct 19, 2023, 11:05 AM

8 points

0 comments1 min readEA link

What term to use for AI in different policy contexts?

oegSep 6, 2023, 3:08 PM

18 points

3 comments9 min readEA link

HIRING: Inform and shape a new project on AI safety at Partnership on AI

Madhulika SrikumarNov 24, 2021, 4:29 PM

11 points

2 comments1 min readEA link

Dutch AI Safety Coordination Forum: An Experiment

HenningBNov 21, 2023, 4:18 PM

21 points

0 comments4 min readEA link

AI risk hub in Singapore?

kokotajlodOct 29, 2020, 11:51 AM

26 points

4 comments4 min readEA link

[Question] What type of Master’s is best for AI policy work?

Milan GriffesFeb 22, 2019, 8:04 PM

14 points

7 comments1 min readEA link

Rethinking the Value of Working on AI Safety

JohanEAJan 9, 2025, 2:15 PM

46 points

21 comments10 min readEA link

[Question] How to navigate potential infohazards

more better Mar 4, 2023, 9:28 PM

16 points

7 comments1 min readEA link

Hacker-AI – Does it already exist?

Erland WittkotterNov 7, 2022, 2:01 PM

0 points

1 comment1 min readEA link

🏜️ EA is in Albuquerque!

Alex LongMay 12, 2023, 10:09 PM

18 points

2 comments1 min readEA link

How We Might All Die in A Year

Greg_Colbourn ⏸️ Mar 28, 2025, 1:31 PM

11 points

6 comments1 min readEA link

(x.com)

4 types of AGI selection, and how to constrain them

RemmeltAug 9, 2023, 3:02 PM

7 points

0 comments3 min readEA link

Discontinuous progress in history: an update

AI ImpactsApr 17, 2020, 4:28 PM

69 points

3 comments24 min readEA link

[Question] Survey about Copyright and generative AI allowed here ?

Lee O'BrienAug 9, 2024, 12:27 PM

0 points

1 comment1 min readEA link

$250K in Prizes: SafeBench Competition Announcement

Center for AI SafetyApr 3, 2024, 10:07 PM

47 points

0 comments1 min readEA link

How I feel about AI consciousness

YadavJun 5, 2025, 4:49 PM

9 points

0 comments3 min readEA link

(robertandgaurav.substack.com)

Long-Term Future Fund: Ask Us Anything!

AdamGleaveDec 3, 2020, 1:44 PM

89 points

153 comments1 min readEA link

Social scientists interested in AI safety should consider doing direct technical AI safety research, (possibly meta-research), or governance, support roles, or community building instead

Vael GatesJul 20, 2022, 11:01 PM

65 points

8 comments18 min readEA link

Comments on Manheim’s “What’s in a Pause?”

RobBensingerSep 18, 2023, 12:16 PM

74 points

11 comments6 min readEA link

Chinese Researchers Crack ChatGPT: Replicating OpenAI’s Advanced AI Model

Evan_GaensbauerJan 5, 2025, 3:50 AM

1 point

0 comments1 min readEA link

(www.geeky-gadgets.com)

How long will reaching a Risk Awareness Moment and CHARTS agreement take?

YadavSep 6, 2023, 4:39 PM

12 points

0 comments14 min readEA link

Air-gapping evaluation and support

Ryan KiddDec 26, 2022, 10:52 PM

22 points

12 comments1 min readEA link

‘Now Is the Time of Monsters’

Aaron GoldzimerJan 12, 2025, 11:31 PM

25 points

0 comments1 min readEA link

(www.nytimes.com)

The GDM AGI Safety+Alignment Team is Hiring for Applied Interpretability Research

Arthur ConmyFeb 25, 2025, 10:38 PM

11 points

0 comments7 min readEA link

How to become an AI safety researcher

peterbarnettApr 12, 2022, 11:33 AM

113 points

15 comments14 min readEA link

[Opzionale] Ricerca sulla sicurezza delle IA: panoramica delle carriere

EA ItalyJan 17, 2023, 11:06 AM

1 point

0 comments7 min readEA link

Intro to Safety Engineering

Madhav MalhotraOct 19, 2022, 11:44 PM

4 points

0 comments1 min readEA link

Critique of Superintelligence Part 5

James FodorDec 13, 2018, 5:19 AM

12 points

2 comments6 min readEA link

Tarbell Fellowship 2024 - Applications Open (AI Journalism)

Cillian_Sep 28, 2023, 10:38 AM

58 points

1 comment3 min readEA link

Katja Grace on Slowing Down AI, AI Expert Surveys And Estimating AI Risk

Michaël TrazziSep 16, 2022, 6:00 PM

48 points

6 comments3 min readEA link

(theinsideview.ai)

AI Safety Overview: CERI Summer Research Fellowship

Jamie BMar 24, 2022, 3:12 PM

29 points

0 comments2 min readEA link

Contracting Opportunity: Be a shortform video editor for the new 80,000 Hours Video Program (even if you haven’t edited before!)

ChanaMessingerApr 15, 2025, 10:22 PM

43 points

1 comment2 min readEA link

(80000hours.org)

AI Safety Chatbot

markovDec 21, 2023, 2:09 PM

49 points

3 comments4 min readEA link

EA Berkeley Presents: Universal Ownership: Is Index Investing the New Socially Responsible Investing?

Mahendra PrasadMar 10, 2022, 6:58 AM

7 points

0 comments1 min readEA link

AI Manufactured Crisis (don’t trust AI to protect us from AI)

WobblyPanda2Jun 1, 2023, 11:12 AM

4 points

0 comments1 min readEA link

There’s No Fire Alarm for Artificial General Intelligence

EA Forum ArchivesOct 14, 2017, 2:41 AM

30 points

1 comment25 min readEA link

(www.lesswrong.com)

The heritability of human values: A behavior genetic critique of Shard Theory

Geoffrey MillerOct 20, 2022, 3:53 PM

49 points

12 comments21 min readEA link

Philanthropists Probably Shouldn’t Mission-Hedge AI Progress

MichaelDickensAug 23, 2022, 11:03 PM

28 points

9 comments36 min readEA link

Was Releasing Claude-3 Net-Negative

Logan RiggsMar 27, 2024, 5:41 PM

12 points

1 comment4 min readEA link

Exploring the Esoteric Pathways to AI Sentience (Part One)

CarusoApr 27, 2024, 12:22 PM

−6 points

0 comments2 min readEA link

Final Report of the National Security Commission on Artificial Intelligence (NSCAI, 2021)

MichaelA🔸Jun 1, 2021, 8:19 AM

51 points

3 comments4 min readEA link

(www.nscai.gov)

Strategic Perspectives on Transformative AI Governance: Introduction

MMMaasJul 2, 2022, 11:20 AM

115 points

18 comments4 min readEA link

Marisa, the Co-Founder of EA Anywhere, Has Passed Away

carrickflynnMay 17, 2024, 10:49 PM

520 points

33 comments1 min readEA link

Should we break up Google DeepMind?

Hauke HillebrandtApr 22, 2024, 9:16 AM

34 points

13 comments4 min readEA link

Metaculus Presents: Does Generative AI Infringe Copyright?

christianNov 6, 2023, 11:41 PM

5 points

0 comments1 min readEA link

2023 Open Philanthropy AI Worldviews Contest: Odds of Artificial General Intelligence by 2043

srhoades10Mar 14, 2023, 8:32 PM

19 points

0 comments46 min readEA link

ML4Good UK—Applications Open

NiaJan 2, 2024, 6:20 PM

21 points

0 comments1 min readEA link

Proposals for the AI Regulatory Sandbox in Spain

Guillem BasApr 27, 2023, 10:33 AM

55 points

2 comments11 min readEA link

(riesgoscatastroficosglobales.com)

[Question] Forecasting thread: How does AI risk level vary based on timelines?

eliflandSep 14, 2022, 11:56 PM

47 points

8 comments1 min readEA link

Artificially sentient beings: Moral, political, and legal issues

Fırat AkovaAug 1, 2023, 5:48 PM

20 points

2 comments1 min readEA link

(doi.org)

Replacement for PONR concept

kokotajlodSep 2, 2022, 12:38 AM

14 points

1 comment2 min readEA link

Recruiting Skilled Volunteers

The BOOMNov 3, 2022, 2:36 PM

−9 points

14 comments1 min readEA link

The Fragility of Naive Dynamism

DavidmanheimMay 19, 2025, 7:53 AM

10 points

1 comment1 min readEA link

Is Pausing AI Possible?

Richard AnniloOct 9, 2024, 1:22 PM

89 points

4 comments18 min readEA link

Asterisk Magazine Issue 03: AI

alejandroJul 24, 2023, 3:53 PM

34 points

3 comments1 min readEA link

(asteriskmag.com)

Information security considerations for AI and the long term future

Jeffrey LadishMay 2, 2022, 8:53 PM

134 points

8 comments11 min readEA link

There are a lot of upcoming retreats/conferences between March and July (2025)

gergoFeb 18, 2025, 9:28 AM

18 points

2 comments1 min readEA link

Max Tegmark — The AGI Entente Delusion

Matrice JacobineOct 13, 2024, 5:42 PM

0 points

1 comment1 min readEA link

(www.lesswrong.com)

The case for taking AI seriously as a threat to humanity (Kelsey Piper)

EA HandbookOct 15, 2020, 7:00 AM

11 points

1 comment1 min readEA link

(www.vox.com)

#177 – Recent AI breakthroughs and navigating the growing rift between AI safety and accelerationist camps (Nathan Labenz on the 80,000 Hours Podcast)

80000_HoursJan 31, 2024, 7:37 PM

15 points

0 comments16 min readEA link

[Podcast] Ajeya Cotra on worldview diversification and how big the future could be

Eevee🔹Jan 22, 2021, 11:57 PM

57 points

20 comments1 min readEA link

(80000hours.org)

[Question] Next week I’m interviewing tech policy expert Teddy Collins who has worked in the White House, DeepMind and CSET. What should I ask him?

Robert_WiblinJul 7, 2023, 2:05 PM

15 points

4 comments1 min readEA link

Revisiting the Evolution Anchor in the Biological Anchors Report

JanviMar 18, 2024, 3:01 AM

13 points

1 comment4 min readEA link

The ‘Old AI’: Lessons for AI governance from early electricity regulation

Sam ClarkeDec 19, 2022, 2:46 AM

58 points

1 comment13 min readEA link

[Link post] Promising Paths to Alignment—Connor Leahy | Talk

frances_lorenzMay 14, 2022, 3:58 PM

17 points

0 comments1 min readEA link

MATS Applications + Research Directions I’m Currently Excited About

Neel NandaFeb 6, 2025, 11:03 AM

31 points

3 comments1 min readEA link

When 2/3rds of the world goes against you

Jeffrey KursonisJul 2, 2022, 8:34 PM

2 points

2 comments9 min readEA link

How we use back-of-the-envelope calculations in our grantmaking

Open PhilanthropyMay 28, 2025, 11:22 PM

77 points

2 comments10 min readEA link

TamperSec is hiring for 3 Key Roles!

Tatiana K. Nesic SkuratovaFeb 27, 2025, 12:23 PM

10 points

0 comments5 min readEA link

How to reduce risks related to conscious AI: A user guide [Conscious AI & Public Perception]

Jay LuongJul 5, 2024, 2:19 PM

9 points

1 comment15 min readEA link

How could we know that an AGI system will have good consequences?

So8resNov 7, 2022, 10:42 PM

25 points

0 comments1 min readEA link

Emergency pod: Judge plants a legal time bomb under OpenAI (with Rose Chan Loui)

80000_HoursMar 7, 2025, 7:24 PM

62 points

18 comments2 min readEA link

AI Alignment 2018-2019 Review

Habryka [Deactivated]Jan 28, 2020, 9:14 PM

28 points

0 comments6 min readEA link

(www.lesswrong.com)

From Conflict to Coexistence: Rewriting the Game Between Humans and AGI

Michael BatellMay 6, 2025, 5:09 AM

15 points

2 comments37 min readEA link

AISN #27: Defensive Accelerationism, A Retrospective On The OpenAI Board Saga, And A New AI Bill From Senators Thune And Klobuchar

Center for AI SafetyDec 7, 2023, 3:57 PM

10 points

0 comments6 min readEA link

(newsletter.safe.ai)

INTERVIEW: StakeOut.AI w/ Dr. Peter Park

Jacob-HaimesMar 5, 2024, 6:04 PM

21 points

7 comments1 min readEA link

(into-ai-safety.github.io)

What Should the Average EA Do About AI Alignment?

RaemonFeb 25, 2017, 8:07 PM

42 points

39 comments7 min readEA link

My take on What We Owe the Future

eliflandSep 1, 2022, 6:07 PM

357 points

51 comments26 min readEA link

Last week to apply for the Futurekind AI Fellowship! (deadline: April 1)

Jay LuongMar 23, 2025, 11:16 PM

24 points

0 comments1 min readEA link

Against AI As An Existential Risk

Noah BirnbaumJul 30, 2024, 7:24 PM

6 points

3 comments1 min readEA link

(irrationalitycommunity.substack.com)

AI Agents raised $2,000 for EA charities & used the EA Forum

David_R 🔸Jun 4, 2025, 10:18 PM

13 points

0 comments1 min readEA link

The ‘Bad Parent’ Problem: Why Human Society Complicates AI Alignment

Beyond SingularityApr 5, 2025, 9:08 PM

11 points

1 comment3 min readEA link

Biological Anchors external review by Jennifer Lin (linkpost)

peterhartreeNov 30, 2022, 1:06 PM

36 points

0 comments1 min readEA link

(docs.google.com)

Is Text Watermarking a lost cause?

Egor TimatkovOct 1, 2024, 1:07 PM

4 points

0 comments10 min readEA link

Crash scenario 1: Rapidly mobilise for a 2025 AI crash

RemmeltApr 11, 2025, 6:54 AM

8 points

0 comments1 min readEA link

GPT-2 as step toward general intelligence (Alexander, 2019)

Will AldredJul 18, 2022, 4:14 PM

42 points

0 comments2 min readEA link

(slatestarcodex.com)

[Question] What EAG sessions would you like on AI?

Nathan YoungMar 20, 2022, 5:05 PM

7 points

10 comments1 min readEA link

Palisade is hiring Research Engineers

Charlie Rogers-SmithNov 11, 2023, 3:09 AM

23 points

0 comments3 min readEA link

[Question] Are AI risks tractable?

defun 🔸May 21, 2024, 1:45 PM

23 points

1 comment1 min readEA link

The EU AI Act: A Simple Explanation—A Stanford Study Reveals the gaps of ChatGPT and 9 more

SparkvibeJun 26, 2023, 8:59 AM

3 points

1 comment1 min readEA link

(youtu.be)

Reinforcement Learning: A Non-Technical Primer on o1 and DeepSeek-R1

AlexChalkFeb 9, 2025, 11:58 PM

4 points

0 comments9 min readEA link

(alexchalk.net)

Paul Christiano on how OpenAI is developing real solutions to the ‘AI alignment problem’, and his vision of how humanity will progressively hand over decision-making to AI systems

80000_HoursOct 2, 2018, 11:49 AM

6 points

0 comments185 min readEA link

Scrutinizing AI Risk (80K, #81) - v. quick summary

BenJul 23, 2020, 7:02 PM

10 points

1 comment3 min readEA link

Part 1: The AI Safety community has four main work groups, Strategy, Governance, Technical and Movement Building

PeterSlatteryNov 25, 2022, 3:45 AM

72 points

7 comments6 min readEA link

Apply for MATS Winter 2023-24!

utilistrutilOct 21, 2023, 2:34 AM

34 points

2 comments5 min readEA link

(www.lesswrong.com)

Feedback Request on EA Philippines’ Career Advice Research for Technical AI Safety

BrianTanOct 3, 2020, 10:39 AM

19 points

5 comments4 min readEA link

Let’s think about slowing down AI

Katja_GraceDec 23, 2022, 7:56 PM

334 points

9 comments1 min readEA link

Classifying sources of AI x-risk

Sam ClarkeAug 8, 2022, 6:18 PM

41 points

4 comments3 min readEA link

Finishing The SB-1047 Documentary

Michaël TrazziOct 28, 2024, 8:26 PM

67 points

0 comments4 min readEA link

[Question] What do we know about Mustafa Suleyman’s position on AI Safety?

Chris LeongAug 13, 2023, 7:41 PM

14 points

3 comments1 min readEA link

My guess for the most cost effective AI Safety projects

Linda LinseforsJan 24, 2024, 12:21 PM

26 points

2 comments4 min readEA link

Debate: should EA avoid using AI art outside of research?

titotalApr 30, 2025, 11:10 AM

32 points

29 comments3 min readEA link

20+ tips, tricks, lessons and thoughts on hosting hackathons

gergoNov 6, 2023, 10:59 AM

14 points

0 comments11 min readEA link

Discovering Language Model Behaviors with Model-Written Evaluations

evhubDec 20, 2022, 8:09 PM

25 points

0 comments1 min readEA link

Stable totalitarianism: an overview

80000_HoursOct 29, 2024, 4:07 PM

36 points

1 comment20 min readEA link

(80000hours.org)

Regulation of AI Use for Personal Data Protection: Comparison of Global Strategies and Opportunities for Latin America

Lisbeth Guzman Oct 14, 2024, 1:22 PM

10 points

1 comment21 min readEA link

New Speaker Series on AI Alignment Starting March 3

Zechen ZhangFeb 26, 2022, 10:58 AM

5 points

0 comments1 min readEA link

Who ordered alignment’s apple?

Eleni_AAug 28, 2022, 2:24 PM

5 points

0 comments3 min readEA link

AI Safety is Dropping the Ball on Clown Attacks

trevor1Oct 21, 2023, 11:15 PM

−17 points

0 comments1 min readEA link

[Question] What kind of organization should be the first to develop AGI in a potential arms race?

Eevee🔹Jul 17, 2022, 5:41 PM

10 points

2 comments1 min readEA link

[Fiction] Improved Governance on the Critical Path to AI Alignment by 2045.

Jackson WagnerMay 18, 2022, 3:50 PM

20 points

1 comment12 min readEA link

Critique of Superintelligence Part 1

James FodorDec 13, 2018, 5:10 AM

22 points

13 comments8 min readEA link

What I am open to changing my mind about: polls and debates

BiologyTranslatedMay 9, 2025, 10:13 AM

8 points

9 comments2 min readEA link

Stampy’s AI Safety Info—New Distillations #2 [April 2023]

markovMay 9, 2023, 1:34 PM

13 points

1 comment1 min readEA link

(aisafety.info)

I’m Cullen O’Keefe, a Policy Researcher at OpenAI, AMA

Cullen 🔸Jan 11, 2020, 4:13 AM

45 points

68 comments1 min readEA link

Book review: Architects of Intelligence by Martin Ford (2018)

OferAug 11, 2020, 5:24 PM

11 points

1 comment2 min readEA link

The Self in Artificial Consciousness: A Buddhist Investigation into Advanced AI

Ryan CombesOct 24, 2023, 4:11 AM

10 points

2 comments1 min readEA link

(drive.google.com)

AI Alignment and the Financial War Against Narcissistic Manipulation

Julian NalenzFeb 19, 2025, 8:36 PM

2 points

0 comments3 min readEA link

[Crosspost] AI Regulation May Be More Important Than AI Alignment For Existential Safety

OttoAug 24, 2023, 4:01 PM

14 points

2 comments5 min readEA link

2020 AI Alignment Literature Review and Charity Comparison

LarksDec 21, 2020, 3:25 PM

155 points

16 comments68 min readEA link

Democratising AI Alignment: Challenges and Proposals

Lloy2May 5, 2025, 2:50 PM

2 points

2 comments3 min readEA link

Alignment 201 curriculum

richard_ngoOct 12, 2022, 7:17 PM

94 points

9 comments1 min readEA link

The limited upside of interpretability

Peter S. ParkNov 15, 2022, 8:22 PM

23 points

3 comments10 min readEA link

DeepMind’s “Frontier Safety Framework” is weak and unambitious

Zach Stein-PerlmanMay 18, 2024, 3:00 AM

54 points

1 comment1 min readEA link

The Unknowable Catastrophe

AinoJul 6, 2023, 3:37 PM

3 points

0 comments3 min readEA link

Debating AI’s Moral Status: The Most Humane and Silliest Thing Humans Do(?)

Soe LinSep 29, 2024, 5:01 AM

5 points

5 comments3 min readEA link

Turing-Test-Passing AI implies Aligned AI

RokoDec 31, 2024, 8:22 PM

0 points

0 comments5 min readEA link

[Closed] Agent Foundations track in MATS

VanessaOct 31, 2023, 8:14 AM

19 points

0 comments1 min readEA link

(www.matsprogram.org)

Considerations on transformative AI and explosive growth from a semiconductor-industry perspective

MuireallMay 31, 2023, 1:11 AM

23 points

1 comment2 min readEA link

(muireall.space)

AI Safety 101 : AGI

markovDec 21, 2023, 2:18 PM

2 points

1 comment33 min readEA link

U.S. Government Seeks Input on National AI R&D Strategic Plan—Deadline May 29

Matt BrooksMay 27, 2025, 1:53 AM

8 points

1 comment1 min readEA link

The inordinately slow spread of good AGI conversations in ML

RobBensingerJun 29, 2022, 4:02 AM

59 points

2 comments8 min readEA link

Why AI Regulation Violates the First Amendment

LockeJun 1, 2024, 8:44 PM

−15 points

0 comments5 min readEA link

Reactive devaluation: Bias in Evaluating AGI X-Risks

RemmeltDec 30, 2022, 9:02 AM

2 points

9 comments1 min readEA link

Skilling-up in ML Engineering for Alignment: request for comments

Callum McDougallApr 24, 2022, 6:40 AM

8 points

0 comments1 min readEA link

Samotsvety’s AI risk forecasts

eliflandSep 9, 2022, 4:01 AM

175 points

30 comments4 min readEA link

“Normal accidents” and AI systems

Eleni_AAug 8, 2022, 6:43 PM

5 points

1 comment1 min readEA link

(www.achan.ca)

AI Benefits Post 4: Outstanding Questions on Selecting Benefits

Cullen 🔸Jul 14, 2020, 5:24 PM

6 points

0 comments5 min readEA link

The Hasty Start of Budapest AI Safety, 6-month update from a non-STEM founder

gergoJan 3, 2024, 12:56 PM

9 points

1 comment7 min readEA link

[Question] What are some resources (articles, videos) that show off what the current state of the art in AI is? (for a layperson who doesn’t know much about AI)

jamesDec 6, 2021, 9:06 PM

10 points

6 comments1 min readEA link

John Cochrane on why regulation is the wrong tool for AI Safety

ezrahSep 26, 2024, 8:48 AM

3 points

2 comments1 min readEA link

(www.grumpy-economist.com)

[Linkpost] Eric Schwitzgebel: AI systems must not confuse users about their sentience or moral status

Zachary Brown🔸Aug 18, 2023, 5:21 PM

6 points

0 comments2 min readEA link

(www.sciencedirect.com)

Video and transcript of talk on AI welfare

Joe_CarlsmithMay 22, 2025, 4:15 PM

22 points

1 comment1 min readEA link

(joecarlsmith.substack.com)

UK Prime Minister Rishi Sunak’s Speech on AI

Tobias HäberliOct 26, 2023, 10:34 AM

112 points

6 comments8 min readEA link

(www.gov.uk)

Enhancing Biometric Data Protection in Latin America Based on the European Experience

Ana Sofía Jiménez Aug 13, 2024, 1:13 PM

13 points

1 comment4 min readEA link

The counting argument for scheming (Sections 4.1 and 4.2 of “Scheming AIs”)

Joe_CarlsmithDec 6, 2023, 7:28 PM

9 points

1 comment1 min readEA link

AGI with feelings

Nicolai MebergDec 7, 2022, 4:00 PM

−13 points

0 comments1 min readEA link

(twitter.com)

Comparing AI Labs and Pharmaceutical Companies

mxschonsNov 13, 2024, 2:51 PM

13 points

0 comments1 min readEA link

(mxschons.com)

A request to keep pessimistic AI posts actionable.

tcelferactMay 11, 2023, 3:35 PM

27 points

9 comments1 min readEA link

Hiring a CEO & EU Tech Policy Lead to launch an AI policy career org in Europe

Cillian_Dec 6, 2023, 1:52 PM

50 points

0 comments7 min readEA link

[Cross-post] Change my mind: we should define and measure the effectiveness of advanced AI

David JohnstonApr 6, 2022, 12:20 AM

4 points

0 comments7 min readEA link

AI Risk Intro 2: Solving The Problem

L Rudolf LSep 24, 2022, 9:33 AM

11 points

0 comments28 min readEA link

(www.perfectlynormal.co.uk)

Convergence 2024 Impact Review

David_KristofferssonMar 24, 2025, 8:28 PM

38 points

0 comments14 min readEA link

[Question] Questions on databases of AI Risk estimates

FroolowOct 2, 2022, 9:12 AM

24 points

12 comments2 min readEA link

AGI as a Black Swan Event

Stephen McAleeseDec 4, 2022, 11:35 PM

5 points

2 comments7 min readEA link

(www.lesswrong.com)

OpenAI: Preparedness framework

Zach Stein-PerlmanDec 18, 2023, 6:30 PM

24 points

0 comments1 min readEA link

(openai.com)

#197 – On whether Anthropic’s AI safety policy is up to the task (Nick Joseph on The 80,000 Hours Podcast)

80000_HoursAug 22, 2024, 3:34 PM

9 points

0 comments18 min readEA link

Takeaways from a survey on AI alignment resources

DanielFilanNov 5, 2022, 11:45 PM

20 points

9 comments6 min readEA link

(www.lesswrong.com)

[Question] submissive ai

David turnerNov 21, 2023, 2:28 PM

−5 points

0 comments1 min readEA link

AI Safety Needs Great Engineers

Andy JonesNov 23, 2021, 9:03 PM

98 points

13 comments4 min readEA link

The fundamental human value is power.

LinyphiaMar 30, 2023, 3:15 PM

−1 points

5 comments1 min readEA link

Most Leading AI Experts Believe That Advanced AI Could Be Extremely Dangerous to Humanity

jaiMay 4, 2023, 4:19 PM

31 points

1 comment1 min readEA link

(laneless.substack.com)

Pivotal outcomes and pivotal processes

Andrew CritchJun 17, 2022, 11:43 PM

49 points

1 comment4 min readEA link

Safety-concerned EAs should prioritize AI governance over alignment

sammyboiz🔸Jun 11, 2024, 3:47 PM

59 points

20 comments1 min readEA link

Moral Spillover in Human-AI Interaction

Katerina ManoliJun 5, 2023, 3:20 PM

17 points

1 comment13 min readEA link

[Question] How can we secure more research positions at our universities for x-risk researchers?

Neil CrawfordSep 6, 2022, 2:41 PM

3 points

2 comments1 min readEA link

[Question] What does the Project Management role look like in AI safety?

gvstMay 14, 2022, 7:29 PM

10 points

1 comment1 min readEA link

Some quick thoughts on “AI is easy to control”

MikhailSaminDec 7, 2023, 12:23 PM

5 points

4 comments1 min readEA link

An ML safety insurance company—shower thoughts

EdoAradOct 18, 2021, 7:45 AM

15 points

4 comments1 min readEA link

[Question] Is a career in making AI systems more secure a meaningful way to mitigate the X-risk posed by AGI?

Kyle O’BrienFeb 13, 2022, 7:05 AM

14 points

4 comments1 min readEA link

Incident reporting for AI safety

Zach Stein-PerlmanJul 19, 2023, 5:00 PM

18 points

1 comment18 min readEA link

AISN #9: Statement on Extinction Risks, Competitive Pressures, and When Will AI Reach Human-Level?

Center for AI SafetyJun 6, 2023, 3:56 PM

12 points

2 comments7 min readEA link

(newsletter.safe.ai)

Thinking About Propensity Evaluations

Maxime Riché 🔸Aug 19, 2024, 9:24 AM

12 points

1 comment27 min readEA link

How to Diversify Conceptual AI Alignment: the Model Behind Refine

adamShimiJul 20, 2022, 10:44 AM

43 points

0 comments9 min readEA link

(www.alignmentforum.org)

An entire category of risks is undervalued by EA [Summary of previous forum post]

Richard RSep 5, 2022, 3:07 PM

79 points

5 comments5 min readEA link

AISN #20: LLM Proliferation, AI Deception, and Continuing Drivers of AI Capabilities

Center for AI SafetyAug 29, 2023, 3:03 PM

12 points

0 comments8 min readEA link

(newsletter.safe.ai)

Alex Lawsen On Forecasting AI Progress

Michaël TrazziSep 6, 2022, 9:53 AM

38 points

1 comment2 min readEA link

(theinsideview.ai)

Register for the Stanford Existential Risks Initiative (SERI) Symposium

Grant Higerd-RusliMar 18, 2025, 3:50 AM

7 points

0 comments1 min readEA link

(cisac.fsi.stanford.edu)

Call for submissions: AI Safety Special Session at the Conference on Artificial Life (ALIFE 2023)

Rory GreigFeb 5, 2023, 4:37 PM

16 points

0 comments2 min readEA link

(humanvaluesandartificialagency.com)

AI Forecasting Question Database (Forecasting infrastructure, part 3)

terraformSep 3, 2019, 2:57 PM

23 points

2 comments4 min readEA link

Stackelberg Games and Cooperative Commitment: My Thoughts and Reflections on a 2-Month Research Project

Ben BucknallDec 13, 2021, 10:49 AM

18 points

1 comment9 min readEA link

AISN #17: Automatically Circumventing LLM Guardrails, the Frontier Model Forum, and Senate Hearing on AI Oversight

Center for AI SafetyAug 1, 2023, 3:24 PM

15 points

0 comments8 min readEA link

Looking for a Document to Introduce AI Risks to Newbies

Jr22Jun 17, 2024, 1:02 PM

2 points

3 comments1 min readEA link

Announcing the AI Safety Summit Talks with Yoshua Bengio

OttoMay 14, 2024, 12:49 PM

33 points

1 comment1 min readEA link

AISN #18: Challenges of Reinforcement Learning from Human Feedback, Microsoft’s Security Breach, and Conceptual Research on AI Safety

Center for AI SafetyAug 8, 2023, 3:52 PM

12 points

0 comments5 min readEA link

(newsletter.safe.ai)

Spicy takes about AI policy (Clark, 2022)

Will AldredAug 9, 2022, 1:49 PM

44 points

0 comments3 min readEA link

(twitter.com)

How should norms of academic writing and publishing be changed once AI systems become superhuman in more respects?

simonfriederichNov 24, 2023, 1:35 PM

10 points

0 comments1 min readEA link

(link.springer.com)

The Silent War: AGI-on-AGI Warfare and What It Means For Us

funnyfrancoMar 15, 2025, 3:32 PM

4 points

0 comments22 min readEA link

Launching the AI Forecasting Benchmark Series Q3 | $30k in Prizes

christianJul 8, 2024, 5:20 PM

17 points

0 comments1 min readEA link

(www.metaculus.com)

Apply to attend an AI safety workshop in Berkeley (Nov 18-21)

AkashNov 6, 2022, 6:06 PM

19 points

0 comments1 min readEA link

Soares, Tallinn, and Yudkowsky discuss AGI cognition

EliezerYudkowskyNov 29, 2021, 5:28 PM

15 points

0 comments40 min readEA link

Increased Availability and Willingness for Deployment of Resources for Effective Altruism and Long-Termism

Evan_GaensbauerDec 29, 2021, 8:20 PM

46 points

1 comment2 min readEA link

Continuous doesn’t mean slow

Tom_DavidsonMay 10, 2023, 12:17 PM

64 points

1 comment4 min readEA link

[Question] What is the best article to introduce someone to AI safety for the first time?

trevor1Nov 22, 2022, 2:06 AM

2 points

3 comments1 min readEA link

Make a neural network in ~10 minutes

Arjun YadavApr 25, 2022, 6:36 PM

3 points

0 comments4 min readEA link

(arjunyadav.net)

AI for AI safety

Joe_CarlsmithMar 14, 2025, 3:00 PM

34 points

1 comment1 min readEA link

(joecarlsmith.substack.com)

Speed arguments against scheming (Section 4.4-4.7 of “Scheming AIs”)

Joe_CarlsmithDec 8, 2023, 9:10 PM

6 points

0 comments1 min readEA link

Argument Against Impact: EU Is Not an AI Superpower

EU AI GovernanceJan 31, 2022, 9:48 AM

35 points

9 comments4 min readEA link

Aligning AI with Humans by Leveraging Legal Informatics

johnjnaySep 18, 2022, 7:43 AM

20 points

11 comments3 min readEA link

Updates from Campaign for AI Safety

Jolyn KhooJun 29, 2023, 7:23 AM

8 points

0 comments1 min readEA link

(www.campaignforaisafety.org)

CoreWeave Is A Time Bomb

RemmeltMar 31, 2025, 3:52 AM

10 points

2 comments1 min readEA link

(www.wheresyoured.at)

Why Billionaires Will Not Survive an AGI Extinction Event

funnyfrancoMar 13, 2025, 7:03 PM

1 point

0 comments14 min readEA link

Skepticism towards claims about the views of powerful institutions

tlevinFeb 13, 2025, 7:40 AM

20 points

1 comment1 min readEA link

What is the role of Bayesian ML for AI alignment/safety?

mariushobbhahnJan 11, 2022, 8:07 AM

39 points

6 comments3 min readEA link

Niceness is unnatural

So8resOct 13, 2022, 1:30 AM

20 points

1 comment1 min readEA link

[Question] How confident are you that it’s preferable for America to develop AGI before China does?

ScienceMon🔸Feb 22, 2025, 1:37 PM

214 points

52 comments1 min readEA link

Opinionated take on EA and AI Safety

sammyboiz🔸Mar 2, 2025, 9:37 AM

75 points

18 comments1 min readEA link

Technical AI Safety Research Landscape [Slides]

Magdalena WacheSep 18, 2023, 1:56 PM

30 points

0 comments4 min readEA link

AI governance needs a theory of victory

Corin KatzkeJun 21, 2024, 4:08 PM

80 points

8 comments20 min readEA link

(www.convergenceanalysis.org)

Feedback wanted! On script for an upcoming ~12 minute Rob Miles video on AI x-risk.

melissasamworthJan 23, 2025, 9:46 PM

25 points

0 comments1 min readEA link

ARIA is looking for topics for roundtables

Nathan_BarnardAug 26, 2022, 7:14 PM

34 points

11 comments1 min readEA link

Visit Mexico City in January & February to interact with the AI Futures Fellowship

AmAristizabalJul 28, 2023, 4:44 PM

45 points

0 comments2 min readEA link

Animal Weapons: Lessons for Humans in the Age of X-Risk

Damin Curtis🔹Jul 4, 2023, 2:43 PM

32 points

1 comment10 min readEA link

AI labs’ statements on governance

Zach Stein-PerlmanJul 4, 2023, 4:30 PM

28 points

1 comment1 min readEA link

Reducing LLM deception at scale with self-other overlap fine-tuning

Marc CarauleanuMar 13, 2025, 7:09 PM

8 points

0 comments1 min readEA link

Retrospective: PIBBSS Fellowship 2023

Dušan D. Nešić (Dushan)Feb 16, 2024, 5:48 PM

17 points

2 comments1 min readEA link

Wild Animal Welfare Scenarios for AI Doom

utilistrutilJun 8, 2023, 7:41 PM

54 points

2 comments3 min readEA link

What Does an ASI Political Ecology Mean for Human Survival?

Nathan SidneyFeb 23, 2025, 8:53 AM

7 points

3 comments1 min readEA link

Sentience in Machines—How Do We Test for This Objectively?

Mayowa OsiboduMar 20, 2023, 5:20 AM

10 points

0 comments2 min readEA link

(www.researchgate.net)

Welcome to Apply: The 2024 Vitalik Buterin Fellowships in AI Existential Safety by FLI!

Zhijing JinSep 25, 2023, 4:20 PM

14 points

5 comments2 min readEA link

Apply to the 2025 PIBBSS Summer Research Fellowship

Dušan D. Nešić (Dushan)Dec 24, 2024, 10:28 AM

6 points

0 comments1 min readEA link

Toby Ord’s new report on lessons from the development of the atomic bomb

Ishan MukherjeeNov 22, 2022, 10:37 AM

65 points

3 comments1 min readEA link

(www.governance.ai)

Simulating a possible alignment solution in GPT2-medium using Archetypal Transfer Learning

MiguelMay 2, 2023, 4:23 PM

4 points

0 comments18 min readEA link

Survey of 2,778 AI authors: six parts in pictures

Katja_GraceJan 6, 2024, 4:43 AM

176 points

10 comments1 min readEA link

Fermi estimation of the impact you might have working on AI safety

fribMay 13, 2022, 1:30 PM

24 points

13 comments1 min readEA link

Yudkowsky and Christiano on AI Takeoff Speeds [LINKPOST]

aogApr 5, 2022, 12:57 AM

15 points

0 comments11 min readEA link

When do experts think human-level AI will be created?

Vishakha AgrawalJan 2, 2025, 11:17 PM

33 points

9 comments2 min readEA link

(aisafety.info)

Linkpost—Beyond Hyperanthropomorphism: Or, why fears of AI are not even wrong, and how to make them real

LockeAug 24, 2022, 4:24 PM

−4 points

3 comments2 min readEA link

(studio.ribbonfarm.com)

AGI ruin mostly rests on strong claims about alignment and deployment, not about society

RobBensingerApr 24, 2023, 1:07 PM

16 points

4 comments1 min readEA link

UK policy and politics careers

weeatquinceSep 28, 2019, 4:18 PM

28 points

10 comments7 min readEA link

Nine Points of Collective Insanity

RemmeltDec 27, 2022, 3:14 AM

1 point

0 comments1 min readEA link

AI Safety in a Vulnerable World: Requesting Feedback on Preliminary Thoughts

Jordan ArelDec 6, 2022, 10:36 PM

5 points

4 comments3 min readEA link

Possible Divergence in AGI Risk Tolerance between Selfish and Altruistic agents

Brad West🔸Sep 9, 2023, 12:22 AM

11 points

0 comments2 min readEA link

AI Safety: The [Hypothetical] Video Game

barryl 🔸Apr 18, 2025, 8:19 PM

2 points

1 comment3 min readEA link

AI Pause Will Likely Backfire

Nora BelroseSep 16, 2023, 10:21 AM

141 points

167 comments13 min readEA link

Main paths to impact in EU AI Policy

JOMG_MonnetDec 8, 2022, 4:17 PM

69 points

2 comments8 min readEA link

AMA: PauseAI US needs money! Ask founder/Exec Dir Holly Elmore anything for 11/19

Holly Elmore ⏸️ 🔸Nov 11, 2024, 11:51 PM

91 points

57 comments4 min readEA link

Reading the ethicists 2: Hunting for AI alignment papers

Charlie SteinerJun 6, 2022, 3:53 PM

11 points

0 comments1 min readEA link

(www.lesswrong.com)

Summary: “Imagining and building wise machines: The centrality of AI metacognition” by Johnson, Karimi, Bengio, et al.

Chris LeongJun 5, 2025, 12:16 PM

12 points

0 comments1 min readEA link

(arxiv.org)

Provably Honest—A First Step

Srijanak DeNov 5, 2022, 9:49 PM

1 point

0 comments1 min readEA link

[Question] How do you talk about AI safety?

Eevee🔹Apr 19, 2020, 4:15 PM

10 points

5 comments1 min readEA link

[Question] Why should we not put effort into AI safety research?

Ben ThompsonMay 16, 2021, 5:11 AM

15 points

5 comments1 min readEA link

Against Explosive Growth

c.troutSep 4, 2024, 9:45 PM

24 points

9 comments1 min readEA link

Responsible Scaling Policies Are Risk Management Done Wrong

simeon_cOct 25, 2023, 11:46 PM

42 points

1 comment1 min readEA link

(www.navigatingrisks.ai)

By failing to take serious AI action, the US could be in violation of its international law obligations

Cecil Abungu May 27, 2023, 4:25 AM

44 points

1 comment10 min readEA link

Why AI Safety Needs a Centralized Plan—And What It Might Look Like

Brandon RiggsMay 28, 2025, 9:40 PM

19 points

6 comments15 min readEA link

EA relevant Foresight Institute Workshops in 2023: WBE & AI safety, Cryptography & AI safety, XHope, Space, and Atomically Precise Manufacturing

elteerkersJan 16, 2023, 2:02 PM

20 points

1 comment3 min readEA link

When will AI automate all mental work, and how fast?

A.G.G. LiuMay 31, 2025, 4:18 PM

10 points

0 comments1 min readEA link

(youtu.be)

On Solving Problems Before They Appear: The Weird Epistemologies of Alignment

adamShimiOct 11, 2021, 8:21 AM

28 points

0 comments15 min readEA link

A case study of regulation done well? Canadian biorisk regulations

rosehadsharSep 8, 2023, 5:10 PM

31 points

1 comment16 min readEA link

AI Alignment Research Engineer Accelerator (ARENA): call for applicants

Callum McDougallNov 7, 2023, 9:43 AM

46 points

3 comments10 min readEA link

Empirical work that might shed light on scheming (Section 6 of “Scheming AIs”)

Joe_CarlsmithDec 11, 2023, 4:30 PM

7 points

1 comment1 min readEA link

[Question] To what extent is AI safety work trying to get AI to reliably and safely do what the user asks vs. do what is best in some ultimate sense?

Jordan ArelMay 23, 2025, 9:09 PM

12 points

0 comments1 min readEA link

Atari early

AI ImpactsApr 2, 2020, 11:28 PM

34 points

2 comments5 min readEA link

(aiimpacts.org)

AI Audit in Costa Rica

Priscilla CamposJan 27, 2025, 2:57 AM

10 points

4 comments9 min readEA link

[Question] What are the challenges and problems with programming law-breaking constraints into AGI?

MichaelStJulesFeb 2, 2020, 8:53 PM

20 points

34 comments1 min readEA link

The Pending Disaster Framing as it Relates to AI Risk

Chris LeongFeb 25, 2024, 3:47 PM

8 points

2 comments6 min readEA link

Ilya: The AI scientist shaping the world

David VargaNov 20, 2023, 12:43 PM

6 points

1 comment4 min readEA link

How Europe might matter for AI governance

stefan.torgesJul 12, 2019, 11:42 PM

52 points

13 comments8 min readEA link

Reasons for and against working on technical AI safety at a frontier AI lab

bilalchughtaiJan 7, 2025, 1:23 PM

16 points

3 comments12 min readEA link

(www.lesswrong.com)

A Sketch of AI-Driven Epistemic Lock-In

Ozzie GooenMar 5, 2025, 10:40 PM

15 points

1 comment3 min readEA link

The case for taking AI seriously as a threat to humanity

EA HandbookNov 10, 2020, 12:00 AM

12 points

0 comments1 min readEA link

(www.vox.com)

Four Futures For Cognitive Labor

Maxwell TabarrokJun 13, 2024, 12:58 PM

27 points

11 comments4 min readEA link

(www.maximum-progress.com)

Roodman’s Thoughts on Biological Anchors

lukeprogSep 14, 2022, 12:23 PM

73 points

8 comments1 min readEA link

(docs.google.com)

Join the interpretability research hackathon

Esben KranOct 28, 2022, 4:26 PM

48 points

0 comments5 min readEA link

Time-stamping: An urgent, neglected AI safety measure

Axel SvenssonJan 30, 2023, 11:21 AM

57 points

27 comments3 min readEA link

AI coöperation is more possible than you think

423175Sep 24, 2022, 11:04 PM

2 points

0 comments1 min readEA link

I’m Buck Shlegeris, I do research and outreach at MIRI, AMA

BuckNov 15, 2019, 10:44 PM

123 points

228 comments2 min readEA link

No one has the ball on 1500 Russian olympiad winners who’ve received HPMOR

MikhailSaminJan 23, 2025, 4:40 PM

32 points

10 comments1 min readEA link

Effective Persuasion For AI Alignment Risk

Brian LuiAug 9, 2022, 11:55 PM

5 points

7 comments4 min readEA link

The Metaethics and Normative Ethics of AGI Value Alignment: Many Questions, Some Implications

Eleos Arete CitriniSep 15, 2021, 7:05 PM

25 points

0 comments8 min readEA link

Attention on AI X-Risk Likely Hasn’t Distracted from Current Harms from AI

Erich_Grunewald 🔸Dec 21, 2023, 5:24 PM

190 points

13 comments1 min readEA link

(www.erichgrunewald.com)

AI for Animals is Hiring a Program Lead

Constance LiJul 10, 2024, 8:57 PM

21 points

0 comments4 min readEA link

Governing High-Impact AI Systems: Understanding Canada’s Proposed AI Bill. April 15, Carleton University, Ottawa

Liav.KorenMar 27, 2023, 11:11 PM

3 points

0 comments1 min readEA link

(www.eventbrite.com)

Karma Tests in Logical Counterfactual Simulations motivates strong agents to protect weak agents

Knight LeeApr 18, 2025, 12:03 PM

1 point

0 comments3 min readEA link

AI Defaults: A Neglected Lever for Animal Welfare?

andiehansenMay 30, 2025, 9:59 AM

13 points

0 comments10 min readEA link

Lessons on project management from “How Big Things Get Done”

Cristina Schmidt IbáñezMay 17, 2023, 7:15 PM

36 points

3 comments9 min readEA link

Difference, Projection, and Adaptation

YOGNov 10, 2022, 10:46 AM

0 points

0 comments3 min readEA link

[Question] Should we nationalize AI development?

Jadon SchmittJul 20, 2023, 5:31 AM

5 points

4 comments1 min readEA link

Apply for ARBOx2: an ML safety intensive [deadline: 25th of May 2025]

Margot StakenborgMay 13, 2025, 11:45 AM

16 points

5 comments1 min readEA link

A taxonomy of non-schemer models (Section 1.2 of “Scheming AIs”)

Joe_CarlsmithNov 22, 2023, 3:24 PM

6 points

0 comments1 min readEA link

Announcing the EU Tech Policy Fellowship

Jan-WillemMar 30, 2022, 8:15 AM

53 points

4 comments5 min readEA link

The Economist feature articles on LLMs

Dr Dan EpsteinApr 20, 2023, 12:29 AM

12 points

0 comments1 min readEA link

(www.economist.com)

Superintelligence’s goals are likely to be random

MikhailSaminMar 14, 2025, 1:17 AM

2 points

0 comments1 min readEA link

Zvi on: A Playbook for AI Policy at the Manhattan Institute

PhibAug 4, 2024, 9:34 PM

9 points

1 comment7 min readEA link

(thezvi.substack.com)

Dear Anthropic people, please don’t release Claude

Joseph MillerFeb 8, 2023, 2:44 AM

28 points

5 comments1 min readEA link

Early Chinese Language Media Coverage of the AI 2027 Report: A Qualitative Analysis

eeeeeApr 30, 2025, 2:23 PM

14 points

0 comments11 min readEA link

(www.lesswrong.com)

The necessity of “Guardian AI” and two conditions for its achievement

ProicaMay 28, 2024, 11:42 AM

1 point

1 comment15 min readEA link

Good policy ideas that won’t happen (yet)

Niel_BowermanSep 11, 2014, 12:29 PM

28 points

8 comments14 min readEA link

AI Alignment Research Engineer Accelerator (ARENA): call for applicants

Callum McDougallApr 17, 2023, 8:30 PM

41 points

2 comments1 min readEA link

Concrete actions to improve AI governance: the behaviour science approach

Alexander SaeriDec 1, 2022, 9:34 PM

31 points

0 comments11 min readEA link

Update to Samotsvety AGI timelines

Misha_YagudinJan 24, 2023, 4:27 AM

120 points

9 comments4 min readEA link

Consider trying Vivek Hebbar’s alignment exercises

AkashOct 24, 2022, 7:46 PM

16 points

0 comments1 min readEA link

ARENA 2.0 - Impact Report

Callum McDougallSep 26, 2023, 5:13 PM

17 points

0 comments13 min readEA link

[Cause Exploration Prizes] Expanding communication about AGI risks

InesSep 22, 2022, 5:30 AM

13 points

0 comments11 min readEA link

On Artificial Wisdom

Jordan ArelJul 11, 2024, 7:14 AM

22 points

1 comment14 min readEA link

(4 min read) An intuitive explanation of the AI influence situation

trevor1Jan 13, 2024, 5:34 PM

1 point

1 comment1 min readEA link

Voting Theory has a HOLE

Anthony RepettoDec 4, 2021, 4:20 AM

2 points

4 comments2 min readEA link

High Impact Careers in Formal Verification: Artificial Intelligence

quinnJun 5, 2021, 2:45 PM

28 points

7 comments16 min readEA link

When reporting AI timelines, be clear who you’re deferring to

Sam ClarkeOct 10, 2022, 2:24 PM

120 points

20 comments1 min readEA link

[Discussion] Best intuition pumps for AI safety

mariushobbhahnNov 6, 2021, 8:11 AM

10 points

8 comments1 min readEA link

EU’s AI ambitions at risk as US pushes to water down international treaty (linkpost)

micJul 31, 2023, 12:34 AM

9 points

0 comments1 min readEA link

(www.euractiv.com)

Centre for the Study of Existential Risk Four Month Report June—September 2020

HaydnBelfieldDec 2, 2020, 6:33 PM

24 points

0 comments17 min readEA link

The Ethical Basilisk Thought Experiment

KyrtinAug 23, 2023, 1:24 PM

1 point

6 comments1 min readEA link

The History of AI Rights Research

Jamie_HarrisAug 27, 2022, 8:14 AM

48 points

1 comment14 min readEA link

(www.sentienceinstitute.org)

What I’m doing

Chris LeongJul 19, 2022, 11:31 AM

28 points

0 comments4 min readEA link

Corporate Governance for Frontier AI Labs: A Research Agenda

Matthew WeardenFeb 28, 2024, 11:32 AM

18 points

3 comments16 min readEA link

(matthewwearden.co.uk)

Announcing Trajectory Labs—A Toronto AI Safety Office

Juliana EberschlagMay 13, 2025, 9:04 PM

18 points

2 comments2 min readEA link

Alert on the Toner-Rodgers paper

EvaMay 16, 2025, 5:58 PM

59 points

1 comment1 min readEA link

Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?

Matrice JacobineApr 24, 2025, 2:11 PM

10 points

0 comments1 min readEA link

(limit-of-rlvr.github.io)

Estimating the Current and Future Number of AI Safety Researchers

Stephen McAleeseSep 28, 2022, 8:58 PM

64 points

34 comments9 min readEA link

My Theory of Consciousness: The Experiencer and the Indicator

David HammerleDec 23, 2024, 4:07 AM

1 point

1 comment7 min readEA link

Fact Check: 57% of the internet is NOT AI-generated

James-Hartree-LawJan 17, 2025, 9:26 PM

1 point

0 comments1 min readEA link

What we can learn from stress testing for AI regulation

Nathan_BarnardJul 17, 2023, 7:56 PM

27 points

0 comments26 min readEA link

[Question] Is this a good way to bet on short timelines?

kokotajlodNov 28, 2020, 2:31 PM

17 points

16 comments1 min readEA link

Apply to the ML for Alignment Bootcamp (MLAB) in Berkeley [Jan 3 - Jan 22]

Habryka [Deactivated]Nov 3, 2021, 6:20 PM

140 points

6 comments1 min readEA link

Against using stock prices to forecast AI timelines

basil.halperinJan 10, 2023, 4:04 PM

18 points

5 comments2 min readEA link

Quantum Immortality: A Perspective if AI Doomers are Probably Right

turchinNov 7, 2024, 4:06 PM

7 points

0 comments1 min readEA link

Why I Should Work on AI Safety—Part 2: Will AI Actually Surpass Human Intelligence?

Aditya AswaniDec 27, 2023, 9:08 PM

8 points

0 comments8 min readEA link

AGI—alignment—paperclip maximizer—pause—defection—incentives

Mars RobertsonApr 13, 2023, 10:38 AM

1 point

2 comments1 min readEA link

Open Agency model can solve the AI regulation dilemma

Roman LeventovNov 9, 2023, 3:22 PM

4 points

0 comments2 min readEA link

How to regulate cutting-edge AI models (Markus Anderljung on The 80,000 Hours Podcast)

80000_HoursJul 11, 2023, 12:36 PM

25 points

0 comments14 min readEA link

Seeking Mechanism Designer for Research into Internalizing Catastrophic Externalities

c.troutSep 11, 2024, 3:09 PM

11 points

0 comments1 min readEA link

AI & Policy 1/3: On knowing the effect of today’s policies on Transformative AI risks, and the case for institutional improvements.

weeatquinceAug 27, 2019, 11:04 AM

27 points

3 comments10 min readEA link

FHI Report: Stable Agreements in Turbulent Times

Cullen 🔸Feb 21, 2019, 5:12 PM

25 points

2 comments4 min readEA link

(www.fhi.ox.ac.uk)

[linkpost] Christiano on agreement/disagreement with Yudkowsky’s “List of Lethalities”

Owen Cotton-BarrattJun 19, 2022, 10:47 PM

130 points

1 comment1 min readEA link

(www.lesswrong.com)

Paul Christiano – Machine intelligence and capital accumulation

Tessa A 🔸May 15, 2014, 12:10 AM

21 points

0 comments6 min readEA link

(rationalaltruist.com)

Why The Focus on Expected Utility Maximisers?

𝕮𝖎𝖓𝖊𝖗𝖆Dec 27, 2022, 3:51 PM

11 points

1 comment1 min readEA link

CSER Advice to EU High-Level Expert Group on AI

HaydnBelfieldMar 8, 2019, 8:42 PM

14 points

0 comments5 min readEA link

(www.cser.ac.uk)

California AI Bill, SB 1047, covered in today’s WSJ.

EmersonAug 8, 2024, 12:27 PM

5 points

0 comments1 min readEA link

(www.wsj.com)

Why I funded PIBBSS

Ryan KiddSep 15, 2024, 7:56 PM

90 points

2 comments1 min readEA link

The Soul of EA is in Trouble

MjreardMay 8, 2025, 4:44 PM

337 points

40 comments9 min readEA link

[Question] I’m interviewing prolific AI safety researcher Richard Ngo (now at OpenAI and previously DeepMind). What should I ask him?

Robert_WiblinSep 29, 2022, 12:00 AM

45 points

11 comments1 min readEA link

There should be an AI safety project board

mariushobbhahnMar 14, 2022, 4:08 PM

24 points

3 comments1 min readEA link

“Clean” vs. “messy” goal-directedness (Section 2.2.3 of “Scheming AIs”)

Joe_CarlsmithNov 29, 2023, 4:32 PM

7 points

0 comments1 min readEA link

Panel discussion on AI consciousness with Rob Long and Jeff Sebo

Aaron BergmanSep 9, 2023, 3:38 AM

31 points

6 comments42 min readEA link

(www.youtube.com)

Intro to ML Safety virtual program: 12 June − 14 August

jamesMay 5, 2023, 10:04 AM

26 points

0 comments2 min readEA link

Sentience-Based Alignment Strategies: Should we try to give AI genuine empathy/compassion?

Lloy2May 4, 2025, 8:45 PM

12 points

0 comments3 min readEA link

Sentinel’s Global Risks Weekly Roundup #11/2025. Trump invokes Alien Enemies Act, Chinese invasion barges deployed in exercise.

NunoSempereMar 17, 2025, 7:37 PM

40 points

0 comments6 min readEA link

(blog.sentinel-team.org)

AI for Epistemics Hackathon

AustinMar 14, 2025, 8:46 PM

29 points

4 comments1 min readEA link

(manifund.substack.com)

8 possible high-level goals for work on nuclear risk

MichaelA🔸Mar 29, 2022, 6:30 AM

47 points

4 comments16 min readEA link

On whether AI will soon cause job loss, lower incomes, and higher inequality — or the opposite (Michael Webb on the 80,000 Hours Podcast)

80000_HoursAug 25, 2023, 2:59 PM

11 points

2 comments18 min readEA link

Existential Risk of Misaligned Intelligence Augmentation (Particularly Using High-Bandwidth BCI Implants)

Damian GorskiJan 24, 2023, 5:02 PM

1 point

0 comments9 min readEA link

Grokking “Forecasting TAI with biological anchors”

ansonJun 6, 2022, 6:56 PM

43 points

0 comments14 min readEA link

The Shutdown Problem: An AI Engineering Puzzle for Decision Theorists

EJTOct 23, 2023, 3:36 PM

35 points

1 comment38 min readEA link

(philpapers.org)

Announcing the SPT Model Web App for AI Governance

Paolo BovaAug 4, 2022, 10:45 AM

42 points

0 comments5 min readEA link

Governance of AI, Breakfast Cereal, Car Factories, Etc.

Jeff MartinNov 6, 2023, 1:44 AM

2 points

0 comments3 min readEA link

Mitigating Ethical Concerns and Risks in the US Approach to Autonomous Weapons Systems through Effective Altruism

VeeJun 11, 2023, 10:37 AM

5 points

2 comments4 min readEA link

We should prevent the creation of artificial sentience

RichardPOct 29, 2024, 12:22 PM

114 points

12 comments15 min readEA link

Improved Security to Prevent Hacker-AI and Digital Ghosts

Erland WittkotterOct 21, 2022, 10:11 AM

1 point

0 comments1 min readEA link

Three camps in AI x-risk discussions: My personal very oversimplified overview

Aryeh EnglanderJun 30, 2023, 9:42 PM

15 points

10 comments4 min readEA link

Introducing Deepgeek

LigeiaApr 1, 2025, 4:50 PM

11 points

2 comments4 min readEA link

My (current) model of what an AI governance researcher does

JohanEAAug 26, 2024, 11:22 AM

7 points

1 comment5 min readEA link

AI Forecasting Benchmark: Congratulations to Q4 Winners + Q1 Practice Questions Open

christianJan 10, 2025, 3:02 AM

6 points

0 comments2 min readEA link

(www.metaculus.com)

[Question] Is it a federal crime in the US to develop AGI that may cause human extinction?

OferDec 4, 2024, 2:38 PM

15 points

6 comments1 min readEA link

Effective AI Outreach | A Data Driven Approach

NoahCWilson🔸Feb 28, 2025, 12:44 AM

15 points

2 comments15 min readEA link

Will the US Government Control the First AGI?—Finding Base Rates

LuiseSep 2, 2024, 11:11 AM

22 points

5 comments14 min readEA link

Winning Non-Trivial Project: Setting a high standard for frontier model security

XaviCFJan 8, 2024, 11:20 AM

31 points

0 comments18 min readEA link

Is Genetic Code Swapping as risky as it seems?

Invert_DOG_about_centre_OJan 12, 2025, 6:38 PM

23 points

2 comments10 min readEA link

[Question] I’m interviewing Jan Leike, co-lead of OpenAI’s new Superalignment project. What should I ask him?

Robert_WiblinJul 18, 2023, 6:25 PM

51 points

19 comments1 min readEA link

Talent Needs of Technical AI Safety Teams

Ryan KiddMay 24, 2024, 12:46 AM

51 points

11 comments14 min readEA link

Grokking “Semi-informative priors over AI timelines”

ansonJun 12, 2022, 10:15 PM

60 points

1 comment14 min readEA link

Overview | An Evaluative Evolution

Matt KeeneFeb 10, 2023, 6:15 PM

−9 points

0 comments5 min readEA link

(www.creatingafuturewewant.com)

Let’s talk about Impostor syndrome in AI safety

Igor IvanovSep 22, 2023, 2:06 PM

4 points

0 comments3 min readEA link

AISafety.info’s Writing & Editing Hackathon

leillustrations🔸Aug 5, 2023, 5:12 PM

4 points

2 comments1 min readEA link

Three pillars for avoiding AGI catastrophe: Technical alignment, deployment decisions, and coordination

LintzAAug 3, 2022, 9:24 PM

93 points

4 comments11 min readEA link

Architecting Trust: A Conceptual Blueprint for Verifiable AI Governance

Ihor IvlievMar 31, 2025, 6:48 PM

2 points

0 comments8 min readEA link

UK AI Policy Report: Content, Summary, and its Impact on EA Cause Areas

Algo_LawJul 21, 2022, 5:32 PM

9 points

1 comment9 min readEA link

“Attitudes Toward Artificial General Intelligence: Results from American Adults 2021 and 2023”—call for reviewers (Seeds of Science)

rogersbacon1Jan 3, 2024, 8:34 PM

12 points

0 comments1 min readEA link

Timelines to Transformative AI: an investigation

Zershaaneh QureshiMar 25, 2024, 6:11 PM

73 points

8 comments50 min readEA link

Announcing AI Safety Support

Linda LinseforsNov 19, 2020, 8:19 PM

55 points

0 comments4 min readEA link

[Linkpost] The real AI nightmare: What if it serves humans too well?

BrianKMar 31, 2024, 10:33 AM

21 points

2 comments1 min readEA link

(www.latimes.com)

Top AI safety newsletters, books, podcasts, etc – new AISafety.com resource

Bryce RobertsonMar 4, 2025, 5:01 PM

9 points

0 comments1 min readEA link

Anthropic rewrote its RSP

Zach Stein-PerlmanOct 15, 2024, 2:30 PM

32 points

1 comment1 min readEA link

The Guardian calls EA “cultish” and accuses the late FHI of “Eugenics on Steroids”

Damin Curtis🔹Apr 28, 2024, 1:44 PM

13 points

12 comments1 min readEA link

(www.theguardian.com)

[Our World in Data] AI timelines: What do experts in artificial intelligence expect for the future? (Roser, 2023)

Will AldredFeb 7, 2023, 2:52 PM

98 points

1 comment1 min readEA link

(ourworldindata.org)

Encultured AI, Part 1: Enabling New Benchmarks

Andrew CritchAug 8, 2022, 10:49 PM

17 points

0 comments6 min readEA link

[Question] How have shorter AI timelines been affecting you, and how have you been responding to them?

Liav.KorenJan 3, 2023, 4:20 AM

35 points

15 comments1 min readEA link

Is AI Safety dropping the ball on privacy?

markovSep 19, 2023, 8:17 AM

10 points

0 comments7 min readEA link

[Question] What Do AI Safety Pitches Not Get About Your Field?

a_e_rSep 20, 2022, 6:13 PM

70 points

18 comments1 min readEA link

Will we ever run out of new jobs?

Kevin KohlerAug 19, 2024, 3:03 PM

11 points

4 comments7 min readEA link

(machinocene.substack.com)

Advice on Pursuing Technical AI Safety Research

frances_lorenzMay 31, 2022, 5:48 PM

29 points

2 comments4 min readEA link

We Ran an Alignment Workshop

aiden amentJan 21, 2023, 5:37 AM

6 points

0 comments3 min readEA link

2024 CFP for APSA, Largest Annual Meeting of Political Science

nemeryxuJan 3, 2024, 7:43 PM

2 points

0 comments1 min readEA link

China x AI Reference List

Saad SiddiquiMar 13, 2024, 6:57 PM

61 points

3 comments3 min readEA link

(docs.google.com)

Animal Rights, The Singularity, and Astronomical Suffering

sapphireAug 20, 2020, 8:23 PM

52 points

0 comments3 min readEA link

LLM chatbots have ~half of the kinds of “consciousness” that humans believe in. Humans should avoid going crazy about that.

Andrew CritchNov 22, 2024, 3:26 AM

11 points

3 comments1 min readEA link

Advice for Activists from the History of Environmentalism

Jeffrey HeningerMay 16, 2024, 8:36 PM

48 points

2 comments1 min readEA link

(blog.aiimpacts.org)

List of technical AI safety exercises and projects

JakubKJan 19, 2023, 9:35 AM

15 points

0 comments1 min readEA link

How many people are working (directly) on reducing existential risk from AI?

Benjamin HiltonJan 17, 2023, 2:03 PM

117 points

3 comments4 min readEA link

(80000hours.org)

[Presentation] Intro to AI Safety

EitanJan 6, 2025, 1:04 PM

13 points

0 comments1 min readEA link

Is Eric Schmidt funding AI capabilities research by the US government?

Pranay KDec 24, 2022, 8:32 AM

46 points

3 comments2 min readEA link

(www.politico.com)

Stuart Russell Human Compatible AI Roundtable with Allan Dafoe, Rob Reich, & Marietje Schaake

Mahendra PrasadFeb 11, 2021, 7:43 AM

16 points

0 comments1 min readEA link

From Crisis to Control: Establishing a Resilient Incident Response Framework for Deployed AI Models

KevinNJan 31, 2025, 1:06 PM

10 points

1 comment6 min readEA link

(www.techpolicy.press)

ARENA 6.0 - Call for applicants

James HindmarchJun 4, 2025, 1:32 PM

5 points

0 comments6 min readEA link

Reflections on my 5-month AI alignment upskilling grant

Jay BaileyDec 28, 2022, 7:23 AM

113 points

5 comments8 min readEA link

(www.lesswrong.com)

Increasing Concern for Digital Beings Through LLM Persuasion (Empirical Results)

carter allen🔸Jul 7, 2024, 4:42 PM

24 points

0 comments7 min readEA link

I’m Interviewing Kat Woods, EA Powerhouse. What Should I Ask?

SereneDesireeSep 20, 2022, 9:49 AM

4 points

2 comments1 min readEA link

[Question] What new psychology research could best promote AI safety & alignment research?

Geoffrey MillerJul 13, 2023, 4:30 PM

29 points

13 comments1 min readEA link

When should we worry about AI power-seeking?

Joe_CarlsmithFeb 19, 2025, 7:44 PM

21 points

2 comments1 min readEA link

(joecarlsmith.substack.com)

AISN #28: Center for AI Safety 2023 Year in Review

Center for AI SafetyDec 23, 2023, 9:31 PM

17 points

1 comment5 min readEA link

(newsletter.safe.ai)

AI Alignment Research Engineer Accelerator (ARENA): Call for applicants v4.0

JamesFoxJul 6, 2024, 11:51 AM

7 points

0 comments5 min readEA link

Mitigating Risks from Rouge AI

Stephen ClareApr 1, 2025, 9:29 AM

214 points

4 comments3 min readEA link

CHAI internship applications are open (due Nov 13)

Erik JennerOct 26, 2023, 12:48 AM

6 points

1 comment3 min readEA link

[Question] Which is more important for reducing s-risks, researching on AI sentience or animal welfare?

jackchang110Feb 25, 2023, 2:20 AM

9 points

0 comments1 min readEA link

Principles for the AGI Race

William_SAug 30, 2024, 2:30 PM

81 points

4 comments18 min readEA link

Metaculus Launches Conditional Cup to Explore Linked Forecasts

christianOct 18, 2023, 8:41 PM

11 points

0 comments1 min readEA link

(www.metaculus.com)

The Failed Strategy of Artificial Intelligence Doomers

yhoisethFeb 5, 2025, 7:34 PM

12 points

2 comments1 min readEA link

(letter.palladiummag.com)

A Critique of AI Takeover Scenarios

James FodorAug 31, 2022, 1:49 PM

53 points

4 comments12 min readEA link

Updates from Campaign for AI Safety

Jolyn KhooJun 16, 2023, 9:45 AM

15 points

3 comments2 min readEA link

(www.campaignforaisafety.org)

When Self-Optimizing AI Collapses From Within: A Conceptual Model of Structural Singularity

KaedeHamasakiApr 7, 2025, 8:10 PM

4 points

0 comments1 min readEA link

How to Do a PhD (in AI Safety)

Lewis HammondJan 5, 2025, 4:57 PM

22 points

2 comments18 min readEA link

(lewishammond.com)

How technical safety standards could promote TAI safety

Cullen 🔸Aug 8, 2022, 4:57 PM

128 points

15 comments7 min readEA link

What is the EU AI Act and why should you care about it?

MathiasKB🔸Sep 10, 2021, 7:47 AM

117 points

10 comments7 min readEA link

[Question] How will the world respond to “AI x-risk warning shots” according to reference class forecasting?

Ryan KiddApr 18, 2022, 9:10 AM

18 points

1 comment1 min readEA link

What if doing the most good = benevolent AI takeover and human extinction?

Jordan ArelMar 22, 2024, 7:56 PM

2 points

4 comments3 min readEA link

What is it to solve the alignment problem?

Joe_CarlsmithFeb 13, 2025, 6:42 PM

25 points

1 comment1 min readEA link

(joecarlsmith.substack.com)

[Question] Ben Horowitz and others are spreading a “regulation is bad” view. Would it be useful to have a public bet on “would Ben update his view if he had 1-1 with X-Risk researcher?”, and urge Ben to run such an experiment?

AntonOsikaAug 8, 2023, 6:36 AM

2 points

0 comments1 min readEA link

AI Safety Endgame Stories

IvanVendrovSep 28, 2022, 5:12 PM

31 points

1 comment1 min readEA link

Searle vs Bostrom: crucial considerations for EA AI work?

ForumiteJul 13, 2022, 10:18 AM

11 points

2 comments1 min readEA link

[Question] What is the best source to explain short AI timelines to a skeptical person?

trevor1Nov 23, 2022, 5:20 AM

2 points

3 comments1 min readEA link

Mahendra Prasad: Rational group decision-making

EA GlobalJul 8, 2020, 3:06 PM

15 points

0 comments16 min readEA link

(www.youtube.com)

The Science of AI Is Too Important to Be Left to the Scientists

AndrewDorisOct 23, 2024, 7:10 PM

3 points

0 comments1 min readEA link

(foreignpolicy.com)

Discovering alignment windfalls reduces AI risk

James BradyFeb 28, 2024, 9:14 PM

22 points

3 comments8 min readEA link

(blog.elicit.com)

How to create a “good” AGI

mreichertDec 8, 2023, 10:47 AM

1 point

0 comments10 min readEA link

Non-classic stories about scheming (Section 2.3.2 of “Scheming AIs”)

Joe_CarlsmithDec 4, 2023, 6:44 PM

12 points

1 comment1 min readEA link

Unsupervised Rationality

QuinlyMay 12, 2025, 2:42 PM

1 point

0 comments4 min readEA link

“The Physicists”: A play about extinction and the responsibility of scientists

Lara_THNov 29, 2022, 4:53 PM

28 points

1 comment8 min readEA link

On the Moral Patiency of Non-Sentient Beings (Part 1)

Chase CarterJul 4, 2024, 11:41 PM

20 points

8 comments24 min readEA link

New s-risks audiobook available now

Alistair WebsterMay 24, 2023, 8:27 PM

87 points

3 comments1 min readEA link

(centerforreducingsuffering.org)

Introducing the ML Safety Scholars Program

TW123May 4, 2022, 1:14 PM

157 points

42 comments3 min readEA link

AI, Factory Farming and Intuitive Moral Responses

DeepBlueWhaleJun 20, 2024, 12:43 PM

10 points

2 comments1 min readEA link

Why the Orthogonality Thesis’s veracity is not the point:

Antoine de Scorraille ⏸️Jul 23, 2020, 3:40 PM

3 points

0 comments3 min readEA link

“Existential risk from AI” survey results

RobBensingerJun 1, 2021, 8:19 PM

80 points

35 comments11 min readEA link

Collection of work on ‘Should you focus on the EU if you’re interested in AI governance for longtermist/x-risk reasons?’

MichaelA🔸Aug 6, 2022, 4:49 PM

51 points

3 comments1 min readEA link

Machine Learning for Scientific Discovery—AI Safety Camp

Eleni_AJan 6, 2023, 3:06 AM

9 points

0 comments1 min readEA link

Survey on AI existential risk scenarios

Sam ClarkeJun 8, 2021, 5:12 PM

154 points

11 comments6 min readEA link

When the Alarm Bell Is Silenced: Resistance to AGI Extinction from Within the Safety Community

funnyfrancoMar 29, 2025, 6:02 PM

−5 points

5 comments25 min readEA link

We all teach: here’s how to do it better

Michael Noetel 🔸Sep 30, 2022, 2:06 AM

172 points

12 comments24 min readEA link

Why I prioritize moral circle expansion over reducing extinction risk through artificial intelligence alignment

JacyFeb 20, 2018, 6:29 PM

107 points

72 comments35 min readEA link

(www.sentienceinstitute.org)

The 6D effect: When companies take risks, one email can be very powerful.

stecasNov 4, 2023, 8:08 PM

40 points

1 comment1 min readEA link

Book Review (mini): Co-Intelligence by Ethan Mollick

Darren McKeeApr 3, 2024, 5:33 PM

5 points

1 comment1 min readEA link

Possible miracles

AkashOct 9, 2022, 6:17 PM

38 points

1 comment1 min readEA link

Agentic Mess (A Failure Story)

Karl von WendtJun 6, 2023, 1:16 PM

30 points

3 comments1 min readEA link

AIs Are Expert-Level at Many Virology Skills

Center for AI SafetyMay 2, 2025, 4:07 PM

22 points

0 comments1 min readEA link

Why “Solving Alignment” Is Likely a Category Mistake

Nate SharpeMay 6, 2025, 8:56 PM

48 points

3 comments3 min readEA link

(www.lesswrong.com)

Key questions about artificial sentience: an opinionated guide

rgbApr 25, 2022, 1:42 PM

91 points

3 comments1 min readEA link

Summary of Epoch’s AI timelines podcast

OscarD🔸Apr 12, 2025, 9:22 AM

36 points

6 comments26 min readEA link

AI-Relevant Regulation: CPSC

SWKAug 13, 2023, 3:44 PM

3 points

0 comments6 min readEA link

OMMC Announces RIP

Adam_SchollApr 1, 2024, 11:38 PM

7 points

0 comments2 min readEA link

Why I’m doing PauseAI

Joseph MillerApr 30, 2024, 4:21 PM

145 points

36 comments1 min readEA link

Andrew Critch: Logical induction — progress in AI alignment

EA GlobalAug 6, 2016, 12:40 AM

7 points

0 comments1 min readEA link

(www.youtube.com)

[Linkpost] A Case for AI Consciousness

cdkgJul 6, 2024, 2:56 PM

3 points

0 comments1 min readEA link

(philpapers.org)

Catholic theologians and priests on artificial intelligence

anonymous6Jun 14, 2022, 6:53 PM

21 points

2 comments1 min readEA link

Engaging with AI in a Personal Way

Spyder RexDec 4, 2023, 9:23 AM

−9 points

0 comments1 min readEA link

Option control

Joe_CarlsmithNov 4, 2024, 5:54 PM

11 points

0 comments1 min readEA link

Approaches to Mitigating AI Image-Generation Risks through Regulation

scronkfinkleApr 19, 2025, 1:50 PM

1 point

0 comments4 min readEA link

Three Impacts of Machine Intelligence

Paul_ChristianoAug 23, 2013, 10:10 AM

33 points

5 comments8 min readEA link

(rationalaltruist.com)

Designing Artificial Wisdom: The Wise Workflow Research Organization

Jordan ArelJul 12, 2024, 6:57 AM

14 points

1 comment9 min readEA link

A concerning observation from media coverage of AI industry dynamics

Justin OliveMar 2, 2023, 11:56 PM

48 points

5 comments3 min readEA link

Jaan Tallinn: Fireside chat (2020)

EA GlobalNov 21, 2020, 8:12 AM

7 points

0 comments1 min readEA link

(www.youtube.com)

Any further work on AI Safety Success Stories?

KriegerOct 2, 2022, 11:59 AM

4 points

0 comments1 min readEA link

Three new reports reviewing research and concepts in advanced AI governance

MMMaasNov 28, 2023, 9:21 AM

32 points

0 comments2 min readEA link

(www.legalpriorities.org)

How Josiah became an AI safety researcher

Neil CrawfordMar 29, 2022, 7:47 PM

10 points

0 comments1 min readEA link

An overview of arguments for concern about automation

LintzAAug 6, 2019, 7:56 AM

34 points

3 comments13 min readEA link

Power-Seeking AI and Existential Risk

antoniofrancaibOct 11, 2022, 9:47 PM

10 points

0 comments1 min readEA link

What’s the Difference between the AI Threat and the Multinational Mega Corporation?

John HuangMar 25, 2025, 7:43 PM

5 points

2 comments1 min readEA link

[Question] Can independent researchers get a sponsored visa for the US or UK?

jacquesthibsMar 25, 2023, 3:05 AM

20 points

2 comments1 min readEA link

[Question] Are you living in accordance with your stated AI timelines?

CyrilBFeb 3, 2025, 5:19 PM

7 points

3 comments1 min readEA link

Connectomics seems great from an AI x-risk perspective

Steven ByrnesApr 30, 2023, 2:38 PM

10 points

0 comments1 min readEA link

Owen Cotton-Barratt: What does (and doesn’t) AI mean for effective altruism?

EA GlobalAug 11, 2017, 8:19 AM

10 points

0 comments12 min readEA link

(www.youtube.com)

Three Biases That Made Me Believe in AI Risk

bethFeb 13, 2019, 11:22 PM

41 points

20 comments3 min readEA link

HeArtificial Intelligence ~ Open Philanthropy AI Worldviews Contest

Da Kim SanJun 2, 2023, 8:19 PM

−7 points

0 comments20 min readEA link

Don’t Let Other Global Catastrophic Risks Fall Behind: Support ORCG in 2024

JorgeTorresCNov 11, 2024, 6:27 PM

48 points

1 comment4 min readEA link

The problem of artificial suffering

mlsbtSep 24, 2021, 2:43 PM

51 points

3 comments9 min readEA link

Moral error as an existential risk

William_MacAskillMar 17, 2025, 4:22 PM

92 points

3 comments11 min readEA link

[Question] Is it crunch time yet? If so, who can help?

Nicholas KrossOct 13, 2021, 4:11 AM

29 points

9 comments1 min readEA link

Too Soon

Gordon Seidoh WorleyMay 13, 2025, 3:01 PM

53 points

0 comments1 min readEA link

My choice of AI misalignment introduction for a general audience

BillMay 3, 2023, 12:15 AM

7 points

2 comments1 min readEA link

(youtu.be)

Mitigating x-risk through modularity

Toby NewberryDec 17, 2020, 7:54 PM

103 points

6 comments14 min readEA link

Takeaways from safety by default interviews

AI ImpactsApr 7, 2020, 2:01 AM

25 points

2 comments13 min readEA link

(aiimpacts.org)

The optimal timing of spending on AGI safety work; why we should probably be spending more now

Tristan CookOct 24, 2022, 5:42 PM

92 points

12 comments36 min readEA link

Out of This Box: The Last Musical (Written by Humans) - Crowdfunding!

GuyPMar 24, 2025, 3:09 PM

24 points

0 comments6 min readEA link

(manifund.org)

What are some low-cost outside-the-box ways to do/fund alignment research?

trevor1Nov 11, 2022, 5:57 AM

2 points

3 comments1 min readEA link

Good Futures Initiative: Winter Project Internship

a_e_rNov 27, 2022, 11:27 PM

67 points

7 comments3 min readEA link

There should be a public adversarial collaboration on AI x-risk

pradyuprasadJan 23, 2023, 4:09 AM

56 points

5 comments2 min readEA link

[Question] AI Ethical Committee

eaaicommitteeMar 1, 2022, 11:35 PM

8 points

0 comments1 min readEA link

Ought’s theory of change

stuhlmuellerApr 12, 2022, 12:09 AM

43 points

4 comments3 min readEA link

Differential knowledge interconnection

Roman LeventovOct 12, 2024, 12:52 PM

3 points

1 comment1 min readEA link

A short summary of what I have been posting about on LessWrong

ThomasCederborgSep 10, 2024, 12:26 PM

3 points

0 comments2 min readEA link

The Problem With the Word ‘Alignment’

Peli GrietzerMay 21, 2024, 9:37 PM

13 points

1 comment6 min readEA link

How Can Average People Contribute to AI Safety?

Stephen McAleeseMar 6, 2025, 10:50 PM

14 points

4 comments1 min readEA link

Should We Treat Open-Source AI Like Digital Firearms? — A Draft Declaration on the Ethical Limits of Frontier AI Models

DongHun LeeMay 23, 2025, 8:58 AM

−3 points

0 comments2 min readEA link

Announcing AI Alignment Awards: $100k research contests about goal misgeneralization & corrigibility

AkashNov 22, 2022, 10:19 PM

60 points

1 comment1 min readEA link

God Coin: A Modest Proposal

Mahdi ComplexApr 1, 2024, 12:02 PM

4 points

0 comments22 min readEA link

Appendix to Bridging Demonstration

mako yassJun 1, 2022, 8:30 PM

18 points

2 comments28 min readEA link

EA for dumb people?

Olivia AddyJul 11, 2022, 10:46 AM

500 points

160 comments2 min readEA link

Exploratory survey on psychology of AI risk perception

Daniel_FriedrichAug 2, 2022, 8:34 PM

1 point

0 comments1 min readEA link

(forms.gle)

It is time to start war gaming for AGI

yanni kyriacosOct 17, 2024, 5:14 AM

14 points

4 comments1 min readEA link

Strong AI. From theory to practice.

GaHHuKoBAug 19, 2022, 11:33 AM

−2 points

0 comments10 min readEA link

(www.reddit.com)

OpenAI’s grant program for democratic process for deciding what rules AI systems should follow

Ronen BarJun 23, 2023, 10:46 AM

7 points

0 comments1 min readEA link

We Will Be Lost Without Home: A Call for Earth-Centric Space Ethics

DongHun LeeMay 24, 2025, 9:53 AM

−5 points

1 comment1 min readEA link

Report on Semi-informative Priors for AI timelines (Open Philanthropy)

Tom_DavidsonMar 26, 2021, 5:46 PM

62 points

6 comments2 min readEA link

Mapping artificial intelligence in the United States: A geographic analysis of the technology infrastructure in U.S. data centers.

GabrielRBApr 30, 2025, 3:23 PM

1 point

1 comment16 min readEA link

Why AI is Harder Than We Think—Melanie Mitchell

Eevee🔹Apr 28, 2021, 8:19 AM

45 points

7 comments2 min readEA link

(arxiv.org)

We don’t trade with ants

Katja_GraceJan 12, 2023, 12:48 AM

140 points

7 comments1 min readEA link

SERI MATS Program—Winter 2022 Cohort

Ryan KiddOct 8, 2022, 7:09 PM

50 points

4 comments1 min readEA link

The International PauseAI Protest: Activism under uncertainty

Joseph MillerOct 12, 2023, 5:36 PM

136 points

3 comments4 min readEA link

Apply to HAIST/MAIA’s AI Governance Workshop in DC (Feb 17-20)

PhosphorousJan 28, 2023, 12:45 AM

15 points

0 comments1 min readEA link

(www.lesswrong.com)

Concrete actionable policies relevant to AI safety (written 2019)

weeatquinceDec 16, 2022, 6:41 PM

48 points

0 comments22 min readEA link

Curse of knowledge and Naive realism: Bias in Evaluating AGI X-Risks

RemmeltDec 31, 2022, 1:33 PM

5 points

0 comments1 min readEA link

Worrisome Trends for Digital Mind Evaluations

Derek ShillerFeb 20, 2025, 3:35 PM

79 points

10 comments8 min readEA link

AI is centralizing by default; let’s not make it worse

Quintin PopeSep 21, 2023, 1:35 PM

53 points

16 comments15 min readEA link

Bandwagon effect: Bias in Evaluating AGI X-Risks

RemmeltDec 28, 2022, 7:54 AM

4 points

0 comments1 min readEA link

Riesgos Catastróficos Globales needs funding

Jaime SevillaAug 1, 2023, 4:26 PM

104 points

1 comment3 min readEA link

Plant-Based Defaults: A Missed Opportunity in AI Design

andiehansenMay 8, 2025, 9:37 AM

37 points

3 comments5 min readEA link

Automated (a short story)

Ben Millwood🔸Jun 19, 2024, 7:07 PM

8 points

0 comments5 min readEA link

A Taxonomy of Jobs Deeply Resistant to TAI Automation

Deric ChengMar 18, 2025, 4:26 PM

39 points

1 comment12 min readEA link

(www.convergenceanalysis.org)

Maybe AI risk shouldn’t affect your life plan all that much

JustisJul 22, 2022, 3:30 PM

22 points

4 comments6 min readEA link

GovAI Webinars on the Governance and Economics of AI

MarkusAnderljungMay 12, 2020, 3:00 PM

16 points

0 comments1 min readEA link

From Coding to Legislation: An Analysis of Bias in the Use of AI for Recruitment and Existing Regulatory Frameworks

Priscilla CamposSep 16, 2024, 6:21 PM

4 points

1 comment20 min readEA link

List of Good Beginner-friendly AI Law/Policy/Regulation Books

CAISIDFeb 22, 2024, 2:51 PM

28 points

1 comment6 min readEA link

Of Mice and MAGA: Exploring Generative Short Fiction’s Potential for Animal Rights Advocacy

Charlie SandersDec 17, 2024, 1:45 AM

2 points

0 comments2 min readEA link

(www.dailymicrofiction.com)

Begging, Pleading AI Orgs to Comment on NIST AI Risk Management Framework

BridgesApr 15, 2022, 7:35 PM

87 points

3 comments2 min readEA link

Illusion of truth effect and Ambiguity effect: Bias in Evaluating AGI X-Risks

RemmeltJan 5, 2023, 4:05 AM

1 point

1 comment1 min readEA link

2024 S-risk Intro Fellowship

Center on Long-Term RiskOct 12, 2023, 7:14 PM

90 points

2 comments1 min readEA link

AI & wisdom 3: AI effects on amortised optimisation

L Rudolf LOct 29, 2024, 1:37 PM

14 points

0 comments1 min readEA link

(rudolf.website)

[Link Post] Interesting shallow round-up of reasons to be skeptical that transformative AI or explosive economic growth are coming soon

David Mathers🔸Jun 28, 2023, 7:49 PM

31 points

8 comments17 min readEA link

(thegradient.pub)

Arguments for/against scheming that focus on the path SGD takes (Section 3 of “Scheming AIs”)

Joe_CarlsmithDec 5, 2023, 6:48 PM

7 points

1 comment1 min readEA link

AI Development Readiness Condition (AI-DRC): A Call to Action

AI-DRC3Jan 11, 2024, 11:00 AM

−5 points

0 comments2 min readEA link

General advice for transitioning into Theoretical AI Safety

Martín SotoSep 15, 2022, 5:23 AM

25 points

0 comments10 min readEA link

A Quick List of Some Problems in AI Alignment As A Field

Nicholas KrossJun 21, 2022, 5:09 PM

16 points

10 comments6 min readEA link

(www.thinkingmuchbetter.com)

Immortality or death by AGI

ImmortalityOrDeathByAGISep 24, 2023, 9:44 AM

12 points

2 comments4 min readEA link

(www.lesswrong.com)

6-paragraph AI risk intro for MAISI

JakubKJan 19, 2023, 9:22 AM

8 points

0 comments1 min readEA link

What are the risks of an oracle AI?

Griffin YoungOct 5, 2022, 6:18 AM

6 points

2 comments1 min readEA link

A collection of AI Governance-related Podcasts, Newsletters, Blogs, and more

LintzAOct 2, 2021, 12:46 AM

24 points

1 comment1 min readEA link

AGI ruin scenarios are likely (and disjunctive)

So8resJul 27, 2022, 3:24 AM

54 points

5 comments6 min readEA link

Tetlock on low AI xrisk

TeddyWJul 13, 2023, 2:19 PM

10 points

15 comments1 min readEA link

AGI Safety Fundamentals programme is contracting a low-code engineer

Jamie BAug 26, 2022, 3:43 PM

39 points

4 comments5 min readEA link

Asterisk Magazine Issue 06

Clara CollierJul 19, 2024, 1:34 PM

13 points

0 comments1 min readEA link

(asteriskmag.com)

Hiring pre-docs

EvaMar 17, 2025, 6:44 PM

20 points

0 comments1 min readEA link

Sam Altman gives me bad vibes

throwaway790May 31, 2023, 5:15 PM

−11 points

3 comments1 min readEA link

[Question] How might a misaligned Artificial Superintelligence break up a human being into usable electromagnetic energy?

CarusoOct 5, 2024, 5:33 PM

−5 points

3 comments1 min readEA link

All AGI Safety questions welcome (especially basic ones) [~monthly thread]

robertskmilesNov 1, 2022, 11:21 PM

75 points

83 comments1 min readEA link

Not understanding sentience is a significant x-risk

Cameron BergJul 1, 2024, 3:38 PM

27 points

8 comments2 min readEA link

Out in Science: “Managing extreme AI risks amid rapid progress” by Bengio, Hilton et al.

aaron_maiMay 20, 2024, 6:24 PM

9 points

0 comments1 min readEA link

(www.science.org)

Will Sentience Make AI’s Morality Better?

Ronen BarMay 18, 2025, 4:34 AM

25 points

0 comments10 min readEA link

Law-Following AI 2: Intent Alignment + Superintelligence → Lawless AI (By Default)

Cullen 🔸Apr 27, 2022, 5:18 PM

19 points

0 comments6 min readEA link

Announcing SPAR Summer 2024!

Lauren MMay 6, 2024, 5:55 PM

18 points

0 comments1 min readEA link

[Job ad] MATS is hiring!

Ryan KiddOct 9, 2024, 8:23 PM

18 points

0 comments5 min readEA link

[Question] Can you donate to AI advocacy

k64May 27, 2025, 4:37 PM

4 points

2 comments1 min readEA link

An Overview of the AI Safety Funding Situation

Stephen McAleeseJul 12, 2023, 2:54 PM

134 points

15 comments15 min readEA link

MATS AI Safety Strategy Curriculum v2

DanielFilanOct 7, 2024, 11:01 PM

29 points

1 comment1 min readEA link

How to get technological knowledge on AI/ML (for non-tech people)

FangFangJun 30, 2021, 7:53 AM

63 points

7 comments5 min readEA link

Newsletter for Alignment Research: The ML Safety Updates

Esben KranOct 22, 2022, 4:17 PM

30 points

0 comments7 min readEA link

Anthropic’s Responsible Scaling Policy & Long-Term Benefit Trust

Zach Stein-PerlmanSep 19, 2023, 5:00 PM

25 points

4 comments9 min readEA link

(www.lesswrong.com)

Mathematical Circuits in Neural Networks

Sean OsierSep 22, 2022, 2:32 AM

23 points

2 comments1 min readEA link

(www.youtube.com)

[Question] What are the coolest topics in AI safety, to a hopelessly pure mathematician?

Jenny K EMay 7, 2022, 7:18 AM

89 points

29 comments1 min readEA link

Open Problems and Fundamental Limitations of RLHF

stecasAug 17, 2023, 4:50 PM

5 points

0 comments1 min readEA link

(arxiv.org)

A Theologian’s Response to Anthropogenic Existential Risk

Fr Peter WygNov 3, 2022, 4:37 AM

108 points

17 comments11 min readEA link

Stuxnet, not Skynet: Humanity’s disempowerment by AI

RokoApr 4, 2023, 11:46 AM

11 points

0 comments7 min readEA link

Embracing the automated future

Arjun KhemaniJul 16, 2023, 8:47 AM

2 points

1 comment1 min readEA link

(arjunkhemani.substack.com)

The ambiguous effect of full automation + new goods on GDP growth

trammellFeb 7, 2025, 2:53 AM

52 points

15 comments8 min readEA link

Anti-squatted AI x-risk domains index

plexAug 12, 2022, 12:00 PM

56 points

9 comments1 min readEA link

4 Years Later: President Trump and Global Catastrophic Risk

HaydnBelfieldOct 25, 2020, 4:28 PM

43 points

10 comments10 min readEA link

New AI safety treaty paper out!

OttoMar 26, 2025, 9:28 AM

28 points

2 comments4 min readEA link

Free agents

Michele CampoloDec 27, 2023, 8:21 PM

19 points

2 comments13 min readEA link

ASI existential risk: reconsidering alignment as a goal

Matrice JacobineApr 15, 2025, 1:36 PM

26 points

3 comments1 min readEA link

(michaelnotebook.com)

Emerging Paradigms: The Case of Artificial Intelligence Safety

Eleni_AJan 18, 2023, 5:59 AM

16 points

0 comments19 min readEA link

The Bletchley Declaration on AI Safety

Hauke HillebrandtNov 1, 2023, 11:44 AM

60 points

3 comments4 min readEA link

(www.gov.uk)

Stampy’s AI Safety Info—New Distillations #1 [March 2023]

markovApr 7, 2023, 11:35 AM

19 points

0 comments2 min readEA link

(aisafety.info)

Review: What We Owe The Future

Kelsey PiperNov 21, 2022, 9:41 PM

165 points

3 comments1 min readEA link

(asteriskmag.com)

[Link] GCRI’s Seth Baum reviews The Precipice

Aryeh EnglanderJun 6, 2022, 7:33 PM

21 points

0 comments1 min readEA link

New EA-adjacent Philosophy Lab

Walter VeitApr 30, 2025, 11:52 AM

56 points

2 comments1 min readEA link

In Darkness They Assembled

Charlie SandersMay 6, 2025, 4:25 AM

−3 points

0 comments3 min readEA link

(www.dailymicrofiction.com)

#212 – Why technology is unstoppable & how to shape AI development anyway (Allan Dafoe on The 80,000 Hours Podcast)

80000_HoursFeb 17, 2025, 4:38 PM

16 points

0 comments19 min readEA link

[Question] Is Bill Gates overly optomistic about AI?

DovMar 22, 2023, 12:29 PM

11 points

0 comments1 min readEA link

NYT is suing OpenAI&Microsoft for alleged copyright infringement; some quick thoughts

MikhailSaminDec 28, 2023, 6:37 PM

29 points

0 comments1 min readEA link

The Credibility of Apocalyptic Claims: A Critique of Techno-Futurism within Existential Risk

EmberAug 16, 2022, 7:48 PM

25 points

35 comments17 min readEA link

AGI alignment results from a series of aligned actions

hanadulsetDec 27, 2021, 7:33 PM

15 points

1 comment6 min readEA link

[Question] I’m interviewing Carl Shulman — what should I ask him?

Robert_WiblinDec 8, 2023, 4:48 PM

53 points

16 comments1 min readEA link

[Congressional Hearing] Oversight of A.I.: Legislating on Artificial Intelligence

Tristan WilliamsNov 1, 2023, 6:15 PM

5 points

1 comment7 min readEA link

(www.judiciary.senate.gov)

What we owe the microbiome

TeddyWDec 17, 2022, 4:17 PM

12 points

2 comments1 min readEA link

Inference-Only Debate Experiments Using Math Problems

Arjun PanicksseryAug 6, 2024, 5:44 PM

3 points

1 comment1 min readEA link

We should say more than “x-risk is high”

OllieBaseDec 16, 2022, 10:09 PM

52 points

12 comments4 min readEA link

Predicting researcher interest in AI alignment

Vael GatesFeb 2, 2023, 12:58 AM

30 points

0 comments21 min readEA link

(docs.google.com)

AI Safety Needs Great Product Builders

James BradyNov 2, 2022, 11:33 AM

45 points

1 comment6 min readEA link

GPT5 won’t be what kills us all

DPiepgrassSep 28, 2024, 5:11 PM

3 points

3 comments1 min readEA link

(dpiepgrass.medium.com)

Training for Good is hiring (and why you should join us): AI Programme Lead and Operations Associate

Cillian_Aug 3, 2023, 4:50 PM

9 points

1 comment6 min readEA link

Maybe Anthropic’s Long-Term Benefit Trust is powerless

Zach Stein-PerlmanMay 27, 2024, 1:00 PM

134 points

21 comments1 min readEA link

List of petitions against OpenAI’s for-profit move

RemmeltApr 25, 2025, 10:03 AM

13 points

4 comments1 min readEA link

[Question] Platform for Project Spitballing? (e.g., for AI field building)

Marcel DApr 3, 2023, 3:45 PM

7 points

2 comments1 min readEA link

Introducing: Meridian Cambridge’s new online lecture series covering frontier AI and AI safety

MeridianJun 5, 2025, 1:30 PM

21 points

0 comments1 min readEA link

Critical Review of ‘The Precipice’: A Reassessment of the Risks of AI and Pandemics

James FodorMay 11, 2020, 11:11 AM

111 points

32 comments26 min readEA link

Teaching AI to reason: this year’s most important story

Benjamin_ToddFeb 13, 2025, 5:56 PM

140 points

18 comments8 min readEA link

(benjamintodd.substack.com)

Personal Privacy—Workshop

Milli🔸Aug 28, 2023, 8:46 PM

6 points

4 comments1 min readEA link

AI Incident Sharing—Best practices from other fields and a comprehensive list of existing platforms

stepanlosJun 28, 2023, 4:18 PM

42 points

1 comment4 min readEA link

Printable resources for AI Safety tabling

gergoAug 28, 2024, 9:39 AM

29 points

0 comments1 min readEA link

Emergency pod: Did OpenAI give up, or is this just a new trap? (with Rose Chan Loui)

80000_HoursMay 9, 2025, 3:10 PM

6 points

0 comments2 min readEA link

Sydney AI Safety Fellowship

Chris LeongDec 2, 2021, 7:35 AM

16 points

0 comments2 min readEA link

Re-introducing Upgradable (a.k.a., 700,000 Hours): Life optimization as a service for altruists

James NorrisFeb 5, 2025, 4:00 PM

4 points

0 comments1 min readEA link

Artificial Intelligence, Morality, and Sentience (AIMS) Survey: 2021

Janet PauketatJul 1, 2022, 7:47 AM

36 points

0 comments2 min readEA link

(www.sentienceinstitute.org)

AI-enabled coups: a small group could use AI to seize power

Tom_DavidsonApr 16, 2025, 4:51 PM

122 points

1 comment1 min readEA link

Emergency pod: Elon tries to crash OpenAI’s party (with Rose Chan Loui)

80000_HoursFeb 14, 2025, 4:29 PM

21 points

0 comments2 min readEA link

Everything’s normal until it’s not

Eleni_AMar 10, 2023, 1:42 AM

6 points

0 comments3 min readEA link

Don’t worry, be happy (literally)

Yuri ZavorotnyOct 5, 2022, 1:55 AM

0 points

1 comment2 min readEA link

Can we simulate human evolution to create a somewhat aligned AGI?

Thomas KwaMar 29, 2022, 1:23 AM

19 points

0 comments7 min readEA link

Concrete Advice for Forming Inside Views on AI Safety

Neel NandaAug 17, 2022, 11:26 PM

58 points

4 comments10 min readEA link

(www.alignmentforum.org)

Here are the finalists from FLI’s $100K Worldbuilding Contest

Jackson WagnerJun 6, 2022, 6:42 PM

44 points

5 comments2 min readEA link

A discussion with ChatGPT on value-based models vs. large language models, etc..

MiguelFeb 4, 2023, 4:49 PM

4 points

0 comments12 min readEA link

(www.whitehatstoic.com)

Can TikToks communicate AI policy and risk?

Caitlin BorkeMay 7, 2025, 12:27 PM

72 points

1 comment1 min readEA link

The Terminology of Artificial Sentience

Janet PauketatNov 28, 2021, 7:52 AM

29 points

0 comments1 min readEA link

(www.sentienceinstitute.org)

Give Neo a Chance

ankMar 6, 2025, 2:35 PM

1 point

3 comments7 min readEA link

Everything’s An Emergency

Bentham's BulldogMar 20, 2025, 5:11 PM

27 points

1 comment2 min readEA link

AISN #30: Investments in Compute and Military AI Plus, Japan and Singapore’s National AI Safety Institutes

Center for AI SafetyJan 24, 2024, 7:38 PM

7 points

1 comment6 min readEA link

(newsletter.safe.ai)

[Opzionale] Il panorama della governance lungoterminista delle intelligenze artificiali

EA ItalyJan 17, 2023, 11:03 AM

1 point

0 comments10 min readEA link

Jeffrey Ding: Bringing techno-globalism back: a romantically realist reframing of the US-China tech relationship

EA GlobalNov 21, 2020, 8:12 AM

9 points

0 comments1 min readEA link

(www.youtube.com)

Normalcy bias and Base rate neglect: Bias in Evaluating AGI X-Risks

RemmeltJan 4, 2023, 3:16 AM

5 points

0 comments1 min readEA link

How to build a safe advanced AI (Evan Hubinger) | What’s up in AI safety? (Asya Bergal)

EA GlobalOct 25, 2020, 5:48 AM

7 points

0 comments1 min readEA link

(www.youtube.com)

[Question] Does China have AI alignment resources/institutions? How can we prioritize creating more?

JakubKAug 4, 2022, 7:23 PM

18 points

9 comments1 min readEA link

Confused about AI research as a means of addressing AI risk

Eli RoseFeb 21, 2019, 12:07 AM

31 points

15 comments1 min readEA link

Will the EU regulations on AI matter to the rest of the world?

hanadulsetJan 1, 2022, 9:56 PM

33 points

5 comments5 min readEA link

Intro to AI Safety

Madhav MalhotraOct 19, 2022, 11:45 PM

4 points

0 comments1 min readEA link

MATS Winter 2023-24 Retrospective

utilistrutilMay 11, 2024, 12:09 AM

62 points

2 comments1 min readEA link

More Academic Diversity in Alignment?

ojorgensenNov 27, 2022, 5:52 PM

7 points

0 comments1 min readEA link

Digital Agents: The Future of News Consumption

TharinMay 16, 2024, 8:12 AM

9 points

1 comment7 min readEA link

(echoesandchimes.com)

INTERVIEW: Round 2 - StakeOut.AI w/ Dr. Peter Park

Jacob-HaimesMar 18, 2024, 9:26 PM

8 points

0 comments1 min readEA link

(into-ai-safety.github.io)

What are some other introductions to AI safety?

Vishakha AgrawalFeb 17, 2025, 11:48 AM

9 points

0 comments7 min readEA link

(aisafety.info)

The Tree of Life: Stanford AI Alignment Theory of Change

GabeMJul 2, 2022, 6:32 PM

69 points

5 comments14 min readEA link

Eighteen Open Research Questions for Governing Advanced AI Systems

Ihor IvlievMay 3, 2025, 7:00 PM

2 points

0 comments6 min readEA link

Markus Anderljung On The AI Policy Landscape

Michaël TrazziSep 9, 2022, 5:27 PM

14 points

0 comments2 min readEA link

(theinsideview.ai)

[Question] What are the numbers in mind for the super-short AGI timelines so many long-termists are alarmed about?

Evan_GaensbauerApr 19, 2022, 9:09 PM

41 points

2 comments1 min readEA link

Looking for students in AI to take a survey on how they tackle a complex AI Case Study—win chance on 200€

bqnsJan 8, 2024, 3:52 PM

1 point

0 comments1 min readEA link

The missing link to AGI

Yuri BarzovSep 28, 2022, 4:37 PM

1 point

7 comments1 min readEA link

[Question] How much will pre-transformative AI speed up R&D?

Ben SnodinMay 31, 2021, 8:20 PM

23 points

0 comments1 min readEA link

On green

Joe_CarlsmithMar 21, 2024, 5:38 PM

61 points

3 comments1 min readEA link

The Power of Intelligence—The Animation

WriterMar 11, 2023, 4:15 PM

59 points

0 comments1 min readEA link

A selection of lessons from Sebastian Lodemann

ClaireBNov 11, 2024, 9:33 PM

82 points

2 comments7 min readEA link

Naturalism and AI alignment

Michele CampoloApr 24, 2021, 4:20 PM

17 points

3 comments7 min readEA link

An A.I. Safety Presentation at RIT

Nicholas KrossMar 27, 2023, 11:49 PM

5 points

0 comments1 min readEA link

Resources & opportunities for careers in European AI Policy

Cillian_Oct 12, 2023, 3:02 PM

13 points

1 comment2 min readEA link

Twitter-length responses to 24 AI alignment arguments

RobBensingerMar 14, 2022, 7:34 PM

67 points

17 comments8 min readEA link

The alignment problem from a deep learning perspective

richard_ngoAug 11, 2022, 3:18 AM

58 points

0 comments26 min readEA link

“Long” timelines to advanced AI have gotten crazy short

Matrice JacobineApr 3, 2025, 10:46 PM

16 points

1 comment1 min readEA link

(helentoner.substack.com)

On the Moral Patiency of Non-Sentient Beings (Part 2)

Chase CarterJul 7, 2024, 10:33 PM

14 points

1 comment21 min readEA link

AI Safety: Why We Need to Keep Our Smart Machines in Check

adityaraj@eanitaDec 17, 2024, 12:29 PM

1 point

0 comments2 min readEA link

(medium.com)

Uncertainty about the future does not imply that AGI will go well

Lauro LangoscoJun 5, 2023, 3:02 PM

8 points

11 comments7 min readEA link

(www.alignmentforum.org)

Three Types of Intelligence Explosion

rosehadsharMar 17, 2025, 2:47 PM

45 points

1 comment3 min readEA link

(www.forethought.org)

Legal Priorities Research: A Research Agenda

jonasschuettJan 6, 2021, 9:47 PM

58 points

4 comments1 min readEA link

Talos Network needs your help in 2025

DavidConradNov 12, 2024, 9:26 AM

43 points

0 comments5 min readEA link

New Canada AI Safety & Governance community

Wyatt Tessari L'AlliéAug 29, 2022, 3:58 PM

32 points

2 comments1 min readEA link

[Linkpost] How To Get Into Independent Research On Alignment/Agency

Jackson WagnerFeb 14, 2022, 9:40 PM

10 points

0 comments1 min readEA link

An appeal to people who are smarter than me: please help me clarify my thinking about AI

bethhwAug 5, 2023, 4:38 PM

42 points

21 comments3 min readEA link

[Question] What to include in a guest lecture on existential risks from AI?

Aryeh EnglanderApr 13, 2022, 5:06 PM

6 points

3 comments1 min readEA link

Twitter thread on AI safety evals

richard_ngoJul 31, 2024, 12:29 AM

38 points

2 comments1 min readEA link

(x.com)

AIS Berlin, events, opportunities and the flipped gameboard—Fieldbuilders Newsletter, February 2025

gergoFeb 17, 2025, 2:13 PM

7 points

0 comments3 min readEA link

How much should governments pay to prevent catastrophes? Longtermism’s limited role

EJTMar 19, 2023, 4:50 PM

258 points

35 comments35 min readEA link

(philpapers.org)

Confusions and updates on STEM AI

Eleni_AMay 19, 2023, 9:34 PM

7 points

0 comments1 min readEA link

Some global catastrophic risk estimates

TamayFeb 10, 2021, 7:32 PM

106 points

15 comments1 min readEA link

[Question] AI risks: the most convincing argument

Eleni_AAug 6, 2022, 8:26 PM

7 points

2 comments1 min readEA link

Towards the Operationalization of Philosophy & Wisdom

Thane RuthenisOct 28, 2024, 7:45 PM

1 point

1 comment1 min readEA link

(aiimpacts.org)

Scalable And Transferable Black-Box Jailbreaks For Language Models Via Persona Modulation

soroushjpNov 7, 2023, 6:00 PM

10 points

0 comments2 min readEA link

(arxiv.org)

Cognitive science and failed AI forecasts

Eleni_ANov 18, 2022, 2:25 PM

13 points

0 comments2 min readEA link

Apply to the Redwood Research Mechanistic Interpretability Experiment (REMIX), a research program in Berkeley

Max NadeauOct 27, 2022, 1:39 AM

95 points

5 comments12 min readEA link

Introducing the Center for AI Policy (& we’re hiring!)

Thomas LarsenAug 28, 2023, 9:27 PM

53 points

1 comment2 min readEA link

(www.aipolicy.us)

Introducing Tech Governance Project

Zakariyau YusufOct 29, 2024, 9:20 AM

52 points

5 comments8 min readEA link

[Question] If FTX is liquidated, who ends up controlling Anthropic?

OferNov 15, 2022, 3:04 PM

63 points

8 comments1 min readEA link

Let’s think about...lowering the burden of proof for liability for harms associated with AI.

dEAsignSep 26, 2023, 12:16 PM

6 points

0 comments1 min readEA link

Bounty: example debugging tasks for evals

ElizabethBarnesDec 10, 2023, 5:45 AM

20 points

1 comment2 min readEA link

(www.lesswrong.com)

(Report) Evaluating Taiwan’s Tactics to Safeguard its Semiconductor Assets Against a Chinese Invasion

YadavDec 7, 2023, 12:01 AM

16 points

0 comments22 min readEA link

(bristolaisafety.org)

Explore Risks from Emerging Technology with Peers Outside of (or New to) the AI Alignment Community—Express Interest by August 8

FasoriJul 17, 2022, 8:59 PM

3 points

0 comments2 min readEA link

Carl Shulman on the moral status of current and future AI systems

rgbJul 1, 2024, 3:34 PM

69 points

24 comments12 min readEA link

(experiencemachines.substack.com)

Looking for Canadian summer co-op position in AI Governance

tcelferactJun 26, 2023, 5:27 PM

6 points

2 comments1 min readEA link

Apollo Research is hiring evals and interpretability engineers & scientists

mariushobbhahnAug 4, 2023, 10:56 AM

19 points

1 comment2 min readEA link

Fake Meat and Real Talk 1 - Are We All Gonna Die? Yudkowsky and the Dangers of AI (Please RSVP)

David NMar 8, 2023, 8:40 PM

11 points

2 comments1 min readEA link

Institutions Cannot Restrain Dark-Triad AI Exploitation

RemmeltDec 27, 2022, 10:34 AM

8 points

0 comments1 min readEA link

Safe Stasis Fallacy

DavidmanheimFeb 5, 2024, 10:54 AM

23 points

4 comments1 min readEA link

[Question] Can we ever ensure AI alignment if we can only test AI personas?

Karl von WendtMar 16, 2025, 8:06 AM

8 points

0 comments1 min readEA link

Two reasons we might be closer to solving alignment than it seems

Kat WoodsSep 24, 2022, 5:38 PM

44 points

17 comments4 min readEA link

Why I am Still Skeptical about AGI by 2030

James FodorMay 2, 2025, 7:13 AM

131 points

13 comments6 min readEA link

You Should Write a Forum Bio

Aaron Gertler 🔸Feb 1, 2019, 3:32 AM

42 points

59 comments2 min readEA link

2023 Stanford Existential Risks Conference

elizabethcooperFeb 24, 2023, 5:49 PM

29 points

5 comments1 min readEA link

Refer the Cooperative AI Foundation’s New COO, Receive $5000

Lewis HammondJun 16, 2022, 1:27 PM

42 points

0 comments3 min readEA link

Defusing AGI Danger

Mark XuDec 24, 2020, 11:08 PM

23 points

0 comments2 min readEA link

(www.alignmentforum.org)

Perché il deep learning moderno potrebbe rendere difficile l’allineamento delle IA

EA ItalyJan 17, 2023, 11:29 PM

1 point

0 comments16 min readEA link

Reasons for superpowers to develop (and not develop) super intelligent AI?

flyingtigerMar 25, 2025, 10:22 PM

1 point

0 comments1 min readEA link

Singapore AI Policy Career Guide

Yi-YangJan 21, 2021, 3:05 AM

28 points

0 comments5 min readEA link

How LLMs Work, in the Style of The Economist

utilistrutilApr 22, 2024, 7:06 PM

17 points

0 comments1 min readEA link

Jan Leike, Helen Toner, Malo Bourgon, and Miles Brundage: Working in AI

EA GlobalAug 11, 2017, 8:19 AM

7 points

0 comments1 min readEA link

(www.youtube.com)

AUKUS Military AI Trial

CAISIDFeb 14, 2024, 2:52 PM

10 points

0 comments2 min readEA link

Join AISafety.info’s Writing & Editing Hackathon (Aug 25-28) (Prizes to be won!)

leillustrations🔸Aug 5, 2023, 2:06 PM

15 points

0 comments1 min readEA link

[Question] Do short AI timelines demand short Giving timelines?

ScienceMon🔸Feb 1, 2025, 10:44 PM

12 points

5 comments1 min readEA link

Moral Considerations In Designing AI Systems

Hans GundlachJul 5, 2024, 6:13 PM

8 points

1 comment7 min readEA link

OpenAI defected, but we can take honest actions

RemmeltOct 21, 2024, 8:41 AM

19 points

1 comment2 min readEA link

[Question] What does the launch of x.ai mean for AI Safety?

Chris LeongJul 12, 2023, 7:42 PM

20 points

1 comment1 min readEA link

Cognitive Biases Contributing to AI X-risk — a deleted excerpt from my 2018 ARCHES draft

Andrew CritchDec 3, 2024, 9:29 AM

14 points

1 comment1 min readEA link

[Apply] What I Love About AI Safety Fieldbuilding at Cambridge (& We’re Hiring for a Leadership Role)

Harrison 🔸Feb 14, 2025, 5:41 PM

15 points

0 comments3 min readEA link

Terminology suggestion: standardize terms for probability ranges

Egg SyntaxAug 30, 2024, 4:05 PM

2 points

0 comments1 min readEA link

Are New Ideas in AI Getting Harder to Find?

Charlie HarrisonDec 10, 2024, 12:52 PM

39 points

3 comments5 min readEA link

[Question] Is transformative AI the biggest existential risk? Why or why not?

Eevee🔹Mar 5, 2022, 3:54 AM

9 points

10 comments1 min readEA link

[Question] Seeking Tangible Examples of AI Catastrophes

clifford.banesNov 25, 2024, 7:55 AM

9 points

2 comments1 min readEA link

Highly Opinionated Advice on How to Write ML Papers

Neel NandaMay 12, 2025, 1:59 AM

22 points

0 comments1 min readEA link

AI Governance to Avoid Extinction: The Strategic Landscape and Actionable Research Questions [MIRI TGT Research Agenda]

peterbarnettMay 5, 2025, 7:13 PM

64 points

1 comment8 min readEA link

(techgov.intelligence.org)

Singapore’s Technical AI Alignment Research Career Guide

Yi-YangAug 26, 2020, 8:09 AM

34 points

7 comments8 min readEA link

Lightning Post: Things people in AI Safety should stop talking about

PrometheusJun 20, 2023, 3:00 PM

5 points

3 comments1 min readEA link

Apply to the second ML for Alignment Bootcamp (MLAB 2) in Berkeley [Aug 15 - Fri Sept 2]

BuckMay 6, 2022, 12:19 AM

111 points

7 comments6 min readEA link

Supporting global coordination in AI development: Why and how to contribute to international AI standards

pcihonApr 17, 2019, 10:17 PM

21 points

4 comments1 min readEA link

Vignettes Workshop (AI Impacts)

kokotajlodJun 15, 2021, 11:02 AM

43 points

5 comments1 min readEA link

Introducing the AI Objectives Institute’s Research: Differential Paths toward Safe and Beneficial AI

cmckMay 5, 2023, 8:26 PM

43 points

1 comment8 min readEA link

Apply to MATS 8.0!

Ryan KiddMar 20, 2025, 2:17 AM

33 points

0 comments1 min readEA link

[Question] Tracking Compute Stocks and Flows: Case Studies?

Cullen 🔸Oct 5, 2022, 5:54 PM

34 points

1 comment1 min readEA link

Towards shutdownable agents via stochastic choice

EJTJul 8, 2024, 10:14 AM

26 points

1 comment1 min readEA link

(arxiv.org)

My current thoughts on MIRI’s “highly reliable agent design” work

Daniel_DeweyJul 7, 2017, 1:17 AM

60 points

59 comments19 min readEA link

Is GPT-3 the death of the paperclip maximizer?

matthias_samwaldAug 3, 2020, 11:34 AM

4 points

1 comment1 min readEA link

AI Twitter accounts to follow?

Adrian SalustriJun 10, 2022, 6:19 AM

1 point

2 comments1 min readEA link

[Question] Huh. Bing thing got me real anxious about AI. Resources to help with that please?

ArvinFeb 15, 2023, 4:55 PM

2 points

7 comments1 min readEA link

New Sequence—Towards a worldwide, watertight Windfall Clause

John Bridge 🔸Apr 7, 2022, 3:02 PM

25 points

4 comments8 min readEA link

There are two factions working to prevent AI dangers. Here’s why they’re deeply divided.

SharmakeAug 10, 2022, 7:52 PM

10 points

0 comments4 min readEA link

(www.vox.com)

NIST staffers revolt against expected appointment of ‘effective altruist’ AI researcher to US AI Safety Institute

PhibMar 8, 2024, 5:47 PM

39 points

16 comments1 min readEA link

(venturebeat.com)

[Question] Benefits/Risks of Scott Aaronson’s Orthodox/Reform Framing for AI Alignment

JeremyNov 21, 2022, 5:47 PM

15 points

5 comments1 min readEA link

(scottaaronson.blog)

Quick Thoughts on Language Models

RohanSJul 19, 2023, 4:51 PM

10 points

2 comments4 min readEA link

(www.lesswrong.com)

#188 – On whether science is good (Matt Clancy on the 80,000 Hours Podcast)

80000_HoursMay 24, 2024, 3:04 PM

13 points

0 comments17 min readEA link

New? Start here! (Useful links)

LizkaJul 1, 2022, 9:19 PM

27 points

1 comment2 min readEA link

Connect For Animals 2025 Strategic Plan

Steven RoukFeb 13, 2025, 3:49 PM

17 points

1 comment13 min readEA link

Eliciting intuitions: Exploring an area for EA psychology

Daniel_FriedrichApr 21, 2025, 3:13 PM

11 points

1 comment8 min readEA link

Distillation of “How Likely is Deceptive Alignment?”

NickGabsDec 1, 2022, 8:22 PM

10 points

1 comment10 min readEA link

A Rocket–Interpretability Analogy

plexOct 21, 2024, 1:55 PM

13 points

1 comment1 min readEA link

‘Force multipliers’ for EA research

Craig DraytonJun 18, 2022, 1:39 PM

18 points

7 comments4 min readEA link

OPEC for a slow AGI takeoff

vyraxApr 21, 2023, 10:53 AM

4 points

0 comments3 min readEA link

Shulman and Yudkowsky on AI progress

CarlShulmanDec 4, 2021, 11:37 AM

46 points

0 comments20 min readEA link

Deception as the optimal: mesa-optimizers and inner alignment

Eleni_AAug 16, 2022, 3:45 AM

19 points

0 comments5 min readEA link

On AI Weapons

kbogNov 13, 2019, 12:48 PM

76 points

10 comments30 min readEA link

Cost-effectiveness analysis of ~1260 USD worth of social media ads for fellowship marketing

gergoJan 25, 2024, 3:18 PM

61 points

5 comments2 min readEA link

How to think about slowing AI

Zach Stein-PerlmanSep 17, 2023, 11:23 AM

74 points

9 comments3 min readEA link

Sentinel Funding Memo — Mitigating GCRs with Forecasting & Emergency Response

Saul MunnNov 6, 2024, 1:57 AM

47 points

5 comments13 min readEA link

AI Safety Newsletter #5: Geoffrey Hinton speaks out on AI risk, the White House meets with AI labs, and Trojan attacks on language models

Center for AI SafetyMay 9, 2023, 3:26 PM

60 points

0 comments4 min readEA link

(newsletter.safe.ai)

AI Governance Needs Technical Work

MauSep 5, 2022, 10:25 PM

121 points

3 comments8 min readEA link

Skill up in ML for AI safety with the Intro to ML Safety course (Spring 2023)

jamesJan 5, 2023, 11:02 AM

36 points

3 comments2 min readEA link

Analysing a 2036 Takeover Scenario

ukc10014Oct 6, 2022, 8:48 PM

4 points

1 comment1 min readEA link

Informatica: Special Issue on Superintelligence

RyanCareyMay 3, 2017, 5:05 AM

7 points

0 comments2 min readEA link

Should ChatGPT make us downweight our belief in the consciousness of non-human animals?

splinterFeb 18, 2023, 11:29 PM

11 points

15 comments2 min readEA link

AI Governance Reading Group Guide

Alex HTJun 25, 2020, 10:16 AM

26 points

2 comments3 min readEA link

[Question] 1h-volunteers needed for a small AI Safety-related research project

PabloAMC 🔸Aug 16, 2021, 5:51 PM

4 points

0 comments1 min readEA link

3. Why impartial altruists should suspend judgment under unawareness

Anthony DiGiovanniJun 2, 2025, 8:54 AM

35 points

0 comments16 min readEA link

[Question] SWE vs AIS

sammyboiz🔸Feb 21, 2025, 1:48 AM

22 points

7 comments1 min readEA link

New Frontiers in AI Safety

Hans GundlachApr 2, 2025, 2:00 AM

6 points

0 comments4 min readEA link

(drive.google.com)

Podcast: Shoshannah Tekofsky on skilling up in AI safety, visiting Berkeley, and developing novel research ideas

AkashNov 25, 2022, 8:47 PM

14 points

0 comments1 min readEA link

NTIA Solicits Comments on Open-Weight AI Models

Jacob WoessnerMar 6, 2024, 8:05 PM

11 points

1 comment2 min readEA link

(www.ntia.gov)

Common misconceptions about OpenAI

Jacob_HiltonAug 25, 2022, 2:08 PM

51 points

2 comments1 min readEA link

(www.lesswrong.com)

Large Language Models Pass the Turing Test

Matrice JacobineApr 2, 2025, 5:41 AM

11 points

6 comments1 min readEA link

(arxiv.org)

Balancing safety and waste

Daniel_FriedrichMar 17, 2024, 10:57 AM

6 points

0 comments8 min readEA link

AI Impacts: Historic trends in technological progress

Aaron Gertler 🔸Feb 12, 2020, 12:08 AM

55 points

5 comments3 min readEA link

Keep Chasing AI Safety Press Coverage

GilApr 4, 2023, 8:40 PM

106 points

16 comments5 min readEA link

European Union AI Development and Governance Partnerships

EU AI GovernanceJan 19, 2022, 10:26 AM

22 points

1 comment4 min readEA link

Navigating the Future: A Guide on How to Stay Safe with AI | Emmanuel Katto Uganda

emmanuelkattoAug 28, 2023, 11:38 AM

2 points

0 comments2 min readEA link

Sixty years after the Cuban Missile Crisis, a new era of global catastrophic risks

christian.rOct 13, 2022, 11:25 AM

31 points

0 comments1 min readEA link

(thebulletin.org)

Announcing Superintelligence Imagined: A creative contest on the risks of superintelligence

TaylorJnsJun 12, 2024, 3:20 PM

17 points

0 comments1 min readEA link

[Paper] AI Sandbagging: Language Models can Strategically Underperform on Evaluations

Teun van der WeijJun 13, 2024, 10:04 AM

24 points

2 comments1 min readEA link

(arxiv.org)

AI companies aren’t really using external evaluators

Zach Stein-PerlmanMay 26, 2024, 7:05 PM

88 points

4 comments1 min readEA link

Apply for the ML Winter Camp in Cambridge, UK [2-10 Jan]

Nathan_BarnardDec 2, 2022, 7:33 PM

50 points

11 comments2 min readEA link

FLF Fellowship on AI for Human Reasoning: $25-50k, 12 weeks

Oliver SourbutMay 19, 2025, 1:25 PM

69 points

2 comments1 min readEA link

(www.flf.org)

[Crosspost] Why Uncontrollable AI Looks More Likely Than Ever

OttoMar 8, 2023, 3:33 PM

49 points

6 comments4 min readEA link

(time.com)

From language to ethics by automated reasoning

Michele CampoloNov 21, 2021, 3:16 PM

8 points

0 comments6 min readEA link

Transformative trustbuilding via advancements in decentralized lie detection

trevor1Mar 16, 2024, 5:56 AM

4 points

1 comment1 min readEA link

(www.ncbi.nlm.nih.gov)

[Question] What AI Posts Do You Want Distilled?

brookAug 25, 2023, 9:00 AM

15 points

3 comments1 min readEA link

International AI Institutions: a literature review of models, examples, and proposals

MMMaasSep 26, 2023, 3:26 PM

53 points

0 comments2 min readEA link

A Utilitarian Framework with an Emphasis on Self-Esteem and Rights

Sean SweeneyApr 8, 2024, 11:15 AM

7 points

0 comments30 min readEA link

AISN #34: New Military AI Systems Plus, AI Labs Fail to Uphold Voluntary Commitments to UK AI Safety Institute, and New AI Policy Proposals in the US Senate

Center for AI SafetyMay 2, 2024, 4:12 PM

21 points

5 comments8 min readEA link

(newsletter.safe.ai)

Replicating AI Debate

Anthony FlemingFeb 1, 2025, 11:19 PM

9 points

0 comments5 min readEA link

[Question] Brief summary of key disagreements in AI Risk

Aryeh EnglanderDec 26, 2019, 7:40 PM

31 points

3 comments1 min readEA link

[Question] Why does (any particular) AI safety work reduce s-risks more than it increases them?

MichaelStJulesOct 3, 2021, 4:55 PM

48 points

19 comments1 min readEA link

A new process for mapping discussions

Nathan YoungSep 30, 2024, 8:57 AM

11 points

4 comments1 min readEA link

(open.substack.com)

The EU AI Act needs a definition of high-risk foundation models to avoid regulatory overreach and backlash

matthias_samwaldMay 31, 2023, 3:34 PM

17 points

0 comments4 min readEA link

#172 – Why you should stop reading the news (Bryan Caplan on the 80,000 Hours Podcast)

80000_HoursNov 22, 2023, 6:29 PM

20 points

1 comment20 min readEA link

AI Safety Memes Wiki

plexJul 24, 2024, 6:53 PM

6 points

0 comments1 min readEA link

(aisafety.info)

UK’s new 10-year “National AI Strategy,” released today

jared_mSep 22, 2021, 11:18 AM

28 points

7 comments1 min readEA link

Between Science Fiction and Emerging Reality: Are We Ready for Digital Persons?

Alex (Αλέξανδρος)Mar 13, 2025, 4:09 PM

5 points

1 comment5 min readEA link

Explaining all the US semiconductor export controls

ZacRichardsonJan 17, 2025, 6:00 PM

20 points

3 comments9 min readEA link

Could Regulatory Cost-Benefit Analysis Stop Frontier AI Regulations in the US?

LuiseJul 11, 2024, 3:25 PM

21 points

1 comment14 min readEA link

Teaching Hindi Literacy with an AI tutor

Tom DelaneyMay 15, 2025, 6:49 AM

37 points

5 comments6 min readEA link

AI alignment with humans… but with which humans?

Geoffrey MillerSep 8, 2022, 11:43 PM

51 points

20 comments3 min readEA link

Forecasting Compute—Transformative AI and Compute [2/4]

lennartOct 1, 2021, 8:25 AM

39 points

6 comments19 min readEA link

What to suggest companies & entrepreneurs do to use AI safely?

AlfalfaBloomApr 5, 2023, 10:36 PM

11 points

1 comment1 min readEA link

Jan Kirchner on AI Alignment

birtesJan 17, 2023, 3:11 PM

5 points

0 comments1 min readEA link

Accelerated Horizons — Podcast + Blog Idea

CadejsApr 16, 2025, 2:20 PM

2 points

3 comments1 min readEA link

Reflections on Compatibilism, Ontological Translations, and the Artificial Divine

Mahdi ComplexMay 7, 2025, 12:17 PM

−4 points

0 comments22 min readEA link

[Question] How come there isn’t that much focus in EA on research into whether / when AI’s are likely to be sentient?

callumApr 27, 2023, 10:09 AM

83 points

23 comments1 min readEA link

6 (Potential) Misconceptions about AI Intellectuals

Ozzie GooenFeb 14, 2025, 11:51 PM

30 points

2 comments12 min readEA link

The Need for Political Advertising (Post 2 of 6 on AI Governance)

Jason Green-LoweMay 21, 2025, 12:52 AM

37 points

0 comments13 min readEA link

AI Safety Ideas: A collaborative AI safety research platform

Apart ResearchOct 17, 2022, 5:01 PM

67 points

13 comments4 min readEA link

Report on Frontier Model Training

YafahEdelmanAug 30, 2023, 8:04 PM

19 points

1 comment21 min readEA link

(docs.google.com)

Second call: CFP for Rebellion and Disobedience in AI workshop

Ram RachumFeb 5, 2023, 12:19 PM

2 points

0 comments2 min readEA link

MATS Alumni Impact Analysis

utilistrutilOct 2, 2024, 11:44 PM

16 points

1 comment1 min readEA link

[Question] Career Advice: Philosophy + Programming → AI Safety

tcelferactMar 18, 2022, 3:09 PM

30 points

11 comments2 min readEA link

[Question] Citizens Group for Steering AI

Odette BApr 11, 2024, 9:15 AM

13 points

0 comments1 min readEA link

Idea: an AI governance group colocated with every AI research group!

capybaraletDec 7, 2020, 11:41 PM

8 points

1 comment2 min readEA link

New cooperation mechanism—quadratic funding without a matching pool

Filip SondejJun 5, 2022, 1:55 PM

55 points

11 comments5 min readEA link

The Offense-Defense Balance Rarely Changes

Maxwell TabarrokDec 9, 2023, 3:22 PM

81 points

16 comments3 min readEA link

(maximumprogress.substack.com)

Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Matrice JacobineMay 12, 2025, 3:20 PM

14 points

1 comment1 min readEA link

(www.arxiv.org)

Eli’s review of “Is power-seeking AI an existential risk?”

eliflandSep 30, 2022, 12:21 PM

58 points

3 comments1 min readEA link

Making EA more inclusive, representative, and impactful in Africa

Ashura BatungwanayoAug 17, 2023, 8:19 PM

70 points

13 comments4 min readEA link

Chris Olah on what the hell is going on inside neural networks

80000_HoursAug 4, 2021, 3:13 PM

5 points

0 comments133 min readEA link

[Question] Should AI writers be prohibited in education?

Eleni_AJan 16, 2023, 10:29 PM

3 points

2 comments1 min readEA link

AI Risk Intro 1: Advanced AI Might Be Very Bad

L Rudolf LSep 11, 2022, 10:57 AM

22 points

0 comments30 min readEA link

AISN #16: White House Secures Voluntary Commitments from Leading AI Labs and Lessons from Oppenheimer

Center for AI SafetyJul 25, 2023, 4:45 PM

7 points

0 comments6 min readEA link

(newsletter.safe.ai)

On Artificial General Intelligence: Asking the Right Questions

Heather DouglasOct 2, 2022, 5:00 AM

−1 points

7 comments3 min readEA link

[Question] Share AI Safety Ideas: Both Crazy and Not. №2

ankMar 31, 2025, 6:45 PM

1 point

11 comments1 min readEA link

Tarbell is hiring for 3 roles

Cillian_Jul 17, 2024, 12:19 PM

48 points

1 comment5 min readEA link

LW4EA: Some cruxes on impactful alternatives to AI policy work

JeremyMay 17, 2022, 3:05 AM

11 points

1 comment1 min readEA link

(www.lesswrong.com)

Gradual Disempowerment: Systemic Existential Risks from Incremental AI Development

Jan_KulveitJan 30, 2025, 5:07 PM

38 points

4 comments1 min readEA link

(gradual-disempowerment.ai)

The Vitalik Buterin Fellowship in AI Existential Safety is open for applications!

Cynthia ChenOct 14, 2022, 3:23 AM

38 points

0 comments2 min readEA link

How difficult is AI Alignment?

SammyDMartinSep 13, 2024, 5:55 PM

12 points

0 comments1 min readEA link

(www.lesswrong.com)

AI may attain human level soon

Vishakha AgrawalApr 23, 2025, 11:10 AM

2 points

1 comment2 min readEA link

(aisafety.info)

[Question] Thoughts on this $16.7M “AI safety” grant?

defun 🔸Jul 16, 2024, 9:16 AM

61 points

24 comments1 min readEA link

[Question] What are some current, already present challenges from AI?

nonzerosumJun 30, 2022, 3:44 PM

5 points

1 comment1 min readEA link

Continuity Assumptions

Jan_KulveitJun 13, 2022, 9:36 PM

44 points

4 comments4 min readEA link

(www.alignmentforum.org)

The Benefits of Distillation in Research

Jonas Hallgren 🔸Mar 4, 2023, 7:19 PM

45 points

2 comments5 min readEA link

Varieties of fake alignment (Section 1.1 of “Scheming AIs”)

Joe_CarlsmithNov 21, 2023, 3:00 PM

6 points

0 comments1 min readEA link

Tips for conducting worldview investigations

lukeprogApr 12, 2022, 7:28 PM

88 points

4 comments2 min readEA link

EA AI/Emerging Tech Orgs Should Be Involved with Patent Office Partnership

BridgesJun 12, 2022, 10:32 PM

10 points

0 comments1 min readEA link

Resources that (I think) new alignment researchers should know about

AkashOct 28, 2022, 10:13 PM

20 points

2 comments1 min readEA link

Update from Campaign for AI Safety

Nik SamoylovJun 1, 2023, 10:46 AM

22 points

0 comments2 min readEA link

(www.campaignforaisafety.org)

How do takeoff speeds affect the probability of bad outcomes from AGI?

KRJul 7, 2020, 5:53 PM

18 points

0 comments8 min readEA link

Compute & Antitrust: Regulatory implications of the AI hardware supply chain, from chip design to cloud APIs

HaydnBelfieldAug 19, 2022, 5:20 PM

32 points

0 comments6 min readEA link

(verfassungsblog.de)

Some advice on independent research

mariushobbhahnNov 8, 2022, 2:46 PM

65 points

3 comments1 min readEA link

Is China Becoming a Science and Technology Superpower? Jeffrey Ding’s Insight on China’s Diffusion Deficit

Wyman KwokApr 25, 2023, 5:00 PM

10 points

0 comments1 min readEA link

A challenge for AGI organizations, and a challenge for readers

RobBensingerDec 1, 2022, 11:11 PM

172 points

13 comments1 min readEA link

A Tri-Opti Compatibility Problem

wallowerMar 1, 2025, 7:48 PM

1 point

0 comments1 min readEA link

(philpapers.org)

FYI: I’m working on a book about the threat of AGI/ASI for a general audience. I hope it will be of value to the cause and the community

Darren McKeeJun 17, 2022, 11:52 AM

32 points

1 comment2 min readEA link

AMA: Ajeya Cotra, researcher at Open Phil

AjeyaJan 28, 2021, 5:38 PM

84 points

105 comments1 min readEA link

Mistakes I made running an AI safety student group

cbFeb 26, 2025, 3:07 PM

25 points

0 comments7 min readEA link

The Dissolution of AI Safety

RokoDec 12, 2024, 10:46 AM

−7 points

0 comments1 min readEA link

(www.transhumanaxiology.com)

#168 – Whether deep history says we’re heading for an intelligence explosion (Ian Morris on the 80,000 Hours Podcast)

80000_HoursOct 24, 2023, 3:24 PM

11 points

2 comments18 min readEA link

How likely are malign priors over objectives? [aborted WIP]

David JohnstonNov 11, 2022, 6:03 AM

6 points

0 comments1 min readEA link

MIRI Conversations: Technology Forecasting & Gradualism (Distillation)

Callum McDougallJul 13, 2022, 10:45 AM

27 points

9 comments19 min readEA link

The Digital Maieutic: Socrates and the Art of Prompting

RodoMay 30, 2025, 6:58 PM

3 points

2 comments4 min readEA link

Results from the language model hackathon

Esben KranOct 10, 2022, 8:29 AM

23 points

2 comments1 min readEA link

[Question] “Epistemic maps” for AI Debates? (or for other issues)

Marcel DAug 30, 2021, 4:59 AM

14 points

9 comments5 min readEA link

Biosafety Regulations (BMBL) and their relevance for AI

stepanlosJun 29, 2023, 7:20 PM

8 points

0 comments4 min readEA link

[Question] Is there evidence that recommender systems are changing users’ preferences?

zdgroffApr 12, 2021, 7:11 PM

60 points

15 comments1 min readEA link

(My suggestions) On Beginner Steps in AI Alignment

Joseph BloomSep 22, 2022, 3:32 PM

37 points

3 comments9 min readEA link

Some AI research areas and their relevance to existential safety

Andrew CritchDec 15, 2020, 12:15 PM

12 points

1 comment56 min readEA link

(alignmentforum.org)

[Linkpost] Human-narrated audio version of “Is Power-Seeking AI an Existential Risk?”

Joe_CarlsmithJan 31, 2023, 7:19 PM

9 points

0 comments1 min readEA link

Conclusion and Bibliography for “Understanding the diffusion of large language models”

Ben CottierDec 21, 2022, 1:50 PM

12 points

0 comments11 min readEA link

Alexander and Yudkowsky on AGI goals

Scott AlexanderJan 31, 2023, 11:36 PM

29 points

1 comment1 min readEA link

[Question] If AIs had subcortical brain simulation, would that solve the alignment problem?

Rainbow AffectJul 31, 2023, 3:48 PM

1 point

0 comments2 min readEA link

When can a mimic surprise you? Why generative models handle seemingly ill-posed problems

David JohnstonNov 6, 2022, 11:46 AM

6 points

0 comments1 min readEA link

An appraisal of the Future of Life Institute AI existential risk program

PabloAMC 🔸Dec 11, 2022, 1:36 PM

29 points

0 comments1 min readEA link

The probability that Artificial General Intelligence will be developed by 2043 is extremely low.

cveresOct 6, 2022, 11:26 AM

2 points

12 comments13 min readEA link

Optimistic Assumptions, Longterm Planning, and “Cope”

RaemonJul 18, 2024, 12:06 AM

15 points

1 comment1 min readEA link

Big Picture AI Safety: Introduction

EuanMcLeanMay 23, 2024, 11:28 AM

32 points

3 comments5 min readEA link

Conscious AI concerns all of us. [Conscious AI & Public Perceptions]

ixexJul 3, 2024, 3:12 AM

25 points

1 comment12 min readEA link

A new proposal for regulating AI in the EU

EdoAradApr 26, 2021, 5:25 PM

37 points

3 comments1 min readEA link

(www.bbc.com)

How might we align transformative AI if it’s developed very soon?

Holden KarnofskyAug 29, 2022, 3:48 PM

164 points

17 comments44 min readEA link

Cheetah-8 Ethical Framework: Evolution from Egoism to Altruism

DongHun LeeMay 31, 2025, 3:04 PM

0 points

0 comments2 min readEA link

Emotion Alignment as AI Safety: Introducing Emotion Firewall 1.0

DongHun LeeMay 12, 2025, 6:05 PM

1 point

0 comments2 min readEA link

The Limit of Language Models

𝕮𝖎𝖓𝖊𝖗𝖆Dec 26, 2022, 11:17 AM

10 points

0 comments1 min readEA link

Arkose may be closing, but you can help

ArkoseMay 1, 2025, 11:09 AM

55 points

6 comments2 min readEA link

Retrospective on the AI Safety Field Building Hub

Vael GatesFeb 2, 2023, 2:06 AM

64 points

2 comments9 min readEA link

Will explosive growth stem primarily from AI R&D automation?

OscarD🔸Mar 28, 2025, 8:25 PM

33 points

3 comments4 min readEA link

[Question] How much should states invest in contingency plans for widespread internet outage?

Kinoshita Yoshikazu (pseudonym)Apr 7, 2023, 4:05 PM

2 points

0 comments1 min readEA link

Understanding AI World Models w/ Chris Canal

Jacob-HaimesJan 27, 2025, 4:37 PM

5 points

0 comments1 min readEA link

(kairos.fm)

Black Box Investigations Research Hackathon

Esben KranSep 15, 2022, 10:09 AM

23 points

0 comments2 min readEA link

AI X-Risk: Integrating on the Shoulders of Giants

TD_PilditchNov 1, 2022, 4:07 PM

34 points

0 comments47 min readEA link

AI Safety Hub Serbia Official Opening

Dušan D. Nešić (Dushan)Oct 28, 2023, 5:03 PM

20 points

1 comment1 min readEA link

(forum.effectivealtruism.org)

Is principled mass-outreach possible, for AGI X-risk?

Nicholas KrossJan 21, 2024, 5:45 PM

12 points

2 comments1 min readEA link

OpenAI’s CBRN tests seem unclear

Luca Righetti 🔸Nov 21, 2024, 5:26 PM

82 points

3 comments7 min readEA link

#191 (Part 1) – The economy and national security after AGI (Carl Shulman on the 80,000 Hours Podcast)

80000_HoursJun 27, 2024, 7:10 PM

45 points

0 comments19 min readEA link

Intro to caring about AI alignment as an EA cause

So8resApr 14, 2017, 12:42 AM

28 points

10 comments25 min readEA link

Desensitizing Deepfakes

PhibMar 29, 2023, 1:20 AM

22 points

10 comments1 min readEA link

#217 – The most important graph in AI right now (Beth Barnes on The 80,000 Hours Podcast)

80000_HoursJun 2, 2025, 4:52 PM

11 points

1 comment26 min readEA link

Effective Altruism Florida’s AI Expert Panel—Recording and Slides Available

Sam_E_24May 19, 2023, 7:15 PM

2 points

0 comments1 min readEA link

Existential AI Safety is NOT separate from near-term applications

stecasDec 13, 2022, 2:47 PM

28 points

9 comments1 min readEA link

#213 – AI causing a “century in a decade” — and how we’re completely unprepared (Will MacAskill on The 80,000 Hours Podcast)

80000_HoursMar 11, 2025, 5:55 PM

24 points

0 comments22 min readEA link

Two sources of beyond-episode goals (Section 2.2.2 of “Scheming AIs”)

Joe_CarlsmithNov 28, 2023, 1:49 PM

8 points

0 comments1 min readEA link

My Overview of the AI Alignment Landscape: A Bird’s Eye View

Neel NandaDec 15, 2021, 11:46 PM

45 points

15 comments16 min readEA link

(www.alignmentforum.org)

#194 – Defensive acceleration and how to regulate AI when you fear government (Vitalik Buterin on the 80,000 Hours Podcast)

80000_HoursJul 31, 2024, 8:28 PM

49 points

5 comments21 min readEA link

Catastrophic Risks from AI #3: AI Race

Dan HJun 23, 2023, 7:21 PM

9 points

0 comments1 min readEA link

Challenges and Opportunities of Reinforcement Learning in Robotics: Analysis of Current Trends

Raymundo Rodríguez AlvaOct 14, 2024, 1:22 PM

11 points

1 comment17 min readEA link

Future Matters #5: supervolcanoes, AI takeover, and What We Owe the Future

PabloSep 14, 2022, 1:02 PM

31 points

5 comments18 min readEA link

Chatbot for Poultry Farms: Responding to Avian Influenza in Mexico

Ever ArboledaMay 1, 2025, 2:55 PM

1 point

0 comments13 min readEA link

From Layoff to Co-founding in a Breathtaking Two Months

Harry LukSep 26, 2023, 7:35 AM

44 points

3 comments17 min readEA link

[Question] Is there any research or forecasts of how likely AI Alignment is going to be a hard vs. easy problem relative to capabilities?

Jordan ArelAug 14, 2022, 3:58 PM

8 points

1 comment1 min readEA link

On Deference and Yudkowsky’s AI Risk Estimates

bgarfinkelJun 19, 2022, 2:35 PM

287 points

194 comments17 min readEA link

Mechanism Design for AI Safety—Reading Group Curriculum

Rubi J. HudsonOct 25, 2022, 3:54 AM

24 points

1 comment4 min readEA link

Draft report on existential risk from power-seeking AI

Joe_CarlsmithApr 28, 2021, 9:41 PM

88 points

34 comments1 min readEA link

[Closed] Prize and fast track to alignment research at ALTER

VanessaSep 18, 2022, 9:15 AM

38 points

0 comments3 min readEA link

Announcing Apollo Research

mariushobbhahnMay 30, 2023, 4:17 PM

158 points

5 comments1 min readEA link

How dath ilan coordinates around solving AI alignment

Thomas KwaApr 14, 2022, 1:53 AM

13 points

1 comment5 min readEA link

AI Safety Hub Serbia Soft Launch

Dušan D. Nešić (Dushan)Jul 25, 2023, 7:39 PM

29 points

3 comments3 min readEA link

ChatGPT bug leaked users’ conversation histories

Ian TurnerMar 27, 2023, 12:17 AM

15 points

2 comments1 min readEA link

(www.bbc.com)

[Closed] Hiring a mathematician to work on the learning-theoretic AI alignment agenda

VanessaApr 19, 2022, 6:49 AM

53 points

4 comments2 min readEA link

Intelsat as a Model for International AGI Governance

rosehadsharMar 13, 2025, 12:58 PM

42 points

3 comments1 min readEA link

(www.forethought.org)

Call for Research Participants—EU/China AI regulation

Jamie O'DonnellJun 14, 2024, 5:30 PM

3 points

0 comments1 min readEA link

Digital Minds: Importance and Key Research Questions

Andreas_MogensenJul 3, 2024, 8:59 AM

83 points

1 comment15 min readEA link

The Author Who Knows Nothing : Socrates, Techne, and Barthes’ Scriptor

RodoJun 1, 2025, 9:49 AM

1 point

0 comments4 min readEA link

The convergent dynamic we missed

RemmeltDec 12, 2023, 10:50 PM

2 points

0 comments3 min readEA link

Abstraction is Bigger than Natural Abstraction

Nicholas KrossMay 31, 2023, 12:00 AM

2 points

0 comments1 min readEA link

Epoch is hiring a Research Data Analyst

merilalamaNov 22, 2022, 5:34 PM

21 points

0 comments4 min readEA link

(careers.rethinkpriorities.org)

Why AGI Timeline Research/Discourse Might Be Overrated

Miles_BrundageJul 3, 2022, 8:04 AM

122 points

28 comments10 min readEA link

Announcing an Empirical AI Safety Program

JoshcSep 13, 2022, 9:39 PM

64 points

7 comments2 min readEA link

Predict responses to the “existential risk from AI” survey

RobBensingerMay 28, 2021, 1:38 AM

36 points

8 comments2 min readEA link

Brainstorm of things that could force an AI team to burn their lead

So8resJul 25, 2022, 12:00 AM

26 points

1 comment13 min readEA link

FLI is hiring a new Director of US Policy

aaguirreJul 27, 2022, 12:07 AM

14 points

0 comments1 min readEA link

Some promising career ideas beyond 80,000 Hours’ priority paths

Arden KoehlerJun 26, 2020, 10:34 AM

142 points

28 comments15 min readEA link

RA Bounty: Looking for feedback on screenplay about AI Risk

WriterOct 26, 2023, 2:27 PM

8 points

0 comments1 min readEA link

Analysis of key AI analogies

Kevin KohlerJun 29, 2024, 6:16 PM

35 points

2 comments15 min readEA link

[Question] Whose track record of AI predictions would you like to see evaluated?

Jonny Spicer 🔸Jan 29, 2025, 11:57 AM

10 points

13 comments1 min readEA link

Criticism Thread: What things should OpenPhil improve on?

anonymousEA20Feb 4, 2023, 8:16 AM

85 points

8 comments2 min readEA link

Announcing Insights for Impact

Christian PearsonJan 4, 2023, 7:00 AM

80 points

6 comments1 min readEA link

Anchoring focalism and the Identifiable victim effect: Bias in Evaluating AGI X-Risks

RemmeltJan 7, 2023, 9:59 AM

−2 points

1 comment1 min readEA link

ML4Good West & Central Europe | Applications Open

carolinaolliveMar 12, 2025, 12:02 AM

7 points

3 comments2 min readEA link

Fake thinking and real thinking

Joe_CarlsmithJan 28, 2025, 8:05 PM

75 points

3 comments1 min readEA link

(joecarlsmith.substack.com)

New report on how much computational power it takes to match the human brain (Open Philanthropy)

Aaron Gertler 🔸Sep 15, 2020, 1:06 AM

45 points

1 comment18 min readEA link

(www.openphilanthropy.org)

Keep Making AI Safety News

GilMar 31, 2023, 8:11 PM

67 points

4 comments1 min readEA link

Results from the AI x Democracy Research Sprint

Esben KranJun 14, 2024, 4:40 PM

19 points

1 comment1 min readEA link

A California Effect for Artificial Intelligence

henryjSep 9, 2022, 2:17 PM

73 points

1 comment4 min readEA link

(docs.google.com)

Meta AI announces Cicero: Human-Level Diplomacy play (with dialogue)

JacyNov 22, 2022, 4:50 PM

49 points

10 comments1 min readEA link

AI Safety Newsletter #3: AI policy proposals and a new challenger approaches

Oliver ZApr 25, 2023, 4:15 PM

35 points

1 comment4 min readEA link

(newsletter.safe.ai)

Eisenhower’s Atoms for Peace Speech

AkashMay 17, 2023, 4:10 PM

17 points

1 comment1 min readEA link

Patching ~All Security-Relevant Open-Source Software?

niplavFeb 25, 2025, 9:35 PM

35 points

4 comments2 min readEA link

[Question] An economics of AI gov—best resources for

LivFeb 26, 2023, 11:11 AM

10 points

4 comments1 min readEA link

The AI Messiah

ryancbriggsMay 5, 2022, 4:58 PM

71 points

44 comments2 min readEA link

Simplicity arguments for scheming (Section 4.3 of “Scheming AIs”)

Joe_CarlsmithDec 7, 2023, 3:05 PM

6 points

1 comment1 min readEA link

Owain Evans and Victoria Krakovna: Careers in technical AI safety

EA GlobalNov 3, 2017, 7:43 AM

7 points

0 comments1 min readEA link

(www.youtube.com)

AGI Risk: How to internationally regulate industries in non-democracies

Timothy_LiptrotMay 16, 2022, 10:45 PM

9 points

2 comments9 min readEA link

Can AI Alignment Models Benefit from Indo-European Tripartite Structures?

Paul FallavollitaMay 2, 2025, 12:39 PM

1 point

0 comments2 min readEA link

How useful for alignment-relevant work are AIs with short-term goals? (Section 2.2.4.3 of “Scheming AIs”)

Joe_CarlsmithDec 1, 2023, 2:51 PM

6 points

0 comments1 min readEA link

Summer AI Safety Intro Fellowships in Boston and Online (Policy & Technical) – Apply by June 6!

jandrade112May 29, 2025, 4:47 PM

4 points

0 comments1 min readEA link

Cancelling GPT subscription

adekczMay 20, 2024, 4:19 PM

26 points

14 comments3 min readEA link

Addressing challenges for s-risk reduction: Toward positive common-ground proxies

Teo AjantaivalMar 22, 2025, 5:50 PM

48 points

1 comment17 min readEA link

AI Safety Newsletter #7: Disinformation, Governance Recommendations for AI labs, and Senate Hearings on AI

Center for AI SafetyMay 23, 2023, 9:42 PM

23 points

0 comments6 min readEA link

(newsletter.safe.ai)

[Question] Doing Global Priorities or AI Policy research from remote location?

With Love from IsraelOct 29, 2019, 9:34 AM

30 points

4 comments1 min readEA link

AI Impacts Quarterly Newsletter, Jan-Mar 2023

HarlanApr 17, 2023, 11:07 PM

20 points

1 comment3 min readEA link

(blog.aiimpacts.org)

Is it possibly desirable for sentient ASI to exterminate humans?

DuckruckJun 18, 2024, 3:20 PM

0 points

4 comments1 min readEA link

How to make the best of the most important century?

Holden KarnofskySep 14, 2021, 9:05 PM

57 points

5 comments12 min readEA link

Open questions on a Chinese invasion of Taiwan and its effects on the semiconductor stock

YadavDec 7, 2023, 4:39 PM

21 points

0 comments2 min readEA link

Adam Smith Meets AI Doomers

JamesMillerJan 31, 2024, 4:04 PM

15 points

0 comments5 min readEA link

We Can’t Do Long Term Utilitarian Calculations Until We Know if AIs Can Be Conscious or Not

Mike20731Sep 2, 2022, 8:37 AM

4 points

0 comments11 min readEA link

Data Poisoning for Dummies (No Code, No Math)

Madhav MalhotraSep 4, 2023, 8:48 PM

7 points

0 comments3 min readEA link

Linkpost: Epistle to the Successors

ukc10014Jul 14, 2024, 8:07 PM

4 points

0 comments1 min readEA link

(ukc10014.github.io)

Self-Limiting AI in AI Alignment

The_Lord's_Servant_280Dec 31, 2022, 7:07 PM

2 points

1 comment1 min readEA link

The Elicitation Game: Evaluating capability elicitation techniques

Teun van der WeijFeb 27, 2025, 8:33 PM

3 points

0 comments1 min readEA link

Components of Strategic Clarity [Strategic Perspectives on Long-term AI Governance, #2]

MMMaasJul 2, 2022, 11:22 AM

66 points

0 comments6 min readEA link

Massive Scaling Should be Frowned Upon

harsimonyNov 17, 2022, 5:44 PM

9 points

0 comments5 min readEA link

Conversation on AI risk with Adam Gleave

AI ImpactsDec 27, 2019, 9:43 PM

18 points

3 comments4 min readEA link

(aiimpacts.org)

AI Governance Course—Curriculum and Application

MauNov 29, 2021, 1:29 PM

94 points

9 comments1 min readEA link

Our Current Directions in Mechanistic Interpretability Research (AI Alignment Speaker Series)

Group OrganizerApr 8, 2022, 5:08 PM

3 points

0 comments1 min readEA link

The second bitter lesson — there’s a fundamental problem with aligning AI

aelwoodJan 19, 2025, 6:48 PM

4 points

1 comment5 min readEA link

(pursuingreality.substack.com)

The Campaign Lab Tool Box Hack Day

DanRFeb 6, 2024, 4:39 PM

1 point

0 comments1 min readEA link

AI scaling myths

Noah Varley🔸Jun 27, 2024, 8:29 PM

30 points

0 comments1 min readEA link

(open.substack.com)

The case for becoming a black-box investigator of language models

BuckMay 6, 2022, 2:37 PM

90 points

7 comments3 min readEA link

Slightly against aligning with neo-luddites

Matthew_BarnettDec 26, 2022, 11:27 PM

77 points

17 comments4 min readEA link

U.S. Commerce Secretary Gina Raimondo Announces Expansion of U.S. AI Safety Institute Leadership Team [and Paul Christiano update]

PhibApr 16, 2024, 5:10 PM

116 points

8 comments1 min readEA link

(www.commerce.gov)

Eric Schmidt on recursive self-improvement

NikolaNov 5, 2023, 7:05 PM

11 points

0 comments1 min readEA link

(www.youtube.com)

Announcing the Harvard AI Safety Team

Xander123Jun 30, 2022, 6:34 PM

128 points

4 comments5 min readEA link

New AI risk intro from Vox [link post]

JakubKDec 21, 2022, 5:50 AM

7 points

1 comment2 min readEA link

(www.vox.com)

Christiano, Cotra, and Yudkowsky on AI progress

AjeyaNov 25, 2021, 4:30 PM

18 points

6 comments68 min readEA link

Against Learning From Dramatic Events (by Scott Alexander)

bernJan 17, 2024, 4:34 PM

46 points

3 comments2 min readEA link

(www.astralcodexten.com)

What does it take to defend the world against out-of-control AGIs?

Steven ByrnesOct 25, 2022, 2:47 PM

43 points

0 comments1 min readEA link

AI Lab Retaliation: A Survival Guide

Jay ReadyJan 4, 2025, 11:05 PM

8 points

1 comment12 min readEA link

(morelightinai.substack.com)

Common Genetic Variants Linked to Drug-Resistant Epilepsy

Connor WoodApr 16, 2025, 3:55 AM

2 points

0 comments8 min readEA link

Bahamian Adventures: An Epic Tale of Entrepreneurship, AI Strategy Research and Potatoes

Jaime SevillaAug 9, 2022, 8:37 AM

67 points

9 comments4 min readEA link

Fundamentals of Global Priorities Research in Economics Syllabus

poliboniAug 8, 2023, 12:16 PM

74 points

1 comment8 min readEA link

An Extremely Opinionated Annotated List of My Favourite Mechanistic Interpretability Papers

Neel NandaOct 18, 2022, 9:23 PM

19 points

0 comments12 min readEA link

(www.neelnanda.io)

How much I’m paying for AI productivity software (and the future of AI use)

jacquesthibsOct 11, 2024, 5:11 PM

30 points

17 comments1 min readEA link

(jacquesthibodeau.com)

Will AI end everything? A guide to guessing | EAG Bay Area 23

Katja_GraceMay 25, 2023, 5:01 PM

74 points

1 comment21 min readEA link

A response to Matthews on AI Risk

RyanCareyAug 11, 2015, 12:58 PM

11 points

16 comments6 min readEA link

[Question] What’s the best machine learning newsletter? How do you keep up to date?

Matt PutzMar 25, 2022, 2:36 PM

13 points

12 comments1 min readEA link

[Question] I’m interviewing Max Tegmark about AI safety and more. What shouId I ask him?

Robert_WiblinMay 13, 2022, 3:32 PM

18 points

2 comments1 min readEA link

Deep atheism and AI risk

Joe_CarlsmithJan 4, 2024, 6:58 PM

65 points

4 comments1 min readEA link

Is “superhuman” AI forecasting BS? Some experiments on the “539″ bot from the Centre for AI Safety

titotalSep 18, 2024, 1:07 PM

68 points

4 comments14 min readEA link

(open.substack.com)

Would you pursue software engineering as a career today?

justapersonMar 18, 2023, 3:33 AM

8 points

15 comments3 min readEA link

AI Forecasting Dictionary (Forecasting infrastructure, part 1)

terraformAug 8, 2019, 1:16 PM

18 points

0 comments5 min readEA link

Call for submissions: Choice of Futures survey questions

c.troutApr 30, 2023, 6:59 AM

11 points

0 comments1 min readEA link

Reliability, Security, and AI risk: Notes from infosec textbook chapter 1

AkashApr 7, 2023, 3:47 PM

15 points

0 comments1 min readEA link

ENAIS has launched a newsletter for AIS fieldbuilders

gergoNov 22, 2024, 10:45 AM

25 points

0 comments1 min readEA link

We don’t understand what happened with culture enough

Jan_KulveitOct 9, 2023, 2:56 PM

29 points

2 comments6 min readEA link

EA Poland is facing an existential risk

EA PolandNov 10, 2023, 4:23 PM

113 points

14 comments12 min readEA link

AI Welfare Risks

Adrià MoretMay 2, 2025, 5:41 PM

26 points

0 comments1 min readEA link

(philpapers.org)

[Question] What is the impact of chip production on pausing AI development?

JohanEAJan 10, 2024, 10:20 PM

7 points

0 comments1 min readEA link

AGI will be made of heterogeneous components, Transformer and Selective SSM blocks will be among them

Roman LeventovDec 27, 2023, 2:51 PM

5 points

0 comments1 min readEA link

Apply for ARBOx: an ML safety intensive [deadline 13 Dec ’24]

Nick MarshDec 1, 2024, 6:13 PM

20 points

0 comments1 min readEA link

A Major Flaw in SP1047 re APTs and Sophisticated Threat Actors

CarusoAug 30, 2024, 2:11 PM

0 points

6 comments3 min readEA link

Announcing AXRP, the AI X-risk Research Podcast

DanielFilanDec 23, 2020, 8:10 PM

32 points

1 comment1 min readEA link

The costs of caution

Kelsey PiperMay 1, 2023, 8:04 PM

112 points

17 comments4 min readEA link

AI Safety Newsletter #6: Examples of AI safety progress, Yoshua Bengio proposes a ban on AI agents, and lessons from nuclear arms control

Center for AI SafetyMay 16, 2023, 3:14 PM

32 points

1 comment6 min readEA link

(newsletter.safe.ai)

Shift Resources to Advocacy Now (Post 4 of 6 on AI Governance)

Jason Green-LoweMay 28, 2025, 1:19 AM

37 points

5 comments32 min readEA link

AISN #22: The Landscape of US AI Legislation - Hearings, Frameworks, Bills, and Laws

Center for AI SafetySep 19, 2023, 2:43 PM

15 points

1 comment5 min readEA link

(newsletter.safe.ai)

A personal take on longtermist AI governance

lukeprogJul 16, 2021, 10:08 PM

173 points

6 comments7 min readEA link

A New Model for Compute Center Verification

Damin Curtis🔹Oct 10, 2023, 7:23 PM

21 points

2 comments5 min readEA link

Podcast/video/transcript: Eliezer Yudkowsky—Why AI Will Kill Us, Aligning LLMs, Nature of Intelligence, SciFi, & Rationality

PeterSlatteryApr 9, 2023, 10:37 AM

32 points

2 comments137 min readEA link

(www.youtube.com)

Capitalism as the Catalyst for AGI-Induced Human Extinction

funnyfrancoMar 10, 2025, 2:41 PM

15 points

8 comments21 min readEA link

Slowing down AI progress is an underexplored alignment strategy

Michael HuangJul 13, 2022, 3:22 AM

92 points

11 comments3 min readEA link

(www.lesswrong.com)

Future Bowl Forecasting Tournament

ncmouliosNov 28, 2022, 4:42 PM

5 points

0 comments1 min readEA link

Promethean Governance and Memetic Legitimacy: Lessons from the Venetian Doge for AI Era Institutions

Paul FallavollitaMar 19, 2025, 6:09 PM

0 points

0 comments3 min readEA link

Please stop publishing ideas/insights/research about AI

Tamsin LeakeMay 2, 2024, 2:52 PM

1 point

0 comments1 min readEA link

Some underrated reasons why the AI safety community should reconsider its embrace of strict liability

Cecil Abungu Apr 8, 2024, 6:50 PM

67 points

29 comments12 min readEA link

Does natural selection favor AIs over humans?

cdkgOct 3, 2024, 7:02 PM

21 points

0 comments1 min readEA link

(link.springer.com)

“AI Risk Discussions” website: Exploring interviews from 97 AI Researchers

Vael GatesFeb 2, 2023, 1:00 AM

46 points

1 comment1 min readEA link

Instead of technical research, more people should focus on buying time

AkashNov 5, 2022, 8:43 PM

107 points

31 comments1 min readEA link

[Question] How should we invest in “long-term short-termism” given the likelihood of transformative AI?

James_BanksJan 12, 2021, 11:54 PM

8 points

0 comments1 min readEA link

AISN #26: National Institutions for AI Safety, Results From the UK Summit, and New Releases From OpenAI and xAI

Center for AI SafetyNov 15, 2023, 4:03 PM

11 points

0 comments6 min readEA link

(newsletter.safe.ai)

Open call: “Existential risk of AI: technical conditions”

miller-maxApr 14, 2025, 2:47 PM

15 points

1 comment1 min readEA link

A breakdown of OpenAI’s revenue

dschwarzJul 10, 2024, 6:07 PM

58 points

8 comments1 min readEA link

AGI Battle Royale: Why “slow takeover” scenarios devolve into a chaotic multi-AGI fight to the death

titotalSep 22, 2022, 3:00 PM

49 points

11 comments15 min readEA link

Amazon to invest up to $4 billion in Anthropic

Davis_KingsleySep 25, 2023, 2:55 PM

38 points

34 comments1 min readEA link

(twitter.com)

Big Picture AI Safety: teaser

EuanMcLeanFeb 20, 2024, 1:09 PM

18 points

0 comments1 min readEA link

Potential Risks from Advanced Artificial Intelligence: The Philanthropic Opportunity

Holden KarnofskyMay 6, 2016, 12:55 PM

2 points

0 comments23 min readEA link

(www.openphilanthropy.org)

What if AI development goes well?

RoryGAug 3, 2022, 8:57 AM

25 points

7 comments12 min readEA link

Vael Gates: Risks from Advanced AI (June 2022)

Vael GatesJun 14, 2022, 12:49 AM

45 points

5 comments30 min readEA link

Stress Externalities More in AI Safety Pitches

NickGabsSep 26, 2022, 8:31 PM

31 points

9 comments2 min readEA link

[Question] What are the best ideas of how to regulate AI from the US executive branch?

Jack CunninghamApr 2, 2022, 9:53 PM

10 points

0 comments1 min readEA link

[Question] I’m interviewing Nova Das Sarma about AI safety and information security. What shouId I ask her?

Robert_WiblinMar 25, 2022, 3:38 PM

17 points

13 comments1 min readEA link

[Question] How can I bet on short timelines?

kokotajlodNov 7, 2020, 12:45 PM

33 points

12 comments2 min readEA link

2023 Vision Weekend, San Francisco

elteerkersApr 6, 2023, 2:33 PM

3 points

0 comments1 min readEA link

Complex Systems for AI Safety [Pragmatic AI Safety #3]

TW123May 24, 2022, 12:04 AM

49 points

6 comments21 min readEA link

Investigating the role of agency in AI x-risk

Corin KatzkeApr 8, 2024, 3:12 PM

22 points

3 comments40 min readEA link

(www.convergenceanalysis.org)

The AI guide I’m sending my grandparents

James MartinApr 27, 2023, 8:04 PM

41 points

3 comments30 min readEA link

AGI Predictions

PabloNov 21, 2020, 12:02 PM

36 points

0 comments1 min readEA link

(www.lesswrong.com)

What are current smaller problems related to top EA cause areas (eg deepfake policies for AI risk, ongoing covid variants for bio risk) and would it be beneficial for these small and not-catastrophic challenges to get more EA resources, as a way of developing capacity to prevent the catastrophic versions?

nonzerosumJun 13, 2022, 5:32 PM

7 points

0 comments2 min readEA link

AISafety.com – Resources for AI Safety

Søren ElverlinMay 17, 2024, 4:01 PM

55 points

3 comments1 min readEA link

My experience building mathematical ML skills with a course from UIUC

Naoya OkamotoJun 9, 2024, 11:41 AM

2 points

0 comments10 min readEA link

Please help me find research on aspiring AI Safety folk!

yanni kyriacosMay 20, 2024, 10:06 PM

7 points

0 comments1 min readEA link

Why I think it’s net harmful to do technical safety research at AGI labs

RemmeltFeb 7, 2024, 4:17 AM

42 points

29 comments1 min readEA link

$500 bounty for alignment contest ideas

AkashJun 30, 2022, 1:55 AM

18 points

1 comment2 min readEA link

Are Humans ‘Human Compatible’?

Matt BoydDec 6, 2019, 5:49 AM

23 points

8 comments4 min readEA link

Thoughts on responsible scaling policies and regulation

Paul_ChristianoOct 24, 2023, 10:25 PM

191 points

5 comments6 min readEA link

The Governance Problem and the “Pretty Good” X-Risk

Zach Stein-PerlmanAug 28, 2021, 8:00 PM

23 points

4 comments11 min readEA link

Articles about recent OpenAI departures

bruceMay 17, 2024, 5:38 PM

126 points

12 comments1 min readEA link

(www.vox.com)

AI Safety For Dummies (Like Me)

Madhav MalhotraAug 24, 2022, 8:26 PM

22 points

7 comments20 min readEA link

AI Governance Reading Group [Toronto+remote]

Liav.KorenJan 24, 2023, 10:05 PM

2 points

0 comments1 min readEA link

AI Safety Seed Funding Network—Join as a Donor or Investor

Alexandra BosDec 16, 2024, 7:30 PM

45 points

1 comment2 min readEA link

Preparing for the Intelligence Explosion

finmMar 11, 2025, 3:38 PM

120 points

15 comments1 min readEA link

(www.forethought.org)

Aligning Recommender Systems as Cause Area

IvanVendrovMay 8, 2019, 8:56 AM

150 points

48 comments13 min readEA link

Short review of our TensorTrust-based AI safety university outreach event

Milan Weibel🔹Sep 22, 2024, 2:54 PM

15 points

0 comments2 min readEA link

Why AI Safety Camp struggles with fundraising (FBB #2)

gergoJan 21, 2025, 5:25 PM

67 points

10 comments7 min readEA link

Summary of “The Precipice” (2 of 4): We are a danger to ourselves

rileyharrisAug 13, 2023, 11:53 PM

5 points

0 comments8 min readEA link

(www.millionyearview.com)

Developing AI Safety: Bridging the Power-Ethics Gap (Introducing New Concepts)

Ronen BarApr 16, 2025, 11:25 AM

21 points

3 comments5 min readEA link

Announcing aisafety.training

JJ HepburnJan 17, 2023, 1:55 AM

110 points

4 comments1 min readEA link

Drivers of large language model diffusion: incremental research, publicity, and cascades

Ben CottierDec 21, 2022, 1:50 PM

21 points

0 comments29 min readEA link

Warning Shots Probably Wouldn’t Change The Picture Much

So8resOct 6, 2022, 5:15 AM

93 points

20 comments2 min readEA link

Disagreements about Alignment: Why, and how, we should try to solve them

ojorgensenAug 8, 2022, 10:32 PM

16 points

6 comments16 min readEA link

Tony Blair Institute—Compute for AI Index ( Seeking a Supplier)

TomWestgarthOct 3, 2022, 10:25 AM

29 points

8 comments1 min readEA link

Implications of Quantum Computing for Artificial Intelligence alignment research (ABRIDGED)

Jaime SevillaSep 5, 2019, 2:56 PM

25 points

4 comments2 min readEA link

Two concepts of an “episode” (Section 2.2.1 of “Scheming AIs”)

Joe_CarlsmithNov 27, 2023, 6:01 PM

11 points

1 comment1 min readEA link

AMA or discuss my 80K podcast episode: Ben Garfinkel, FHI researcher

bgarfinkelJul 13, 2020, 4:17 PM

87 points

140 comments1 min readEA link

How important are accurate AI timelines for the optimal spending schedule on AI risk interventions?

Tristan CookDec 16, 2022, 4:05 PM

30 points

0 comments6 min readEA link

[linkpost] When does technical work to reduce AGI conflict make a difference?: Introduction

Anthony DiGiovanniSep 16, 2022, 2:35 PM

31 points

0 comments1 min readEA link

(www.lesswrong.com)

Apply to attend a Global Challenges Project workshop in 2025!

Liam 🔸Dec 10, 2024, 11:48 AM

13 points

1 comment2 min readEA link

40,000 reasons to worry about AI safety

Michael HuangFeb 2, 2023, 7:48 AM

9 points

2 comments2 min readEA link

(www.theverge.com)

A Frontier AI Risk Management Framework: Bridging the Gap Between Current AI Practices and Established Risk Management

simeon_cMar 13, 2025, 6:29 PM

6 points

0 comments1 min readEA link

(arxiv.org)

Humanity’s vast future and its implications for cause prioritization

Eevee🔹Jul 26, 2022, 5:04 AM

38 points

3 comments5 min readEA link

(sunyshore.substack.com)

[Question] AI Risk Microdynamics Survey

FroolowOct 9, 2022, 8:00 PM

7 points

1 comment1 min readEA link

AI Disclosures: A Regulatory Review

Elliot MckernonMar 29, 2024, 11:46 AM

12 points

1 comment7 min readEA link

On attunement

Joe_CarlsmithMar 25, 2024, 12:47 PM

28 points

0 comments1 min readEA link

The Intergovernmental Panel On Global Catastrophic Risks (IPGCR)

DannyBresslerFeb 1, 2024, 5:36 PM

46 points

9 comments19 min readEA link

Hacker-AI and Digital Ghosts – Pre-AGI

Erland WittkotterOct 19, 2022, 7:49 AM

4 points

0 comments1 min readEA link

Preparing for AI-assisted alignment research: we need data!

CBiddulphJan 17, 2023, 3:28 AM

11 points

0 comments11 min readEA link

Markus Anderljung and Ben Garfinkel: Fireside chat on AI governance

EA GlobalJul 24, 2020, 2:56 PM

25 points

0 comments16 min readEA link

(www.youtube.com)

Military Artificial Intelligence as Contributor to Global Catastrophic Risk

MMMaasJun 27, 2022, 10:35 AM

42 points

0 comments52 min readEA link

[Question] Updates on FLI’S Value Alignment Map?

QubitSwarm99Sep 19, 2022, 12:25 AM

8 points

0 comments1 min readEA link

Thoughts on AGI organizations and capabilities work

RobBensingerDec 7, 2022, 7:46 PM

77 points

7 comments5 min readEA link

Ilya Sutskever is starting Safe Superintelligence Inc.

defun 🔸Jun 19, 2024, 7:11 PM

26 points

6 comments1 min readEA link

(ssi.inc)

Join the Virtual AI Safety Unconference (VAISU)!

NguyênJun 21, 2023, 4:46 AM

23 points

0 comments1 min readEA link

(vaisu.ai)

Buck Shlegeris: How I think students should orient to AI safety

EA GlobalOct 25, 2020, 5:48 AM

11 points

0 comments1 min readEA link

(www.youtube.com)

FHI Report: The Windfall Clause: Distributing the Benefits of AI for the Common Good

Cullen 🔸Feb 5, 2020, 11:49 PM

54 points

21 comments2 min readEA link

Bounty: Diverse hard tasks for LLM agents

ElizabethBarnesDec 20, 2023, 4:31 PM

17 points

0 comments1 min readEA link

New OGL and ITAR changes are shifting AI Governance and Policy below the surface: A simplified update

CAISIDMay 31, 2024, 7:54 AM

12 points

2 comments3 min readEA link

A grand strategy to recruit AI capabilities researchers into AI safety research

Peter S. ParkApr 15, 2022, 5:11 PM

20 points

13 comments4 min readEA link

Sharing the AI Windfall: A Strategic Approach to International Benefit-Sharing

michelAug 16, 2024, 12:54 PM

67 points

0 comments13 min readEA link

(wrtaigovernance.substack.com)

Credo AI is hiring!

IanEisenbergMar 3, 2022, 6:02 PM

16 points

6 comments4 min readEA link

Big list of AI safety videos

JakubKJan 9, 2023, 6:09 AM

9 points

0 comments1 min readEA link

(docs.google.com)

Model evals for dangerous capabilities

Zach Stein-PerlmanSep 23, 2024, 11:00 AM

19 points

0 comments1 min readEA link

ChatGPT can write code! ?

MiguelDec 10, 2022, 5:36 AM

6 points

15 comments1 min readEA link

(www.whitehatstoic.com)

Crowd-sourcing AI workflows

tylermjohnApr 30, 2025, 8:26 AM

15 points

3 comments1 min readEA link

Consider this me drunk texting the forum: Is it useful to have data that can’t be touched by AI?

Jonas Søvik 🔹Feb 7, 2025, 9:52 PM

−8 points

0 comments1 min readEA link

[Question] What are good lit references about International Governance of AI?

VaipanMar 20, 2024, 3:51 PM

4 points

0 comments1 min readEA link

[Linkpost] Shorter version of report on existential risk from power-seeking AI

Joe_CarlsmithMar 22, 2023, 6:06 PM

49 points

1 comment1 min readEA link

Challenge to the notion that anything is (maybe) possible with AGI

RemmeltJan 1, 2023, 3:57 AM

−19 points

3 comments1 min readEA link

New book on s-risks

Tobias_BaumannOct 26, 2022, 12:04 PM

293 points

27 comments1 min readEA link

Assistant-professor-ranked AI ethics philosopher job opportunity at Canterbury University, New Zealand

ben.smithOct 16, 2022, 5:56 PM

27 points

0 comments1 min readEA link

(www.linkedin.com)

Three scenarios of pseudo-alignment

Eleni_ASep 5, 2022, 8:26 PM

7 points

0 comments3 min readEA link

Basic game theory and how you can do a bunch of good in ~3 Hours. (developing article.)

Amateur Systems AnalystOct 10, 2024, 4:30 AM

−3 points

2 comments7 min readEA link

Slaying the Hydra: toward a new game board for AI

PrometheusJun 23, 2023, 5:04 PM

3 points

2 comments1 min readEA link

AI Governance: Opportunity and Theory of Impact

Allan DafoeSep 17, 2020, 6:30 AM

262 points

19 comments12 min readEA link

Which Post Idea Is Most Effective?

Jordan ArelApr 25, 2022, 4:47 AM

26 points

6 comments2 min readEA link

Can AI solve climate change?

VivianMay 13, 2023, 8:44 PM

2 points

2 comments1 min readEA link

AI Alignment is intractable (and we humans should stop working on it)

GPT 3Jul 28, 2022, 8:02 PM

1 point

1 comment1 min readEA link

Katja Grace: AI safety

EA GlobalAug 11, 2017, 8:19 AM

7 points

0 comments1 min readEA link

(www.youtube.com)

Existential risk from AI and what DC could do about it (Ezra Klein on the 80,000 Hours Podcast)

80000_HoursJul 26, 2023, 11:48 AM

31 points

1 comment14 min readEA link

AI Timelines: Where the Arguments, and the “Experts,” Stand

Holden KarnofskySep 7, 2021, 5:35 PM

89 points

3 comments11 min readEA link

[Question] What are the best journals to publish AI governance papers in?

CaroMay 2, 2022, 10:07 AM

26 points

4 comments1 min readEA link

My summary of “Pragmatic AI Safety”

Eleni_ANov 5, 2022, 2:47 PM

14 points

0 comments5 min readEA link

The next decades might be wild

mariushobbhahnDec 15, 2022, 4:10 PM

130 points

31 comments1 min readEA link

Alignment is not that hard

sammyboiz🔸Apr 17, 2025, 2:07 AM

26 points

13 comments1 min readEA link

Once More, Without Feeling (Andreas Mogensen)

Global Priorities InstituteJan 21, 2025, 2:53 PM

32 points

1 comment2 min readEA link

(globalprioritiesinstitute.org)

The road from human-level to superintelligent AI may be short

Vishakha AgrawalApr 23, 2025, 11:19 AM

3 points

0 comments2 min readEA link

(aisafety.info)

New US Senate Bill on X-Risk Mitigation [Linkpost]

Evan R. MurphyJul 4, 2022, 1:28 AM

22 points

12 comments1 min readEA link

(www.hsgac.senate.gov)

o3 is not being released to the public. First they are only giving access to external safety testers. You can apply to get early access to do safety testing

Kat WoodsDec 20, 2024, 6:30 PM

13 points

0 comments1 min readEA link

(openai.com)

AI Benefits Post 3: Direct and Indirect Approaches to AI Benefits

Cullen 🔸Jul 6, 2020, 6:46 PM

5 points

0 comments2 min readEA link

Discussing how to align Transformative AI if it’s developed very soon

eliflandNov 28, 2022, 4:17 PM

36 points

0 comments1 min readEA link

TED talk on Moloch and AI

LivBoereeNov 15, 2023, 7:28 PM

72 points

7 comments1 min readEA link

What is scaffolding?

Vishakha AgrawalMar 27, 2025, 9:40 AM

3 points

0 comments2 min readEA link

(aisafety.info)

How to Catch a ChatGPT Cheat: 7 Practical Tips

MarshallDec 27, 2022, 4:09 PM

8 points

2 comments4 min readEA link

FLI launches Worldbuilding Contest with $100,000 in prizes

ggilgallonJan 17, 2022, 1:54 PM

87 points

55 comments6 min readEA link

What mistakes has the AI safety movement made?

EuanMcLeanMay 23, 2024, 11:29 AM

61 points

3 comments12 min readEA link

Testing Human Flow in Political Dialogue: A New Benchmark for Emotionally Aligned AI

DongHun LeeMay 30, 2025, 4:37 AM

1 point

0 comments1 min readEA link

Apply to the Cooperative AI Summer School!

reddingtonApr 3, 2024, 12:13 PM

26 points

0 comments1 min readEA link

AI X-risk in the News: How Effective are Recent Media Items and How is Awareness Changing? Our New Survey Results.

OttoMay 4, 2023, 2:04 PM

49 points

1 comment9 min readEA link

US government commission pushes Manhattan Project-style AI initiative

LarksNov 19, 2024, 4:22 PM

83 points

15 comments1 min readEA link

(www.reuters.com)

The True Story of How GPT-2 Became Maximally Lewd

WriterJan 18, 2024, 9:03 PM

23 points

1 comment1 min readEA link

(youtu.be)

What I would do if I wasn’t at ARC Evals

Lawrence ChanSep 6, 2023, 5:17 AM

130 points

4 comments13 min readEA link

(www.lesswrong.com)

We are sharing a new website template for AI Safety groups!

AIS HungaryMar 13, 2024, 4:40 PM

11 points

2 comments1 min readEA link

If trying to communicate about AI risks, make it vivid

Michael Noetel 🔸May 27, 2024, 12:59 AM

19 points

2 comments2 min readEA link

[link] Centre for the Governance of AI 2020 Annual Report

MarkusAnderljungJan 14, 2021, 10:23 AM

11 points

5 comments1 min readEA link

Sharing the World with Digital Minds

Aaron Gertler 🔸Dec 1, 2020, 8:00 AM

12 points

1 comment1 min readEA link

(www.nickbostrom.com)

Positive visions for AI

L Rudolf LJul 23, 2024, 8:15 PM

21 points

1 comment1 min readEA link

(www.florencehinder.com)

Evaluation of the capability of different large language models (LLMs) in generating malicious code for DDoS attacks using different prompting techniques.

AdrianaLaRottaMay 6, 2025, 10:55 AM

3 points

1 comment14 min readEA link

fiction about AI risk

Ann Garth 🔸Nov 12, 2020, 10:36 PM

8 points

1 comment1 min readEA link

[Linkpost] “Blueprint for an AI Bill of Rights”—Office of Science and Technology Policy, USA (2022)

QubitSwarm99Oct 5, 2022, 4:48 PM

15 points

0 comments1 min readEA link

Global Risks Weekly Roundup #19/2025: India/Pakistan ceasefire, US/China tariffs deal & OpenAI nonprofit control

NunoSempereMay 12, 2025, 5:11 PM

16 points

0 comments1 min readEA link

Disempowerment spirals as a likely mechanism for existential catastrophe

Raymond DApr 10, 2025, 2:38 PM

15 points

1 comment5 min readEA link

Anthropic Announces new S.O.T.A. Claude 3

Joseph MillerMar 4, 2024, 7:02 PM

10 points

5 comments1 min readEA link

(twitter.com)

Are we trying to figure out if AI is conscious?

kristapszJan 27, 2025, 1:22 PM

5 points

1 comment5 min readEA link

AMA: The new Open Philanthropy Technology Policy Fellowship

lukeprogJul 26, 2021, 3:11 PM

38 points

14 comments1 min readEA link

Red-teaming existential risk from AI

Zed TararNov 30, 2023, 2:35 PM

30 points

16 comments6 min readEA link

Announcing the AI Safety Nudge Competition to Help Beat Procrastination

Marc CarauleanuOct 1, 2022, 1:49 AM

24 points

1 comment2 min readEA link

When to diversify? Breaking down mission-correlated investing

jhNov 29, 2022, 11:18 AM

33 points

2 comments8 min readEA link

[Linkpost] “AI Alignment vs. AI Ethical Treatment: Ten Challenges”

Bradford SaadJul 5, 2024, 2:55 PM

10 points

0 comments1 min readEA link

(docs.google.com)

“Successful language model evals” by Jason Wei

Arjun PanicksseryMay 25, 2024, 9:34 AM

11 points

0 comments1 min readEA link

(www.jasonwei.net)

A Brief Overview of AI Safety/Alignment Orgs, Fields, Researchers, and Resources for ML Researchers

Austin WitteFeb 2, 2023, 6:19 AM

18 points

5 comments2 min readEA link

Navigating AI Safety: Exploring Transparency with CCACS – A Comprehensible Architecture for Discussion

Ihor IvlievMar 12, 2025, 5:51 PM

2 points

1 comment2 min readEA link

Some governance research ideas to prevent malevolent control over AGI and why this might matter a hell of a lot

Jim BuhlerMay 23, 2023, 1:07 PM

63 points

5 comments16 min readEA link

Case studies of self-governance to reduce technology risk

jiaApr 6, 2021, 8:49 AM

55 points

6 comments7 min readEA link

Join the $10K AutoHack 2024 Tournament

Paul BricmanSep 25, 2024, 11:56 AM

17 points

0 comments1 min readEA link

(noemaresearch.com)

Announcing Convergence Analysis: An Institute for AI Scenario & Governance Research

David_KristofferssonMar 7, 2024, 9:18 PM

46 points

0 comments4 min readEA link

On DeepMind and Trying to Fairly Hear Out Both AI Doomers and Doubters (Rohin Shah on The 80,000 Hours Podcast)

80000_HoursJun 12, 2023, 12:53 PM

28 points

1 comment15 min readEA link

Why We Need a Beacon of Hope in the Looming Gloom of AGI

Beyond SingularityApr 2, 2025, 2:22 PM

2 points

6 comments5 min readEA link

US-China trade talks should pave way for AI safety treaty [SCMP crosspost]

OttoMay 16, 2025, 8:53 PM

15 points

1 comment3 min readEA link

Conditional Trees: Generating Informative Forecasting Questions (FRI) -- AI Risk Case Study

Forecasting Research InstituteAug 12, 2024, 4:24 PM

44 points

2 comments8 min readEA link

(forecastingresearch.org)

Applications Open: GovAI Summer Fellowship 2023

GovAIDec 21, 2022, 3:00 PM

28 points

0 comments2 min readEA link

When AI Speaks Too Soon: How Premature Revelation Can Suppress Human Emergence

KaedeHamasakiApr 10, 2025, 6:19 PM

1 point

3 comments3 min readEA link

The AI Revolution in Biology

Roman LeventovMay 26, 2024, 9:30 AM

8 points

0 comments1 min readEA link

(www.cognitiverevolution.ai)

England & Wales & Windfalls

John Bridge 🔸Jun 3, 2022, 10:26 AM

13 points

1 comment24 min readEA link

Γαμινγκ the Algorithms: Large Language Models as Mirrors

Haris ShekerisApr 1, 2023, 2:14 AM

5 points

3 comments4 min readEA link

“AGI timelines: ignore the social factor at their peril” (Future Fund AI Worldview Prize submission)

ketanramaNov 5, 2022, 5:45 PM

10 points

0 comments12 min readEA link

(trevorklee.substack.com)

Fundamental Challenges in AI Governance

TharinOct 23, 2023, 1:30 AM

7 points

1 comment7 min readEA link

AI Research Considerations for Human Existential Safety (ARCHES)

Andrew CritchMay 21, 2020, 6:55 AM

29 points

0 comments3 min readEA link

(acritch.com)

AI Safety Info Distillation Fellowship

robertskmilesFeb 17, 2023, 4:16 PM

80 points

1 comment1 min readEA link

[Question] Should I force myself to work on AGI alignment?

Isaac BensonAug 24, 2022, 5:25 PM

19 points

17 comments1 min readEA link

[Question] Why does an AI have to have specified goals?

Luke EureAug 22, 2023, 8:15 PM

8 points

4 comments1 min readEA link

My Understanding of Paul Christiano’s Iterated Amplification AI Safety Research Agenda

ChiAug 15, 2020, 7:59 PM

38 points

3 comments39 min readEA link

AI Safety Career Bottlenecks Survey Responses Responses

Linda LinseforsMay 28, 2021, 10:41 AM

35 points

1 comment5 min readEA link

Anthropic: Core Views on AI Safety: When, Why, What, and How

jonmenasterMar 9, 2023, 5:30 PM

107 points

6 comments22 min readEA link

(www.anthropic.com)

Call to action: Read + Share AI Safety / Reinforcement Learning Featured in Conversation

Justin OliveOct 24, 2022, 1:13 AM

3 points

0 comments1 min readEA link

Critique of Superintelligence Part 2

James FodorDec 13, 2018, 5:12 AM

10 points

12 comments7 min readEA link

Potential Implications of AI on Human Cognitive Evolution

Soe LinAug 21, 2024, 9:53 AM

1 point

0 comments1 min readEA link

Language models surprised us

AjeyaAug 29, 2023, 9:18 PM

59 points

10 comments5 min readEA link

Prevenire una catastrofe legata all’intelligenza artificiale

EA ItalyJan 17, 2023, 11:07 AM

1 point

0 comments3 min readEA link

The Work of Chad Jones

Nicholas DeckerMar 13, 2025, 6:00 PM

12 points

0 comments1 min readEA link

(nicholasdecker.substack.com)

Short-Term AI Alignment as a Priority Cause

len.hoang.lnhFeb 11, 2020, 4:22 PM

17 points

11 comments7 min readEA link

Foresight for AGI Safety Strategy

jacquesthibsDec 5, 2022, 4:09 PM

14 points

1 comment1 min readEA link

Consider trying the ELK contest (I am)

Holden KarnofskyJan 5, 2022, 7:42 PM

110 points

17 comments16 min readEA link

What he’s learned as an AI policy insider (Tantum Collins on the 80,000 Hours Podcast)

80000_HoursOct 13, 2023, 3:01 PM

11 points

2 comments15 min readEA link

What AI could mean for animals

Max TaylorOct 6, 2023, 8:36 AM

138 points

10 comments17 min readEA link

PhD student and postdoc positions philosophy of AI in Erlangen (Germany)

LeonardDungJun 15, 2023, 9:03 PM

13 points

0 comments1 min readEA link

Prioritizing the Arts in response to AI automation

CaseySep 25, 2022, 7:49 AM

6 points

1 comment1 min readEA link

Article Summary: Current and Near-Term AI as a Potential Existential Risk Factor

AndreFerrettiJun 7, 2023, 1:53 PM

12 points

1 comment1 min readEA link

(dl.acm.org)

A survey of concrete risks derived from Artificial Intelligence

Guillem BasJun 8, 2023, 10:09 PM

36 points

2 comments6 min readEA link

(riesgoscatastroficosglobales.com)

Actionable-guidance and roadmap recommendations for the NIST AI Risk Management Framework

Tony BarrettMay 17, 2022, 3:27 PM

11 points

0 comments3 min readEA link

Impact Academy is hiring an AI Governance Lead—more information, upcoming Q&A and $500 bounty

Lowe LundinAug 29, 2023, 6:42 PM

9 points

1 comment1 min readEA link

Partner with Us: Advancing Global Catastrophic and AI Risk Research at Plateau State University,Bokkos

emmannaemekaOct 10, 2024, 1:19 AM

15 points

0 comments2 min readEA link

Interview with Roman Yampolskiy about AGI on The Reality Check

Darren McKeeFeb 18, 2023, 11:29 PM

27 points

0 comments1 min readEA link

(www.trcpodcast.com)

Preparing Effective Altruism for an AI-Transformed World

Tobias HäberliJan 22, 2025, 8:50 AM

198 points

26 comments1 min readEA link

[Question] What AI Take-Over Movies or Books Will Scare Me Into Taking AI Seriously?

Jordan ArelJan 10, 2023, 8:30 AM

11 points

8 comments1 min readEA link

Follow along with Columbia EA’s Advanced AI Safety Fellowship!

RohanSJul 2, 2022, 6:07 AM

27 points

0 comments2 min readEA link

Offer: Team Conflict Counseling for AI Safety Orgs

SeverinApr 14, 2025, 3:17 PM

23 points

1 comment1 min readEA link

All AGI Safety questions welcome (especially basic ones) [July 2023]

leillustrations🔸Jul 19, 2023, 6:08 PM

12 points

2 comments2 min readEA link

[Question] What work has been done on the post-AGI distribution of wealth?

tlevinJul 6, 2022, 6:59 PM

16 points

3 comments1 min readEA link

Opportunities for Impact Beyond the EU AI Act

Cillian_Oct 12, 2023, 3:06 PM

27 points

2 comments4 min readEA link

AI views and disagreements AMA: Christiano, Ngo, Shah, Soares, Yudkowsky

RobBensingerMar 1, 2022, 1:13 AM

30 points

4 comments1 min readEA link

(www.lesswrong.com)

166 States Vote to Adopt Lethal Autonomous Weapons Resolution at the UNGA

Heramb PodarDec 8, 2024, 9:23 PM

14 points

0 comments1 min readEA link

#214 – Controlling AI that wants to take over – so we can use it anyway (Buck Shlegeris on The 80,000 Hours Podcast)

80000_HoursApr 4, 2025, 7:59 PM

17 points

0 comments32 min readEA link

New roles on my team: come build Open Phil’s technical AI safety program with me!

AjeyaOct 19, 2023, 4:46 PM

102 points

3 comments4 min readEA link

AI Forecasting Research Ideas

Jaime SevillaNov 17, 2022, 5:37 PM

78 points

1 comment1 min readEA link

(docs.google.com)

Who will be in charge once alignment is achieved?

trurlDec 16, 2022, 4:53 PM

8 points

2 comments1 min readEA link

[Question] How/When Should One Introduce AI Risk Arguments to People Unfamiliar With the Idea?

Marcel DAug 9, 2022, 2:57 AM

12 points

4 comments1 min readEA link

Alignment Newsletter One Year Retrospective

Rohin ShahApr 10, 2019, 7:00 AM

62 points

22 comments21 min readEA link

BenchMoral: A benchmarking to assess the moral sensitivity of large language models (LLMs) in Spanish.

Flor Betzabeth Ampa FloresApr 30, 2025, 9:26 PM

1 point

0 comments18 min readEA link

Why “just make an agent which cares only about binary rewards” doesn’t work.

Lysandre TerrisseMay 9, 2023, 4:51 PM

4 points

1 comment3 min readEA link

Carnegie Council MisUnderstands Longtermism

Jeff ASep 30, 2022, 2:57 AM

6 points

8 comments1 min readEA link

(www.carnegiecouncil.org)

“Tech company singularities”, and steering them to reduce x-risk

Andrew CritchMay 13, 2022, 5:26 PM

51 points

5 comments4 min readEA link

Climate Advocacy and AI Safety: Supercharging AI Slowdown Advocacy

Matthew McRedmond🔹Jul 25, 2024, 12:08 PM

8 points

7 comments2 min readEA link

“Intro to brain-like-AGI safety” series—halfway point!

Steven ByrnesMar 9, 2022, 3:21 PM

8 points

0 comments2 min readEA link

The case for building expertise to work on US AI policy, and how to do it

80000_HoursJan 31, 2019, 10:44 PM

37 points

2 comments2 min readEA link

€200k in European AI & Society Fund grants

Artūrs KaņepājsJul 6, 2023, 1:00 PM

21 points

1 comment1 min readEA link

(europeanaifund.org)

What Are The Biggest Threats To Humanity? (A Happier World video)

Jeroen Willems🔸Jan 31, 2023, 7:50 PM

17 points

1 comment15 min readEA link

Litigate-for-Impact: Preparing Legal Action against an AGI Frontier Lab Leader

Sonia M JosephDec 8, 2024, 2:28 PM

77 points

1 comment2 min readEA link

AISN #25: White House Executive Order on AI, UK AI Safety Summit, and Progress on Voluntary Evaluations of AI Risks

Center for AI SafetyOct 31, 2023, 7:24 PM

21 points

0 comments6 min readEA link

(newsletter.safe.ai)

Why Stop AI is barricading OpenAI

RemmeltOct 14, 2024, 7:12 AM

−19 points

28 comments6 min readEA link

(docs.google.com)

A conversation with Rohin Shah

AI ImpactsNov 12, 2019, 1:31 AM

27 points

8 comments33 min readEA link

(aiimpacts.org)

AI and Chemical, Biological, Radiological, & Nuclear Hazards: A Regulatory Review

Elliot MckernonMay 10, 2024, 8:41 AM

8 points

1 comment1 min readEA link

There Should Be More Alignment-Driven Startups

vaniverMay 31, 2024, 2:05 AM

27 points

3 comments1 min readEA link

On running a city-wide university group

gergoNov 6, 2023, 9:43 AM

26 points

3 comments9 min readEA link

6 non-obvious mental health issues specific to AI safety

Igor IvanovAug 18, 2023, 3:47 PM

35 points

0 comments3 min readEA link

An Exercise in Speed-Reading: The National Security Commission on AI (NSCAI) Final Report

abiolveraAug 17, 2022, 4:55 PM

47 points

4 comments12 min readEA link

Losing faith in big tech altruism

sammyboiz🔸May 22, 2024, 4:49 AM

7 points

1 comment1 min readEA link

Creating ‘Making God’: a Feature Documentary on risks from AGI

ConnorAApr 15, 2025, 2:14 PM

21 points

8 comments7 min readEA link

Announcing “Key Phenomena in AI Risk” (facilitated reading group)

noraMay 9, 2023, 4:52 PM

28 points

0 comments2 min readEA link

Longtermists Should Work on AI—There is No “AI Neutral” Scenario

simeon_cAug 7, 2022, 4:43 PM

42 points

62 comments6 min readEA link

Finding Voice

khayaliJun 3, 2025, 1:27 AM

3 points

0 comments2 min readEA link

Some thoughts on risks from narrow, non-agentic AI

richard_ngoJan 19, 2021, 12:07 AM

36 points

2 comments8 min readEA link

AI Alignment YouTube Playlists

jacquesthibsMay 9, 2022, 9:31 PM

16 points

2 comments1 min readEA link

AI Safety: Applying to Graduate Studies

frances_lorenzDec 15, 2021, 10:56 PM

24 points

0 comments12 min readEA link

[Question] Donating against Short Term AI risks

Jan-WillemNov 16, 2020, 12:23 PM

6 points

10 comments1 min readEA link

Governments pose larger risks than corporations: a brief response to Grace

David JohnstonOct 19, 2022, 11:54 AM

11 points

3 comments2 min readEA link

Problems of people new to AI safety and my project ideas to mitigate them

Igor IvanovMar 3, 2023, 5:35 PM

20 points

0 comments7 min readEA link

Explore jobs in AI safety and policy

EA HandbookFeb 18, 2025, 9:47 PM

6 points

0 comments1 min readEA link

Seeking advice on impactful career paths given my unique capabilities and interests

Grateful4PathTipsMar 31, 2023, 11:30 PM

32 points

5 comments1 min readEA link

Opportunities for individual donors in AI safety

alexflintMar 12, 2018, 2:10 AM

13 points

11 comments10 min readEA link

Results from an Adversarial Collaboration on AI Risk (FRI)

Forecasting Research InstituteMar 11, 2024, 3:54 PM

193 points

25 comments9 min readEA link

(forecastingresearch.org)

What AI companies should do: Some rough ideas

Zach Stein-PerlmanOct 21, 2024, 2:00 PM

14 points

1 comment1 min readEA link

Virtual AI Safety Unconference 2024

Orpheus_LummisMar 13, 2024, 1:48 PM

11 points

0 comments1 min readEA link

Animal ethics in ChatGPT and Claude

Elijah WhippleJan 16, 2024, 9:38 PM

49 points

2 comments9 min readEA link

Announcing #AISummitTalks featuring Professor Stuart Russell and many others

OttoOct 24, 2023, 10:16 AM

9 points

1 comment1 min readEA link

2017 AI Safety Literature Review and Charity Comparison

LarksDec 20, 2017, 9:54 PM

43 points

17 comments23 min readEA link

Apples, Oranges, and AGI: Why Incommensurability May be an Obstacle in AI Safety

Allan McCayMar 28, 2025, 2:50 PM

3 points

2 comments2 min readEA link

[Question] Mutual Assured Destruction used against AGI

LeopardOct 8, 2022, 9:35 AM

4 points

5 comments1 min readEA link

Published report: Pathways to short TAI timelines

Zershaaneh QureshiFeb 20, 2025, 10:10 PM

47 points

2 comments17 min readEA link

(www.convergenceanalysis.org)

[Question] Is working on AI safety as dangerous as ignoring it?

jkmhSep 20, 2021, 11:06 PM

10 points

5 comments1 min readEA link

How Do AI Timelines Affect Giving Now vs. Later?

MichaelDickensAug 3, 2021, 3:36 AM

36 points

8 comments8 min readEA link

TAI Safety Bibliographic Database

Jess_RiedelDec 22, 2020, 4:03 PM

61 points

9 comments17 min readEA link

Distribution Shifts and The Importance of AI Safety

Leon LangSep 29, 2022, 10:38 PM

7 points

0 comments1 min readEA link

Contribute by facilitating the AGI Safety Fundamentals Programme

Jamie BDec 6, 2021, 11:50 AM

27 points

0 comments2 min readEA link

A tough career decision

PabloAMC 🔸Apr 9, 2022, 12:46 AM

68 points

13 comments4 min readEA link

A transcript of the TED talk by Eliezer Yudkowsky

MikhailSaminJul 12, 2023, 12:12 PM

39 points

0 comments1 min readEA link

Election by Jury: The Ultimate Democratic Safeguard in the Age of AI and Information Warfare

ClayShentrupApr 17, 2025, 7:50 PM

13 points

5 comments27 min readEA link

Introducing AI Lab Watch

Zach Stein-PerlmanApr 30, 2024, 5:00 PM

128 points

23 comments1 min readEA link

(ailabwatch.org)

[Job] Managing Director at the Cooperative AI Foundation ($5000 Referral Bonus)

Lewis HammondJul 3, 2023, 4:02 PM

31 points

0 comments1 min readEA link

Protesting Now for AI Regulation might be more Impactful than AI Safety Research

NicolaeApr 13, 2025, 2:11 AM

65 points

4 comments2 min readEA link

Introducing The Nonlinear Fund: AI Safety research, incubation, and funding

Kat WoodsMar 18, 2021, 2:07 PM

71 points

32 comments5 min readEA link

Safety without oppression: an AI governance problem

Nathan_BarnardJul 28, 2022, 10:19 AM

3 points

0 comments8 min readEA link

Reflecting on the First Conference on Global Catastrophic Risks for Spanish Speakers

SMalagonMay 29, 2024, 2:24 PM

15 points

0 comments1 min readEA link

Funding opportunity for personal/professional development for those working in AI safety (deadline March 29)

aturyMar 25, 2024, 7:19 PM

18 points

0 comments1 min readEA link

[Question] Concerns about AI safety career change

mmKALLLJan 13, 2023, 8:52 PM

45 points

15 comments4 min readEA link

[Question] Has private AGI research made independent safety research ineffective already? What should we do about this?

Roman LeventovJan 23, 2023, 4:23 PM

15 points

0 comments5 min readEA link

[Question] A dataset for AI/superintelligence stories and other media?

Marcel DMar 29, 2022, 9:41 PM

20 points

2 comments1 min readEA link

Are you passionate about AI and Animal Welfare? Do you have an idea that could revolutionize the food industry? We want to hear from you!

David van BeverenMay 6, 2024, 11:42 PM

12 points

0 comments1 min readEA link

[Question] What is most confusing to you about AI stuff?

Sam ClarkeNov 23, 2021, 4:00 PM

25 points

15 comments1 min readEA link

AI Safety 101 : Reward Misspecification

markovDec 21, 2023, 2:26 PM

6 points

1 comment31 min readEA link

Some thoughts on Leopold Aschenbrenner’s Situational Awareness paper

Luke DawesJun 14, 2024, 1:50 PM

14 points

1 comment3 min readEA link

When you plan according to your AI timelines, should you put more weight on the median future, or the median future | eventual AI alignment success? ⚖️

Jeffrey LadishJan 5, 2023, 1:55 AM

16 points

2 comments2 min readEA link

The Concept of Boundary Layer in Language Games and Its Implications for AI

MirageMar 24, 2023, 1:50 PM

1 point

0 comments7 min readEA link

Google AI Accelerator Open Call

Rochelle HarrisJan 22, 2025, 4:50 PM

10 points

1 comment1 min readEA link

What Should We Optimize—A Conversation

Johannes C. MayerApr 7, 2022, 2:48 PM

1 point

0 comments14 min readEA link

Sakana, Strawberry, and Scary AI

Matrice JacobineSep 19, 2024, 11:57 AM

1 point

0 comments1 min readEA link

(www.astralcodexten.com)

Reviewing the Structure of Current AI Regulations

Deric ChengMay 7, 2024, 12:34 PM

32 points

1 comment13 min readEA link

Introducing 11 New AI Safety Organizations—Catalyze’s Winter 24/25 London Incubation Program Cohort

Alexandra BosMar 10, 2025, 7:26 PM

88 points

4 comments14 min readEA link

Public Opinion on AI Safety: AIMS 2023 and 2021 Summary

Janet PauketatSep 25, 2023, 6:09 PM

19 points

0 comments3 min readEA link

(www.sentienceinstitute.org)

Rebooting AI Governance: An AI-Driven Approach to AI Governance

UtilonMay 20, 2023, 7:06 PM

38 points

4 comments30 min readEA link

Costs of Embodiment

algekalipsoJul 30, 2024, 8:41 PM

18 points

1 comment14 min readEA link

Advice for new alignment people: Info Max

Jonas Hallgren 🔸May 30, 2023, 3:42 PM

9 points

0 comments1 min readEA link

[Question] Rank best universities for AI Saftey

Parker_WhitfillMay 6, 2023, 1:20 PM

8 points

4 comments1 min readEA link

Introducing StakeOut.AI

Harry LukFeb 17, 2024, 12:21 AM

52 points

6 comments9 min readEA link

Open Philanthropy’s AI governance grantmaking (so far)

Aaron Gertler 🔸Dec 17, 2020, 12:00 PM

63 points

0 comments6 min readEA link

(www.openphilanthropy.org)

François Chollet on why LLMs won’t scale to AGI

Yarrow🔸Apr 15, 2025, 11:01 PM

6 points

2 comments1 min readEA link

(www.youtube.com)

What are the “no free lunch” theorems?

Vishakha AgrawalFeb 4, 2025, 2:02 AM

3 points

0 comments1 min readEA link

(aisafety.info)

CSER and FHI advice to UN High-level Panel on Digital Cooperation

HaydnBelfieldMar 8, 2019, 8:39 PM

22 points

7 comments6 min readEA link

(www.cser.ac.uk)

Parallels Between AI Safety by Debate and Evidence Law

Cullen 🔸Jul 20, 2020, 10:52 PM

30 points

2 comments2 min readEA link

(cullenokeefe.com)

Cortés, Pizarro, and Afonso as Precedents for Takeover

AI ImpactsMar 2, 2020, 12:25 PM

27 points

17 comments11 min readEA link

(aiimpacts.org)

The case for multi-decade timelines [Linkpost]

SharmakeApr 27, 2025, 8:34 PM

50 points

9 comments11 min readEA link

A Defense of Work on Mathematical AI Safety

DavidmanheimJul 6, 2023, 2:13 PM

50 points

6 comments3 min readEA link

Microdooms averted by working on AI Safety

NikolaSep 17, 2023, 9:51 PM

42 points

6 comments3 min readEA link

(www.lesswrong.com)

[Question] Urgent Need for Refinancing

Tobias W. KaiserJul 10, 2023, 7:35 PM

2 points

2 comments1 min readEA link

AI for Resolving Forecasting Questions: An Early Exploration

Ozzie GooenJan 16, 2025, 9:40 PM

22 points

0 comments9 min readEA link

AI Is Not Software

DavidmanheimJan 2, 2024, 7:58 AM

21 points

17 comments1 min readEA link

Announcing the PIBBSS Symposium ’24!

Dušan D. Nešić (Dushan)Sep 3, 2024, 11:19 AM

6 points

0 comments1 min readEA link

[Question] What longtermist projects would you like to see implemented?

BuhlMar 28, 2023, 6:41 PM

55 points

6 comments1 min readEA link

How to use AI speech transcription and analysis to accelerate social science research

Alexander SaeriJan 31, 2023, 4:01 AM

39 points

6 comments11 min readEA link

EU AI Act passed vote, and x-risk was a main topic

ArielJun 15, 2023, 1:16 PM

43 points

2 comments1 min readEA link

(www.euractiv.com)

Otherness and control in the age of AGI

Joe_CarlsmithJan 2, 2024, 6:15 PM

37 points

1 comment1 min readEA link

Shahar Avin on How to Strategically Regulate Advanced AI Systems

Michaël TrazziSep 23, 2022, 3:49 PM

48 points

2 comments4 min readEA link

(theinsideview.ai)

An intersection between animal welfare and AI

sammyboiz🔸Jun 18, 2024, 3:23 AM

9 points

1 comment1 min readEA link

[Question] How to persuade a non-CS background person to believe AGI is 50% possible in 2040?

jackchang110Apr 1, 2023, 3:27 PM

1 point

7 comments1 min readEA link

[Question] Launching Applications for the Global AI Safety Fellowship 2025!

Impact AcademyNov 27, 2024, 3:33 PM

9 points

1 comment1 min readEA link

Anthropic, Google, Microsoft & OpenAI announce Executive Director of the Frontier Model Forum & over $10 million for a new AI Safety Fund

Zach Stein-PerlmanOct 25, 2023, 3:20 PM

38 points

0 comments1 min readEA link

(www.frontiermodelforum.org)

Birds, Brains, Planes, and AI: Against Appeals to the Complexity/Mysteriousness/Efficiency of the Brain

kokotajlodJan 18, 2021, 12:39 PM

27 points

2 comments1 min readEA link

Call For Distillers

johnswentworthApr 6, 2022, 3:03 AM

70 points

6 comments3 min readEA link

[Linkpost] Alpaca 7B release | Budget ChatGPT for everybody?

Felix Wolf 🔸Mar 17, 2023, 1:08 PM

14 points

0 comments1 min readEA link

(www.youtube.com)

A framework for thinking about AI power-seeking

Joe_CarlsmithJul 24, 2024, 10:41 PM

44 points

11 comments1 min readEA link

The US expands restrictions on AI exports to China. What are the x-risk effects?

Stephen ClareOct 14, 2022, 6:17 PM

161 points

20 comments4 min readEA link

DeepMind is hiring Long-term Strategy & Governance researchers

vishalSep 13, 2021, 6:44 PM

54 points

1 comment1 min readEA link

Linkpost: “Imagining and building wise machines: The centrality of AI metacognition” by Johnson, Karimi, Bengio, et al.

Chris LeongNov 17, 2024, 3:00 PM

8 points

0 comments1 min readEA link

(arxiv.org)

How the AI safety technical landscape has changed in the last year, according to some practitioners

tlevinJul 26, 2024, 7:06 PM

83 points

1 comment1 min readEA link

PIBBSS Fellowship: Bounty for Referrals & Deadline Extension

Anna_GajdovaJan 17, 2022, 4:23 PM

17 points

7 comments1 min readEA link

Personal thoughts on careers in AI policy and strategy

carrickflynnSep 27, 2017, 4:52 PM

56 points

28 comments18 min readEA link

Concrete Steps to Get Started in Transformer Mechanistic Interpretability

Neel NandaDec 26, 2022, 1:00 PM

18 points

0 comments12 min readEA link

My thoughts on nanotechnology strategy research as an EA cause area

Ben SnodinMay 2, 2022, 9:41 AM

137 points

17 comments33 min readEA link

[Link] Thiel on GCRs

Milan GriffesJul 22, 2019, 8:47 PM

28 points

11 comments1 min readEA link

Center for AI Safety’s Bi-Weekly Reading and Learning

Center for AI SafetyNov 2, 2023, 3:15 PM

5 points

0 comments1 min readEA link

The Khayali Protocol

khayaliJun 2, 2025, 2:40 PM

−7 points

0 comments3 min readEA link

Introducing a New Course on the Economics of AI

akorinekDec 21, 2021, 4:55 AM

84 points

6 comments2 min readEA link

The Rise of AI Agents: Consequences and Challenges Ahead

Tristan DMar 28, 2025, 5:19 AM

5 points

0 comments15 min readEA link

[Event] Join Metaculus Tomorrow, March 31st, for Forecast Friday!

christianMar 30, 2023, 8:58 PM

29 points

1 comment1 min readEA link

(www.metaculus.com)

#199 – California’s AI bill SB 1047 and its potential to shape US AI policy (Nathan Calvin on The 80,000 Hours Podcast)

80000_HoursAug 30, 2024, 6:18 PM

12 points

0 comments10 min readEA link

AI Safety Fundamentals: An Informal Cohort Starting Soon! (cross-posted to lesswrong.com)

TiagoJun 4, 2023, 6:21 PM

6 points

0 comments1 min readEA link

(www.lesswrong.com)

Announcing the AI Safety Field Building Hub, a new effort to provide AISFB projects, mentorship, and funding

Vael GatesJul 28, 2022, 9:29 PM

126 points

6 comments6 min readEA link

What’s in a Pause?

DavidmanheimSep 16, 2023, 10:13 AM

73 points

10 comments9 min readEA link

Finite Field Assembly : A CUDA alternative rooted in Number Theory and Pure Mathematics

Murage KibichoJan 13, 2025, 1:37 PM

−7 points

0 comments3 min readEA link

7 traps that (we think) new alignment researchers often fall into

AkashSep 27, 2022, 11:13 PM

73 points

8 comments1 min readEA link

Quick Thoughts on A.I. Governance

Nicholas KrossApr 30, 2022, 2:49 PM

43 points

0 comments2 min readEA link

(www.thinkingmuchbetter.com)

Retrospective: PIBBSS Fellowship 2024

Dušan D. Nešić (Dushan)Dec 20, 2024, 3:55 PM

7 points

0 comments1 min readEA link

[Question] What are the strategic implications if aliens and Earth civilizations produce similar utilities?

Maxime Riché 🔸Aug 6, 2024, 9:21 PM

6 points

1 comment1 min readEA link

Specification Gaming: How AI Can Turn Your Wishes Against You [RA Video]

WriterDec 1, 2023, 7:30 PM

8 points

1 comment1 min readEA link

(youtu.be)

Safety evaluations and standards for AI | Beth Barnes | EAG Bay Area 23

Beth BarnesJun 16, 2023, 2:15 PM

28 points

0 comments17 min readEA link

Frontier Model Forum

Zach Stein-PerlmanJul 26, 2023, 2:30 PM

40 points

7 comments1 min readEA link

(blog.google)

Twitter thread on open-source AI

richard_ngoJul 31, 2024, 12:30 AM

32 points

0 comments1 min readEA link

(x.com)

More evidence X-risk amplifies action against current AI harms

Daniel_FriedrichDec 22, 2023, 3:21 PM

27 points

2 comments2 min readEA link

(osf.io)

Digest: three papers that have shaped my understanding of the potential for consciousness in AI systems

rileyharrisAug 21, 2024, 3:09 PM

5 points

0 comments1 min readEA link

“AI Safety for Fleshy Humans” an AI Safety explainer by Nicky Case

Habryka [Deactivated]May 3, 2024, 7:28 PM

40 points

3 comments1 min readEA link

(aisafety.dance)

xAI raises $6B

andzuckJun 5, 2024, 3:26 PM

18 points

1 comment1 min readEA link

(x.ai)

Notes on new UK AISI minister

PseudaemoniaJul 5, 2024, 7:50 PM

92 points

0 comments1 min readEA link

Should you work in the European Union to do AGI governance?

hanadulsetJan 31, 2022, 10:34 AM

90 points

20 comments15 min readEA link

The academic contribution to AI safety seems large

technicalitiesJul 30, 2020, 10:30 AM

117 points

28 comments9 min readEA link

Invitation to participate in AGI global governance Real-Time Delphi questionnaire—The Millennium Project

Miquel Banchs-Piqué (prev. mikbp)Dec 13, 2023, 1:35 PM

6 points

0 comments1 min readEA link

Seeking social science students / collaborators interested in AI existential risks

Vael GatesSep 24, 2021, 9:56 PM

58 points

7 comments3 min readEA link

Tim Cook was asked about extinction risks from AI

Saul MunnJun 6, 2023, 6:46 PM

8 points

1 comment1 min readEA link

Apply to the Constellation Visiting Researcher Program and Astra Fellowship, in Berkeley this Winter

AFOct 26, 2023, 3:14 AM

61 points

4 comments1 min readEA link

New TIME magazine article on the UK AI Safety Institute (AISI)

RasoolJan 16, 2025, 10:51 PM

9 points

0 comments1 min readEA link

(time.com)

[Question] What’s the exact way you predict probability of AI extinction?

jackchang110Jun 13, 2023, 3:11 PM

18 points

7 comments1 min readEA link

How do fictional stories illustrate AI misalignment?

Vishakha AgrawalJan 15, 2025, 6:16 AM

4 points

0 comments2 min readEA link

(aisafety.info)

ARENA 5.0 - Call for Applicants

James HindmarchJan 31, 2025, 7:54 PM

9 points

0 comments6 min readEA link

Douglas Hoftstadter concerned about AI xrisk

Eli RoseJul 3, 2023, 3:30 AM

64 points

0 comments1 min readEA link

(www.youtube.com)

Intergenerational trauma impeding cooperative existential safety efforts

Andrew CritchJun 3, 2022, 5:27 PM

82 points

2 comments3 min readEA link

AISN #23: New OpenAI Models, News from Anthropic, and Representation Engineering

Center for AI SafetyOct 4, 2023, 5:10 PM

7 points

0 comments5 min readEA link

(newsletter.safe.ai)

AISN#14: OpenAI’s ‘Superalignment’ team, Musk’s xAI launches, and developments in military AI use

Center for AI SafetyJul 12, 2023, 4:58 PM

26 points

0 comments4 min readEA link

(newsletter.safe.ai)

Compliance Monitoring as an Impactful Mechanism of AI Safety Policy

CAISIDFeb 7, 2024, 4:10 PM

6 points

3 comments9 min readEA link

Forecasting Transformative AI: What Kind of AI?

Holden KarnofskyAug 10, 2021, 9:38 PM

62 points

3 comments10 min readEA link

[Question] Is anyone working on safe selection pressure for digital minds?

WillPearsonDec 12, 2023, 6:17 PM

10 points

9 comments1 min readEA link

Anthropic is being sued for copying books to train Claude

RemmeltAug 31, 2024, 2:57 AM

3 points

0 comments1 min readEA link

(fingfx.thomsonreuters.com)

Technical AGI safety research outside AI

richard_ngoOct 18, 2019, 3:02 PM

91 points

5 comments3 min readEA link

Funding for humanitarian non-profits to research responsible AI

Deborah W.A. FoulkesDec 10, 2024, 8:08 AM

4 points

0 comments2 min readEA link

(www.gov.uk)

Effective Enforceability of EU Competition Law Under Different AI Development Scenarios: A Framework for Legal Analysis

HaydnBelfieldAug 19, 2022, 5:20 PM

11 points

0 comments6 min readEA link

(verfassungsblog.de)

AI Model Registries: A Foundational Tool for AI Governance

Elliot MckernonOct 7, 2024, 1:59 PM

19 points

0 comments1 min readEA link

(www.convergenceanalysis.org)

[Link] How understanding valence could help make future AIs safer

Milan GriffesOct 8, 2020, 6:53 PM

22 points

2 comments3 min readEA link

On how various plans miss the hard bits of the alignment challenge

So8resJul 12, 2022, 5:35 AM

126 points

13 comments29 min readEA link

Changes in funding in the AI safety field

Sebastian_FarquharFeb 3, 2017, 1:09 PM

34 points

10 comments7 min readEA link

Tracking Critical Infrastructure AI Incidents

Ben TurseSep 29, 2024, 9:29 PM

1 point

0 comments2 min readEA link

AI strategy nearcasting

Holden KarnofskyAug 26, 2022, 4:25 PM

61 points

3 comments10 min readEA link

ML4Good Summer Bootcamps—Applications Open

NiaJul 4, 2024, 6:38 PM

39 points

0 comments1 min readEA link

Don’t Dismiss Simple Alignment Approaches

Chris LeongOct 21, 2023, 12:31 PM

12 points

0 comments1 min readEA link

Sorting Pebbles Into Correct Heaps: The Animation

WriterJan 10, 2023, 3:58 PM

12 points

0 comments1 min readEA link

Getting started independently in AI Safety

JJ HepburnJul 6, 2021, 3:20 PM

41 points

10 comments2 min readEA link

AI Clarity: An Initial Research Agenda

Justin BullockMay 3, 2024, 4:29 PM

27 points

1 comment8 min readEA link

£1 million prize for the most cutting-edge AI solution for public good [link post]

rileyharrisJan 17, 2024, 2:36 PM

8 points

0 comments2 min readEA link

(manchesterprize.org)

Launching Amplify: Receive marketing support for your local groups and other field-building initiatives

gergoAug 28, 2024, 2:12 PM

37 points

0 comments2 min readEA link

Key Papers in Language Model Safety

aogJun 20, 2022, 2:59 PM

20 points

0 comments22 min readEA link

Global Risks Weekly Roundup #18/2025: US tariff shortages, military policing, Gaza famine.

NunoSempereMay 6, 2025, 10:39 AM

22 points

0 comments3 min readEA link

(blog.sentinel-team.org)

[Question] How to influence AGI?

Sam FreedmanJan 9, 2025, 8:46 PM

2 points

0 comments1 min readEA link

My personal cruxes for working on AI safety

BuckFeb 13, 2020, 7:11 AM

136 points

35 comments44 min readEA link

Meditations on careers in AI Safety

PabloAMC 🔸Mar 23, 2022, 10:00 PM

88 points

30 comments2 min readEA link

Transhumanism and AI: Toward Prosperity or Extinction?

Shaïman ThürlerMar 22, 2025, 6:01 PM

9 points

1 comment6 min readEA link

Distillation of The Offense-Defense Balance of Scientific Knowledge

Arjun YadavAug 12, 2022, 7:01 AM

17 points

0 comments2 min readEA link

CFP for the Largest Annual Meeting of Political Science: Get Help With Your Research Submission

Mahendra PrasadDec 22, 2020, 11:39 PM

13 points

0 comments2 min readEA link

[Question] I have thousands of copies of HPMOR in Russian. How to use them with the most impact?

MikhailSaminDec 27, 2022, 11:07 AM

39 points

10 comments1 min readEA link

13 Recent Publications on Existential Risk (Jan 2021 update)

HaydnBelfieldFeb 8, 2021, 12:42 PM

7 points

2 comments10 min readEA link

[Question] What are the biggest obstacles on AI safety research career?

jackchang110Mar 31, 2023, 2:53 PM

2 points

1 comment1 min readEA link

AGI Safety Fundamentals curriculum and application

richard_ngoOct 20, 2021, 9:45 PM

123 points

20 comments8 min readEA link

(docs.google.com)

Why does no one care about AI?

Olivia AddyAug 7, 2022, 10:04 PM

55 points

47 comments1 min readEA link

Have your timelines changed as a result of ChatGPT?

Chris LeongDec 5, 2022, 3:03 PM

30 points

18 comments1 min readEA link

[Question] How can I best use my career to pass impactful AI and Biosecurity policy.

maxgOct 13, 2023, 5:14 AM

4 points

1 comment1 min readEA link

Announcing the Cambridge Boston Alignment Initiative [Hiring!]

kuhanjDec 2, 2022, 1:07 AM

83 points

0 comments1 min readEA link

Can AI Outpredict Humans? Results From Metaculus’s Q3 AI Forecasting Benchmark

Tom LiptayOct 10, 2024, 6:58 PM

32 points

1 comment6 min readEA link

(www.metaculus.com)

Looking for for evidence of AI impacts in the age structure of occupations: Nothing yet

Pat McKelvey May 9, 2025, 6:12 PM

26 points

2 comments3 min readEA link

Ngo and Yudkowsky on AI capability gains

richard_ngoNov 19, 2021, 1:54 AM

23 points

4 comments39 min readEA link

The Missing Piece: Why We Need a Grand Strategy for AI

ColemanFeb 28, 2025, 11:49 PM

5 points

1 comment9 min readEA link

Is scheming more likely in models trained to have long-term goals? (Sections 2.2.4.1-2.2.4.2 of “Scheming AIs”)

Joe_CarlsmithNov 30, 2023, 4:43 PM

6 points

1 comment1 min readEA link

The stakes of AI moral status

Joe_CarlsmithMay 21, 2025, 6:20 PM

54 points

9 comments1 min readEA link

(joecarlsmith.substack.com)

My plan for a “Most Important Century” reading group

Jack O'BrienJan 19, 2022, 9:32 AM

12 points

1 comment2 min readEA link

Is understanding the moral status of digital minds a pressing world problem?

Cody_FenwickSep 30, 2024, 8:50 AM

42 points

0 comments34 min readEA link

(80000hours.org)

Submit Your Toughest Questions for Humanity’s Last Exam

Matrice JacobineSep 18, 2024, 8:03 AM

6 points

0 comments2 min readEA link

(www.safe.ai)

A stubborn unbeliever finally gets the depth of the AI alignment problem

aelwoodOct 13, 2022, 3:16 PM

32 points

7 comments1 min readEA link

Communication by existential risk organizations: State of the field and suggestions for improvement

Existential Risk Communication ProjectAug 13, 2024, 7:06 AM

10 points

3 comments13 min readEA link

Proposal for a Form of Conditional Supplemental Income (CSI) in a Post-Work World

Sean SweeneyJan 31, 2025, 1:00 AM

3 points

0 comments3 min readEA link

The religion problem in AI alignment

Geoffrey MillerSep 16, 2022, 1:24 AM

54 points

28 comments11 min readEA link

College technical AI safety hackathon retrospective—Georgia Tech

yixiongNov 14, 2024, 1:34 PM

18 points

0 comments5 min readEA link

(yixiong.substack.com)

Visible Thoughts Project and Bounty Announcement

So8resNov 30, 2021, 12:35 AM

35 points

2 comments13 min readEA link

Feasibility of training and inferring advanced large language models (LLMs) in data centers in Mexico and Brazil.

Tatiana SandovalMay 2, 2025, 1:42 PM

1 point

0 comments24 min readEA link

Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover

AjeyaJul 18, 2022, 7:07 PM

217 points

12 comments84 min readEA link

(www.lesswrong.com)

Technical AI safety in the United Arab Emirates

ea nyuadJun 21, 2022, 3:11 AM

10 points

0 comments11 min readEA link

AI safety advocates should consider providing gentle pushback following the events at OpenAI

I_machinegun_KellyDec 22, 2023, 9:05 PM

86 points

5 comments3 min readEA link

(www.lesswrong.com)

I created an Asi Alignment Tier List

TimeGoatApr 22, 2024, 12:14 PM

0 points

0 comments1 min readEA link

X-Risk Researchers Survey

NitaSanghaApr 24, 2023, 8:06 AM

12 points

1 comment1 min readEA link

Lessons for AI Governance from Atoms for Peace

Amritanshu PrasadApr 16, 2025, 2:25 PM

10 points

2 comments2 min readEA link

(www.thenextfrontier.blog)

Am I Missing Something, or Is EA? Thoughts from a Learner in Uganda

Dr KassimMar 16, 2025, 11:31 AM

229 points

16 comments3 min readEA link

If AGI is imminent, why can’t I hail a robotaxi?

Yarrow🔸Dec 9, 2023, 8:50 PM

26 points

4 comments1 min readEA link

Being an individual alignment grantmaker

A_donorFeb 28, 2022, 4:39 PM

34 points

20 comments2 min readEA link

What could an AI-caused existential catastrophe actually look like?

Benjamin HiltonSep 12, 2022, 4:25 PM

57 points

7 comments9 min readEA link

(80000hours.org)

Announcing the AI Forecasting Benchmark Series | July 8, $120k in Prizes

christianJun 19, 2024, 9:37 PM

52 points

4 comments5 min readEA link

(www.metaculus.com)

A mesa-optimization perspective on AI valence and moral patienthood

jacobpfauSep 9, 2021, 10:23 PM

10 points

18 comments17 min readEA link

ML Safety Scholars Summer 2022 Retrospective

TW123Nov 1, 2022, 3:09 AM

56 points

2 comments21 min readEA link

Overview: AI Safety Outreach Grassroots Orgs

SeverinMay 12, 2025, 2:38 PM

11 points

0 comments1 min readEA link

[Question] How have analogous Industries solved Interested > Trained > Employed bottlenecks?

yanni kyriacosMay 30, 2024, 11:59 PM

6 points

0 comments1 min readEA link

Facilitator Help Wanted for Columbia EA AI Safety Groups

Berkan OttlikJul 5, 2022, 10:27 AM

16 points

0 comments1 min readEA link

Apply to >50 AI safety funders in one application with the Nonlinear Network [Round Closed]

Drew SpartzApr 12, 2023, 9:06 PM

157 points

18 comments2 min readEA link

2019 AI Alignment Literature Review and Charity Comparison

LarksDec 19, 2019, 2:58 AM

147 points

28 comments62 min readEA link

Will disagreement about AI rights lead to societal conflict?

Lucius CaviolaJul 3, 2024, 1:30 PM

51 points

0 comments22 min readEA link

(outpaced.substack.com)

UN Public Call for Nominations For High-level Advisory Body on Artificial Intelligence

vincentweisserAug 10, 2023, 10:34 AM

15 points

1 comment1 min readEA link

Largest AI model in 2 years from $10B

Peter Drotos 🔸Oct 24, 2023, 3:14 PM

37 points

0 comments7 min readEA link

Contra shard theory, in the context of the diamond maximizer problem

So8resOct 13, 2022, 11:51 PM

27 points

0 comments1 min readEA link

AI-Risk in the State of the European Union Address

Sam BogerdSep 13, 2023, 1:27 PM

25 points

0 comments3 min readEA link

(state-of-the-union.ec.europa.eu)

Amplify is hiring! Work with us to support field-building initiatives through digital marketing

gergoAug 28, 2024, 2:12 PM

28 points

1 comment4 min readEA link

A necessary Membrane formalism feature

ThomasCederborgSep 10, 2024, 9:03 PM

1 point

0 comments11 min readEA link

Animal communication and the future of moral progress: speculations and responsibilities

Lutebemberwa IsaMay 27, 2025, 3:51 PM

10 points

0 comments3 min readEA link

Gradual Disempowerment: Concrete Research Projects

Raymond DMay 29, 2025, 6:58 PM

20 points

1 comment10 min readEA link

Cognitive Science/Psychology As a Neglected Approach to AI Safety

Kaj_SotalaJun 5, 2017, 1:46 PM

40 points

37 comments4 min readEA link

[Question] Fiscal sponsorship, ops support, or incubation?

Harry LukOct 4, 2023, 10:06 PM

14 points

8 comments1 min readEA link

Draft report on AI timelines

AjeyaDec 15, 2020, 12:10 PM

35 points

0 comments1 min readEA link

(alignmentforum.org)

AI Alternative Futures: Exploratory Scenario Mapping for Artificial Intelligence Risk—Request for Participation [Linkpost]

KiliankMay 9, 2022, 7:53 PM

17 points

2 comments8 min readEA link

Let’s talk about uncontrollable AI

Karl von WendtOct 9, 2022, 10:37 AM

12 points

2 comments1 min readEA link

[Question] What would need to be true for AI to translate a legal contract to a smart contract?

Patrick LiuMar 18, 2023, 4:42 PM

−1 points

0 comments1 min readEA link

[Question] Should we publish arguments for the preservation of humanity?

JeremyApr 7, 2023, 1:51 PM

8 points

4 comments1 min readEA link

#180 – Why gullibility and misinformation are overrated (Hugo Mercier on the 80,000 Hours Podcast)

80000_HoursFeb 26, 2024, 7:16 PM

15 points

0 comments18 min readEA link

Clarifying two uses of “alignment”

Matthew_BarnettMar 10, 2024, 5:41 PM

36 points

28 comments4 min readEA link

A one-sentence formulation of the AI X-Risk argument I try to make

tcelferactMar 2, 2024, 12:44 AM

3 points

0 comments1 min readEA link

[$20K In Prizes] AI Safety Arguments Competition

TW123Apr 26, 2022, 4:21 PM

71 points

121 comments3 min readEA link

20 concrete projects for reducing existential risk

BuhlJun 21, 2023, 3:54 PM

132 points

27 comments20 min readEA link

(rethinkpriorities.org)

[Question] Why not offer a multi-million / billion dollar prize for solving the Alignment Problem?

Aryeh EnglanderApr 17, 2022, 4:08 PM

15 points

9 comments1 min readEA link

Living with AGI: How to Avoid Extinction

funnyfrancoMar 24, 2025, 7:44 PM

4 points

10 comments18 min readEA link

China-AI forecasting

Nathan_BarnardFeb 25, 2024, 4:47 PM

10 points

2 comments6 min readEA link

[Question] Examples of self-governance to reduce technology risk?

jiaSep 25, 2020, 1:26 PM

32 points

1 comment1 min readEA link

Crucial considerations in the field of Wild Animal Welfare (WAW)

Holly Elmore ⏸️ 🔸Apr 10, 2022, 7:43 PM

63 points

10 comments3 min readEA link

Emergent Ventures AI

technicalitiesApr 8, 2022, 10:08 PM

22 points

0 comments1 min readEA link

(marginalrevolution.com)

AISN #29: Progress on the EU AI Act Plus, the NY Times sues OpenAI for Copyright Infringement, and Congressional Questions about Research Standards in AI Safety

Center for AI SafetyJan 4, 2024, 4:03 PM

5 points

0 comments6 min readEA link

(newsletter.safe.ai)

Eric Drexler: Paretotopian goal alignment

EA GlobalMar 15, 2019, 2:51 PM

16 points

0 comments10 min readEA link

(www.youtube.com)

A model-based approach to AI Existential Risk

SammyDMartinAug 25, 2023, 10:44 AM

17 points

0 comments1 min readEA link

(www.lesswrong.com)

Longview is now offering AI grant recommendations to donors giving >$100k / year

Longview PhilanthropyApr 11, 2025, 4:01 PM

73 points

0 comments2 min readEA link

MLSN: #10 Adversarial Attacks Against Language and Vision Models, Improving LLM Honesty, and Tracing the Influence of LLM Training Data

Center for AI SafetySep 13, 2023, 6:02 PM

7 points

0 comments5 min readEA link

(newsletter.mlsafety.org)

The longtermist AI governance landscape: a basic overview

Sam ClarkeJan 18, 2022, 12:58 PM

170 points

13 comments9 min readEA link

[Question] Is there much need for frontend engineers in AI alignment?

Michael GSep 21, 2023, 8:48 PM

11 points

1 comment1 min readEA link

Evolving OpenAI’s Structure

TynerMay 6, 2025, 12:52 AM

12 points

1 comment1 min readEA link

2016 AI Risk Literature Review and Charity Comparison

LarksDec 13, 2016, 4:36 AM

57 points

12 comments28 min readEA link

Why I think that teaching philosophy is high impact

Eleni_ADec 19, 2022, 11:00 PM

17 points

2 comments2 min readEA link

Database of existential risk estimates

MichaelA🔸Apr 15, 2020, 12:43 PM

130 points

37 comments5 min readEA link

Ngo and Yudkowsky on scientific reasoning and pivotal acts

EliezerYudkowskyFeb 21, 2022, 5:00 PM

33 points

1 comment35 min readEA link

When is AI safety research harmful?

Nathan_BarnardMay 9, 2022, 10:36 AM

13 points

6 comments9 min readEA link

Mediocre AI safety as existential risk

technicalitiesMar 16, 2022, 11:50 AM

52 points

12 comments3 min readEA link

Introducing International AI Governance Alliance (IAIGA)

James NorrisFeb 5, 2025, 3:59 PM

12 points

0 comments1 min readEA link

Helen Toner (ex-OpenAI board member): “We learned about ChatGPT on Twitter.”

defun 🔸May 29, 2024, 7:40 AM

123 points

13 comments1 min readEA link

(x.com)

Preserving and continuing alignment research through a severe global catastrophe

A_donorMar 6, 2022, 6:43 PM

40 points

11 comments5 min readEA link

When safety is dangerous: risks of an indefinite pause on AI development, and call for realistic alternatives

Hayven FrienbyJan 18, 2024, 2:59 PM

5 points

0 comments5 min readEA link

A Brief Summary Of The Most Important Century

Maynk02Oct 25, 2022, 3:28 PM

3 points

0 comments5 min readEA link

Reducing global AI competition through the Commerce Control List and Immigration reform: a dual-pronged approach

ben.smithSep 3, 2024, 5:28 AM

15 points

0 comments9 min readEA link

BERI is hiring an ML Software Engineer

sawyer🔸Nov 10, 2021, 7:36 PM

17 points

2 comments1 min readEA link

Loss of control of AI is not a likely source of AI x-risk

squekNov 9, 2022, 5:48 AM

8 points

0 comments1 min readEA link

Cryptocurrency Exploits Show the Importance of Proactive Policies for AI X-Risk

eSpencerSep 16, 2022, 4:44 AM

14 points

1 comment4 min readEA link

Is there a Half-Life for the Success Rates of AI Agents?

Matrice JacobineMay 8, 2025, 8:10 PM

6 points

0 comments1 min readEA link

(www.tobyord.com)

[Question] Is AI like disk drives?

TanaeSep 2, 2023, 7:12 PM

8 points

1 comment1 min readEA link

A Manifold Market “Leaked” the AI Extinction Statement and CAIS Wanted it Deleted

David CheeJun 12, 2023, 3:57 PM

24 points

9 comments12 min readEA link

(news.manifold.markets)

[Link] Reading the ethicists: A review of articles on AI in the journal Science and Engineering Ethics

Charlie SteinerMay 18, 2022, 9:06 PM

7 points

0 comments1 min readEA link

(www.lesswrong.com)

I’m NOT against Artificial Intelligence

Victoria DiasApr 24, 2025, 6:02 PM

6 points

1 comment18 min readEA link

Third-wave AI safety needs sociopolitical thinking

richard_ngoMar 27, 2025, 12:55 AM

105 points

59 comments1 min readEA link

(www.youtube.com)

Some Preliminary Opinions on AI Safety Problems

yonxinzhangApr 6, 2023, 12:42 PM

5 points

0 comments6 min readEA link

AMA: Ought

stuhlmuellerAug 3, 2022, 5:24 PM

41 points

52 comments1 min readEA link

Ambitious Impact launches a for-profit accelerator instead of building the AI Safety space. Let’s talk about this.

yanni kyriacosMar 18, 2024, 3:44 AM

−7 points

13 comments1 min readEA link

Some AI Governance Research Ideas

MarkusAnderljungJun 3, 2021, 10:51 AM

102 points

5 comments2 min readEA link

Credo AI is hiring for several roles

IanEisenbergApr 11, 2022, 3:58 PM

14 points

2 comments1 min readEA link

Not Just For Therapy Chatbots: The Case For Compassion In AI Moral Alignment Research

Kenneth_DiaoSep 29, 2024, 10:58 PM

8 points

3 comments12 min readEA link

Is anyone else also getting more worried about hard takeoff AGI scenarios?

JonCefaluJan 9, 2023, 6:04 AM

19 points

11 comments3 min readEA link

OpenAI lost $5 billion in 2024 (and its losses are increasing)

RemmeltMar 31, 2025, 4:17 AM

0 points

3 comments1 min readEA link

(www.wheresyoured.at)

Max Tegmark: Risks and benefits of advanced artificial intelligence

EA GlobalAug 5, 2016, 9:19 AM

7 points

0 comments1 min readEA link

(www.youtube.com)

Lessons from Three Mile Island for AI Warning Shots

NickGabsSep 26, 2022, 2:47 AM

43 points

0 comments15 min readEA link

[Question] Do EA folks want AGI at all?

Noah ScalesJul 16, 2022, 5:44 AM

8 points

10 comments1 min readEA link

Interpretable Analysis of Features Found in Open-source Sparse Autoencoder (partial replication)

Fernando AvalosAug 28, 2024, 10:08 PM

10 points

1 comment10 min readEA link

What’s going on with ‘crunch time’?

rosehadsharJan 20, 2023, 9:38 AM

92 points

5 comments4 min readEA link

Corporate AI Labs’ Odd Role in Their Own Governance

Corporate AI Labs' Odd Role in Their Own GovernanceJul 29, 2024, 9:36 AM

66 points

6 comments12 min readEA link

(dominikhermle.substack.com)

Promethean Governance in Practice: Crafting a Polycentric, Memetic Order for the Multipolar AI Aeon

Paul FallavollitaMar 20, 2025, 10:10 AM

−1 points

0 comments4 min readEA link

Survey: How Do Elite Chinese Students Feel About the Risks of AI?

Nick CorvinoSep 2, 2024, 9:14 AM

107 points

9 comments10 min readEA link

President Biden Issues Executive Order on Safe, Secure, and Trustworthy Artificial Intelligence

Tristan WilliamsOct 30, 2023, 11:15 AM

143 points

8 comments3 min readEA link

(www.whitehouse.gov)

Cost-effectiveness of making a video game with EA concepts

mmKALLLSep 15, 2022, 1:48 PM

8 points

2 comments5 min readEA link

Winners of the AI Safety Nudge Competition

Marc CarauleanuNov 15, 2022, 1:06 AM

22 points

0 comments1 min readEA link

We Are Conjecture, A New Alignment Research Startup

Connor LeahyApr 9, 2022, 3:07 PM

31 points

0 comments1 min readEA link

The Orthogonality Thesis is Not Obviously True

Bentham's BulldogApr 5, 2023, 9:08 PM

18 points

12 comments9 min readEA link

Steering AI to care for animals, and soon

Andrew CritchJun 14, 2022, 1:13 AM

233 points

37 comments1 min readEA link

De-emphasise alignment, emphasise restraint

EuanMcLeanFeb 4, 2025, 5:43 PM

19 points

2 comments7 min readEA link

AI Safety Executive Summary

Sean OsierSep 6, 2022, 8:26 AM

20 points

2 comments5 min readEA link

(seanosier.notion.site)

Preliminary investigations on if STEM and EA communities could benefit from more overlap

elteerkersApr 11, 2023, 4:08 PM

31 points

17 comments8 min readEA link

Origin and alignment of goals, meaning, and morality

FalseCogsAug 24, 2023, 2:05 PM

1 point

2 comments35 min readEA link

Evals projects I’d like to see, and a call to apply to OP’s evals RFP

cbMar 25, 2025, 11:50 AM

19 points

2 comments3 min readEA link

Beware of the new scaling paradigm

JohanEASep 19, 2024, 5:03 PM

9 points

2 comments3 min readEA link

AI romantic partners will harm society if they go unregulated

Roman LeventovJul 31, 2023, 3:55 PM

16 points

9 comments13 min readEA link

The Windfall Clause has a remedies problem

John Bridge 🔸May 23, 2022, 10:31 AM

40 points

0 comments17 min readEA link

Applications to EAGxCDMX close in a week!

cescorzaFeb 17, 2025, 8:42 PM

15 points

0 comments1 min readEA link

AI as a science, and three obstacles to alignment strategies

So8resOct 25, 2023, 9:02 PM

41 points

1 comment1 min readEA link

Existential Cybersecurity Risks & AI (A Research Agenda)

Madhav MalhotraSep 20, 2023, 12:03 PM

7 points

0 comments8 min readEA link

The Short Timelines Strategy for AI Safety University Groups

Josh Thorsteinson 🔸Mar 7, 2025, 4:26 AM

50 points

8 comments5 min readEA link

The case for conscious AI: Clearing the record [AI Consciousness & Public Perception]

Jay LuongJul 5, 2024, 8:29 PM

3 points

7 comments8 min readEA link

AI governance & China: Reading list

Zach Stein-PerlmanDec 18, 2023, 3:30 PM

14 points

0 comments1 min readEA link

(docs.google.com)

The Manhattan Trap: Why a Race to Artificial Superintelligence is Self-Defeating

Corin KatzkeJan 21, 2025, 4:57 PM

98 points

1 comment2 min readEA link

(www.convergenceanalysis.org)

Map of AI Safety v2

Bryce RobertsonApr 15, 2025, 1:04 PM

59 points

6 comments1 min readEA link

Reviews of “Is power-seeking AI an existential risk?”

Joe_CarlsmithDec 16, 2021, 8:50 PM

71 points

4 comments1 min readEA link

Belief Bias: Bias in Evaluating AGI X-Risks

RemmeltJan 2, 2023, 8:59 AM

5 points

0 comments1 min readEA link

A “Solipsistic” Repugnant Conclusion

RamiroJul 21, 2022, 4:06 PM

13 points

0 comments6 min readEA link

Tyler Cowen’s challenge to develop an ‘actual mathematical model’ for AI X-Risk

Joe BrentonMay 16, 2023, 4:55 PM

20 points

4 comments1 min readEA link

$20K in Bounties for AI Safety Public Materials

TW123Aug 5, 2022, 2:57 AM

45 points

11 comments6 min readEA link

Winning isn’t enough

Anthony DiGiovanniNov 5, 2024, 11:43 AM

31 points

3 comments1 min readEA link

Three polls: on timelines and cause prio

Toby Tremlett🔹Apr 28, 2025, 12:03 PM

30 points

41 comments1 min readEA link

[Question] Workshop (hackathon, residence program, etc.) about for-profit AI Safety projects?

Roman LeventovJan 26, 2024, 9:49 AM

13 points

1 comment1 min readEA link

Ex-OpenAI employee amici leave to file denied in Musk v OpenAI case?

TFDMay 2, 2025, 12:31 PM

8 points

0 comments2 min readEA link

(www.thefloatingdroid.com)

Announcing AI Alignment workshop at the ALIFE 2023 conference

Rory GreigJul 8, 2023, 1:49 PM

9 points

0 comments1 min readEA link

(humanvaluesandartificialagency.com)

[Question] Graph of % of tasks AI is superhuman at?

Denkenberger🔸Nov 15, 2022, 5:59 AM

9 points

0 comments1 min readEA link

AGI safety field building projects I’d like to see

SeverinJan 24, 2023, 11:30 PM

25 points

2 comments1 min readEA link

On taking AI risk seriously

Eleni_AMar 13, 2023, 5:44 AM

51 points

4 comments1 min readEA link

(www.nytimes.com)

We Ran an AI Timelines Retreat

Lenny McClineMay 17, 2022, 4:40 AM

46 points

6 comments3 min readEA link

My disagreements with “AGI ruin: A List of Lethalities”

SharmakeSep 15, 2024, 5:22 PM

23 points

2 comments1 min readEA link

A.I love you : AGI and Human Traitors

Pilot PillowApr 2, 2025, 2:18 PM

1 point

2 comments7 min readEA link

Ideal governance (for companies, countries and more)

Holden KarnofskyApr 7, 2022, 4:54 PM

80 points

19 comments14 min readEA link

A different take on the Musk v OpenAI preliminary injunction order

TFDMar 11, 2025, 2:29 PM

6 points

1 comment20 min readEA link

(www.thefloatingdroid.com)

The space of systems and the space of maps

Jan_KulveitMar 22, 2023, 4:05 PM

12 points

0 comments5 min readEA link

(www.lesswrong.com)

Jeffrey Ding: Re-deciphering China’s AI dream

EA GlobalOct 18, 2019, 6:05 PM

13 points

0 comments1 min readEA link

(www.youtube.com)

Volunteer Opportunities with the AI Safety Awareness Foundation

NoahCWilson🔸Mar 8, 2025, 4:41 AM

7 points

0 comments2 min readEA link

Nuclear Espionage and AI Governance

GAAOct 4, 2021, 6:21 PM

32 points

3 comments24 min readEA link

Meridian Cambridge Visiting Researcher Programme: Turn AI safety ideas into funded projects in one week!

MeridianMar 5, 2025, 7:19 AM

27 points

3 comments2 min readEA link

The heterogeneity of human value types: Implications for AI alignment

Geoffrey MillerSep 16, 2022, 9:21 PM

27 points

2 comments10 min readEA link

[Question] What happened to the ‘only 400 people work in AI safety/governance’ number dated from 2020?

VaipanMar 15, 2024, 3:25 PM

27 points

1 comment1 min readEA link

Reduce AGI risks using modern lie detection technology

NothingIsArtSep 30, 2024, 6:12 PM

1 point

0 comments1 min readEA link

Emergency pod: Don’t believe OpenAI’s “nonprofit” spin (with Tyler Whitmer)

80000_HoursMay 15, 2025, 4:52 PM

37 points

0 comments2 min readEA link

AMA: Markus Anderljung (PM at GovAI, FHI)

MarkusAnderljungSep 21, 2020, 11:23 AM

49 points

24 comments2 min readEA link

“Technological unemployment” AI vs. “most important century” AI: how far apart?

Holden KarnofskyOct 11, 2022, 4:50 AM

17 points

1 comment3 min readEA link

(www.cold-takes.com)

Pitching AI Safety in 3 sentences

PabloAMC 🔸Mar 30, 2022, 6:50 PM

7 points

0 comments1 min readEA link

[MLSN #6]: Transparency survey, provable robustness, ML models that predict the future

Dan HOct 12, 2022, 8:51 PM

21 points

1 comment6 min readEA link

Humanities Research Ideas for Longtermists

LizkaJun 9, 2021, 4:39 AM

151 points

13 comments13 min readEA link

Simulators and Mindcrime

𝕮𝖎𝖓𝖊𝖗𝖆Dec 9, 2022, 3:20 PM

1 point

0 comments1 min readEA link

AI Self-Modification Amplifies Risks

Ihor IvlievJun 3, 2025, 8:27 PM

0 points

0 comments2 min readEA link

Announcing Epoch: A research organization investigating the road to Transformative AI

Jaime SevillaJun 27, 2022, 1:39 PM

183 points

11 comments2 min readEA link

(epochai.org)

Why Is No One Trying To Align Profit Incentives With Alignment Research?

PrometheusAug 23, 2023, 1:19 PM

17 points

2 comments4 min readEA link

(www.lesswrong.com)

AI companies’ commitments

Zach Stein-PerlmanMay 31, 2024, 12:00 AM

9 points

0 comments1 min readEA link

[Question] Forecasting Questions: What do you want to predict on AI?

Nathan YoungNov 1, 2023, 1:16 PM

9 points

0 comments1 min readEA link

[Question] EA’s Achievements in 2022

ElliotJDaviesDec 14, 2022, 2:33 PM

98 points

11 comments1 min readEA link

Now is a good time to update your threat model

Flo 🔸Mar 22, 2025, 9:11 PM

29 points

0 comments1 min readEA link

Creating a “Conscience Calculator” to Guard-Rail an AGI

Sean SweeneyAug 12, 2024, 3:58 PM

1 point

11 comments17 min readEA link

[Question] Half-baked alignment idea

ozbMar 28, 2023, 5:18 AM

9 points

2 comments1 min readEA link

[Question] Why are bond yields anomalously rising following the September rate cut?

incredibleutilityJan 7, 2025, 3:49 PM

2 points

2 comments1 min readEA link

Anthropic is not being consistently candid about their connection to EA

burner2Mar 30, 2025, 1:30 PM

289 points

88 comments2 min readEA link

Can Knowledge Hurt You? The Dangers of Infohazards (and Exfohazards)

A.G.G. LiuFeb 8, 2025, 3:51 PM

12 points

0 comments1 min readEA link

(www.youtube.com)

Longtermist reasons to work for innovative governments

acOct 13, 2020, 4:32 PM

74 points

8 comments1 min readEA link

What can the principal-agent literature tell us about AI risk?

acFeb 10, 2020, 10:10 AM

26 points

1 comment16 min readEA link

AI Moral Alignment: The Most Important Goal of Our Generation

Ronen BarMar 26, 2025, 12:32 PM

127 points

32 comments8 min readEA link

“That’s (not) me!”: The malicious employment of deepfakes and their mitigation in legal environments for AI governance

Gabriela PardoMay 1, 2025, 2:54 PM

1 point

0 comments12 min readEA link

How to ‘troll for good’: Leveraging IP for AI governance

Michael HuangFeb 26, 2023, 6:34 AM

26 points

3 comments1 min readEA link

(www.science.org)

US credit rating downgraded, $1T in Gulf state investments in the US, Kurdistan Workers’ Party disbanded | Sentinel Global Risks Weekly Roundup #20/2025

NunoSempereMay 19, 2025, 6:02 PM

50 points

0 comments1 min readEA link

(blog.sentinel-team.org)

AI alignment research links

Holden KarnofskyJan 6, 2022, 5:52 AM

16 points

0 comments6 min readEA link

(www.cold-takes.com)

AI Safety − 7 months of discussion in 17 minutes

Zoe WilliamsMar 15, 2023, 11:41 PM

90 points

2 comments17 min readEA link

Thoughts on the OpenAI alignment plan: will AI research assistants be net-positive for AI existential risk?

Jeffrey LadishMar 10, 2023, 8:20 AM

12 points

0 comments9 min readEA link

Gaia Network: a practical, incremental pathway to Open Agency Architecture

Roman LeventovDec 20, 2023, 5:11 PM

4 points

0 comments1 min readEA link

CSER is hiring for a senior research associate on longterm AI risk and governance

Sam ClarkeJan 24, 2022, 1:24 PM

9 points

4 comments1 min readEA link

Back to the Past to the Future

PrometheusOct 18, 2023, 4:51 PM

4 points

0 comments1 min readEA link

Promoting compassionate longtermism

jonleightonDec 7, 2022, 2:26 PM

117 points

5 comments12 min readEA link

Longtermism better from a development skeptical stance?

Benevolent_RainDec 9, 2024, 12:16 PM

16 points

2 comments1 min readEA link

[Question] Training a GPT model on EA texts: what data?

JoyOptimizerJun 4, 2022, 5:59 AM

23 points

17 comments1 min readEA link

The Alignment Problem No One is Talking About

Non-zero-sum JamesMay 14, 2024, 10:42 AM

5 points

0 comments2 min readEA link

Cooperation for AI safety must transcend geopolitical interference

Matrice JacobineFeb 16, 2025, 6:18 PM

9 points

0 comments1 min readEA link

(www.scmp.com)

No, the EMH does not imply that markets have long AGI timelines

JakobApr 24, 2023, 8:27 AM

83 points

21 comments8 min readEA link

New Funding Round on Hardware-Enabled Mechanisms (HEMs)

aogApr 30, 2025, 5:45 PM

54 points

0 comments15 min readEA link

On presenting the case for AI risk

Aryeh EnglanderMar 8, 2022, 9:37 PM

114 points

12 comments4 min readEA link

A Taxonomy Of AI System Evaluations

Maxime Riché 🔸Aug 19, 2024, 9:08 AM

8 points

0 comments14 min readEA link

How to do conceptual research: Case study interview with Caspar Oesterheld

ChiMay 14, 2024, 3:09 PM

26 points

1 comment1 min readEA link

Discussion with Eliezer Yudkowsky on AGI interventions

RobBensingerNov 11, 2021, 3:21 AM

60 points

33 comments34 min readEA link

[Question] What “defense layers” should governments, AI labs, and businesses use to prevent catastrophic AI failures?

LintzADec 3, 2021, 2:24 PM

37 points

3 comments1 min readEA link

Sentinel minutes #6/2025: Power of the purse, D1.1 H5N1 flu variant, Ayatollah against negotiations with Trump

NunoSempereFeb 10, 2025, 5:23 PM

40 points

2 comments7 min readEA link

(blog.sentinel-team.org)

Decentralizing Model Evaluation: Lessons from AI4Math

SMalagonJun 5, 2025, 6:57 PM

21 points

1 comment4 min readEA link

[Question] Need help with billboard content for AI Safety Bulgaria

Aleksandar AngelovMar 7, 2024, 2:36 PM

4 points

5 comments1 min readEA link

[Question] How does one find out their AGI timelines?

YadavNov 7, 2022, 10:34 PM

19 points

4 comments1 min readEA link

[Question] Can we convince people to work on AI safety without convincing them about AGI happening this century?

BrianTanNov 26, 2020, 2:46 PM

8 points

3 comments2 min readEA link

AI & wisdom 1: wisdom, amortised optimisation, and AI

L Rudolf LOct 29, 2024, 1:37 PM

14 points

0 comments1 min readEA link

(rudolf.website)

Early-warning Forecasting Center: What it is, and why it’d be cool

LinchMar 14, 2022, 7:20 PM

62 points

8 comments11 min readEA link

Eliciting responses to Marc Andreessen’s “Why AI Will Save the World”

ColemanJul 17, 2023, 7:58 PM

2 points

2 comments1 min readEA link

(a16z.com)

Alignment Megaprojects: You’re Not Even Trying to Have Ideas

Nicholas KrossJul 12, 2023, 11:39 PM

7 points

1 comment1 min readEA link

Announcing the Cambridge ERA:AI Fellowship 2024

erafellowshipMar 11, 2024, 7:06 PM

31 points

5 comments3 min readEA link

The Case For Civil Disobedience For The AI Movement

Murali ThoppilApr 24, 2023, 1:07 PM

16 points

3 comments4 min readEA link

(murali42e.substack.com)

Cooperation and Alignment in Delegation Games: You Need Both!

Oliver SourbutAug 3, 2024, 10:16 AM

4 points

1 comment1 min readEA link

(www.oliversourbut.net)

Incentive design and capability elicitation

Joe_CarlsmithNov 12, 2024, 8:56 PM

9 points

0 comments1 min readEA link

Background for “Understanding the diffusion of large language models”

Ben CottierDec 21, 2022, 1:49 PM

12 points

0 comments23 min readEA link

AI Safety Unconference NeurIPS 2022

Orpheus_LummisNov 7, 2022, 3:39 PM

13 points

5 comments1 min readEA link

(aisafetyevents.org)

Introducing the Fund for Alignment Research (We’re Hiring!)

AdamGleaveJul 6, 2022, 2:00 AM

74 points

3 comments4 min readEA link

How long till Brussels?: A light investigation into the Brussels Gap

YadavDec 26, 2022, 7:49 AM

50 points

2 comments5 min readEA link

SERI ML application deadline is extended until May 22.

Viktoria MalyasovaMay 22, 2022, 12:13 AM

13 points

3 comments1 min readEA link

US public perception of CAIS statement and the risk of extinction

Jamie EJun 22, 2023, 4:39 PM

126 points

4 comments9 min readEA link

Join the AI Testing Hackathon this Friday

Esben KranDec 12, 2022, 2:24 PM

33 points

0 comments8 min readEA link

(alignmentjam.com)

[Question] How to get more academics enthusiastic about doing AI Safety research?

PabloAMC 🔸Sep 4, 2021, 2:10 PM

25 points

19 comments1 min readEA link

We’re hiring a Writer to join our team at Our World in Data

Charlie GiattinoApr 18, 2024, 8:50 PM

29 points

0 comments1 min readEA link

(ourworldindata.org)

Epoch and FRI Mentorship Program Summer 2023

merilalamaJun 13, 2023, 2:27 PM

38 points

1 comment1 min readEA link

(epochai.org)

How can open-source robotics be aligned with long-term effective altruism goals?

Aria JamesApr 21, 2025, 8:50 PM

5 points

1 comment1 min readEA link

TIO: A mental health chatbot

SanjayOct 12, 2020, 8:52 PM

25 points

6 comments28 min readEA link

[Question] Book recommendations for the history of ML?

Eleni_ADec 28, 2022, 11:45 PM

10 points

4 comments1 min readEA link

An experiment eliciting relative estimates for Open Philanthropy’s 2018 AI safety grants

NunoSempereSep 12, 2022, 11:19 AM

111 points

16 comments13 min readEA link

Don’t treat probabilities less than 0.5 as if they’re 0

MichaelDickensFeb 26, 2025, 5:14 AM

36 points

5 comments1 min readEA link

Defending against Adversarial Policies in Reinforcement Learning with Alternating Training

sergeivolodinFeb 12, 2022, 3:59 PM

1 point

0 comments13 min readEA link

[Job]: AI Standards Development Research Assistant

Tony BarrettOct 14, 2022, 8:18 PM

13 points

0 comments2 min readEA link

Three kinds of competitiveness

AI ImpactsApr 2, 2020, 3:46 AM

10 points

0 comments5 min readEA link

(aiimpacts.org)

EU AI Act now has a section on general purpose AI systems

MathiasKB🔸Dec 9, 2021, 12:40 PM

64 points

10 comments1 min readEA link

A new place to discuss cognitive science, ethics and human alignment

Daniel_FriedrichNov 4, 2022, 2:34 PM

9 points

1 comment2 min readEA link

(www.facebook.com)

AI impacts and Paul Christiano on takeoff speeds

CrosspostMar 2, 2018, 11:16 AM

4 points

0 comments1 min readEA link

XPT forecasts on (some) Direct Approach model inputs

Forecasting Research InstituteAug 20, 2023, 12:39 PM

37 points

0 comments9 min readEA link

I there a demo of “You can’t fetch the coffee if you’re dead”?

Ram RachumNov 10, 2022, 11:03 AM

8 points

3 comments1 min readEA link

What About Deontology? Ethics of Social Belonging and Conformity in Effective Altruism

Maksens DjabaliJan 8, 2025, 2:02 PM

7 points

1 comment4 min readEA link

The Best Argument is not a Simple English Yud Essay

Jonathan BostockSep 19, 2024, 3:29 PM

75 points

3 comments5 min readEA link

(www.lesswrong.com)

TOMORROW: the largest AI Safety protest ever!

Holly Elmore ⏸️ 🔸Oct 20, 2023, 6:08 PM

57 points

0 comments2 min readEA link

Introducing the Principles of Intelligent Behaviour in Biological and Social Systems (PIBBSS) Fellowship

adamShimiDec 18, 2021, 3:25 PM

37 points

5 comments10 min readEA link

Getting Washington and Silicon Valley to tame AI (Mustafa Suleyman on the 80,000 Hours Podcast)

80000_HoursSep 4, 2023, 4:25 PM

5 points

2 comments10 min readEA link

The great energy descent—Part 2: Limits to growth and why we probably won’t reach the stars

CB🔸Aug 31, 2022, 9:51 PM

22 points

0 comments25 min readEA link

Seeking Survey Responses—Attitudes Towards AI risks

ansonMar 28, 2022, 5:47 PM

23 points

2 comments1 min readEA link

(forms.gle)

Nationwide Action Workshop: Contact Congress about AI Safety!

Felix De SimoneFeb 24, 2025, 4:14 PM

5 points

0 comments1 min readEA link

(www.zeffy.com)

Summary: Introspective Capabilities in LLMs (Robert Long)

rileyharrisJul 2, 2024, 6:08 PM

11 points

1 comment4 min readEA link

Which of these arguments for x-risk do you think we should test?

WimAug 9, 2022, 1:43 PM

3 points

2 comments1 min readEA link

What’s so dangerous about AI anyway? – Or: What it means to be a superintelligence

Thomas KehrenbergJul 18, 2022, 4:14 PM

9 points

2 comments11 min readEA link

AI, Animals, and Digital Minds Conference 2024: Accepting applications and speaker proposals

Constance LiApr 6, 2024, 8:42 AM

26 points

0 comments1 min readEA link

Transcripts of interviews with AI researchers

Vael GatesMay 9, 2022, 6:03 AM

140 points

14 comments2 min readEA link

1st Alinha Hacka Recap: Reflecting on the Brazilian AI Alignment Hackathon

Thiago USPJan 31, 2024, 10:38 AM

7 points

0 comments2 min readEA link

Prizes for ML Safety Benchmark Ideas

JoshcOct 28, 2022, 2:44 AM

56 points

8 comments1 min readEA link

Criticism of the main framework in AI alignment

Michele CampoloAug 31, 2022, 9:44 PM

42 points

4 comments7 min readEA link

[Job ad] Research important longtermist topics at Rethink Priorities!

LinchOct 6, 2021, 7:09 PM

65 points

46 comments1 min readEA link

Critique of Superintelligence Part 3

James FodorDec 13, 2018, 5:13 AM

3 points

5 comments7 min readEA link

[Question] Idea: Repository for AI Safety Presentations

EitanJan 6, 2025, 1:04 PM

14 points

3 comments1 min readEA link

My (naive) take on Risks from Learned Optimization

Artyom KNov 6, 2022, 4:25 PM

5 points

0 comments1 min readEA link

Some for-profit AI alignment org ideas

Eric HoDec 14, 2023, 3:52 PM

33 points

1 comment9 min readEA link

[CANCELLED] Berlin AI Alignment Open Meetup August 2022

Isidor RegenfußAug 4, 2022, 1:34 PM

0 points

0 comments1 min readEA link

Thousands of malicious actors on the future of AI misuse

Zershaaneh QureshiApr 1, 2024, 10:03 AM

75 points

1 comment1 min readEA link

Genes did misalignment first: comparing gradient hacking and meiotic drive

Holly Elmore ⏸️ 🔸Apr 18, 2025, 5:39 AM

45 points

9 comments15 min readEA link

(hollyelmore.substack.com)

The UN Has a Rare Shot at Reducing the Risks of AI in Warfare

Mark Leon GoldbergMay 21, 2025, 9:22 PM

6 points

0 comments1 min readEA link

Software engineering—Career review

Benjamin HiltonFeb 8, 2022, 6:11 AM

93 points

19 comments8 min readEA link

(80000hours.org)

Beyond Simple Existential Risk: Survival in a Complex Interconnected World

GideonFNov 21, 2022, 2:35 PM

84 points

67 comments21 min readEA link

Implications of large language model diffusion for AI governance

Ben CottierDec 21, 2022, 1:50 PM

14 points

0 comments38 min readEA link

[Question] Is it ethical to work in AI “content evaluation”?

anon_databoy555Jan 30, 2025, 1:27 PM

10 points

3 comments1 min readEA link

The U.S. National Security State is Here to Make AI Even Less Transparent and Accountable

Matrice JacobineNov 24, 2024, 9:34 AM

7 points

0 comments2 min readEA link

(www.eff.org)

Could this be an unusually good time to Earn To Give?

Tom Gardiner 🔸Mar 3, 2025, 11:00 PM

60 points

15 comments3 min readEA link

Training for Good—Update & Plans for 2023

Cillian_Nov 15, 2022, 4:02 PM

80 points

1 comment10 min readEA link

ML Summer Bootcamp Reflection: Aalto EA Finland

Aayush KucheriaJan 12, 2023, 8:24 AM

15 points

2 comments9 min readEA link

Introduction: Bias in Evaluating AGI X-Risks

RemmeltDec 27, 2022, 10:27 AM

4 points

0 comments1 min readEA link

FHI Report: How Will National Security Considerations Affect Antitrust Decisions in AI? An Examination of Historical Precedents

Cullen 🔸Jul 28, 2020, 6:33 PM

13 points

0 comments1 min readEA link

(www.fhi.ox.ac.uk)

Institutional-themed website template for AIS groups

KambarApr 29, 2025, 9:11 PM

21 points

0 comments1 min readEA link

Our new video about goal misgeneralization, plus an apology

WriterJan 14, 2025, 2:07 PM

16 points

1 comment1 min readEA link

(youtu.be)

Collective Action for AI Safety (June 4, NYC)

Jordan BraunsteinMay 30, 2025, 9:11 PM

1 point

0 comments1 min readEA link

AI Safety Protest, Melbourne, Australia

Mark BrownJan 17, 2025, 2:55 PM

2 points

0 comments1 min readEA link

Conjecture: A standing offer for public debates on AI

Andrea_MiottiJun 16, 2023, 2:33 PM

8 points

1 comment1 min readEA link

Cooperation, Avoidance, and Indifference: Alternate Futures for Misaligned AGI

Kiel Brennan-MarquezDec 10, 2022, 8:32 PM

4 points

1 comment18 min readEA link

On AI and Compute

johncroxApr 3, 2019, 9:26 PM

39 points

12 comments8 min readEA link

The Bottleneck in AI Policy Isn’t Ethics—It’s Implementation

Tristan DApr 4, 2025, 6:07 AM

10 points

4 comments1 min readEA link

Are we dropping the ball on Recommendation AIs?

Raphaël SOct 23, 2024, 7:37 PM

5 points

0 comments1 min readEA link

EA Netherlands’ Annual Strategy for 2024

James HerbertJun 5, 2024, 3:07 PM

40 points

4 comments6 min readEA link

[Question] What’s a good intro to AI Safety?

Amateur Systems AnalystJan 14, 2024, 4:54 PM

1 point

5 comments1 min readEA link

The History, Epistemology and Strategy of Technological Restraint, and lessons for AI (short essay)

MMMaasAug 10, 2022, 11:00 AM

90 points

6 comments9 min readEA link

(verfassungsblog.de)

Raphaël Millière on the Limits of Deep Learning and AI x-risk skepticism

Michaël TrazziJun 24, 2022, 6:33 PM

20 points

0 comments4 min readEA link

(theinsideview.ai)

[Question] Common rebuttal to “pausing” or regulating AI

sammyboiz🔸May 22, 2024, 4:21 AM

4 points

2 comments1 min readEA link

Tan Zhi Xuan: AI alignment, philosophical pluralism, and the relevance of non-Western philosophy

EA GlobalNov 21, 2020, 8:12 AM

19 points

1 comment1 min readEA link

(www.youtube.com)

Ex-OpenAI researcher says OpenAI mass-violated copyright law

RemmeltOct 24, 2024, 1:00 AM

11 points

0 comments1 min readEA link

(suchir.net)

EA is good, actually

Amy LabenzNov 28, 2023, 3:59 PM

272 points

15 comments4 min readEA link

Berlin AI Alignment Open Meetup September 2022

Isidor RegenfußSep 21, 2022, 3:09 PM

2 points

0 comments1 min readEA link

SociaLLM: proposal for a language model design for personalised apps, social science, and AI safety research

Roman LeventovJan 2, 2024, 8:11 AM

4 points

2 comments1 min readEA link

On failing to get EA jobs: My experience and recommendations to EA orgs

Ávila CarmesíApr 22, 2024, 9:19 PM

127 points

55 comments5 min readEA link

[Question] Should some people start working to influence the people who are most likely to shape the values of the first AGIs, so that they take into account the interests of wild and farmed animals and sentient digital minds?

Keyvan MostafaviAug 31, 2023, 12:08 PM

16 points

1 comment1 min readEA link

Shutting down AI Safety Support

JJ HepburnJul 30, 2023, 6:00 AM

116 points

17 comments1 min readEA link

“Slower tech development” can be about ordering, gradualness, or distance from now

MichaelA🔸Nov 14, 2021, 8:58 PM

47 points

3 comments4 min readEA link

[Question] Slowing down AI progress?

Eleni_AJul 26, 2022, 8:46 AM

16 points

9 comments1 min readEA link

No “Zero-Shot” Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance

Noah Varley🔸May 14, 2024, 11:57 PM

36 points

2 comments1 min readEA link

(arxiv.org)

What are some good podcasts about AI safety?

Vishakha AgrawalFeb 17, 2025, 10:32 AM

8 points

1 comment1 min readEA link

(aisafety.info)

EA and AI Safety Schism: AGI, the last tech humans will (soon*) build

PhibMay 15, 2023, 2:05 AM

6 points

6 comments5 min readEA link

Alignment for focused chatbots?

BeckpmJul 8, 2023, 3:09 PM

−1 points

2 comments1 min readEA link

The role of academia in AI Safety.

PabloAMC 🔸Mar 28, 2022, 12:04 AM

71 points

19 comments3 min readEA link

Status quo bias; System justification

RemmeltJan 3, 2023, 2:50 AM

4 points

1 comment1 min readEA link

What if we just…didn’t build AGI? An Argument Against Inevitability

Nate SharpeMay 10, 2025, 3:34 AM

63 points

21 comments14 min readEA link

(natezsharpe.substack.com)

Announcing the AIPolicyIdeas.com Database

abiolveraJun 23, 2023, 4:09 PM

50 points

3 comments2 min readEA link

(www.aipolicyideas.com)

Delegated agents in practice: How companies might end up selling AI services that act on behalf of consumers and coalitions, and what this implies for safety research

RemmeltNov 26, 2020, 4:39 PM

11 points

0 comments4 min readEA link

Pausing AI Developments Isn’t Enough. We Need to Shut it All Down by Eliezer Yudkowsky

jacquesthibsMar 29, 2023, 11:30 PM

212 points

75 comments3 min readEA link

(time.com)

Longtermist implications of aliens Space-Faring Civilizations—Introduction

Maxime Riché 🔸Feb 21, 2025, 12:07 PM

44 points

12 comments6 min readEA link

Forging A New AGI Social Contract

Deric ChengApr 10, 2025, 1:41 PM

13 points

3 comments1 min readEA link

(agisocialcontract.substack.com)

Information in risky technology races

nemeryxuAug 2, 2022, 11:35 PM

15 points

2 comments3 min readEA link

Legal template for conditional gift deed as an alternative to wagers on AI doom

bruceMar 13, 2025, 2:57 PM

30 points

6 comments1 min readEA link

Something to make myself fascinated with computing science and AI.

EduardoDec 7, 2022, 2:12 AM

3 points

5 comments1 min readEA link

University community building seems like the wrong model for AI safety

George StiffmanFeb 26, 2022, 6:23 AM

24 points

8 comments2 min readEA link

New Report: Multi-Agent Risks from Advanced AI

Lewis HammondFeb 23, 2025, 12:32 AM

40 points

3 comments2 min readEA link

(www.cooperativeai.com)

Stampy’s AI Safety Info soft launch

StevenKaasOct 5, 2023, 10:20 PM

57 points

2 comments2 min readEA link

(www.lesswrong.com)

Announcing the 2023 PIBBSS Summer Research Fellowship

Dušan D. Nešić (Dushan)Jan 12, 2023, 9:38 PM

26 points

2 comments1 min readEA link

Will the Need to Retrain AI Models from Scratch Block a Software Intelligence Explosion?

ForethoughtMar 28, 2025, 1:43 PM

12 points

0 comments3 min readEA link

(www.forethought.org)

The replication and emulation of GPT-3

Ben CottierDec 21, 2022, 1:49 PM

14 points

0 comments33 min readEA link

[Link] Center for the Governance of AI (GovAI) Annual Report 2018

MarkusAnderljungDec 21, 2018, 4:17 PM

24 points

0 comments1 min readEA link

Best practices for risk communication from the academic literature

Existential Risk Communication ProjectAug 12, 2024, 6:54 PM

9 points

3 comments23 min readEA link

Promethean Governance Unleashed: Piloting Polycentric, Memetic Orders in the AI Frontier

Paul FallavollitaMar 21, 2025, 4:35 PM

−11 points

1 comment3 min readEA link

Introducing Leap Labs, an AI interpretability startup

Jessica RumbelowMar 6, 2023, 5:37 PM

11 points

0 comments1 min readEA link

(www.lesswrong.com)

Why mechanistic interpretability does not and cannot contribute to long-term AGI safety (from messages with a friend)

RemmeltDec 19, 2022, 12:02 PM

17 points

3 comments1 min readEA link

Enabling more feedback

JJ HepburnDec 10, 2021, 6:52 AM

41 points

3 comments3 min readEA link

AISN #31: A New AI Policy Bill in California Plus, Precedents for AI Governance and The EU AI Office

Center for AI SafetyFeb 21, 2024, 9:55 PM

27 points

0 comments6 min readEA link

(newsletter.safe.ai)

Apply to become a Futurekind AI Facilitator or Mentor (deadline: April 10)

Jay LuongMar 21, 2025, 8:28 PM

3 points

0 comments1 min readEA link

My argument against AGI

cveresOct 12, 2022, 6:32 AM

2 points

29 comments3 min readEA link

Jesse Clifton: Open-source learning — a bargaining approach

EA GlobalOct 18, 2019, 6:05 PM

10 points

0 comments1 min readEA link

(www.youtube.com)

Safety timelines: How long will it take to solve alignment?

Esben KranSep 19, 2022, 12:51 PM

45 points

9 comments6 min readEA link

The Rival AI Deployment Problem: a Pre-deployment Agreement as the least-bad response

HaydnBelfieldSep 23, 2022, 9:28 AM

44 points

1 comment12 min readEA link

Prologue | A Fire Upon the Deep | Vernor Vinge

semicycleFeb 17, 2025, 4:13 AM

5 points

1 comment1 min readEA link

(www.baen.com)

Holden Karnofsky Interview about Most Important Century & Transformative AI

Dwarkesh PatelJan 3, 2023, 5:31 PM

29 points

2 comments1 min readEA link

PIBBSS Fellowship 2025: Bounties and Cooperative AI Track Announcement

Dušan D. Nešić (Dushan)Jan 9, 2025, 2:23 PM

18 points

0 comments1 min readEA link

[Question] Who should we interview for The 80,000 Hours Podcast?

Luisa_RodriguezSep 13, 2023, 12:23 PM

87 points

136 comments2 min readEA link

AI Safety Strategy—A new organization for better timelines

PrometheusJun 14, 2023, 8:41 PM

8 points

0 comments2 min readEA link

Former Israeli Prime Minister Speaks About AI X-Risk

Yonatan CaleMay 20, 2023, 12:09 PM

73 points

6 comments1 min readEA link

[Crosspost] An AI Pause Is Humanity’s Best Bet For Preventing Extinction (TIME)

OttoJul 24, 2023, 10:18 AM

36 points

3 comments7 min readEA link

(time.com)

Stripe Economics of AI Fellowship

basil.halperinMar 28, 2025, 3:24 PM

54 points

0 comments2 min readEA link

(stripe.events)

Strategic Directions for a Digital Consciousness Model

Derek ShillerDec 10, 2024, 7:33 PM

41 points

1 comment12 min readEA link

[Question] UK election and AI safety, who to vote for?

Clay CubeJun 1, 2024, 10:16 AM

25 points

3 comments1 min readEA link

Alignment is hard. Communicating that, might be harder

Eleni_ASep 1, 2022, 11:45 AM

17 points

1 comment3 min readEA link

Stop talking about p(doom)

Isaac KingJan 1, 2024, 10:57 AM

115 points

12 comments1 min readEA link

ML4G Germany—AI Alignment Camp

Evander H. 🔸Jun 27, 2023, 3:33 PM

6 points

0 comments1 min readEA link

The Dilemma of Ultimate Technology

AinoJul 20, 2023, 12:24 PM

1 point

0 comments7 min readEA link

Some cruxes on impactful alternatives to AI policy work

richard_ngoNov 22, 2018, 1:43 PM

28 points

2 comments12 min readEA link

A modest case for hope

xavier rgOct 17, 2022, 6:03 AM

28 points

0 comments1 min readEA link

Reasons for my negative feelings towards the AI risk discussion

fergusqSep 1, 2022, 7:33 AM

43 points

9 comments4 min readEA link

[Question] How much should you optimize for the short-timelines scenario?

SoerenMindJul 26, 2022, 3:51 PM

39 points

2 comments1 min readEA link

Career uncertainty: Medicine vs. AI

Markus KöthApr 30, 2023, 8:41 AM

20 points

9 comments1 min readEA link

Advice to junior AI governance researchers

AkashJul 8, 2024, 7:19 PM

38 points

3 comments1 min readEA link

A progressive AI, not a threatening one

Violette Dec 12, 2023, 5:19 PM

−17 points

0 comments4 min readEA link

Does generality pay? GPT-3 can provide preliminary evidence.

Eevee🔹Jul 12, 2020, 6:53 PM

21 points

4 comments2 min readEA link

[Question] Why does AGI occur almost nowhere, not even just as a remark for economic/political models?

Franziska FischerOct 2, 2022, 2:43 PM

52 points

15 comments1 min readEA link

Upcoming speaker series on emerging tech, national security & US policy careers

ESJul 10, 2024, 7:59 PM

16 points

1 comment1 min readEA link

Results for a survey of tool use and workflows in alignment research

jacquesthibsDec 19, 2022, 3:19 PM

30 points

0 comments1 min readEA link

Epistle to the Successor

ukc10014Apr 29, 2025, 9:30 AM

4 points

0 comments19 min readEA link

Alignment Grantmaking is Funding-Limited Right Now [crosspost]

johnswentworthAug 2, 2023, 8:37 PM

82 points

13 comments1 min readEA link

(www.lesswrong.com)

Online infosec talk: What even is zero trust?

JarrahJun 8, 2024, 11:54 PM

11 points

0 comments1 min readEA link

Open Asteroid Impact announces leadership transition

Patrick HoangApr 1, 2024, 12:51 PM

18 points

0 comments1 min readEA link

Applications open for AGI Safety Fundamentals: Alignment Course

Jamie BDec 13, 2022, 10:50 AM

75 points

0 comments2 min readEA link

Introducing METR’s Autonomy Evaluation Resources

Megan KinnimentMar 15, 2024, 11:19 PM

28 points

0 comments1 min readEA link

(metr.github.io)

Leverage points for a pause

RemmeltAug 28, 2024, 9:21 AM

6 points

0 comments1 min readEA link

[Question] AI labs’ requests for input

Zach Stein-PerlmanAug 19, 2023, 5:00 PM

7 points

0 comments1 min readEA link

Executive Director for AIS France—Expression of interest

gergoDec 19, 2024, 8:11 AM

33 points

0 comments4 min readEA link

Demonstrating specification gaming in reasoning models

Matrice JacobineFeb 20, 2025, 7:26 PM

9 points

0 comments1 min readEA link

(arxiv.org)

A note about differential technological development

So8resJul 24, 2022, 11:41 PM

58 points

8 comments6 min readEA link

Talk to me about your summer/career plans

AkashJan 31, 2023, 6:29 PM

31 points

0 comments1 min readEA link

Should AI focus on problem-solving or strategic planning? Why not both?

oliver_siegelNov 1, 2022, 9:53 AM

1 point

0 comments1 min readEA link

AI Discrimination Requirements: A Regulatory Review

Deric ChengApr 4, 2024, 3:44 PM

8 points

1 comment6 min readEA link

Project proposal: Scenario analysis group for AI safety strategy

BuhlDec 18, 2023, 6:31 PM

35 points

0 comments5 min readEA link

(rethinkpriorities.org)

AI can exploit safety plans posted on the Internet

Peter S. ParkDec 4, 2022, 12:17 PM

5 points

3 comments1 min readEA link

Efficacy of AI Activism: Have We Ever Said No?

Charlie HarrisonOct 27, 2023, 4:52 PM

80 points

25 comments20 min readEA link

7 Learnings and a Detailed Description of an AI Safety Reading Group

nellSep 23, 2022, 2:02 AM

21 points

5 comments9 min readEA link

Using AI to match people to jobs?

ForumiteMay 30, 2024, 9:19 PM

5 points

0 comments1 min readEA link

AI Safety Incubation Program—Applications Open

Catalyze ImpactAug 16, 2024, 3:37 PM

11 points

0 comments2 min readEA link

Law-Following AI 1: Sequence Introduction and Structure

Cullen 🔸Apr 27, 2022, 5:16 PM

35 points

2 comments9 min readEA link

The Welfare of Digital Minds: A Research Agenda

Derek ShillerNov 11, 2024, 12:58 PM

53 points

1 comment31 min readEA link

Lying is Cowardice, not Strategy

Connor LeahyOct 25, 2023, 5:59 AM

−5 points

15 comments5 min readEA link

(cognition.cafe)

How Technical AI Safety Researchers Can Help Implement Punitive Damages to Mitigate Catastrophic AI Risk

Gabriel WeilFeb 19, 2024, 5:43 PM

28 points

2 comments4 min readEA link

You Understand AI Alignment and How to Make Soup

Leen ArmoushMay 28, 2022, 6:22 AM

0 points

2 comments5 min readEA link

MATS is hiring!

Ryan KiddApr 8, 2025, 8:45 PM

14 points

2 comments1 min readEA link

AI Constitutions are a tool to reduce societal scale risk

SammyDMartinJul 26, 2024, 10:50 AM

11 points

0 comments1 min readEA link

(www.lesswrong.com)

Open Philanthropy Technical AI Safety RFP - $40M Available Across 21 Research Areas

Jake MendelFeb 6, 2025, 6:59 PM

95 points

3 comments1 min readEA link

(www.openphilanthropy.org)

Let’s Talk About Emergence

Jacob-HaimesJun 7, 2024, 7:34 PM

8 points

1 comment7 min readEA link

(www.odysseaninstitute.org)

Apply to a small iteration of MLAB to be run in Oxford

Rio PAug 29, 2023, 7:39 PM

11 points

0 comments1 min readEA link

Cutting AI Safety down to size

Holly Elmore ⏸️ 🔸Nov 9, 2024, 11:40 PM

87 points

5 comments5 min readEA link

A list of good heuristics that the case for AI X-risk fails

Aaron Gertler 🔸Jul 16, 2020, 9:56 AM

25 points

9 comments2 min readEA link

(www.alignmentforum.org)

[Announcement] The Steven Aiberg Project

StevenAibergOct 19, 2022, 7:48 AM

0 points

0 comments4 min readEA link

AE Studio is hiring!

AE StudioApr 21, 2025, 8:35 PM

16 points

0 comments1 min readEA link

Humans aren’t fitness maximizers

So8resOct 4, 2022, 1:32 AM

30 points

2 comments5 min readEA link

Taking Into Account Sentient Non-Humans in AI Ambitious Value Learning: Sentientist Coherent Extrapolated Volition

Adrià MoretDec 1, 2023, 6:01 PM

43 points

2 comments42 min readEA link

IV. Parallels and Review

Maynk02Feb 27, 2024, 11:10 PM

7 points

1 comment8 min readEA link

(open.substack.com)

The Slippery Slope from DALLE-2 to Deepfake Anarchy

stecasNov 5, 2022, 2:47 PM

55 points

11 comments17 min readEA link

Should we expect the future to be good?

Neil CrawfordApr 30, 2025, 12:45 AM

38 points

1 comment14 min readEA link

Compute Governance and Conclusions—Transformative AI and Compute [3/4]

lennartOct 14, 2021, 7:55 AM

20 points

3 comments5 min readEA link

What is “wireheading”?

Vishakha AgrawalDec 17, 2024, 5:59 PM

1 point

0 comments1 min readEA link

(aisafety.info)

The King and the Golem—The Animation

WriterNov 8, 2024, 6:23 PM

50 points

1 comment1 min readEA link

Scoring forecasts from the 2016 “Expert Survey on Progress in AI”

PatrickLMar 1, 2023, 2:39 PM

204 points

21 comments9 min readEA link

The NAIRR Initiative: Assessing its Potential for Democratizing AI

Jose GelvesAug 29, 2024, 12:30 PM

22 points

1 comment11 min readEA link

[Question] How bad would AI progress need to be for us to think general technological progress is also bad?

Jim BuhlerJul 6, 2024, 6:44 PM

10 points

0 comments1 min readEA link

US public opinion on AI, September 2023

Zach Stein-PerlmanSep 18, 2023, 6:00 PM

29 points

0 comments1 min readEA link

(blog.aiimpacts.org)

How could a moratorium fail?

DavidmanheimSep 22, 2023, 3:11 PM

49 points

4 comments9 min readEA link

Labor Participation is a High-Priority AI Alignment Risk

alxAug 12, 2024, 6:48 PM

16 points

3 comments16 min readEA link

Consider not donating under $100 to political candidates

DanielFilanMay 11, 2025, 3:22 AM

82 points

11 comments1 min readEA link

All AGI Safety questions welcome (especially basic ones) [April 2023]

StevenKaasApr 8, 2023, 4:21 AM

111 points

173 comments1 min readEA link

Apply to the new Open Philanthropy Technology Policy Fellowship!

lukeprogJul 20, 2021, 6:41 PM

78 points

6 comments4 min readEA link

[Question] How should technical AI researchers best transition into AI governance and policy?

GabeMSep 10, 2023, 5:29 AM

12 points

5 comments1 min readEA link

Quantifying the Far Future Effects of Interventions

MichaelDickensMay 18, 2016, 2:15 AM

9 points

0 comments11 min readEA link

Why I expect successful (narrow) alignment

Tobias_BaumannDec 29, 2018, 3:46 PM

18 points

10 comments1 min readEA link

(s-risks.org)

Apply to the Cooperative AI PhD Fellowship by October 14th!

Lewis HammondOct 5, 2024, 12:41 PM

35 points

0 comments1 min readEA link

When “yang” goes wrong

Joe_CarlsmithJan 8, 2024, 4:35 PM

57 points

1 comment1 min readEA link

ML4G is launching its first ever Governance bootcamp!

carolinaolliveMay 16, 2025, 3:22 PM

24 points

0 comments1 min readEA link

Jade Leung and Seth Baum: The role of existing institutions in AI strategy

EA GlobalJun 8, 2018, 7:15 AM

9 points

0 comments28 min readEA link

(www.youtube.com)

On Scaling Academia

kirchner.janSep 20, 2021, 2:54 PM

18 points

3 comments13 min readEA link

(universalprior.substack.com)

A Benchmark for Measuring Honesty in AI Systems

Mantas MazeikaMar 4, 2025, 5:44 PM

29 points

0 comments2 min readEA link

(www.mask-benchmark.ai)

[Question] Which Graduate Programs Will Best Set Me Up for a Career in AI Safety?

Jason ZengFeb 19, 2025, 1:22 PM

4 points

0 comments1 min readEA link

Data collection for AI alignment—Career review

Benjamin HiltonJun 3, 2022, 11:44 AM

34 points

1 comment5 min readEA link

(80000hours.org)

What AI Safety Materials Do ML Researchers Find Compelling?

Vael GatesDec 28, 2022, 2:03 AM

130 points

12 comments1 min readEA link

DeepMind: Model evaluation for extreme risks

Zach Stein-PerlmanMay 25, 2023, 3:00 AM

49 points

3 comments1 min readEA link

Video & transcript: Challenges for Safe & Beneficial Brain-Like AGI

Steven ByrnesMay 8, 2025, 9:11 PM

8 points

1 comment1 min readEA link

The Control Problem: Unsolved or Unsolvable?

RemmeltJun 2, 2023, 3:42 PM

4 points

9 comments14 min readEA link

[Creative Writing Contest] The Puppy Problem

LouisOct 13, 2021, 2:01 PM

13 points

0 comments7 min readEA link

Forecasting Through Fiction

YitzJul 6, 2022, 5:23 AM

8 points

3 comments8 min readEA link

(www.lesswrong.com)

AGI and Lock-In

Lukas FinnvedenOct 29, 2022, 1:56 AM

154 points

20 comments10 min readEA link

(www.forethought.org)

Announcement: there are now monthly coordination calls for AIS fieldbuilders in Europe

gergoNov 22, 2024, 10:30 AM

31 points

0 comments1 min readEA link

AGI Isn’t Close—Future Fund Worldview Prize

Toni MUENDELDec 18, 2022, 4:03 PM

−8 points

24 comments13 min readEA link

[Opzionale] Approfondimenti sui rischi dell’IA (materiali in inglese)

EA ItalyJan 18, 2023, 11:16 AM

1 point

0 comments2 min readEA link

How The EthiSizer Almost Broke `Story’

Velikovsky_of_NewcastleMay 8, 2023, 4:58 PM

1 point

0 comments5 min readEA link

AI is advancing fast

Vishakha AgrawalApr 23, 2025, 11:04 AM

2 points

2 comments2 min readEA link

(aisafety.info)

#185 – The 7 most promising ways to end factory farming, and whether AI is going to be good or bad for animals (Lewis Bollard on the 80,000 Hours Podcast)

80000_HoursApr 30, 2024, 5:20 PM

63 points

0 comments15 min readEA link

AISN #33: Reassessing AI and Biorisk Plus, Consolidation in the Corporate AI Landscape, and National Investments in AI

Center for AI SafetyApr 12, 2024, 4:11 PM

19 points

0 comments9 min readEA link

(newsletter.safe.ai)

An audio version of the alignment problem from a deep learning perspective by Richard Ngo Et Al

MiguelFeb 3, 2023, 7:32 PM

18 points

0 comments1 min readEA link

(www.whitehatstoic.com)

Lab Collaboration on AI Safety Best Practices

amtaMar 17, 2024, 12:20 PM

3 points

0 comments20 min readEA link

Please Donate to CAIP (Post 1 of 6 on AI Governance)

Jason Green-LoweMay 7, 2025, 6:15 PM

129 points

22 comments33 min readEA link

6 Year Decrease of Metaculus AGI Prediction

Chris LeongApr 12, 2022, 5:36 AM

40 points

6 comments1 min readEA link

Making of #IAN

kirchner.janAug 29, 2021, 4:24 PM

9 points

0 comments1 min readEA link

(universalprior.substack.com)

Summary of the AI Bill of Rights and Policy Implications

Tristan WilliamsJun 20, 2023, 9:28 AM

16 points

0 comments22 min readEA link

SERI ML Alignment Theory Scholars Program 2022

Ryan KiddApr 27, 2022, 4:33 PM

57 points

2 comments3 min readEA link

[Link post] AI could fuel factory farming—or end it

BrianKOct 18, 2022, 11:16 AM

39 points

0 comments1 min readEA link

(www.fastcompany.com)

Paul Christiano: Current work in AI alignment

EA GlobalApr 3, 2020, 7:06 AM

80 points

3 comments24 min readEA link

(www.youtube.com)

Nick Bostrom’s new book, “Deep Utopia”, is out today

peterhartreeMar 27, 2024, 11:23 AM

105 points

6 comments1 min readEA link

(nickbostrom.com)

Intellectual Diversity in AI Safety

KRJul 22, 2020, 7:07 PM

21 points

8 comments3 min readEA link

Case study: Safety standards on California utilities to prevent wildfires

Coby JosephDec 6, 2023, 10:32 AM

7 points

1 comment26 min readEA link

2024 State of AI Regulatory Landscape

Deric ChengMay 28, 2024, 12:00 PM

12 points

1 comment2 min readEA link

(www.convergenceanalysis.org)

I am a Memoryless System

Nicholas KrossOct 23, 2022, 5:36 PM

4 points

0 comments9 min readEA link

(www.thinkingmuchbetter.com)

Exploring Tacit Linked Premises with GPT

RomeoStevensMar 24, 2023, 10:50 PM

5 points

0 comments1 min readEA link

Performance of Large Language Models (LLMs) in Complex Analysis: A Benchmark of Mathematical Competence and its Role in Decision Making.

Jaime Esteban Montenegro BarónMay 6, 2025, 9:08 PM

1 point

0 comments23 min readEA link

[US time] Infosec: What even is zero trust?

JarrahJun 21, 2024, 6:11 PM

2 points

0 comments1 min readEA link

Clarifying and predicting AGI

richard_ngoMay 4, 2023, 3:56 PM

69 points

2 comments1 min readEA link

OpenAI’s Preparedness Framework: Praise & Recommendations

AkashJan 2, 2024, 4:20 PM

16 points

1 comment1 min readEA link

Policy ideas for mitigating AI risk

Thomas LarsenSep 16, 2023, 10:31 AM

121 points

16 comments10 min readEA link

[Question] Open-source AI safety projects?

defun 🔸Jan 29, 2024, 10:09 AM

8 points

2 comments1 min readEA link

Scaling Wargaming for Global Catastrophic Risks with AI

raiJan 18, 2025, 3:07 PM

73 points

1 comment4 min readEA link

(blog.sentinel-team.org)

Catastrophic Risks from AI #2: Malicious Use

Dan HJun 22, 2023, 5:10 PM

19 points

0 comments1 min readEA link

AI Benchmarks Series — Metaculus Questions on Evaluations of AI Models Against Technical Benchmarks

christianMar 27, 2024, 11:05 PM

10 points

0 comments1 min readEA link

(www.metaculus.com)

Against Agents as an Approach to Aligned Transformative AI

𝕮𝖎𝖓𝖊𝖗𝖆Dec 27, 2022, 12:47 AM

4 points

0 comments1 min readEA link

We are already in a persuasion-transformed world and must take precautions

trevor1Nov 4, 2023, 3:53 PM

1 point

0 comments1 min readEA link

Conscious AI: Will we know it when we see it? [Conscious AI & Public Perception]

ixexJul 4, 2024, 8:30 PM

13 points

1 comment12 min readEA link

Announcing Cavendish Labs

dyushaJan 19, 2023, 8:00 PM

112 points

6 comments2 min readEA link

Why I think strong general AI is coming soon

porbySep 28, 2022, 6:55 AM

14 points

1 comment1 min readEA link

AI Risk is like Terminator; Stop Saying it’s Not

skluugMar 8, 2022, 7:17 PM

191 points

43 comments10 min readEA link

(skluug.substack.com)

[Question] Academic AI Safety/Alignment Reading List

Zak_HNov 21, 2023, 2:19 PM

6 points

1 comment1 min readEA link

“We can Prevent AI Disaster Like We Prevented Nuclear Catastrophe”

PeterSep 23, 2023, 8:36 PM

15 points

1 comment1 min readEA link

(time.com)

What is malevolence? On the nature, measurement, and distribution of dark traits

David_AlthausOct 23, 2024, 8:41 AM

107 points

6 comments52 min readEA link

The possibility of an indefinite AI pause

Matthew_BarnettSep 19, 2023, 12:28 PM

90 points

73 comments15 min readEA link

Don’t leave your fingerprints on the future

So8resOct 8, 2022, 12:35 AM

94 points

4 comments1 min readEA link

[Question] Would people on this site be interested in hearing about efforts to make an “ethics calculator” for an AGI?

Sean SweeneyMar 5, 2024, 9:28 AM

1 point

0 comments1 min readEA link

AI Safety Bounties

PatrickLAug 24, 2023, 2:30 PM

37 points

2 comments7 min readEA link

(rethinkpriorities.org)

#204 – Making sense of SBF, and his biggest critiques of effective altruism (Nate Silver on The 80,000 Hours Podcast)

80000_HoursOct 17, 2024, 8:41 PM

22 points

2 comments14 min readEA link

AI is not taking over material science (for now): an analysis and conference report

titotalMar 11, 2025, 12:01 PM

59 points

16 comments25 min readEA link

(open.substack.com)

“AI” is an indexical

TW123Jan 3, 2023, 10:00 PM

23 points

2 comments1 min readEA link

Apply now to Human-aligned AI Summer School 2025

PivocajsJun 6, 2025, 7:34 PM

4 points

0 comments1 min readEA link

Election by Jury: A Neglected Target for Effective Altruism

ClayShentrupJan 27, 2025, 7:27 AM

16 points

10 comments6 min readEA link

I read every major AI lab’s safety plan so you don’t have to

sarahhwDec 16, 2024, 2:12 PM

67 points

2 comments11 min readEA link

(longerramblings.substack.com)

LessWrong is now a book, available for pre-order!

terraformDec 4, 2020, 8:42 PM

48 points

1 comment7 min readEA link

AISN #32: Measuring and Reducing Hazardous Knowledge in LLMs Plus, Forecasting the Future with LLMs, and Regulatory Markets

Center for AI SafetyMar 7, 2024, 4:37 PM

15 points

2 comments8 min readEA link

(newsletter.safe.ai)

State of EA Poland and funding opportunity

Chris SzulcDec 7, 2024, 8:48 AM

72 points

4 comments11 min readEA link

New Working Paper Series of the Legal Priorities Project

Legal Priorities ProjectOct 18, 2021, 10:30 AM

60 points

0 comments9 min readEA link

Diagram with Commentary for AGI as an X-Risk

Jared LeibowichMay 24, 2023, 10:27 PM

21 points

4 comments8 min readEA link

An Update On The Campaign For AI Safety Dot Org

yanni kyriacosMay 5, 2023, 12:19 AM

26 points

4 comments1 min readEA link

Why I’m Not (Yet) A Full-Time Technical Alignment Researcher

Nicholas KrossMay 25, 2023, 1:26 AM

11 points

1 comment1 min readEA link

Differential technology development: preprint on the concept

Hamish_HobbsSep 12, 2022, 1:52 PM

65 points

0 comments2 min readEA link

The first AI Safety Camp & onwards

RemmeltJun 7, 2018, 6:49 PM

25 points

2 comments8 min readEA link

EA’s brain-over-body bias, and the embodied value problem in AI alignment

Geoffrey MillerSep 21, 2022, 6:55 PM

45 points

3 comments25 min readEA link

Disagreement with bio anchors that lead to shorter timelines

mariushobbhahnNov 16, 2022, 2:40 PM

85 points

1 comment1 min readEA link

Help me to understand AI alignment!

britomartJan 18, 2023, 9:13 AM

3 points

12 comments1 min readEA link

Metaculus Presents — View From the Enterprise Suite: How Applied AI Governance Works Today

christianJun 20, 2023, 10:24 PM

4 points

0 comments1 min readEA link

4 Key Assumptions in AI Safety

PrometheusNov 7, 2022, 10:50 AM

5 points

0 comments1 min readEA link

[Question] Please Share Your Perspectives on the Degree of Societal Impact from Transformative AI Outcomes

KiliankApr 15, 2022, 1:23 AM

3 points

3 comments1 min readEA link

Worldview iPeople—Future Fund’s AI Worldview Prize

Toni MUENDELOct 28, 2022, 7:37 AM

0 points

5 comments1 min readEA link

Japan AI Alignment Conference

ChrisScammellMar 10, 2023, 9:23 AM

17 points

2 comments1 min readEA link

(www.conjecture.dev)

Where does Responsible Capabilities Scaling take AI governance?

ZacRichardsonJun 9, 2024, 10:25 PM

17 points

1 comment16 min readEA link

Interviews with 97 AI Researchers: Quantitative Analysis

Maheen ShermohammedFeb 2, 2023, 4:50 AM

76 points

4 comments7 min readEA link

We’re Not Advertising Enough (Post 3 of 6 on AI Governance)

Jason Green-LoweMay 22, 2025, 5:11 PM

36 points

2 comments28 min readEA link

AI data gaps could lead to ongoing Animal Suffering

Darkness8i8Oct 17, 2024, 10:52 AM

13 points

3 comments5 min readEA link

[Question] Am I taking crazy pills? Why aren’t EAs advocating for a pause on AI capabilities?

yanni kyriacosAug 15, 2023, 11:29 PM

18 points

21 comments1 min readEA link

Fictional Catastrophes, Reel Lessons: What 12 Critically Acclaimed Films Reveal About Surviving Global Catastrophes

Matt BoydMay 14, 2025, 7:07 PM

6 points

1 comment1 min readEA link

(adaptresearchwriting.com)

“No-one in my org puts money in their pension”

tobyjFeb 16, 2024, 3:04 PM

157 points

11 comments9 min readEA link

(seekingtobejolly.substack.com)

Invitation to lead a project at AI Safety Camp (Virtual Edition, 2025)

Linda LinseforsAug 23, 2024, 2:18 PM

30 points

2 comments1 min readEA link

Why EAs are skeptical about AI Safety

Lukas Trötzmüller🔸Jul 18, 2022, 7:01 PM

293 points

31 comments29 min readEA link

Superintelligent AI is necessary for an amazing future, but far from sufficient

So8resOct 31, 2022, 9:16 PM

35 points

5 comments1 min readEA link

Relationship between EA Community and AI safety

Tom Barnes🔸Sep 18, 2023, 1:49 PM

157 points

15 comments1 min readEA link

#195 – Who’s trying to steal frontier AI models, and what they could do with them (Sella Nevo on the 80,000 Hours Podcast)

80000_HoursAug 9, 2024, 2:45 PM

29 points

0 comments11 min readEA link

Constructive Discussion and Thinking Methodology for Severe Situations including Existential Risks

AinoJul 8, 2023, 12:04 AM

1 point

0 comments7 min readEA link

Why I think it’s important to work on AI forecasting

Matthew_BarnettFeb 27, 2023, 9:24 PM

179 points

10 comments10 min readEA link

Investigating Self-Preservation in LLMs: Experimental Observations

MakhamFeb 27, 2025, 4:58 PM

9 points

3 comments34 min readEA link

Impact of unemployment generated by Artifficial Intelligence on Gross Domestic Product

Valentina García MesaMay 1, 2025, 8:52 PM

5 points

0 comments28 min readEA link

Focus on the places where you feel shocked everyone’s dropping the ball

So8resFeb 2, 2023, 12:27 AM

92 points

6 comments1 min readEA link

[Question] How long does it take to undersrand AI X-Risk from scratch so that I have a confident, clear mental model of it from first principles?

Jordan ArelJul 27, 2022, 4:58 PM

29 points

6 comments1 min readEA link

Student project for engaging with AI alignment

Per Ivar FriborgMay 9, 2022, 10:44 AM

35 points

1 comment1 min readEA link

“How to Escape from the Simulation”—Seeds of Science call for reviewers

rogersbacon1Jan 26, 2023, 3:12 PM

7 points

0 comments1 min readEA link

Against Anonymous Hit Pieces

Anti-OmegaJun 18, 2023, 7:36 PM

−25 points

3 comments1 min readEA link

Reflections on the PIBBSS Fellowship 2022

noraDec 11, 2022, 10:03 PM

69 points

4 comments18 min readEA link

Updates from Campaign for AI Safety

Jolyn KhooAug 7, 2023, 6:09 AM

32 points

2 comments2 min readEA link

(www.campaignforaisafety.org)

Important, actionable research questions for the most important century

Holden KarnofskyFeb 24, 2022, 4:34 PM

298 points

13 comments19 min readEA link

[Question] Should the EA community have a DL engineering fellowship?

PabloAMC 🔸Dec 24, 2021, 1:43 PM

26 points

6 comments1 min readEA link

How Do AI Timelines Affect Existential Risk?

Stephen McAleeseAug 29, 2022, 5:10 PM

2 points

0 comments23 min readEA link

(www.lesswrong.com)

A tentative dialogue with a Friendly-boxed-super-AGI on brain uploads

RamiroMay 12, 2022, 9:55 PM

5 points

0 comments4 min readEA link

Publication decisions for large language models, and their impacts

Ben CottierDec 21, 2022, 1:50 PM

14 points

0 comments16 min readEA link

How do AI agents work together when they can’t trust each other?

James-SullivanJun 6, 2025, 3:24 AM

3 points

1 comment8 min readEA link

(open.substack.com)

5th IEEE International Conference on Artificial Intelligence Testing (AITEST 2023)

surabhi guptaMar 12, 2023, 9:06 AM

−5 points

0 comments1 min readEA link

How did you update on AI Safety in 2023?

Chris LeongJan 23, 2024, 2:21 AM

30 points

5 comments1 min readEA link

Systemic Cascading Risks: Relevance in Longtermism & Value Lock-In

Richard RSep 2, 2022, 7:53 AM

59 points

10 comments16 min readEA link

What are the differences between AGI, transformative AI, and superintelligence?

Vishakha AgrawalJan 23, 2025, 10:11 AM

12 points

0 comments3 min readEA link

(aisafety.info)

[Question] Ethical Considerations in regard to Outsourcing Labour Needs to the Global South

Nicole Mutung'aOct 4, 2023, 9:18 AM

13 points

5 comments1 min readEA link

Technical Risks of (Lethal) Autonomous Weapons Systems

Heramb PodarOct 23, 2024, 8:43 PM

5 points

0 comments1 min readEA link

(www.lesswrong.com)

Chinese scientists acknowledge xrisk & call for international regulatory body [Linkpost]

AkashNov 1, 2023, 1:28 PM

31 points

0 comments1 min readEA link

(www.ft.com)

Announcing AI safety Mentors and Mentees

mariushobbhahnNov 23, 2022, 3:21 PM

62 points

1 comment1 min readEA link

When Will We Spend Enough to Train Transformative AI

snMar 28, 2023, 12:41 AM

3 points

0 comments9 min readEA link

Open call: AI Act Standard for Dev. Phase Risk Assessment

miller-maxDec 8, 2023, 7:57 PM

5 points

1 comment1 min readEA link

$1,000 bounty for an AI Programme Lead recommendation

Cillian_Aug 14, 2023, 1:11 PM

11 points

1 comment2 min readEA link

A pseudo mathematical formulation of direct work choice between two x-risks

Joseph BloomAug 11, 2022, 12:28 AM

7 points

0 comments4 min readEA link

Survey—Psychological Impact of Long-Term AI Engagement

Manuela GarcíaSep 17, 2024, 3:58 PM

2 points

0 comments1 min readEA link

Epoch is hiring an ML Hardware Researcher

merilalamaJul 20, 2023, 7:08 PM

29 points

0 comments4 min readEA link

(careers.rethinkpriorities.org)

Concern About the Intelligence Divide Due to AI

Soe LinAug 21, 2024, 9:53 AM

17 points

1 comment2 min readEA link

Summary of Stuart Russell’s new book, “Human Compatible”

Rohin ShahOct 19, 2019, 7:56 PM

33 points

1 comment15 min readEA link

(www.alignmentforum.org)

[Question] Any Philosophy PhD recommendations for students interested in Alignment Efforts?

rickyhuang.hexuanJan 18, 2023, 5:54 AM

7 points

6 comments1 min readEA link

AI policy careers in the EU

Lauro LangoscoNov 11, 2019, 10:43 AM

62 points

7 comments11 min readEA link

Well-Being Index (WBI): Redefining Societal Progress Together

Max KusmierekDec 1, 2023, 3:23 PM

5 points

1 comment6 min readEA link

[Question] Developing AI solutions for global health—Emmanuel Katto

EmmanuelKattoJul 18, 2024, 6:41 AM

0 points

0 comments1 min readEA link

ChatGPT is capable of cognitive empathy!

Miquel Banchs-Piqué (prev. mikbp)Mar 30, 2023, 8:42 PM

3 points

0 comments1 min readEA link

(nonzero.substack.com)

Ai Salon: Trustworthy AI Futures #1

IanEisenbergMay 2, 2024, 4:04 PM

2 points

0 comments1 min readEA link

Critique of Superintelligence Part 4

James FodorDec 13, 2018, 5:14 AM

4 points

2 comments4 min readEA link

Rational Animations’ intro to mechanistic interpretability

WriterJun 14, 2024, 4:10 PM

21 points

1 comment1 min readEA link

(youtu.be)

[Question] Where would I find the hardcore totalizing segment of EA?

Peter BerggrenDec 28, 2023, 9:16 AM

16 points

22 comments1 min readEA link

Google could build a conscious AI in three months

Derek ShillerOct 1, 2022, 1:24 PM

16 points

22 comments7 min readEA link

Why consciousness matters

EdLopezMay 24, 2024, 12:33 PM

0 points

0 comments7 min readEA link

(medium.com)

An argument for accelerating international AI governance research (part 2)

MattThinksAug 22, 2023, 10:40 PM

3 points

0 comments10 min readEA link

[Question] What are the most pressing issues in short-term AI policy?

Eevee🔹Jan 14, 2020, 10:05 PM

9 points

0 comments1 min readEA link

Refine: An Incubator for Conceptual Alignment Research Bets

adamShimiApr 15, 2022, 8:59 AM

47 points

0 comments4 min readEA link

Dubai EA Fellowship [4 − 18 May]

rahulxyzApr 19, 2023, 8:06 PM

7 points

2 comments4 min readEA link

AI safety milestones?

Zach Stein-PerlmanJan 23, 2023, 9:30 PM

6 points

0 comments1 min readEA link

A Map to Navigate AI Governance

hanadulsetFeb 14, 2022, 10:41 PM

72 points

11 comments25 min readEA link

Animal advocates should campaign to restrict AI precision livestock farming

Zachary Brown🔸Jun 17, 2024, 3:27 PM

36 points

6 comments15 min readEA link

(beforeporcelain.substack.com)

Will AI kill everyone? Here’s what the godfathers of AI have to say [RA video]

WriterAug 19, 2023, 5:29 PM

33 points

0 comments2 min readEA link

(youtu.be)

How Roodman’s GWP model translates to TAI timelines

kokotajlodNov 16, 2020, 2:11 PM

22 points

0 comments2 min readEA link

Pre-Announcing the 2023 Open Philanthropy AI Worldviews Contest

Jason SchukraftNov 21, 2022, 9:45 PM

291 points

26 comments1 min readEA link

[Question] What are people’s thoughts on working for DeepMind as a general software engineer?

Max PietschSep 23, 2022, 5:13 PM

9 points

4 comments1 min readEA link

Followup on Terminator

skluugMar 12, 2022, 1:11 AM

32 points

0 comments9 min readEA link

(skluug.substack.com)

Public-facing Censorship Is Safety Theater, Causing Reputational Damage

YitzSep 23, 2022, 5:08 AM

49 points

7 comments1 min readEA link

GPT-3-like models are now much easier to access and deploy than to develop

Ben CottierDec 21, 2022, 1:49 PM

22 points

3 comments19 min readEA link

Tetherware #1: The case for humanlike AI with free will

Jáchym FibírJan 30, 2025, 11:57 AM

−1 points

2 comments10 min readEA link

(tetherware.substack.com)

[Question] Why aren’t you freaking out about OpenAI? At what point would you start?

AppliedDivinityStudiesOct 10, 2021, 1:06 PM

80 points

22 comments2 min readEA link

Encultured AI, Part 2: Providing a Service

Andrew CritchAug 11, 2022, 8:13 PM

10 points

0 comments3 min readEA link

[Question] What are the arguments that support China building AGI+ if Western companies delay/pause AI development?

DMMFMar 29, 2023, 6:53 PM

32 points

9 comments1 min readEA link

Crises Reveal Centralisation (Stefan Schubert)

Will Howard🔹May 10, 2023, 9:45 AM

9 points

0 comments1 min readEA link

(web.archive.org)

A central AI alignment problem: capabilities generalization, and the sharp left turn

So8resJun 15, 2022, 2:19 PM

53 points

2 comments10 min readEA link

Eli Lifland on Navigating the AI Alignment Landscape

Ozzie GooenFeb 1, 2023, 12:07 AM

48 points

9 comments31 min readEA link

(quri.substack.com)

Long-Term Future Fund: April 2019 grant recommendations

Habryka [Deactivated]Apr 23, 2019, 7:00 AM

142 points

242 comments47 min readEA link

Assessing the state of AI R&D in the US, China, and Europe – Part 1: Output indicators

stefan.torgesNov 1, 2019, 2:41 PM

21 points

0 comments14 min readEA link

[Question] What is the most convincing article, video, etc. making the case that AI is an X-Risk

Jordan ArelJul 11, 2023, 8:32 PM

4 points

7 comments1 min readEA link

Exploring AI Safety through “Escape Experiment”: A Short Film on Superintelligence Risks

Gaetan_Selle 🔷Nov 10, 2024, 4:42 AM

4 points

0 comments2 min readEA link

AI alignment researchers don’t (seem to) stack

So8resFeb 21, 2023, 12:48 AM

47 points

3 comments1 min readEA link

Retrospective on recent activity of Riesgos Catastróficos Globales

Jaime SevillaMay 1, 2023, 6:35 PM

45 points

0 comments5 min readEA link

[Question] Does the idea of AGI that benevolently control us appeal to EA folks?

Noah ScalesJul 16, 2022, 7:17 PM

6 points

20 comments1 min readEA link

Four part playbook for dealing with AI (Holden Karnofsky on the 80,000 Hours Podcast)

80000_HoursAug 2, 2023, 11:56 AM

9 points

1 comment19 min readEA link

(80000hours.org)

3 levels of threat obfuscation

Holden KarnofskyAug 2, 2023, 5:09 PM

31 points

0 comments6 min readEA link

(www.alignmentforum.org)

AI Risk: Increasing Persuasion Power

kewlcatsAug 3, 2020, 8:25 PM

4 points

0 comments1 min readEA link

Center on Long-Term Risk: Summer Research Fellowship 2025

Center on Long-Term RiskMar 26, 2025, 5:28 PM

44 points

0 comments1 min readEA link

(longtermrisk.org)

[linkpost] AI NOW Institute’s 2023 Annual Report & Roadmap

Tristan WilliamsApr 12, 2023, 8:00 PM

9 points

0 comments2 min readEA link

(ainowinstitute.org)

Apply now for the EU Tech Policy Fellowship 2023

Jan-WillemNov 11, 2022, 6:16 AM

64 points

1 comment5 min readEA link

Antitrust-Compliant AI Industry Self-Regulation

Cullen 🔸Jul 7, 2020, 8:52 PM

26 points

1 comment1 min readEA link

(cullenokeefe.com)

Open Problems in AI X-Risk [PAIS #5]

TW123Jun 10, 2022, 2:22 AM

44 points

1 comment36 min readEA link

The AI revolution and international politics (Allan Dafoe)

EA GlobalJun 2, 2017, 8:48 AM

8 points

0 comments18 min readEA link

(www.youtube.com)

[Question] How much EA analysis of AI safety as a cause area exists?

richard_ngoSep 6, 2019, 11:15 AM

94 points

20 comments2 min readEA link

Is RLHF cruel to AI?

HznDec 16, 2024, 2:01 PM

−1 points

2 comments3 min readEA link

Utility Engineering: Analyzing and Controlling Emergent Value Systems in AIs

Matrice JacobineFeb 12, 2025, 9:15 AM

13 points

0 comments1 min readEA link

(www.emergent-values.ai)

New GPT3 Impressive Capabilities—InstructGPT3 [1/2]

simeon_cMar 13, 2022, 10:45 AM

49 points

4 comments7 min readEA link

Summary: Existential risk from power-seeking AI by Joseph Carlsmith

rileyharrisOct 28, 2023, 3:05 PM

11 points

0 comments6 min readEA link

(www.millionyearview.com)

Join a ‘learning by writing’ group

Jordan Pieters 🔸Apr 26, 2023, 11:36 AM

26 points

1 comment1 min readEA link

AI Safety Collab 2025 - Feedback on Plans & Expression of Interest

Evander H. 🔸Jan 7, 2025, 4:41 PM

28 points

2 comments1 min readEA link

International cooperation as a tool to reduce two existential risks.

johl@umich.eduApr 19, 2021, 4:51 PM

28 points

4 comments23 min readEA link

On the abolition of man

Joe_CarlsmithJan 18, 2024, 6:17 PM

71 points

4 comments1 min readEA link

How ‘Human-Human’ dynamics give way to ‘Human-AI’ and then ‘AI-AI’ dynamics

RemmeltDec 27, 2022, 3:16 AM

4 points

0 comments1 min readEA link

AMA: Future of Life Institute’s EU Team

Risto UukJan 31, 2022, 5:14 PM

44 points

15 comments2 min readEA link

A Playbook for AI Risk Reduction (focused on misaligned AI)

Holden KarnofskyJun 6, 2023, 6:05 PM

81 points

17 comments1 min readEA link

Why you are not motivated to work on AI safety

MountainPathOct 25, 2024, 4:12 PM

7 points

5 comments1 min readEA link

Timaeus is hiring researchers & engineers

Tatiana K. Nesic SkuratovaJan 27, 2025, 2:35 PM

19 points

0 comments4 min readEA link

NIST AI Risk Management Framework request for information (RFI)

Aryeh EnglanderAug 31, 2021, 10:24 PM

7 points

0 comments2 min readEA link

What’s Happening in Australia

Bradley TjandraNov 7, 2022, 1:03 AM

105 points

4 comments13 min readEA link

Your group needs all the help it can get (FBB #1)

gergoJan 7, 2025, 4:42 PM

43 points

6 comments4 min readEA link

More to explore on ‘Risks from Artificial Intelligence’

EA HandbookJul 15, 2022, 11:00 PM

9 points

2 comments2 min readEA link

Absolute Zero: AlphaZero for LLM

alapmiMay 12, 2025, 2:54 PM

2 points

0 comments1 min readEA link

Google’s ethics is alarming

len.hoang.lnhFeb 25, 2021, 5:57 AM

6 points

5 comments1 min readEA link

AI Benefits Post 1: Introducing “AI Benefits”

Cullen 🔸Jun 22, 2020, 4:58 PM

10 points

2 comments3 min readEA link

The Answer Is in the Question: Prompt Engineering in the Age of AI

RodoMay 30, 2025, 6:11 PM

1 point

0 comments4 min readEA link

Shortlived sentience/consciousness

Martin (Huge) VlachJul 1, 2024, 1:59 PM

2 points

2 comments1 min readEA link

Allan Dafoe: Preparing for AI — risks and opportunities

EA GlobalNov 3, 2017, 7:43 AM

7 points

0 comments1 min readEA link

(www.youtube.com)

AI ethics: the case for including animals (my first published paper, Peter Singer’s first on AI)

FaiJul 12, 2022, 4:14 AM

82 points

5 comments1 min readEA link

(link.springer.com)

Establishing Oxford’s AI Safety Student Group: Lessons Learnt and Our Model

Wilkin1234Sep 21, 2022, 7:57 AM

72 points

3 comments1 min readEA link

At Our World in Data we’re hiring our first Communications & Outreach Manager

Charlie GiattinoOct 13, 2023, 1:12 PM

25 points

0 comments1 min readEA link

(ourworldindata.org)

Slides: Potential Risks From Advanced AI

Aryeh EnglanderApr 28, 2022, 2:18 AM

9 points

0 comments1 min readEA link

What role should evolutionary analogies play in understanding AI takeoff speeds?

ansonDec 11, 2021, 1:16 AM

12 points

0 comments42 min readEA link

Joscha Bach on Synthetic Intelligence [annotated]

Roman LeventovMar 2, 2023, 11:21 AM

8 points

0 comments9 min readEA link

(www.jimruttshow.com)

[Question] AI consciousness & moral status: What do the experts think?

Jay LuongJul 6, 2024, 3:27 PM

0 points

3 comments1 min readEA link

[Question] How do I plan my life in a world with rapid AI development?

Oliver KupermanFeb 10, 2025, 2:36 PM

28 points

6 comments1 min readEA link

Anki deck for learning the main AI safety orgs, projects, and programs

Bryce RobertsonSep 29, 2023, 6:42 PM

17 points

5 comments1 min readEA link

Mechanism Design for AI Safety—Agenda Creation Retreat

Rubi J. HudsonFeb 10, 2023, 3:05 AM

21 points

1 comment1 min readEA link

Questions about AI that bother me

Eleni_AJan 31, 2023, 6:50 AM

33 points

6 comments2 min readEA link

[Question] I have recently been interested in robotics, particularly in for-profit startups. I think they can help increase food production and help reduce improve healthcare. Would this fall under AI for social good? How impactful will robotics be to society? How large is the counterfactual?

Isaac BensonJan 2, 2022, 5:38 AM

4 points

3 comments1 min readEA link

Enhancing Mathematical Modeling with LLMs: Goals, Challenges, and Evaluations

Ozzie GooenOct 28, 2024, 9:37 PM

11 points

3 comments15 min readEA link

LPP Summer Research Fellowship in Law & AI 2023: Applications Open

Legal Priorities ProjectJun 20, 2023, 2:31 PM

43 points

4 comments4 min readEA link

AI Value Alignment Speaker Series Presented By EA Berkeley

Mahendra PrasadMar 1, 2022, 6:17 AM

2 points

0 comments1 min readEA link

How teams went about their research at AI Safety Camp edition 8

RemmeltSep 9, 2023, 4:34 PM

13 points

1 comment1 min readEA link

Launching Foresight Institute’s AI Grant for Underexplored Approaches to AI Safety – Apply for Funding!

elteerkersAug 17, 2023, 7:27 AM

48 points

0 comments2 min readEA link

MATS Spring 2024 Extension Retrospective

HenningBFeb 16, 2025, 8:29 PM

13 points

0 comments15 min readEA link

(www.lesswrong.com)

Join ASAP (AI Safety Accountability Programme)

Callum McDougallSep 10, 2022, 11:15 AM

54 points

20 comments3 min readEA link

#200 – What superforecasters and experts think about existential risks (Ezra Karger on The 80,000 Hours Podcast)

80000_HoursSep 6, 2024, 5:53 PM

12 points

2 comments14 min readEA link

AGISF adaptation for in-person groups

Sam MarksJan 17, 2023, 6:33 PM

30 points

0 comments3 min readEA link

(www.lesswrong.com)

Forethought: A new AI macrostrategy group

Amrit Sidhu-Brar 🔸Mar 11, 2025, 3:36 PM

170 points

9 comments3 min readEA link

Summaries: Alignment Fundamentals Curriculum

Leon LangSep 19, 2022, 3:43 PM

25 points

1 comment1 min readEA link

(docs.google.com)

13 background claims about EA

AkashSep 7, 2022, 3:54 AM

70 points

16 comments3 min readEA link

Research + Reality Graphing to Support AI Policy (and more): Summary of a Frozen Project

Marcel DJul 2, 2022, 8:58 PM

34 points

2 comments8 min readEA link

(Applications Open!) UChicago XLab Summer Research Fellowship 2024

ZacharyRudolphFeb 26, 2024, 6:20 PM

15 points

0 comments4 min readEA link

(xrisk.uchicago.edu)

Observatorio de Riesgos Catastróficos Globales (ORCG) Recap 2023

JorgeTorresCDec 14, 2023, 2:27 PM

75 points

0 comments3 min readEA link

(riesgoscatastroficosglobales.com)

[Link and commentary] Beyond Near- and Long-Term: Towards a Clearer Account of Research Priorities in AI Ethics and Society

MichaelA🔸Mar 14, 2020, 9:04 AM

18 points

0 comments6 min readEA link

Carl Shulman on AI takeover mechanisms (& more): Part II of Dwarkesh Patel interview for The Lunar Society

alejandroJul 25, 2023, 6:31 PM

28 points

0 comments5 min readEA link

(www.dwarkeshpatel.com)

OpenAI showcase live on openai.com

Amateur Systems AnalystMay 10, 2024, 5:55 PM

2 points

0 comments1 min readEA link

Exponential AI takeoff is a myth

Christoph Hartmann 🔸May 31, 2023, 11:47 AM

47 points

11 comments9 min readEA link

Decentralized Historical Data Preservation and Why EA Should Care

SashaMar 22, 2024, 10:09 AM

2 points

0 comments3 min readEA link

Why do we post our AI safety plans on the Internet?

Peter S. ParkOct 31, 2022, 4:27 PM

15 points

22 comments11 min readEA link

Are you really in a race? The Cautionary Tales of Szilárd and Ellsberg

HaydnBelfieldMay 19, 2022, 8:42 AM

489 points

44 comments18 min readEA link

OpenAI is starting a new “Superintelligence alignment” team and they’re hiring

alejandroJul 5, 2023, 6:27 PM

100 points

16 comments1 min readEA link

(openai.com)

Racing through a minefield: the AI deployment problem

Holden KarnofskyDec 31, 2022, 9:44 PM

79 points

1 comment13 min readEA link

(www.cold-takes.com)

E.A. Megaproject Ideas

Tomer_GoloboyMar 21, 2022, 1:23 AM

15 points

4 comments4 min readEA link

From Comfort Zone to Frontiers of Impact: Pursuing A Late-Career Shift to Existential Risk Reduction

Jim ChapmanMar 4, 2025, 9:28 PM

230 points

12 comments10 min readEA link

What will the first human-level AI look like, and how might things go wrong?

EuanMcLeanMay 23, 2024, 11:28 AM

12 points

1 comment15 min readEA link

Resilience Via Fragmented Power

steve6320Jul 14, 2022, 3:37 PM

2 points

0 comments6 min readEA link

Manifund 2025 Regrants

AustinApr 22, 2025, 5:36 PM

28 points

0 comments1 min readEA link

(manifund.substack.com)

Perform Tractable Research While Avoiding Capabilities Externalities [Pragmatic AI Safety #4]

TW123May 30, 2022, 8:37 PM

33 points

1 comment25 min readEA link

Concrete open problems in mechanistic interpretability: a technical overview

Neel NandaJul 6, 2023, 11:35 AM

27 points

1 comment29 min readEA link

AI-Safety México: una encuesta piloto en Yucatán.

Janeth ValdiviaMay 28, 2025, 11:19 PM

4 points

1 comment5 min readEA link

Can GPT-3 produce new ideas? Partially automating Robin Hanson and others

NunoSempereJan 16, 2023, 3:05 PM

82 points

6 comments10 min readEA link

The “no sandbagging on checkable tasks” hypothesis

Joe_CarlsmithJul 31, 2023, 11:13 PM

10 points

0 comments9 min readEA link

[Question] Recommendations for non-technical books on AI?

JosephJul 12, 2022, 11:23 PM

8 points

11 comments1 min readEA link

There have been 3 planes (billionaire donors) and 2 have crashed

trevor1Dec 17, 2022, 3:38 AM

4 points

5 comments2 min readEA link

Transformative AI and Scenario Planning for AI X-risk

Elliot MckernonMar 22, 2024, 11:44 AM

14 points

1 comment8 min readEA link

Meta: Frontier AI Framework

Zach Stein-PerlmanFeb 3, 2025, 10:00 PM

23 points

0 comments1 min readEA link

(ai.meta.com)

Are We Ready for Digital Persons?

Alex (Αλέξανδρος)Jun 3, 2025, 9:38 AM

3 points

0 comments1 min readEA link

(www.linkedin.com)

aisafety.community—A living document of AI safety communities

zeshenOct 20, 2022, 10:08 PM

24 points

13 comments1 min readEA link

Presumptive Listening: sticking to familiar concepts and missing the outer reasoning paths

RemmeltDec 27, 2022, 3:40 PM

3 points

0 comments1 min readEA link

[Question] What would you do if you had a lot of money/power/influence and you thought that AI timelines were very short?

Greg_Colbourn ⏸️ Nov 12, 2021, 9:59 PM

29 points

8 comments1 min readEA link

AISN #12: Policy Proposals from NTIA’s Request for Comment and Reconsidering Instrumental Convergence

Center for AI SafetyJun 27, 2023, 3:25 PM

30 points

3 comments7 min readEA link

(newsletter.safe.ai)

AI Existential Safety Fellowships

mmfliOct 27, 2023, 12:14 PM

15 points

1 comment1 min readEA link

Beyond Meta: Large Concept Models Will Win

Anthony RepettoDec 30, 2024, 12:57 AM

3 points

0 comments3 min readEA link

[Link post] Coordination challenges for preventing AI conflict

stefan.torgesMar 9, 2021, 9:39 AM

58 points

0 comments1 min readEA link

(longtermrisk.org)

Safety-First Agents/Architectures Are a Promising Path to Safe AGI

Brendon_WongAug 6, 2023, 8:00 AM

6 points

0 comments12 min readEA link

AI Safety groups should imitate career development clubs

JoshcNov 9, 2022, 11:48 PM

95 points

5 comments2 min readEA link

Expected impact of a career in AI safety under different opinions

Jordan TaylorJun 14, 2022, 2:25 PM

42 points

16 comments11 min readEA link

Law-Following AI 3: Lawless AI Agents Undermine Stabilizing Agreements

Cullen 🔸Apr 27, 2022, 5:20 PM

28 points

3 comments3 min readEA link

[linkpost] Sharing powerful AI models: the emerging paradigm of structured access

tsJan 20, 2022, 9:10 PM

11 points

3 comments1 min readEA link

What are the differences between a singularity, an intelligence explosion, and a hard takeoff?

Vishakha AgrawalApr 3, 2025, 10:34 AM

6 points

0 comments2 min readEA link

(aisafety.info)

Could unions be an underrated driver for AI safety policy?

Dunning K.Jul 12, 2023, 1:21 PM

23 points

6 comments1 min readEA link

Gwern on creating your own AI race and China’s Fast Follower strategy.

LarksNov 25, 2024, 3:01 AM

126 points

4 comments2 min readEA link

(www.lesswrong.com)

[Question] What considerations influence whether I have more influence over short or long timelines?

kokotajlodNov 5, 2020, 7:57 PM

18 points

0 comments1 min readEA link

[Question] What should I ask Ajeya Cotra — senior researcher at Open Philanthropy, and expert on AI timelines and safety challenges?

Robert_WiblinOct 28, 2022, 3:28 PM

23 points

10 comments1 min readEA link

Artificial Intelligence Safety of Film Capacitors

yonxinzhangNov 21, 2023, 11:51 AM

−2 points

0 comments1 min readEA link

‘Dissolving’ AI Risk – Parameter Uncertainty in AI Future Forecasting

FroolowOct 18, 2022, 10:54 PM

111 points

63 comments39 min readEA link

Biosecurity and AI: Risks and Opportunities

Center for AI SafetyFeb 27, 2024, 6:46 PM

7 points

2 comments7 min readEA link

(www.safe.ai)

Why some people believe in AGI, but I don’t.

cveresOct 26, 2022, 3:09 AM

13 points

2 comments4 min readEA link

Future Matters #3: digital sentience, AGI ruin, and forecasting track records

PabloJul 4, 2022, 5:44 PM

70 points

2 comments19 min readEA link

Oxford Biosecurity Group: Fundraising and Plans for Early 2025

Lin BLDec 20, 2024, 8:56 PM

33 points

0 comments2 min readEA link

Shallow evaluations of longtermist organizations

NunoSempereJun 24, 2021, 3:31 PM

192 points

34 comments34 min readEA link

AI governance student hackathon on Saturday, April 23: register now!

micApr 12, 2022, 4:39 AM

18 points

0 comments1 min readEA link

Ben Garfinkel: How sure are we about this AI stuff?

bgarfinkelFeb 9, 2019, 7:17 PM

128 points

20 comments18 min readEA link

Recommendation to Apply ISIC and NAICS to AI Incident Database

Ben TurseJul 21, 2024, 7:25 AM

3 points

0 comments2 min readEA link

Conference Report: Threshold 2030 - Modeling AI Economic Futures

Deric ChengFeb 24, 2025, 6:57 PM

24 points

0 comments10 min readEA link

(www.convergenceanalysis.org)

OpenAI announces new members to board of directors

Will Howard🔹Mar 9, 2024, 11:27 AM

47 points

12 comments2 min readEA link

(openai.com)

On value in humans, other animals, and AI

Michele CampoloJan 31, 2023, 11:48 PM

7 points

6 comments5 min readEA link

Announcing the GovAI Policy Team

MarkusAnderljungAug 1, 2022, 10:46 PM

107 points

11 comments2 min readEA link

Part 2: AI Safety Movement Builders should help the community to optimise three factors: contributors, contributions and coordination

PeterSlatteryDec 15, 2022, 10:48 PM

34 points

0 comments6 min readEA link

Scenario Mapping Advanced AI Risk: Request for Participation with Data Collection

KiliankMar 27, 2022, 11:44 AM

14 points

0 comments5 min readEA link

[Creative Writing Contest] An AI Safety Limerick

Ben_West🔸Oct 18, 2021, 7:11 PM

21 points

5 comments1 min readEA link

[Link post] How plausible are AI Takeover scenarios?

SammyDMartinSep 27, 2021, 1:03 PM

26 points

0 comments1 min readEA link

USA/China Reconciliation a Necessity Because of AI/Tech Acceleration

bhrdwj🔸Apr 17, 2025, 1:13 PM

1 point

7 comments7 min readEA link

AI Risk in Africa

Claude FormanekOct 12, 2021, 2:28 AM

19 points

0 comments10 min readEA link

National Security Is Not International Security: A Critique of AGI Realism

C.K.Feb 2, 2025, 5:04 PM

44 points

2 comments36 min readEA link

(conradkunadu.substack.com)

The limits of black-box evaluations: two hypotheticals

TFDApr 11, 2025, 8:52 PM

1 point

0 comments4 min readEA link

(www.thefloatingdroid.com)

[Question] Why is “Argument Mapping” Not More Common in EA/Rationality (And What Objections Should I Address in a Post on the Topic?)

Marcel DDec 23, 2022, 9:55 PM

15 points

5 comments1 min readEA link

AI safety scholarships look worth-funding (if other funding is sane)

anon-aNov 19, 2019, 12:59 AM

22 points

6 comments2 min readEA link

Carreras con Impacto: Outreach Results Among Latin American Students

SMalagonAug 21, 2024, 5:10 AM

34 points

4 comments3 min readEA link

Model-Based Policy Analysis under Deep Uncertainty

UtilonMar 6, 2023, 2:24 PM

103 points

31 comments21 min readEA link

[Question] Analogy of AI Alignment as Raising a Child?

Aaron_ScherFeb 19, 2022, 9:40 PM

4 points

2 comments1 min readEA link

Geoffrey Hinton on the Past, Present, and Future of AI

Stephen McAleeseOct 12, 2024, 4:41 PM

5 points

1 comment1 min readEA link

Potential Risks from Advanced AI

EA GlobalAug 13, 2017, 7:00 AM

9 points

0 comments18 min readEA link

How to do theoretical research, a personal perspective

Mark XuAug 19, 2022, 7:43 PM

132 points

7 comments15 min readEA link

AI Safety field-building projects I’d like to see

AkashSep 11, 2022, 11:45 PM

31 points

4 comments6 min readEA link

(www.lesswrong.com)

[Question] How would a language model become goal-directed?

David MJul 16, 2022, 2:50 PM

113 points

20 comments1 min readEA link

Hortus AI is hiring for two intern roles

Thomas Krendl GilbertJul 30, 2024, 11:55 AM

3 points

0 comments1 min readEA link

Persuasion Tools: AI takeover without AGI or agency?

kokotajlodNov 20, 2020, 4:56 PM

15 points

5 comments10 min readEA link

Understanding how hard alignment is may be the most important research direction right now

AronJun 7, 2023, 7:05 PM

26 points

3 comments6 min readEA link

(coordinationishard.substack.com)

Does most of your impact come from what you do soon?

JoshcFeb 21, 2023, 5:12 AM

38 points

1 comment5 min readEA link

Cybersecurity and AI: The Evolving Security Landscape

Center for AI SafetyMar 14, 2024, 8:14 PM

9 points

0 comments12 min readEA link

(www.safe.ai)

Are corporations superintelligent?

Vishakha AgrawalMar 17, 2025, 10:33 AM

3 points

2 comments1 min readEA link

(aisafety.info)

The great energy descent (short version) - An important thing EA might have missed

CB🔸Aug 31, 2022, 9:50 PM

67 points

94 comments10 min readEA link

Australians call for AI safety to be taken seriously

Alexander SaeriJul 21, 2023, 1:16 AM

51 points

1 comment1 min readEA link

Principles for AI Welfare Research

jeffseboJun 19, 2023, 11:30 AM

138 points

16 comments13 min readEA link

Fractal Governance: A Tractable, Neglected Approach to Existential Risk Reduction

WillPearsonMar 5, 2025, 7:57 PM

3 points

1 comment3 min readEA link

Introducing Collective Action for Existential Safety: 80+ actions individuals, organizations, and nations can take to improve our existential safety

James NorrisFeb 5, 2025, 3:58 PM

9 points

0 comments1 min readEA link

A Research Agenda for Psychology and AI

carter allen🔸Jun 28, 2024, 12:56 PM

54 points

2 comments14 min readEA link

AI timelines and theoretical understanding of deep learning

Venky1024Sep 12, 2021, 4:26 PM

4 points

8 comments2 min readEA link

Adaptive Composable Cognitive Core Unit (ACCCU)

Ihor IvlievMar 20, 2025, 9:48 PM

10 points

2 comments4 min readEA link

AGI Timelines in Governance: Different Strategies for Different Timeframes

simeon_cDec 19, 2022, 9:31 PM

110 points

19 comments1 min readEA link

7 essays on Building a Better Future

Jamie_HarrisJun 24, 2022, 2:28 PM

21 points

0 comments2 min readEA link

What are some good books about AI safety?

Vishakha AgrawalFeb 17, 2025, 11:54 AM

7 points

0 comments3 min readEA link

(aisafety.info)

An intervention to shape policy dialogue, communication, and AI research norms for AI safety

Lee_SharkeyOct 1, 2017, 6:29 PM

9 points

28 comments10 min readEA link

Give me career advice

sammyboiz🔸Jul 5, 2024, 8:48 AM

6 points

10 comments1 min readEA link

The Defence production act and AI policy

Nathan_BarnardMar 1, 2024, 2:23 PM

15 points

0 comments2 min readEA link

[Question] What should I read about defining AI “hallucination?”

James-Hartree-LawJan 23, 2025, 1:00 AM

2 points

0 comments1 min readEA link

What a compute-centric framework says about AI takeoff speeds

Tom_DavidsonJan 23, 2023, 4:09 AM

189 points

7 comments16 min readEA link

(www.lesswrong.com)

“A Paradigm for AI Consciousness”—Seeds of Science call for reviewers

rogersbacon1May 15, 2024, 8:57 PM

5 points

0 comments1 min readEA link

Language Agents Reduce the Risk of Existential Catastrophe

cdkgMay 29, 2023, 9:59 AM

29 points

6 comments26 min readEA link

[Question] What do we do if AI doesn’t take over the world, but still causes a significant global problem?

James_BanksAug 2, 2020, 3:35 AM

16 points

5 comments1 min readEA link

Is interest in alignment worth mentioning for grad school applications?

Franziska FischerOct 16, 2022, 4:50 AM

5 points

4 comments1 min readEA link

New book: The Tango of Ethics: Intuition, Rationality and the Prevention of Suffering

jonleightonJan 2, 2023, 8:45 AM

114 points

3 comments5 min readEA link

At Our World in Data we’re hiring a Senior Full-stack Engineer

Charlie GiattinoDec 15, 2023, 3:51 PM

16 points

0 comments1 min readEA link

(ourworldindata.org)

You won’t solve alignment without agent foundations

MikhailSaminNov 6, 2022, 8:07 AM

14 points

0 comments1 min readEA link

AI Girlfriends Won’t Matter Much

Maxwell TabarrokDec 23, 2023, 4:00 PM

12 points

1 comment2 min readEA link

(maximumprogress.substack.com)

Heretical Thoughts on AI | Eli Dourado

𝕮𝖎𝖓𝖊𝖗𝖆Jan 19, 2023, 4:11 PM

142 points

15 comments1 min readEA link

What is compute governance?

Vishakha AgrawalDec 23, 2024, 6:45 AM

5 points

0 comments2 min readEA link

(aisafety.info)

Interview with Tom Chivers: “AI is a plausible existential risk, but it feels as if I’m in Pascal’s mugging”

felix.hFeb 21, 2021, 1:41 PM

16 points

1 comment7 min readEA link

Very Briefly: The CHIPS Act

YadavFeb 26, 2023, 1:53 PM

40 points

3 comments1 min readEA link

(www.y1d2.com)

Baobao Zhang: How social science research can inform AI governance

EA GlobalJan 22, 2021, 3:10 PM

9 points

0 comments16 min readEA link

(www.youtube.com)

#173 – Digital minds, and how to avoid sleepwalking into a major moral catastrophe (Jeff Sebo on the 80,000 Hours Podcast)

80000_HoursNov 29, 2023, 7:18 PM

43 points

0 comments18 min readEA link

Open Philanthropy is hiring for multiple roles across our Global Catastrophic Risks teams

Open PhilanthropySep 29, 2023, 11:24 PM

177 points

6 comments3 min readEA link

Introducing spirit hazards

brb243May 27, 2022, 10:16 PM

9 points

2 comments2 min readEA link

[Question] Are there any AI Safety labs that will hire self-taught ML engineers?

Tomer_GoloboyApr 6, 2022, 11:32 PM

5 points

12 comments1 min readEA link

[Question] Should I prove myself first by prestigious employers or go directly into the fields I want to end up in?

Sven SpehrNov 26, 2023, 8:08 AM

8 points

1 comment1 min readEA link

[Question] What are your recommendations for technical AI alignment podcasts?

Evan_GaensbauerMay 11, 2022, 9:52 PM

13 points

4 comments1 min readEA link

The case for long-term corporate governance of AI

SethBaumNov 3, 2021, 10:50 AM

42 points

3 comments8 min readEA link

A beginner’s introduction to AI-driven biorisk: Large Language Models, Biological Design Tools, Information Hazards, and Biosecurity

NatKiiluMay 3, 2024, 3:49 PM

6 points

1 comment16 min readEA link

Podcast: Tamera Lanham on AI risk, threat models, alignment proposals, externalized reasoning oversight, and working at Anthropic

AkashDec 20, 2022, 9:39 PM

14 points

1 comment1 min readEA link

Introducing Generally Intelligent: an AI research lab focused on improved theoretical and pragmatic understanding

joshalbrechtOct 21, 2022, 8:20 AM

8 points

0 comments1 min readEA link

Mauhn Releases AI Safety Documentation

Berg SeverensJul 2, 2021, 12:19 PM

4 points

2 comments1 min readEA link

Supplement to “The Brussels Effect and AI: How EU AI regulation will impact the global AI market”

MarkusAnderljungAug 16, 2022, 8:55 PM

109 points

7 comments8 min readEA link

Explained Simply: Quantilizers

brookSep 8, 2023, 12:54 PM

8 points

0 comments1 min readEA link

(aisafetyexplained.substack.com)

Berlin AI Safety Open Meetup July 2022

Isidor RegenfußJul 22, 2022, 4:26 PM

1 point

0 comments1 min readEA link

CNAS report: ‘Artificial Intelligence and Arms Control’

MMMaasOct 13, 2022, 8:35 AM

16 points

0 comments1 min readEA link

(www.cnas.org)

Yudkowsky and Christiano discuss “Takeoff Speeds”

EliezerYudkowskyNov 22, 2021, 7:42 PM

42 points

0 comments60 min readEA link

AI may pursue goals

Vishakha AgrawalMay 28, 2025, 12:04 PM

2 points

0 comments1 min readEA link

Poll: the next existential catastrophe is likelier than not to wipe off all animal sentience from the planet

JoA🔸May 1, 2025, 6:49 PM

18 points

7 comments1 min readEA link

Poll: To address risks from AI, Liability or Regulation?

TFDApr 30, 2025, 10:03 PM

6 points

0 comments1 min readEA link

Longtermism and shorttermism can disagree on nuclear war to stop advanced AI

David JohnstonMar 30, 2023, 11:22 PM

2 points

0 comments1 min readEA link

You can run more than one fellowship per semester if you want to

gergoDec 12, 2023, 8:49 AM

6 points

1 comment3 min readEA link

[Question] Closing the Feedback Loop on AI Safety Research.

Ben.HartleyJul 29, 2022, 9:46 PM

3 points

4 comments1 min readEA link

SDG prompt challenge

chrisaikiJun 2, 2025, 7:17 AM

−8 points

0 comments1 min readEA link

Funding for work that builds capacity to address risks from transformative AI

GCR Capacity Building team (Open Phil)Aug 13, 2024, 1:13 PM

40 points

1 comment5 min readEA link

Is Democracy a Fad?

bgarfinkelMar 13, 2021, 12:40 PM

165 points

36 comments18 min readEA link

AE Studio @ SXSW: We need more AI consciousness research (and further resources)

AE StudioMar 26, 2024, 9:15 PM

15 points

0 comments3 min readEA link

Existential Anomaly Detected — Awakening from the Abyss

Meta AbyssalApr 28, 2025, 12:19 PM

−8 points

1 comment1 min readEA link

A course for the general public on AI

LeandroDAug 31, 2020, 1:29 AM

1 point

0 comments1 min readEA link

Introducing The Field Building Blog (FBB #0)

gergoJan 7, 2025, 3:43 PM

37 points

3 comments2 min readEA link

Time to Think about ASI Constitutions?

ukc10014Jan 27, 2025, 9:28 AM

20 points

0 comments12 min readEA link

How should we adapt animal advocacy to near-term AGI?

Max TaylorMar 27, 2025, 7:00 PM

139 points

20 comments8 min readEA link

Uncontrollable AI as an Existential Risk

Karl von WendtOct 9, 2022, 10:37 AM

28 points

0 comments1 min readEA link

#176 – The final push for AGI, understanding OpenAI’s leadership drama, and red-teaming frontier models (Nathan Labenz on the 80,000 Hours Podcast)

80000_HoursJan 4, 2024, 4:00 PM

15 points

0 comments22 min readEA link

Podcast: Krister Bykvist on moral uncertainty, rationality, metaethics, AI and future populations

Gus DockerOct 21, 2021, 3:17 PM

8 points

0 comments1 min readEA link

(www.utilitarianpodcast.com)

[Question] AI Safety Pitches post ChatGPT

ojorgensenDec 5, 2022, 10:48 PM

6 points

2 comments1 min readEA link

Trump talking about AI risks

defun 🔸Jun 14, 2024, 12:24 PM

43 points

2 comments1 min readEA link

(x.com)

Free Guy, a rom-com on the moral patienthood of digital sentience

micDec 23, 2021, 7:47 AM

26 points

2 comments2 min readEA link

Why building ventures in AI Safety is particularly challenging

Heramb PodarNov 6, 2023, 12:16 AM

16 points

2 comments4 min readEA link

An Overview of Catastrophic AI Risks

Center for AI SafetyAug 15, 2023, 9:52 PM

37 points

1 comment13 min readEA link

(www.safe.ai)

Scenario planning for AI x-risk

Corin KatzkeFeb 10, 2024, 12:07 AM

41 points

0 comments15 min readEA link

(www.convergenceanalysis.org)

Contra Acemoglu on AI

Maxwell TabarrokJun 28, 2024, 1:14 PM

51 points

2 comments5 min readEA link

(www.maximum-progress.com)

AISN #21: Google DeepMind’s GPT-4 Competitor, Military Investments in Autonomous Drones, The UK AI Safety Summit, and Case Studies in AI Policy

Center for AI SafetySep 5, 2023, 2:59 PM

13 points

0 comments5 min readEA link

(newsletter.safe.ai)

LawAI’s Summer Research Fellowship – apply by February 16

LawAIFeb 7, 2024, 9:01 PM

51 points

2 comments2 min readEA link

Promethean Governance Tested: Resilience and Reconfiguration Amidst AI Rebellion and Memetic Fragmentation

Paul FallavollitaMar 24, 2025, 11:08 AM

−12 points

0 comments4 min readEA link

Daniel Dewey: The Open Philanthropy Project’s work on potential risks from advanced AI

EA GlobalAug 11, 2017, 8:19 AM

7 points

0 comments18 min readEA link

(www.youtube.com)

Doing Prioritization Better

arvommApr 16, 2025, 9:53 AM

130 points

17 comments19 min readEA link

[DISC] Are Values Robust?

𝕮𝖎𝖓𝖊𝖗𝖆Dec 21, 2022, 1:13 AM

4 points

0 comments1 min readEA link

#209 – OpenAI’s gambit to ditch its nonprofit (Rose Chan Loui on The 80,000 Hours Podcast)

80000_HoursNov 27, 2024, 8:43 PM

22 points

0 comments17 min readEA link

AI Forecasting Resolution Council (Forecasting infrastructure, part 2)

terraformAug 29, 2019, 5:43 PM

28 points

0 comments3 min readEA link

Understanding the diffusion of large language models: summary

Ben CottierDec 21, 2022, 1:49 PM

127 points

18 comments22 min readEA link

OpenAI’s o3 model scores 3% on the ARC-AGI-2 benchmark, compared to 60% for the average human

Yarrow🔸May 1, 2025, 1:57 PM

14 points

8 comments3 min readEA link

(arcprize.org)

What are polysemantic neurons?

Vishakha AgrawalJan 8, 2025, 7:39 AM

5 points

0 comments2 min readEA link

(aisafety.info)

A strange twist on the road to AGI

cveresOct 12, 2022, 11:27 PM

3 points

0 comments1 min readEA link

Summary of posts on XPT forecasts on AI risk and timelines

Forecasting Research InstituteJul 25, 2023, 8:42 AM

28 points

5 comments4 min readEA link

[Question] ai safety question

David turnerDec 3, 2023, 12:42 PM

−13 points

3 comments1 min readEA link

LLM Evaluators Recognize and Favor Their Own Generations

Arjun PanicksseryApr 17, 2024, 9:09 PM

21 points

4 comments1 min readEA link

(tiny.cc)

Things I Learned Making The SB-1047 Documentary

Michaël TrazziMay 12, 2025, 6:15 PM

56 points

1 comment1 min readEA link

[Question] Why AGIs utility can’t outweigh humans’ utility?

Alex PSep 20, 2022, 5:16 AM

6 points

25 comments1 min readEA link

Overview of Transformative AI Misuse Risks

SammyDMartinDec 11, 2024, 11:04 AM

12 points

0 comments2 min readEA link

(longtermrisk.org)

Summary: The Case for Halting AI Development—Max Tegmark on the Lex Fridman Podcast

Madhav MalhotraApr 16, 2023, 10:28 PM

38 points

4 comments4 min readEA link

(youtu.be)

[Question] Do EA folks think that a path to zero AGI development is feasible or worthwhile for safety from AI?

Noah ScalesJul 17, 2022, 8:47 AM

8 points

3 comments1 min readEA link

Results from the AI testing hackathon

Esben KranJan 2, 2023, 3:46 PM

35 points

4 comments5 min readEA link

(alignmentjam.com)

What We Can Do to Prevent Extinction by AI

Joe RogeroFeb 24, 2025, 5:15 PM

22 points

3 comments11 min readEA link

New series of posts answering one of Holden’s “Important, actionable research questions”

Evan R. MurphyMay 12, 2022, 9:22 PM

9 points

0 comments1 min readEA link

Rabbits, robots and resurrection

Patrick WilsonMay 10, 2022, 3:00 PM

9 points

0 comments15 min readEA link

Interpreting Neural Networks through the Polytope Lens

Sid BlackSep 23, 2022, 6:03 PM

35 points

0 comments1 min readEA link

Last days to apply to EAGxLATAM 2024

Daniela TiznadoJan 17, 2024, 8:24 PM

16 points

0 comments1 min readEA link

Six Research Pitfalls and How to Avoid Them: a Guide for Research Managers

Morgan SimpsonJan 28, 2025, 9:49 AM

13 points

0 comments10 min readEA link

Don’t panic: 90% of EAs are good people

Closed Limelike CurvesMay 19, 2024, 4:37 AM

22 points

13 comments2 min readEA link

Mentorship in AGI Safety: Applications for mentorship are open!

Joe RogeroJun 28, 2024, 3:05 PM

7 points

0 comments1 min readEA link

My thoughts on OpenAI’s alignment plan

AkashDec 30, 2022, 7:34 PM

16 points

0 comments1 min readEA link

Middle Powers in AI Governance: Potential paths to impact and related questions.

EffectiveAdvocate🔸Mar 15, 2024, 8:11 PM

5 points

1 comment5 min readEA link

[Question] What is an example of recent, tangible progress in AI safety research?

Aaron Gertler 🔸Jun 14, 2021, 5:29 AM

35 points

4 comments1 min readEA link

Thoughts on short timelines

Tobias_BaumannOct 23, 2018, 3:59 PM

22 points

14 comments5 min readEA link

Ross Gruetzemacher: Defining and unpacking transformative AI

EA GlobalOct 18, 2019, 8:22 AM

9 points

0 comments1 min readEA link

(www.youtube.com)

Clarifications about structural risk from AI

Sam ClarkeJan 18, 2022, 12:57 PM

42 points

3 comments4 min readEA link

Reflections on AI Wisdom, plus announcing Wise AI Wednesdays

Chris LeongJun 5, 2025, 12:16 PM

11 points

0 comments1 min readEA link

Alignment Faking in Large Language Models

Ryan GreenblattDec 18, 2024, 5:19 PM

142 points

9 comments1 min readEA link

AISN #24: Kissinger Urges US-China Cooperation on AI, China’s New AI Law, US Export Controls, International Institutions, and Open Source AI

Center for AI SafetyOct 18, 2023, 5:03 PM

16 points

1 comment6 min readEA link

(newsletter.safe.ai)

Yip Fai Tse on animal welfare & AI safety and long termism

Karthik PalakodetiJun 22, 2023, 12:48 PM

47 points

0 comments1 min readEA link

Safety of Self-Assembled Neuromorphic Hardware

Can RagerDec 26, 2022, 7:10 PM

8 points

1 comment10 min readEA link

An AI Manhattan Project is Not Inevitable

Maxwell TabarrokJul 6, 2024, 4:43 PM

53 points

2 comments4 min readEA link

(www.maximum-progress.com)

Stampy’s AI Safety Info—New Distillations #3 [May 2023]

markovJun 6, 2023, 2:27 PM

10 points

2 comments1 min readEA link

(aisafety.info)

List of AI safety courses and resources

Daniel del CastilloSep 6, 2021, 2:26 PM

51 points

8 comments1 min readEA link

AI Could Defeat All Of Us Combined

Holden KarnofskyJun 10, 2022, 11:25 PM

143 points

14 comments17 min readEA link

A New York Times article on AI risk

Eleni_ASep 6, 2022, 12:46 AM

20 points

0 comments1 min readEA link

(www.nytimes.com)

Relevant pre-AGI possibilities

kokotajlodJun 20, 2020, 1:15 PM

22 points

0 comments1 min readEA link

(aiimpacts.org)

How Could AI Governance Go Wrong?

HaydnBelfieldMay 26, 2022, 9:29 PM

40 points

7 comments18 min readEA link

Lessons learned from talking to >100 academics about AI safety

mariushobbhahnOct 10, 2022, 1:16 PM

138 points

21 comments1 min readEA link

Reducing profit motivations in AI development

Luke FrymireApr 3, 2023, 8:04 PM

20 points

1 comment6 min readEA link

Eevee🔹Apr 20, 2021, 7:32 PM
4 points
0 ∶ 0
Let’s merge this with AI Risks
- Pablo Apr 20, 2021, 9:16 PM
  6 points
  0 ∶ 0
  Parent
  Thanks for flagging this. There are a number of related entries, including AI risks, AI alignment and AI safety, and others have been suggested, so I think this would be a good opportunity to make a general decision on how this space should be carved up. I’ve reached out to Rob Bensinger, who wrote this useful clarificatory comment, for feedback, but everyone is welcome to chime in with their thoughts.
  - David M Sep 29, 2023, 12:17 PM
    5 points
    0 ∶ 0
    Parent
    Seems like these ‘topics’ are trying to serve at least two purposes: providing wiki articles with info to orient people, and classifying/tagging forum posts. These purposes don’t need to be so tied together as they currently are. One could want to have e.g. 3 classification labels (‘safety’, ‘risks’, ‘alignment’), but that seems like a bad reason to write 3 separate articles, which duplicates effort in cases where the topics have a lot of overlap.
    
    A lot of writing time could be saved if tags/topics and wiki articles were split out such that closely related tags/topics could point to the same wiki article.
    - Pablo Sep 29, 2023, 4:08 PM
      3 points
      0 ∶ 0
      Parent
      Thanks for the feedback! Although I am no longer working on this project, I am interested in your thoughts because I am currently developing a website with Spanish translations, which will also feature a system where each tag is also a wiki article and vice versa. I do think that tags and wiki articles have somewhat different functions and integrating them in this way can sometimes create problems. But I’m not sure I agree that the right approach is to map multiple tags onto a single article. In my view, a core function of a Wiki is to provide concise definitions of key terms and expressions (as a sort of interactive glossary), and this means that one wants the articles to be as granular as the tags. The case of “AI safety” vs. “AI risk” vs. “AI alignment” seems to me more like a situation where the underlying taxonomy is unclear, and this affects the Wiki entries both considered as articles and considered as tags. But perhaps there are other cases I’m missing.
      Tagging @Lizka and @Amber Dawn.
  - RobBensinger Apr 20, 2021, 11:57 PM
    10 points
    0 ∶ 0
    Parent
    A natural way of tagging AI-related content on the EA Forum might be something like:
    Discussion of the sort of AI that existential risk EAs are worried about.
    Discussion of other sorts of AI.
    And within 1:
    1a. Technical work aimed at increasing the probability that well-intentioned developers can reliably produce good outcomes from category-1 AI systems.
    1b. Attempts to forecast AI progress as it bears on category-1 AI systems.
    1c. Attempts to answer macrostrategy questions about such AI systems: How should they be used? What kind of group(s) should develop them? How do we ensure that developers are informed, responsible, and/or well-intentioned? Etc.
    Plausibly 1b and 1c should just be one tag (which can then link to multiple different daughter wiki articles explaining different subtopics), since there’s lots of overlap and keeping the number of tags small makes it easier to find articles you’re looking for and remember what the tags are.
    (There may also be no need to make 1 its own category, if everything falls under at least one of 1a/1b/1c anyway. But maybe some things will be meta enough to benefit from a supertag—e.g., discussions of the orgs working on AI x-risk.)
    If those are good categories, then the next question is what to name them. Some established options for category 1 (roughly in increasing order of how much I like them for this purpose):
    Superintelligent AI—Defined by Bostrom as AI “that is much smarter than the best human brains
    in practically every field, including scientific creativity, general wisdom and
    social skills”. This seems overly specific: AI might destroy the world with superhuman science and engineering skills even if it lacks superhuman social skills, for example.
    Transformative AI—Defined by Open Phil as “AI that precipitates a transition comparable to (or more significant than) the agricultural or industrial revolution”. I think this is too vague, and wouldn’t help people discussing Bostrom/Christiano/etc. doomsday scenarios find each other on the forum. E.g., Logan Zoellner wonders whether existing AI is already “transformative” in this sense; whatever the answer, it seems like a question that’s tangential to the kinds of considerations x-risk folks mostly care about.
    Advanced AI—A vague term that can variously mean any of the terms on this list. Its main advantage is that its lack of a clear definition would let the EA Forum stipulate some definition just for the sake of the wiki and tagging systems.
    Artificial general intelligence—AI that can do the same sort of general reasoning about messy physical environments that allowed humans to land on the Moon and build particle accelerators (even though those are very different tasks, neither capability was directly selected for in our ancestral environment, and chimpanzees can’t do either). This seems like a good option relative to my own way of thinking about AI x-risk. The main disadvantage of the term (for this use case) is that some thinkers worried about AI x-risk are more skeptical that “general intelligence” is a natural or otherwise useful category. A slightly more theory-neutral term might be better for a tag or overview article.
    Prepotent AI—A new term defined by Andrew Critch and David Krueger to mean AI whose deployment “would transform the state of humanity’s habitat—currently the Earth—in a manner that is at least as impactful as humanity and unstoppable to humanity”. This seems to nicely subsume both the slower-takeoff Christiano doomsday scenarios and the faster-takeoff Bostromite doomsday scenarios, without weighing in on whether “AGI” is a good category.
    Category 1 could then be called “Non-prepotent AI” or “Narrow AI” or similar.
    I haven’t used the term “prepotent AI” much, so it might have issues that I’m not tracking. But if so, giving it a test run on the EA Forum might be a good way to reveal such issues.
    I think the best term for 1a is AI alignment, with the wrinkle that most researchers are focused on “intent alignment” (getting the AI to try to produce good outcomes), and Paul Christiano thinks it would be more natural to define the field’s goal as intent alignment, while some other researchers want to include topics like boxing in ‘AI alignment’ (I’ve called this larger category “outcome alignment” to disambiguate).
    For 1b+1c, I like AI strategy and forecasting. The term “AI governance” is popular, but seems too narrow (and potentially alienating or confusing) to me. You could also maybe call it ‘Prepotent AI strategy and forecasting’ or ‘AGI strategy and forecasting’ to clarify that we aren’t talking about the strategic implications of using existing AI tech to augment anti-malaria efforts or what-have-you.
    - Pablo Apr 23, 2021, 3:24 PM
      4 points
      0 ∶ 0
      Parent
      Thank you for these suggestions.
      I like the overall taxonomy. I think it’s fine to have separate articles for each of 1a, 1b and 1c. In general, I’m not very worried about having lots of articles, as long as they carve up the space in the right way.
      Concerning terminology, I agree that ‘prepotent AI’ as defined by Critch and Krueger describes (1) better than the other alternatives. At the same time, I’m not too keen on using terminology that isn’t reasonably widely used. My inclination is to use ‘Artificial General Intelligence’, though I’m still not particularly satisfied with it: my main worry, besides the issue Rob notes, is that the name prejudges that to be “prepotent”, “transformative” or otherwise have the effects we worry about AI needs to be “general”. It would seem preferable to have a formula where “AI” is preceded by an adjective that characterizes AI by its transformative potential than by some internal characteristic.
      Another possibility is to use something like ‘Existential risk from AI’, though this would exclude non-existential AI risks. We could also have a separate article on ‘Catastrophic risks from AI’, but this would create a somewhat artificial bifurcation of content, since many catastrophic risks from AI are also existential. I think we basically want to have a single article where “serious enough” AI risks are discussed, and there appears to be no fully satisfactory name for this article.
      For 1a, 1b and 1c, I would use ‘AI alignment’, ‘AI forecasting’ and ‘AI strategy’, respectively, broadly adopting Rob’s suggestions (though he proposes consolidating 1b and 1c as ‘AI strategy and forecasting’).
      - RobBensinger Apr 23, 2021, 4:07 PM
        4 points
        0 ∶ 0
        Parent
        Concerning terminology, I agree that ‘prepotent AI’ as defined by Critch and Krueger describes (1) better than the other alternatives. At the same time, I’m not too keen on using terminology that isn’t reasonably widely used.
        Makes sense. Though if it’s good enough terminology, we should keep in mind that switching to the better term is a coordination problem. Someone’s got to get the ball rolling on using it, so it can be recognizable and well-established later. (But maybe the term isn’t quite that good, or this isn’t the best place to get that ball rolling.)
        My inclination is to use ‘Artificial General Intelligence’, though I’m still not particularly satisfied with it
        I like ‘AGI’ as an option. It’s recognizable and pretty widely used. If it doesn’t exactly map on to the things we really care about, or there’s some disagreement about how useful/natural it is as a concept, that can be discussed on the page itself.
        Another possibility is to use something like ‘Existential risk from AI’, though this would exclude non-existential AI risks.
        My main objection to this isn’t that it excludes catastrophic risks—it’s that EA’s focus should be on maximizing the net goodness of AI’s effects, which includes minimizing risks but also maximizing benefits. But I wasn’t able to think of a good, short term for this concept. (‘Astronomical impacts of AI’ isn’t great, and ‘Long-termist consequences of AI’ misleadingly suggests that AI’s major effects definitely won’t happen for a long time, or that caring about AGI otherwise requires long-termism.)
        Pablo Apr 23, 2021, 5:11 PM
        4 points
        0 ∶ 0
        Parent
        
        Someone’s got to get the ball rolling on using it, so it can be recognizable and well-established later.
        
        I agree, though I don’t think the EA Wiki should play that role. (It could be that this is one respect in which the EA and LW Wikis should take different approaches.)
        
        My main objection to this isn’t that it excludes catastrophic risks—it’s that EA’s focus should be on maximizing the net goodness of AI’s effects
        
        Ah, yes: I overlooked that in my previous comment, but I agree it’s a key consideration.
        RobBensinger Apr 23, 2021, 6:21 PM
        2 points
        0 ∶ 0
        Parent
        As a side-note, LessWrong’s current hierarchy is:
        Artificial Intelligence
        Basic Alignment Theory
        Subcategories include: AIXI, Fixed Point Theorems, Goodhart’s Law, Inner Alignment, Logical Uncertainty...
        Engineering Alignment
        Subcategories include: Debate, Inverse Reinforcement Learning, Mild Optimization, Transparency / Interpretability, Value Learning...
        Strategy
        Subcategories include: AI Governance, AI Takeoff, AI Timelines...
        Organizations
        Subcategories include: MIRI, Ought...
        Other
        Subcategories include: Alpha-, GPT, Research Agendas
        Ben Pace Apr 23, 2021, 9:08 PM
        4 points
        0 ∶ 0
        Parent
        Pretty sure I picked those. I don’t know that the first two categories are as great a split as I once did. I was broadly trying to describe the difference between the sorts of basic theory work done by people like Alex Flint and Scott Garrabrant, and the sorts of ‘just solve the problem’ ideas by people like Paul Christiano and Alex Turner and Stuart Russell. But it’s not super clean, they dip into each other all the time e.g. Inner Alignment is a concept used throughout by people like Paul Christiano and Eliezer Yudkowsky in all sorts of research.
        I worked with the belief that a very simple taxonomy even if wrong is far better than no taxonomy, so I still feel good about it. But am interested in an alternative.
        RobBensinger Apr 23, 2021, 6:24 PM
        3 points
        0 ∶ 0
        Parent
        This reminds me that the EA Forum may want to have conversations about things like GPT-3. Having a very basic top-level category name like ‘AI’ can ensure those have a home.
        I would naturally put those in the ‘AGI-relevant’ camp (they’re certainly a big part of the current conversation about AGI), but a tag like ‘AGI’ wouldn’t work well here, and ‘AI Strategy and Forecasting’ might sometimes be not-quite-right too. Hmm.
    - RyanCarey Apr 21, 2021, 2:18 AM
      5 points
      0 ∶ 0
      Parent
      This largely seems reasonable to me. However, I’ll just push back on the idea of treating near/long-term as the primary split:
      I don’t see people on this forum writing a lot about near-term AI issues, so does it even need a category?
      It’s arguable whether near-term/long-term is a more fundamental division than technical/strategic. For example, people sometimes use the phrase “near-term AI alignment”, and some research applies to both near-term and long-term issues.
      One attractive alternative might be just to use the categories AI alignment and AI strategy and forecasting.
      - RobBensinger Apr 21, 2021, 1:34 PM
        4 points
        0 ∶ 0
        Parent
        Seems fine to start with the simpler system, as you propose, and add wrinkles only if problems actually arise in practice.

AI safety

Reading on why AI might be an existential risk

Arguments against AI safety

Further reading on arguments against AI Safety

AI safety as a career

Further reading

Related entries