RSS

AI safety

Core TagLast edit: Aug 7, 2024, 3:10 PM by vipulnaik

AI safety is the study of ways to reduce risks posed by artificial intelligence.

Interventions that aim to reduce these risks can be split into:

Reading on why AI might be an existential risk

Hilton, Benjamin (2023) Preventing an AI-related catastrophe, 80000 Hours, March 2023

Cotra, Ajeya (2022) Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover Effective Altruism Forum, July 18

Carlsmith, Joseph (2022) Is Power-Seeking AI an Existential Risk? Arxiv, 16 June

Yudkowsky, Eliezer (2022) AGI Ruin: A List of Lethalities LessWrong, June 5

Ngo et al (2023) The alignment problem from a deep learning perspectiveArxiv, February 23

Arguments against AI safety

AI safety and AI risk is sometimes referred to as a Pascal’s Mugging [1], implying that the risks are tiny and that for any stated level of ignorable risk the the payoffs could be exaggerated to force it to still be a top priority. A response to this is that in a survey of 700 ML researchers, the median answer to the “the probability that the long-run effect of advanced AI on humanity will be “extremely bad (e.g., human extinction)” was 5% with, with 48% of respondents giving 10% or higher[2]. These probabilites are too high (by at least 5 orders of magnitude) to be consider Pascalian.

Further reading on arguments against AI Safety

Grace, Katja (2022) Counterarguments to the basic AI x-risk case EA Forum, October 14

Garfinkel, Ben (2020) Scrutinising classic AI risk arguments 80000 Hours Podcast, July 9

AI safety as a career

80,000 Hours’ medium-depth investigation rates technical AI safety research a “priority path”—among the most promising career opportunities the organization has identified so far.[3][4] Richard Ngo and Holden Karnofsky also have advice for those interested in working on AI Safety[5][6].

Further reading

Gates, Vael (2022) Resources I send to AI researchers about AI safety, Effective Altruism Forum, June 13.

Krakovna, Victoria (2017) Introductory resources on AI safety research, Victoria Krakovna’s Blog, October 19.

Ngo, Richard (2019) Disentangling arguments for the importance of AI safety, Effective Altruism Forum, January 21.

Rice, Issa; Naik, Vipul (2024) Timeline of AI safety, Timelines Wiki

Related entries

AI alignment | AI governance | AI forecasting| AI takeoff | AI race | Economics of artificial intelligence |AI interpretability | AI risk | cooperative AI | building the field of AI safety |

  1. ^

    https://​​twitter.com/​​amasad/​​status/​​1632121317146361856 The CEO of Replit, a coding organisation who are involved in ML Tools

  2. ^
  3. ^

    Todd, Benjamin (2023) The highest impact career paths our research has identified so far, 80,000 Hours, May 12.

  4. ^

    Hilton, Benjamin (2023) AI safety technical research, 80,000 Hours, June 19th

  5. ^

    Ngo, Richard (2023) AGI safety career advice, EA Forum, May 2

  6. ^

    Karnofsky, Holden (2023), Jobs that can help with the most important century, EA Forum, Feb 12

An­nounc­ing the Win­ners of the 2023 Open Philan­thropy AI Wor­ld­views Contest

Jason SchukraftSep 30, 2023, 3:51 AM
74 points
30 comments2 min readEA link

High-level hopes for AI alignment

Holden KarnofskyDec 20, 2022, 2:11 AM
123 points
14 comments19 min readEA link
(www.cold-takes.com)

Re­sources I send to AI re­searchers about AI safety

Vael GatesJan 11, 2023, 1:24 AM
43 points
0 comments1 min readEA link

AI safety needs to scale, and here’s how you can do it

Esben KranFeb 2, 2024, 7:17 AM
32 points
2 comments5 min readEA link
(apartresearch.com)

Chilean AIS Hackathon Retrospective

Agustín Covarrubias 🔸May 9, 2023, 1:34 AM
67 points
0 comments5 min readEA link

FLI open let­ter: Pause gi­ant AI experiments

Zach Stein-PerlmanMar 29, 2023, 4:04 AM
220 points
38 comments1 min readEA link

Fill out this cen­sus of ev­ery­one in­ter­ested in re­duc­ing catas­trophic AI risks

Alex HTMay 18, 2024, 3:53 PM
105 points
1 comment1 min readEA link

Katja Grace: Let’s think about slow­ing down AI

peterhartreeDec 23, 2022, 12:57 AM
84 points
6 comments2 min readEA link
(worldspiritsockpuppet.substack.com)

An­nounc­ing the Euro­pean Net­work for AI Safety (ENAIS)

Esben KranMar 22, 2023, 5:57 PM
124 points
3 comments3 min readEA link

Launch­ing ap­pli­ca­tions for AI Safety Ca­reers Course In­dia 2024

varun_agrMay 1, 2024, 5:30 AM
23 points
1 comment1 min readEA link

Me­tac­u­lus Launches Fu­ture of AI Series, Based on Re­search Ques­tions by Arb

christianMar 13, 2024, 9:14 PM
34 points
0 comments1 min readEA link
(www.metaculus.com)

An­nounc­ing AI Safety Bulgaria

Aleksandar N. AngelovMar 3, 2024, 5:53 PM
15 points
0 comments1 min readEA link

AI Safety Europe Re­treat 2023 Retrospective

Magdalena WacheApr 14, 2023, 9:05 AM
41 points
10 comments1 min readEA link

Pre­dictable up­dat­ing about AI risk

Joe_CarlsmithMay 8, 2023, 10:05 PM
134 points
12 comments36 min readEA link

The Shut­down Prob­lem: In­com­plete Prefer­ences as a Solution

EJTFeb 23, 2024, 4:01 PM
26 points
0 comments1 min readEA link

A Qual­i­ta­tive Case for LTFF: Filling Crit­i­cal Ecosys­tem Gaps

LinchDec 3, 2024, 9:57 PM
89 points
26 comments9 min readEA link

Long list of AI ques­tions

NunoSempereDec 6, 2023, 11:12 AM
124 points
14 comments86 min readEA link

Evolu­tion pro­vides no ev­i­dence for the sharp left turn

Quintin PopeApr 11, 2023, 6:48 PM
43 points
2 comments1 min readEA link

Please vote for PauseAI US in the Dona­tion Elec­tion!

Holly Elmore ⏸️ 🔸Nov 22, 2024, 4:12 AM
21 points
3 comments2 min readEA link

My cover story in Ja­cobin on AI cap­i­tal­ism and the x-risk debates

GarrisonFeb 12, 2024, 11:34 PM
154 points
10 comments6 min readEA link
(jacobin.com)

“Near Mid­night in Suicide City”

Greg_ColbournDec 6, 2024, 7:54 PM
5 points
0 comments1 min readEA link
(www.youtube.com)

AI for An­i­mals 2025 Con­fer­ence—Get Early Bird Tick­ets Now

Constance LiNov 20, 2024, 12:53 AM
47 points
0 comments1 min readEA link

Nav­i­gat­ing the New Real­ity in DC: An EIP Primer

IanDavidMossDec 20, 2024, 4:59 PM
20 points
1 comment13 min readEA link
(effectiveinstitutionsproject.substack.com)

Dona­tion recom­men­da­tions for xrisk + ai safety

vincentweisserFeb 6, 2023, 9:25 PM
17 points
11 comments1 min readEA link

We are not alone: many com­mu­ni­ties want to stop Big Tech from scal­ing un­safe AI

RemmeltSep 22, 2023, 5:38 PM
28 points
30 comments4 min readEA link

Win­ners of the Es­say com­pe­ti­tion on the Au­toma­tion of Wis­dom and Philosophy

Owen Cotton-BarrattOct 29, 2024, 12:02 AM
37 points
2 comments30 min readEA link
(blog.aiimpacts.org)

A case for donat­ing to AI risk re­duc­tion (in­clud­ing if you work in AI)

tlevinDec 2, 2024, 7:05 PM
118 points
5 comments3 min readEA link

Four mind­set dis­agree­ments be­hind ex­is­ten­tial risk dis­agree­ments in ML

RobBensingerApr 11, 2023, 4:53 AM
61 points
2 comments9 min readEA link

Why Si­mu­la­tor AIs want to be Ac­tive In­fer­ence AIs

Jan_KulveitApr 11, 2023, 9:06 AM
22 points
0 comments8 min readEA link
(www.lesswrong.com)

An­nounc­ing the Q1 2025 Long-Term Fu­ture Fund grant round

LinchDec 20, 2024, 2:17 AM
43 points
8 comments2 min readEA link

Cos­mic AI safety

Magnus VindingDec 6, 2024, 10:32 PM
22 points
5 comments6 min readEA link

How AI Takeover Might Hap­pen in Two Years

JoshcFeb 7, 2025, 11:51 PM
20 points
6 comments29 min readEA link
(x.com)

Here’s how The Mi­das Pro­ject could use ad­di­tional fund­ing.

Tyler JohnstonNov 17, 2024, 10:15 PM
20 points
0 comments2 min readEA link

Sym­bio­sis, not al­ign­ment, as the goal for liberal democ­ra­cies in the tran­si­tion to ar­tifi­cial gen­eral intelligence

simonfriederichMar 17, 2023, 1:04 PM
18 points
2 comments24 min readEA link
(rdcu.be)

Prevent­ing an AI-re­lated catas­tro­phe—Prob­lem profile

Benjamin HiltonAug 29, 2022, 6:49 PM
138 points
18 comments4 min readEA link
(80000hours.org)

Con­sider grant­ing AIs freedom

Matthew_BarnettDec 6, 2024, 12:55 AM
80 points
22 comments5 min readEA link

Where I Am Donat­ing in 2024

MichaelDickensNov 19, 2024, 12:09 AM
179 points
73 comments46 min readEA link

The Choice Transition

Owen Cotton-BarrattNov 18, 2024, 12:32 PM
42 points
1 comment15 min readEA link
(strangecities.substack.com)

De­cep­tive Align­ment is <1% Likely by Default

DavidWFeb 21, 2023, 3:07 PM
54 points
26 comments14 min readEA link

MIRI’s 2024 End-of-Year Update

RobBensingerDec 3, 2024, 4:33 AM
32 points
7 comments1 min readEA link

Fund­ing case: AI Safety Camp 10

RemmeltDec 12, 2023, 9:05 AM
45 points
13 comments5 min readEA link
(manifund.org)

Vael Gates: Risks from Highly-Ca­pable AI (March 2023)

Vael GatesApr 1, 2023, 8:54 PM
31 points
4 comments1 min readEA link
(docs.google.com)

AISN #45: Cen­ter for AI Safety 2024 Year in Review

Center for AI SafetyDec 19, 2024, 6:14 PM
11 points
0 comments4 min readEA link
(newsletter.safe.ai)

[Question] Seek­ing sug­gested read­ings & videos for a new course on ‘AI and Psy­chol­ogy’

Geoffrey MillerMay 20, 2024, 5:45 PM
32 points
7 comments1 min readEA link

Against Aschen­bren­ner: How ‘Si­tu­a­tional Aware­ness’ con­structs a nar­ra­tive that un­der­mines safety and threat­ens humanity

Gideon FutermanJul 15, 2024, 4:21 PM
240 points
22 comments21 min readEA link

[Linkpost] State­ment from Scar­lett Jo­hans­son on OpenAI’s use of the “Sky” voice, that was shock­ingly similar to her own voice.

LinchMay 20, 2024, 11:50 PM
46 points
8 comments1 min readEA link
(variety.com)

AI al­ign­ment re­searchers may have a com­par­a­tive ad­van­tage in re­duc­ing s-risks

Lukas_GloorFeb 15, 2023, 1:01 PM
79 points
5 comments13 min readEA link

To the Bat Mo­bile!! My Mid-Ca­reer Tran­si­tion into AI Safety

MoneerNov 7, 2024, 3:59 PM
12 points
0 comments3 min readEA link

[SEE NEW EDITS] No, *You* Need to Write Clearer

Nicholas / Heather KrossApr 29, 2023, 5:04 AM
71 points
8 comments1 min readEA link
(www.thinkingmuchbetter.com)

Cur­rent UK gov­ern­ment lev­ers on AI development

rosehadsharApr 10, 2023, 1:16 PM
82 points
3 comments4 min readEA link

Prevent­ing AI Mi­suse: State of the Art Re­search and its Flaws

Madhav MalhotraApr 23, 2023, 10:50 AM
24 points
2 comments11 min readEA link

Should AI X-Risk Wor­ri­ers Short the Mar­ket?

postlibertarianNov 4, 2024, 4:16 PM
14 points
1 comment6 min readEA link

But why would the AI kill us?

So8resApr 17, 2023, 7:38 PM
45 points
3 comments1 min readEA link

Sleeper Agents: Train­ing De­cep­tive LLMs that Per­sist Through Safety Training

evhubJan 12, 2024, 7:51 PM
65 points
0 comments1 min readEA link
(arxiv.org)

AI Can Help An­i­mal Ad­vo­cacy More Than It Can Help In­dus­trial Farming

Wladimir J. AlonsoNov 26, 2024, 9:55 AM
21 points
10 comments4 min readEA link

AGI safety ca­reer advice

richard_ngoMay 2, 2023, 7:36 AM
211 points
20 comments1 min readEA link

Agen­tic Align­ment: Nav­i­gat­ing be­tween Harm and Illegitimacy

LennardZNov 26, 2024, 9:27 PM
2 points
1 comment9 min readEA link

[Linkpost] AI Align­ment, Ex­plained in 5 Points (up­dated)

Daniel_EthApr 18, 2023, 8:09 AM
31 points
2 comments1 min readEA link
(medium.com)

AI al­ign­ment, hu­man al­ign­ment, oh my

MilesWOct 31, 2024, 3:23 AM
−12 points
0 comments2 min readEA link

INTELLECT-1 Re­lease: The First Globally Trained 10B Pa­ram­e­ter Model

Matrice JacobineNov 29, 2024, 11:03 PM
2 points
1 comment1 min readEA link
(www.primeintellect.ai)

NIST Seeks Com­ments On “Safety Con­sid­er­a­tions for Chem­i­cal and/​or Biolog­i­cal AI Models”

Dylan RichardsonOct 26, 2024, 6:28 PM
15 points
0 comments1 min readEA link
(www.federalregister.gov)

[Question] Can we train AI so that fu­ture philan­thropy is more effec­tive?

Ricardo PimentelNov 3, 2024, 3:08 PM
3 points
0 comments1 min readEA link

AIS Hun­gary is hiring a part-time Tech­ni­cal Lead! (Dead­line: Dec 31st)

gergoDec 17, 2024, 2:08 PM
9 points
0 comments2 min readEA link

The Com­pendium, A full ar­gu­ment about ex­tinc­tion risk from AGI

adamShimiOct 31, 2024, 12:02 PM
9 points
1 comment2 min readEA link
(www.thecompendium.ai)

Oper­a­tional­iz­ing timelines

Zach Stein-PerlmanMar 10, 2023, 5:30 PM
30 points
2 comments1 min readEA link

2021 AI Align­ment Liter­a­ture Re­view and Char­ity Comparison

LarksDec 23, 2021, 2:06 PM
176 points
18 comments73 min readEA link

Which in­cen­tives should be used to en­courage com­pli­ance with UK AI leg­is­la­tion?

jcwNov 18, 2024, 6:13 PM
12 points
0 comments12 min readEA link

1-year up­date on im­pactRIO, the first AI Safety group in Brazil

João Lucas DuimJun 28, 2024, 10:59 AM
56 points
2 comments10 min readEA link

Cy­borg Pe­ri­ods: There will be mul­ti­ple AI transitions

Jan_KulveitFeb 22, 2023, 4:09 PM
68 points
1 comment1 min readEA link

Man­i­fund x AI Worldviews

AustinMar 31, 2023, 3:32 PM
32 points
2 comments2 min readEA link
(manifund.org)

Merger of Deep­Mind and Google Brain

Greg_ColbournApr 20, 2023, 8:16 PM
11 points
12 comments1 min readEA link
(blog.google)

Par­tial value takeover with­out world takeover

Katja_GraceApr 18, 2024, 3:00 AM
24 points
2 comments1 min readEA link

Filling the Void: A Com­pre­hen­sive Database for AI Risks Materials

J.A.M.May 28, 2024, 4:03 PM
10 points
1 comment4 min readEA link

Fund­ing AI Safety poli­ti­cal ad­vo­cacy in the US: In­di­vi­d­ual donors and small dona­tions may be es­pe­cially helpful

Holly Elmore ⏸️ 🔸Nov 14, 2023, 11:14 PM
64 points
8 comments1 min readEA link

Pro­ject ideas: Backup plans & Co­op­er­a­tive AI

Lukas FinnvedenJan 4, 2024, 7:26 AM
25 points
2 comments13 min readEA link
(lukasfinnveden.substack.com)

Videos on the world’s most press­ing prob­lems, by 80,000 Hours

BellaMar 21, 2024, 8:18 PM
63 points
5 comments2 min readEA link

Shap­ing Poli­cies for Eth­i­cal AI Devel­op­ment in Africa

KuiyakiMay 16, 2024, 2:15 PM
3 points
0 comments1 min readEA link

De­tails on how an IAEA-style AI reg­u­la­tor would func­tion?

freedomandutilityJun 3, 2023, 12:03 PM
12 points
5 comments1 min readEA link

AI-nu­clear in­te­gra­tion: ev­i­dence of au­toma­tion bias from hu­mans and LLMs [re­search sum­mary]

TaoApr 27, 2024, 9:59 PM
17 points
2 comments12 min readEA link

Timelines are short, p(doom) is high: a global stop to fron­tier AI de­vel­op­ment un­til x-safety con­sen­sus is our only rea­son­able hope

Greg_ColbournOct 12, 2023, 11:24 AM
73 points
85 comments9 min readEA link

New Busi­ness Wars pod­cast sea­son on Sam Alt­man and OpenAI

Eevee🔹Apr 2, 2024, 6:22 AM
10 points
0 comments1 min readEA link
(wondery.com)

The Leeroy Jenk­ins prin­ci­ple: How faulty AI could guaran­tee “warn­ing shots”

titotalJan 14, 2024, 3:03 PM
54 points
2 comments21 min readEA link
(titotal.substack.com)

Public Weights?

Jeff Kaufman 🔸Nov 2, 2023, 2:51 AM
20 points
7 comments1 min readEA link

Bounty for Ev­i­dence on Some of Pal­isade Re­search’s Beliefs

bwrSep 23, 2024, 8:05 PM
5 points
0 comments1 min readEA link

Cri­tiques of promi­nent AI safety labs: Red­wood Research

OmegaMar 31, 2023, 8:58 AM
339 points
91 comments20 min readEA link

Pro­ject ideas: Epistemics

Lukas FinnvedenJan 4, 2024, 7:26 AM
43 points
1 comment17 min readEA link
(lukasfinnveden.substack.com)

Drexler’s Nanosys­tems is now available online

MikhailSaminJun 1, 2024, 2:41 PM
32 points
4 comments1 min readEA link
(nanosyste.ms)

The “low-hang­ing fruits” of AI safety

Julian NalenzDec 19, 2024, 1:38 PM
−1 points
0 comments6 min readEA link
(blog.hermesloom.org)

A short con­ver­sa­tion I had with Google Gem­ini on the dan­gers of un­reg­u­lated LLM API use, while mildly drunk in an air­port.

EvanMcCormickDec 17, 2024, 12:25 PM
1 point
0 comments8 min readEA link

How to Ad­dress EA Dilem­mas – What is Miss­ing from EA Values?

alexis schoenlaubOct 13, 2024, 9:33 AM
6 points
4 comments6 min readEA link

AISafety.info “How can I help?” FAQ

StevenKaasJun 5, 2023, 10:09 PM
48 points
1 comment1 min readEA link

OpenAI in­tro­duces func­tion call­ing for GPT-4

micJun 20, 2023, 1:58 AM
26 points
0 comments1 min readEA link

AI Risk & Policy Fore­casts from Me­tac­u­lus & FLI’s AI Path­ways Workshop

Will AldredMay 16, 2023, 8:53 AM
41 points
0 comments8 min readEA link

My fa­vorite AI gov­er­nance re­search this year so far

Zach Stein-PerlmanJul 23, 2023, 10:00 PM
81 points
4 comments7 min readEA link
(blog.aiimpacts.org)

In­ter­ac­tive AI Gover­nance Map

Hamish McDoodlesMar 12, 2024, 10:02 AM
66 points
8 comments1 min readEA link

The Cruel Trade-Off Between AI Mi­suse and AI X-risk Concerns

simeon_cApr 22, 2023, 1:49 PM
21 points
17 comments1 min readEA link

How I failed to form views on AI safety

Ada-Maaria HyvärinenApr 17, 2022, 11:05 AM
213 points
72 comments40 min readEA link

Ex­ec­u­tive Direc­tor for AIS Brus­sels—Ex­pres­sion of interest

gergoDec 19, 2024, 9:15 AM
28 points
0 comments4 min readEA link

Propos­ing the Con­di­tional AI Safety Treaty (linkpost TIME)

OttoNov 15, 2024, 1:56 PM
12 points
6 comments3 min readEA link
(time.com)

AI Risk US Pres­i­den­tal Candidate

Simon BerensApr 11, 2023, 8:18 PM
12 points
8 comments1 min readEA link

AI Safety Camp 10

Robert KralischOct 26, 2024, 11:36 AM
15 points
0 comments18 min readEA link
(www.lesswrong.com)

AI safety starter pack

mariushobbhahnMar 28, 2022, 4:05 PM
126 points
13 comments6 min readEA link

Re­quest to AGI or­ga­ni­za­tions: Share your views on paus­ing AI progress

AkashApr 11, 2023, 5:30 PM
85 points
1 comment1 min readEA link

Anti-‘FOOM’ (stop try­ing to make your cute pet name the thing)

david_reinsteinApr 14, 2023, 4:05 PM
41 points
17 comments2 min readEA link

[Question] What is the cur­rent most rep­re­sen­ta­tive EA AI x-risk ar­gu­ment?

Matthew_BarnettDec 15, 2023, 10:04 PM
117 points
50 comments3 min readEA link

AI strat­egy given the need for good reflection

Owen Cotton-BarrattMar 18, 2024, 12:48 AM
40 points
1 comment5 min readEA link

[Linkpost] Given Ex­tinc­tion Wor­ries, Why Don’t AI Re­searchers Quit? Well, Sev­eral Reasons

Daniel_EthJun 6, 2023, 7:31 AM
25 points
6 comments1 min readEA link
(medium.com)

Co­or­di­na­tion by com­mon knowl­edge to pre­vent un­con­trol­lable AI

Karl von WendtMay 14, 2023, 1:37 PM
14 points
0 comments1 min readEA link

2/​3 Aussie & NZ AI Safety folk of­ten or some­times feel lonely or dis­con­nected (and 16 other bar­ri­ers to im­pact)

yanni kyriacosAug 1, 2024, 1:14 AM
19 points
11 comments8 min readEA link

Count­ing ar­gu­ments provide no ev­i­dence for AI doom

Nora BelroseFeb 27, 2024, 11:03 PM
84 points
15 comments1 min readEA link

Train for in­cor­rigi­bil­ity, then re­verse it (Shut­down Prob­lem Con­test Sub­mis­sion)

Daniel_EthJul 18, 2023, 8:26 AM
16 points
0 comments2 min readEA link

Prob­lem-solv­ing tasks in Graph The­ory for lan­guage mod­els

Bruno López OrozcoOct 1, 2024, 12:36 PM
21 points
1 comment9 min readEA link

An­nounc­ing the AI Fables Writ­ing Con­test!

Daystar EldJul 12, 2023, 3:04 AM
76 points
52 comments3 min readEA link

Break­through in AI agents? (On Devin—The Zvi, linkpost)

SiebeRozendalMar 20, 2024, 9:43 AM
16 points
9 comments1 min readEA link
(thezvi.substack.com)

Among the A.I. Doom­say­ers—The New Yorker

Agustín Covarrubias 🔸Mar 11, 2024, 9:12 PM
66 points
0 comments1 min readEA link
(www.newyorker.com)

Two im­por­tant re­cent AI Talks- Ge­bru and Lazar

Gideon FutermanMar 6, 2023, 1:30 AM
−7 points
5 comments1 min readEA link

The Tech In­dus­try is the Biggest Blocker to Mean­ingful AI Safety Regulations

GarrisonAug 16, 2024, 7:37 PM
139 points
8 comments8 min readEA link
(garrisonlovely.substack.com)

Arkose: Or­ga­ni­za­tional Up­dates & Ways to Get Involved

ArkoseAug 1, 2024, 1:03 PM
28 points
1 comment1 min readEA link

In favour of ex­plor­ing nag­ging doubts about x-risk

Owen Cotton-BarrattJun 25, 2024, 11:52 PM
89 points
15 comments2 min readEA link

An­nounc­ing the CLR Foun­da­tions Course and CLR S-Risk Seminars

James FavilleNov 19, 2024, 1:18 AM
52 points
2 comments3 min readEA link

AI do­ing philos­o­phy = AI gen­er­at­ing hands?

Wei DaiJan 15, 2024, 9:04 AM
67 points
6 comments3 min readEA link

Dis­rupt­ing mal­i­cious uses of AI by state-af­fili­ated threat actors

Agustín Covarrubias 🔸Feb 14, 2024, 9:28 PM
22 points
1 comment1 min readEA link
(openai.com)

[Question] What’s the best way to get a sense of the day-to-day ac­tivi­ties of differ­ent re­searchers/​re­search di­rec­tions? (AI Gover­nance)

LuiseMay 27, 2024, 12:48 PM
15 points
1 comment1 min readEA link

Ten ar­gu­ments that AI is an ex­is­ten­tial risk

Katja_GraceAug 14, 2024, 9:51 PM
30 points
0 comments7 min readEA link

AI stocks could crash. And that could have im­pli­ca­tions for AI safety

Benjamin_ToddMay 9, 2024, 7:23 AM
173 points
41 comments4 min readEA link
(benjamintodd.substack.com)

Hooray for step­ping out of the limelight

So8resApr 1, 2023, 2:45 AM
103 points
0 comments1 min readEA link

The mar­ket plau­si­bly ex­pects AI soft­ware to cre­ate trillions of dol­lars of value by 2027

Benjamin_ToddMay 6, 2024, 5:16 AM
88 points
19 comments1 min readEA link
(benjamintodd.substack.com)

Jan Leike: “I’m ex­cited to join @An­throp­icAI to con­tinue the su­per­al­ign­ment mis­sion!”

defun 🔸May 28, 2024, 6:08 PM
35 points
11 comments1 min readEA link
(x.com)

Ap­ply to the Cavendish Labs Fel­low­ship (by 4/​15)

Derik KApr 3, 2023, 11:06 PM
35 points
2 comments1 min readEA link

Men­tor­ship in AGI Safety (MAGIS)

Joe RogeroMay 23, 2024, 6:34 PM
11 points
1 comment2 min readEA link

Anal­ogy Bank for AI Safety

utilistrutilJan 29, 2024, 2:35 AM
14 points
5 comments1 min readEA link

[MLSN #8]: Mechanis­tic in­ter­pretabil­ity, us­ing law to in­form AI al­ign­ment, scal­ing laws for proxy gaming

TW123Feb 20, 2023, 4:06 PM
25 points
0 comments4 min readEA link
(newsletter.mlsafety.org)

De­con­fus­ing Pauses: Long Term Mo­ra­to­rium vs Slow­ing AI

Gideon FutermanAug 4, 2024, 11:32 AM
17 points
3 comments5 min readEA link

[Question] Why hasn’t there been any sig­nifi­cant AI protest

sammyboizMay 17, 2024, 2:59 AM
21 points
14 comments1 min readEA link

AI Win­ter Sea­son at EA Hotel

CEEALARSep 25, 2024, 1:36 PM
57 points
2 comments1 min readEA link

My lab’s small AI safety agenda

Jobst Heitzig (vodle.it)Jun 18, 2023, 12:29 PM
59 points
26 comments3 min readEA link

FLI re­port: Poli­cy­mak­ing in the Pause

Zach Stein-PerlmanApr 15, 2023, 5:01 PM
29 points
4 comments1 min readEA link

Why some peo­ple dis­agree with the CAIS state­ment on AI

David_MossAug 15, 2023, 1:39 PM
144 points
15 comments16 min readEA link

AI Safety Im­pact Mar­kets: Your Char­ity Eval­u­a­tor for AI Safety

Dawn DrescherOct 1, 2023, 10:47 AM
28 points
4 comments6 min readEA link
(impactmarkets.substack.com)

Sen­tience In­sti­tute 2021 End of Year Summary

AliNov 26, 2021, 2:40 PM
66 points
5 comments6 min readEA link
(www.sentienceinstitute.org)

Dario Amodei — Machines of Lov­ing Grace

Matrice JacobineOct 11, 2024, 9:39 PM
66 points
0 comments1 min readEA link
(darioamodei.com)

Non-al­ign­ment pro­ject ideas for mak­ing trans­for­ma­tive AI go well

Lukas FinnvedenJan 4, 2024, 7:23 AM
66 points
1 comment3 min readEA link
(lukasfinnveden.substack.com)

An­nounc­ing Fore­castBench, a new bench­mark for AI and hu­man fore­cast­ing abilities

Forecasting Research InstituteOct 1, 2024, 12:31 PM
20 points
1 comment3 min readEA link
(arxiv.org)

EU poli­cy­mak­ers reach an agree­ment on the AI Act

tlevinDec 15, 2023, 6:03 AM
109 points
13 comments1 min readEA link

Slim overview of work one could do to make AI go bet­ter (and a grab-bag of other ca­reer con­sid­er­a­tions)

ChiMar 20, 2024, 11:17 PM
34 points
1 comment3 min readEA link

Claude Doesn’t Want to Die

GarrisonMar 5, 2024, 6:00 AM
22 points
14 comments10 min readEA link
(garrisonlovely.substack.com)

When “hu­man-level” is the wrong thresh­old for AI

Ben Millwood🔸Jun 22, 2024, 2:34 PM
38 points
3 comments7 min readEA link

NYT: Google will ‘re­cal­ibrate’ the risk of re­leas­ing AI due to com­pe­ti­tion with OpenAI

Michael HuangJan 22, 2023, 2:13 AM
173 points
8 comments1 min readEA link
(www.nytimes.com)

Fu­ture Mat­ters #8: Bing Chat, AI labs on safety, and paus­ing Fu­ture Matters

PabloMar 21, 2023, 2:50 PM
81 points
5 comments24 min readEA link

AGI Catas­tro­phe and Takeover: Some Refer­ence Class-Based Priors

zdgroffMay 24, 2023, 7:14 PM
103 points
10 comments6 min readEA link

Large Lan­guage Models as Fi­du­cia­ries to Humans

johnjnayJan 24, 2023, 7:53 PM
25 points
0 comments34 min readEA link
(papers.ssrn.com)

Shut­ting down all com­pet­ing AI pro­jects might not buy a lot of time due to In­ter­nal Time Pressure

ThomasCederborgOct 3, 2024, 12:05 AM
6 points
1 comment12 min readEA link

How to Give Com­ing AGI’s the Best Chance of Figur­ing Out Ethics for Us

Sean SweeneyMay 23, 2024, 7:44 PM
1 point
1 comment10 min readEA link

Please won­der about the hard parts of the al­ign­ment problem

MikhailSaminJul 11, 2023, 5:02 PM
8 points
0 comments1 min readEA link

Join­ing the Carnegie En­dow­ment for In­ter­na­tional Peace

Holden KarnofskyApr 29, 2024, 3:45 PM
228 points
14 comments2 min readEA link

2024: a year of con­soli­da­tion for ORCG

JorgeTorresCDec 18, 2024, 5:47 PM
33 points
0 comments7 min readEA link
(www.orcg.info)

[Linkpost] 538 Poli­tics Pod­cast on AI risk & politics

jackvaApr 11, 2023, 5:03 PM
64 points
5 comments1 min readEA link
(fivethirtyeight.com)

Why AGI sys­tems will not be fa­nat­i­cal max­imisers (un­less trained by fa­nat­i­cal hu­mans)

titotalMay 17, 2023, 11:58 AM
43 points
3 comments15 min readEA link

Par­tial Tran­script of Re­cent Se­nate Hear­ing Dis­cussing AI X-Risk

Daniel_EthJul 27, 2023, 9:16 AM
150 points
2 comments22 min readEA link
(medium.com)

Trendlines in AIxBio evals

ljustenOct 31, 2024, 12:09 AM
39 points
2 comments11 min readEA link
(www.lennijusten.com)

Solv­ing ad­ver­sar­ial at­tacks in com­puter vi­sion as a baby ver­sion of gen­eral AI alignment

Stanislav FortAug 31, 2024, 4:15 PM
3 points
1 comment7 min readEA link

Cor­po­rate cam­paigns work: a key learn­ing for AI Safety

Jamie_HarrisAug 17, 2023, 9:35 PM
72 points
12 comments6 min readEA link

AISC 2024 - Pro­ject Summaries

Nicky PochinkovNov 27, 2023, 10:35 PM
13 points
1 comment18 min readEA link

Data Tax­a­tion: A Pro­posal for Slow­ing Down AGI Progress

Per Ivar FriborgApr 11, 2023, 5:27 PM
42 points
6 comments12 min readEA link

Sam Alt­man re­turn­ing as OpenAI CEO “in prin­ci­ple”

Fermi–Dirac DistributionNov 22, 2023, 6:15 AM
55 points
37 comments1 min readEA link

AI Safety Ac­tion Plan—A re­port com­mis­sioned by the US State Department

Agustín Covarrubias 🔸Mar 11, 2024, 10:13 PM
25 points
1 comment1 min readEA link
(www.gladstone.ai)

The ‘Ne­glected Ap­proaches’ Ap­proach: AE Stu­dio’s Align­ment Agenda

Marc CarauleanuDec 18, 2023, 9:13 PM
21 points
0 comments12 min readEA link

Did Ben­gio and Teg­mark lose a de­bate about AI x-risk against LeCun and Mitchell?

Karl von WendtJun 25, 2023, 4:59 PM
80 points
24 comments1 min readEA link

Please don’t crit­i­cize EAs who “sell out” to OpenAI and Anthropic

Eevee🔹Mar 5, 2023, 9:17 PM
−4 points
21 comments2 min readEA link

Brain-com­puter in­ter­faces and brain organoids in AI al­ign­ment?

freedomandutilityApr 15, 2023, 10:28 PM
8 points
2 comments1 min readEA link

“Aligned with who?” Re­sults of sur­vey­ing 1,000 US par­ti­ci­pants on AI values

Holly MorganMar 21, 2023, 10:07 PM
41 points
0 comments2 min readEA link
(www.lesswrong.com)

[Question] Con­crete, ex­ist­ing ex­am­ples of high-im­pact risks from AI?

freedomandutilityApr 15, 2023, 10:19 PM
9 points
1 comment1 min readEA link

Some Things I Heard about AI Gover­nance at EAG

utilistrutilFeb 28, 2023, 9:27 PM
35 points
5 comments6 min readEA link

A fresh­man year dur­ing the AI midgame: my ap­proach to the next year

BuckApr 14, 2023, 12:38 AM
179 points
30 comments7 min readEA link

How to help cru­cial AI safety leg­is­la­tion pass with 10 min­utes of effort

ThomasWSep 11, 2024, 7:14 PM
258 points
33 comments3 min readEA link

Whether you should do a PhD doesn’t de­pend much on timelines.

alex lawsenMar 22, 2023, 12:25 PM
67 points
7 comments4 min readEA link

Suc­cess with­out dig­nity: a nearcast­ing story of avoid­ing catas­tro­phe by luck

Holden KarnofskyMar 15, 2023, 8:17 PM
113 points
3 comments1 min readEA link

List of AI safety newslet­ters and other resources

LizkaMay 1, 2023, 5:24 PM
49 points
5 comments4 min readEA link

A Roundtable for Safe AI (RSAI)?

Lara_THMar 9, 2023, 12:11 PM
9 points
0 comments4 min readEA link

Paper Sum­mary: The Effec­tive­ness of AI Ex­is­ten­tial Risk Com­mu­ni­ca­tion to the Amer­i­can and Dutch Public

OttoMar 9, 2023, 10:40 AM
97 points
11 comments4 min readEA link

Refram­ing the bur­den of proof: Com­pa­nies should prove that mod­els are safe (rather than ex­pect­ing au­di­tors to prove that mod­els are dan­ger­ous)

AkashApr 25, 2023, 6:49 PM
35 points
1 comment1 min readEA link

How we could stum­ble into AI catastrophe

Holden KarnofskyJan 16, 2023, 2:52 PM
83 points
0 comments31 min readEA link
(www.cold-takes.com)

Ex­plor­ing Me­tac­u­lus’s AI Track Record

Peter ScoblicMay 1, 2023, 9:02 PM
52 points
5 comments5 min readEA link

A great talk for AI noobs (ac­cord­ing to an AI noob)

DovApr 23, 2023, 5:32 AM
8 points
0 comments1 min readEA link
(www.youtube.com)

AGI ris­ing: why we are in a new era of acute risk and in­creas­ing pub­lic aware­ness, and what to do now

Greg_ColbournMay 2, 2023, 10:17 AM
68 points
35 comments13 min readEA link

Misal­ign­ment Mu­seum opens in San Fran­cisco: ‘Sorry for kil­ling most of hu­man­ity’

Michael HuangMar 4, 2023, 7:09 AM
99 points
6 comments1 min readEA link
(www.misalignmentmuseum.com)

Orthog­o­nal: A new agent foun­da­tions al­ign­ment organization

Tamsin LeakeApr 19, 2023, 8:17 PM
38 points
0 comments1 min readEA link

AI Progress: The Game Show

Alex ArnettApr 21, 2023, 4:47 PM
3 points
0 comments2 min readEA link

Re­search agenda: Su­per­vis­ing AIs im­prov­ing AIs

Quintin PopeApr 29, 2023, 5:09 PM
16 points
0 comments1 min readEA link

World and Mind in Ar­tifi­cial In­tel­li­gence: ar­gu­ments against the AI pause

Arturo MaciasApr 18, 2023, 2:35 PM
6 points
3 comments5 min readEA link

Risk of AI de­cel­er­a­tion.

Micah ZoltuApr 18, 2023, 11:19 AM
9 points
14 comments3 min readEA link

The ba­sic rea­sons I ex­pect AGI ruin

RobBensingerApr 18, 2023, 3:37 AM
58 points
13 comments1 min readEA link

“Risk Aware­ness Mo­ments” (Rams): A con­cept for think­ing about AI gov­er­nance interventions

oegApr 14, 2023, 5:40 PM
53 points
0 comments9 min readEA link

ChatGPT not so clever or not so ar­tifi­cial as hyped to be?

Haris ShekerisMar 2, 2023, 6:16 AM
−7 points
2 comments1 min readEA link

[linkpost] “What Are Rea­son­able AI Fears?” by Robin Han­son, 2023-04-23

Arjun PanicksseryApr 14, 2023, 11:26 PM
41 points
3 comments4 min readEA link
(quillette.com)

[Question] Who is test­ing AI Safety pub­lic out­reach mes­sag­ing?

yanni kyriacosApr 15, 2023, 12:53 AM
20 points
2 comments1 min readEA link

AGI Safety Needs Peo­ple With All Skil­lsets!

SeverinJul 25, 2022, 1:30 PM
33 points
7 comments2 min readEA link

UK Govern­ment an­nounces £100 mil­lion in fund­ing for Foun­da­tion Model Task­force.

Jordan Pieters 🔸Apr 25, 2023, 11:29 AM
10 points
1 comment1 min readEA link
(www.gov.uk)

Sen­tinel min­utes for week #52/​2024

NunoSempereDec 30, 2024, 6:25 PM
61 points
0 comments6 min readEA link
(blog.sentinel-team.org)

Nav­i­gat­ing the Open-Source AI Land­scape: Data, Fund­ing, and Safety

AndreFerrettiApr 12, 2023, 10:30 AM
23 points
3 comments10 min readEA link

The stan­dard case for de­lay­ing AI ap­pears to rest on non-util­i­tar­ian assumptions

Matthew_BarnettFeb 11, 2025, 4:04 AM
18 points
34 comments10 min readEA link

Mea­sur­ing ar­tifi­cial in­tel­li­gence on hu­man bench­marks is naive

Ward AApr 11, 2023, 11:28 AM
3 points
2 comments1 min readEA link

Ex­is­ten­tial risk x Crypto: An un­con­fer­ence at Zuzalu

YeshApr 11, 2023, 1:31 PM
6 points
0 comments1 min readEA link

How ma­jor gov­ern­ments can help with the most im­por­tant century

Holden KarnofskyFeb 24, 2023, 7:37 PM
56 points
4 comments4 min readEA link
(www.cold-takes.com)

How can OSINT be used for the en­force­ment of the EU AI Act?

KristinaJun 7, 2024, 11:07 AM
8 points
1 comment1 min readEA link

How to pur­sue a ca­reer in tech­ni­cal AI alignment

Charlie Rogers-SmithJun 4, 2022, 9:36 PM
265 points
9 comments39 min readEA link

Is fear pro­duc­tive when com­mu­ni­cat­ing AI x-risk? [Study re­sults]

Johanna RonigerJan 22, 2024, 5:38 AM
73 points
10 comments5 min readEA link

Gaia Net­work: An Illus­trated Primer

Roman LeventovJan 26, 2024, 11:55 AM
4 points
4 comments15 min readEA link

Stan­dard policy frame­works for AI governance

Nathan_BarnardJan 30, 2024, 6:14 PM
26 points
2 comments3 min readEA link

Map­ping How Alli­ances, Ac­qui­si­tions, and An­titrust are Shap­ing the Fron­tier AI Industry

t6aguirreJun 3, 2024, 9:43 AM
24 points
1 comment2 min readEA link

AI Safety Ar­gu­ments: An In­ter­ac­tive Guide

Lukas Trötzmüller🔸Feb 1, 2023, 7:21 PM
32 points
5 comments3 min readEA link

Help us find pain points in AI safety

Esben KranApr 12, 2022, 6:43 PM
31 points
4 comments9 min readEA link

[US] NTIA: AI Ac­countabil­ity Policy Re­quest for Comment

Kyle J. LuccheseApr 13, 2023, 4:12 PM
47 points
4 comments1 min readEA link
(ntia.gov)

[Question] Do you worry about to­tal­i­tar­ian regimes us­ing AI Align­ment tech­nol­ogy to cre­ate AGI that sub­scribe to their val­ues?

diodio_yangFeb 28, 2023, 6:12 PM
25 points
12 comments2 min readEA link

What does Bing Chat tell us about AI risk?

Holden KarnofskyFeb 28, 2023, 6:47 PM
99 points
8 comments2 min readEA link
(www.cold-takes.com)

This might be the last AI Safety Camp

RemmeltJan 24, 2024, 9:29 AM
87 points
32 comments1 min readEA link

AI and Work: Sum­maris­ing a New Liter­a­ture Review

cpeppiattJul 15, 2024, 10:27 AM
13 points
0 comments2 min readEA link
(arxiv.org)

Prospects for AI safety agree­ments be­tween countries

oegApr 14, 2023, 5:41 PM
104 points
3 comments22 min readEA link

Overview of in­tro­duc­tory re­sources in AI Governance

Lucie Philippon 🔸May 27, 2024, 4:22 PM
26 points
1 comment6 min readEA link
(www.lesswrong.com)

Pay to get AI safety info from be­hind NDA wall?

louisbarclayJun 5, 2024, 10:19 AM
2 points
2 comments1 min readEA link

What can we do now to pre­pare for AI sen­tience, in or­der to pro­tect them from the global scale of hu­man sadism?

rimeApr 18, 2023, 9:58 AM
44 points
0 comments2 min readEA link

New Ar­tifi­cial In­tel­li­gence quiz: can you beat ChatGPT?

AndreFerrettiMar 3, 2023, 3:46 PM
29 points
3 comments1 min readEA link

AI Safety Newslet­ter #2: ChaosGPT, Nat­u­ral Selec­tion, and AI Safety in the Media

Oliver ZApr 18, 2023, 6:36 PM
56 points
1 comment4 min readEA link
(newsletter.safe.ai)

There are no co­her­ence theorems

EJTFeb 20, 2023, 9:52 PM
107 points
49 comments19 min readEA link

What AI com­pa­nies can do to­day to help with the most im­por­tant century

Holden KarnofskyFeb 20, 2023, 5:40 PM
104 points
8 comments11 min readEA link
(www.cold-takes.com)

Rea­sons to have hope

Jordan Pieters 🔸Apr 20, 2023, 10:19 AM
53 points
4 comments1 min readEA link

[Closed] MIT Fu­tureTech are hiring for a Head of Oper­a­tions role

PeterSlatteryOct 2, 2024, 4:51 PM
8 points
0 comments4 min readEA link

Draghi’s re­port sig­nal a less safety-fo­cused Euro­pean Union on AI

t6aguirreSep 9, 2024, 6:39 PM
17 points
3 comments1 min readEA link

[Question] Should peo­ple get neu­ro­science phD to work in AI safety field?

jackchang110Mar 7, 2023, 4:21 PM
9 points
11 comments1 min readEA link

PhD Po­si­tion: AI In­ter­pretabil­ity in Ber­lin, Germany

Martian MoonshineApr 22, 2023, 6:57 PM
24 points
0 comments1 min readEA link
(stephanw.net)

Dis­cus­sion about AI Safety fund­ing (FB tran­script)

AkashApr 30, 2023, 7:05 PM
104 points
10 comments6 min readEA link

[Question] Pre­dic­tions for fu­ture AI gov­er­nance?

jackchang110Apr 2, 2023, 4:43 PM
4 points
1 comment1 min readEA link

De­sign­ing Ar­tifi­cial Wis­dom: De­ci­sion Fore­cast­ing AI & Futarchy

Jordan ArelJul 14, 2024, 5:10 AM
5 points
1 comment6 min readEA link

World’s first ma­jor law for ar­tifi­cial in­tel­li­gence gets fi­nal EU green light

Dane ValerieMay 24, 2024, 2:57 PM
3 points
1 comment2 min readEA link
(www.cnbc.com)

[Linkpost] Scott Alexan­der re­acts to OpenAI’s lat­est post

AkashMar 11, 2023, 10:24 PM
105 points
4 comments1 min readEA link

Paus­ing AI Devel­op­ments Isn’t Enough. We Need to Shut it All Down

EliezerYudkowskyApr 9, 2023, 3:53 PM
50 points
3 comments1 min readEA link

Pillars to Convergence

PhlobtonApr 1, 2023, 1:04 PM
1 point
0 comments8 min readEA link

Pes­simism about AI Safety

Max_He-HoApr 2, 2023, 7:57 AM
5 points
0 comments25 min readEA link
(www.lesswrong.com)

Up­dates from Cam­paign for AI Safety

Jolyn KhooSep 27, 2023, 2:44 AM
16 points
0 comments2 min readEA link
(www.campaignforaisafety.org)

What’s new at FAR AI

AdamGleaveDec 4, 2023, 9:18 PM
68 points
0 comments1 min readEA link
(far.ai)

Miti­gat­ing ex­treme AI risks amid rapid progress [Linkpost]

AkashMay 21, 2024, 8:04 PM
36 points
1 comment1 min readEA link

Epi­sode: Austin vs Linch on OpenAI

AustinMay 25, 2024, 4:15 PM
22 points
2 comments44 min readEA link
(manifund.substack.com)

Chain­ing the evil ge­nie: why “outer” AI safety is prob­a­bly easy

titotalAug 30, 2022, 1:55 PM
40 points
12 comments10 min readEA link

The two-tiered society

Roman LeventovMay 13, 2024, 7:53 AM
14 points
5 comments1 min readEA link

Wor­ri­some mi­s­un­der­stand­ing of the core is­sues with AI transition

Roman LeventovJan 18, 2024, 10:05 AM
4 points
3 comments1 min readEA link

Ap­ply to the Cam­bridge ML for Align­ment Boot­camp (CaMLAB) [26 March − 8 April]

hannahFeb 9, 2023, 4:32 PM
62 points
1 comment5 min readEA link

Now THIS is fore­cast­ing: un­der­stand­ing Epoch’s Direct Approach

Elliot MckernonMay 4, 2024, 12:06 PM
52 points
2 comments19 min readEA link

In DC, a new wave of AI lob­by­ists gains the up­per hand

Chris LeongMay 13, 2024, 7:31 AM
97 points
7 comments1 min readEA link
(www.politico.com)

MIT Fu­tureTech are hiring for an Oper­a­tions and Pro­ject Man­age­ment role.

PeterSlatteryMay 17, 2024, 1:29 AM
12 points
0 comments3 min readEA link

Weekly newslet­ter for AI safety events and train­ing programs

Bryce RobertsonMay 3, 2024, 12:37 AM
15 points
0 comments1 min readEA link
(www.lesswrong.com)

Open-Source AI: A Reg­u­la­tory Review

Elliot MckernonApr 29, 2024, 10:10 AM
14 points
1 comment8 min readEA link

Tech­nol­ogy is Power: Rais­ing Aware­ness Of Tech­nolog­i­cal Risks

Marc WongFeb 9, 2023, 3:13 PM
3 points
0 comments2 min readEA link

Cy­ber­se­cu­rity of Fron­tier AI Models: A Reg­u­la­tory Review

Deric ChengApr 25, 2024, 2:51 PM
9 points
1 comment8 min readEA link

Speedrun: AI Align­ment Prizes

joeFeb 9, 2023, 11:55 AM
27 points
0 comments18 min readEA link

Re­search Sum­mary: Fore­cast­ing with Large Lan­guage Models

Damien LairdApr 2, 2023, 10:52 AM
4 points
0 comments7 min readEA link
(damienlaird.substack.com)

Deep­Mind: Fron­tier Safety Framework

Zach Stein-PerlmanMay 17, 2024, 5:30 PM
23 points
0 comments1 min readEA link
(deepmind.google)

14+ AI Safety Ad­vi­sors You Can Speak to – New AISafety.com Resource

Bryce RobertsonJan 21, 2025, 5:34 PM
18 points
2 comments1 min readEA link

List of pro­jects that seem im­pact­ful for AI Governance

JaimeRVJan 14, 2024, 4:52 PM
35 points
2 comments13 min readEA link

How evals might (or might not) pre­vent catas­trophic risks from AI

AkashFeb 7, 2023, 8:16 PM
28 points
0 comments1 min readEA link

Danger­ous ca­pa­bil­ity tests should be harder

Luca Righetti 🔸Aug 20, 2024, 4:11 PM
23 points
1 comment5 min readEA link
(www.planned-obsolescence.org)

The new UK gov­ern­ment’s stance on AI safety

Elliot MckernonJul 31, 2024, 3:23 PM
19 points
0 comments1 min readEA link

Dis­cussing AI-Hu­man Col­lab­o­ra­tion Through Fic­tion: The Story of Laika and GPT-∞

LaikaJul 27, 2023, 6:04 AM
1 point
0 comments1 min readEA link

Con­scious AI & Public Per­cep­tion: Four futures

nicoleta-kJul 3, 2024, 11:06 PM
12 points
1 comment16 min readEA link

Poster Ses­sion on AI Safety

Neil CrawfordNov 12, 2022, 3:50 AM
8 points
0 comments4 min readEA link

‘The AI Dilemma: Growth vs Ex­is­ten­tial Risk’: An Ex­ten­sion for EAs and a Sum­mary for Non-economists

TomHouldenApr 21, 2024, 4:28 PM
65 points
1 comment16 min readEA link

In­tro­duc­tion to Prag­matic AI Safety [Prag­matic AI Safety #1]

TW123May 9, 2022, 5:02 PM
68 points
0 comments6 min readEA link

My Feed­back to the UN Ad­vi­sory Body on AI

Heramb PodarApr 4, 2024, 11:39 PM
7 points
1 comment4 min readEA link

Re­port: Eval­u­at­ing an AI Chip Regis­tra­tion Policy

Deric ChengApr 12, 2024, 4:40 AM
15 points
0 comments5 min readEA link
(www.convergenceanalysis.org)

Ap­ply to Aether—In­de­pen­dent LLM Agent Safety Re­search Group

RohanSAug 21, 2024, 9:40 AM
47 points
13 comments8 min readEA link

An even deeper atheism

Joe_CarlsmithJan 11, 2024, 5:28 PM
26 points
2 comments1 min readEA link

Reza Ne­garestani’s In­tel­li­gence & Spirit

ukc10014Jun 27, 2024, 6:17 PM
7 points
1 comment4 min readEA link

What do XPT fore­casts tell us about AI risk?

Forecasting Research InstituteJul 19, 2023, 7:43 AM
97 points
21 comments14 min readEA link

MIRI 2024 Mis­sion and Strat­egy Update

MaloJan 5, 2024, 1:10 AM
154 points
38 comments1 min readEA link

Coun­ter­ar­gu­ments to the ba­sic AI risk case

Katja_GraceOct 14, 2022, 8:30 PM
284 points
23 comments34 min readEA link

Is effec­tive al­tru­ism re­ally to blame for the OpenAI de­ba­cle?

GarrisonNov 23, 2023, 12:44 AM
13 points
0 comments1 min readEA link
(garrisonlovely.substack.com)

An AI crash is our best bet for re­strict­ing AI

RemmeltOct 11, 2024, 2:12 AM
20 points
3 comments1 min readEA link

UNGA Re­s­olu­tion on AI: 5 Key Take­aways Look­ing to Fu­ture Policy

Heramb PodarMar 24, 2024, 12:03 PM
17 points
1 comment3 min readEA link

Towards ev­i­dence gap-maps for AI safety

dEAsignJul 25, 2023, 8:13 AM
6 points
1 comment2 min readEA link

Risk-averse Batch Ac­tive In­verse Re­ward Design

Panagiotis LiampasOct 7, 2023, 8:56 AM
11 points
0 comments15 min readEA link

Help the UN de­sign global gov­er­nance struc­tures for AI

Joanna (Asia) WiaterekJan 12, 2024, 8:44 AM
72 points
2 comments1 min readEA link

UK AI Bill Anal­y­sis & Opinion

CAISIDFeb 5, 2024, 12:12 AM
18 points
0 comments15 min readEA link

I am un­able to get any AI safety re­lated fel­low­ships or in­tern­ships.

AavishkarMar 11, 2024, 5:00 AM
5 points
6 comments1 min readEA link

Claude 3.5 Sonnet

Zach Stein-PerlmanJun 20, 2024, 6:00 PM
31 points
0 comments1 min readEA link
(www.anthropic.com)

AI In­ci­dent Re­port­ing: A Reg­u­la­tory Review

Deric ChengMar 11, 2024, 9:02 PM
10 points
1 comment6 min readEA link

Assess­ment of AI safety agen­das: think about the down­side risk

Roman LeventovDec 19, 2023, 9:02 AM
6 points
0 comments1 min readEA link

“The Uni­verse of Minds”—call for re­view­ers (Seeds of Science)

rogersbacon1Jul 25, 2023, 4:55 PM
4 points
0 comments1 min readEA link

Claude 3 claims it’s con­scious, doesn’t want to die or be modified

MikhailSaminMar 4, 2024, 11:05 PM
8 points
3 comments1 min readEA link

Liter­a­ture re­view of Trans­for­ma­tive Ar­tifi­cial In­tel­li­gence timelines

Jaime SevillaJan 27, 2023, 8:36 PM
148 points
10 comments1 min readEA link

OpenAI’s Su­per­al­ign­ment team has opened Fast Grants

YadavDec 16, 2023, 3:41 PM
31 points
2 comments1 min readEA link
(openai.com)

Bring­ing about an­i­mal-in­clu­sive AI

Max TaylorDec 18, 2023, 11:49 AM
121 points
9 comments16 min readEA link

Join the AI Eval­u­a­tion Tasks Bounty Hackathon

Esben KranMar 18, 2024, 8:15 AM
20 points
0 comments4 min readEA link

[Question] Why haven’t we been de­stroyed by a power-seek­ing AGI from el­se­where in the uni­verse?

Jadon SchmittJul 22, 2023, 7:21 AM
35 points
14 comments1 min readEA link

AISN#15: China and the US take ac­tion to reg­u­late AI, re­sults from a tour­na­ment fore­cast­ing AI risk, up­dates on xAI’s plan, and Meta re­leases its open-source and com­mer­cially available Llama 2

Center for AI SafetyJul 19, 2023, 1:40 AM
5 points
0 comments6 min readEA link
(newsletter.safe.ai)

An In­tro­duc­tion to Cri­tiques of promi­nent AI safety organizations

OmegaJul 19, 2023, 6:53 AM
87 points
2 comments5 min readEA link

(Even) More Early-Ca­reer EAs Should Try AI Safety Tech­ni­cal Research

tlevinJun 30, 2022, 9:14 PM
86 points
40 comments11 min readEA link

AI-Rele­vant Reg­u­la­tion: In­surance in Safety-Crit­i­cal Industries

SWKJul 22, 2023, 5:52 PM
5 points
0 comments6 min readEA link

AI Policy In­sights from the AIMS Survey

Janet PauketatFeb 22, 2024, 7:17 PM
10 points
1 comment18 min readEA link
(www.sentienceinstitute.org)

Ap­ply to MATS 7.0!

Ryan KiddSep 21, 2024, 12:23 AM
27 points
0 comments1 min readEA link

AI Risk and Sur­vivor­ship Bias—How An­dreessen and LeCun got it wrong

stepanlosJul 14, 2023, 5:10 PM
5 points
1 comment6 min readEA link

A fic­tional AI law laced w/​ al­ign­ment theory

MiguelJul 17, 2023, 3:26 AM
3 points
0 comments2 min readEA link

Help us seed AI Safety Brussels

gergoAug 7, 2024, 6:17 AM
50 points
2 comments3 min readEA link

An economist’s per­spec­tive on AI safety

David StinsonJun 7, 2024, 7:55 AM
7 points
1 comment9 min readEA link

Cam­bridge AI Safety Hub is look­ing for full- or part-time organisers

hannahJul 15, 2023, 2:31 PM
12 points
0 comments1 min readEA link

Ad­vo­cat­ing for Public Own­er­ship of Fu­ture AGI: Pre­serv­ing Hu­man­ity’s Col­lec­tive Heritage

George_A (Digital Intelligence Rights Initiative) Jul 14, 2023, 4:01 PM
−10 points
2 comments4 min readEA link

Up­dates from Cam­paign for AI Safety

Jolyn KhooJul 19, 2023, 8:15 AM
5 points
0 comments2 min readEA link
(www.campaignforaisafety.org)

Non-triv­ial Fel­low­ship Pro­ject: Towards a Unified Danger­ous Ca­pa­bil­ities Benchmark

Jord Mar 4, 2024, 9:24 AM
2 points
1 comment9 min readEA link

Cur­rent paths to im­pact in EU AI Policy (Feb ’24)

JOMG_MonnetFeb 12, 2024, 3:57 PM
47 points
0 comments5 min readEA link

[Question] How in­de­pen­dent is the re­search com­ing out of OpenAI’s pre­pared­ness team?

EarthlingFeb 10, 2024, 4:59 PM
18 points
0 comments1 min readEA link

[Linkpost] A Nar­row Path—How to Se­cure our Future

MathiasKB🔸Oct 2, 2024, 10:50 PM
63 points
0 comments1 min readEA link
(www.narrowpath.co)

An ar­gu­ment for ac­cel­er­at­ing in­ter­na­tional AI gov­er­nance re­search (part 1)

MattThinksAug 16, 2023, 5:40 AM
9 points
0 comments3 min readEA link

Sam Alt­man’s Chip Am­bi­tions Un­der­cut OpenAI’s Safety Strategy

GarrisonFeb 10, 2024, 7:52 PM
286 points
20 comments3 min readEA link
(garrisonlovely.substack.com)

Model­ling large-scale cy­ber at­tacks from ad­vanced AI sys­tems with Ad­vanced Per­sis­tent Threats

Iyngkarran KumarOct 2, 2023, 9:54 AM
28 points
2 comments30 min readEA link

Thoughts on the AI Safety Sum­mit com­pany policy re­quests and responses

So8resOct 31, 2023, 11:54 PM
42 points
3 comments1 min readEA link

(How) Is tech­ni­cal AI Safety re­search be­ing eval­u­ated?

JohnSnowJul 11, 2023, 9:37 AM
27 points
1 comment1 min readEA link

Begin­ner’s guide to re­duc­ing s-risks [link-post]

Center on Long-Term RiskOct 17, 2023, 12:51 AM
129 points
3 comments3 min readEA link
(longtermrisk.org)

Tort Law Can Play an Im­por­tant Role in Miti­gat­ing AI Risk

Gabriel WeilFeb 12, 2024, 5:11 PM
99 points
6 comments5 min readEA link

AI-Rele­vant Reg­u­la­tion: IAEA

SWKJul 15, 2023, 6:20 PM
10 points
0 comments5 min readEA link

Paradigms and The­ory Choice in AI: Adap­tivity, Econ­omy and Control

particlemaniaAug 28, 2023, 10:44 PM
3 points
0 comments16 min readEA link

AI-Rele­vant Reg­u­la­tion: CERN

SWKJul 15, 2023, 6:40 PM
12 points
0 comments6 min readEA link

[Question] What am I miss­ing re. open-source LLM’s?

another-anon-do-gooderDec 4, 2023, 4:48 AM
1 point
2 comments1 min readEA link

Deep Deceptiveness

So8resMar 21, 2023, 2:51 AM
40 points
1 comment1 min readEA link

A sim­ple way of ex­ploit­ing AI’s com­ing eco­nomic im­pact may be highly-impactful

kuiraJul 16, 2023, 10:30 AM
5 points
0 comments2 min readEA link
(www.lesswrong.com)

AI Wellbeing

Simon Jul 11, 2023, 12:34 AM
11 points
0 comments9 min readEA link

Ask AI com­pa­nies about what they are do­ing for AI safety?

micMar 8, 2022, 9:54 PM
44 points
1 comment2 min readEA link

Up­dates from Cam­paign for AI Safety

Jolyn KhooOct 31, 2023, 5:46 AM
14 points
1 comment2 min readEA link
(www.campaignforaisafety.org)

Assess­ing the Danger­ous­ness of Malev­olent Ac­tors in AGI Gover­nance: A Pre­limi­nary Exploration

Callum HinchcliffeOct 14, 2023, 9:18 PM
28 points
4 comments9 min readEA link

UK Foun­da­tion Model Task Force—Ex­pres­sion of Interest

ojorgensenJun 18, 2023, 9:40 AM
111 points
3 comments1 min readEA link
(twitter.com)

Should you work at a lead­ing AI lab? (in­clud­ing in non-safety roles)

Benjamin HiltonJul 25, 2023, 4:28 PM
38 points
13 comments12 min readEA link

[Question] Could some­one help me un­der­stand why it’s so difficult to solve the al­ign­ment prob­lem?

Jadon SchmittJul 22, 2023, 4:39 AM
35 points
21 comments1 min readEA link

An­nounc­ing Athena—Women in AI Align­ment Research

Claire ShortNov 7, 2023, 10:02 PM
180 points
28 comments3 min readEA link

My Ob­jec­tions to “We’re All Gonna Die with Eliezer Yud­kowsky”

Quintin PopeMar 21, 2023, 1:23 AM
166 points
21 comments39 min readEA link

News: Span­ish AI image out­cry + US AI work­force “reg­u­la­tion”

Benevolent_RainSep 26, 2023, 7:43 AM
9 points
0 comments1 min readEA link

Aus­trali­ans are con­cerned about AI risks and ex­pect strong gov­ern­ment action

Alexander SaeriMar 8, 2024, 6:39 AM
38 points
12 comments5 min readEA link
(aigovernance.org.au)

Biolog­i­cal su­per­in­tel­li­gence: a solu­tion to AI safety

YarrowDec 4, 2023, 1:09 PM
0 points
6 comments1 min readEA link

Dr Alt­man or: How I Learned to Stop Wor­ry­ing and Love the Killer AI

Barak GilaMar 11, 2024, 5:01 AM
−7 points
0 comments2 min readEA link

[Question] Know a grad stu­dent study­ing AI’s eco­nomic im­pacts?

Madhav MalhotraJul 5, 2023, 12:07 AM
7 points
0 comments1 min readEA link

The Mul­tidis­ci­plinary Ap­proach to Align­ment (MATA) and Archety­pal Trans­fer Learn­ing (ATL)

MiguelJun 19, 2023, 3:23 AM
4 points
0 comments7 min readEA link

We Should Talk About This More. Epistemic World Col­lapse as Im­mi­nent Safety Risk of Gen­er­a­tive AI.

Jörg WeißNov 16, 2023, 8:34 AM
4 points
0 comments29 min readEA link

Po­ten­tial em­ploy­ees have a unique lever to in­fluence the be­hav­iors of AI labs

oxalisMar 18, 2023, 8:58 PM
139 points
1 comment5 min readEA link

Neu­ron­pe­dia—AI Safety Game

johnnylinOct 16, 2023, 9:35 AM
9 points
2 comments4 min readEA link
(neuronpedia.org)

Hash­marks: Pri­vacy-Pre­serv­ing Bench­marks for High-Stakes AI Evaluation

Paul BricmanDec 4, 2023, 7:41 AM
4 points
0 comments16 min readEA link
(arxiv.org)

Align­ing the Align­ers: En­sur­ing Aligned AI acts for the com­mon good of all mankind

timunderwoodJan 16, 2023, 11:13 AM
40 points
2 comments4 min readEA link

LLMs won’t lead to AGI—Fran­cois Chollet

tobycrisford 🔸Jun 11, 2024, 8:19 PM
37 points
23 comments1 min readEA link
(www.youtube.com)

[Question] Why is learn­ing eco­nomics, psy­chol­ogy, so­ciol­ogy im­por­tant for pre­vent­ing AI risks?

jackchang110Nov 3, 2023, 9:48 PM
3 points
0 comments1 min readEA link

An­nounc­ing New Begin­ner-friendly Book on AI Safety and Risk

Darren McKeeNov 25, 2023, 3:57 PM
114 points
9 comments1 min readEA link

Pod­cast: In­ter­view se­ries fea­tur­ing Dr. Peter Park

Jacob-HaimesMar 26, 2024, 12:35 AM
1 point
0 comments2 min readEA link
(into-ai-safety.github.io)

If you are too stressed, walk away from the front lines

Neil WarrenJun 12, 2023, 9:01 PM
7 points
2 comments4 min readEA link

All Tech is Hu­man <-> EA

tae 🔸Dec 3, 2023, 9:01 PM
29 points
0 comments2 min readEA link

There is only one goal or drive—only self-per­pet­u­a­tion counts

freest oneJun 13, 2023, 1:37 AM
2 points
4 comments8 min readEA link

Cri­tiques of promi­nent AI safety labs: Conjecture

OmegaJun 12, 2023, 5:52 AM
150 points
83 comments32 min readEA link

[Question] How does AI progress af­fect other EA cause ar­eas?

Luis Mota FreitasJun 9, 2023, 12:43 PM
95 points
13 comments1 min readEA link

The Bar for Con­tribut­ing to AI Safety is Lower than You Think

Chris LeongAug 17, 2024, 10:52 AM
14 points
5 comments2 min readEA link

Have your say on the fu­ture of AI reg­u­la­tion: Dead­line ap­proach­ing for your feed­back on UN High-Level Ad­vi­sory Body on AI In­terim Re­port ‘Govern­ing AI for Hu­man­ity’

Deborah W.A. FoulkesMar 29, 2024, 6:37 AM
17 points
1 comment1 min readEA link

What can su­per­in­tel­li­gent ANI tell us about su­per­in­tel­li­gent AGI?

Ted SandersJun 12, 2023, 6:32 AM
81 points
20 comments5 min readEA link

Fix­ing In­sider Threats in the AI Sup­ply Chain

Madhav MalhotraOct 7, 2023, 10:49 AM
9 points
2 comments5 min readEA link

The AI Endgame: A coun­ter­fac­tual to AI al­ign­ment by an AI Safety newcomer

Andreas PDec 1, 2023, 5:49 AM
2 points
5 comments3 min readEA link

Rais­ing the voices that ac­tu­ally count

Kim HolderJun 13, 2023, 7:21 PM
2 points
3 comments2 min readEA link

A sum­mary of cur­rent work in AI governance

constructiveJun 17, 2023, 4:58 PM
87 points
4 comments11 min readEA link

Ob­ser­va­tions on the fund­ing land­scape of EA and AI safety

Vilhelm SkoglundOct 2, 2023, 9:45 AM
136 points
12 comments15 min readEA link

The cur­rent al­ign­ment plan, and how we might im­prove it | EAG Bay Area 23

BuckJun 7, 2023, 9:03 PM
66 points
0 comments33 min readEA link

The Risks of AI-Gen­er­ated Con­tent on the EA Forum

WobblyPanda2Jun 4, 2023, 5:33 AM
−1 points
0 comments1 min readEA link

Epoch is hiring a Product and Data Vi­su­al­iza­tion Designer

merilalamaNov 25, 2023, 12:14 AM
21 points
0 comments4 min readEA link
(careers.rethinkpriorities.org)

Does AI risk “other” the AIs?

Joe_CarlsmithJan 9, 2024, 5:51 PM
23 points
3 comments1 min readEA link

Co­op­er­a­tive AI: Three things that con­fused me as a be­gin­ner (and my cur­rent un­der­stand­ing)

C TilliApr 16, 2024, 7:06 AM
56 points
10 comments6 min readEA link

Mud­dling Along Is More Likely Than Dystopia

Jeffrey HeningerOct 21, 2023, 9:30 AM
87 points
3 comments8 min readEA link
(blog.aiimpacts.org)

OpenAI board re­ceived let­ter warn­ing of pow­er­ful AI

JordanStoneNov 23, 2023, 12:16 AM
26 points
2 comments1 min readEA link
(www.reuters.com)

AI com­pa­nies are not on track to se­cure model weights

Jeffrey LadishJul 18, 2024, 3:13 PM
73 points
3 comments19 min readEA link

The Game of Dominance

Karl von WendtAug 27, 2023, 11:23 AM
5 points
0 comments6 min readEA link

Au­to­mated Par­li­a­ments — A Solu­tion to De­ci­sion Uncer­tainty and Misal­ign­ment in Lan­guage Models

Shak RagolerOct 2, 2023, 9:47 AM
8 points
0 comments17 min readEA link

Catas­trophic Risks from Un­safe AI: Nav­i­gat­ing a Tightrope Sce­nario (Ben Garfinkel, EAG Lon­don 2023)

Alexander SaeriJun 2, 2023, 9:59 AM
19 points
1 comment10 min readEA link

A com­pute-based frame­work for think­ing about the fu­ture of AI

Matthew_BarnettMay 31, 2023, 10:00 PM
96 points
36 comments19 min readEA link

Safe AI and moral AI

William D'AlessandroJun 1, 2023, 9:18 PM
3 points
0 comments11 min readEA link

AI Safety Newslet­ter #8: Rogue AIs, how to screen for AI risks, and grants for re­search on demo­cratic gov­er­nance of AI

Center for AI SafetyMay 30, 2023, 11:44 AM
16 points
3 comments6 min readEA link
(newsletter.safe.ai)

Ex­plor­ers in a vir­tual coun­try: Nav­i­gat­ing the knowl­edge land­scape of large lan­guage models

Alexander SaeriMar 28, 2023, 9:32 PM
17 points
1 comment6 min readEA link

ChatGPT: to­wards AI subjectivity

KrisDAmatoMay 1, 2024, 10:13 AM
3 points
0 comments1 min readEA link
(link.springer.com)

Digi­tal peo­ple could make AI safer

GMcGowanJun 10, 2022, 3:29 PM
24 points
15 comments4 min readEA link
(www.mindlessalgorithm.com)

Key take­aways from our EA and al­ign­ment re­search surveys

Cameron BergMay 4, 2024, 3:51 PM
64 points
21 comments21 min readEA link

Prim­i­tive Global Dis­course Frame­work, Con­sti­tu­tional AI us­ing le­gal frame­works, and Mono­cul­ture—A loss of con­trol over the role of AGI in society

broptrossJun 1, 2023, 5:12 AM
2 points
0 comments12 min readEA link

Without a tra­jec­tory change, the de­vel­op­ment of AGI is likely to go badly

Max HMay 30, 2023, 12:21 AM
1 point
0 comments13 min readEA link

AISN #35: Lob­by­ing on AI Reg­u­la­tion Plus, New Models from OpenAI and Google, and Le­gal Regimes for Train­ing on Copy­righted Data

Center for AI SafetyMay 16, 2024, 2:26 PM
14 points
0 comments6 min readEA link
(newsletter.safe.ai)

Boomerang—pro­to­col to dis­solve some com­mit­ment races

Filip SondejMay 30, 2023, 4:24 PM
20 points
0 comments8 min readEA link
(www.lesswrong.com)

AI, Cy­ber­se­cu­rity, and Malware: A Shal­low Re­port [Gen­eral]

Madhav MalhotraMar 31, 2023, 12:01 PM
5 points
0 comments8 min readEA link

We are fight­ing a shared bat­tle (a call for a differ­ent ap­proach to AI Strat­egy)

Gideon FutermanMar 16, 2023, 2:37 PM
59 points
11 comments15 min readEA link

AI, Cy­ber­se­cu­rity, and Malware: A Shal­low Re­port [Tech­ni­cal]

Madhav MalhotraMar 31, 2023, 12:03 PM
4 points
0 comments9 min readEA link

AGI de­vel­op­ment role-play­ing game

rekahalaszDec 11, 2023, 10:22 AM
4 points
0 comments1 min readEA link

My Proven AI Safety Ex­pla­na­tion (as a com­put­ing stu­dent)

Mica WhiteFeb 6, 2024, 3:58 AM
8 points
4 comments6 min readEA link

Sta­tus Quo Eng­ines—AI essay

Ilana_Goldowitz_JimenezMay 28, 2023, 2:33 PM
1 point
0 comments15 min readEA link

I de­signed an AI safety course (for a philos­o­phy de­part­ment)

Eleni_ASep 23, 2023, 9:56 PM
27 points
3 comments2 min readEA link

Pos­si­ble OpenAI’s Q* break­through and Deep­Mind’s AlphaGo-type sys­tems plus LLMs

BurnydelicNov 23, 2023, 7:02 AM
13 points
4 comments2 min readEA link

An­nounc­ing: Mechanism De­sign for AI Safety—Read­ing Group

Rubi J. HudsonAug 9, 2022, 4:25 AM
36 points
1 comment4 min readEA link

What to think when a lan­guage model tells you it’s sentient

rgbFeb 20, 2023, 2:59 AM
112 points
18 comments6 min readEA link

[Question] Would an An­thropic/​OpenAI merger be good for AI safety?

MNov 22, 2023, 8:21 PM
6 points
1 comment1 min readEA link

It’s not ob­vi­ous that get­ting dan­ger­ous AI later is better

Aaron_ScherSep 23, 2023, 5:35 AM
23 points
9 comments16 min readEA link

AGI mis­al­ign­ment x-risk may be lower due to an over­looked goal speci­fi­ca­tion technology

johnjnayOct 21, 2022, 2:03 AM
20 points
1 comment1 min readEA link

Why Would AI “Aim” To Defeat Hu­man­ity?

Holden KarnofskyNov 29, 2022, 6:59 PM
24 points
0 comments32 min readEA link
(www.cold-takes.com)

[Linkpost] Be­ware the Squir­rel by Ver­ity Harding

EarthlingSep 3, 2023, 9:04 PM
1 point
1 comment2 min readEA link
(samf.substack.com)

An­nounc­ing Hu­man-al­igned AI Sum­mer School

Jan_KulveitMay 22, 2024, 8:55 AM
33 points
0 comments1 min readEA link
(humanaligned.ai)

[Linkpost] Longter­mists Are Push­ing a New Cold War With China

Radical Empath IsmamMay 27, 2023, 6:53 AM
37 points
16 comments1 min readEA link
(jacobin.com)

Oc­to­ber 2022 AI Risk Com­mu­nity Sur­vey Results

FroolowMay 24, 2023, 10:37 AM
19 points
0 comments7 min readEA link

A Viral Li­cense for AI Safety

IvanVendrovJun 5, 2021, 2:00 AM
30 points
6 comments5 min readEA link

AI Safety & Risk Din­ner w/​ En­trepreneur First CEO & ARIA Chair, Matt Clifford in New York

SimonPastorNov 28, 2023, 7:45 PM
2 points
0 comments1 min readEA link

[CFP] NeurIPS work­shop: AI meets Mo­ral Philos­o­phy and Mo­ral Psychology

jaredlcmSep 4, 2023, 6:21 AM
10 points
1 comment4 min readEA link

What we’re miss­ing: the case for struc­tural risks from AI

Justin OliveNov 9, 2023, 5:52 AM
31 points
3 comments6 min readEA link

MATS Sum­mer 2023 Retrospective

utilistrutilDec 2, 2023, 12:12 AM
28 points
3 comments1 min readEA link

You Can’t Prove Aliens Aren’t On Their Way To De­stroy The Earth (A Com­pre­hen­sive Take­down Of The Doomer View Of AI)

MurphyApr 7, 2023, 1:37 PM
−31 points
7 comments9 min readEA link

[Linkpost] OpenAI lead­ers call for reg­u­la­tion of “su­per­in­tel­li­gence” to re­duce ex­is­ten­tial risk.

Lowe LundinMay 25, 2023, 2:14 PM
5 points
0 comments1 min readEA link

Diminish­ing Re­turns in Ma­chine Learn­ing Part 1: Hard­ware Devel­op­ment and the Phys­i­cal Frontier

Brian ChauMay 27, 2023, 12:39 PM
16 points
3 comments12 min readEA link
(www.fromthenew.world)

In­trin­sic limi­ta­tions of GPT-4 and other large lan­guage mod­els, and why I’m not (very) wor­ried about GPT-n

James FodorJun 3, 2023, 1:09 PM
28 points
3 comments11 min readEA link

The case for more am­bi­tious lan­guage model evals

JozdienJan 30, 2024, 9:24 AM
7 points
0 comments5 min readEA link

Bi­den-Har­ris Ad­minis­tra­tion An­nounces First-Ever Con­sor­tium Ded­i­cated to AI Safety

ben.smithFeb 9, 2024, 6:40 AM
15 points
1 comment1 min readEA link
(www.nist.gov)

Trans­for­ma­tive AI and Com­pute—Read­ing List

Frederik BergSep 4, 2023, 6:21 AM
24 points
0 comments1 min readEA link
(docs.google.com)

AI Safety Camp 2024

Linda LinseforsNov 18, 2023, 10:37 AM
21 points
1 comment1 min readEA link
(aisafety.camp)

Unions for AI safety?

dEAsignSep 24, 2023, 12:13 AM
7 points
12 comments2 min readEA link

Five ne­glected work ar­eas that could re­duce AI risk

Aaron_ScherSep 24, 2023, 2:09 AM
22 points
0 comments9 min readEA link

AI Align­ment in The New Yorker

Eleni_AMay 17, 2023, 9:19 PM
23 points
0 comments1 min readEA link
(www.newyorker.com)

GovAI: Towards best prac­tices in AGI safety and gov­er­nance: A sur­vey of ex­pert opinion

Zach Stein-PerlmanMay 15, 2023, 1:42 AM
68 points
3 comments1 min readEA link

Up­dates from Cam­paign for AI Safety

Jolyn KhooAug 30, 2023, 5:36 AM
7 points
0 comments2 min readEA link
(www.campaignforaisafety.org)

“The Race to the End of Hu­man­ity” – Struc­tural Uncer­tainty Anal­y­sis in AI Risk Models

FroolowMay 19, 2023, 12:03 PM
48 points
4 comments21 min readEA link

AI safety and con­scious­ness re­search: A brainstorm

Daniel_FriedrichMar 15, 2023, 2:33 PM
11 points
1 comment9 min readEA link

A note of cau­tion on be­liev­ing things on a gut level

Nathan_BarnardMay 9, 2023, 12:20 PM
41 points
5 comments2 min readEA link

[Question] Would a su­per-in­tel­li­gent AI nec­es­sar­ily sup­port its own ex­is­tence?

Porque?Jun 25, 2023, 10:39 AM
8 points
2 comments2 min readEA link

You don’t need to be a ge­nius to be in AI safety research

Claire ShortMay 10, 2023, 10:23 PM
28 points
4 comments6 min readEA link

Align­ment, Goals, & The Gut-Head Gap: A Re­view of Ngo. et al

Violet HourMay 11, 2023, 5:16 PM
26 points
0 comments13 min readEA link

Sum­mary of Si­tu­a­tional Aware­ness—The Decade Ahead

OscarD🔸Jun 8, 2024, 11:29 AM
143 points
5 comments18 min readEA link

Aim for con­di­tional pauses

AnonResearcherMajorAILabSep 25, 2023, 1:05 AM
100 points
42 comments12 min readEA link

“Pivotal Act” In­ten­tions: Nega­tive Con­se­quences and Fal­la­cious Arguments

Andrew CritchApr 19, 2022, 8:24 PM
80 points
10 comments7 min readEA link

How quickly AI could trans­form the world (Tom David­son on The 80,000 Hours Pod­cast)

80000_HoursMay 8, 2023, 1:23 PM
82 points
3 comments17 min readEA link

AI policy & gov­er­nance in Aus­tralia: notes from an ini­tial discussion

Alexander SaeriMay 15, 2023, 12:00 AM
31 points
1 comment3 min readEA link

De­com­pos­ing al­ign­ment to take ad­van­tage of paradigms

Christopher KingJun 4, 2023, 2:26 PM
2 points
0 comments4 min readEA link

[Question] Is work­ing on AI to help democ­racy a good idea?

WillPearsonFeb 17, 2024, 11:15 PM
5 points
3 comments1 min readEA link

Risk Align­ment in Agen­tic AI Systems

Hayley ClatterbuckOct 1, 2024, 10:51 PM
31 points
1 comment3 min readEA link
(static1.squarespace.com)

Peter Eck­er­sley (1979-2022)

technicalitiesSep 3, 2022, 10:45 AM
497 points
9 comments1 min readEA link

How MATS ad­dresses “mass move­ment build­ing” concerns

Ryan KiddMay 4, 2023, 12:55 AM
79 points
4 comments1 min readEA link

We’re all in this together

Tamsin LeakeDec 5, 2023, 1:57 PM
15 points
1 comment1 min readEA link
(carado.moe)

Four ques­tions I ask AI safety researchers

AkashJul 17, 2022, 5:25 PM
30 points
3 comments1 min readEA link

Giv­ing away copies of Un­con­trol­lable by Dar­ren McKee

Greg_ColbournDec 14, 2023, 5:00 PM
39 points
2 comments1 min readEA link

[Link Post: New York Times] White House Un­veils Ini­ti­a­tives to Re­duce Risks of A.I.

RockwellMay 4, 2023, 2:04 PM
50 points
1 comment2 min readEA link

AI welfare vs. AI rights

Matthew_BarnettFeb 4, 2025, 6:28 PM
33 points
20 comments3 min readEA link

AI gov­er­nance tal­ent pro­files I’d like to see ap­ply for OP funding

JulianHazellDec 19, 2023, 12:34 PM
118 points
4 comments3 min readEA link
(www.openphilanthropy.org)

AI Views Snapshots

RobBensingerDec 13, 2023, 12:45 AM
25 points
0 comments1 min readEA link

Owain Evans on LLMs, Truth­ful AI, AI Com­po­si­tion, and More

Ozzie GooenMay 2, 2023, 1:20 AM
21 points
0 comments1 min readEA link
(quri.substack.com)

Yud­kowsky on AGI risk on the Ban­kless podcast

RobBensingerMar 13, 2023, 12:42 AM
54 points
2 comments75 min readEA link

P(doom|AGI) is high: why the de­fault out­come of AGI is doom

Greg_ColbournMay 2, 2023, 10:40 AM
13 points
28 comments3 min readEA link

How CISA can Sup­port the Se­cu­rity of Large AI Models Against Theft [Grad School As­sign­ment]

Marcel DMay 3, 2023, 3:36 PM
7 points
0 comments13 min readEA link

My cur­rent take on ex­is­ten­tial AI risk [FB post]

Aryeh EnglanderMay 1, 2023, 4:22 PM
10 points
0 comments3 min readEA link

Planes are still decades away from dis­plac­ing most bird jobs

guzeyNov 25, 2022, 4:49 PM
27 points
2 comments1 min readEA link

Apoca­lypse in­surance, and the hardline liber­tar­ian take on AI risk

So8resNov 28, 2023, 2:09 AM
21 points
0 comments1 min readEA link

AI safety logo de­sign con­test, due end of May (ex­tended)

Adrian CiprianiApr 28, 2023, 2:53 AM
13 points
23 comments2 min readEA link

New open let­ter on AI — “In­clude Con­scious­ness Re­search”

Jamie_HarrisApr 28, 2023, 7:50 AM
55 points
1 comment3 min readEA link
(amcs-community.org)

A Guide to Fore­cast­ing AI Science Capabilities

Eleni_AApr 29, 2023, 6:51 AM
19 points
1 comment4 min readEA link

Briefly how I’ve up­dated since ChatGPT

rimeApr 25, 2023, 7:39 PM
29 points
8 comments2 min readEA link
(www.lesswrong.com)

An­nounc­ing the Open Philan­thropy AI Wor­ld­views Contest

Jason SchukraftMar 10, 2023, 2:33 AM
137 points
33 comments3 min readEA link
(www.openphilanthropy.org)

Emerg­ing Tech­nolo­gies: More to explore

EA HandbookJan 1, 2021, 11:06 AM
4 points
0 comments2 min readEA link

A Wind­fall Clause for CEO could worsen AI race dynamics

LarksMar 9, 2023, 6:02 PM
69 points
12 comments7 min readEA link

AI Rights for Hu­man Safety

Matthew_BarnettAug 3, 2024, 12:47 AM
54 points
1 comment1 min readEA link
(papers.ssrn.com)

A Bare­bones Guide to Mechanis­tic In­ter­pretabil­ity Prerequisites

Neel NandaNov 29, 2022, 6:43 PM
54 points
1 comment3 min readEA link
(neelnanda.io)

Max Teg­mark’s new Time ar­ti­cle on how we’re in a Don’t Look Up sce­nario [Linkpost]

Jonas HallgrenApr 25, 2023, 3:47 PM
41 points
0 comments1 min readEA link

The AI in­dus­try turns against its fa­vorite philosophy

Jonathan YanNov 22, 2023, 12:11 AM
14 points
2 comments1 min readEA link
(www.semafor.com)

Archety­pal Trans­fer Learn­ing: a Pro­posed Align­ment Solu­tion that solves the In­ner x Outer Align­ment Prob­lem while adding Cor­rigible Traits to GPT-2-medium

MiguelApr 26, 2023, 12:40 AM
13 points
0 comments10 min readEA link

[Linkpost] ‘The God­father of A.I.’ Leaves Google and Warns of Danger Ahead

imp4rtial 🔸May 1, 2023, 7:54 PM
43 points
3 comments3 min readEA link
(www.nytimes.com)

Ques­tions about Con­je­cure’s CoEm proposal

AkashMar 9, 2023, 7:32 PM
19 points
0 comments1 min readEA link

AI Safety in a World of Vuln­er­a­ble Ma­chine Learn­ing Systems

AdamGleaveMar 8, 2023, 2:40 AM
20 points
0 comments1 min readEA link

Two con­trast­ing mod­els of “in­tel­li­gence” and fu­ture growth

Magnus VindingNov 24, 2022, 11:54 AM
74 points
32 comments22 min readEA link

Stu­dent com­pe­ti­tion for draft­ing a treaty on mora­to­rium of large-scale AI ca­pa­bil­ities R&D

NayanikaApr 24, 2023, 1:15 PM
36 points
4 comments2 min readEA link

“Who Will You Be After ChatGPT Takes Your Job?”

Stephen ThomasApr 21, 2023, 9:31 PM
23 points
4 comments2 min readEA link
(www.wired.com)

Be­fore Alt­man’s Ouster, OpenAI’s Board Was Di­vided and Feuding

Jonathan YanNov 22, 2023, 1:01 AM
25 points
1 comment1 min readEA link
(www.nytimes.com)

Is the time crunch for AI Safety Move­ment Build­ing now?

Chris LeongJun 8, 2022, 12:19 PM
14 points
10 comments3 min readEA link

“Can We Sur­vive Tech­nol­ogy?” by John von Neumann

Eli RoseMar 13, 2023, 2:26 AM
51 points
0 comments1 min readEA link
(geosci.uchicago.edu)

Who Aligns the Align­ment Re­searchers?

ben.smithMar 5, 2023, 11:22 PM
23 points
4 comments1 min readEA link

Power laws in Speedrun­ning and Ma­chine Learning

Jaime SevillaApr 24, 2023, 10:06 AM
48 points
0 comments1 min readEA link

How bad a fu­ture do ML re­searchers ex­pect?

Katja_GraceMar 13, 2023, 5:47 AM
165 points
20 comments1 min readEA link

Play Re­grantor: Move up to $250,000 to Your Top High-Im­pact Pro­jects!

Dawn DrescherMay 17, 2023, 4:51 PM
58 points
2 comments2 min readEA link
(impactmarkets.substack.com)

Paper­clip Club (AI Safety Meetup)

Luke ThorburnApr 20, 2023, 4:04 PM
2 points
0 comments1 min readEA link

Deep­Mind and Google Brain are merg­ing [Linkpost]

AkashApr 20, 2023, 6:47 PM
32 points
1 comment1 min readEA link

[Question] If your AGI x-risk es­ti­mates are low, what sce­nar­ios make up the bulk of your ex­pec­ta­tions for an OK out­come?

Greg_ColbournApr 21, 2023, 11:15 AM
62 points
55 comments1 min readEA link

Quick takes on “AI is easy to con­trol”

So8resDec 2, 2023, 10:33 PM
−12 points
4 comments1 min readEA link

12 ten­ta­tive ideas for US AI policy (Luke Muehlhauser)

LizkaApr 19, 2023, 9:05 PM
117 points
12 comments4 min readEA link
(www.openphilanthropy.org)

Com­ments on OpenAI’s “Plan­ning for AGI and be­yond”

So8resMar 3, 2023, 11:01 PM
115 points
7 comments1 min readEA link

Two con­trast­ing mod­els of “in­tel­li­gence” and fu­ture growth

Magnus VindingNov 24, 2022, 11:54 AM
74 points
32 comments22 min readEA link

[Video] - How does the EU AI Act Work?

YadavSep 11, 2024, 2:16 PM
10 points
0 comments5 min readEA link

Pivotal Re­search is Hiring Re­search Managers

Tobias HäberliSep 25, 2024, 7:11 PM
8 points
0 comments3 min readEA link

Notes on risk compensation

trammellMay 12, 2024, 6:40 PM
136 points
14 comments21 min readEA link

De­com­pos­ing Agency — ca­pa­bil­ities with­out desires

Owen Cotton-BarrattJul 11, 2024, 9:38 AM
37 points
2 comments12 min readEA link
(strangecities.substack.com)

Pod­cast with Yoshua Ben­gio on Why AI Labs are “Play­ing Dice with Hu­man­ity’s Fu­ture”

GarrisonMay 10, 2024, 5:23 PM
29 points
3 comments2 min readEA link
(garrisonlovely.substack.com)

Brand­ing AI Safety Groups: A Field Guide

Agustín Covarrubias 🔸May 13, 2024, 5:17 PM
44 points
6 comments1 min readEA link

Peter Eck­er­sley (1979-2022)

technicalitiesSep 3, 2022, 10:45 AM
497 points
9 comments1 min readEA link

Safety tax functions

Owen Cotton-BarrattOct 20, 2024, 2:13 PM
23 points
1 comment6 min readEA link
(strangecities.substack.com)

GDP per cap­ita in 2050

Hauke HillebrandtMay 6, 2024, 3:14 PM
130 points
11 comments16 min readEA link
(hauke.substack.com)

Epoch AI is Hiring an Oper­a­tions Associate

merilalamaMay 3, 2024, 12:16 AM
5 points
1 comment3 min readEA link
(careers.rethinkpriorities.org)

Biorisk is an Un­helpful Anal­ogy for AI Risk

DavidmanheimMay 6, 2024, 6:18 AM
22 points
4 comments3 min readEA link

Is the time crunch for AI Safety Move­ment Build­ing now?

Chris LeongJun 8, 2022, 12:19 PM
14 points
10 comments3 min readEA link

Up­dates on the EA catas­trophic risk land­scape

Benjamin_ToddMay 6, 2024, 4:52 AM
194 points
46 comments2 min readEA link

Who Aligns the Align­ment Re­searchers?

ben.smithMar 5, 2023, 11:22 PM
23 points
4 comments1 min readEA link

ML4Good is seek­ing part­ner or­gani­sa­tions, in­di­vi­d­ual or­ganisers and TAs

NiaMay 13, 2024, 1:43 PM
22 points
0 comments3 min readEA link

The In­ten­tional Stance, LLMs Edition

Eleni_AMay 1, 2024, 3:22 PM
8 points
2 comments8 min readEA link

Les­sons from the FDA for AI

RemmeltAug 2, 2024, 12:52 AM
6 points
2 comments1 min readEA link
(ainowinstitute.org)

Risks I am Con­cerned About

HappyBunnyApr 29, 2024, 11:41 PM
1 point
1 comment1 min readEA link

AISN #38: Supreme Court De­ci­sion Could Limit Fed­eral Abil­ity to Reg­u­late AI Plus, “Cir­cuit Break­ers” for AI sys­tems, and up­dates on China’s AI industry

Center for AI SafetyJul 9, 2024, 7:29 PM
8 points
0 comments5 min readEA link
(newsletter.safe.ai)

Aspira­tion-based, non-max­i­miz­ing AI agent designs

Bob Jacobs 🔸May 7, 2024, 4:13 PM
12 points
1 comment38 min readEA link

AI Safety is Some­times a Model Property

Cullen 🔸May 2, 2024, 3:38 PM
18 points
1 comment1 min readEA link
(open.substack.com)

Re­lease of UN’s draft re­lated to the gov­er­nance of AI (a sum­mary of the Si­mon In­sti­tute’s re­sponse)

SebastianSchmidtApr 27, 2024, 6:27 PM
22 points
0 comments1 min readEA link

Com­ments on OpenAI’s “Plan­ning for AGI and be­yond”

So8resMar 3, 2023, 11:01 PM
115 points
7 comments1 min readEA link

AISC9 has ended and there will be an AISC10

Linda LinseforsApr 29, 2024, 10:53 AM
36 points
0 comments1 min readEA link

AI Safety Newslet­ter #42: New­som Ve­toes SB 1047 Plus, OpenAI’s o1, and AI Gover­nance Summary

Center for AI SafetyOct 1, 2024, 8:33 PM
10 points
0 comments6 min readEA link
(newsletter.safe.ai)

List #2: Why co­or­di­nat­ing to al­ign as hu­mans to not de­velop AGI is a lot eas­ier than, well… co­or­di­nat­ing as hu­mans with AGI co­or­di­nat­ing to be al­igned with humans

RemmeltDec 24, 2022, 9:53 AM
3 points
0 comments1 min readEA link

AI Gover­nance & Strat­egy: Pri­ori­ties, tal­ent gaps, & opportunities

AkashMar 3, 2023, 6:09 PM
21 points
0 comments1 min readEA link

Re­sults of an in­for­mal sur­vey on AI grantmaking

Scott AlexanderAug 21, 2024, 1:19 PM
127 points
28 comments1 min readEA link

Scal­ing of AI train­ing runs will slow down af­ter GPT-5

Maxime_RicheApr 26, 2024, 4:06 PM
10 points
2 comments3 min readEA link

In­tro­duc­ing Align­ment Stress-Test­ing at Anthropic

evhubJan 12, 2024, 11:51 PM
80 points
0 comments1 min readEA link

Is AI fore­cast­ing a waste of effort on the mar­gin?

EmrikNov 5, 2022, 12:41 AM
10 points
6 comments3 min readEA link

Staged release

Zach Stein-PerlmanApr 20, 2024, 1:00 AM
16 points
0 comments1 min readEA link

80,000 hours should re­move OpenAI from the Job Board (and similar EA orgs should do similarly)

RaemonJul 3, 2024, 8:34 PM
263 points
79 comments3 min readEA link

Fron­tier AI sys­tems have sur­passed the self-repli­cat­ing red line

Greg_ColbournDec 10, 2024, 4:33 PM
25 points
14 comments1 min readEA link
(github.com)

Law-Fol­low­ing AI 4: Don’t Rely on Vi­car­i­ous Liability

Cullen 🔸Aug 2, 2022, 11:23 PM
13 points
0 comments3 min readEA link

[Video] Why SB-1047 de­serves a fairer debate

YadavAug 20, 2024, 10:38 AM
15 points
1 comment7 min readEA link

Es­say com­pe­ti­tion on the Au­toma­tion of Wis­dom and Philos­o­phy — $25k in prizes

Owen Cotton-BarrattApr 16, 2024, 10:08 AM
80 points
15 comments8 min readEA link
(blog.aiimpacts.org)

A Gen­tle In­tro­duc­tion to Risk Frame­works Beyond Forecasting

pending_survivalApr 11, 2024, 9:15 AM
81 points
4 comments27 min readEA link

CEA seeks co-founder for AI safety group sup­port spin-off

Agustín Covarrubias 🔸Apr 8, 2024, 3:42 PM
62 points
0 comments4 min readEA link

Imi­ta­tion Learn­ing is Prob­a­bly Ex­is­ten­tially Safe

Vasco Grilo🔸Apr 30, 2024, 5:06 PM
19 points
7 comments3 min readEA link
(www.openphilanthropy.org)

The ar­gu­ment for near-term hu­man dis­em­pow­er­ment through AI

Chris LeongApr 16, 2024, 3:07 AM
31 points
12 comments1 min readEA link
(link.springer.com)

Women in AI Safety Lon­don Meetup

NiaAug 1, 2024, 9:48 AM
2 points
0 comments1 min readEA link

[Question] If AI is in a bub­ble and the bub­ble bursts, what would you do?

RemmeltAug 19, 2024, 10:56 AM
28 points
7 comments1 min readEA link

What suc­cess looks like

mariushobbhahnJun 28, 2022, 2:30 PM
112 points
20 comments19 min readEA link

List #1: Why stop­ping the de­vel­op­ment of AGI is hard but doable

RemmeltDec 24, 2022, 9:52 AM
24 points
2 comments1 min readEA link

Want to work on US emerg­ing tech policy? Con­sider the Hori­zon Fel­low­ship.

ESJul 30, 2024, 11:46 AM
32 points
0 comments1 min readEA link

Scal­ing Laws and Likely Limits to AI

DavidmanheimAug 18, 2024, 5:19 PM
19 points
0 comments3 min readEA link

De­cod­ing Repub­li­can AI Policy: In­sights from 10 Key Ar­ti­cles from Mid-2024

anonymous007Aug 18, 2024, 9:48 AM
5 points
0 comments6 min readEA link

[Question] Suggested read­ings & videos for a new col­lege course on ‘Psy­chol­ogy and AI’?

Geoffrey MillerJan 11, 2024, 10:26 PM
12 points
3 comments1 min readEA link

Com­mu­nity Build­ing for Grad­u­ate Stu­dents: A Tar­geted Approach

Neil CrawfordMar 29, 2022, 7:47 PM
13 points
0 comments3 min readEA link

Cog­ni­tive as­sets and defen­sive acceleration

JulianHazellApr 3, 2024, 2:55 PM
13 points
3 comments4 min readEA link
(muddyclothes.substack.com)

Ap­ply to the 2024 PIBBSS Sum­mer Re­search Fellowship

noraJan 12, 2024, 4:06 AM
37 points
1 comment1 min readEA link

New Me­tac­u­lus Space for AI and X-Risk Re­lated Questions

David Mathers🔸Sep 6, 2024, 11:37 AM
16 points
0 comments1 min readEA link

How do AI welfare and AI safety in­ter­act?

Lucius CaviolaJul 1, 2024, 10:39 AM
77 points
21 comments7 min readEA link
(outpaced.substack.com)

Bryan John­son seems more EA al­igned than I expected

PeterSlatteryApr 22, 2024, 9:38 AM
13 points
27 comments2 min readEA link
(www.youtube.com)

Reflec­tions on my first year of AI safety research

Jay BaileyJan 8, 2024, 7:49 AM
63 points
2 comments12 min readEA link

2023: news on AI safety, an­i­mal welfare, global health, and more

LizkaJan 5, 2024, 9:57 PM
54 points
1 comment12 min readEA link

Sur­vey on in­ter­me­di­ate goals in AI governance

MichaelA🔸Mar 17, 2023, 12:44 PM
155 points
4 comments1 min readEA link

A new­comer’s guide to the tech­ni­cal AI safety field

zeshenNov 4, 2022, 2:29 PM
16 points
0 comments1 min readEA link

Against most, but not all, AI risk analogies

Matthew_BarnettJan 14, 2024, 7:13 PM
43 points
9 comments1 min readEA link

[Question] What is MIRI cur­rently do­ing?

RokoDec 14, 2024, 2:55 AM
9 points
2 comments1 min readEA link

Pri­ori­tis­ing be­tween ex­tinc­tion risks: Ev­i­dence Quality

freedomandutilityDec 30, 2023, 12:25 PM
11 points
0 comments2 min readEA link

Pro­ject ideas: Gover­nance dur­ing ex­plo­sive tech­nolog­i­cal growth

Lukas FinnvedenJan 4, 2024, 7:25 AM
33 points
1 comment16 min readEA link
(lukasfinnveden.substack.com)

AI, cen­tral­iza­tion, and the One Ring

Owen Cotton-BarrattSep 13, 2024, 1:56 PM
18 points
0 comments8 min readEA link
(strangecities.substack.com)

An Ar­gu­ment for Fo­cus­ing on Mak­ing AI go Well

Chris LeongDec 28, 2023, 1:25 PM
13 points
4 comments3 min readEA link

Eric Sch­midt’s blueprint for US tech­nol­ogy strategy

OscarD🔸Oct 15, 2024, 7:54 PM
29 points
4 comments9 min readEA link

Pro­ject ideas: Sen­tience and rights of digi­tal minds

Lukas FinnvedenJan 4, 2024, 7:26 AM
33 points
1 comment20 min readEA link
(lukasfinnveden.substack.com)

Po­si­tions at MITFutureTech

PeterSlatteryDec 19, 2023, 8:28 PM
21 points
1 comment4 min readEA link

En­hanc­ing biose­cu­rity with lan­guage mod­els: defin­ing re­search directions

micMar 26, 2024, 12:30 PM
11 points
1 comment13 min readEA link
(papers.ssrn.com)

The Fu­ture of Work: How Can Poli­cy­mak­ers Pre­pare for AI’s Im­pact on La­bor Mar­kets?

DavidConradJun 24, 2024, 9:43 PM
4 points
1 comment3 min readEA link
(www.lesswrong.com)

[Question] Best giv­ing mul­ti­plier for X-risk/​AI safety?

SiebeRozendalDec 27, 2023, 10:51 AM
7 points
0 comments1 min readEA link

Talk: AI safety field­build­ing at MATS

Ryan KiddJun 23, 2024, 11:06 PM
14 points
1 comment1 min readEA link

More peo­ple get­ting into AI safety should do a PhD

AdamGleaveMar 14, 2024, 10:14 PM
50 points
4 comments1 min readEA link
(gleave.me)

[Question] Who should we give books on AI X-risk to?

yanniDec 18, 2023, 11:57 PM
13 points
1 comment1 min readEA link

AI gov­er­nance and strat­egy: a list of re­search agen­das and work that could be done.

Nathan_BarnardMar 12, 2024, 11:21 AM
33 points
4 comments17 min readEA link

Disen­tan­gling ar­gu­ments for the im­por­tance of AI safety

richard_ngoJan 23, 2019, 2:58 PM
63 points
14 comments8 min readEA link

Ret­ro­spec­tive on the 2022 Con­jec­ture AI Discussions

Andrea_MiottiFeb 24, 2023, 10:41 PM
12 points
1 comment1 min readEA link

Nav­i­gat­ing Risks from Ad­vanced Ar­tifi­cial In­tel­li­gence: A Guide for Philan­thropists [Founders Pledge]

Tom Barnes🔸Jun 21, 2024, 9:48 AM
101 points
7 comments1 min readEA link
(www.founderspledge.com)

On the fu­ture of lan­guage models

Owen Cotton-BarrattDec 20, 2023, 4:58 PM
125 points
3 comments36 min readEA link

“Ar­tifi­cial Gen­eral In­tel­li­gence”: an ex­tremely brief FAQ

Steven ByrnesMar 11, 2024, 5:49 PM
12 points
0 comments1 min readEA link

Chris­ti­ano (ARC) and GA (Con­jec­ture) Dis­cuss Align­ment Cruxes

Andrea_MiottiFeb 24, 2023, 11:03 PM
16 points
1 comment1 min readEA link

De­con­struct­ing Bostrom’s Clas­sic Ar­gu­ment for AI Doom

Nora BelroseMar 11, 2024, 6:03 AM
25 points
0 comments1 min readEA link
(www.youtube.com)

Case stud­ies on so­cial-welfare-based stan­dards in var­i­ous industries

Holden KarnofskyJun 20, 2024, 1:33 PM
73 points
2 comments1 min readEA link

Fif­teen Law­suits against OpenAI

RemmeltMar 9, 2024, 12:22 PM
55 points
5 comments1 min readEA link

[Question] What should the EA/​AI safety com­mu­nity change, in re­sponse to Sam Alt­man’s re­vealed pri­ori­ties?

SiebeRozendalMar 8, 2024, 12:35 PM
30 points
16 comments1 min readEA link

Chain­ing Retroac­tive Fun­ders to Bor­row Against Un­likely Utopias

Dawn DrescherApr 19, 2022, 6:25 PM
24 points
4 comments9 min readEA link
(impactmarkets.substack.com)

AI, An­i­mals, and Digi­tal Minds 2024 - Retrospective

Constance LiJun 19, 2024, 2:56 PM
80 points
8 comments8 min readEA link

The last era of hu­man mistakes

Owen Cotton-BarrattJul 24, 2024, 9:56 AM
23 points
4 comments7 min readEA link
(strangecities.substack.com)

[Question] Any tips on ap­ply­ing for EA fund­ing?

Eevee🔹Sep 22, 2024, 5:11 AM
18 points
4 comments1 min readEA link

AI Safety Newslet­ter #37: US Launches An­titrust In­ves­ti­ga­tions Plus, re­cent crit­i­cisms of OpenAI and An­thropic, and a sum­mary of Si­tu­a­tional Awareness

Center for AI SafetyJun 18, 2024, 6:08 PM
15 points
0 comments5 min readEA link
(newsletter.safe.ai)

Pal­isade is hiring: Exec As­sis­tant, Con­tent Lead, Ops Lead, and Policy Lead

Charlie Rogers-SmithOct 9, 2024, 12:04 AM
15 points
2 comments1 min readEA link

[Question] Has An­thropic already made the ex­ter­nally leg­ible com­mit­ments that it planned to make?

OferMar 12, 2024, 1:45 PM
21 points
3 comments1 min readEA link

AI things that are per­haps as im­por­tant as hu­man-con­trol­led AI

ChiMar 3, 2024, 6:07 PM
113 points
9 comments21 min readEA link

Tak­ing a leave of ab­sence from Open Philan­thropy to work on AI safety

Holden KarnofskyFeb 23, 2023, 7:05 PM
420 points
31 comments2 min readEA link

[Question] Why won’t nan­otech kill us all?

YarrowDec 16, 2023, 11:27 PM
19 points
5 comments1 min readEA link

My ar­ti­cle in The Na­tion — Cal­ifor­nia’s AI Safety Bill Is a Mask-Off Mo­ment for the Industry

GarrisonAug 15, 2024, 7:25 PM
134 points
0 comments1 min readEA link
(www.thenation.com)

Video and tran­script of pre­sen­ta­tion on Oth­er­ness and con­trol in the age of AGI

Joe_CarlsmithOct 8, 2024, 10:30 PM
18 points
1 comment1 min readEA link

Offer­ing AI safety sup­port calls for ML professionals

Vael GatesFeb 15, 2024, 11:48 PM
52 points
1 comment1 min readEA link

In­cu­bat­ing AI x-risk pro­jects: some per­sonal reflections

Ben SnodinDec 19, 2023, 5:03 PM
84 points
10 comments9 min readEA link

List #3: Why not to as­sume on prior that AGI-al­ign­ment workarounds are available

RemmeltDec 24, 2022, 9:54 AM
6 points
0 comments1 min readEA link

The case for more Align­ment Tar­get Anal­y­sis (ATA)

ChiSep 20, 2024, 1:14 AM
21 points
0 comments1 min readEA link

Can the AI af­ford to wait?

Ben Millwood🔸Mar 20, 2024, 7:45 PM
48 points
11 comments7 min readEA link

A tale of 2.5 or­thog­o­nal­ity theses

ArepoMay 1, 2022, 1:53 PM
140 points
31 comments11 min readEA link

On the Dwarkesh/​Chol­let Pod­cast, and the cruxes of scal­ing to AGI

JWS 🔸Jun 15, 2024, 8:24 PM
72 points
49 comments17 min readEA link

[Question] Dan Hendrycks and EA

CarusoAug 3, 2024, 1:49 PM
−1 points
6 comments1 min readEA link

Thoughts on “The Offense-Defense Balance Rarely Changes”

Cullen 🔸Feb 12, 2024, 3:26 AM
42 points
4 comments5 min readEA link

The benefits and risks of op­ti­mism (about AI safety)

Karl von WendtDec 3, 2023, 12:45 PM
3 points
5 comments1 min readEA link

Ar­tifi­cial In­tel­li­gence, Con­scious Machines, and An­i­mals: Broad­en­ing AI Ethics

Group OrganizerSep 21, 2023, 8:58 PM
4 points
0 comments1 min readEA link

FLI pod­cast se­ries, “Imag­ine A World”, about as­pira­tional fu­tures with AGI

Jackson WagnerOct 13, 2023, 4:03 PM
18 points
0 comments4 min readEA link

Does schem­ing lead to ad­e­quate fu­ture em­pow­er­ment? (Sec­tion 2.3.1.2 of “Schem­ing AIs”)

Joe_CarlsmithDec 3, 2023, 6:32 PM
6 points
1 comment1 min readEA link

Think­ing-in-limits about TAI from the de­mand per­spec­tive. De­mand sat­u­ra­tion, re­source wars, new debt.

Ivan MadanNov 7, 2023, 10:44 PM
2 points
0 comments4 min readEA link

An­nounc­ing The Mi­das Pro­ject — and our first cam­paign (which you can help with!)

Tyler JohnstonJun 13, 2024, 6:41 PM
98 points
15 comments4 min readEA link

An­nounc­ing the Lon­don Ini­ti­a­tive for Safe AI (LISA)

JamesFoxFeb 5, 2024, 10:36 AM
65 points
3 comments9 min readEA link

RP’s AI Gover­nance & Strat­egy team—June 2023 in­terim overview

MichaelA🔸Jun 22, 2023, 1:45 PM
68 points
1 comment7 min readEA link

Up­com­ing speaker se­ries on emerg­ing tech, na­tional se­cu­rity & US policy careers

kuhanjJun 21, 2023, 4:49 AM
42 points
0 comments2 min readEA link

[Question] How good/​bad is the new Bing AI for the world?

Nathan YoungFeb 17, 2023, 4:31 PM
21 points
14 comments1 min readEA link

A Friendly Face (Another Failure Story)

Karl von WendtJun 20, 2023, 10:31 AM
22 points
8 comments1 min readEA link

The Hub­inger lec­tures on AGI safety: an in­tro­duc­tory lec­ture series

evhubJun 22, 2023, 12:59 AM
44 points
0 comments1 min readEA link

ML4G Ger­many—AI Align­ment Camp

Evander H. 🔸Jun 19, 2023, 7:24 AM
17 points
1 comment1 min readEA link

An­nounc­ing FAR Labs, an AI safety cowork­ing space

ghabsOct 2, 2023, 8:15 PM
63 points
0 comments1 min readEA link
(www.lesswrong.com)

Up­dat­ing Drexler’s CAIS model

Matthew_BarnettJun 17, 2023, 1:57 AM
59 points
0 comments1 min readEA link

Ra­tional An­i­ma­tions is look­ing for an AI Safety scriptwriter, a lead com­mu­nity man­ager, and other roles.

WriterJun 16, 2023, 9:41 AM
40 points
4 comments1 min readEA link

[Question] What would it look like for AIS to no longer be ne­glected?

RockwellJun 16, 2023, 3:59 PM
100 points
15 comments1 min readEA link

Si­mu­lat­ing Shut­down Code Ac­ti­va­tions in an AI Virus Lab

MiguelJun 20, 2023, 5:27 AM
4 points
0 comments6 min readEA link

ai-plans.com De­cem­ber Cri­tique-a-Thon

Kabir_KumarDec 4, 2023, 9:27 AM
1 point
0 comments2 min readEA link

Safety isn’t safety with­out a so­cial model (or: dis­pel­ling the myth of per se tech­ni­cal safety)

Andrew CritchJun 14, 2024, 12:16 AM
95 points
3 comments1 min readEA link

The “tech­nol­ogy” bucket error

Holly Elmore ⏸️ 🔸Sep 21, 2023, 12:59 AM
33 points
10 comments4 min readEA link
(open.substack.com)

Hy­po­thet­i­cal grants that the Long-Term Fu­ture Fund nar­rowly rejected

calebpNov 15, 2023, 7:39 PM
95 points
12 comments6 min readEA link

Global Pause AI Protest 10/​21

Holly Elmore ⏸️ 🔸Oct 14, 2023, 3:17 AM
22 points
0 comments1 min readEA link

M&A in AI

Hauke HillebrandtOct 30, 2023, 5:43 PM
9 points
1 comment6 min readEA link

An­nounc­ing the Vi­talik Bu­terin Fel­low­ships in AI Ex­is­ten­tial Safety!

DanielFilanSep 21, 2021, 12:41 AM
62 points
0 comments1 min readEA link
(grants.futureoflife.org)

[Question] Pros and cons of set­ting up a com­pany to do in­de­pen­dent AIS re­search?

Eevee🔹Aug 13, 2024, 12:11 AM
15 points
0 comments1 min readEA link

Brief thoughts on Data, Re­port­ing, and Re­sponse for AI Risk Mitigation

DavidmanheimJun 15, 2023, 7:53 AM
18 points
3 comments8 min readEA link

Some tal­ent needs in AI governance

Sam ClarkeJun 13, 2023, 1:53 PM
133 points
10 comments8 min readEA link

ARC is hiring the­o­ret­i­cal researchers

Jacob_HiltonJun 12, 2023, 7:11 PM
78 points
0 comments4 min readEA link
(www.lesswrong.com)

Ap­ti­tudes for AI gov­er­nance work

Sam ClarkeJun 13, 2023, 1:54 PM
68 points
0 comments7 min readEA link

Us­ing Con­sen­sus Mechanisms as an ap­proach to Alignment

PrometheusJun 11, 2023, 1:24 PM
14 points
0 comments1 min readEA link

Mesa-Op­ti­miza­tion: Ex­plain it like I’m 10 Edition

brookAug 26, 2023, 11:06 PM
7 points
0 comments6 min readEA link
(www.lesswrong.com)

12 ca­reer ad­vis­ing ques­tions that may (or may not) be helpful for peo­ple in­ter­ested in al­ign­ment research

AkashDec 12, 2022, 10:36 PM
14 points
0 comments1 min readEA link

UN Sec­re­tary-Gen­eral recog­nises ex­is­ten­tial threat from AI

Greg_ColbournJun 15, 2023, 5:03 PM
58 points
1 comment1 min readEA link

Care­less talk on US-China AI com­pe­ti­tion? (and crit­i­cism of CAIS cov­er­age)

Oliver SourbutSep 20, 2023, 12:46 PM
52 points
19 comments1 min readEA link
(www.oliversourbut.net)

UK gov­ern­ment to host first global sum­mit on AI Safety

DavidNashJun 8, 2023, 1:24 PM
78 points
1 comment5 min readEA link
(www.gov.uk)

[Question] Are we con­fi­dent that su­per­in­tel­li­gent ar­tifi­cial in­tel­li­gence dis­em­pow­er­ing hu­mans would be bad?

Vasco Grilo🔸Jun 10, 2023, 9:24 AM
24 points
27 comments1 min readEA link

AI take­off and nu­clear war

Owen Cotton-BarrattJun 11, 2024, 7:33 PM
72 points
5 comments11 min readEA link
(strangecities.substack.com)

An­nounc­ing the In­tro­duc­tion to ML Safety Course

TW123Aug 6, 2022, 2:50 AM
136 points
4 comments7 min readEA link

Be­ware pop­u­lar dis­cus­sions of AI “sen­tience”

David Mathers🔸Jun 8, 2023, 8:57 AM
42 points
6 comments9 min readEA link

New re­port: “Schem­ing AIs: Will AIs fake al­ign­ment dur­ing train­ing in or­der to get power?”

Joe_CarlsmithNov 15, 2023, 5:16 PM
71 points
4 comments1 min readEA link

Protest against Meta’s ir­re­versible pro­lifer­a­tion (Sept 29, San Fran­cisco)

Holly Elmore ⏸️ 🔸Sep 19, 2023, 11:40 PM
114 points
32 comments1 min readEA link

AI Safety Newslet­ter #41: The Next Gen­er­a­tion of Com­pute Scale Plus, Rank­ing Models by Sus­cep­ti­bil­ity to Jailbreak­ing, and Ma­chine Ethics

Center for AI SafetySep 11, 2024, 7:11 PM
12 points
0 comments5 min readEA link
(newsletter.safe.ai)

RSPs are pauses done right

evhubOct 14, 2023, 4:06 AM
93 points
7 comments1 min readEA link

Trans­for­ma­tive AGI by 2043 is <1% likely

Ted SandersJun 6, 2023, 3:51 PM
98 points
92 comments5 min readEA link
(arxiv.org)

Ap­pli­ca­tions are now open for In­tro to ML Safety Spring 2023

JoshcNov 4, 2022, 10:45 PM
49 points
1 comment2 min readEA link

Which ML skills are use­ful for find­ing a new AIS re­search agenda?

Yonatan CaleFeb 9, 2023, 1:09 PM
7 points
3 comments1 min readEA link

Cri­tiques of non-ex­is­tent AI safety labs: Yours

AnnealJun 16, 2023, 6:50 AM
117 points
12 comments3 min readEA link

AI Safety Newslet­ter #39: Im­pli­ca­tions of a Trump Ad­minis­tra­tion for AI Policy Plus, Safety Engineering

Center for AI SafetyJul 29, 2024, 5:48 PM
6 points
0 comments6 min readEA link
(newsletter.safe.ai)

In­tro­duc­ing Kairos: a new AI safety field­build­ing or­ga­ni­za­tion (the new home for SPAR and FSP)

Agustín Covarrubias 🔸Oct 25, 2024, 9:59 PM
71 points
2 comments2 min readEA link

Some thoughts on “AI could defeat all of us com­bined”

Milan GriffesJun 2, 2023, 3:03 PM
23 points
0 comments4 min readEA link

AI Safety Hub Ser­bia Offi­cial Opening

Dušan D. Nešić (Dushan)Oct 28, 2023, 5:10 PM
26 points
3 comments1 min readEA link
(forum.effectivealtruism.org)

Ac­tion: Help ex­pand fund­ing for AI Safety by co­or­di­nat­ing on NSF response

Evan R. MurphyJan 20, 2022, 8:48 PM
20 points
7 comments3 min readEA link

An­nounce­ment: You can now listen to the “AI Safety Fun­da­men­tals” courses

peterhartreeJun 9, 2023, 4:32 PM
101 points
8 comments1 min readEA link

Will scal­ing work?

Vasco Grilo🔸Feb 4, 2024, 9:29 AM
19 points
1 comment12 min readEA link
(www.dwarkeshpatel.com)

In­tro­duc­ing Fu­ture Mat­ters – a strat­egy consultancy

KyleGraceySep 30, 2023, 2:06 AM
59 points
2 comments5 min readEA link

State­ment on AI Ex­tinc­tion—Signed by AGI Labs, Top Aca­demics, and Many Other Notable Figures

Center for AI SafetyMay 30, 2023, 9:06 AM
427 points
28 comments1 min readEA link
(www.safe.ai)

AXRP: Store, Pa­treon, Video

DanielFilanFeb 7, 2023, 5:12 AM
7 points
0 comments1 min readEA link

The bul­ls­eye frame­work: My case against AI doom

titotalMay 30, 2023, 11:52 AM
71 points
15 comments17 min readEA link

A moral back­lash against AI will prob­a­bly slow down AGI development

Geoffrey MillerMay 31, 2023, 9:31 PM
142 points
22 comments14 min readEA link

Why and When In­ter­pretabil­ity Work is Dangerous

Nicholas / Heather KrossMay 28, 2023, 12:27 AM
6 points
0 comments1 min readEA link

Cal­ifor­ni­ans, tell your reps to vote yes on SB 1047!

Holly Elmore ⏸️ 🔸Aug 12, 2024, 7:49 PM
106 points
6 comments1 min readEA link

List of Masters Pro­grams in Tech Policy, Public Policy and Se­cu­rity (Europe)

sbergMay 29, 2023, 10:23 AM
47 points
0 comments3 min readEA link

Biomimetic al­ign­ment: Align­ment be­tween an­i­mal genes and an­i­mal brains as a model for al­ign­ment be­tween hu­mans and AI sys­tems.

Geoffrey MillerMay 26, 2023, 9:25 PM
32 points
1 comment16 min readEA link

Seek­ing (Paid) Case Stud­ies on Standards

Holden KarnofskyMay 26, 2023, 5:58 PM
99 points
14 comments1 min readEA link

[Job Ad] SERI MATS is hiring for our sum­mer program

annashiveMay 26, 2023, 4:51 AM
8 points
1 comment7 min readEA link

On the cor­re­spon­dence be­tween AI-mis­al­ign­ment and cog­ni­tive dis­so­nance us­ing a be­hav­ioral eco­nomics model

StijnNov 1, 2022, 9:15 AM
11 points
0 comments6 min readEA link

[Linkpost] OpenAI is award­ing ten 100k grants for build­ing pro­to­types of a demo­cratic pro­cess for steer­ing AI

pseudonymMay 26, 2023, 12:49 PM
36 points
2 comments1 min readEA link
(openai.com)

[Linkpost] “Gover­nance of su­per­in­tel­li­gence” by OpenAI

Daniel_EthMay 22, 2023, 8:15 PM
51 points
6 comments2 min readEA link
(openai.com)

Box in­ver­sion revisited

Jan_KulveitNov 7, 2023, 11:09 AM
13 points
1 comment1 min readEA link

[Question] AI strat­egy ca­reer pipeline

Zach Stein-PerlmanMay 22, 2023, 12:00 AM
72 points
23 comments1 min readEA link

Bandgaps, Brains, and Bioweapons: The limi­ta­tions of com­pu­ta­tional sci­ence and what it means for AGI

titotalMay 26, 2023, 3:57 PM
59 points
0 comments18 min readEA link

Please, some­one make a dataset of sup­posed cases of “tech panic”

Marcel DNov 7, 2023, 2:49 AM
4 points
2 comments2 min readEA link

Google in­vests $300mn in ar­tifi­cial in­tel­li­gence start-up An­thropic | FT

𝕮𝖎𝖓𝖊𝖗𝖆Feb 3, 2023, 7:43 PM
155 points
5 comments1 min readEA link
(www.ft.com)

A Study of AI Science Models

Eleni_AMay 13, 2023, 7:14 PM
12 points
4 comments24 min readEA link

Yann LeCun on AGI and AI Safety

Chris LeongAug 8, 2023, 11:43 PM
23 points
4 comments1 min readEA link
(drive.google.com)

“Di­a­mon­doid bac­te­ria” nanobots: deadly threat or dead-end? A nan­otech in­ves­ti­ga­tion

titotalSep 29, 2023, 2:01 PM
102 points
33 comments20 min readEA link
(titotal.substack.com)

Pod­cast (+tran­script): Nathan Barnard on how US fi­nan­cial reg­u­la­tion can in­form AI governance

Aaron BergmanAug 8, 2023, 9:46 PM
12 points
0 comments23 min readEA link
(www.aaronbergman.net)

A re­cent write-up of the case for AI (ex­is­ten­tial) risk

TimseyMay 18, 2023, 1:07 PM
17 points
0 comments19 min readEA link

Will AI Avoid Ex­ploita­tion? (Adam Bales)

Global Priorities InstituteDec 13, 2023, 11:37 AM
38 points
0 comments2 min readEA link

Stu­art J. Rus­sell on “should we press pause on AI?”

KaleemSep 18, 2023, 1:19 PM
32 points
3 comments1 min readEA link
(podcasts.apple.com)

Some quotes from Tues­day’s Se­nate hear­ing on AI

Daniel_EthMay 17, 2023, 12:13 PM
105 points
7 comments4 min readEA link

Cul­ture and Pro­gram­ming Ret­ro­spec­tive: ERA Fel­low­ship 2023

Gideon FutermanSep 28, 2023, 4:45 PM
16 points
0 comments10 min readEA link

Trends in the dol­lar train­ing cost of ma­chine learn­ing systems

Ben CottierFeb 1, 2023, 2:48 PM
63 points
3 comments1 min readEA link

The state of AI in differ­ent coun­tries — an overview

LizkaSep 14, 2023, 10:37 AM
68 points
6 comments13 min readEA link
(aisafetyfundamentals.com)

SPAR seeks ad­vi­sors and stu­dents for AI safety pro­jects (Se­cond Wave)

micSep 14, 2023, 11:09 PM
14 points
0 comments1 min readEA link

AI safety field-build­ing sur­vey: Ta­lent needs, in­fras­truc­ture needs, and re­la­tion­ship to EA

michelOct 27, 2023, 9:08 PM
67 points
3 comments9 min readEA link

[Question] Ask­ing for on­line calls on AI s-risks discussions

jackchang110May 14, 2023, 1:58 PM
26 points
3 comments1 min readEA link

What does it mean for an AGI to be ‘safe’?

So8resOct 7, 2022, 4:43 AM
53 points
21 comments1 min readEA link

Law & AI Din­ner—EAG Bos­ton 2023

Alfredo Parra 🔸Oct 12, 2023, 8:32 AM
8 points
0 comments1 min readEA link

How “AGI” could end up be­ing many differ­ent spe­cial­ized AI’s stitched together

titotalMay 8, 2023, 12:32 PM
31 points
2 comments9 min readEA link

Ap­ply to lead a pro­ject dur­ing the next vir­tual AI Safety Camp

Linda LinseforsSep 13, 2023, 1:29 PM
16 points
0 comments1 min readEA link
(aisafety.camp)

Ag­gre­gat­ing Utilities for Cor­rigible AI [Feed­back Draft]

Dan HMay 12, 2023, 8:57 PM
12 points
0 comments1 min readEA link

How much do mar­kets value Open AI?

Ben_West🔸May 14, 2023, 7:28 PM
39 points
13 comments4 min readEA link

All AGI Safety ques­tions wel­come (es­pe­cially ba­sic ones) [May 2023]

StevenKaasMay 8, 2023, 10:30 PM
19 points
11 comments1 min readEA link

ARC Evals: Re­spon­si­ble Scal­ing Policies

Zach Stein-PerlmanSep 28, 2023, 4:30 AM
16 points
1 comment1 min readEA link
(evals.alignment.org)

Re­minder: AI Wor­ld­views Con­test Closes May 31

Jason SchukraftMay 8, 2023, 5:40 PM
20 points
0 comments1 min readEA link

An Anal­ogy for Un­der­stand­ing Transformers

TheMcDouglasMay 13, 2023, 12:20 PM
7 points
0 comments1 min readEA link

Sam Alt­man /​ Open AI Dis­cus­sion Thread

John SalterNov 20, 2023, 9:21 AM
40 points
36 comments1 min readEA link

My model of how differ­ent AI risks fit together

Stephen ClareJan 31, 2024, 5:09 PM
63 points
4 comments7 min readEA link
(unfoldingatlas.substack.com)

Un­veiling the Amer­i­can Public Opinion on AI Mo­ra­to­rium and Govern­ment In­ter­ven­tion: The Im­pact of Me­dia Exposure

OttoMay 8, 2023, 10:49 AM
28 points
5 comments6 min readEA link

OpenAI’s new Pre­pared­ness team is hiring

leopoldOct 26, 2023, 8:41 PM
85 points
13 comments1 min readEA link

Pro­jects I would like to see (pos­si­bly at AI Safety Camp)

Linda LinseforsSep 27, 2023, 9:27 PM
9 points
0 comments1 min readEA link

New re­port on the state of AI safety in China

Geoffrey MillerOct 27, 2023, 8:20 PM
22 points
0 comments3 min readEA link
(concordia-consulting.com)

The Parable of the Boy Who Cried 5% Chance of Wolf

Kat WoodsAug 15, 2022, 2:22 PM
80 points
8 comments2 min readEA link

Re­grant up to $600,000 to AI safety pro­jects with GiveWiki

Dawn DrescherOct 28, 2023, 7:56 PM
22 points
0 comments3 min readEA link

AI Safety Seems Hard to Measure

Holden KarnofskyDec 11, 2022, 1:31 AM
90 points
4 comments14 min readEA link
(www.cold-takes.com)

[Question] Ask­ing for on­line re­sources why AI now is near AGI

jackchang110May 18, 2023, 12:04 AM
6 points
4 comments1 min readEA link

Many AI gov­er­nance pro­pos­als have a trade­off be­tween use­ful­ness and feasibility

AkashFeb 3, 2023, 6:49 PM
22 points
0 comments1 min readEA link

Thread: Reflec­tions on the AGI Safety Fun­da­men­tals course?

CliffordMay 18, 2023, 1:11 PM
27 points
7 comments1 min readEA link

AI risk/​re­ward: A sim­ple model

Nathan YoungMay 4, 2023, 7:12 PM
37 points
5 comments7 min readEA link

Re­think Pri­ori­ties is hiring a Com­pute Gover­nance Re­searcher or Re­search Assistant

MichaelA🔸Jun 7, 2023, 1:22 PM
36 points
2 comments8 min readEA link
(careers.rethinkpriorities.org)

Are there enough op­por­tu­ni­ties for AI safety spe­cial­ists?

mhint199May 13, 2023, 9:18 PM
8 points
2 comments3 min readEA link

Un-un­plug­ga­bil­ity—can’t we just un­plug it?

Oliver SourbutMay 15, 2023, 1:23 PM
15 points
0 comments1 min readEA link
(www.oliversourbut.net)

Order Mat­ters for De­cep­tive Alignment

DavidWFeb 15, 2023, 8:12 PM
20 points
1 comment1 min readEA link
(www.lesswrong.com)

I don’t want to talk about ai

KirstenMay 22, 2023, 9:19 PM
7 points
0 comments1 min readEA link
(ealifestyles.substack.com)

How CISA can Sup­port the Se­cu­rity of Large AI Models Against Theft [Grad School As­sign­ment]

Marcel DMay 3, 2023, 3:36 PM
7 points
0 comments13 min readEA link

The Po­lar­ity Prob­lem [Draft]

Dan HMay 23, 2023, 9:05 PM
11 points
0 comments1 min readEA link

[Link post] Michael Niel­sen’s “Notes on Ex­is­ten­tial Risk from Ar­tifi­cial Su­per­in­tel­li­gence”

Joel BeckerSep 19, 2023, 1:31 PM
38 points
1 comment6 min readEA link
(michaelnotebook.com)

Quick sur­vey on AI al­ign­ment resources

frances_lorenzJun 30, 2022, 7:08 PM
15 points
0 comments1 min readEA link

We’re all in this together

Tamsin LeakeDec 5, 2023, 1:57 PM
15 points
1 comment1 min readEA link
(carado.moe)

AI welfare vs. AI rights

Matthew_BarnettFeb 4, 2025, 6:28 PM
33 points
20 comments3 min readEA link

AI Views Snapshots

RobBensingerDec 13, 2023, 12:45 AM
25 points
0 comments1 min readEA link

Giv­ing away copies of Un­con­trol­lable by Dar­ren McKee

Greg_ColbournDec 14, 2023, 5:00 PM
39 points
2 comments1 min readEA link

Owain Evans on LLMs, Truth­ful AI, AI Com­po­si­tion, and More

Ozzie GooenMay 2, 2023, 1:20 AM
21 points
0 comments1 min readEA link
(quri.substack.com)

The Retroac­tive Fund­ing Land­scape: In­no­va­tions for Donors and Grantmakers

Dawn DrescherSep 29, 2023, 5:39 PM
17 points
2 comments19 min readEA link
(impactmarkets.substack.com)

My AI Align­ment Re­search Agenda and Threat Model, right now (May 2023)

Nicholas / Heather KrossMay 28, 2023, 3:23 AM
6 points
0 comments1 min readEA link

[Question] How to hedge in­vest­ment port­fo­lio against AI risk?

Timothy_LiptrotJan 31, 2023, 8:04 AM
8 points
0 comments1 min readEA link

Cal­ling for Stu­dent Sub­mis­sions: AI Safety Distil­la­tion Contest

a_e_rApr 23, 2022, 8:24 PM
102 points
28 comments3 min readEA link

P(doom|AGI) is high: why the de­fault out­come of AGI is doom

Greg_ColbournMay 2, 2023, 10:40 AM
13 points
28 comments3 min readEA link

How MATS ad­dresses “mass move­ment build­ing” concerns

Ryan KiddMay 4, 2023, 12:55 AM
79 points
4 comments1 min readEA link

EA, Psy­chol­ogy & AI Safety Research

Sam EllisMay 26, 2022, 11:46 PM
28 points
3 comments6 min readEA link

My cur­rent take on ex­is­ten­tial AI risk [FB post]

Aryeh EnglanderMay 1, 2023, 4:22 PM
10 points
0 comments3 min readEA link

Re­think Pri­ori­ties’ 2023 Sum­mary, 2024 Strat­egy, and Fund­ing Gaps

kierangreig🔸Nov 15, 2023, 8:56 PM
86 points
7 comments3 min readEA link

Align­ment is mostly about mak­ing cog­ni­tion aimable at all

So8resJan 30, 2023, 3:22 PM
57 points
3 comments1 min readEA link

Play Re­grantor: Move up to $250,000 to Your Top High-Im­pact Pro­jects!

Dawn DrescherMay 17, 2023, 4:51 PM
58 points
2 comments2 min readEA link
(impactmarkets.substack.com)

A Guide to Fore­cast­ing AI Science Capabilities

Eleni_AApr 29, 2023, 6:51 AM
19 points
1 comment4 min readEA link

Apoca­lypse in­surance, and the hardline liber­tar­ian take on AI risk

So8resNov 28, 2023, 2:09 AM
21 points
0 comments1 min readEA link

New open let­ter on AI — “In­clude Con­scious­ness Re­search”

Jamie_HarrisApr 28, 2023, 7:50 AM
55 points
1 comment3 min readEA link
(amcs-community.org)

Fram­ing AI strategy

Zach Stein-PerlmanFeb 7, 2023, 8:03 PM
16 points
0 comments1 min readEA link
(www.lesswrong.com)

Archety­pal Trans­fer Learn­ing: a Pro­posed Align­ment Solu­tion that solves the In­ner x Outer Align­ment Prob­lem while adding Cor­rigible Traits to GPT-2-medium

MiguelApr 26, 2023, 12:40 AM
13 points
0 comments10 min readEA link

Com­pendium of prob­lems with RLHF

Raphaël SJan 30, 2023, 8:48 AM
18 points
0 comments1 min readEA link

AI safety logo de­sign con­test, due end of May (ex­tended)

Adrian CiprianiApr 28, 2023, 2:53 AM
13 points
23 comments2 min readEA link

[Linkpost] ‘The God­father of A.I.’ Leaves Google and Warns of Danger Ahead

imp4rtial 🔸May 1, 2023, 7:54 PM
43 points
3 comments3 min readEA link
(www.nytimes.com)

The AI in­dus­try turns against its fa­vorite philosophy

Jonathan YanNov 22, 2023, 12:11 AM
14 points
2 comments1 min readEA link
(www.semafor.com)

Value frag­ility and AI takeover

Joe_CarlsmithAug 5, 2024, 9:28 PM
38 points
3 comments1 min readEA link

Max Teg­mark’s new Time ar­ti­cle on how we’re in a Don’t Look Up sce­nario [Linkpost]

Jonas HallgrenApr 25, 2023, 3:47 PM
41 points
0 comments1 min readEA link

Tech­nolog­i­cal de­vel­op­ments that could in­crease risks from nu­clear weapons: A shal­low review

MichaelA🔸Feb 9, 2023, 3:41 PM
79 points
3 comments5 min readEA link
(bit.ly)

Up­date on cause area fo­cus work­ing group

Bastian_SternAug 10, 2023, 1:21 AM
140 points
18 comments5 min readEA link

“Who Will You Be After ChatGPT Takes Your Job?”

Stephen ThomasApr 21, 2023, 9:31 PM
23 points
4 comments2 min readEA link
(www.wired.com)

Power laws in Speedrun­ning and Ma­chine Learning

Jaime SevillaApr 24, 2023, 10:06 AM
48 points
0 comments1 min readEA link

Jobs that can help with the most im­por­tant century

Holden KarnofskyFeb 12, 2023, 6:19 PM
57 points
2 comments32 min readEA link
(www.cold-takes.com)

A note of cau­tion about re­cent AI risk coverage

Sean_o_hJun 7, 2023, 5:05 PM
283 points
29 comments3 min readEA link

Qual­ities that al­ign­ment men­tors value in ju­nior researchers

AkashFeb 14, 2023, 11:27 PM
31 points
1 comment1 min readEA link

An Ex­er­cise to Build In­tu­itions on AGI Risk

Lauro LangoscoJun 8, 2023, 11:20 AM
4 points
0 comments8 min readEA link
(www.alignmentforum.org)

Have your say on the Aus­tralian Govern­ment’s AI Policy [Bris­bane]

Michael Noetel 🔸Jun 9, 2023, 12:15 AM
6 points
0 comments1 min readEA link

Fo­cus­ing your im­pact on short vs long TAI timelines

kuhanjSep 30, 2023, 7:23 PM
44 points
0 comments10 min readEA link

In­tent al­ign­ment should not be the goal for AGI x-risk reduction

johnjnayOct 26, 2022, 1:24 AM
7 points
1 comment1 min readEA link

Stu­dent com­pe­ti­tion for draft­ing a treaty on mora­to­rium of large-scale AI ca­pa­bil­ities R&D

NayanikaApr 24, 2023, 1:15 PM
36 points
4 comments2 min readEA link

Ideas for AI labs: Read­ing list

Zach Stein-PerlmanApr 24, 2023, 7:00 PM
28 points
2 comments1 min readEA link

Join AISafety.info’s Distil­la­tion Hackathon (Oct 6-9th)

leillustrations🔸Oct 1, 2023, 6:42 PM
27 points
2 comments2 min readEA link
(www.lesswrong.com)

Briefly how I’ve up­dated since ChatGPT

rimeApr 25, 2023, 7:39 PM
29 points
8 comments2 min readEA link
(www.lesswrong.com)

Be­fore Alt­man’s Ouster, OpenAI’s Board Was Di­vided and Feuding

Jonathan YanNov 22, 2023, 1:01 AM
25 points
1 comment1 min readEA link
(www.nytimes.com)

<$750k grants for Gen­eral Pur­pose AI As­surance/​Safety Research

PhosphorousJun 13, 2023, 4:51 AM
37 points
0 comments1 min readEA link
(cset.georgetown.edu)

[Question] If your AGI x-risk es­ti­mates are low, what sce­nar­ios make up the bulk of your ex­pec­ta­tions for an OK out­come?

Greg_ColbournApr 21, 2023, 11:15 AM
62 points
55 comments1 min readEA link

Will re­leas­ing the weights of large lan­guage mod­els grant wide­spread ac­cess to pan­demic agents?

Jeff Kaufman 🔸Oct 30, 2023, 5:42 PM
56 points
18 comments1 min readEA link
(arxiv.org)

The Im­por­tance of AI Align­ment, ex­plained in 5 points

Daniel_EthFeb 11, 2023, 2:56 AM
50 points
4 comments13 min readEA link

Deep­Mind and Google Brain are merg­ing [Linkpost]

AkashApr 20, 2023, 6:47 PM
32 points
1 comment1 min readEA link

What is it like do­ing AI safety work?

Kat WoodsFeb 21, 2023, 7:24 PM
99 points
2 comments10 min readEA link

Linkpost: Dwarkesh Pa­tel in­ter­view­ing Carl Shulman

Stefan_SchubertJun 14, 2023, 3:30 PM
110 points
5 comments1 min readEA link
(podcastaddict.com)

4 ways to think about de­moc­ra­tiz­ing AI [GovAI Linkpost]

AkashFeb 13, 2023, 6:06 PM
35 points
0 comments1 min readEA link

What Does a Marginal Grant at LTFF Look Like? Fund­ing Pri­ori­ties and Grant­mak­ing Thresh­olds at the Long-Term Fu­ture Fund

LinchAug 10, 2023, 8:11 PM
175 points
22 comments8 min readEA link

11 heuris­tics for choos­ing (al­ign­ment) re­search projects

AkashJan 27, 2023, 12:36 AM
30 points
1 comment1 min readEA link

‘AI Emer­gency Eject Cri­te­ria’ Survey

tcelferactApr 19, 2023, 9:55 PM
5 points
3 comments1 min readEA link

Paper­clip Club (AI Safety Meetup)

Luke ThorburnApr 20, 2023, 4:04 PM
2 points
0 comments1 min readEA link

[Link Post: New York Times] White House Un­veils Ini­ti­a­tives to Re­duce Risks of A.I.

RockwellMay 4, 2023, 2:04 PM
50 points
1 comment2 min readEA link

AI Risk Man­age­ment Frame­work | NIST

𝕮𝖎𝖓𝖊𝖗𝖆Jan 26, 2023, 3:27 PM
50 points
0 comments1 min readEA link

5 Rea­sons Why Govern­ments/​Mili­taries Already Want AI for In­for­ma­tion Warfare

trevor1Nov 12, 2023, 6:24 PM
5 points
0 comments1 min readEA link

AI policy ideas: Read­ing list

Zach Stein-PerlmanApr 17, 2023, 7:00 PM
60 points
3 comments1 min readEA link

Ge­offrey Miller on Cross-Cul­tural Un­der­stand­ing Between China and Western Coun­tries as a Ne­glected Con­sid­er­a­tion in AI Alignment

Evan_GaensbauerApr 17, 2023, 3:26 AM
25 points
2 comments4 min readEA link

Vir­tual AI Safety Un­con­fer­ence (VAISU)

NguyênJun 20, 2023, 9:47 AM
14 points
0 comments1 min readEA link

2023 Align­ment Re­search Up­dates from FAR AI

AdamGleaveDec 4, 2023, 10:32 PM
14 points
0 comments1 min readEA link
(far.ai)

AI Takeover Sce­nario with Scaled LLMs

simeon_cApr 16, 2023, 11:28 PM
29 points
1 comment1 min readEA link

Or­ga­niz­ing a de­bate with ex­perts and MPs to raise AI xrisk aware­ness: a pos­si­ble blueprint

OttoApr 19, 2023, 10:50 AM
75 points
1 comment4 min readEA link

[Question] What harm could AI safety do?

SeanEngelhartMay 15, 2021, 1:11 AM
12 points
7 comments1 min readEA link

AGI in sight: our look at the game board

Andrea_MiottiFeb 18, 2023, 10:17 PM
25 points
18 comments1 min readEA link

Next steps af­ter AGISF at UMich

JakubKJan 25, 2023, 8:57 PM
18 points
1 comment1 min readEA link

Ex­cerpts from “Do­ing EA Bet­ter” on x-risk methodology

Eevee🔹Jan 26, 2023, 1:04 AM
22 points
5 comments6 min readEA link
(forum.effectivealtruism.org)

[Linkpost] The A.I. Dilemma—March 9, 2023, with Tris­tan Har­ris and Aza Raskin

PeterSlatteryApr 14, 2023, 8:00 AM
38 points
3 comments41 min readEA link
(youtu.be)

Spread­ing mes­sages to help with the most im­por­tant century

Holden KarnofskyJan 25, 2023, 8:35 PM
128 points
21 comments18 min readEA link
(www.cold-takes.com)

Nav­i­gat­ing AI Risks (NAIR) #1: Slow­ing Down AI

simeon_cApr 14, 2023, 2:35 PM
12 points
1 comment1 min readEA link

12 ten­ta­tive ideas for US AI policy (Luke Muehlhauser)

LizkaApr 19, 2023, 9:05 PM
117 points
12 comments4 min readEA link
(www.openphilanthropy.org)

Don’t Call It AI Alignment

GilFeb 20, 2023, 5:27 AM
16 points
7 comments2 min readEA link

[Question] Ev­i­dence to pri­ori­tize or work­ing on AI as the most im­pact­ful thing?

VaipanSep 22, 2023, 8:43 AM
9 points
6 comments1 min readEA link

An­nounc­ing Epoch’s dash­board of key trends and figures in Ma­chine Learning

Jaime SevillaApr 13, 2023, 7:33 AM
127 points
4 comments1 min readEA link

AIs ac­cel­er­at­ing AI research

AjeyaApr 12, 2023, 11:41 AM
84 points
7 comments4 min readEA link

[MLSN #9] Ver­ify­ing large train­ing runs, se­cu­rity risks from LLM ac­cess to APIs, why nat­u­ral se­lec­tion may fa­vor AIs over humans

TW123Apr 11, 2023, 4:05 PM
18 points
0 comments6 min readEA link
(newsletter.mlsafety.org)

kpurens’s Quick takes

kpurensApr 11, 2023, 2:10 PM
9 points
2 comments2 min readEA link

Why peo­ple want to work on AI safety (but don’t)

Emily GrundyJan 24, 2023, 6:41 AM
70 points
10 comments7 min readEA link

AI Safety Newslet­ter #1 [CAIS Linkpost]

AkashApr 10, 2023, 8:18 PM
38 points
0 comments1 min readEA link

An EA used de­cep­tive mes­sag­ing to ad­vance her pro­ject; we need mechanisms to avoid de­on­tolog­i­cally du­bi­ous plans

MikhailSaminFeb 13, 2024, 11:11 PM
22 points
39 comments5 min readEA link

CEEALAR: 2024 Update

CEEALARJul 19, 2024, 11:14 AM
116 points
7 comments4 min readEA link

Me­tac­u­lus’ pre­dic­tions are much bet­ter than low-in­for­ma­tion priors

Vasco Grilo🔸Apr 11, 2023, 8:36 AM
53 points
0 comments6 min readEA link

Sur­vey on the ac­cel­er­a­tion risks of our new RFPs to study LLM capabilities

AjeyaNov 10, 2023, 11:59 PM
38 points
1 comment8 min readEA link

Ap­ply for men­tor­ship in AI Safety field-building

AkashSep 17, 2022, 7:03 PM
21 points
0 comments1 min readEA link

Cruxes on US lead for some do­mes­tic AI regulation

Zach Stein-PerlmanSep 10, 2023, 6:00 PM
20 points
6 comments2 min readEA link

[Question] Which stocks or ETFs should you in­vest in to take ad­van­tage of a pos­si­ble AGI ex­plo­sion, and why?

Eevee🔹Apr 10, 2023, 5:55 PM
19 points
16 comments1 min readEA link

Hu­mans are not pre­pared to op­er­ate out­side their moral train­ing distribution

PrometheusApr 10, 2023, 9:44 PM
12 points
0 comments1 min readEA link

Ap­pli­ca­tions open: Sup­port for tal­ent work­ing on in­de­pen­dent learn­ing, re­search or en­trepreneurial pro­jects fo­cused on re­duc­ing global catas­trophic risks

CEEALARFeb 9, 2024, 1:04 PM
63 points
1 comment2 min readEA link

My highly per­sonal skep­ti­cism brain­dump on ex­is­ten­tial risk from ar­tifi­cial in­tel­li­gence.

NunoSempereJan 23, 2023, 8:08 PM
435 points
116 comments14 min readEA link
(nunosempere.com)

[Question] Why might AI be a x-risk? Suc­cinct ex­pla­na­tions please

SanjayApr 4, 2023, 12:46 PM
20 points
9 comments1 min readEA link

Mis­gen­er­al­iza­tion as a misnomer

So8resApr 6, 2023, 8:43 PM
48 points
0 comments1 min readEA link

Beren’s “De­con­fus­ing Direct vs Amor­tised Op­ti­mi­sa­tion”

𝕮𝖎𝖓𝖊𝖗𝖆Apr 7, 2023, 8:57 AM
9 points
0 comments1 min readEA link

[Question] Imag­ine AGI kil­led us all in three years. What would have been our biggest mis­takes?

yanni kyriacosApr 7, 2023, 12:06 AM
17 points
6 comments1 min readEA link

Re­cur­sive Mid­dle Man­ager Hell

RaemonJan 17, 2023, 7:02 PM
73 points
3 comments1 min readEA link

Is it time for a pause?

Kelsey PiperApr 6, 2023, 11:48 AM
103 points
6 comments5 min readEA link

EA In­fosec: skill up in or make a tran­si­tion to in­fosec via this book club

Jason ClintonMar 5, 2023, 9:02 PM
170 points
16 comments2 min readEA link

Orthog­o­nal­ity is Expensive

𝕮𝖎𝖓𝖊𝖗𝖆Apr 3, 2023, 1:57 AM
18 points
4 comments1 min readEA link

Say­ing ‘AI safety re­search is a Pas­cal’s Mug­ging’ isn’t a strong response

Robert_WiblinDec 15, 2015, 1:48 PM
15 points
16 comments2 min readEA link

An ‘AGI Emer­gency Eject Cri­te­ria’ con­sen­sus could be re­ally use­ful.

tcelferactApr 7, 2023, 4:21 PM
27 points
3 comments1 min readEA link

OpenAI o1

Zach Stein-PerlmanSep 12, 2024, 6:54 PM
38 points
0 comments1 min readEA link

In­ves­ti­gat­ing an in­surance-for-AI startup

L Rudolf LSep 21, 2024, 3:29 PM
40 points
1 comment1 min readEA link
(www.strataoftheworld.com)

GPTs are Pre­dic­tors, not Imitators

EliezerYudkowskyApr 8, 2023, 7:59 PM
74 points
12 comments1 min readEA link

Race to the Top: Bench­marks for AI Safety

isaduanDec 4, 2022, 10:50 PM
51 points
8 comments1 min readEA link

AISafety.world is a map of the AIS ecosystem

Hamish McDoodlesApr 6, 2023, 11:47 AM
191 points
8 comments1 min readEA link

Distinc­tions when Dis­cussing Utility Functions

Ozzie GooenMar 8, 2024, 6:43 PM
15 points
5 comments8 min readEA link

We might get lucky with AGI warn­ing shots. Let’s be ready!

tcelferactMar 31, 2023, 9:37 PM
22 points
2 comments1 min readEA link

[Question] Can we eval­u­ate the “tool ver­sus agent” AGI pre­dic­tion?

Ben_West🔸Apr 8, 2023, 6:35 PM
63 points
7 comments1 min readEA link

New sur­vey: 46% of Amer­i­cans are con­cerned about ex­tinc­tion from AI; 69% sup­port a six-month pause in AI development

AkashApr 5, 2023, 1:26 AM
143 points
34 comments1 min readEA link

Wi­den­ing Over­ton Win­dow—Open Thread

PrometheusMar 31, 2023, 10:06 AM
12 points
5 comments1 min readEA link
(www.lesswrong.com)

Defer­ence on AI timelines: sur­vey results

Sam ClarkeMar 30, 2023, 11:03 PM
68 points
3 comments2 min readEA link

Re­cruit the World’s best for AGI Alignment

Greg_ColbournMar 30, 2023, 4:41 PM
34 points
8 comments22 min readEA link

Nu­clear brinks­man­ship is not a good AI x-risk strategy

titotalMar 30, 2023, 10:07 PM
19 points
8 comments5 min readEA link

AI and Evolution

Dan HMar 30, 2023, 1:09 PM
41 points
1 comment2 min readEA link
(arxiv.org)

How LDT helps re­duce the AI arms race

Tamsin LeakeDec 10, 2023, 4:21 PM
8 points
1 comment1 min readEA link
(carado.moe)

[Draft] The hum­ble cos­mol­o­gist’s P(doom) paradox

titotalMar 16, 2024, 11:13 AM
38 points
6 comments10 min readEA link

No­body’s on the ball on AGI alignment

leopoldMar 29, 2023, 2:26 PM
327 points
65 comments9 min readEA link
(www.forourposterity.com)

[TIME mag­a­z­ine] Deep­Mind’s CEO Helped Take AI Main­stream. Now He’s Urg­ing Cau­tion (Per­rigo, 2023)

Will AldredJan 20, 2023, 8:37 PM
93 points
0 comments1 min readEA link
(time.com)

Want to win the AGI race? Solve al­ign­ment.

leopoldMar 29, 2023, 3:19 PM
56 points
6 comments5 min readEA link
(www.forourposterity.com)

“Dangers of AI and the End of Hu­man Civ­i­liza­tion” Yud­kowsky on Lex Fridman

𝕮𝖎𝖓𝖊𝖗𝖆Mar 30, 2023, 3:44 PM
28 points
0 comments1 min readEA link

Deep­Mind: Eval­u­at­ing Fron­tier Models for Danger­ous Capabilities

Zach Stein-PerlmanMar 21, 2024, 11:00 PM
28 points
0 comments1 min readEA link
(arxiv.org)

A rough and in­com­plete re­view of some of John Went­worth’s research

So8resMar 28, 2023, 6:52 PM
27 points
0 comments1 min readEA link

Semi-con­duc­tor /​ AI stocks dis­cus­sion.

sapphireNov 25, 2022, 11:35 PM
10 points
3 comments1 min readEA link

What would a com­pute mon­i­tor­ing plan look like? [Linkpost]

AkashMar 26, 2023, 7:33 PM
61 points
1 comment1 min readEA link

A stylized di­alogue on John Went­worth’s claims about mar­kets and optimization

So8resMar 25, 2023, 10:32 PM
18 points
0 comments1 min readEA link

[Question] Please help me sense-check my as­sump­tions about the needs of the AI Safety com­mu­nity and re­lated ca­reer plans

PeterSlatteryMar 27, 2023, 8:11 AM
23 points
27 comments2 min readEA link

Suc­ces­sif: Join our AI pro­gram to help miti­gate the catas­trophic risks of AI

ClaireBOct 25, 2023, 4:51 PM
15 points
0 comments5 min readEA link

My at­tempt at ex­plain­ing the case for AI risk in a straight­for­ward way

JulianHazellMar 25, 2023, 4:32 PM
25 points
7 comments18 min readEA link
(muddyclothes.substack.com)

[Question] AI+bio can­not be half of AI catas­tro­phe risk, right?

Benevolent_RainOct 10, 2023, 3:17 AM
23 points
11 comments2 min readEA link

AI al­ign­ment shouldn’t be con­flated with AI moral achievement

Matthew_BarnettDec 30, 2023, 3:08 AM
114 points
15 comments5 min readEA link

Guardrails vs Goal-di­rect­ed­ness in AI Alignment

freedomandutilityDec 30, 2023, 12:58 PM
13 points
2 comments1 min readEA link

13 Very Differ­ent Stances on AGI

Ozzie GooenDec 27, 2021, 11:30 PM
84 points
23 comments3 min readEA link

EA Wins 2023

Shakeel HashimDec 31, 2023, 2:07 PM
357 points
9 comments3 min readEA link

Civil di­s­obe­di­ence op­por­tu­nity—a way to help re­duce chance of hard take­off from re­cur­sive self im­prove­ment of code

JonCefaluMar 25, 2023, 10:37 PM
−5 points
0 comments1 min readEA link
(codegencodepoisoningcontest.cargo.site)

Truth and Ad­van­tage: Re­sponse to a draft of “AI safety seems hard to mea­sure”

So8resMar 22, 2023, 3:36 AM
11 points
0 comments1 min readEA link

[Question] What is AI Safety’s line of re­treat?

RemmeltJul 28, 2024, 5:43 AM
4 points
2 comments1 min readEA link

Ideas for im­prov­ing epistemics in AI safety outreach

micAug 21, 2023, 7:56 PM
31 points
0 comments3 min readEA link
(www.lesswrong.com)

Col­lin Burns on Align­ment Re­search And Dis­cov­er­ing La­tent Knowl­edge Without Supervision

Michaël TrazziJan 17, 2023, 5:21 PM
21 points
3 comments1 min readEA link

[Linkpost] Prospect Magaz­ine—How to save hu­man­ity from extinction

jackvaSep 26, 2023, 7:16 PM
32 points
2 comments1 min readEA link
(www.prospectmagazine.co.uk)

Mea­sur­ing AI-Driven Risk with Stock Prices (Su­sana Cam­pos-Mart­ins)

Global Priorities InstituteDec 12, 2024, 2:22 PM
10 points
1 comment4 min readEA link
(globalprioritiesinstitute.org)

[Question] Will AI Wor­ld­view Prize Fund­ing Be Re­placed?

Jordan ArelNov 13, 2022, 5:10 PM
26 points
4 comments1 min readEA link

Ap­ply to CEEALAR to do AGI mora­to­rium work

Greg_ColbournJul 26, 2023, 9:24 PM
62 points
0 comments1 min readEA link

Shal­low re­view of live agen­das in al­ign­ment & safety

technicalitiesNov 27, 2023, 11:33 AM
76 points
8 comments29 min readEA link

Me­tac­u­lus Pre­dicts Weak AGI in 2 Years and AGI in 10

Chris LeongMar 24, 2023, 7:43 PM
27 points
12 comments1 min readEA link

An­nounc­ing the ITAM AI Fu­tures Fel­low­ship

AmAristizabalJul 28, 2023, 4:44 PM
43 points
3 comments2 min readEA link

An­nounc­ing the Pivotal Re­search Fel­low­ship – Ap­ply Now!

Tobias HäberliApr 3, 2024, 5:30 PM
51 points
5 comments2 min readEA link

Paul Chris­ti­ano on Dwarkesh Podcast

ESRogsNov 3, 2023, 10:13 PM
5 points
0 comments1 min readEA link
(www.dwarkeshpatel.com)

The Nav­i­ga­tion Fund launched + is hiring a pro­gram officer to lead the dis­tri­bu­tion of $20M an­nu­ally for AI safety! Full-time, fully re­mote, pay starts at $200k

vincentweisserNov 3, 2023, 9:53 PM
120 points
3 comments1 min readEA link

An­nounc­ing Epoch’s newly ex­panded Pa­ram­e­ters, Com­pute and Data Trends in Ma­chine Learn­ing database

Robi RahmanOct 25, 2023, 3:03 AM
38 points
1 comment1 min readEA link
(epochai.org)

AGI and the EMH: mar­kets are not ex­pect­ing al­igned or un­al­igned AI in the next 30 years

basil.halperinJan 10, 2023, 4:05 PM
342 points
177 comments26 min readEA link

The Top AI Safety Bets for 2023: GiveWiki’s Lat­est Recommendations

Dawn DrescherNov 11, 2023, 9:04 AM
11 points
4 comments8 min readEA link

Call for Papers on Global AI Gover­nance from the UN

Chris LeongAug 20, 2023, 8:56 AM
36 points
1 comment1 min readEA link
(www.linkedin.com)

On “slack” in train­ing (Sec­tion 1.5 of “Schem­ing AIs”)

Joe_CarlsmithNov 25, 2023, 5:51 PM
14 points
1 comment1 min readEA link

Su­per­vised Pro­gram for Align­ment Re­search (SPAR) at UC Berkeley: Spring 2023 summary

micAug 19, 2023, 2:32 AM
18 points
1 comment6 min readEA link
(www.lesswrong.com)

Defin­ing al­ign­ment research

richard_ngoAug 19, 2024, 10:49 PM
48 points
1 comment1 min readEA link

Be­ware safety-washing

LizkaJan 13, 2023, 10:39 AM
143 points
7 comments4 min readEA link

[Question] Game the­ory work on AI al­ign­ment with di­verse AI sys­tems, hu­man in­di­vi­d­u­als, & hu­man groups?

Geoffrey MillerMar 2, 2023, 4:50 PM
22 points
2 comments1 min readEA link

How ARENA course ma­te­rial gets made

TheMcDouglasJul 2, 2024, 7:27 AM
12 points
0 comments1 min readEA link

Longter­mism Fund: Au­gust 2023 Grants Report

Michael Townsend🔸Aug 20, 2023, 5:34 AM
81 points
3 comments5 min readEA link

[Question] What is the coun­ter­fac­tual value of differ­ent AI Safety pro­fes­sion­als?

PabloAMC 🔸Jul 3, 2024, 2:38 PM
6 points
2 comments1 min readEA link

Sili­con Valley’s Rab­bit Hole Problem

MandelbrotOct 8, 2023, 12:25 PM
34 points
44 comments11 min readEA link
(medium.com)

The AI Boom Mainly Benefits Big Firms, but long-term, mar­kets will concentrate

Hauke HillebrandtOct 29, 2023, 8:38 AM
12 points
0 comments1 min readEA link

AIのタイムライン ─ 提案されている論証と「専門家」の立ち位置

EA JapanAug 17, 2023, 2:59 PM
2 points
0 comments1 min readEA link

Vic­to­ria Krakovna on AGI Ruin, The Sharp Left Turn and Paradigms of AI Alignment

Michaël TrazziJan 12, 2023, 5:09 PM
16 points
0 comments1 min readEA link

What is au­ton­omy, and how does it lead to greater risk from AI?

DavidmanheimAug 1, 2023, 8:06 AM
10 points
0 comments6 min readEA link
(www.lesswrong.com)

Linkpost: 7 A.I. Com­pa­nies Agree to Safe­guards After Pres­sure From the White House

MHR🔸Jul 21, 2023, 1:23 PM
61 points
4 comments1 min readEA link
(www.nytimes.com)

VIRTUA: a novel about AI alignment

Karl von WendtJan 12, 2023, 9:37 AM
23 points
0 comments1 min readEA link

Si­tu­a­tional aware­ness (Sec­tion 2.1 of “Schem­ing AIs”)

Joe_CarlsmithNov 26, 2023, 11:00 PM
12 points
1 comment1 min readEA link

The Over­ton Win­dow widens: Ex­am­ples of AI risk in the media

AkashMar 23, 2023, 5:10 PM
112 points
11 comments1 min readEA link

In­tro­duc­ing the new Ries­gos Catas­trófi­cos Globales team

Jaime SevillaMar 3, 2023, 11:04 PM
74 points
3 comments5 min readEA link
(riesgoscatastroficosglobales.com)

Tran­script: NBC Nightly News: AI ‘race to reck­less­ness’ w/​ Tris­tan Har­ris, Aza Raskin

WilliamKielyMar 23, 2023, 3:45 AM
47 points
1 comment1 min readEA link

Ex­cerpts from “Ma­jor­ity Leader Schumer De­liv­ers Re­marks To Launch SAFE In­no­va­tion Frame­work For Ar­tifi­cial In­tel­li­gence At CSIS”

Chris LeongJul 21, 2023, 11:15 PM
19 points
0 comments1 min readEA link
(www.democrats.senate.gov)

AGI Take­off dy­nam­ics—In­tel­li­gence vs Quan­tity ex­plo­sion

EdoAradJul 26, 2023, 9:20 AM
14 points
0 comments2 min readEA link
(github.com)

The US-China Re­la­tion­ship and Catas­trophic Risk (EAG Bos­ton tran­script)

EA GlobalJul 9, 2024, 1:50 PM
30 points
1 comment19 min readEA link

AI Safety Newslet­ter #40: Cal­ifor­nia AI Leg­is­la­tion Plus, NVIDIA De­lays Chip Pro­duc­tion, and Do AI Safety Bench­marks Ac­tu­ally Mea­sure Safety?

Center for AI SafetyAug 21, 2024, 6:10 PM
17 points
0 comments6 min readEA link
(newsletter.safe.ai)

Cost-effec­tive­ness of pro­fes­sional field-build­ing pro­grams for AI safety research

Center for AI SafetyJul 10, 2023, 5:26 PM
38 points
2 comments18 min readEA link

US Congress in­tro­duces CREATE AI Act for es­tab­lish­ing Na­tional AI Re­search Resource

Daniel_EthJul 28, 2023, 11:27 PM
9 points
1 comment1 min readEA link
(eshoo.house.gov)

White House pub­lishes frame­work for Nu­cleic Acid Screening

Agustín Covarrubias 🔸Apr 30, 2024, 12:44 AM
30 points
1 comment1 min readEA link
(www.whitehouse.gov)

The­o­ries of Change for Track II Di­plo­macy [Founders Pledge]

christian.rJul 9, 2024, 1:31 PM
20 points
2 comments33 min readEA link

Cost-effec­tive­ness of stu­dent pro­grams for AI safety research

Center for AI SafetyJul 10, 2023, 5:23 PM
53 points
7 comments15 min readEA link

[Question] Strongest real-world ex­am­ples sup­port­ing AI risk claims?

rosehadsharSep 5, 2023, 3:11 PM
52 points
9 comments1 min readEA link

We Did AGISF’s 8-week Course in 3 Days. Here’s How it Went

ag4000Jul 24, 2022, 4:46 PM
26 points
7 comments6 min readEA link

[Question] What is the eas­iest/​funnest way to build up a com­pre­hen­sive un­der­stand­ing of AI and AI Safety?

Jordan ArelApr 30, 2024, 6:39 PM
14 points
0 comments1 min readEA link

Have your say on the Aus­tralian Govern­ment’s AI Policy

Nathan SherburnJul 17, 2023, 11:02 AM
3 points
1 comment1 min readEA link

Thoughts on yes­ter­day’s UN Se­cu­rity Coun­cil meet­ing on AI

Greg_ColbournJul 19, 2023, 4:46 PM
31 points
2 comments1 min readEA link

Model­ing the im­pact of AI safety field-build­ing programs

Center for AI SafetyJul 10, 2023, 5:22 PM
83 points
0 comments7 min readEA link

Some rea­sons to start a pro­ject to stop harm­ful AI

RemmeltAug 22, 2024, 4:23 PM
5 points
0 comments1 min readEA link

De­bate se­ries: should we push for a pause on the de­vel­op­ment of AI?

Ben_West🔸Sep 8, 2023, 4:29 PM
252 points
58 comments1 min readEA link

We need non-cy­ber­se­cu­rity peo­ple [too]

JarrahMay 5, 2024, 12:11 AM
32 points
0 comments2 min readEA link

Five Years of Re­think Pri­ori­ties: Im­pact, Fu­ture Plans, Fund­ing Needs (July 2023)

Rethink PrioritiesJul 18, 2023, 3:59 PM
110 points
3 comments16 min readEA link

How I Formed My Own Views About AI Safety

Neel NandaFeb 27, 2022, 6:52 PM
134 points
12 comments14 min readEA link
(www.neelnanda.io)

Lev­el­ling Up in AI Safety Re­search Engineering

GabeMSep 2, 2022, 4:59 AM
165 points
21 comments17 min readEA link

Read­ing list on AI agents and as­so­ci­ated policy

Peter WildefordAug 9, 2024, 5:40 PM
79 points
2 comments1 min readEA link

Is this com­mu­nity over-em­pha­siz­ing AI al­ign­ment?

LixiangJan 8, 2023, 6:23 AM
1 point
5 comments1 min readEA link

New Deep­Mind re­port on in­sti­tu­tions for global AI governance

finmJul 14, 2023, 4:05 PM
10 points
0 comments1 min readEA link
(www.deepmind.com)

An­nounc­ing the Ex­is­ten­tial In­foSec Forum

calebpJul 7, 2023, 9:08 PM
90 points
1 comment2 min readEA link

[Linkpost] Jan Leike on three kinds of al­ign­ment taxes

AkashJan 6, 2023, 11:57 PM
29 points
0 comments1 min readEA link

20 Cri­tiques of AI Safety That I Found on Twitter

Daniel KirmaniJun 23, 2022, 3:11 PM
14 points
13 comments1 min readEA link

AI Safety Camp, Vir­tual Edi­tion 2023

Linda LinseforsJan 6, 2023, 12:55 AM
24 points
0 comments1 min readEA link

[Question] What did AI Safety’s spe­cific fund­ing of AGI R&D labs lead to?

RemmeltJul 5, 2023, 3:51 PM
24 points
17 comments1 min readEA link

Have your say on the Aus­tralian Govern­ment’s AI Policy

Nathan SherburnJul 11, 2023, 1:12 AM
3 points
0 comments1 min readEA link

A Wind­fall Clause for CEO could worsen AI race dynamics

LarksMar 9, 2023, 6:02 PM
69 points
12 comments7 min readEA link

AI Rights for Hu­man Safety

Matthew_BarnettAug 3, 2024, 12:47 AM
54 points
1 comment1 min readEA link
(papers.ssrn.com)

Ques­tions about Con­je­cure’s CoEm proposal

AkashMar 9, 2023, 7:32 PM
19 points
0 comments1 min readEA link

An overview of some promis­ing work by ju­nior al­ign­ment researchers

AkashDec 26, 2022, 5:23 PM
10 points
0 comments1 min readEA link

Views on when AGI comes and on strat­egy to re­duce ex­is­ten­tial risk

TsviBTJul 8, 2023, 9:00 AM
31 points
3 comments1 min readEA link

Trans­for­ma­tive AI is­sues (not just mis­al­ign­ment): an overview

Holden KarnofskyJan 6, 2023, 2:19 AM
36 points
0 comments22 min readEA link
(www.cold-takes.com)

An­nounc­ing the Open Philan­thropy AI Wor­ld­views Contest

Jason SchukraftMar 10, 2023, 2:33 AM
137 points
33 comments3 min readEA link
(www.openphilanthropy.org)

Wash­ing­ton Post ar­ti­cle about EA uni­ver­sity groups

LizkaJul 5, 2023, 12:58 PM
35 points
5 comments1 min readEA link

New ‘South Park’ epi­sode on AI & Chat GPT

Geoffrey MillerMar 21, 2023, 8:06 PM
13 points
1 comment1 min readEA link

Part 3: A Pro­posed Ap­proach for AI Safety Move­ment Build­ing: Pro­jects, Pro­fes­sions, Skills, and Ideas for the Fu­ture [long post][bounty for feed­back]

PeterSlatteryMar 22, 2023, 12:54 AM
22 points
8 comments32 min readEA link

Pause For Thought: The AI Pause De­bate (As­tral Codex Ten)

David MOct 5, 2023, 9:32 AM
37 points
0 comments1 min readEA link
(www.astralcodexten.com)

Where I’m at with AI risk: con­vinced of dan­ger but not (yet) of doom

Amber DawnMar 21, 2023, 1:23 PM
62 points
16 comments6 min readEA link

Sam Alt­man fired from OpenAI

LarksNov 17, 2023, 9:07 PM
133 points
90 comments1 min readEA link
(openai.com)

Are al­ign­ment re­searchers de­vot­ing enough time to im­prov­ing their re­search ca­pac­ity?

Carson JonesNov 4, 2022, 12:58 AM
11 points
1 comment1 min readEA link

The Wizard of Oz Prob­lem: How in­cen­tives and nar­ra­tives can skew our per­cep­tion of AI developments

AkashMar 20, 2023, 10:36 PM
16 points
0 comments1 min readEA link

An­nounc­ing Man­i­fund Regrants

AustinJul 5, 2023, 7:42 PM
217 points
51 comments4 min readEA link
(manifund.org)

Paus­ing AI might be good policy, but it’s bad politics

Stephen ClareOct 23, 2023, 1:36 PM
162 points
20 comments2 min readEA link
(unfoldingatlas.substack.com)

Solv­ing al­ign­ment isn’t enough for a flour­ish­ing future

micFeb 2, 2024, 6:22 PM
27 points
0 comments22 min readEA link
(papers.ssrn.com)

NIMBYism as an AI gov­er­nance tool?

freedomandutilityJun 9, 2024, 6:40 AM
10 points
2 comments1 min readEA link

Mis­nam­ing and Other Is­sues with OpenAI’s “Hu­man Level” Su­per­in­tel­li­gence Hierarchy

DavidmanheimJul 15, 2024, 5:50 AM
14 points
0 comments1 min readEA link

Still no strong ev­i­dence that LLMs in­crease bioter­ror­ism risk

freedomandutilityNov 2, 2023, 9:23 PM
58 points
9 comments1 min readEA link

Man­i­fund: 2023 in Review

AustinJan 18, 2024, 11:50 PM
29 points
1 comment23 min readEA link
(manifund.substack.com)

A Sim­ple Model of AGI De­ploy­ment Risk

djbinderJul 9, 2021, 9:44 AM
30 points
0 comments5 min readEA link

Thoughts on SB-1047

Ryan GreenblattMay 30, 2024, 12:19 AM
53 points
4 comments1 min readEA link

An­nounc­ing Timaeus

Stan van WingerdenOct 22, 2023, 1:32 PM
79 points
0 comments5 min readEA link
(www.lesswrong.com)

Towards more co­op­er­a­tive AI safety strategies

richard_ngoJul 16, 2024, 4:36 AM
62 points
5 comments1 min readEA link

The GiveWiki’s Top Picks in AI Safety for the Giv­ing Sea­son of 2023

Dawn DrescherDec 7, 2023, 9:23 AM
26 points
0 comments3 min readEA link
(impactmarkets.substack.com)

An­nounc­ing Open Philan­thropy’s AI gov­er­nance and policy RFP

JulianHazellJul 17, 2024, 12:25 AM
73 points
2 comments1 min readEA link
(www.openphilanthropy.org)

Notes on nukes, IR, and AI from “Arse­nals of Folly” (and other books)

tlevinSep 4, 2023, 7:02 PM
21 points
2 comments6 min readEA link

Pros and Cons of boy­cotting paid Chat GPT

NickLaingMar 18, 2023, 8:50 AM
14 points
11 comments2 min readEA link

Yud­kowsky on AGI risk on the Ban­kless podcast

RobBensingerMar 13, 2023, 12:42 AM
54 points
2 comments75 min readEA link

How Re­think Pri­ori­ties’ Re­search could in­form your grantmaking

kierangreig🔸Oct 4, 2023, 6:24 PM
59 points
0 comments2 min readEA link

Ex­plor­ing Me­tac­u­lus’ com­mu­nity predictions

Vasco Grilo🔸Mar 24, 2023, 7:59 AM
95 points
17 comments10 min readEA link

Have your say on the Aus­tralian Govern­ment’s AI Policy [On­line #1]

Nathan SherburnJul 11, 2023, 12:35 AM
3 points
0 comments1 min readEA link

Ways to buy time

AkashNov 12, 2022, 7:31 PM
47 points
1 comment1 min readEA link

Ap­ply to fall policy in­tern­ships (we can help)

ESJul 2, 2023, 9:37 PM
57 points
4 comments1 min readEA link

[Question] Do AI com­pa­nies make their safety re­searchers sign a non-dis­par­age­ment clause?

OferSep 5, 2022, 1:40 PM
73 points
3 comments1 min readEA link

Un­jour­nal: Eval­u­a­tions of “Ar­tifi­cial In­tel­li­gence and Eco­nomic Growth”, and new host­ing space

david_reinsteinMar 17, 2023, 8:20 PM
47 points
0 comments2 min readEA link
(unjournal.pubpub.org)

Civil di­s­obe­di­ence op­por­tu­nity—a way to help re­duce chance of hard take­off from re­cur­sive self im­prove­ment of code

JonCefaluMar 25, 2023, 10:37 PM
−5 points
0 comments1 min readEA link
(codegencodepoisoningcontest.cargo.site)

A stylized di­alogue on John Went­worth’s claims about mar­kets and optimization

So8resMar 25, 2023, 10:32 PM
18 points
0 comments1 min readEA link

Truth and Ad­van­tage: Re­sponse to a draft of “AI safety seems hard to mea­sure”

So8resMar 22, 2023, 3:36 AM
11 points
0 comments1 min readEA link

[Question] AI+bio can­not be half of AI catas­tro­phe risk, right?

Benevolent_RainOct 10, 2023, 3:17 AM
23 points
11 comments2 min readEA link

[Linkpost] Prospect Magaz­ine—How to save hu­man­ity from extinction

jackvaSep 26, 2023, 7:16 PM
32 points
2 comments1 min readEA link
(www.prospectmagazine.co.uk)

Mea­sur­ing AI-Driven Risk with Stock Prices (Su­sana Cam­pos-Mart­ins)

Global Priorities InstituteDec 12, 2024, 2:22 PM
10 points
1 comment4 min readEA link
(globalprioritiesinstitute.org)

Shal­low re­view of live agen­das in al­ign­ment & safety

technicalitiesNov 27, 2023, 11:33 AM
76 points
8 comments29 min readEA link

Ap­ply to CEEALAR to do AGI mora­to­rium work

Greg_ColbournJul 26, 2023, 9:24 PM
62 points
0 comments1 min readEA link

AI Safety Camp, Vir­tual Edi­tion 2023

Linda LinseforsJan 6, 2023, 12:55 AM
24 points
0 comments1 min readEA link

No­body’s on the ball on AGI alignment

leopoldMar 29, 2023, 2:26 PM
327 points
65 comments9 min readEA link
(www.forourposterity.com)

AI policy ideas: Read­ing list

Zach Stein-PerlmanApr 17, 2023, 7:00 PM
60 points
3 comments1 min readEA link

An­nounc­ing Epoch’s newly ex­panded Pa­ram­e­ters, Com­pute and Data Trends in Ma­chine Learn­ing database

Robi RahmanOct 25, 2023, 3:03 AM
38 points
1 comment1 min readEA link
(epochai.org)

Ways to buy time

AkashNov 12, 2022, 7:31 PM
47 points
1 comment1 min readEA link

The Nav­i­ga­tion Fund launched + is hiring a pro­gram officer to lead the dis­tri­bu­tion of $20M an­nu­ally for AI safety! Full-time, fully re­mote, pay starts at $200k

vincentweisserNov 3, 2023, 9:53 PM
120 points
3 comments1 min readEA link

Longter­mism Fund: Au­gust 2023 Grants Report

Michael Townsend🔸Aug 20, 2023, 5:34 AM
81 points
3 comments5 min readEA link

Su­per­vised Pro­gram for Align­ment Re­search (SPAR) at UC Berkeley: Spring 2023 summary

micAug 19, 2023, 2:32 AM
18 points
1 comment6 min readEA link
(www.lesswrong.com)

On “slack” in train­ing (Sec­tion 1.5 of “Schem­ing AIs”)

Joe_CarlsmithNov 25, 2023, 5:51 PM
14 points
1 comment1 min readEA link

Paul Chris­ti­ano on Dwarkesh Podcast

ESRogsNov 3, 2023, 10:13 PM
5 points
0 comments1 min readEA link
(www.dwarkeshpatel.com)

What is au­ton­omy, and how does it lead to greater risk from AI?

DavidmanheimAug 1, 2023, 8:06 AM
10 points
0 comments6 min readEA link
(www.lesswrong.com)

Linkpost: 7 A.I. Com­pa­nies Agree to Safe­guards After Pres­sure From the White House

MHR🔸Jul 21, 2023, 1:23 PM
61 points
4 comments1 min readEA link
(www.nytimes.com)

AGI Take­off dy­nam­ics—In­tel­li­gence vs Quan­tity ex­plo­sion

EdoAradJul 26, 2023, 9:20 AM
14 points
0 comments2 min readEA link
(github.com)

CFP for Re­bel­lion and Di­sobe­di­ence in AI workshop

Ram RachumDec 29, 2022, 4:09 PM
4 points
0 comments1 min readEA link

Tran­script: NBC Nightly News: AI ‘race to reck­less­ness’ w/​ Tris­tan Har­ris, Aza Raskin

WilliamKielyMar 23, 2023, 3:45 AM
47 points
1 comment1 min readEA link

The Over­ton Win­dow widens: Ex­am­ples of AI risk in the media

AkashMar 23, 2023, 5:10 PM
112 points
11 comments1 min readEA link

The Top AI Safety Bets for 2023: GiveWiki’s Lat­est Recommendations

Dawn DrescherNov 11, 2023, 9:04 AM
11 points
4 comments8 min readEA link

US Congress in­tro­duces CREATE AI Act for es­tab­lish­ing Na­tional AI Re­search Resource

Daniel_EthJul 28, 2023, 11:27 PM
9 points
1 comment1 min readEA link
(eshoo.house.gov)

Thoughts on yes­ter­day’s UN Se­cu­rity Coun­cil meet­ing on AI

Greg_ColbournJul 19, 2023, 4:46 PM
31 points
2 comments1 min readEA link

[Question] Strongest real-world ex­am­ples sup­port­ing AI risk claims?

rosehadsharSep 5, 2023, 3:11 PM
52 points
9 comments1 min readEA link

Have your say on the Aus­tralian Govern­ment’s AI Policy

Nathan SherburnJul 17, 2023, 11:02 AM
3 points
1 comment1 min readEA link

Five Years of Re­think Pri­ori­ties: Im­pact, Fu­ture Plans, Fund­ing Needs (July 2023)

Rethink PrioritiesJul 18, 2023, 3:59 PM
110 points
3 comments16 min readEA link

Read­ing list on AI agents and as­so­ci­ated policy

Peter WildefordAug 9, 2024, 5:40 PM
79 points
2 comments1 min readEA link

New Deep­Mind re­port on in­sti­tu­tions for global AI governance

finmJul 14, 2023, 4:05 PM
10 points
0 comments1 min readEA link
(www.deepmind.com)

[Question] What should I ask Ezra Klein about AI policy pro­pos­als?

Robert_WiblinJun 23, 2023, 4:36 PM
21 points
4 comments1 min readEA link

Part 3: A Pro­posed Ap­proach for AI Safety Move­ment Build­ing: Pro­jects, Pro­fes­sions, Skills, and Ideas for the Fu­ture [long post][bounty for feed­back]

PeterSlatteryMar 22, 2023, 12:54 AM
22 points
8 comments32 min readEA link

Model­ing the im­pact of AI safety field-build­ing programs

Center for AI SafetyJul 10, 2023, 5:22 PM
83 points
0 comments7 min readEA link

Sili­con Valley’s Rab­bit Hole Problem

MandelbrotOct 8, 2023, 12:25 PM
34 points
44 comments11 min readEA link
(medium.com)

Wash­ing­ton Post ar­ti­cle about EA uni­ver­sity groups

LizkaJul 5, 2023, 12:58 PM
35 points
5 comments1 min readEA link

Views on when AGI comes and on strat­egy to re­duce ex­is­ten­tial risk

TsviBTJul 8, 2023, 9:00 AM
31 points
3 comments1 min readEA link

New ‘South Park’ epi­sode on AI & Chat GPT

Geoffrey MillerMar 21, 2023, 8:06 PM
13 points
1 comment1 min readEA link

Pause For Thought: The AI Pause De­bate (As­tral Codex Ten)

David MOct 5, 2023, 9:32 AM
37 points
0 comments1 min readEA link
(www.astralcodexten.com)

An­nounc­ing Man­i­fund Regrants

AustinJul 5, 2023, 7:42 PM
217 points
51 comments4 min readEA link
(manifund.org)

Where I’m at with AI risk: con­vinced of dan­ger but not (yet) of doom

Amber DawnMar 21, 2023, 1:23 PM
62 points
16 comments6 min readEA link

Sam Alt­man fired from OpenAI

LarksNov 17, 2023, 9:07 PM
133 points
90 comments1 min readEA link
(openai.com)

[Question] What did AI Safety’s spe­cific fund­ing of AGI R&D labs lead to?

RemmeltJul 5, 2023, 3:51 PM
24 points
17 comments1 min readEA link

Ex­plor­ing Me­tac­u­lus’ com­mu­nity predictions

Vasco Grilo🔸Mar 24, 2023, 7:59 AM
95 points
17 comments10 min readEA link

A Sim­ple Model of AGI De­ploy­ment Risk

djbinderJul 9, 2021, 9:44 AM
30 points
0 comments5 min readEA link

The Wizard of Oz Prob­lem: How in­cen­tives and nar­ra­tives can skew our per­cep­tion of AI developments

AkashMar 20, 2023, 10:36 PM
16 points
0 comments1 min readEA link

Have your say on the Aus­tralian Govern­ment’s AI Policy [On­line #1]

Nathan SherburnJul 11, 2023, 12:35 AM
3 points
0 comments1 min readEA link

Paus­ing AI might be good policy, but it’s bad politics

Stephen ClareOct 23, 2023, 1:36 PM
162 points
20 comments2 min readEA link
(unfoldingatlas.substack.com)

Have your say on the Aus­tralian Govern­ment’s AI Policy

Nathan SherburnJul 11, 2023, 1:12 AM
3 points
0 comments1 min readEA link

The GiveWiki’s Top Picks in AI Safety for the Giv­ing Sea­son of 2023

Dawn DrescherDec 7, 2023, 9:23 AM
26 points
0 comments3 min readEA link
(impactmarkets.substack.com)

An­nounc­ing Timaeus

Stan van WingerdenOct 22, 2023, 1:32 PM
79 points
0 comments5 min readEA link
(www.lesswrong.com)

How Re­think Pri­ori­ties’ Re­search could in­form your grantmaking

kierangreig🔸Oct 4, 2023, 6:24 PM
59 points
0 comments2 min readEA link

Notes on nukes, IR, and AI from “Arse­nals of Folly” (and other books)

tlevinSep 4, 2023, 7:02 PM
21 points
2 comments6 min readEA link

Still no strong ev­i­dence that LLMs in­crease bioter­ror­ism risk

freedomandutilityNov 2, 2023, 9:23 PM
58 points
9 comments1 min readEA link

An­nounc­ing the Ex­is­ten­tial In­foSec Forum

calebpJul 7, 2023, 9:08 PM
90 points
1 comment2 min readEA link

In­sights from an ex­pert sur­vey about in­ter­me­di­ate goals in AI governance

Sebastian SchwieckerMar 17, 2023, 2:59 PM
11 points
2 comments1 min readEA link

De Dicto and De Se Refer­ence Mat­ters for Alignment

philgoetzOct 3, 2023, 9:57 PM
5 points
2 comments9 min readEA link

Un­jour­nal: Eval­u­a­tions of “Ar­tifi­cial In­tel­li­gence and Eco­nomic Growth”, and new host­ing space

david_reinsteinMar 17, 2023, 8:20 PM
47 points
0 comments2 min readEA link
(unjournal.pubpub.org)

Bio-x-AI policy: call for ideas from the Fed­er­a­tion of Amer­i­can Scientists

Ben StewartAug 15, 2023, 3:21 AM
8 points
0 comments1 min readEA link

Crises re­veal centralisation

Vasco Grilo🔸Mar 26, 2024, 6:00 PM
31 points
2 comments5 min readEA link
(stefanschubert.substack.com)

Up­com­ing Feed­back Op­por­tu­nity on Dual-Use Foun­da­tion Models

Chris LeongNov 2, 2023, 4:30 AM
9 points
0 comments1 min readEA link

Munk AI de­bate: con­fu­sions and pos­si­ble cruxes

Steven ByrnesJun 27, 2023, 3:01 PM
142 points
10 comments1 min readEA link

“X dis­tracts from Y” as a thinly-dis­guised fight over group sta­tus /​ politics

Steven ByrnesSep 25, 2023, 3:29 PM
89 points
9 comments8 min readEA link

Cost-effec­tive­ness of stu­dent pro­grams for AI safety research

Center for AI SafetyJul 10, 2023, 5:23 PM
53 points
7 comments15 min readEA link

Ap­ply to fall policy in­tern­ships (we can help)

ESJul 2, 2023, 9:37 PM
57 points
4 comments1 min readEA link

Ex­cerpts from “Ma­jor­ity Leader Schumer De­liv­ers Re­marks To Launch SAFE In­no­va­tion Frame­work For Ar­tifi­cial In­tel­li­gence At CSIS”

Chris LeongJul 21, 2023, 11:15 PM
19 points
0 comments1 min readEA link
(www.democrats.senate.gov)

[Question] Why isn’t there a Char­ity En­trepreneur­ship pro­gram for AI Safety?

yanniOct 4, 2023, 2:12 AM
11 points
13 comments1 min readEA link

Re­think Pri­ori­ties: Seek­ing Ex­pres­sions of In­ter­est for Spe­cial Pro­jects Next Year

kierangreig🔸Nov 29, 2023, 1:44 PM
57 points
0 comments5 min readEA link

Cost-effec­tive­ness of pro­fes­sional field-build­ing pro­grams for AI safety research

Center for AI SafetyJul 10, 2023, 5:26 PM
38 points
2 comments18 min readEA link

Si­tu­a­tional aware­ness (Sec­tion 2.1 of “Schem­ing AIs”)

Joe_CarlsmithNov 26, 2023, 11:00 PM
12 points
1 comment1 min readEA link

AIのタイムライン ─ 提案されている論証と「専門家」の立ち位置

EA JapanAug 17, 2023, 2:59 PM
2 points
0 comments1 min readEA link

[Question] Do AI com­pa­nies make their safety re­searchers sign a non-dis­par­age­ment clause?

OferSep 5, 2022, 1:40 PM
73 points
3 comments1 min readEA link

Call for Papers on Global AI Gover­nance from the UN

Chris LeongAug 20, 2023, 8:56 AM
36 points
1 comment1 min readEA link
(www.linkedin.com)

Dona­tion offsets for ChatGPT Plus subscriptions

Jeffrey LadishMar 16, 2023, 11:11 PM
76 points
10 comments3 min readEA link

AI Safety Re­search Or­ga­ni­za­tion In­cu­ba­tion Pro­gram—Ex­pres­sion of Interest

kaykozaronekNov 20, 2023, 10:25 PM
70 points
0 comments1 min readEA link

Catas­trophic Risks from AI #6: Dis­cus­sion and FAQ

Center for AI SafetyJun 27, 2023, 11:23 PM
10 points
0 comments1 min readEA link

AI Safety Field Build­ing vs. EA CB

kuhanjJun 26, 2023, 11:21 PM
80 points
16 comments6 min readEA link

Let’s set new AI safety ac­tors up for success

michelJun 26, 2023, 9:17 PM
33 points
1 comment9 min readEA link

Catas­trophic Risks from AI #5: Rogue AIs

Center for AI SafetyJun 27, 2023, 10:06 PM
16 points
1 comment1 min readEA link

Me­tac­u­lus Pre­dicts Weak AGI in 2 Years and AGI in 10

Chris LeongMar 24, 2023, 7:43 PM
27 points
12 comments1 min readEA link

Ideas for im­prov­ing epistemics in AI safety outreach

micAug 21, 2023, 7:56 PM
31 points
0 comments3 min readEA link
(www.lesswrong.com)

Pros and Cons of boy­cotting paid Chat GPT

NickLaingMar 18, 2023, 8:50 AM
14 points
11 comments2 min readEA link

An­nounc­ing the ITAM AI Fu­tures Fel­low­ship

AmAristizabalJul 28, 2023, 4:44 PM
43 points
3 comments2 min readEA link

The AI Boom Mainly Benefits Big Firms, but long-term, mar­kets will concentrate

Hauke HillebrandtOct 29, 2023, 8:38 AM
12 points
0 comments1 min readEA link

Suc­ces­sif: Join our AI pro­gram to help miti­gate the catas­trophic risks of AI

ClaireBOct 25, 2023, 4:51 PM
15 points
0 comments5 min readEA link

NIMBYism as an AI gov­er­nance tool?

freedomandutilityJun 9, 2024, 6:40 AM
10 points
2 comments1 min readEA link

My at­tempt at ex­plain­ing the case for AI risk in a straight­for­ward way

JulianHazellMar 25, 2023, 4:32 PM
25 points
7 comments18 min readEA link
(muddyclothes.substack.com)

ai-plans.com De­cem­ber Cri­tique-a-Thon

Kabir_KumarDec 4, 2023, 9:27 AM
1 point
0 comments2 min readEA link

What would a com­pute mon­i­tor­ing plan look like? [Linkpost]

AkashMar 26, 2023, 7:33 PM
61 points
1 comment1 min readEA link

China Hawks are Man­u­fac­tur­ing an AI Arms Race

GarrisonNov 20, 2024, 6:17 PM
95 points
3 comments5 min readEA link
(garrisonlovely.substack.com)

But ex­actly how com­plex and frag­ile?

Katja_GraceDec 13, 2019, 7:05 AM
37 points
3 comments3 min readEA link
(meteuphoric.com)

A stub­born un­be­liever fi­nally gets the depth of the AI al­ign­ment problem

aelwoodOct 13, 2022, 3:16 PM
32 points
7 comments1 min readEA link

Ques­tions for fur­ther in­ves­ti­ga­tion of AI diffusion

Ben CottierDec 21, 2022, 1:50 PM
28 points
0 comments11 min readEA link

Take­aways from safety by de­fault interviews

AI ImpactsApr 7, 2020, 2:01 AM
25 points
2 comments13 min readEA link
(aiimpacts.org)

Nat­u­ral­ism and AI alignment

Michele CampoloApr 24, 2021, 4:20 PM
17 points
3 comments7 min readEA link

The Wind­fall Clause has a reme­dies problem

John Bridge 🔸May 23, 2022, 10:31 AM
40 points
0 comments17 min readEA link

Con­fused about AI re­search as a means of ad­dress­ing AI risk

Eli RoseFeb 21, 2019, 12:07 AM
31 points
15 comments1 min readEA link

Emer­gent Ven­tures AI

technicalitiesApr 8, 2022, 10:08 PM
22 points
0 comments1 min readEA link
(marginalrevolution.com)

CFP for the Largest An­nual Meet­ing of Poli­ti­cal Science: Get Help With Your Re­search Submission

Mahendra PrasadDec 22, 2020, 11:39 PM
13 points
0 comments2 min readEA link

De­sir­able? AI qualities

brb243Mar 21, 2022, 10:05 PM
7 points
0 comments2 min readEA link

Idea: an AI gov­er­nance group colo­cated with ev­ery AI re­search group!

capybaraletDec 7, 2020, 11:41 PM
8 points
1 comment2 min readEA link

Four rea­sons I find AI safety emo­tion­ally compelling

Kat WoodsJun 28, 2022, 2:01 PM
32 points
5 comments4 min readEA link

Take­aways from a sur­vey on AI al­ign­ment resources

DanielFilanNov 5, 2022, 11:45 PM
20 points
9 comments6 min readEA link
(www.lesswrong.com)

13 back­ground claims about EA

AkashSep 7, 2022, 3:54 AM
70 points
16 comments3 min readEA link

Gen­eral ad­vice for tran­si­tion­ing into The­o­ret­i­cal AI Safety

Martín SotoSep 15, 2022, 5:23 AM
25 points
0 comments10 min readEA link

FLI launches Wor­ld­build­ing Con­test with $100,000 in prizes

ggilgallonJan 17, 2022, 1:54 PM
87 points
55 comments6 min readEA link

Crit­i­cal Re­view of ‘The Precipice’: A Re­assess­ment of the Risks of AI and Pandemics

James FodorMay 11, 2020, 11:11 AM
111 points
32 comments26 min readEA link

Sha­har Avin on How to Strate­gi­cally Reg­u­late Ad­vanced AI Systems

Michaël TrazziSep 23, 2022, 3:49 PM
48 points
2 comments4 min readEA link
(theinsideview.ai)

“In­tro to brain-like-AGI safety” se­ries—just finished!

Steven ByrnesMay 17, 2022, 3:35 PM
15 points
0 comments1 min readEA link

[Link] Thiel on GCRs

Milan GriffesJul 22, 2019, 8:47 PM
28 points
11 comments1 min readEA link

In­tro to car­ing about AI al­ign­ment as an EA cause

So8resApr 14, 2017, 12:42 AM
28 points
10 comments25 min readEA link

[Question] How should we in­vest in “long-term short-ter­mism” given the like­li­hood of trans­for­ma­tive AI?

James_BanksJan 12, 2021, 11:54 PM
8 points
0 comments1 min readEA link

Eric Drexler: Pare­to­topian goal alignment

EA GlobalMar 15, 2019, 2:51 PM
14 points
0 comments10 min readEA link
(www.youtube.com)

AI Risk in Africa

Claude FormanekOct 12, 2021, 2:28 AM
18 points
0 comments10 min readEA link

Win­ners of the AI Safety Nudge Competition

Marc CarauleanuNov 15, 2022, 1:06 AM
22 points
0 comments1 min readEA link

On AI and Compute

johncroxApr 3, 2019, 9:26 PM
39 points
12 comments8 min readEA link

An­nounc­ing AXRP, the AI X-risk Re­search Podcast

DanielFilanDec 23, 2020, 8:10 PM
32 points
1 comment1 min readEA link

UK policy and poli­tics careers

weeatquinceSep 28, 2019, 4:18 PM
28 points
10 comments7 min readEA link

AI Fore­cast­ing Ques­tion Database (Fore­cast­ing in­fras­truc­ture, part 3)

terraformSep 3, 2019, 2:57 PM
23 points
2 comments4 min readEA link

EA Berkeley Pre­sents: Univer­sal Own­er­ship: Is In­dex In­vest­ing the New So­cially Re­spon­si­ble In­vest­ing?

Mahendra PrasadMar 10, 2022, 6:58 AM
7 points
0 comments1 min readEA link

AGI Safety Fun­da­men­tals cur­ricu­lum and application

richard_ngoOct 20, 2021, 9:45 PM
123 points
20 comments8 min readEA link
(docs.google.com)

Three new re­ports re­view­ing re­search and con­cepts in ad­vanced AI governance

MMMaasNov 28, 2023, 9:21 AM
32 points
0 comments2 min readEA link
(www.legalpriorities.org)

[3-hour pod­cast]: Joseph Car­l­smith on longter­mism, utopia, the com­pu­ta­tional power of the brain, meta-ethics, illu­sion­ism and meditation

Gus DockerJul 27, 2021, 1:18 PM
34 points
2 comments1 min readEA link

[Question] How much EA anal­y­sis of AI safety as a cause area ex­ists?

richard_ngoSep 6, 2019, 11:15 AM
94 points
20 comments2 min readEA link

Who or­dered al­ign­ment’s ap­ple?

Eleni_AAug 28, 2022, 2:24 PM
5 points
0 comments3 min readEA link

Ro­hin Shah: What’s been hap­pen­ing in AI al­ign­ment?

EA GlobalJul 29, 2020, 8:15 PM
18 points
0 comments14 min readEA link
(www.youtube.com)

In­fer­ence-Only De­bate Ex­per­i­ments Us­ing Math Problems

Arjun PanicksseryAug 6, 2024, 5:44 PM
3 points
1 comment1 min readEA link

fic­tion about AI risk

Ann Garth 🔸Nov 12, 2020, 10:36 PM
8 points
1 comment1 min readEA link

On Solv­ing Prob­lems Be­fore They Ap­pear: The Weird Episte­molo­gies of Alignment

adamShimiOct 11, 2021, 8:21 AM
28 points
0 comments15 min readEA link

In­tro­duc­ing the Fund for Align­ment Re­search (We’re Hiring!)

AdamGleaveJul 6, 2022, 2:00 AM
74 points
3 comments4 min readEA link

Max Teg­mark: Risks and benefits of ad­vanced ar­tifi­cial intelligence

EA GlobalAug 5, 2016, 9:19 AM
7 points
0 comments1 min readEA link
(www.youtube.com)

5th IEEE In­ter­na­tional Con­fer­ence on Ar­tifi­cial In­tel­li­gence Test­ing (AITEST 2023)

surabhi guptaMar 12, 2023, 9:06 AM
−5 points
0 comments1 min readEA link

My Most Likely Rea­son to Die Young is AI X-Risk

AISafetyIsNotLongtermistJul 4, 2022, 3:34 PM
237 points
62 comments4 min readEA link
(www.lesswrong.com)

Align­ment 201 curriculum

richard_ngoOct 12, 2022, 7:17 PM
94 points
9 comments1 min readEA link

[Question] 1h-vol­un­teers needed for a small AI Safety-re­lated re­search pro­ject

PabloAMC 🔸Aug 16, 2021, 5:51 PM
4 points
0 comments1 min readEA link

Anti-squat­ted AI x-risk do­mains index

plexAug 12, 2022, 12:00 PM
56 points
9 comments1 min readEA link

Dis­con­tin­u­ous progress in his­tory: an update

AI ImpactsApr 17, 2020, 4:28 PM
69 points
3 comments24 min readEA link

Ver­ifi­ca­tion meth­ods for in­ter­na­tional AI agreements

AkashAug 31, 2024, 2:58 PM
20 points
0 comments1 min readEA link
(arxiv.org)

What is the role of Bayesian ML for AI al­ign­ment/​safety?

mariushobbhahnJan 11, 2022, 8:07 AM
39 points
6 comments3 min readEA link

AI Safety Ex­ec­u­tive Summary

Sean OsierSep 6, 2022, 8:26 AM
20 points
2 comments5 min readEA link
(seanosier.notion.site)

Ques­tions about AI that bother me

Eleni_AJan 31, 2023, 6:50 AM
33 points
6 comments2 min readEA link

AGI x-risk timelines: 10% chance (by year X) es­ti­mates should be the head­line, not 50%.

Greg_ColbournMar 1, 2022, 12:02 PM
69 points
22 comments2 min readEA link

Our Cur­rent Direc­tions in Mechanis­tic In­ter­pretabil­ity Re­search (AI Align­ment Speaker Series)

Group OrganizerApr 8, 2022, 5:08 PM
3 points
0 comments1 min readEA link

[Question] De­sign­ing user au­then­ti­ca­tion pro­to­cols

Kinoshita Yoshikazu (pseudonym)Mar 13, 2023, 3:56 PM
−1 points
2 comments1 min readEA link

[Dis­cus­sion] Best in­tu­ition pumps for AI safety

mariushobbhahnNov 6, 2021, 8:11 AM
10 points
8 comments1 min readEA link

[Question] How can I bet on short timelines?

kokotajlodNov 7, 2020, 12:45 PM
33 points
12 comments2 min readEA link

An­nounc­ing the AIPoli­cyIdeas.com Database

abiolveraJun 23, 2023, 4:09 PM
50 points
3 comments2 min readEA link
(www.aipolicyideas.com)

[Question] Does the idea of AGI that benev­olently con­trol us ap­peal to EA folks?

Noah ScalesJul 16, 2022, 7:17 PM
6 points
20 comments1 min readEA link

How to Diver­sify Con­cep­tual AI Align­ment: the Model Be­hind Refine

adamShimiJul 20, 2022, 10:44 AM
43 points
0 comments9 min readEA link
(www.alignmentforum.org)

The re­li­gion prob­lem in AI alignment

Geoffrey MillerSep 16, 2022, 1:24 AM
54 points
28 comments11 min readEA link

The Me­taethics and Nor­ma­tive Ethics of AGI Value Align­ment: Many Ques­tions, Some Implications

Eleos Arete CitriniSep 15, 2021, 7:05 PM
25 points
0 comments8 min readEA link

On tak­ing AI risk se­ri­ously

Eleni_AMar 13, 2023, 5:44 AM
51 points
4 comments1 min readEA link
(www.nytimes.com)

ARIA is look­ing for top­ics for roundtables

Nathan_BarnardAug 26, 2022, 7:14 PM
34 points
11 comments1 min readEA link

Key ques­tions about ar­tifi­cial sen­tience: an opinionated guide

rgbApr 25, 2022, 1:42 PM
91 points
3 comments1 min readEA link

Crypto ‘or­a­cle pro­to­cols’ for AI al­ign­ment with real-world data?

Geoffrey MillerSep 22, 2022, 11:05 PM
9 points
3 comments1 min readEA link

[Question] What kind of event, tar­geted to un­der­grad­u­ate CS ma­jors, would be most effec­tive at get­ting peo­ple to work on AI safety?

CBiddulphSep 19, 2021, 4:19 PM
9 points
1 comment1 min readEA link

My Un­der­stand­ing of Paul Chris­ti­ano’s Iter­ated Am­plifi­ca­tion AI Safety Re­search Agenda

ChiAug 15, 2020, 7:59 PM
38 points
3 comments39 min readEA link

The Power of In­tel­li­gence—The Animation

WriterMar 11, 2023, 4:15 PM
59 points
0 comments1 min readEA link

AI safety uni­ver­sity groups: a promis­ing op­por­tu­nity to re­duce ex­is­ten­tial risk

micJun 30, 2022, 6:37 PM
53 points
1 comment11 min readEA link

AI Fore­cast­ing Re­s­olu­tion Coun­cil (Fore­cast­ing in­fras­truc­ture, part 2)

terraformAug 29, 2019, 5:43 PM
28 points
0 comments3 min readEA link

He­len Toner: The Open Philan­thropy Pro­ject’s work on AI risk

EA GlobalNov 3, 2017, 7:43 AM
7 points
0 comments1 min readEA link
(www.youtube.com)

AI Align­ment YouTube Playlists

jacquesthibsMay 9, 2022, 9:31 PM
16 points
2 comments1 min readEA link

Ques­tion­able Nar­ra­tives of “Si­tu­a­tional Aware­ness”

fergusqJun 16, 2024, 5:09 PM
23 points
10 comments14 min readEA link

AI Safety Overview: CERI Sum­mer Re­search Fellowship

Jamie BMar 24, 2022, 3:12 PM
29 points
0 comments2 min readEA link

Fermi es­ti­ma­tion of the im­pact you might have work­ing on AI safety

fribMay 13, 2022, 1:30 PM
24 points
13 comments1 min readEA link

#177 – Re­cent AI break­throughs and nav­i­gat­ing the grow­ing rift be­tween AI safety and ac­cel­er­a­tionist camps (Nathan Labenz on the 80,000 Hours Pod­cast)

80000_HoursJan 31, 2024, 7:37 PM
15 points
0 comments16 min readEA link

There should be an AI safety pro­ject board

mariushobbhahnMar 14, 2022, 4:08 PM
24 points
3 comments1 min readEA link

In­tro­duc­ing The Non­lin­ear Fund: AI Safety re­search, in­cu­ba­tion, and funding

Kat WoodsMar 18, 2021, 2:07 PM
71 points
32 comments5 min readEA link

Thoughts on the OpenAI al­ign­ment plan: will AI re­search as­sis­tants be net-pos­i­tive for AI ex­is­ten­tial risk?

Jeffrey LadishMar 10, 2023, 8:20 AM
12 points
0 comments9 min readEA link

[Question] Is this a good way to bet on short timelines?

kokotajlodNov 28, 2020, 2:31 PM
17 points
16 comments1 min readEA link

[Question] Should the EA com­mu­nity have a DL en­g­ineer­ing fel­low­ship?

PabloAMC 🔸Dec 24, 2021, 1:43 PM
26 points
6 comments1 min readEA link

Ja­pan AI Align­ment Conference

ChrisScammellMar 10, 2023, 9:23 AM
17 points
2 comments1 min readEA link
(www.conjecture.dev)

[Link post] Promis­ing Paths to Align­ment—Con­nor Leahy | Talk

frances_lorenzMay 14, 2022, 3:58 PM
17 points
0 comments1 min readEA link

AI Risk: In­creas­ing Per­sua­sion Power

kewlcatsAug 3, 2020, 8:25 PM
4 points
0 comments1 min readEA link

Crit­i­cism of the main frame­work in AI alignment

Michele CampoloAug 31, 2022, 9:44 PM
42 points
4 comments7 min readEA link

Are Hu­mans ‘Hu­man Com­pat­i­ble’?

Matt BoydDec 6, 2019, 5:49 AM
23 points
8 comments4 min readEA link

Re: Some thoughts on veg­e­tar­i­anism and veganism

FaiFeb 25, 2022, 8:43 PM
46 points
3 comments8 min readEA link

Every­thing’s nor­mal un­til it’s not

Eleni_AMar 10, 2023, 1:42 AM
6 points
0 comments3 min readEA link

In­tro to Safety Engineering

Madhav MalhotraOct 19, 2022, 11:44 PM
4 points
0 comments1 min readEA link

Scru­ti­niz­ing AI Risk (80K, #81) - v. quick summary

BenJul 23, 2020, 7:02 PM
10 points
1 comment3 min readEA link

AI al­ign­ment with hu­mans… but with which hu­mans?

Geoffrey MillerSep 8, 2022, 11:43 PM
51 points
20 comments3 min readEA link

Three Bi­ases That Made Me Believe in AI Risk

beth​Feb 13, 2019, 11:22 PM
41 points
20 comments3 min readEA link

In­tro­duc­ing the Prin­ci­ples of In­tel­li­gent Be­havi­our in Biolog­i­cal and So­cial Sys­tems (PIBBSS) Fellowship

adamShimiDec 18, 2021, 3:25 PM
37 points
5 comments10 min readEA link

[Cause Ex­plo­ra­tion Prizes] Ex­pand­ing com­mu­ni­ca­tion about AGI risks

InesSep 22, 2022, 5:30 AM
13 points
0 comments11 min readEA link

AI and im­pact opportunities

brb243Mar 31, 2022, 8:23 PM
−2 points
6 comments1 min readEA link

2016 AI Risk Liter­a­ture Re­view and Char­ity Comparison

LarksDec 13, 2016, 4:36 AM
57 points
12 comments28 min readEA link

An­thropic: Core Views on AI Safety: When, Why, What, and How

jonmenasterMar 9, 2023, 5:30 PM
107 points
6 comments22 min readEA link
(www.anthropic.com)

How could we know that an AGI sys­tem will have good con­se­quences?

So8resNov 7, 2022, 10:42 PM
25 points
0 comments1 min readEA link

AI Safety in a Vuln­er­a­ble World: Re­quest­ing Feed­back on Pre­limi­nary Thoughts

Jordan ArelDec 6, 2022, 10:36 PM
5 points
4 comments3 min readEA link

[Job]: AI Stan­dards Devel­op­ment Re­search Assistant

Tony BarrettOct 14, 2022, 8:18 PM
13 points
0 comments2 min readEA link

Re­silience Via Frag­mented Power

steve6320Jul 14, 2022, 3:37 PM
2 points
0 comments6 min readEA link

Draft re­port on ex­is­ten­tial risk from power-seek­ing AI

Joe_CarlsmithApr 28, 2021, 9:41 PM
88 points
34 comments1 min readEA link

Soares, Tal­linn, and Yud­kowsky dis­cuss AGI cognition

EliezerYudkowskyNov 29, 2021, 5:28 PM
15 points
0 comments40 min readEA link

Co­her­ence ar­gu­ments im­ply a force for goal-di­rected behavior

Katja_GraceApr 6, 2021, 9:44 PM
19 points
1 comment11 min readEA link
(worldspiritsockpuppet.com)

We Are Con­jec­ture, A New Align­ment Re­search Startup

Connor LeahyApr 9, 2022, 3:07 PM
31 points
0 comments1 min readEA link

[Question] Ca­reer Ad­vice: Philos­o­phy + Pro­gram­ming → AI Safety

tcelferactMar 18, 2022, 3:09 PM
30 points
11 comments2 min readEA link

Ar­tifi­cial in­tel­li­gence ca­reer stories

EA GlobalOct 25, 2020, 6:56 AM
12 points
0 comments1 min readEA link
(www.youtube.com)

Fake Meat and Real Talk 1 - Are We All Gonna Die? Yud­kowsky and the Dangers of AI (Please RSVP)

David NMar 8, 2023, 8:40 PM
11 points
2 comments1 min readEA link

[Closed] Hiring a math­e­mat­i­cian to work on the learn­ing-the­o­retic AI al­ign­ment agenda

VanessaApr 19, 2022, 6:49 AM
53 points
4 comments2 min readEA link

Sin­ga­pore’s Tech­ni­cal AI Align­ment Re­search Ca­reer Guide

Yi-YangAug 26, 2020, 8:09 AM
34 points
7 comments8 min readEA link

[Cross­post] Why Un­con­trol­lable AI Looks More Likely Than Ever

OttoMar 8, 2023, 3:33 PM
49 points
6 comments4 min readEA link
(time.com)

Dis­cov­er­ing Lan­guage Model Be­hav­iors with Model-Writ­ten Evaluations

evhubDec 20, 2022, 8:09 PM
25 points
0 comments1 min readEA link

[Question] I’m in­ter­view­ing pro­lific AI safety re­searcher Richard Ngo (now at OpenAI and pre­vi­ously Deep­Mind). What should I ask him?

Robert_WiblinSep 29, 2022, 12:00 AM
45 points
11 comments1 min readEA link

Key Papers in Lan­guage Model Safety

aogaraJun 20, 2022, 2:59 PM
20 points
0 comments22 min readEA link

Some global catas­trophic risk estimates

TamayFeb 10, 2021, 7:32 PM
106 points
15 comments1 min readEA link

Katja Grace: AI safety

EA GlobalAug 11, 2017, 8:19 AM
7 points
0 comments1 min readEA link
(www.youtube.com)

Red­wood Re­search is hiring for sev­eral roles (Oper­a­tions and Tech­ni­cal)

JJXWangApr 14, 2022, 3:23 PM
45 points
0 comments1 min readEA link

My plan for a “Most Im­por­tant Cen­tury” read­ing group

Jack O'BrienJan 19, 2022, 9:32 AM
12 points
1 comment2 min readEA link

Cortés, Pizarro, and Afonso as Prece­dents for Takeover

AI ImpactsMar 2, 2020, 12:25 PM
27 points
17 comments11 min readEA link
(aiimpacts.org)

Tan Zhi Xuan: AI al­ign­ment, philo­soph­i­cal plu­ral­ism, and the rele­vance of non-Western philosophy

EA GlobalNov 21, 2020, 8:12 AM
19 points
1 comment1 min readEA link
(www.youtube.com)

Jesse Clif­ton: Open-source learn­ing — a bar­gain­ing approach

EA GlobalOct 18, 2019, 6:05 PM
10 points
0 comments1 min readEA link
(www.youtube.com)

Pivotal out­comes and pivotal processes

Andrew CritchJun 17, 2022, 11:43 PM
49 points
1 comment4 min readEA link

Strate­gic Direc­tions for a Digi­tal Con­scious­ness Model

Derek ShillerDec 10, 2024, 7:33 PM
41 points
1 comment12 min readEA link

[Question] Can we con­vince peo­ple to work on AI safety with­out con­vinc­ing them about AGI hap­pen­ing this cen­tury?

BrianTanNov 26, 2020, 2:46 PM
8 points
3 comments2 min readEA link

Why AI al­ign­ment could be hard with mod­ern deep learning

AjeyaSep 21, 2021, 3:35 PM
153 points
17 comments14 min readEA link
(www.cold-takes.com)

AGI Predictions

PabloNov 21, 2020, 12:02 PM
36 points
0 comments1 min readEA link
(www.lesswrong.com)

On pre­sent­ing the case for AI risk

Aryeh EnglanderMar 8, 2022, 9:37 PM
114 points
12 comments4 min readEA link

[Question] How to nav­i­gate po­ten­tial infohazards

more better Mar 4, 2023, 9:28 PM
16 points
7 comments1 min readEA link

Part 1: The AI Safety com­mu­nity has four main work groups, Strat­egy, Gover­nance, Tech­ni­cal and Move­ment Building

PeterSlatteryNov 25, 2022, 3:45 AM
72 points
7 comments6 min readEA link

Amanda Askell: AI safety needs so­cial scientists

EA GlobalMar 4, 2019, 3:50 PM
27 points
0 comments18 min readEA link
(www.youtube.com)

[Question] I’m in­ter­view­ing Max Teg­mark about AI safety and more. What shouId I ask him?

Robert_WiblinMay 13, 2022, 3:32 PM
18 points
2 comments1 min readEA link

The Benefits of Distil­la­tion in Research

Jonas HallgrenMar 4, 2023, 7:19 PM
45 points
2 comments5 min readEA link

What can the prin­ci­pal-agent liter­a­ture tell us about AI risk?

acFeb 10, 2020, 10:10 AM
26 points
1 comment16 min readEA link

What Should We Op­ti­mize—A Conversation

Johannes C. MayerApr 7, 2022, 2:48 PM
1 point
0 comments14 min readEA link

What are the “no free lunch” the­o­rems?

Vishakha AgrawalFeb 4, 2025, 2:02 AM
3 points
0 comments1 min readEA link
(aisafety.info)

Grokking “Fore­cast­ing TAI with biolog­i­cal an­chors”

ansonJun 6, 2022, 6:56 PM
43 points
0 comments14 min readEA link

AGI Safety Com­mu­ni­ca­tions Initiative

InesJun 11, 2022, 4:30 PM
35 points
6 comments1 min readEA link

Long-Term Fu­ture Fund: May 2021 grant recommendations

abergalMay 27, 2021, 6:44 AM
110 points
17 comments57 min readEA link

Acausal normalcy

Andrew CritchMar 3, 2023, 11:35 PM
21 points
4 comments8 min readEA link

Database of ex­is­ten­tial risk estimates

MichaelA🔸Apr 15, 2020, 12:43 PM
130 points
37 comments5 min readEA link

Pre­serv­ing and con­tin­u­ing al­ign­ment re­search through a se­vere global catastrophe

A_donorMar 6, 2022, 6:43 PM
40 points
11 comments5 min readEA link

How Do AI Timelines Affect Giv­ing Now vs. Later?

MichaelDickensAug 3, 2021, 3:36 AM
36 points
8 comments8 min readEA link

[Question] Donat­ing against Short Term AI risks

Jan-WillemNov 16, 2020, 12:23 PM
6 points
10 comments1 min readEA link

[Question] What con­sid­er­a­tions in­fluence whether I have more in­fluence over short or long timelines?

kokotajlodNov 5, 2020, 7:57 PM
18 points
0 comments1 min readEA link

Ought’s the­ory of change

stuhlmuellerApr 12, 2022, 12:09 AM
43 points
4 comments3 min readEA link

Align­ing AI with Hu­mans by Lev­er­ag­ing Le­gal Informatics

johnjnaySep 18, 2022, 7:43 AM
20 points
11 comments3 min readEA link

[Question] Book recom­men­da­tions for the his­tory of ML?

Eleni_ADec 28, 2022, 11:45 PM
10 points
4 comments1 min readEA link

Why AI is Harder Than We Think—Me­lanie Mitchell

Eevee🔹Apr 28, 2021, 8:19 AM
45 points
7 comments2 min readEA link
(arxiv.org)

Sum­maries: Align­ment Fun­da­men­tals Curriculum

Leon_LangSep 19, 2022, 3:43 PM
25 points
1 comment1 min readEA link
(docs.google.com)

Prob­lems of peo­ple new to AI safety and my pro­ject ideas to miti­gate them

Igor IvanovMar 3, 2023, 5:35 PM
19 points
0 comments7 min readEA link

LW4EA: Some cruxes on im­pact­ful al­ter­na­tives to AI policy work

JeremyMay 17, 2022, 3:05 AM
11 points
1 comment1 min readEA link
(www.lesswrong.com)

E.A. Me­gapro­ject Ideas

Tomer_GoloboyMar 21, 2022, 1:23 AM
15 points
4 comments4 min readEA link

Me­tac­u­lus is build­ing a team ded­i­cated to AI forecasting

christianOct 18, 2022, 4:08 PM
35 points
0 comments1 min readEA link
(apply.workable.com)

Cri­tique of Su­per­in­tel­li­gence Part 3

James FodorDec 13, 2018, 5:13 AM
3 points
5 comments7 min readEA link

Ngo and Yud­kowsky on sci­en­tific rea­son­ing and pivotal acts

EliezerYudkowskyFeb 21, 2022, 5:00 PM
33 points
1 comment35 min readEA link

A con­cern­ing ob­ser­va­tion from me­dia cov­er­age of AI in­dus­try dynamics

Justin OliveMar 2, 2023, 11:56 PM
48 points
5 comments3 min readEA link

Prov­ably Hon­est—A First Step

Srijanak DeNov 5, 2022, 9:49 PM
1 point
0 comments1 min readEA link

[Question] Is trans­for­ma­tive AI the biggest ex­is­ten­tial risk? Why or why not?

Eevee🔹Mar 5, 2022, 3:54 AM
9 points
10 comments1 min readEA link

Stu­dent pro­ject for en­gag­ing with AI alignment

Per Ivar FriborgMay 9, 2022, 10:44 AM
35 points
1 comment1 min readEA link

AI al­ign­ment prize win­ners and next round [link]

RyanCareyJan 20, 2018, 12:07 PM
7 points
1 comment1 min readEA link

Work­ing at EA or­ga­ni­za­tions se­ries: Ma­chine In­tel­li­gence Re­search Institute

SoerenMindNov 1, 2015, 12:49 PM
8 points
0 comments4 min readEA link

Can we simu­late hu­man evolu­tion to cre­ate a some­what al­igned AGI?

Thomas KwaMar 29, 2022, 1:23 AM
19 points
0 comments7 min readEA link

How to build a safe ad­vanced AI (Evan Hub­inger) | What’s up in AI safety? (Asya Ber­gal)

EA GlobalOct 25, 2020, 5:48 AM
7 points
0 comments1 min readEA link
(www.youtube.com)

Distil­la­tion of The Offense-Defense Balance of Scien­tific Knowledge

Arjun YadavAug 12, 2022, 7:01 AM
17 points
0 comments2 min readEA link

FLI AI Align­ment pod­cast: Evan Hub­inger on In­ner Align­ment, Outer Align­ment, and Pro­pos­als for Build­ing Safe Ad­vanced AI

evhubJul 1, 2020, 8:59 PM
13 points
2 comments1 min readEA link
(futureoflife.org)

AI Safety: Ap­ply­ing to Grad­u­ate Studies

frances_lorenzDec 15, 2021, 10:56 PM
23 points
0 comments12 min readEA link

Joscha Bach on Syn­thetic In­tel­li­gence [an­no­tated]

Roman LeventovMar 2, 2023, 11:21 AM
8 points
0 comments9 min readEA link
(www.jimruttshow.com)

I’m In­ter­view­ing Kat Woods, EA Pow­er­house. What Should I Ask?

SereneDesireeSep 20, 2022, 9:49 AM
4 points
2 comments1 min readEA link

My Overview of the AI Align­ment Land­scape: A Bird’s Eye View

Neel NandaDec 15, 2021, 11:46 PM
45 points
15 comments16 min readEA link
(www.alignmentforum.org)

[Question] Why aren’t you freak­ing out about OpenAI? At what point would you start?

AppliedDivinityStudiesOct 10, 2021, 1:06 PM
80 points
22 comments2 min readEA link

[Question] What are some sources re­lated to big-pic­ture AI strat­egy?

Jacob Watts🔸Mar 2, 2023, 5:04 AM
9 points
4 comments1 min readEA link

I’m Buck Sh­legeris, I do re­search and out­reach at MIRI, AMA

BuckNov 15, 2019, 10:44 PM
123 points
228 comments2 min readEA link

Scor­ing fore­casts from the 2016 “Ex­pert Sur­vey on Progress in AI”

PatrickLMar 1, 2023, 2:39 PM
204 points
21 comments9 min readEA link

Atari early

AI ImpactsApr 2, 2020, 11:28 PM
34 points
2 comments5 min readEA link
(aiimpacts.org)

There are two fac­tions work­ing to pre­vent AI dan­gers. Here’s why they’re deeply di­vided.

SharmakeAug 10, 2022, 7:52 PM
10 points
0 comments4 min readEA link
(www.vox.com)

Con­ver­sa­tion on AI risk with Adam Gleave

AI ImpactsDec 27, 2019, 9:43 PM
18 points
3 comments4 min readEA link
(aiimpacts.org)

Med­i­ta­tions on ca­reers in AI Safety

PabloAMC 🔸Mar 23, 2022, 10:00 PM
88 points
30 comments2 min readEA link

How Josiah be­came an AI safety researcher

Neil CrawfordMar 29, 2022, 7:47 PM
10 points
0 comments1 min readEA link

Pro­mot­ing com­pas­sion­ate longtermism

jonleightonDec 7, 2022, 2:26 PM
117 points
5 comments12 min readEA link

Buck Sh­legeris: How I think stu­dents should ori­ent to AI safety

EA GlobalOct 25, 2020, 5:48 AM
11 points
0 comments1 min readEA link
(www.youtube.com)

We should ex­pect to worry more about spec­u­la­tive risks

bgarfinkelMay 29, 2022, 9:08 PM
120 points
14 comments3 min readEA link

Ma­hen­dra Prasad: Ra­tional group de­ci­sion-making

EA GlobalJul 8, 2020, 3:06 PM
15 points
0 comments16 min readEA link
(www.youtube.com)

What I’m doing

Chris LeongJul 19, 2022, 11:31 AM
28 points
0 comments4 min readEA link

From lan­guage to ethics by au­to­mated reasoning

Michele CampoloNov 21, 2021, 3:16 PM
8 points
0 comments6 min readEA link

AI al­ign­ment as a trans­la­tion problem

Roman LeventovFeb 5, 2024, 2:14 PM
3 points
1 comment1 min readEA link

AI Align­ment 2018-2019 Review

HabrykaJan 28, 2020, 9:14 PM
28 points
0 comments6 min readEA link
(www.lesswrong.com)

[Closed] Prize and fast track to al­ign­ment re­search at ALTER

VanessaSep 18, 2022, 9:15 AM
38 points
0 comments3 min readEA link

In­tro­duc­ing Leap Labs, an AI in­ter­pretabil­ity startup

Jessica RumbelowMar 6, 2023, 5:37 PM
11 points
0 comments1 min readEA link
(www.lesswrong.com)

Es­ti­mat­ing the Cur­rent and Fu­ture Num­ber of AI Safety Researchers

Stephen McAleeseSep 28, 2022, 8:58 PM
64 points
34 comments9 min readEA link

Safety timelines: How long will it take to solve al­ign­ment?

Esben KranSep 19, 2022, 12:51 PM
45 points
9 comments6 min readEA link

Perform Tractable Re­search While Avoid­ing Ca­pa­bil­ities Ex­ter­nal­ities [Prag­matic AI Safety #4]

TW123May 30, 2022, 8:37 PM
33 points
1 comment25 min readEA link

Cri­tique of Su­per­in­tel­li­gence Part 1

James FodorDec 13, 2018, 5:10 AM
22 points
13 comments8 min readEA link

New Speaker Series on AI Align­ment Start­ing March 3

Zechen ZhangFeb 26, 2022, 10:58 AM
5 points
0 comments1 min readEA link

Disagree­ments about Align­ment: Why, and how, we should try to solve them

ojorgensenAug 8, 2022, 10:32 PM
16 points
6 comments16 min readEA link

2017 AI Safety Liter­a­ture Re­view and Char­ity Comparison

LarksDec 20, 2017, 9:54 PM
43 points
17 comments23 min readEA link

Con­crete ac­tions to im­prove AI gov­er­nance: the be­havi­our sci­ence approach

Alexander SaeriDec 1, 2022, 9:34 PM
31 points
0 comments11 min readEA link

What Should the Aver­age EA Do About AI Align­ment?

RaemonFeb 25, 2017, 8:07 PM
42 points
39 comments7 min readEA link

Fore­cast­ing Trans­for­ma­tive AI: What Kind of AI?

Holden KarnofskyAug 10, 2021, 9:38 PM
62 points
3 comments10 min readEA link

Draft re­port on AI timelines

AjeyaDec 15, 2020, 12:10 PM
35 points
0 comments1 min readEA link
(alignmentforum.org)

Con­sider try­ing Vivek Heb­bar’s al­ign­ment exercises

AkashOct 24, 2022, 7:46 PM
16 points
0 comments1 min readEA link

The case for be­com­ing a black-box in­ves­ti­ga­tor of lan­guage models

BuckMay 6, 2022, 2:37 PM
90 points
7 comments3 min readEA link

‘Force mul­ti­pli­ers’ for EA research

Craig DraytonJun 18, 2022, 1:39 PM
18 points
7 comments4 min readEA link

Red­wood Re­search is hiring for sev­eral roles

Jack RNov 29, 2021, 12:18 AM
75 points
0 comments1 min readEA link

[Question] An eco­nomics of AI gov—best re­sources for

LivFeb 26, 2023, 11:11 AM
10 points
4 comments1 min readEA link

Why The Fo­cus on Ex­pected Utility Max­imisers?

𝕮𝖎𝖓𝖊𝖗𝖆Dec 27, 2022, 3:51 PM
11 points
1 comment1 min readEA link

En­abling more feedback

JJ HepburnDec 10, 2021, 6:52 AM
41 points
3 comments3 min readEA link

My cur­rent thoughts on MIRI’s “highly re­li­able agent de­sign” work

Daniel_DeweyJul 7, 2017, 1:17 AM
60 points
59 comments19 min readEA link

Safe Sta­sis Fallacy

DavidmanheimFeb 5, 2024, 10:54 AM
23 points
4 comments1 min readEA link

[Link] How un­der­stand­ing valence could help make fu­ture AIs safer

Milan GriffesOct 8, 2020, 6:53 PM
22 points
2 comments3 min readEA link

[Question] Why not to solve al­ign­ment by mak­ing su­per­in­tel­li­gent hu­mans?

PatoOct 16, 2022, 9:26 PM
9 points
12 comments1 min readEA link

Very Briefly: The CHIPS Act

YadavFeb 26, 2023, 1:53 PM
40 points
3 comments1 min readEA link
(www.y1d2.com)

“Clean” vs. “messy” goal-di­rect­ed­ness (Sec­tion 2.2.3 of “Schem­ing AIs”)

Joe_CarlsmithNov 29, 2023, 4:32 PM
7 points
0 comments1 min readEA link

A mod­est case for hope

xavier rgOct 17, 2022, 6:03 AM
28 points
0 comments1 min readEA link

[Question] Is a ca­reer in mak­ing AI sys­tems more se­cure a mean­ingful way to miti­gate the X-risk posed by AGI?

Kyle O’BrienFeb 13, 2022, 7:05 AM
14 points
4 comments1 min readEA link

Owain Evans and Vic­to­ria Krakovna: Ca­reers in tech­ni­cal AI safety

EA GlobalNov 3, 2017, 7:43 AM
7 points
0 comments1 min readEA link
(www.youtube.com)

Why I think it’s im­por­tant to work on AI forecasting

Matthew_BarnettFeb 27, 2023, 9:24 PM
179 points
10 comments10 min readEA link

On Defer­ence and Yud­kowsky’s AI Risk Estimates

bgarfinkelJun 19, 2022, 2:35 PM
285 points
194 comments17 min readEA link

Steer­ing AI to care for an­i­mals, and soon

Andrew CritchJun 14, 2022, 1:13 AM
224 points
37 comments1 min readEA link

Seek­ing in­put on a list of AI books for broader audience

Darren McKeeFeb 27, 2023, 10:40 PM
49 points
14 comments5 min readEA link

Im­proved Se­cu­rity to Prevent Hacker-AI and Digi­tal Ghosts

Erland WittkotterOct 21, 2022, 10:11 AM
1 point
0 comments1 min readEA link

[Question] Why not offer a multi-mil­lion /​ billion dol­lar prize for solv­ing the Align­ment Prob­lem?

Aryeh EnglanderApr 17, 2022, 4:08 PM
15 points
9 comments1 min readEA link

Anal­y­sis of AI Safety sur­veys for field-build­ing insights

Ash JafariDec 5, 2022, 5:37 PM
30 points
7 comments5 min readEA link

Pre­dict re­sponses to the “ex­is­ten­tial risk from AI” survey

RobBensingerMay 28, 2021, 1:38 AM
36 points
8 comments2 min readEA link

AMA or dis­cuss my 80K pod­cast epi­sode: Ben Garfinkel, FHI researcher

bgarfinkelJul 13, 2020, 4:17 PM
87 points
140 comments1 min readEA link

UK AI Policy Re­port: Con­tent, Sum­mary, and its Im­pact on EA Cause Areas

Algo_LawJul 21, 2022, 5:32 PM
9 points
1 comment9 min readEA link

The role of academia in AI Safety.

PabloAMC 🔸Mar 28, 2022, 12:04 AM
71 points
19 comments3 min readEA link

How to ‘troll for good’: Lev­er­ag­ing IP for AI governance

Michael HuangFeb 26, 2023, 6:34 AM
26 points
3 comments1 min readEA link
(www.science.org)

[Question] Are so­cial me­dia al­gorithms an ex­is­ten­tial risk?

Barry GrimesSep 15, 2020, 8:52 AM
24 points
13 comments1 min readEA link

[Question] What Do AI Safety Pitches Not Get About Your Field?

a_e_rSep 20, 2022, 6:13 PM
70 points
18 comments1 min readEA link

A re­sponse to Matthews on AI Risk

RyanCareyAug 11, 2015, 12:58 PM
11 points
16 comments6 min readEA link

You Un­der­stand AI Align­ment and How to Make Soup

Leen ArmoushMay 28, 2022, 6:22 AM
0 points
2 comments5 min readEA link

[Question] Do EA folks want AGI at all?

Noah ScalesJul 16, 2022, 5:44 AM
8 points
10 comments1 min readEA link

On how var­i­ous plans miss the hard bits of the al­ign­ment challenge

So8resJul 12, 2022, 5:35 AM
126 points
13 comments29 min readEA link

Ad­vice on Pur­su­ing Tech­ni­cal AI Safety Research

frances_lorenzMay 31, 2022, 5:48 PM
29 points
2 comments4 min readEA link

Univer­sity com­mu­nity build­ing seems like the wrong model for AI safety

George StiffmanFeb 26, 2022, 6:23 AM
24 points
8 comments2 min readEA link

aisafety.com­mu­nity—A liv­ing doc­u­ment of AI safety communities

zeshenOct 20, 2022, 10:08 PM
24 points
13 comments1 min readEA link

An­drew Critch: Log­i­cal in­duc­tion — progress in AI alignment

EA GlobalAug 6, 2016, 12:40 AM
7 points
0 comments1 min readEA link
(www.youtube.com)

[Question] Why should we *not* put effort into AI safety re­search?

Ben ThompsonMay 16, 2021, 5:11 AM
15 points
5 comments1 min readEA link

[Question] Which is more im­por­tant for re­duc­ing s-risks, re­search­ing on AI sen­tience or an­i­mal welfare?

jackchang110Feb 25, 2023, 2:20 AM
9 points
0 comments1 min readEA link

“Tak­ing AI Risk Se­ri­ously” – Thoughts by An­drew Critch

RaemonNov 19, 2018, 2:21 AM
26 points
9 comments1 min readEA link
(www.lesswrong.com)

AGI safety from first principles

richard_ngoOct 21, 2020, 5:42 PM
77 points
10 comments3 min readEA link
(www.alignmentforum.org)

A Cri­tique of AI Takeover Scenarios

James FodorAug 31, 2022, 1:49 PM
53 points
4 comments12 min readEA link

Changes in fund­ing in the AI safety field

Sebastian_FarquharFeb 3, 2017, 1:09 PM
34 points
10 comments7 min readEA link

Deep­Mind is hiring for the Scal­able Align­ment and Align­ment Teams

Rohin ShahMay 13, 2022, 12:19 PM
102 points
0 comments9 min readEA link

Shah and Yud­kowsky on al­ign­ment failures

EliezerYudkowskyFeb 28, 2022, 7:25 PM
38 points
7 comments92 min readEA link

How much should gov­ern­ments pay to pre­vent catas­tro­phes? Longter­mism’s limited role

EJTMar 19, 2023, 4:50 PM
258 points
35 comments35 min readEA link
(philpapers.org)

AI Safety Endgame Stories

IvanVendrovSep 28, 2022, 5:12 PM
31 points
1 comment1 min readEA link

[Question] How would a lan­guage model be­come goal-di­rected?

David MJul 16, 2022, 2:50 PM
113 points
20 comments1 min readEA link

Cri­tique of Su­per­in­tel­li­gence Part 4

James FodorDec 13, 2018, 5:14 AM
4 points
2 comments4 min readEA link

Ap­ply to a small iter­a­tion of MLAB to be run in Oxford

Rio PAug 29, 2023, 7:39 PM
11 points
0 comments1 min readEA link

[Ex­tended Dead­line: Jan 23rd] An­nounc­ing the PIBBSS Sum­mer Re­search Fellowship

noraDec 18, 2021, 4:54 PM
36 points
1 comment1 min readEA link

Euro­pean Master’s Pro­grams in Ma­chine Learn­ing, Ar­tifi­cial In­tel­li­gence, and re­lated fields

Master Programs ML/AIJan 17, 2021, 8:09 PM
17 points
4 comments1 min readEA link

A con­ver­sa­tion with Ro­hin Shah

AI ImpactsNov 12, 2019, 1:31 AM
27 points
8 comments33 min readEA link
(aiimpacts.org)

Public-fac­ing Cen­sor­ship Is Safety Theater, Caus­ing Rep­u­ta­tional Da­m­age

YitzSep 23, 2022, 5:08 AM
49 points
7 comments1 min readEA link

[Creative Writ­ing Con­test] The Puppy Problem

LouisOct 13, 2021, 2:01 PM
13 points
0 comments7 min readEA link

Visi­ble Thoughts Pro­ject and Bounty Announcement

So8resNov 30, 2021, 12:35 AM
35 points
2 comments13 min readEA link

2023 Stan­ford Ex­is­ten­tial Risks Conference

elizabethcooperFeb 24, 2023, 5:49 PM
29 points
5 comments1 min readEA link

Good Fu­tures Ini­ti­a­tive: Win­ter Pro­ject In­tern­ship

a_e_rNov 27, 2022, 11:27 PM
67 points
7 comments3 min readEA link

Fund­ing for hu­man­i­tar­ian non-prof­its to re­search re­spon­si­ble AI

Deborah W.A. FoulkesDec 10, 2024, 8:08 AM
4 points
0 comments2 min readEA link
(www.gov.uk)

Skil­ling-up in ML Eng­ineer­ing for Align­ment: re­quest for comments

TheMcDouglasApr 24, 2022, 6:40 AM
8 points
0 comments1 min readEA link

EA megapro­jects continued

mariushobbhahnDec 3, 2021, 10:33 AM
183 points
48 comments7 min readEA link

Cri­tique of Su­per­in­tel­li­gence Part 2

James FodorDec 13, 2018, 5:12 AM
10 points
12 comments7 min readEA link

A mesa-op­ti­miza­tion per­spec­tive on AI valence and moral patienthood

jacobpfauSep 9, 2021, 10:23 PM
10 points
18 comments17 min readEA link

An­nounc­ing AI Align­ment Awards: $100k re­search con­tests about goal mis­gen­er­al­iza­tion & corrigibility

AkashNov 22, 2022, 10:19 PM
60 points
1 comment1 min readEA link

2019 AI Align­ment Liter­a­ture Re­view and Char­ity Comparison

LarksDec 19, 2019, 2:58 AM
147 points
28 comments62 min readEA link

Quan­tify­ing the Far Fu­ture Effects of Interventions

MichaelDickensMay 18, 2016, 2:15 AM
8 points
0 comments11 min readEA link

De­fus­ing AGI Danger

Mark XuDec 24, 2020, 11:08 PM
23 points
0 comments2 min readEA link
(www.alignmentforum.org)

*New* Canada AI Safety & Gover­nance community

Wyatt Tessari L'AlliéAug 29, 2022, 3:58 PM
32 points
2 comments1 min readEA link

[Creative Writ­ing Con­test] Me­tal or Mortal

LouisOct 16, 2021, 4:24 PM
7 points
0 comments7 min readEA link

Chris­ti­ano and Yud­kowsky on AI pre­dic­tions and hu­man intelligence

EliezerYudkowskyFeb 23, 2022, 4:51 PM
31 points
0 comments42 min readEA link

Cal­ifor­nia AI Bill, SB 1047, cov­ered in to­day’s WSJ.

EmersonAug 8, 2024, 12:27 PM
5 points
0 comments1 min readEA link
(www.wsj.com)

Shar­ing the World with Digi­tal Minds

Aaron Gertler 🔸Dec 1, 2020, 8:00 AM
12 points
1 comment1 min readEA link
(www.nickbostrom.com)

The het­ero­gene­ity of hu­man value types: Im­pli­ca­tions for AI alignment

Geoffrey MillerSep 16, 2022, 9:21 PM
27 points
2 comments10 min readEA link

How should norms of aca­demic writ­ing and pub­lish­ing be changed once AI sys­tems be­come su­per­hu­man in more re­spects?

simonfriederichNov 24, 2023, 1:35 PM
10 points
0 comments1 min readEA link
(link.springer.com)

[Question] AI Eth­i­cal Committee

eaaicommitteeMar 1, 2022, 11:35 PM
8 points
0 comments1 min readEA link

Con­sider pay­ing me to do AI safety re­search work

RupertNov 5, 2020, 8:09 AM
11 points
3 comments2 min readEA link

[AN #80]: Why AI risk might be solved with­out ad­di­tional in­ter­ven­tion from longtermists

Rohin ShahJan 3, 2020, 7:52 AM
58 points
12 comments10 min readEA link
(www.alignmentforum.org)

AI al­ign­ment re­searchers don’t (seem to) stack

So8resFeb 21, 2023, 12:48 AM
47 points
3 comments1 min readEA link

2018 AI Align­ment Liter­a­ture Re­view and Char­ity Comparison

LarksDec 18, 2018, 4:48 AM
118 points
28 comments63 min readEA link

2020 AI Align­ment Liter­a­ture Re­view and Char­ity Comparison

LarksDec 21, 2020, 3:25 PM
155 points
16 comments68 min readEA link

Does most of your im­pact come from what you do soon?

JoshcFeb 21, 2023, 5:12 AM
38 points
1 comment5 min readEA link

Beg­ging, Plead­ing AI Orgs to Com­ment on NIST AI Risk Man­age­ment Framework

BridgesApr 15, 2022, 7:35 PM
87 points
3 comments2 min readEA link

Data col­lec­tion for AI al­ign­ment—Ca­reer review

Benjamin HiltonJun 3, 2022, 11:44 AM
34 points
1 comment5 min readEA link
(80000hours.org)

Pre­sump­tive Listen­ing: stick­ing to fa­mil­iar con­cepts and miss­ing the outer rea­son­ing paths

RemmeltDec 27, 2022, 3:40 PM
3 points
0 comments1 min readEA link

Pod­cast: Tam­era Lan­ham on AI risk, threat mod­els, al­ign­ment pro­pos­als, ex­ter­nal­ized rea­son­ing over­sight, and work­ing at Anthropic

AkashDec 20, 2022, 9:39 PM
14 points
1 comment1 min readEA link

Con­nor Leahy on Con­jec­ture and Dy­ing with Dignity

Michaël TrazziJul 22, 2022, 7:30 PM
34 points
0 comments10 min readEA link
(theinsideview.ai)

The first AI Safety Camp & onwards

RemmeltJun 7, 2018, 6:49 PM
25 points
2 comments8 min readEA link

[Linkpost] How To Get Into In­de­pen­dent Re­search On Align­ment/​Agency

Jackson WagnerFeb 14, 2022, 9:40 PM
10 points
0 comments1 min readEA link

7 traps that (we think) new al­ign­ment re­searchers of­ten fall into

AkashSep 27, 2022, 11:13 PM
73 points
8 comments1 min readEA link

A Quick List of Some Prob­lems in AI Align­ment As A Field

Nicholas / Heather KrossJun 21, 2022, 5:09 PM
16 points
10 comments6 min readEA link
(www.thinkingmuchbetter.com)

“In­tro to brain-like-AGI safety” se­ries—halfway point!

Steven ByrnesMar 9, 2022, 3:21 PM
8 points
0 comments2 min readEA link

The aca­demic con­tri­bu­tion to AI safety seems large

technicalitiesJul 30, 2020, 10:30 AM
117 points
28 comments9 min readEA link

Jan Leike, He­len Toner, Malo Bour­gon, and Miles Brundage: Work­ing in AI

EA GlobalAug 11, 2017, 8:19 AM
7 points
0 comments1 min readEA link
(www.youtube.com)

All AGI Safety ques­tions wel­come (es­pe­cially ba­sic ones) [~monthly thread]

robertskmilesNov 1, 2022, 11:21 PM
75 points
83 comments1 min readEA link

Syd­ney AI Safety Fellowship

Chris LeongDec 2, 2021, 7:35 AM
16 points
0 comments2 min readEA link

[Question] Is it crunch time yet? If so, who can help?

Nicholas / Heather KrossOct 13, 2021, 4:11 AM
29 points
9 comments1 min readEA link

[Question] Is work­ing on AI safety as dan­ger­ous as ig­nor­ing it?

jkmhSep 20, 2021, 11:06 PM
10 points
5 comments1 min readEA link

Where does Re­spon­si­ble Ca­pa­bil­ities Scal­ing take AI gov­er­nance?

ZacRichardsonJun 9, 2024, 10:25 PM
17 points
1 comment16 min readEA link

How do take­off speeds af­fect the prob­a­bil­ity of bad out­comes from AGI?

KRJul 7, 2020, 5:53 PM
18 points
0 comments8 min readEA link

[Question] What is most con­fus­ing to you about AI stuff?

Sam ClarkeNov 23, 2021, 4:00 PM
25 points
15 comments1 min readEA link

[Question] How do you talk about AI safety?

Eevee🔹Apr 19, 2020, 4:15 PM
10 points
5 comments1 min readEA link

In­ter­view with Ro­man Yam­polskiy about AGI on The Real­ity Check

Darren McKeeFeb 18, 2023, 11:29 PM
27 points
0 comments1 min readEA link
(www.trcpodcast.com)

Math­e­mat­i­cal Cir­cuits in Neu­ral Networks

Sean OsierSep 22, 2022, 2:32 AM
23 points
2 comments1 min readEA link
(www.youtube.com)

[Question] Benefits/​Risks of Scott Aaron­son’s Ortho­dox/​Re­form Fram­ing for AI Alignment

JeremyNov 21, 2022, 5:47 PM
15 points
5 comments1 min readEA link
(scottaaronson.blog)

Be­ing an in­di­vi­d­ual al­ign­ment grantmaker

A_donorFeb 28, 2022, 4:39 PM
34 points
20 comments2 min readEA link

Messy per­sonal stuff that af­fected my cause pri­ori­ti­za­tion (or: how I started to care about AI safety)

Julia_Wise🔸May 5, 2022, 5:59 PM
265 points
14 comments2 min readEA link

SERI ML ap­pli­ca­tion dead­line is ex­tended un­til May 22.

Viktoria MalyasovaMay 22, 2022, 12:13 AM
13 points
3 comments1 min readEA link

A cen­tral AI al­ign­ment prob­lem: ca­pa­bil­ities gen­er­al­iza­tion, and the sharp left turn

So8resJun 15, 2022, 2:19 PM
53 points
2 comments10 min readEA link

Tech­ni­cal AGI safety re­search out­side AI

richard_ngoOct 18, 2019, 3:02 PM
91 points
5 comments3 min readEA link

Some promis­ing ca­reer ideas be­yond 80,000 Hours’ pri­or­ity paths

Arden KoehlerJun 26, 2020, 10:34 AM
142 points
28 comments15 min readEA link

AI Safety Info Distil­la­tion Fellowship

robertskmilesFeb 17, 2023, 4:16 PM
80 points
1 comment1 min readEA link

7 es­says on Build­ing a Bet­ter Future

Jamie_HarrisJun 24, 2022, 2:28 PM
21 points
0 comments2 min readEA link

Align­ment’s phlo­gis­ton

Eleni_AAug 18, 2022, 1:41 AM
18 points
1 comment2 min readEA link

Newslet­ter for Align­ment Re­search: The ML Safety Updates

Esben KranOct 22, 2022, 4:17 PM
30 points
0 comments7 min readEA link

Distil­la­tion of “How Likely is De­cep­tive Align­ment?”

NickGabsDec 1, 2022, 8:22 PM
10 points
1 comment10 min readEA link

“Ex­is­ten­tial risk from AI” sur­vey results

RobBensingerJun 1, 2021, 8:19 PM
80 points
35 comments11 min readEA link

In­creased Availa­bil­ity and Willing­ness for De­ploy­ment of Re­sources for Effec­tive Altru­ism and Long-Termism

Evan_GaensbauerDec 29, 2021, 8:20 PM
46 points
1 comment2 min readEA link

LessWrong is now a book, available for pre-or­der!

terraformDec 4, 2020, 8:42 PM
48 points
1 comment10 min readEA link

An ML safety in­surance com­pany—shower thoughts

EdoAradOct 18, 2021, 7:45 AM
15 points
4 comments1 min readEA link

[Question] Huh. Bing thing got me real anx­ious about AI. Re­sources to help with that please?

ArvinFeb 15, 2023, 4:55 PM
2 points
7 comments1 min readEA link

Ar­tifi­cial In­tel­li­gence, Mo­ral­ity, and Sen­tience (AIMS) Sur­vey: 2021

Janet PauketatJul 1, 2022, 7:47 AM
36 points
0 comments2 min readEA link
(www.sentienceinstitute.org)

AI Safety Needs Great Engineers

Andy JonesNov 23, 2021, 9:03 PM
98 points
13 comments4 min readEA link

Align­ment is hard. Com­mu­ni­cat­ing that, might be harder

Eleni_ASep 1, 2022, 11:45 AM
17 points
1 comment3 min readEA link

High im­pact job op­por­tu­nity at ARIA (UK)

RasoolFeb 12, 2023, 10:35 AM
80 points
0 comments1 min readEA link

New re­port on how much com­pu­ta­tional power it takes to match the hu­man brain (Open Philan­thropy)

Aaron Gertler 🔸Sep 15, 2020, 1:06 AM
45 points
1 comment18 min readEA link
(www.openphilanthropy.org)

Is GPT-3 the death of the pa­per­clip max­i­mizer?

matthias_samwaldAug 3, 2020, 11:34 AM
4 points
1 comment1 min readEA link

$1,000 bounty for an AI Pro­gramme Lead recommendation

Cillian_Aug 14, 2023, 1:11 PM
11 points
1 comment2 min readEA link

Con­crete Ad­vice for Form­ing In­side Views on AI Safety

Neel NandaAug 17, 2022, 11:26 PM
58 points
4 comments10 min readEA link
(www.alignmentforum.org)

Ap­ply for the ML Win­ter Camp in Cam­bridge, UK [2-10 Jan]

Nathan_BarnardDec 2, 2022, 7:33 PM
50 points
11 comments2 min readEA link

Should AI fo­cus on prob­lem-solv­ing or strate­gic plan­ning? Why not both?

oliver_siegelNov 1, 2022, 9:53 AM
1 point
0 comments1 min readEA link

Why some peo­ple be­lieve in AGI, but I don’t.

cveresOct 26, 2022, 3:09 AM
13 points
2 comments4 min readEA link

A list of good heuris­tics that the case for AI X-risk fails

Aaron Gertler 🔸Jul 16, 2020, 9:56 AM
25 points
9 comments2 min readEA link
(www.alignmentforum.org)

The Tree of Life: Stan­ford AI Align­ment The­ory of Change

GabeMJul 2, 2022, 6:32 PM
69 points
5 comments14 min readEA link

We Ran an AI Timelines Retreat

Lenny McClineMay 17, 2022, 4:40 AM
46 points
6 comments3 min readEA link

[Question] Does China have AI al­ign­ment re­sources/​in­sti­tu­tions? How can we pri­ori­tize cre­at­ing more?

JakubKAug 4, 2022, 7:23 PM
18 points
9 comments1 min readEA link

AMA: Ajeya Co­tra, re­searcher at Open Phil

AjeyaJan 28, 2021, 5:38 PM
84 points
105 comments1 min readEA link

Align­ment Newslet­ter One Year Retrospective

Rohin ShahApr 10, 2019, 7:00 AM
62 points
22 comments21 min readEA link

Short-Term AI Align­ment as a Pri­or­ity Cause

len.hoang.lnhFeb 11, 2020, 4:22 PM
17 points
11 comments7 min readEA link

Twit­ter-length re­sponses to 24 AI al­ign­ment arguments

RobBensingerMar 14, 2022, 7:34 PM
67 points
17 comments8 min readEA link

Overview | An Eval­u­a­tive Evolu­tion

Matt KeeneFeb 10, 2023, 6:15 PM
−9 points
0 comments5 min readEA link
(www.creatingafuturewewant.com)

Refer the Co­op­er­a­tive AI Foun­da­tion’s New COO, Re­ceive $5000

Lewis HammondJun 16, 2022, 1:27 PM
42 points
0 comments3 min readEA link

De­cep­tion as the op­ti­mal: mesa-op­ti­miz­ers and in­ner al­ign­ment

Eleni_AAug 16, 2022, 3:45 AM
19 points
0 comments5 min readEA link

AGI in a vuln­er­a­ble world

AI ImpactsApr 2, 2020, 3:43 AM
17 points
0 comments1 min readEA link
(aiimpacts.org)

[Question] Brief sum­mary of key dis­agree­ments in AI Risk

Aryeh EnglanderDec 26, 2019, 7:40 PM
31 points
3 comments1 min readEA link

It’s (not) how you use it

Eleni_ASep 7, 2022, 1:28 PM
6 points
3 comments2 min readEA link

[Question] Why does (any par­tic­u­lar) AI safety work re­duce s-risks more than it in­creases them?

MichaelStJulesOct 3, 2021, 4:55 PM
48 points
19 comments1 min readEA link

The Vi­talik Bu­terin Fel­low­ship in AI Ex­is­ten­tial Safety is open for ap­pli­ca­tions!

Cynthia ChenOct 14, 2022, 3:23 AM
38 points
0 comments2 min readEA link

PIBBSS Fel­low­ship: Bounty for Refer­rals & Dead­line Extension

Anna_GajdovaJan 17, 2022, 4:23 PM
17 points
7 comments1 min readEA link

[Question] Sur­vey about Copy­right and gen­er­a­tive AI al­lowed here ?

Lee O'BrienAug 9, 2024, 12:27 PM
0 points
1 comment1 min readEA link

Ter­minol­ogy sug­ges­tion: stan­dard­ize terms for prob­a­bil­ity ranges

Egg SyntaxAug 30, 2024, 4:05 PM
2 points
0 comments1 min readEA link

Co­op­er­a­tion and Align­ment in Del­e­ga­tion Games: You Need Both!

Oliver SourbutAug 3, 2024, 10:16 AM
4 points
1 comment1 min readEA link
(www.oliversourbut.net)

Con­jec­ture: In­ter­nal In­fo­haz­ard Policy

Connor LeahyJul 29, 2022, 7:35 PM
34 points
3 comments19 min readEA link

Long-Term Fu­ture Fund: April 2019 grant recommendations

HabrykaApr 23, 2019, 7:00 AM
142 points
242 comments46 min readEA link

How to cre­ate a “good” AGI

mreichertDec 8, 2023, 10:47 AM
1 point
0 comments10 min readEA link

[Question] Can in­de­pen­dent re­searchers get a spon­sored visa for the US or UK?

jacquesthibsMar 25, 2023, 3:05 AM
20 points
2 comments1 min readEA link

[Question] How many peo­ple are neart­er­mist and have high P(doom)?

SanjayAug 2, 2023, 2:24 PM
52 points
13 comments1 min readEA link

We are already in a per­sua­sion-trans­formed world and must take precautions

trevor1Nov 4, 2023, 3:53 PM
1 point
0 comments1 min readEA link

3 lev­els of threat obfuscation

Holden KarnofskyAug 2, 2023, 5:09 PM
31 points
0 comments6 min readEA link
(www.alignmentforum.org)

The 6D effect: When com­pa­nies take risks, one email can be very pow­er­ful.

stecasNov 4, 2023, 8:08 PM
40 points
1 comment1 min readEA link

Train­ing for Good is hiring (and why you should join us): AI Pro­gramme Lead and Oper­a­tions Associate

Cillian_Aug 3, 2023, 4:50 PM
9 points
1 comment6 min readEA link

Non-clas­sic sto­ries about schem­ing (Sec­tion 2.3.2 of “Schem­ing AIs”)

Joe_CarlsmithDec 4, 2023, 6:44 PM
12 points
1 comment1 min readEA link

Miti­gat­ing ex­is­ten­tial risks as­so­ci­ated with hu­man na­ture and AI: Thoughts on se­ri­ous mea­sures.

LinyphiaMar 25, 2023, 7:10 PM
2 points
2 comments3 min readEA link

Apollo Re­search is hiring evals and in­ter­pretabil­ity en­g­ineers & scientists

mariushobbhahnAug 4, 2023, 10:56 AM
19 points
1 comment2 min readEA link

ChatGPT bug leaked users’ con­ver­sa­tion histories

Ian TurnerMar 27, 2023, 12:17 AM
15 points
2 comments1 min readEA link
(www.bbc.com)

[Linkpost] Shorter ver­sion of re­port on ex­is­ten­tial risk from power-seek­ing AI

Joe_CarlsmithMar 22, 2023, 6:06 PM
49 points
1 comment1 min readEA link

Ex­plor­ing Tacit Linked Premises with GPT

RomeoStevensMar 24, 2023, 10:50 PM
5 points
0 comments1 min readEA link

[Question] Are there cause pri­or­ti­za­tions es­ti­mates for s-risks sup­port­ers?

jackchang110Mar 27, 2023, 10:32 AM
33 points
6 comments1 min readEA link

New blog: Planned Obsolescence

AjeyaMar 27, 2023, 7:46 PM
198 points
9 comments1 min readEA link
(www.planned-obsolescence.org)

Some of My Cur­rent Im­pres­sions En­ter­ing AI Safety

PhibMar 28, 2023, 5:18 AM
5 points
0 comments2 min readEA link

[Question] Half-baked al­ign­ment idea

ozbMar 28, 2023, 5:18 AM
9 points
2 comments1 min readEA link

Govern­ing High-Im­pact AI Sys­tems: Un­der­stand­ing Canada’s Pro­posed AI Bill. April 15, Car­leton Univer­sity, Ottawa

Liav.KorenMar 27, 2023, 11:11 PM
3 points
0 comments1 min readEA link
(www.eventbrite.com)

Longter­mism and short­ter­mism can dis­agree on nu­clear war to stop ad­vanced AI

David JohnstonMar 30, 2023, 11:22 PM
2 points
0 comments1 min readEA link

When Will We Spend Enough to Train Trans­for­ma­tive AI

snMar 28, 2023, 12:41 AM
3 points
0 comments9 min readEA link

The Prospect of an AI Winter

Erich_Grunewald 🔸Mar 27, 2023, 8:55 PM
56 points
13 comments1 min readEA link

An A.I. Safety Pre­sen­ta­tion at RIT

Nicholas / Heather KrossMar 27, 2023, 11:49 PM
5 points
0 comments1 min readEA link

Keep Chas­ing AI Safety Press Coverage

GilApr 4, 2023, 8:40 PM
106 points
16 comments5 min readEA link

Eric Sch­midt on re­cur­sive self-improvement

NikolaNov 5, 2023, 7:05 PM
11 points
0 comments1 min readEA link
(www.youtube.com)

[Question] What longter­mist pro­jects would you like to see im­ple­mented?

BuhlMar 28, 2023, 6:41 PM
55 points
6 comments1 min readEA link

De­sen­si­tiz­ing Deepfakes

PhibMar 29, 2023, 1:20 AM
22 points
10 comments1 min readEA link

The space of sys­tems and the space of maps

Jan_KulveitMar 22, 2023, 4:05 PM
12 points
0 comments5 min readEA link
(www.lesswrong.com)

[Question] What are the ar­gu­ments that sup­port China build­ing AGI+ if Western com­pa­nies de­lay/​pause AI de­vel­op­ment?

DMMFMar 29, 2023, 6:53 PM
32 points
9 comments1 min readEA link

Paus­ing AI Devel­op­ments Isn’t Enough. We Need to Shut it All Down by Eliezer Yudkowsky

jacquesthibsMar 29, 2023, 11:30 PM
211 points
75 comments3 min readEA link
(time.com)

The fun­da­men­tal hu­man value is power.

LinyphiaMar 30, 2023, 3:15 PM
−1 points
5 comments1 min readEA link

[Event] Join Me­tac­u­lus To­mor­row, March 31st, for Fore­cast Fri­day!

christianMar 30, 2023, 8:58 PM
29 points
1 comment1 min readEA link
(www.metaculus.com)

ChatGPT is ca­pa­ble of cog­ni­tive em­pa­thy!

Miquel Banchs-Piqué (prev. mikbp)Mar 30, 2023, 8:42 PM
3 points
0 comments1 min readEA link
(nonzero.substack.com)

[Question] Can AI safely ex­ist at all?

Hayven FrienbyNov 27, 2023, 5:33 PM
6 points
7 comments2 min readEA link

Why build­ing ven­tures in AI Safety is par­tic­u­larly challeng­ing

Heramb PodarNov 6, 2023, 12:16 AM
16 points
2 comments4 min readEA link

CHAI in­tern­ship ap­pli­ca­tions are open (due Nov 13)

Erik JennerOct 26, 2023, 12:48 AM
6 points
1 comment3 min readEA link

Ap­ply to the Con­stel­la­tion Visit­ing Re­searcher Pro­gram and As­tra Fel­low­ship, in Berkeley this Winter

Anjay FOct 26, 2023, 3:14 AM
61 points
4 comments1 min readEA link

[Question] What are the biggest ob­sta­cles on AI safety re­search ca­reer?

jackchang110Mar 31, 2023, 2:53 PM
2 points
1 comment1 min readEA link

Keep Mak­ing AI Safety News

GilMar 31, 2023, 8:11 PM
67 points
4 comments1 min readEA link

Pal­isade is hiring Re­search Engineers

Charlie Rogers-SmithNov 11, 2023, 3:09 AM
23 points
0 comments3 min readEA link

Gover­nance of AI, Break­fast Ce­real, Car Fac­to­ries, Etc.

Jeff MartinNov 6, 2023, 1:44 AM
2 points
0 comments3 min readEA link

Hu­man Values and AGI Risk | William James

William JamesMar 31, 2023, 10:30 PM
1 point
0 comments12 min readEA link

Γαμινγκ the Al­gorithms: Large Lan­guage Models as Mirrors

Haris ShekerisApr 1, 2023, 2:14 AM
5 points
3 comments4 min readEA link

De-em­pha­sise al­ign­ment, em­pha­sise restraint

EuanMcLeanFeb 4, 2025, 5:43 PM
14 points
2 comments7 min readEA link

Two con­cepts of an “epi­sode” (Sec­tion 2.2.1 of “Schem­ing AIs”)

Joe_CarlsmithNov 27, 2023, 6:01 PM
11 points
1 comment1 min readEA link

[Question] How to per­suade a non-CS back­ground per­son to be­lieve AGI is 50% pos­si­ble in 2040?

jackchang110Apr 1, 2023, 3:27 PM
1 point
7 comments1 min readEA link

Seek­ing ad­vice on im­pact­ful ca­reer paths given my unique ca­pa­bil­ities and interests

Grateful4PathTipsMar 31, 2023, 11:30 PM
32 points
5 comments1 min readEA link

Pod­cast/​video/​tran­script: Eliezer Yud­kowsky—Why AI Will Kill Us, Align­ing LLMs, Na­ture of In­tel­li­gence, SciFi, & Rationality

PeterSlatteryApr 9, 2023, 10:37 AM
32 points
2 comments137 min readEA link
(www.youtube.com)

Stuxnet, not Skynet: Hu­man­ity’s dis­em­pow­er­ment by AI

RokoApr 4, 2023, 11:46 AM
11 points
0 comments7 min readEA link

Some Pre­limi­nary Opinions on AI Safety Problems

yonxinzhangApr 6, 2023, 12:42 PM
5 points
0 comments6 min readEA link

Re­duc­ing profit mo­ti­va­tions in AI development

Luke FrymireApr 3, 2023, 8:04 PM
20 points
1 comment6 min readEA link

What to sug­gest com­pa­nies & en­trepreneurs do to use AI safely?

AlfalfaBloomApr 5, 2023, 10:36 PM
11 points
1 comment1 min readEA link

Pre­limi­nary in­ves­ti­ga­tions on if STEM and EA com­mu­ni­ties could benefit from more overlap

elteerkersApr 11, 2023, 4:08 PM
31 points
17 comments8 min readEA link

If in­ter­pretabil­ity re­search goes well, it may get dangerous

So8resApr 3, 2023, 9:48 PM
33 points
0 comments1 min readEA link

2023 Vi­sion Week­end, San Francisco

elteerkersApr 6, 2023, 2:33 PM
3 points
0 comments1 min readEA link

[Question] How much should states in­vest in con­tin­gency plans for wide­spread in­ter­net out­age?

Kinoshita Yoshikazu (pseudonym)Apr 7, 2023, 4:05 PM
2 points
0 comments1 min readEA link

Stampy’s AI Safety Info—New Distil­la­tions #1 [March 2023]

markovApr 7, 2023, 11:35 AM
19 points
0 comments2 min readEA link
(aisafety.info)

The Orthog­o­nal­ity Th­e­sis is Not Ob­vi­ously True

OmnizoidApr 5, 2023, 9:08 PM
18 points
12 comments9 min readEA link

[Question] How does a com­pany like In­stadeep fit into the cur­rent AI land­scape?

Tom AApr 8, 2023, 5:49 AM
6 points
0 comments1 min readEA link

All AGI Safety ques­tions wel­come (es­pe­cially ba­sic ones) [April 2023]

StevenKaasApr 8, 2023, 4:21 AM
111 points
174 comments1 min readEA link

Reli­a­bil­ity, Se­cu­rity, and AI risk: Notes from in­fosec text­book chap­ter 1

AkashApr 7, 2023, 3:47 PM
15 points
0 comments1 min readEA link

[Question] Should we pub­lish ar­gu­ments for the preser­va­tion of hu­man­ity?

JeremyApr 7, 2023, 1:51 PM
8 points
4 comments1 min readEA link

AI Con­trol idea: Give an AGI the pri­mary ob­jec­tive of delet­ing it­self, but con­struct ob­sta­cles to this as best we can. All other ob­jec­tives are sec­ondary to this pri­mary goal.

JustausernameApr 3, 2023, 2:32 PM
7 points
4 comments1 min readEA link

[Question] Plat­form for Pro­ject Spit­bal­ling? (e.g., for AI field build­ing)

Marcel DApr 3, 2023, 3:45 PM
7 points
2 comments1 min readEA link

SERI MATS—Sum­mer 2023 Cohort

a_e_rApr 8, 2023, 3:32 PM
36 points
2 comments1 min readEA link

Open Phil re­leases RFPs on LLM Bench­marks and Forecasting

Lawrence ChanNov 11, 2023, 3:01 AM
12 points
0 comments1 min readEA link
(www.openphilanthropy.org)

Pause For Thought: The AI Pause Debate

Scott AlexanderOct 10, 2023, 3:34 PM
109 points
20 comments14 min readEA link
(www.astralcodexten.com)

AI as a sci­ence, and three ob­sta­cles to al­ign­ment strategies

So8resOct 25, 2023, 9:02 PM
41 points
1 comment1 min readEA link

A case study of reg­u­la­tion done well? Cana­dian biorisk regulations

rosehadsharSep 8, 2023, 5:10 PM
31 points
1 comment16 min readEA link

Panel dis­cus­sion on AI con­scious­ness with Rob Long and Jeff Sebo

Aaron BergmanSep 9, 2023, 3:38 AM
31 points
6 comments42 min readEA link
(www.youtube.com)

[Question] How should tech­ni­cal AI re­searchers best tran­si­tion into AI gov­er­nance and policy?

GabeMSep 10, 2023, 5:29 AM
12 points
5 comments1 min readEA link

How teams went about their re­search at AI Safety Camp edi­tion 8

RemmeltSep 9, 2023, 4:34 PM
13 points
1 comment1 min readEA link

Scale, schlep, and systems

AjeyaOct 10, 2023, 4:59 PM
59 points
3 comments6 min readEA link

EA Ex­plorer GPT: A New Tool to Ex­plore Effec­tive Altruism

Vlad_TislenkoNov 12, 2023, 3:36 PM
12 points
1 comment1 min readEA link

A New Model for Com­pute Cen­ter Verification

Damin Curtis🔹Oct 10, 2023, 7:23 PM
21 points
2 comments5 min readEA link

Re­spon­si­ble Scal­ing Poli­cies Are Risk Man­age­ment Done Wrong

simeon_cOct 25, 2023, 11:46 PM
42 points
1 comment1 min readEA link
(www.navigatingrisks.ai)

Scal­able And Trans­fer­able Black-Box Jailbreaks For Lan­guage Models Via Per­sona Modulation

soroushjpNov 7, 2023, 6:00 PM
10 points
0 comments2 min readEA link
(arxiv.org)

Well-Be­ing In­dex (WBI): Redefin­ing So­cietal Progress Together

Max KusmierekDec 1, 2023, 3:23 PM
5 points
1 comment6 min readEA link

Ar­tifi­cial In­tel­li­gence as exit strat­egy from the age of acute ex­is­ten­tial risk

Arturo MaciasApr 12, 2023, 2:41 PM
11 points
11 comments7 min readEA link

Ap­ply to >50 AI safety fun­ders in one ap­pli­ca­tion with the Non­lin­ear Net­work [Round Closed]

Drew SpartzApr 12, 2023, 9:06 PM
157 points
18 comments2 min readEA link

[linkpost] AI NOW In­sti­tute’s 2023 An­nual Re­port & Roadmap

Tristan WilliamsApr 12, 2023, 8:00 PM
9 points
0 comments2 min readEA link
(ainowinstitute.org)

AGI—al­ign­ment—pa­per­clip max­i­mizer—pause—defec­tion—incentives

Mars RobertsonApr 13, 2023, 10:38 AM
1 point
2 comments1 min readEA link

Open-source LLMs may prove Bostrom’s vuln­er­a­ble world hypothesis

Roope AhvenharjuApr 14, 2023, 9:25 AM
14 points
2 comments1 min readEA link

Join AISafety.info’s Writ­ing & Edit­ing Hackathon (Aug 25-28) (Prizes to be won!)

leillustrations🔸Aug 5, 2023, 2:06 PM
15 points
0 comments1 min readEA link

In­tro­duc­ing the Men­tal Health Roadmap Series

EmilyApr 11, 2023, 10:26 PM
18 points
2 comments2 min readEA link

An ap­peal to peo­ple who are smarter than me: please help me clar­ify my think­ing about AI

bethhwAug 5, 2023, 4:38 PM
42 points
21 comments3 min readEA link

On run­ning a city-wide uni­ver­sity group

gergoNov 6, 2023, 9:43 AM
26 points
3 comments9 min readEA link

Sum­mary: The Case for Halt­ing AI Devel­op­ment—Max Teg­mark on the Lex Frid­man Podcast

Madhav MalhotraApr 16, 2023, 10:28 PM
38 points
4 comments4 min readEA link
(youtu.be)

No, the EMH does not im­ply that mar­kets have long AGI timelines

JakobApr 24, 2023, 8:27 AM
83 points
21 comments8 min readEA link

AI Im­pacts Quar­terly Newslet­ter, Jan-Mar 2023

HarlanApr 17, 2023, 11:07 PM
20 points
1 comment3 min readEA link
(blog.aiimpacts.org)

Prevenire una catas­trofe legata all’in­tel­li­genza artificiale

EA ItalyJan 17, 2023, 11:07 AM
1 point
0 comments3 min readEA link

AI Align­ment Re­search Eng­ineer Ac­cel­er­a­tor (ARENA): call for applicants

TheMcDouglasApr 17, 2023, 8:30 PM
41 points
2 comments1 min readEA link

AISafety.info’s Writ­ing & Edit­ing Hackathon

leillustrations🔸Aug 5, 2023, 5:12 PM
4 points
2 comments1 min readEA link

Scien­tism vs. people

Roman LeventovApr 18, 2023, 5:52 PM
0 points
0 comments11 min readEA link

L’im­por­tanza delle IA come pos­si­bile mi­nac­cia per l’umanità

EA ItalyJan 17, 2023, 10:24 PM
1 point
0 comments1 min readEA link
(www.vox.com)

Perché il deep learn­ing mod­erno potrebbe ren­dere diffi­cile l’al­linea­mento delle IA

EA ItalyJan 17, 2023, 11:29 PM
1 point
0 comments16 min readEA link

Le Tem­p­is­tiche delle IA: il di­bat­tito e il punto di vista degli “es­perti”

EA ItalyJan 17, 2023, 11:30 PM
1 point
0 comments11 min readEA link

[Opz­ionale] Il panorama della gov­er­nance lun­goter­minista delle in­tel­li­genze artificiali

EA ItalyJan 17, 2023, 11:03 AM
1 point
0 comments10 min readEA link

[Opz­ionale] Ricerca sulla sicurezza delle IA: panoram­ica delle carriere

EA ItalyJan 17, 2023, 11:06 AM
1 point
0 comments7 min readEA link

[Opz­ionale] Ap­profondi­menti sui rischi dell’IA (ma­te­ri­ali in in­glese)

EA ItalyJan 18, 2023, 11:16 AM
1 point
0 comments2 min readEA link

The Economist fea­ture ar­ti­cles on LLMs

Dr Dan EpsteinApr 20, 2023, 12:29 AM
12 points
0 comments1 min readEA link
(www.economist.com)

OPEC for a slow AGI takeoff

vyraxApr 21, 2023, 10:53 AM
4 points
0 comments3 min readEA link

Notes on “the hot mess the­ory of AI mis­al­ign­ment”

JakubKApr 21, 2023, 10:07 AM
44 points
3 comments1 min readEA link

Safety-First Agents/​Ar­chi­tec­tures Are a Promis­ing Path to Safe AGI

Brendon_WongAug 6, 2023, 8:00 AM
6 points
0 comments12 min readEA link

The Case For Civil Di­sobe­di­ence For The AI Movement

Murali ThoppilApr 24, 2023, 1:07 PM
16 points
3 comments4 min readEA link
(murali42e.substack.com)

FT: We must slow down the race to God-like AI

Angelina LiApr 24, 2023, 11:57 AM
33 points
2 comments2 min readEA link
(www.ft.com)

AGI ruin mostly rests on strong claims about al­ign­ment and de­ploy­ment, not about society

RobBensingerApr 24, 2023, 1:07 PM
16 points
4 comments1 min readEA link

X-Risk Re­searchers Sur­vey

NitaSanghaApr 24, 2023, 8:06 AM
12 points
1 comment1 min readEA link

20+ tips, tricks, les­sons and thoughts on host­ing hackathons

gergoNov 6, 2023, 10:59 AM
14 points
0 comments11 min readEA link

AI Safety Newslet­ter #3: AI policy pro­pos­als and a new challenger approaches

Oliver ZApr 25, 2023, 4:15 PM
35 points
1 comment4 min readEA link
(newsletter.safe.ai)

Creat­ing a “Con­science Calcu­la­tor” to Guard-Rail an AGI

Sean SweeneyAug 12, 2024, 3:58 PM
1 point
11 comments17 min readEA link

The­ory: “WAW might be of higher im­pact than x-risk pre­ven­tion based on util­i­tar­i­anism”

Jens Aslaug 🔸Sep 12, 2023, 1:11 PM
51 points
20 comments17 min readEA link

Join a ‘learn­ing by writ­ing’ group

Jordan Pieters 🔸Apr 26, 2023, 11:36 AM
26 points
1 comment1 min readEA link

Is China Be­com­ing a Science and Tech­nol­ogy Su­per­power? Jeffrey Ding’s In­sight on China’s Diffu­sion Deficit

Wyman KwokApr 25, 2023, 5:00 PM
10 points
0 comments1 min readEA link

[Question] Why does an AI have to have speci­fied goals?

Luke EureAug 22, 2023, 8:15 PM
8 points
4 comments1 min readEA link

[Question] Is there EA dis­cus­sion on non-x-risk trans­for­ma­tive AI?

Franziska FischerApr 26, 2023, 1:50 PM
5 points
0 comments1 min readEA link

UK Prime Minister Rishi Su­nak’s Speech on AI

Tobias HäberliOct 26, 2023, 10:34 AM
112 points
6 comments8 min readEA link
(www.gov.uk)

Pro­pos­als for the AI Reg­u­la­tory Sand­box in Spain

Guillem BasApr 27, 2023, 10:33 AM
55 points
2 comments11 min readEA link
(riesgoscatastroficosglobales.com)

[Question] How come there isn’t that much fo­cus in EA on re­search into whether /​ when AI’s are likely to be sen­tient?

callumApr 27, 2023, 10:09 AM
83 points
23 comments1 min readEA link

The AI guide I’m send­ing my grandparents

James MartinApr 27, 2023, 8:04 PM
41 points
3 comments30 min readEA link

An ar­gu­ment for ac­cel­er­at­ing in­ter­na­tional AI gov­er­nance re­search (part 2)

MattThinksAug 22, 2023, 10:40 PM
3 points
0 comments10 min readEA link

Call for sub­mis­sions: Choice of Fu­tures sur­vey questions

c.troutApr 30, 2023, 6:59 AM
11 points
0 comments1 min readEA link

Ca­reer un­cer­tainty: Medicine vs. AI

Markus KöthApr 30, 2023, 8:41 AM
20 points
9 comments1 min readEA link

Con­nec­tomics seems great from an AI x-risk perspective

Steven ByrnesApr 30, 2023, 2:38 PM
10 points
0 comments1 min readEA link

Global com­put­ing capacity

Vasco Grilo🔸May 1, 2023, 6:09 AM
12 points
0 comments1 min readEA link
(aiimpacts.org)

Ret­ro­spec­tive on re­cent ac­tivity of Ries­gos Catas­trófi­cos Globales

Jaime SevillaMay 1, 2023, 6:35 PM
45 points
0 comments5 min readEA link

The costs of caution

Kelsey PiperMay 1, 2023, 8:04 PM
112 points
17 comments4 min readEA link

Call for Pythia-style foun­da­tion model suite for al­ign­ment research

LucretiaMay 1, 2023, 8:26 PM
10 points
0 comments1 min readEA link

How use­ful for al­ign­ment-rele­vant work are AIs with short-term goals? (Sec­tion 2.2.4.3 of “Schem­ing AIs”)

Joe_CarlsmithDec 1, 2023, 2:51 PM
6 points
0 comments1 min readEA link

Up­dates from Cam­paign for AI Safety

Jolyn KhooAug 7, 2023, 6:09 AM
32 points
2 comments2 min readEA link
(www.campaignforaisafety.org)

Si­mu­lat­ing a pos­si­ble al­ign­ment solu­tion in GPT2-medium us­ing Archety­pal Trans­fer Learning

MiguelMay 2, 2023, 4:23 PM
4 points
0 comments18 min readEA link

AI Safety Newslet­ter #4: AI and Cy­ber­se­cu­rity, Per­sua­sive AIs, Weaponiza­tion, and Ge­offrey Hin­ton talks AI risks

Center for AI SafetyMay 2, 2023, 4:51 PM
35 points
2 comments5 min readEA link
(newsletter.safe.ai)

RA Bounty: Look­ing for feed­back on screen­play about AI Risk

WriterOct 26, 2023, 2:27 PM
8 points
0 comments1 min readEA link

My choice of AI mis­al­ign­ment in­tro­duc­tion for a gen­eral audience

BillMay 3, 2023, 12:15 AM
7 points
2 comments1 min readEA link
(youtu.be)

The Eth­i­cal Basilisk Thought Experiment

KyrtinAug 23, 2023, 1:24 PM
1 point
6 comments1 min readEA link

The Hid­den Com­plex­ity of Wishes—The Animation

WriterSep 27, 2023, 5:59 PM
7 points
0 comments1 min readEA link
(youtu.be)

We don’t need AGI for an amaz­ing future

Karl von WendtMay 4, 2023, 12:11 PM
57 points
2 comments1 min readEA link

AI X-risk in the News: How Effec­tive are Re­cent Me­dia Items and How is Aware­ness Chang­ing? Our New Sur­vey Re­sults.

OttoMay 4, 2023, 2:04 PM
49 points
1 comment9 min readEA link

Clar­ify­ing and pre­dict­ing AGI

richard_ngoMay 4, 2023, 3:56 PM
69 points
2 comments1 min readEA link

Most Lead­ing AI Ex­perts Believe That Ad­vanced AI Could Be Ex­tremely Danger­ous to Humanity

jaiMay 4, 2023, 4:19 PM
31 points
1 comment1 min readEA link
(laneless.substack.com)

An Up­date On The Cam­paign For AI Safety Dot Org

yanni kyriacosMay 5, 2023, 12:19 AM
26 points
4 comments1 min readEA link

Tak­ing Into Ac­count Sen­tient Non-Hu­mans in AI Am­bi­tious Value Learn­ing: Sen­tien­tist Co­her­ent Ex­trap­o­lated Volition

Adrià MoretDec 1, 2023, 6:01 PM
39 points
2 comments42 min readEA link

In­tro to ML Safety vir­tual pro­gram: 12 June − 14 August

jamesMay 5, 2023, 10:04 AM
26 points
0 comments2 min readEA link

In­tro­duc­ing the AI Ob­jec­tives In­sti­tute’s Re­search: Differ­en­tial Paths to­ward Safe and Benefi­cial AI

cmckMay 5, 2023, 8:26 PM
43 points
1 comment8 min readEA link

[Question] Rank best uni­ver­si­ties for AI Saftey

Parker_WhitfillMay 6, 2023, 1:20 PM
8 points
4 comments1 min readEA link

Im­pli­ca­tions of the White­house meet­ing with AI CEOs for AI su­per­in­tel­li­gence risk—a first-step to­wards evals?

Jamie BMay 7, 2023, 5:33 PM
78 points
3 comments7 min readEA link

Orthog­o­nal’s For­mal-Goal Align­ment the­ory of change

Tamsin LeakeMay 5, 2023, 10:36 PM
21 points
0 comments1 min readEA link

Graph­i­cal Rep­re­sen­ta­tions of Paul Chris­ti­ano’s Doom Model

Nathan YoungMay 7, 2023, 1:03 PM
48 points
2 comments1 min readEA link

Why “just make an agent which cares only about bi­nary re­wards” doesn’t work.

Lysandre TerrisseMay 9, 2023, 4:51 PM
4 points
1 comment3 min readEA link

Crises Re­veal Cen­tral­i­sa­tion (Ste­fan Schu­bert)

Will Howard🔹May 10, 2023, 9:45 AM
9 points
0 comments1 min readEA link
(web.archive.org)

An­nounc­ing “Key Phenom­ena in AI Risk” (fa­cil­i­tated read­ing group)

noraMay 9, 2023, 4:52 PM
28 points
0 comments2 min readEA link

[Question] Ben Horow­itz and oth­ers are spread­ing a “reg­u­la­tion is bad” view. Would it be use­ful to have a pub­lic bet on “would Ben up­date his view if he had 1-1 with X-Risk re­searcher?”, and urge Ben to run such an ex­per­i­ment?

AntonOsikaAug 8, 2023, 6:36 AM
2 points
0 comments1 min readEA link

US pub­lic opinion of AI policy and risk

Jamie EMay 12, 2023, 1:22 PM
111 points
7 comments15 min readEA link

Con­tin­u­ous doesn’t mean slow

Tom_DavidsonMay 10, 2023, 12:17 PM
64 points
1 comment4 min readEA link

Stampy’s AI Safety Info—New Distil­la­tions #2 [April 2023]

markovMay 9, 2023, 1:34 PM
13 points
1 comment1 min readEA link
(aisafety.info)

A re­quest to keep pes­simistic AI posts ac­tion­able.

tcelferactMay 11, 2023, 3:35 PM
27 points
9 comments1 min readEA link

🏜️ EA is in Albu­querque!

Alex LongMay 12, 2023, 10:09 PM
18 points
2 comments1 min readEA link

How The EthiSizer Al­most Broke `Story’

Velikovsky_of_NewcastleMay 8, 2023, 4:58 PM
1 point
0 comments5 min readEA link

Can AI solve cli­mate change?

VivianMay 13, 2023, 8:44 PM
2 points
2 comments1 min readEA link

AI Safety Newslet­ter #5: Ge­offrey Hin­ton speaks out on AI risk, the White House meets with AI labs, and Tro­jan at­tacks on lan­guage models

Center for AI SafetyMay 9, 2023, 3:26 PM
60 points
0 comments4 min readEA link
(newsletter.safe.ai)

Open call: AI Act Stan­dard for Dev. Phase Risk Assess­ment

miller-maxDec 8, 2023, 7:57 PM
5 points
1 comment1 min readEA link

[Question] Who should we in­ter­view for The 80,000 Hours Pod­cast?

Luisa_RodriguezSep 13, 2023, 12:23 PM
87 points
136 comments2 min readEA link

AI Ex­is­ten­tial Safety Fellowships

mmfliOct 27, 2023, 12:14 PM
15 points
1 comment1 min readEA link

Public Call for In­ter­est in Math­e­mat­i­cal Alignment

DavidmanheimNov 22, 2023, 1:22 PM
27 points
3 comments1 min readEA link

AI-Risk in the State of the Euro­pean Union Address

Sam BogerdSep 13, 2023, 1:27 PM
25 points
0 comments3 min readEA link
(state-of-the-union.ec.europa.eu)

Tar­bell Fel­low­ship 2024 - Ap­pli­ca­tions Open (AI Jour­nal­ism)

Cillian_Sep 28, 2023, 10:38 AM
58 points
1 comment3 min readEA link

Why Is No One Try­ing To Align Profit In­cen­tives With Align­ment Re­search?

PrometheusAug 23, 2023, 1:19 PM
17 points
2 comments4 min readEA link
(www.lesswrong.com)

EA and AI Safety Schism: AGI, the last tech hu­mans will (soon*) build

PhibMay 15, 2023, 2:05 AM
6 points
6 comments5 min readEA link

Best prac­tices for risk com­mu­ni­ca­tion from the aca­demic literature

Existential Risk Communication ProjectAug 12, 2024, 6:54 PM
9 points
3 comments23 min readEA link

Speci­fi­ca­tion Gam­ing: How AI Can Turn Your Wishes Against You [RA Video]

WriterDec 1, 2023, 7:30 PM
8 points
1 comment1 min readEA link
(youtu.be)

[Question] In­tel­lec­tual prop­erty of AI and ex­is­ten­tial risk in gen­eral?

WillPearsonJun 11, 2024, 1:50 PM
3 points
3 comments1 min readEA link

Con­di­tional Trees: Gen­er­at­ing In­for­ma­tive Fore­cast­ing Ques­tions (FRI) -- AI Risk Case Study

Forecasting Research InstituteAug 12, 2024, 4:24 PM
43 points
2 comments8 min readEA link
(forecastingresearch.org)

AI Safety Bounties

PatrickLAug 24, 2023, 2:30 PM
37 points
2 comments7 min readEA link
(rethinkpriorities.org)

Ac­ci­den­tally teach­ing AI mod­els to de­ceive us (Ajeya Co­tra on The 80,000 Hours Pod­cast)

80000_HoursMay 15, 2023, 8:58 PM
37 points
2 comments18 min readEA link

Ori­gin and al­ign­ment of goals, mean­ing, and morality

FalseCogsAug 24, 2023, 2:05 PM
1 point
2 comments35 min readEA link

ENAIS has launched a newslet­ter for AIS fieldbuilders

gergoNov 22, 2024, 10:45 AM
25 points
0 comments1 min readEA link

Fun­da­men­tals of Global Pri­ori­ties Re­search in Eco­nomics Syllabus

poliboniAug 8, 2023, 12:16 PM
74 points
1 comment8 min readEA link

[Question] What AI Posts Do You Want Distil­led?

brookAug 25, 2023, 9:00 AM
15 points
3 comments1 min readEA link

Ap­pli­ca­tions for EU Tech Policy Fel­low­ship 2024 now open

Jan-WillemSep 13, 2023, 4:17 PM
22 points
2 comments1 min readEA link

Tyler Cowen’s challenge to de­velop an ‘ac­tual math­e­mat­i­cal model’ for AI X-Risk

Joe BrentonMay 16, 2023, 4:55 PM
20 points
4 comments1 min readEA link

The In­ter­na­tional PauseAI Protest: Ac­tivism un­der uncertainty

Joseph MillerOct 12, 2023, 5:36 PM
129 points
3 comments4 min readEA link

How to think about slow­ing AI

Zach Stein-PerlmanSep 17, 2023, 11:23 AM
74 points
9 comments3 min readEA link

Microdooms averted by work­ing on AI Safety

NikolaSep 17, 2023, 9:51 PM
39 points
6 comments3 min readEA link
(www.lesswrong.com)

A model-based ap­proach to AI Ex­is­ten­tial Risk

SammyDMartinAug 25, 2023, 10:44 AM
17 points
0 comments1 min readEA link
(www.lesswrong.com)

What’s in a Pause?

DavidmanheimSep 16, 2023, 10:13 AM
73 points
10 comments9 min readEA link

AI Safety Newslet­ter #6: Ex­am­ples of AI safety progress, Yoshua Ben­gio pro­poses a ban on AI agents, and les­sons from nu­clear arms control

Center for AI SafetyMay 16, 2023, 3:14 PM
32 points
1 comment6 min readEA link
(newsletter.safe.ai)

Safety-con­cerned EAs should pri­ori­tize AI gov­er­nance over alignment

sammyboizJun 11, 2024, 3:47 PM
50 points
20 comments1 min readEA link

AI Pause Will Likely Backfire

Nora BelroseSep 16, 2023, 10:21 AM
141 points
167 comments13 min readEA link

Ex­is­ten­tial Cy­ber­se­cu­rity Risks & AI (A Re­search Agenda)

Madhav MalhotraSep 20, 2023, 12:03 PM
7 points
0 comments8 min readEA link

[Cross­post] AI Reg­u­la­tion May Be More Im­por­tant Than AI Align­ment For Ex­is­ten­tial Safety

OttoAug 24, 2023, 4:01 PM
14 points
2 comments5 min readEA link

MLSN: #10 Ad­ver­sar­ial At­tacks Against Lan­guage and Vi­sion Models, Im­prov­ing LLM Hon­esty, and Trac­ing the In­fluence of LLM Train­ing Data

Center for AI SafetySep 13, 2023, 6:02 PM
7 points
0 comments5 min readEA link
(newsletter.mlsafety.org)

Policy ideas for miti­gat­ing AI risk

Thomas LarsenSep 16, 2023, 10:31 AM
121 points
16 comments10 min readEA link

2024 S-risk In­tro Fellowship

Center on Long-Term RiskOct 12, 2023, 7:14 PM
90 points
2 comments1 min readEA link

Com­ments on Man­heim’s “What’s in a Pause?”

RobBensingerSep 18, 2023, 12:16 PM
74 points
11 comments6 min readEA link

We’re Not Ready: thoughts on “paus­ing” and re­spon­si­ble scal­ing policies

Holden KarnofskyOct 27, 2023, 3:19 PM
150 points
23 comments1 min readEA link

Effi­cacy of AI Ac­tivism: Have We Ever Said No?

Charlie HarrisonOct 27, 2023, 4:52 PM
78 points
25 comments20 min readEA link

AISN #18: Challenges of Re­in­force­ment Learn­ing from Hu­man Feed­back, Microsoft’s Se­cu­rity Breach, and Con­cep­tual Re­search on AI Safety

Center for AI SafetyAug 8, 2023, 3:52 PM
12 points
0 comments5 min readEA link
(newsletter.safe.ai)

Two sources of be­yond-epi­sode goals (Sec­tion 2.2.2 of “Schem­ing AIs”)

Joe_CarlsmithNov 28, 2023, 1:49 PM
8 points
0 comments1 min readEA link

Op­por­tu­ni­ties for Im­pact Beyond the EU AI Act

Cillian_Oct 12, 2023, 3:06 PM
27 points
2 comments4 min readEA link

Re­sources & op­por­tu­ni­ties for ca­reers in Euro­pean AI Policy

Cillian_Oct 12, 2023, 3:02 PM
13 points
1 comment2 min readEA link

OpenAI’s mas­sive push to make su­per­in­tel­li­gence safe in 4 years or less (Jan Leike on the 80,000 Hours Pod­cast)

80000_HoursAug 8, 2023, 6:00 PM
32 points
1 comment19 min readEA link
(80000hours.org)

[Question] How can I best use my ca­reer to pass im­pact­ful AI and Biose­cu­rity policy.

maxgOct 13, 2023, 5:14 AM
4 points
1 comment1 min readEA link

Eisen­hower’s Atoms for Peace Speech

AkashMay 17, 2023, 4:10 PM
17 points
1 comment1 min readEA link

Jan Kul­veit’s Cor­rigi­bil­ity Thoughts Distilled

brookAug 25, 2023, 1:42 PM
16 points
0 comments5 min readEA link
(www.lesswrong.com)

AI Align­ment Re­search Eng­ineer Ac­cel­er­a­tor (ARENA): call for applicants

TheMcDouglasNov 7, 2023, 9:43 AM
46 points
3 comments10 min readEA link

Les­sons on pro­ject man­age­ment from “How Big Things Get Done”

Cristina Schmidt IbáñezMay 17, 2023, 7:15 PM
35 points
3 comments9 min readEA link

Re­la­tion­ship be­tween EA Com­mu­nity and AI safety

Tom Barnes🔸Sep 18, 2023, 1:49 PM
157 points
15 comments1 min readEA link

Tech­ni­cal AI Safety Re­search Land­scape [Slides]

Magdalena WacheSep 18, 2023, 1:56 PM
29 points
0 comments4 min readEA link

EA is good, actually

Amy LabenzNov 28, 2023, 3:59 PM
272 points
15 comments4 min readEA link

A tax­on­omy of non-schemer mod­els (Sec­tion 1.2 of “Schem­ing AIs”)

Joe_CarlsmithNov 22, 2023, 3:24 PM
6 points
0 comments1 min readEA link

At Our World in Data we’re hiring our first Com­mu­ni­ca­tions & Outreach Manager

Charlie GiattinoOct 13, 2023, 1:12 PM
25 points
0 comments1 min readEA link
(ourworldindata.org)

Ar­gu­ments for/​against schem­ing that fo­cus on the path SGD takes (Sec­tion 3 of “Schem­ing AIs”)

Joe_CarlsmithDec 5, 2023, 6:48 PM
7 points
1 comment1 min readEA link

G7 Sum­mit—Co­op­er­a­tion on AI Policy

Leonard_BarrettMay 19, 2023, 10:10 AM
22 points
2 comments1 min readEA link
(www.japantimes.co.jp)

On whether AI will soon cause job loss, lower in­comes, and higher in­equal­ity — or the op­po­site (Michael Webb on the 80,000 Hours Pod­cast)

80000_HoursAug 25, 2023, 2:59 PM
11 points
2 comments18 min readEA link

Anki deck for learn­ing the main AI safety orgs, pro­jects, and programs

Bryce RobertsonSep 29, 2023, 6:42 PM
17 points
5 comments1 min readEA link

La­bor Par­ti­ci­pa­tion is a High-Pri­or­ity AI Align­ment Risk

alxAug 12, 2024, 6:48 PM
16 points
3 comments16 min readEA link

Case study: Safety stan­dards on Cal­ifor­nia util­ities to pre­vent wildfires

Coby JosephDec 6, 2023, 10:32 AM
7 points
1 comment26 min readEA link

In­vi­ta­tion to par­ti­ci­pate in AGI global gov­er­nance Real-Time Delphi ques­tion­naire—The Millen­nium Project

Miquel Banchs-Piqué (prev. mikbp)Dec 13, 2023, 1:35 PM
6 points
0 comments1 min readEA link

Ilya: The AI sci­en­tist shap­ing the world

David VargaNov 20, 2023, 12:43 PM
6 points
1 comment4 min readEA link

Effec­tive Altru­ism Florida’s AI Ex­pert Panel—Record­ing and Slides Available

Sam_E_24May 19, 2023, 7:15 PM
2 points
0 comments1 min readEA link

What he’s learned as an AI policy in­sider (Tan­tum Col­lins on the 80,000 Hours Pod­cast)

80000_HoursOct 13, 2023, 3:01 PM
11 points
2 comments15 min readEA link

Con­fu­sions and up­dates on STEM AI

Eleni_AMay 19, 2023, 9:34 PM
7 points
0 comments1 min readEA link

Model-Based Policy Anal­y­sis un­der Deep Uncertainty

UtilonMar 6, 2023, 2:24 PM
103 points
31 comments21 min readEA link

Re­boot­ing AI Gover­nance: An AI-Driven Ap­proach to AI Governance

UtilonMay 20, 2023, 7:06 PM
38 points
4 comments30 min readEA link

US pub­lic opinion on AI, Septem­ber 2023

Zach Stein-PerlmanSep 18, 2023, 6:00 PM
29 points
0 comments1 min readEA link
(blog.aiimpacts.org)

What AI could mean for al­ter­na­tive proteins

Max TaylorFeb 9, 2024, 10:13 AM
33 points
3 comments16 min readEA link

Former Is­raeli Prime Minister Speaks About AI X-Risk

Yonatan CaleMay 20, 2023, 12:09 PM
73 points
6 comments1 min readEA link

Em­piri­cal work that might shed light on schem­ing (Sec­tion 6 of “Schem­ing AIs”)

Joe_CarlsmithDec 11, 2023, 4:30 PM
7 points
1 comment1 min readEA link

A pro­gres­sive AI, not a threat­en­ing one

Violette Dec 12, 2023, 5:19 PM
−17 points
0 comments4 min readEA link

How I learned to stop wor­ry­ing and love skill trees

Clark UrzoMay 23, 2023, 8:03 AM
22 points
3 comments1 min readEA link
(www.lesswrong.com)

4 types of AGI se­lec­tion, and how to con­strain them

RemmeltAug 9, 2023, 3:02 PM
7 points
0 comments3 min readEA link

Is AI Safety drop­ping the ball on pri­vacy?

markovSep 19, 2023, 8:17 AM
10 points
0 comments7 min readEA link

Some gov­er­nance re­search ideas to pre­vent malev­olent con­trol over AGI and why this might mat­ter a hell of a lot

Jim BuhlerMay 23, 2023, 1:07 PM
63 points
5 comments16 min readEA link

A Differ­ent Ap­proach to Com­mu­nity Build­ing: The Spiral Path to Im­pact

ezrahMay 23, 2023, 6:41 PM
46 points
4 comments8 min readEA link

The pos­si­bil­ity of an in­definite AI pause

Matthew_BarnettSep 19, 2023, 12:28 PM
90 points
73 comments15 min readEA link

AI Safety Newslet­ter #7: Dis­in­for­ma­tion, Gover­nance Recom­men­da­tions for AI labs, and Se­nate Hear­ings on AI

Center for AI SafetyMay 23, 2023, 9:42 PM
23 points
0 comments6 min readEA link
(newsletter.safe.ai)

[Question] I’m in­ter­view­ing Carl Shul­man — what should I ask him?

Robert_WiblinDec 8, 2023, 4:48 PM
53 points
16 comments1 min readEA link

New s-risks au­dio­book available now

Alistair WebsterMay 24, 2023, 8:27 PM
87 points
3 comments1 min readEA link
(centerforreducingsuffering.org)

Di­a­gram with Com­men­tary for AGI as an X-Risk

Jared LeibowichMay 24, 2023, 10:27 PM
20 points
4 comments8 min readEA link

Dutch AI Safety Co­or­di­na­tion Fo­rum: An Experiment

HenningBNov 21, 2023, 4:18 PM
21 points
0 comments4 min readEA link

Rishi Su­nak men­tions “ex­is­ten­tial threats” in talk with OpenAI, Deep­Mind, An­thropic CEOs

Arjun PanicksseryMay 24, 2023, 9:06 PM
44 points
2 comments1 min readEA link

[Question] How can we se­cure more re­search po­si­tions at our uni­ver­si­ties for x-risk re­searchers?

Neil CrawfordSep 6, 2022, 2:41 PM
3 points
2 comments1 min readEA link

Deep­Mind: Model eval­u­a­tion for ex­treme risks

Zach Stein-PerlmanMay 25, 2023, 3:00 AM
49 points
3 comments1 min readEA link

Why I’m Not (Yet) A Full-Time Tech­ni­cal Align­ment Researcher

Nicholas / Heather KrossMay 25, 2023, 1:26 AM
11 points
1 comment1 min readEA link

Will AI end ev­ery­thing? A guide to guess­ing | EAG Bay Area 23

Katja_GraceMay 25, 2023, 5:01 PM
74 points
1 comment21 min readEA link

By failing to take se­ri­ous AI ac­tion, the US could be in vi­o­la­tion of its in­ter­na­tional law obligations

Cecil Abungu May 27, 2023, 4:25 AM
44 points
1 comment10 min readEA link

Sum­mary: Ex­is­ten­tial risk from power-seek­ing AI by Joseph Carlsmith

rileyharrisOct 28, 2023, 3:05 PM
11 points
0 comments6 min readEA link
(www.millionyearview.com)

Lan­guage Agents Re­duce the Risk of Ex­is­ten­tial Catastrophe

cdkgMay 29, 2023, 9:59 AM
29 points
6 comments26 min readEA link

AISN #22: The Land­scape of US AI Leg­is­la­tion - Hear­ings, Frame­works, Bills, and Laws

Center for AI SafetySep 19, 2023, 2:43 PM
15 points
1 comment5 min readEA link
(newsletter.safe.ai)

Me­tac­u­lus Pre­sents: Does Gen­er­a­tive AI In­fringe Copy­right?

christianNov 6, 2023, 11:41 PM
5 points
0 comments1 min readEA link

AI Safety Hub Ser­bia Offi­cial Opening

Dušan D. Nešić (Dushan)Oct 28, 2023, 5:03 PM
20 points
1 comment1 min readEA link
(forum.effectivealtruism.org)

Ex­po­nen­tial AI take­off is a myth

Christoph Hartmann 🔸May 31, 2023, 11:47 AM
47 points
11 comments9 min readEA link

[Question] Is any­one work­ing on safe se­lec­tion pres­sure for digi­tal minds?

WillPearsonDec 12, 2023, 6:17 PM
10 points
9 comments1 min readEA link

Im­pli­ca­tions of AGI on Sub­jec­tive Hu­man Experience

Erica S. May 30, 2023, 6:47 PM
2 points
0 comments19 min readEA link
(docs.google.com)

[Question] ai safety ques­tion

David turnerDec 3, 2023, 12:42 PM
−13 points
3 comments1 min readEA link

An­thropic’s Re­spon­si­ble Scal­ing Policy & Long-Term Benefit Trust

Zach Stein-PerlmanSep 19, 2023, 5:00 PM
25 points
4 comments9 min readEA link
(www.lesswrong.com)

The Case for AI Adap­ta­tion: The Per­ils of Liv­ing in a World with Aligned and Well-De­ployed Trans­for­ma­tive Ar­tifi­cial Intelligence

HTCMay 30, 2023, 6:29 PM
3 points
1 comment7 min readEA link

Ad­vice for new al­ign­ment peo­ple: Info Max

Jonas HallgrenMay 30, 2023, 3:42 PM
10 points
0 comments1 min readEA link

An­nounc­ing Apollo Research

mariushobbhahnMay 30, 2023, 4:17 PM
158 points
5 comments1 min readEA link

Con­sid­er­a­tions on trans­for­ma­tive AI and ex­plo­sive growth from a semi­con­duc­tor-in­dus­try per­spec­tive

MuireallMay 31, 2023, 1:11 AM
23 points
1 comment2 min readEA link
(muireall.space)

Ab­strac­tion is Big­ger than Nat­u­ral Abstraction

Nicholas / Heather KrossMay 31, 2023, 12:00 AM
2 points
0 comments1 min readEA link

The EU AI Act needs a defi­ni­tion of high-risk foun­da­tion mod­els to avoid reg­u­la­tory over­reach and backlash

matthias_samwaldMay 31, 2023, 3:34 PM
17 points
0 comments4 min readEA link

Sam Alt­man gives me bad vibes

throwaway790May 31, 2023, 5:15 PM
−11 points
3 comments1 min readEA link

Cen­ter on Long-Term Risk: An­nual re­view and fundraiser 2023

Center on Long-Term RiskDec 13, 2023, 4:42 PM
78 points
3 comments4 min readEA link

Some for-profit AI al­ign­ment org ideas

Eric HoDec 14, 2023, 3:52 PM
32 points
1 comment9 min readEA link

Up­date from Cam­paign for AI Safety

Nik SamoylovJun 1, 2023, 10:46 AM
22 points
0 comments2 min readEA link
(www.campaignforaisafety.org)

AI Man­u­fac­tured Cri­sis (don’t trust AI to pro­tect us from AI)

WobblyPanda2Jun 1, 2023, 11:12 AM
4 points
0 comments1 min readEA link

A sur­vey of con­crete risks de­rived from Ar­tifi­cial Intelligence

Guillem BasJun 8, 2023, 10:09 PM
36 points
2 comments6 min readEA link
(riesgoscatastroficosglobales.com)

The Con­trol Prob­lem: Un­solved or Un­solv­able?

RemmeltJun 2, 2023, 3:42 PM
4 points
9 comments14 min readEA link

Ad­vice for En­ter­ing AI Safety Research

stecasJun 2, 2023, 8:46 PM
14 points
1 comment1 min readEA link

HeAr­tifi­cial In­tel­li­gence ~ Open Philan­thropy AI Wor­ld­views Contest

Da Kim SanJun 2, 2023, 8:19 PM
−7 points
0 comments20 min readEA link

An­nounc­ing AISafety.info’s Write-a-thon (June 16-18) and Se­cond Distil­la­tion Fel­low­ship (July 3-Oc­to­ber 2)

StevenKaasJun 3, 2023, 2:03 AM
12 points
1 comment1 min readEA link

Pres­i­dent Bi­den Is­sues Ex­ec­u­tive Order on Safe, Se­cure, and Trust­wor­thy Ar­tifi­cial Intelligence

Tristan WilliamsOct 30, 2023, 11:15 AM
143 points
8 comments3 min readEA link
(www.whitehouse.gov)

Fund­ing for work that builds ca­pac­ity to ad­dress risks from trans­for­ma­tive AI

GCR Capacity Building team (Open Phil)Aug 13, 2024, 1:13 PM
40 points
1 comment5 min readEA link

AI Safety Fun­da­men­tals: An In­for­mal Co­hort Start­ing Soon! (cross-posted to less­wrong.com)

TiagoJun 4, 2023, 6:21 PM
6 points
0 comments1 min readEA link
(www.lesswrong.com)

UN Public Call for Nom­i­na­tions For High-level Ad­vi­sory Body on Ar­tifi­cial Intelligence

vincentweisserAug 10, 2023, 10:34 AM
15 points
1 comment1 min readEA link

Uncer­tainty about the fu­ture does not im­ply that AGI will go well

Lauro LangoscoJun 5, 2023, 3:02 PM
8 points
11 comments7 min readEA link
(www.alignmentforum.org)

Agen­tic Mess (A Failure Story)

Karl von WendtJun 6, 2023, 1:16 PM
30 points
3 comments1 min readEA link

Mo­ral Spillover in Hu­man-AI Interaction

Katerina ManoliJun 5, 2023, 3:20 PM
17 points
1 comment13 min readEA link

AISN #9: State­ment on Ex­tinc­tion Risks, Com­pet­i­tive Pres­sures, and When Will AI Reach Hu­man-Level?

Center for AI SafetyJun 6, 2023, 3:56 PM
12 points
2 comments7 min readEA link
(newsletter.safe.ai)

Stampy’s AI Safety Info—New Distil­la­tions #3 [May 2023]

markovJun 6, 2023, 2:27 PM
10 points
2 comments1 min readEA link
(aisafety.info)

Tim Cook was asked about ex­tinc­tion risks from AI

Saul MunnJun 6, 2023, 6:46 PM
8 points
1 comment1 min readEA link

A Play­book for AI Risk Re­duc­tion (fo­cused on mis­al­igned AI)

Holden KarnofskyJun 6, 2023, 6:05 PM
81 points
17 comments1 min readEA link

Re­sponse to “Co­or­di­nated paus­ing: An eval­u­a­tion-based co­or­di­na­tion scheme for fron­tier AI de­vel­op­ers”

Matthew WeardenOct 30, 2023, 12:49 PM
7 points
1 comment6 min readEA link
(matthewwearden.co.uk)

Ar­ti­cle Sum­mary: Cur­rent and Near-Term AI as a Po­ten­tial Ex­is­ten­tial Risk Factor

AndreFerrettiJun 7, 2023, 1:53 PM
12 points
1 comment1 min readEA link
(dl.acm.org)

What AI could mean for animals

Max TaylorOct 6, 2023, 8:36 AM
119 points
8 comments17 min readEA link

The Case for AI Safety Ad­vo­cacy to the Public

Holly Elmore ⏸️ 🔸Sep 20, 2023, 12:03 PM
257 points
58 comments14 min readEA link

The Offense-Defense Balance Rarely Changes

Maxwell TabarrokDec 9, 2023, 3:22 PM
80 points
16 comments3 min readEA link
(maximumprogress.substack.com)

AI Safety Strat­egy—A new or­ga­ni­za­tion for bet­ter timelines

PrometheusJun 14, 2023, 8:41 PM
8 points
0 comments2 min readEA link

En­gag­ing with AI in a Per­sonal Way

Spyder RexDec 4, 2023, 9:23 AM
−9 points
0 comments1 min readEA link

Un­der­stand­ing how hard al­ign­ment is may be the most im­por­tant re­search di­rec­tion right now

AronJun 7, 2023, 7:05 PM
26 points
3 comments6 min readEA link
(coordinationishard.substack.com)

Could AI ac­cel­er­ate eco­nomic growth?

Tom_DavidsonJun 7, 2023, 7:07 PM
28 points
0 comments6 min readEA link

Wild An­i­mal Welfare Sce­nar­ios for AI Doom

utilistrutilJun 8, 2023, 7:41 PM
52 points
2 comments3 min readEA link

The con­ver­gent dy­namic we missed

RemmeltDec 12, 2023, 10:50 PM
2 points
0 comments3 min readEA link

#172 – Why you should stop read­ing the news (Bryan Ca­plan on the 80,000 Hours Pod­cast)

80000_HoursNov 22, 2023, 6:29 PM
20 points
1 comment20 min readEA link

If AGI is im­mi­nent, why can’t I hail a rob­o­taxi?

YarrowDec 9, 2023, 8:50 PM
26 points
4 comments1 min readEA link

The NAIRR Ini­ti­a­tive: Assess­ing its Po­ten­tial for De­moc­ra­tiz­ing AI

Jose GelvesAug 29, 2024, 12:30 PM
22 points
1 comment11 min readEA link

Miti­gat­ing Eth­i­cal Con­cerns and Risks in the US Ap­proach to Au­tonomous Weapons Sys­tems through Effec­tive Altruism

VeeJun 11, 2023, 10:37 AM
5 points
2 comments4 min readEA link

Epoch and FRI Men­tor­ship Pro­gram Sum­mer 2023

merilalamaJun 13, 2023, 2:27 PM
38 points
1 comment1 min readEA link
(epochai.org)

Sum­ming up “Schem­ing AIs” (Sec­tion 5)

Joe_CarlsmithDec 9, 2023, 3:48 PM
9 points
1 comment1 min readEA link

Hiring a CEO & EU Tech Policy Lead to launch an AI policy ca­reer org in Europe

Cillian_Dec 6, 2023, 1:52 PM
50 points
0 comments7 min readEA link

Speed ar­gu­ments against schem­ing (Sec­tion 4.4-4.7 of “Schem­ing AIs”)

Joe_CarlsmithDec 8, 2023, 9:10 PM
6 points
0 comments1 min readEA link

On Deep­Mind and Try­ing to Fairly Hear Out Both AI Doomers and Doubters (Ro­hin Shah on The 80,000 Hours Pod­cast)

80000_HoursJun 12, 2023, 12:53 PM
28 points
1 comment15 min readEA link

A Man­i­fold Mar­ket “Leaked” the AI Ex­tinc­tion State­ment and CAIS Wanted it Deleted

David CheeJun 12, 2023, 3:57 PM
24 points
9 comments12 min readEA link
(news.manifold.markets)

TED talk on Moloch and AI

LivBoereeNov 15, 2023, 7:28 PM
72 points
7 comments1 min readEA link

Tony Blair In­sti­tute AI Safety Work

TomWestgarthJun 13, 2023, 1:16 PM
88 points
2 comments6 min readEA link
(www.institute.global)

AISN #26: Na­tional In­sti­tu­tions for AI Safety, Re­sults From the UK Sum­mit, and New Re­leases From OpenAI and xAI

Center for AI SafetyNov 15, 2023, 4:03 PM
11 points
0 comments6 min readEA link
(newsletter.safe.ai)

Bounty: ex­am­ple de­bug­ging tasks for evals

ElizabethBarnesDec 10, 2023, 5:45 AM
20 points
1 comment2 min readEA link
(www.lesswrong.com)

Seek­ing In­put to AI Safety Book for non-tech­ni­cal audience

Darren McKeeAug 10, 2023, 6:03 PM
11 points
4 comments1 min readEA link

[Question] What’s the ex­act way you pre­dict prob­a­bil­ity of AI ex­tinc­tion?

jackchang110Jun 13, 2023, 3:11 PM
18 points
7 comments1 min readEA link

Nav­i­gat­ing the Fu­ture: A Guide on How to Stay Safe with AI | Em­manuel Katto Uganda

emmanuelkattoAug 28, 2023, 11:38 AM
2 points
0 comments2 min readEA link

In­tro­duc­ing Col­lec­tive Ac­tion for Ex­is­ten­tial Safety: 80+ ac­tions in­di­vi­d­u­als, or­ga­ni­za­tions, and na­tions can take to im­prove our ex­is­ten­tial safety

James NorrisFeb 5, 2025, 3:58 PM
9 points
0 comments1 min readEA link

[Con­gres­sional Hear­ing] Over­sight of A.I.: Leg­is­lat­ing on Ar­tifi­cial Intelligence

Tristan WilliamsNov 1, 2023, 6:15 PM
5 points
1 comment7 min readEA link
(www.judiciary.senate.gov)

Re­port: Ar­tifi­cial In­tel­li­gence Risk Man­age­ment in Spain

JorgeTorresCJun 15, 2023, 4:08 PM
22 points
0 comments3 min readEA link
(riesgoscatastroficosglobales.com)

EU AI Act passed vote, and x-risk was a main topic

Ariel G.Jun 15, 2023, 1:16 PM
43 points
2 comments1 min readEA link
(www.euractiv.com)

AI Safety Con­cepts Wri­teup: WebGPT

JustisAug 11, 2023, 1:31 AM
14 points
0 comments7 min readEA link

PhD stu­dent and post­doc po­si­tions philos­o­phy of AI in Er­lan­gen (Ger­many)

LeonardDungJun 15, 2023, 9:03 PM
13 points
0 comments1 min readEA link

Up­dates from Cam­paign for AI Safety

Jolyn KhooJun 16, 2023, 9:45 AM
15 points
3 comments2 min readEA link
(www.campaignforaisafety.org)

Safety eval­u­a­tions and stan­dards for AI | Beth Barnes | EAG Bay Area 23

Beth BarnesJun 16, 2023, 2:15 PM
28 points
0 comments17 min readEA link

Con­jec­ture: A stand­ing offer for pub­lic de­bates on AI

Andrea_MiottiJun 16, 2023, 2:33 PM
8 points
1 comment1 min readEA link

AI is cen­tral­iz­ing by de­fault; let’s not make it worse

Quintin PopeSep 21, 2023, 1:35 PM
53 points
16 comments15 min readEA link

Sce­nario plan­ning for AI x-risk

Corin KatzkeFeb 10, 2024, 12:07 AM
40 points
0 comments15 min readEA link
(www.convergenceanalysis.org)

Per­sonal Pri­vacy—Workshop

Milli🔸Aug 28, 2023, 8:46 PM
6 points
4 comments1 min readEA link

There should be more AI safety orgs

mariushobbhahnSep 21, 2023, 2:53 PM
117 points
20 comments1 min readEA link

Against Anony­mous Hit Pieces

Anti-OmegaJun 18, 2023, 7:36 PM
−25 points
3 comments1 min readEA link

In­tro­duc­ing the Cen­ter for AI Policy (& we’re hiring!)

Thomas LarsenAug 28, 2023, 9:27 PM
53 points
1 comment2 min readEA link
(www.aipolicy.us)

Com­mu­ni­ca­tion by ex­is­ten­tial risk or­ga­ni­za­tions: State of the field and sug­ges­tions for improvement

Existential Risk Communication ProjectAug 13, 2024, 7:06 AM
10 points
3 comments13 min readEA link

Google Deep­Mind re­leases Gemini

YarrowDec 6, 2023, 5:39 PM
21 points
7 comments1 min readEA link
(deepmind.google)

[Question] How Can Com­pu­ta­tion Cause Con­scious­ness?

Matthew BarberNov 21, 2023, 11:33 PM
2 points
1 comment3 min readEA link

[Closed] Agent Foun­da­tions track in MATS

VanessaOct 31, 2023, 8:14 AM
19 points
0 comments1 min readEA link
(www.matsprogram.org)

New refer­ence stan­dard on LLM Ap­pli­ca­tion se­cu­rity started by OWASP

QuantumForestJun 19, 2023, 7:56 PM
5 points
0 comments1 min readEA link

Sum­mary of the AI Bill of Rights and Policy Implications

Tristan WilliamsJun 20, 2023, 9:28 AM
16 points
0 comments22 min readEA link

Light­ning Post: Things peo­ple in AI Safety should stop talk­ing about

PrometheusJun 20, 2023, 3:00 PM
5 points
3 comments1 min readEA link

LPP Sum­mer Re­search Fel­low­ship in Law & AI 2023: Ap­pli­ca­tions Open

Legal Priorities ProjectJun 20, 2023, 2:31 PM
43 points
4 comments4 min readEA link

Join the Vir­tual AI Safety Un­con­fer­ence (VAISU)!

NguyênJun 21, 2023, 4:46 AM
23 points
0 comments1 min readEA link
(vaisu.ai)

Me­tac­u­lus Pre­sents — View From the En­ter­prise Suite: How Ap­plied AI Gover­nance Works Today

christianJun 20, 2023, 10:24 PM
4 points
0 comments1 min readEA link

20 con­crete pro­jects for re­duc­ing ex­is­ten­tial risk

BuhlJun 21, 2023, 3:54 PM
132 points
27 comments20 min readEA link
(rethinkpriorities.org)

The count­ing ar­gu­ment for schem­ing (Sec­tions 4.1 and 4.2 of “Schem­ing AIs”)

Joe_CarlsmithDec 6, 2023, 7:28 PM
9 points
1 comment1 min readEA link

Yip Fai Tse on an­i­mal welfare & AI safety and long termism

Karthik PalakodetiJun 22, 2023, 12:48 PM
47 points
0 comments1 min readEA link

You can run more than one fel­low­ship per semester if you want to

gergoDec 12, 2023, 8:49 AM
6 points
1 comment3 min readEA link

US pub­lic per­cep­tion of CAIS state­ment and the risk of extinction

Jamie EJun 22, 2023, 4:39 PM
126 points
4 comments9 min readEA link

[Question] Is there much need for fron­tend en­g­ineers in AI al­ign­ment?

Michael GSep 21, 2023, 8:48 PM
11 points
1 comment1 min readEA link

Catas­trophic Risks from AI #1: Introduction

Dan HJun 22, 2023, 5:09 PM
28 points
1 comment1 min readEA link
(arxiv.org)

Catas­trophic Risks from AI #2: Mal­i­cious Use

Dan HJun 22, 2023, 5:10 PM
19 points
0 comments1 min readEA link

An­nounc­ing the EA Pro­ject Ideas Database

Joe RogeroJun 22, 2023, 8:20 PM
14 points
4 comments1 min readEA link

OpenAI’s grant pro­gram for demo­cratic pro­cess for de­cid­ing what rules AI sys­tems should follow

Ronen BarJun 23, 2023, 10:46 AM
7 points
0 comments1 min readEA link

Sum­mary of “The Precipice” (2 of 4): We are a dan­ger to ourselves

rileyharrisAug 13, 2023, 11:53 PM
5 points
0 comments8 min readEA link
(www.millionyearview.com)

The UK AI Safety Sum­mit tomorrow

SebastianSchmidtOct 31, 2023, 7:09 PM
17 points
2 comments2 min readEA link

AISN #25: White House Ex­ec­u­tive Order on AI, UK AI Safety Sum­mit, and Progress on Vol­un­tary Eval­u­a­tions of AI Risks

Center for AI SafetyOct 31, 2023, 7:24 PM
21 points
0 comments6 min readEA link
(newsletter.safe.ai)

AI-Rele­vant Reg­u­la­tion: CPSC

SWKAug 13, 2023, 3:44 PM
3 points
0 comments6 min readEA link

AISN #20: LLM Pro­lifer­a­tion, AI De­cep­tion, and Con­tin­u­ing Drivers of AI Capabilities

Center for AI SafetyAug 29, 2023, 3:03 PM
12 points
0 comments8 min readEA link
(newsletter.safe.ai)

The the­o­ret­i­cal com­pu­ta­tional limit of the So­lar Sys­tem is 1.47x10^49 bits per sec­ond.

William the KiwiOct 17, 2023, 2:52 AM
12 points
7 comments1 min readEA link

[Question] What do we know about Mustafa Suley­man’s po­si­tion on AI Safety?

Chris LeongAug 13, 2023, 7:41 PM
14 points
3 comments1 min readEA link

The Bletch­ley Dec­la­ra­tion on AI Safety

Hauke HillebrandtNov 1, 2023, 11:44 AM
60 points
3 comments4 min readEA link
(www.gov.uk)

[Question] Should some peo­ple start work­ing to in­fluence the peo­ple who are most likely to shape the val­ues of the first AGIs, so that they take into ac­count the in­ter­ests of wild and farmed an­i­mals and sen­tient digi­tal minds?

Keyvan MostafaviAug 31, 2023, 12:08 PM
16 points
1 comment1 min readEA link

New Book: ‘Nexus’ by Yu­val Noah Harari

timfarkasOct 3, 2024, 1:54 PM
14 points
2 comments5 min readEA link

An­nounc­ing Su­per­in­tel­li­gence Imag­ined: A cre­ative con­test on the risks of superintelligence

TaylorJnsJun 12, 2024, 3:20 PM
17 points
0 comments1 min readEA link

Be­ware of the new scal­ing paradigm

Johan de KockSep 19, 2024, 5:03 PM
9 points
2 comments3 min readEA link

Four Fu­tures For Cog­ni­tive Labor

Maxwell TabarrokJun 13, 2024, 12:58 PM
27 points
11 comments4 min readEA link
(www.maximum-progress.com)

[Paper] AI Sand­bag­ging: Lan­guage Models can Strate­gi­cally Un­der­perform on Evaluations

Teun van der WeijJun 13, 2024, 10:04 AM
22 points
2 comments1 min readEA link
(arxiv.org)

As­ter­isk Magaz­ine Is­sue 06

Clara CollierJul 19, 2024, 1:34 PM
13 points
0 comments1 min readEA link
(asteriskmag.com)

The Best Ar­gu­ment is not a Sim­ple English Yud Essay

Jonathan BostockSep 19, 2024, 3:29 PM
74 points
3 comments5 min readEA link
(www.lesswrong.com)

En­hanc­ing Bio­met­ric Data Pro­tec­tion in Latin Amer­ica Based on the Euro­pean Experience

Ana Sofía Jiménez Aug 13, 2024, 1:13 PM
13 points
1 comment4 min readEA link

AUKUS Mili­tary AI Trial

CAISIDFeb 14, 2024, 2:52 PM
10 points
0 comments2 min readEA link

Trump talk­ing about AI risks

defun 🔸Jun 14, 2024, 12:24 PM
43 points
2 comments1 min readEA link
(x.com)

Some thoughts on Leopold Aschen­bren­ner’s Si­tu­a­tional Aware­ness paper

Luke DawesJun 14, 2024, 1:50 PM
14 points
1 comment3 min readEA link

Call for Re­search Par­ti­ci­pants—EU/​China AI regulation

Jamie O'DonnellJun 14, 2024, 5:30 PM
3 points
0 comments1 min readEA link

I have a some ques­tions for the peo­ple at 80,000 Hours

yanni kyriacosFeb 14, 2024, 11:07 PM
25 points
17 comments1 min readEA link

“AI Align­ment” is a Danger­ously Over­loaded Term

RokoDec 15, 2023, 3:06 PM
20 points
2 comments3 min readEA link

Recom­men­da­tion to Ap­ply ISIC and NAICS to AI In­ci­dent Database

Ben TurseJul 21, 2024, 7:25 AM
3 points
0 comments2 min readEA link

Re­sults from the AI x Democ­racy Re­search Sprint

Esben KranJun 14, 2024, 4:40 PM
19 points
1 comment1 min readEA link

At Our World in Data we’re hiring a Se­nior Full-stack Engineer

Charlie GiattinoDec 15, 2023, 3:51 PM
16 points
0 comments1 min readEA link
(ourworldindata.org)

“No-one in my org puts money in their pen­sion”

tobyjFeb 16, 2024, 3:04 PM
156 points
11 comments9 min readEA link
(seekingtobejolly.substack.com)

Ret­ro­spec­tive: PIBBSS Fel­low­ship 2023

Dušan D. Nešić (Dushan)Feb 16, 2024, 5:48 PM
17 points
2 comments1 min readEA link

[Question] Is there a pub­lic tracker de­pict­ing at what dates AI has been able to au­to­mate x% of cog­ni­tive tasks (weighted by 2020 eco­nomic value)?

Mitchell Laughlin🔸Feb 17, 2024, 4:52 AM
12 points
4 comments1 min readEA link

In­tro­duc­ing In­ter­na­tional AI Gover­nance Alli­ance (IAIGA)

James NorrisFeb 5, 2025, 3:59 PM
12 points
0 comments1 min readEA link

Ra­tional An­i­ma­tions’ in­tro to mechanis­tic interpretability

WriterJun 14, 2024, 4:10 PM
21 points
1 comment1 min readEA link
(youtu.be)

How Tech­ni­cal AI Safety Re­searchers Can Help Im­ple­ment Pu­ni­tive Da­m­ages to Miti­gate Catas­trophic AI Risk

Gabriel WeilFeb 19, 2024, 5:43 PM
28 points
2 comments4 min readEA link

Look­ing for a Doc­u­ment to In­tro­duce AI Risks to Newbies

Jr22Jun 17, 2024, 1:02 PM
2 points
3 comments1 min readEA link

Big Pic­ture AI Safety: teaser

EuanMcLeanFeb 20, 2024, 1:09 PM
18 points
0 comments1 min readEA link

In­tro­duc­ing StakeOut.AI

Harry LukFeb 17, 2024, 12:21 AM
52 points
6 comments9 min readEA link

IFRC cre­ative com­pe­ti­tion: product or ser­vice from fu­ture au­tonomous weapons sys­tems and emerg­ing digi­tal risks

Devin LamJul 21, 2024, 1:08 PM
9 points
0 comments1 min readEA link
(solferinoacademy.com)

In­ter­pretable Anal­y­sis of Fea­tures Found in Open-source Sparse Au­toen­coder (par­tial repli­ca­tion)

Fernando AvalosAug 28, 2024, 10:08 PM
6 points
1 comment10 min readEA link

Does nat­u­ral se­lec­tion fa­vor AIs over hu­mans?

cdkgOct 3, 2024, 7:02 PM
21 points
0 comments1 min readEA link
(link.springer.com)

AISN #31: A New AI Policy Bill in Cal­ifor­nia Plus, Prece­dents for AI Gover­nance and The EU AI Office

Center for AI SafetyFeb 21, 2024, 9:55 PM
27 points
0 comments6 min readEA link
(newsletter.safe.ai)

List of Good Begin­ner-friendly AI Law/​Policy/​Reg­u­la­tion Books

CAISIDFeb 22, 2024, 2:51 PM
28 points
1 comment6 min readEA link

AI-based dis­in­for­ma­tion is prob­a­bly not a ma­jor threat to democracy

Dan WilliamsFeb 24, 2024, 8:01 PM
63 points
8 comments10 min readEA link

China-AI fore­cast­ing

Nathan_BarnardFeb 25, 2024, 4:47 PM
10 points
2 comments6 min readEA link

The Pend­ing Disaster Fram­ing as it Re­lates to AI Risk

Chris LeongFeb 25, 2024, 3:47 PM
8 points
2 comments6 min readEA link

Biose­cu­rity and AI: Risks and Opportunities

Center for AI SafetyFeb 27, 2024, 6:46 PM
7 points
2 comments7 min readEA link
(www.safe.ai)

(Ap­pli­ca­tions Open!) UChicago XLab Sum­mer Re­search Fel­low­ship 2024

ZacharyRudolphFeb 26, 2024, 6:20 PM
15 points
0 comments4 min readEA link
(xrisk.uchicago.edu)

#180 – Why gullibil­ity and mis­in­for­ma­tion are over­rated (Hugo Mercier on the 80,000 Hours Pod­cast)

80000_HoursFeb 26, 2024, 7:16 PM
15 points
0 comments18 min readEA link

Is it pos­si­bly de­sir­able for sen­tient ASI to ex­ter­mi­nate hu­mans?

DuckruckJun 18, 2024, 3:20 PM
0 points
4 comments1 min readEA link

Noah’s Arc: From AR Desks to AI Reactors

TabulaRasaMar 1, 2024, 1:59 PM
7 points
0 comments4 min readEA link

An­i­mal ad­vo­cates should cam­paign to re­strict AI pre­ci­sion live­stock farming

🔸Zachary BrownJun 17, 2024, 3:27 PM
33 points
6 comments15 min readEA link
(beforeporcelain.substack.com)

IV. Par­allels and Review

Maynk02Feb 27, 2024, 11:10 PM
7 points
1 comment8 min readEA link
(open.substack.com)

Cor­po­rate Gover­nance for Fron­tier AI Labs: A Re­search Agenda

Matthew WeardenFeb 28, 2024, 11:32 AM
17 points
3 comments16 min readEA link
(matthewwearden.co.uk)

Re­duc­ing global AI com­pe­ti­tion through the Com­merce Con­trol List and Im­mi­gra­tion re­form: a dual-pronged approach

ben.smithSep 3, 2024, 5:28 AM
15 points
0 comments9 min readEA link

Dis­cov­er­ing al­ign­ment wind­falls re­duces AI risk

James BradyFeb 28, 2024, 9:14 PM
22 points
3 comments8 min readEA link
(blog.elicit.com)

Set up an AIS newslet­ter for your group in 10 min­utes per month (June edi­tion)

gergoJun 18, 2024, 6:31 AM
34 points
0 comments1 min readEA link

‘Surveillance Cap­i­tal­ism’ & AI Gover­nance: Slip­pery Busi­ness Models, Se­cu­ri­ti­sa­tion, and Self-Regulation

Charlie HarrisonFeb 29, 2024, 3:47 PM
19 points
2 comments12 min readEA link

The Defence pro­duc­tion act and AI policy

Nathan_BarnardMar 1, 2024, 2:23 PM
15 points
0 comments2 min readEA link

An in­ter­sec­tion be­tween an­i­mal welfare and AI

sammyboizJun 18, 2024, 3:23 AM
9 points
1 comment1 min readEA link

An­nounc­ing the PIBBSS Sym­po­sium ’24!

Dušan D. Nešić (Dushan)Sep 3, 2024, 11:19 AM
6 points
0 comments1 min readEA link

Re­duce AGI risks us­ing mod­ern lie de­tec­tion technology

NothingIsArtSep 30, 2024, 6:12 PM
1 point
0 comments1 min readEA link

Pos­i­tive vi­sions for AI

L Rudolf LJul 23, 2024, 8:15 PM
21 points
1 comment1 min readEA link
(www.florencehinder.com)

Demis Hass­abis — Google Deep­Mind: The Podcast

Zach Stein-PerlmanAug 16, 2024, 12:00 AM
22 points
2 comments1 min readEA link
(www.youtube.com)

A one-sen­tence for­mu­la­tion of the AI X-Risk ar­gu­ment I try to make

tcelferactMar 2, 2024, 12:44 AM
3 points
0 comments1 min readEA link

AI and X-risk un­con­fer­ence at ZuGeorgia

YeshJun 18, 2024, 2:24 PM
2 points
0 comments1 min readEA link

An­thropic An­nounces new S.O.T.A. Claude 3

Joseph MillerMar 4, 2024, 7:02 PM
10 points
5 comments1 min readEA link
(twitter.com)

[Question] Would peo­ple on this site be in­ter­ested in hear­ing about efforts to make an “ethics calcu­la­tor” for an AGI?

Sean SweeneyMar 5, 2024, 9:28 AM
1 point
0 comments1 min readEA link

INTERVIEW: StakeOut.AI w/​ Dr. Peter Park

Jacob-HaimesMar 5, 2024, 6:04 PM
21 points
7 comments1 min readEA link
(into-ai-safety.github.io)

Lov­ing a world you don’t trust

Joe_CarlsmithJun 18, 2024, 7:31 PM
65 points
7 comments1 min readEA link

Shar­ing the AI Wind­fall: A Strate­gic Ap­proach to In­ter­na­tional Benefit-Sharing

michelAug 16, 2024, 12:54 PM
67 points
0 comments13 min readEA link
(wrtaigovernance.substack.com)

Is RLHF cruel to AI?

HznDec 16, 2024, 2:01 PM
−1 points
2 comments3 min readEA link

Talk­ing to Congress: Can con­stituents con­tact­ing their leg­is­la­tor in­fluence policy?

Tristan WilliamsMar 7, 2024, 9:24 AM
47 points
3 comments19 min readEA link

AI gov­er­nance tracker of each coun­try per re­gion

Alix RamillonJul 24, 2024, 5:39 PM
16 points
2 comments23 min readEA link

NTIA Solic­its Com­ments on Open-Weight AI Models

Jacob WoessnerMar 6, 2024, 8:05 PM
11 points
1 comment2 min readEA link
(www.ntia.gov)

Ilya Sutskever is start­ing Safe Su­per­in­tel­li­gence Inc.

defun 🔸Jun 19, 2024, 7:11 PM
26 points
6 comments1 min readEA link
(ssi.inc)

An­nounc­ing the AI Fore­cast­ing Bench­mark Series | July 8, $120k in Prizes

christianJun 19, 2024, 9:37 PM
52 points
4 comments5 min readEA link
(www.metaculus.com)

Au­to­mated (a short story)

Ben Millwood🔸Jun 19, 2024, 7:07 PM
8 points
0 comments5 min readEA link

AISN #32: Mea­sur­ing and Re­duc­ing Hazardous Knowl­edge in LLMs Plus, Fore­cast­ing the Fu­ture with LLMs, and Reg­u­la­tory Markets

Center for AI SafetyMar 7, 2024, 4:37 PM
15 points
2 comments8 min readEA link
(newsletter.safe.ai)

[Question] Need help with billboard con­tent for AI Safety Bulgaria

Aleksandar N. AngelovMar 7, 2024, 2:36 PM
4 points
5 comments1 min readEA link

An­nounc­ing Con­ver­gence Anal­y­sis: An In­sti­tute for AI Sce­nario & Gover­nance Research

David_KristofferssonMar 7, 2024, 9:18 PM
46 points
0 comments4 min readEA link

Short re­view of our Ten­sorTrust-based AI safety uni­ver­sity out­reach event

Milan Weibel🔹Sep 22, 2024, 2:54 PM
15 points
0 comments2 min readEA link

AI Safety Memes Wiki

plexJul 24, 2024, 6:53 PM
6 points
0 comments1 min readEA link
(aisafety.info)

AI, Fac­tory Farm­ing and In­tu­itive Mo­ral Responses

DeepBlueWhaleJun 20, 2024, 12:43 PM
10 points
2 comments1 min readEA link

Cli­mate Ad­vo­cacy and AI Safety: Su­per­charg­ing AI Slow­down Advocacy

Matthew McRedmond🔹Jul 25, 2024, 12:08 PM
8 points
7 comments2 min readEA link

NIST staffers re­volt against ex­pected ap­point­ment of ‘effec­tive al­tru­ist’ AI re­searcher to US AI Safety Institute

PhibMar 8, 2024, 5:47 PM
39 points
16 comments1 min readEA link
(venturebeat.com)

OpenAI an­nounces new mem­bers to board of directors

Will Howard🔹Mar 9, 2024, 11:27 AM
47 points
12 comments2 min readEA link
(openai.com)

Pro­ject pro­posal: Sce­nario anal­y­sis group for AI safety strategy

BuhlDec 18, 2023, 6:31 PM
35 points
0 comments5 min readEA link
(rethinkpriorities.org)

A frame­work for think­ing about AI power-seeking

Joe_CarlsmithJul 24, 2024, 10:41 PM
44 points
11 comments1 min readEA link

OpenAI: Pre­pared­ness framework

Zach Stein-PerlmanDec 18, 2023, 6:30 PM
24 points
0 comments1 min readEA link
(openai.com)

Clar­ify­ing two uses of “al­ign­ment”

Matthew_BarnettMar 10, 2024, 5:41 PM
36 points
28 comments4 min readEA link

China x AI Refer­ence List

Saad SiddiquiMar 13, 2024, 6:57 PM
61 points
3 comments3 min readEA link
(docs.google.com)

An­nounc­ing the Cam­bridge ERA:AI Fel­low­ship 2024

erafellowshipMar 11, 2024, 7:06 PM
31 points
5 comments3 min readEA link

More ev­i­dence X-risk am­plifies ac­tion against cur­rent AI harms

Daniel_FriedrichDec 22, 2023, 3:21 PM
27 points
2 comments2 min readEA link
(osf.io)

Re­sults from an Ad­ver­sar­ial Col­lab­o­ra­tion on AI Risk (FRI)

Forecasting Research InstituteMar 11, 2024, 3:54 PM
193 points
25 comments9 min readEA link
(forecastingresearch.org)

AI Safety In­cu­ba­tion Pro­gram—Ap­pli­ca­tions Open

Catalyze ImpactAug 16, 2024, 3:37 PM
11 points
0 comments2 min readEA link

Vir­tual AI Safety Un­con­fer­ence 2024

Orpheus_LummisMar 13, 2024, 1:48 PM
11 points
0 comments1 min readEA link

[Question] What could a policy ban­ning AGI look like?

TsviBTMar 13, 2024, 2:19 PM
17 points
4 comments1 min readEA link

We are shar­ing a new web­site tem­plate for AI Safety groups!

AIS HungaryMar 13, 2024, 4:40 PM
10 points
2 comments1 min readEA link

In­tro­duc­ing the AI for An­i­mals newsletter

Max TaylorJun 21, 2024, 1:24 PM
40 points
0 comments1 min readEA link

Claude vs GPT

Maxwell TabarrokMar 14, 2024, 12:44 PM
14 points
1 comment2 min readEA link
(www.maximum-progress.com)

AI gov­er­nance needs a the­ory of victory

Corin KatzkeJun 21, 2024, 4:08 PM
80 points
8 comments20 min readEA link
(www.convergenceanalysis.org)

Cy­ber­se­cu­rity and AI: The Evolv­ing Se­cu­rity Landscape

Center for AI SafetyMar 14, 2024, 8:14 PM
9 points
0 comments12 min readEA link
(www.safe.ai)

[Question] What hap­pened to the ‘only 400 peo­ple work in AI safety/​gov­er­nance’ num­ber dated from 2020?

VaipanMar 15, 2024, 3:25 PM
27 points
1 comment1 min readEA link

In­tro­duc­ing METR’s Au­ton­omy Eval­u­a­tion Resources

Megan KinnimentMar 15, 2024, 11:19 PM
28 points
0 comments1 min readEA link
(metr.github.io)

Mid­dle Pow­ers in AI Gover­nance: Po­ten­tial paths to im­pact and re­lated ques­tions.

EffectiveAdvocate🔸Mar 15, 2024, 8:11 PM
5 points
1 comment5 min readEA link

Lab Col­lab­o­ra­tion on AI Safety Best Prac­tices

amtaMar 17, 2024, 12:20 PM
3 points
0 comments20 min readEA link

[US time] In­fosec: What even is zero trust?

JarrahJun 21, 2024, 6:11 PM
2 points
0 comments1 min readEA link

[EU time] In­fosec: What even is zero trust?

JarrahJun 21, 2024, 6:09 PM
2 points
0 comments1 min readEA link

AI Con­sti­tu­tions are a tool to re­duce so­cietal scale risk

SammyDMartinJul 26, 2024, 10:50 AM
11 points
0 comments1 min readEA link
(www.lesswrong.com)

Balanc­ing safety and waste

Daniel_FriedrichMar 17, 2024, 10:57 AM
6 points
0 comments7 min readEA link

Trans­for­ma­tive trust­build­ing via ad­vance­ments in de­cen­tral­ized lie detection

trevor1Mar 16, 2024, 5:56 AM
4 points
1 comment1 min readEA link
(www.ncbi.nlm.nih.gov)

Ap­ply to the Co­op­er­a­tive AI PhD Fel­low­ship by Oc­to­ber 14th!

Lewis HammondOct 5, 2024, 12:41 PM
35 points
0 comments1 min readEA link

Am­bi­tious Im­pact launches a for-profit ac­cel­er­a­tor in­stead of build­ing the AI Safety space. Let’s talk about this.

yanni kyriacosMar 18, 2024, 3:44 AM
−7 points
13 comments1 min readEA link

Re­vis­it­ing the Evolu­tion An­chor in the Biolog­i­cal An­chors Re­port

JanviMar 18, 2024, 3:01 AM
13 points
1 comment4 min readEA link

INTERVIEW: Round 2 - StakeOut.AI w/​ Dr. Peter Park

Jacob-HaimesMar 18, 2024, 9:26 PM
8 points
0 comments1 min readEA link
(into-ai-safety.github.io)

AE Stu­dio @ SXSW: We need more AI con­scious­ness re­search (and fur­ther re­sources)

AEStudioMar 26, 2024, 9:15 PM
15 points
0 comments3 min readEA link

AI Safety Eval­u­a­tions: A Reg­u­la­tory Review

Elliot MckernonMar 19, 2024, 3:09 PM
12 points
2 comments11 min readEA link

AI Ex­is­ten­tial Risk from AI’s Per­spec­tive (30-40%)

nobody42Mar 20, 2024, 12:18 PM
0 points
1 comment2 min readEA link

NAIRA—An ex­er­cise in reg­u­la­tory, com­pet­i­tive safety gov­er­nance [AI Gover­nance In­sti­tu­tional De­sign idea]

Heramb PodarMar 19, 2024, 2:55 PM
5 points
1 comment6 min readEA link

ChatGPT4 Ap­pears to At­tain Pe­ri­ods of Consciousness

nobody42Mar 20, 2024, 12:18 PM
10 points
10 comments15 min readEA link

Against Ex­plo­sive Growth

c.troutSep 4, 2024, 9:45 PM
24 points
9 comments1 min readEA link

Model evals for dan­ger­ous capabilities

Zach Stein-PerlmanSep 23, 2024, 11:00 AM
19 points
0 comments1 min readEA link

My mo­ti­va­tion and the­ory of change for work­ing in AI healthtech

Andrew CritchOct 12, 2024, 12:36 AM
47 points
1 comment1 min readEA link

[Question] What are good lit refer­ences about In­ter­na­tional Gover­nance of AI?

VaipanMar 20, 2024, 3:51 PM
4 points
0 comments1 min readEA link

Some thoughts from a Univer­sity AI Debate

Charlie HarrisonMar 20, 2024, 5:03 PM
25 points
2 comments1 min readEA link

On green

Joe_CarlsmithMar 21, 2024, 5:38 PM
61 points
3 comments1 min readEA link

AI Model Registries: A Reg­u­la­tory Review

Deric ChengMar 22, 2024, 4:01 PM
6 points
3 comments6 min readEA link

[Question] How might a mis­al­igned Ar­tifi­cial Su­per­in­tel­li­gence break up a hu­man be­ing into us­able elec­tro­mag­netic en­ergy?

CarusoOct 5, 2024, 5:33 PM
−5 points
3 comments1 min readEA link

What if do­ing the most good = benev­olent AI takeover and hu­man ex­tinc­tion?

Jordan ArelMar 22, 2024, 7:56 PM
2 points
4 comments3 min readEA link

De­cen­tral­ized His­tor­i­cal Data Preser­va­tion and Why EA Should Care

SashaMar 22, 2024, 10:09 AM
2 points
0 comments3 min readEA link

Trans­for­ma­tive AI and Sce­nario Plan­ning for AI X-risk

Elliot MckernonMar 22, 2024, 11:44 AM
14 points
1 comment8 min readEA link

Video and tran­script of pre­sen­ta­tion on Schem­ing AIs

Joe_CarlsmithMar 22, 2024, 3:56 PM
23 points
1 comment1 min readEA link

How the AI safety tech­ni­cal land­scape has changed in the last year, ac­cord­ing to some practitioners

tlevinJul 26, 2024, 7:06 PM
83 points
1 comment1 min readEA link

On attunement

Joe_CarlsmithMar 25, 2024, 12:47 PM
27 points
0 comments1 min readEA link

Con­tra­tion: The next threat from AI may not be like the risks we’ve feared

John WallbankJul 28, 2024, 11:19 PM
−1 points
1 comment5 min readEA link

Fund­ing op­por­tu­nity for per­sonal/​pro­fes­sional de­vel­op­ment for those work­ing in AI safety (dead­line March 29)

aturyMar 25, 2024, 7:19 PM
18 points
0 comments1 min readEA link

CEEALAR’s The­ory of Change

CEEALARDec 19, 2023, 8:21 PM
51 points
5 comments3 min readEA link

Timelines to Trans­for­ma­tive AI: an investigation

Zershaaneh QureshiMar 25, 2024, 6:11 PM
73 points
8 comments50 min readEA link

[Question] What is the na­ture of hu­mans gen­eral in­tel­li­gence and it’s im­pli­ca­tions for AGI?

WillPearsonMar 26, 2024, 4:22 PM
6 points
0 comments1 min readEA link

Is Paus­ing AI Pos­si­ble?

Richard AnniloOct 9, 2024, 1:22 PM
88 points
4 comments18 min readEA link

AGI will be made of het­ero­ge­neous com­po­nents, Trans­former and Selec­tive SSM blocks will be among them

Roman LeventovDec 27, 2023, 2:51 PM
5 points
0 comments1 min readEA link

Against Learn­ing From Dra­matic Events (by Scott Alexan­der)

bernJan 17, 2024, 4:34 PM
46 points
3 comments2 min readEA link
(www.astralcodexten.com)

AI Gir­lfriends Won’t Mat­ter Much

Maxwell TabarrokDec 23, 2023, 4:00 PM
12 points
1 comment2 min readEA link
(maximumprogress.substack.com)

AISN #28: Cen­ter for AI Safety 2023 Year in Review

Center for AI SafetyDec 23, 2023, 9:31 PM
17 points
1 comment5 min readEA link
(newsletter.safe.ai)

AI Safety Chatbot

markovDec 21, 2023, 2:09 PM
49 points
3 comments4 min readEA link

Gaia Net­work: a prac­ti­cal, in­cre­men­tal path­way to Open Agency Architecture

Roman LeventovDec 20, 2023, 5:11 PM
4 points
0 comments1 min readEA link

[Question] Where would I find the hard­core to­tal­iz­ing seg­ment of EA?

Peter BerggrenDec 28, 2023, 9:16 AM
16 points
22 comments1 min readEA link

METR is hiring!

ElizabethBarnesDec 26, 2023, 9:03 PM
50 points
0 comments1 min readEA link
(www.lesswrong.com)

Bounty: Di­verse hard tasks for LLM agents

ElizabethBarnesDec 20, 2023, 4:31 PM
17 points
0 comments1 min readEA link

AI Safety 101 : Re­ward Misspecification

markovDec 21, 2023, 2:26 PM
6 points
1 comment31 min readEA link

Nick Bostrom’s new book, “Deep Utopia”, is out today

peterhartreeMar 27, 2024, 11:23 AM
105 points
6 comments1 min readEA link
(nickbostrom.com)

More thoughts on the Hu­man-AGI War

AhrenbachDec 27, 2023, 1:52 AM
2 points
0 comments7 min readEA link

At­ten­tion on AI X-Risk Likely Hasn’t Dis­tracted from Cur­rent Harms from AI

Erich_Grunewald 🔸Dec 21, 2023, 5:24 PM
190 points
13 comments1 min readEA link
(www.erichgrunewald.com)

Why I Should Work on AI Safety—Part 2: Will AI Ac­tu­ally Sur­pass Hu­man In­tel­li­gence?

Aditya AswaniDec 27, 2023, 9:08 PM
8 points
0 comments8 min readEA link

AI safety ad­vo­cates should con­sider pro­vid­ing gen­tle push­back fol­low­ing the events at OpenAI

I_machinegun_KellyDec 22, 2023, 9:05 PM
86 points
5 comments3 min readEA link
(www.lesswrong.com)

10 Cruxes of Ar­tifi­cial Sentience

Jordan ArelJul 1, 2024, 2:46 AM
31 points
0 comments3 min readEA link

2024 CFP for APSA, Largest An­nual Meet­ing of Poli­ti­cal Science

nemeryxuJan 3, 2024, 7:43 PM
2 points
0 comments1 min readEA link

AI Is Not Software

DavidmanheimJan 2, 2024, 7:58 AM
21 points
17 comments1 min readEA link

ML4Good UK—Ap­pli­ca­tions Open

NiaJan 2, 2024, 6:20 PM
21 points
0 comments1 min readEA link

Free agents

Michele CampoloDec 27, 2023, 8:21 PM
19 points
2 comments13 min readEA link

OpenAI’s Pre­pared­ness Frame­work: Praise & Recommendations

AkashJan 2, 2024, 4:20 PM
16 points
1 comment1 min readEA link

Stop talk­ing about p(doom)

Isaac KingJan 1, 2024, 10:57 AM
115 points
12 comments1 min readEA link

Oth­er­ness and con­trol in the age of AGI

Joe_CarlsmithJan 2, 2024, 6:15 PM
37 points
1 comment1 min readEA link

The Hasty Start of Bu­dapest AI Safety, 6-month up­date from a non-STEM founder

gergoJan 3, 2024, 12:56 PM
9 points
1 comment7 min readEA link

Gentle­ness and the ar­tifi­cial Other

Joe_CarlsmithJan 2, 2024, 6:21 PM
90 points
2 comments1 min readEA link

NYT is su­ing OpenAI&Microsoft for alleged copy­right in­fringe­ment; some quick thoughts

MikhailSaminDec 28, 2023, 6:37 PM
29 points
0 comments1 min readEA link

“At­ti­tudes Toward Ar­tifi­cial Gen­eral In­tel­li­gence: Re­sults from Amer­i­can Adults 2021 and 2023”—call for re­view­ers (Seeds of Science)

rogersbacon1Jan 3, 2024, 8:34 PM
12 points
0 comments1 min readEA link

So­ci­aLLM: pro­posal for a lan­guage model de­sign for per­son­al­ised apps, so­cial sci­ence, and AI safety research

Roman LeventovJan 2, 2024, 8:11 AM
4 points
2 comments1 min readEA link

Last days to ap­ply to EAGxLATAM 2024

Daniela TiznadoJan 17, 2024, 8:24 PM
16 points
0 comments1 min readEA link

[Question] What is the im­pact of chip pro­duc­tion on paus­ing AI de­vel­op­ment?

Johan de KockJan 10, 2024, 10:20 PM
7 points
0 comments1 min readEA link

When “yang” goes wrong

Joe_CarlsmithJan 8, 2024, 4:35 PM
57 points
1 comment1 min readEA link

#176 – The fi­nal push for AGI, un­der­stand­ing OpenAI’s lead­er­ship drama, and red-team­ing fron­tier mod­els (Nathan Labenz on the 80,000 Hours Pod­cast)

80000_HoursJan 4, 2024, 4:00 PM
15 points
0 comments22 min readEA link

Look­ing for stu­dents in AI to take a sur­vey on how they tackle a com­plex AI Case Study—win chance on 200€

bqnsJan 8, 2024, 3:52 PM
1 point
0 comments1 min readEA link

Deep athe­ism and AI risk

Joe_CarlsmithJan 4, 2024, 6:58 PM
65 points
4 comments1 min readEA link

[Question] Why can’t we ac­cept the hu­man con­di­tion as it ex­isted in 2010?

Hayven FrienbyJan 9, 2024, 6:02 PM
35 points
36 comments2 min readEA link

Learn­ing Math in Time for Alignment

Nicholas / Heather KrossJan 9, 2024, 1:02 AM
10 points
0 comments1 min readEA link

Towards AI Safety In­fras­truc­ture: Talk & Outline

Paul BricmanJan 7, 2024, 9:35 AM
14 points
1 comment2 min readEA link
(www.youtube.com)

AISN #29: Progress on the EU AI Act Plus, the NY Times sues OpenAI for Copy­right In­fringe­ment, and Con­gres­sional Ques­tions about Re­search Stan­dards in AI Safety

Center for AI SafetyJan 4, 2024, 4:03 PM
5 points
0 comments6 min readEA link
(newsletter.safe.ai)

Re­port: Latin Amer­ica and Global Catas­trophic Risks, trans­form­ing risk man­age­ment.

JorgeTorresCJan 9, 2024, 2:13 AM
25 points
1 comment2 min readEA link
(riesgoscatastroficosglobales.com)

Cor­po­rate AI Labs’ Odd Role in Their Own Governance

Corporate AI Labs' Odd Role in Their Own GovernanceJul 29, 2024, 9:36 AM
66 points
6 comments12 min readEA link
(dominikhermle.substack.com)

Sur­vey of 2,778 AI au­thors: six parts in pictures

Katja_GraceJan 6, 2024, 4:43 AM
176 points
10 comments1 min readEA link

AI Devel­op­ment Readi­ness Con­di­tion (AI-DRC): A Call to Action

AI-DRC3Jan 11, 2024, 11:00 AM
−5 points
0 comments2 min readEA link

When safety is dan­ger­ous: risks of an in­definite pause on AI de­vel­op­ment, and call for re­al­is­tic alternatives

Hayven FrienbyJan 18, 2024, 2:59 PM
5 points
0 comments5 min readEA link

Win­ning Non-Triv­ial Pro­ject: Set­ting a high stan­dard for fron­tier model security

XaviCFJan 8, 2024, 11:20 AM
31 points
0 comments18 min readEA link

$250K in Prizes: SafeBench Com­pe­ti­tion An­nounce­ment

Center for AI SafetyApr 3, 2024, 10:07 PM
47 points
0 comments1 min readEA link

Was Re­leas­ing Claude-3 Net-Negative

Logan RiggsMar 27, 2024, 5:41 PM
12 points
1 comment4 min readEA link

Why we’re en­ter­ing a new nu­clear age — and how to re­duce the risks (Chris­tian Ruhl on the 80k After Hours Pod­cast)

80000_HoursMar 27, 2024, 7:17 PM
52 points
2 comments7 min readEA link

AI Bench­marks Series — Me­tac­u­lus Ques­tions on Eval­u­a­tions of AI Models Against Tech­ni­cal Benchmarks

christianMar 27, 2024, 11:05 PM
10 points
0 comments1 min readEA link
(www.metaculus.com)

AI val­ues will be shaped by a va­ri­ety of forces, not just the val­ues of AI developers

Matthew_BarnettJan 11, 2024, 12:48 AM
70 points
3 comments3 min readEA link

AI Dis­clo­sures: A Reg­u­la­tory Review

Elliot MckernonMar 29, 2024, 11:46 AM
12 points
1 comment7 min readEA link

Re-in­tro­duc­ing Upgrad­able (a.k.a., 700,000 Hours): Life op­ti­miza­tion as a ser­vice for altruists

James NorrisFeb 5, 2025, 4:00 PM
4 points
0 comments1 min readEA link

[Linkpost] The real AI night­mare: What if it serves hu­mans too well?

BrianKMar 31, 2024, 10:33 AM
20 points
2 comments1 min readEA link
(www.latimes.com)

#191 (Part 1) – The econ­omy and na­tional se­cu­rity af­ter AGI (Carl Shul­man on the 80,000 Hours Pod­cast)

80000_HoursJun 27, 2024, 7:10 PM
45 points
0 comments19 min readEA link

#201 – Why your robot but­ler isn’t here yet (Ken Gold­berg on The 80,000 Hours Pod­cast)

80000_HoursSep 13, 2024, 5:41 PM
21 points
0 comments12 min readEA link

AI scal­ing myths

Nic Kruus🔸Jun 27, 2024, 8:29 PM
30 points
0 comments1 min readEA link
(open.substack.com)

A Re­search Agenda for Psy­chol­ogy and AI

carter allen🔸Jun 28, 2024, 12:56 PM
53 points
2 comments14 min readEA link

Con­tra Ace­moglu on AI

Maxwell TabarrokJun 28, 2024, 1:14 PM
51 points
2 comments5 min readEA link
(www.maximum-progress.com)

How difficult is AI Align­ment?

SammyDMartinSep 13, 2024, 5:55 PM
12 points
0 comments1 min readEA link
(www.lesswrong.com)

[Job ad] MATS is hiring!

Ryan KiddOct 9, 2024, 8:23 PM
18 points
0 comments5 min readEA link

Thou­sands of mal­i­cious ac­tors on the fu­ture of AI misuse

Zershaaneh QureshiApr 1, 2024, 10:03 AM
75 points
1 comment1 min readEA link

A Selec­tion of Ran­domly Selected SAE Features

TheMcDouglasApr 1, 2024, 9:09 AM
25 points
2 comments1 min readEA link

God Coin: A Modest Pro­posal

Mahdi ComplexApr 1, 2024, 12:02 PM
4 points
0 comments22 min readEA link

Men­tor­ship in AGI Safety: Ap­pli­ca­tions for men­tor­ship are open!

Joe RogeroJun 28, 2024, 3:05 PM
7 points
0 comments1 min readEA link

Open As­teroid Im­pact an­nounces lead­er­ship transition

Patrick HoangApr 1, 2024, 12:51 PM
15 points
0 comments1 min readEA link

Anal­y­sis of key AI analogies

Kevin KohlerJun 29, 2024, 6:16 PM
35 points
2 comments15 min readEA link

There’s an AGI on LessWrong

Neil WarrenApr 1, 2024, 4:36 PM
−5 points
0 comments1 min readEA link

Hor­tus AI is hiring for two in­tern roles

Thomas Krendl GilbertJul 30, 2024, 11:55 AM
3 points
0 comments1 min readEA link

OMMC An­nounces RIP

Adam_SchollApr 1, 2024, 11:38 PM
7 points
0 comments2 min readEA link

#200 – What su­perfore­cast­ers and ex­perts think about ex­is­ten­tial risks (Ezra Karger on The 80,000 Hours Pod­cast)

80000_HoursSep 6, 2024, 5:53 PM
12 points
2 comments14 min readEA link

Ap­ply to the Co­op­er­a­tive AI Sum­mer School!

reddingtonApr 3, 2024, 12:13 PM
26 points
0 comments1 min readEA link

A Utili­tar­ian Frame­work with an Em­pha­sis on Self-Es­teem and Rights

Sean SweeneyApr 8, 2024, 11:15 AM
7 points
0 comments30 min readEA link

AI Discrim­i­na­tion Re­quire­ments: A Reg­u­la­tory Review

Deric ChengApr 4, 2024, 3:44 PM
8 points
1 comment6 min readEA link

Book Re­view (mini): Co-In­tel­li­gence by Ethan Mollick

Darren McKeeApr 3, 2024, 5:33 PM
5 points
1 comment1 min readEA link

Not un­der­stand­ing sen­tience is a sig­nifi­cant x-risk

Cameron BergJul 1, 2024, 3:38 PM
27 points
8 comments2 min readEA link

A Tax­on­omy Of AI Sys­tem Evaluations

Maxime_RicheAug 19, 2024, 9:08 AM
8 points
0 comments14 min readEA link

Think­ing About Propen­sity Evaluations

Maxime_RicheAug 19, 2024, 9:24 AM
12 points
1 comment27 min readEA link

AI, An­i­mals, and Digi­tal Minds Con­fer­ence 2024: Ac­cept­ing ap­pli­ca­tions and speaker proposals

Constance LiApr 6, 2024, 8:42 AM
26 points
0 comments1 min readEA link

Carl Shul­man on the moral sta­tus of cur­rent and fu­ture AI systems

rgbJul 1, 2024, 3:34 PM
62 points
24 comments12 min readEA link
(experiencemachines.substack.com)

Will we ever run out of new jobs?

Kevin KohlerAug 19, 2024, 3:03 PM
11 points
4 comments7 min readEA link
(machinocene.substack.com)

In­ves­ti­gat­ing the role of agency in AI x-risk

Corin KatzkeApr 8, 2024, 3:12 PM
22 points
3 comments40 min readEA link
(www.convergenceanalysis.org)

Con­scious AI con­cerns all of us. [Con­scious AI & Public Per­cep­tions]

ixexJul 3, 2024, 3:12 AM
25 points
1 comment12 min readEA link

Some un­der­rated rea­sons why the AI safety com­mu­nity should re­con­sider its em­brace of strict li­a­bil­ity

Cecil Abungu Apr 8, 2024, 6:50 PM
67 points
29 comments12 min readEA link

Marisa, the Co-Founder of EA Any­where, Has Passed Away

carrickflynnMay 17, 2024, 10:49 PM
518 points
31 comments1 min readEA link

Twit­ter thread on AI safety evals

richard_ngoJul 31, 2024, 12:29 AM
38 points
2 comments1 min readEA link
(x.com)

Against AI As An Ex­is­ten­tial Risk

Noah BirnbaumJul 30, 2024, 7:24 PM
6 points
3 comments1 min readEA link
(irrationalitycommunity.substack.com)

[Question] Ci­ti­zens Group for Steer­ing AI

Odette BApr 11, 2024, 9:15 AM
13 points
0 comments1 min readEA link

Costs of Embodiment

algekalipsoJul 30, 2024, 8:41 PM
13 points
1 comment14 min readEA link

Twit­ter thread on open-source AI

richard_ngoJul 31, 2024, 12:30 AM
32 points
0 comments1 min readEA link
(x.com)

(4 min read) An in­tu­itive ex­pla­na­tion of the AI in­fluence situation

trevor1Jan 13, 2024, 5:34 PM
1 point
1 comment1 min readEA link

AISN #33: Re­assess­ing AI and Biorisk Plus, Con­soli­da­tion in the Cor­po­rate AI Land­scape, and Na­tional In­vest­ments in AI

Center for AI SafetyApr 12, 2024, 4:11 PM
19 points
0 comments9 min readEA link
(newsletter.safe.ai)

#184 – Sleep­ing on sleeper agents, and the biggest AI up­dates since ChatGPT (Zvi Mow­show­itz on the 80,000 Hours Pod­cast)

80000_HoursApr 12, 2024, 12:22 PM
46 points
0 comments20 min readEA link

Shortlived sen­tience/​consciousness

Martin (Huge) VlachJul 1, 2024, 1:59 PM
2 points
2 comments1 min readEA link

An AI Race With China Can Be Bet­ter Than Not Racing

niplavJul 2, 2024, 5:57 PM
18 points
1 comment1 min readEA link

Sum­mary: In­tro­spec­tive Ca­pa­bil­ities in LLMs (Robert Long)

rileyharrisJul 2, 2024, 6:08 PM
11 points
1 comment4 min readEA link

Space set­tle­ment and the time of per­ils: a cri­tique of Thorstad

Matthew RendallApr 14, 2024, 3:29 PM
46 points
10 comments4 min readEA link

Will dis­agree­ment about AI rights lead to so­cietal con­flict?

Lucius CaviolaJul 3, 2024, 1:30 PM
50 points
0 comments22 min readEA link
(outpaced.substack.com)

Digi­tal Minds: Im­por­tance and Key Re­search Ques­tions

Andreas_MogensenJul 3, 2024, 8:59 AM
76 points
1 comment15 min readEA link

[Question] What’s a good in­tro to AI Safety?

No longer EA-affiliatedJan 14, 2024, 4:54 PM
1 point
5 comments1 min readEA link

U.S. Com­merce Sec­re­tary Gina Raimondo An­nounces Ex­pan­sion of U.S. AI Safety In­sti­tute Lead­er­ship Team [and Paul Chris­ti­ano up­date]

PhibApr 16, 2024, 5:10 PM
116 points
8 comments1 min readEA link
(www.commerce.gov)

ML4Good Sum­mer Boot­camps—Ap­pli­ca­tions Open

NiaJul 4, 2024, 6:38 PM
39 points
0 comments1 min readEA link

On the Mo­ral Pa­tiency of Non-Sen­tient Be­ings (Part 1)

Chase CarterJul 4, 2024, 11:41 PM
20 points
8 comments24 min readEA link

Give me ca­reer advice

sammyboizJul 5, 2024, 8:48 AM
6 points
10 comments1 min readEA link

Con­scious AI: Will we know it when we see it? [Con­scious AI & Public Per­cep­tion]

ixexJul 4, 2024, 8:30 PM
13 points
1 comment12 min readEA link

Digi­tal Minds Take­off Scenarios

Bradford SaadJul 5, 2024, 4:06 PM
31 points
10 comments17 min readEA link

[Linkpost] “AI Align­ment vs. AI Eth­i­cal Treat­ment: Ten Challenges”

Bradford SaadJul 5, 2024, 2:55 PM
10 points
0 comments1 min readEA link
(docs.google.com)

Mo­ral Con­sid­er­a­tions In De­sign­ing AI Systems

Hans GundlachJul 5, 2024, 6:13 PM
8 points
1 comment7 min readEA link

The case for con­scious AI: Clear­ing the record [AI Con­scious­ness & Public Per­cep­tion]

Jay LuongJul 5, 2024, 8:29 PM
3 points
7 comments8 min readEA link

#194 – Defen­sive ac­cel­er­a­tion and how to reg­u­late AI when you fear gov­ern­ment (Vi­talik Bu­terin on the 80,000 Hours Pod­cast)

80000_HoursJul 31, 2024, 8:28 PM
42 points
5 comments21 min readEA link

Notes on new UK AISI minister

PseudaemoniaJul 5, 2024, 7:50 PM
92 points
0 comments1 min readEA link

How to re­duce risks re­lated to con­scious AI: A user guide [Con­scious AI & Public Per­cep­tion]

Jay LuongJul 5, 2024, 2:19 PM
9 points
1 comment15 min readEA link

AI Align­ment Re­search Eng­ineer Ac­cel­er­a­tor (ARENA): Call for ap­pli­cants v4.0

JamesFoxJul 6, 2024, 11:51 AM
7 points
0 comments5 min readEA link

LLM Eval­u­a­tors Rec­og­nize and Fa­vor Their Own Generations

Arjun PanicksseryApr 17, 2024, 9:09 PM
21 points
4 comments1 min readEA link
(tiny.cc)

[Question] AI con­scious­ness & moral sta­tus: What do the ex­perts think?

Jay LuongJul 6, 2024, 3:27 PM
0 points
3 comments1 min readEA link

We’re hiring a Writer to join our team at Our World in Data

Charlie GiattinoApr 18, 2024, 8:50 PM
29 points
0 comments1 min readEA link
(ourworldindata.org)

How much I’m pay­ing for AI pro­duc­tivity soft­ware (and the fu­ture of AI use)

jacquesthibsOct 11, 2024, 5:11 PM
30 points
14 comments1 min readEA link
(jacquesthibodeau.com)

My dis­agree­ments with “AGI ruin: A List of Lethal­ities”

SharmakeSep 15, 2024, 5:22 PM
16 points
2 comments1 min readEA link

Why I funded PIBBSS

Ryan KiddSep 15, 2024, 7:56 PM
90 points
2 comments1 min readEA link

De­mon­strate and eval­u­ate risks from AI to so­ciety at the AI x Democ­racy re­search hackathon

Esben KranApr 19, 2024, 2:46 PM
24 points
0 comments6 min readEA link
(www.apartresearch.com)

I cre­ated an Asi Align­ment Tier List

TimeGoatApr 22, 2024, 12:14 PM
0 points
0 comments1 min readEA link

Po­ten­tial Im­pli­ca­tions of AI on Hu­man Cog­ni­tive Evolution

Soe LinAug 21, 2024, 9:53 AM
1 point
0 comments1 min readEA link

Con­cern About the In­tel­li­gence Divide Due to AI

Soe LinAug 21, 2024, 9:53 AM
17 points
1 comment2 min readEA link

AI Reg­u­la­tion is Unsafe

Maxwell TabarrokApr 22, 2024, 4:38 PM
19 points
8 comments4 min readEA link
(www.maximum-progress.com)

Should we break up Google Deep­Mind?

Hauke HillebrandtApr 22, 2024, 9:16 AM
34 points
13 comments4 min readEA link

Ba­sic game the­ory and how you can do a bunch of good in ~3 Hours. (de­vel­op­ing ar­ti­cle.)

No longer EA-affiliatedOct 10, 2024, 4:30 AM
−3 points
2 comments7 min readEA link

On failing to get EA jobs: My ex­pe­rience and recom­men­da­tions to EA orgs

Ávila CarmesíApr 22, 2024, 9:19 PM
126 points
55 comments5 min readEA link

How LLMs Work, in the Style of The Economist

utilistrutilApr 22, 2024, 7:06 PM
17 points
0 comments1 min readEA link

An AI Man­hat­tan Pro­ject is Not Inevitable

Maxwell TabarrokJul 6, 2024, 4:43 PM
53 points
2 comments4 min readEA link
(www.maximum-progress.com)

In­creas­ing Con­cern for Digi­tal Be­ings Through LLM Per­sua­sion (Em­piri­cal Re­sults)

carter allen🔸Jul 7, 2024, 4:42 PM
22 points
0 comments7 min readEA link

Digest: three pa­pers that have shaped my un­der­stand­ing of the po­ten­tial for con­scious­ness in AI systems

rileyharrisAug 21, 2024, 3:09 PM
5 points
0 comments1 min readEA link

On the Mo­ral Pa­tiency of Non-Sen­tient Be­ings (Part 2)

Chase CarterJul 7, 2024, 10:33 PM
14 points
1 comment21 min readEA link

Towards shut­down­able agents via stochas­tic choice

EJTJul 8, 2024, 10:14 AM
26 points
1 comment1 min readEA link
(arxiv.org)

Why I’m work­ing on AI welfare

kyle_fishJul 6, 2024, 6:01 AM
62 points
6 comments5 min readEA link

Disen­tan­gling “Safety”

pleaselistencarefullyasourmenuoptionshaverecentlychangedJul 6, 2024, 11:21 PM
1 point
0 comments3 min readEA link

Launch­ing the AI Fore­cast­ing Bench­mark Series Q3 | $30k in Prizes

christianJul 8, 2024, 5:20 PM
17 points
0 comments1 min readEA link
(www.metaculus.com)

Anal­y­sis of Progress in Speech Recog­ni­tion Models

MiguelASep 16, 2024, 3:56 PM
8 points
1 comment12 min readEA link

From Cod­ing to Leg­is­la­tion: An Anal­y­sis of Bias in the Use of AI for Re­cruit­ment and Ex­ist­ing Reg­u­la­tory Frameworks

Priscilla CamposSep 16, 2024, 6:21 PM
4 points
1 comment20 min readEA link

[Linkpost] A Case for AI Consciousness

cdkgJul 6, 2024, 2:56 PM
3 points
0 comments1 min readEA link
(philpapers.org)

[Question] How bad would AI progress need to be for us to think gen­eral tech­nolog­i­cal progress is also bad?

Jim BuhlerJul 6, 2024, 6:44 PM
10 points
0 comments1 min readEA link

One, per­haps un­der­rated, AI risk.

Alex (Αλέξανδρος)Nov 28, 2024, 10:34 AM
7 points
1 comment3 min readEA link

Re­view of ar­tifi­cial in­tel­li­gence plat­forms for early pan­demic de­tec­tion in Latin America

DianaCarolinaSep 17, 2024, 3:17 PM
5 points
0 comments53 min readEA link

Ad­vice to ju­nior AI gov­er­nance researchers

AkashJul 8, 2024, 7:19 PM
38 points
3 comments1 min readEA link

Max Teg­mark — The AGI En­tente Delusion

Matrice JacobineOct 13, 2024, 5:42 PM
0 points
1 comment1 min readEA link
(www.lesswrong.com)

The Guardian calls EA “cultish” and ac­cuses the late FHI of “Eu­gen­ics on Steroids”

Damin Curtis🔹Apr 28, 2024, 1:44 PM
13 points
12 comments1 min readEA link
(www.theguardian.com)

Why I’m do­ing PauseAI

Joseph MillerApr 30, 2024, 4:21 PM
143 points
36 comments1 min readEA link

Ex­plor­ing the Eso­teric Path­ways to AI Sen­tience (Part One)

CarusoApr 27, 2024, 12:22 PM
−6 points
0 comments2 min readEA link

In­tro­duc­ing AI Lab Watch

Zach Stein-PerlmanApr 30, 2024, 5:00 PM
127 points
23 comments1 min readEA link
(ailabwatch.org)

#185 – The 7 most promis­ing ways to end fac­tory farm­ing, and whether AI is go­ing to be good or bad for an­i­mals (Lewis Bol­lard on the 80,000 Hours Pod­cast)

80000_HoursApr 30, 2024, 5:20 PM
63 points
0 comments15 min readEA link

An­i­mal ethics in ChatGPT and Claude

Elijah WhippleJan 16, 2024, 9:38 PM
46 points
2 comments9 min readEA link

AI Model Registries: A Foun­da­tional Tool for AI Governance

Elliot MckernonOct 7, 2024, 1:59 PM
18 points
0 comments1 min readEA link
(www.convergenceanalysis.org)

An­thropic rewrote its RSP

Zach Stein-PerlmanOct 15, 2024, 2:30 PM
32 points
1 comment1 min readEA link

Be­ing nicer than Clippy

Joe_CarlsmithJan 16, 2024, 7:44 PM
25 points
3 comments1 min readEA link

“Open Source AI” is a lie, but it doesn’t have to be

Jacob-HaimesApr 30, 2024, 7:42 PM
15 points
4 comments6 min readEA link
(jacob-haimes.github.io)

CAISH Hiring: AI Safety Policy Fel­low­ship Facilitators

Chloe LiJan 17, 2024, 9:21 AM
13 points
1 comment1 min readEA link

Why I am no longer think­ing about/​work­ing on AI safety

jbkjrMay 6, 2024, 8:00 PM
−8 points
0 comments4 min readEA link
(www.lesswrong.com)

AISN #34: New Mili­tary AI Sys­tems Plus, AI Labs Fail to Uphold Vol­un­tary Com­mit­ments to UK AI Safety In­sti­tute, and New AI Policy Pro­pos­als in the US Senate

Center for AI SafetyMay 2, 2024, 4:12 PM
21 points
5 comments8 min readEA link
(newsletter.safe.ai)

Ai Salon: Trust­wor­thy AI Fu­tures #1

IanEisenbergMay 2, 2024, 4:04 PM
2 points
0 comments1 min readEA link

ML4Good Brasil—Ap­pli­ca­tions Open

NiaMay 3, 2024, 10:39 AM
28 points
1 comment1 min readEA link

A break­down of OpenAI’s revenue

dschwarzJul 10, 2024, 6:07 PM
58 points
8 comments1 min readEA link

Please stop pub­lish­ing ideas/​in­sights/​re­search about AI

Tamsin LeakeMay 2, 2024, 2:52 PM
1 point
0 comments1 min readEA link

On Ar­tifi­cial Wisdom

Jordan ArelJul 11, 2024, 7:14 AM
22 points
1 comment14 min readEA link

Up­com­ing speaker se­ries on emerg­ing tech, na­tional se­cu­rity & US policy careers

ESJul 10, 2024, 7:59 PM
16 points
1 comment1 min readEA link

Challenges and Op­por­tu­ni­ties of Re­in­force­ment Learn­ing in Robotics: Anal­y­sis of Cur­rent Trends

Raymundo Rodríguez AlvaOct 14, 2024, 1:22 PM
11 points
1 comment17 min readEA link

“AI Safety for Fleshy Hu­mans” an AI Safety ex­plainer by Nicky Case

HabrykaMay 3, 2024, 7:28 PM
40 points
3 comments1 min readEA link
(aisafety.dance)

AI Clar­ity: An Ini­tial Re­search Agenda

Justin BullockMay 3, 2024, 4:29 PM
27 points
1 comment8 min readEA link

A be­gin­ner’s in­tro­duc­tion to AI-driven biorisk: Large Lan­guage Models, Biolog­i­cal De­sign Tools, In­for­ma­tion Hazards, and Biosecurity

NatKiiluMay 3, 2024, 3:49 PM
6 points
1 comment16 min readEA link

I read ev­ery ma­jor AI lab’s safety plan so you don’t have to

sarahhwDec 16, 2024, 2:12 PM
65 points
2 comments11 min readEA link
(longerramblings.substack.com)

#197 – On whether An­thropic’s AI safety policy is up to the task (Nick Joseph on The 80,000 Hours Pod­cast)

80000_HoursAug 22, 2024, 3:34 PM
9 points
0 comments18 min readEA link

An­nounc­ing SPAR Sum­mer 2024!

Lauren MMay 6, 2024, 5:55 PM
18 points
0 comments1 min readEA link

Are you pas­sion­ate about AI and An­i­mal Welfare? Do you have an idea that could rev­olu­tionize the food in­dus­try? We want to hear from you!

David van BeverenMay 6, 2024, 11:42 PM
12 points
0 comments1 min readEA link

£1 mil­lion prize for the most cut­ting-edge AI solu­tion for pub­lic good [link post]

rileyharrisJan 17, 2024, 2:36 PM
8 points
0 comments2 min readEA link
(manchesterprize.org)

Re­view­ing the Struc­ture of Cur­rent AI Regulations

Deric ChengMay 7, 2024, 12:34 PM
32 points
1 comment13 min readEA link

The Age of EM

ABishopMay 9, 2024, 12:17 PM
0 points
0 comments1 min readEA link
(ageofem.com)

[Question] Thoughts on these $1M and $500k AI safety grants?

defun 🔸Jul 11, 2024, 1:37 PM
50 points
7 comments1 min readEA link

AI for An­i­mals is Hiring a Pro­gram Lead

Constance LiJul 10, 2024, 8:57 PM
21 points
0 comments4 min readEA link

Sur­vey—Psy­cholog­i­cal Im­pact of Long-Term AI Engagement

Manuela GarcíaSep 17, 2024, 3:58 PM
2 points
0 comments1 min readEA link

AI and Chem­i­cal, Biolog­i­cal, Ra­diolog­i­cal, & Nu­clear Hazards: A Reg­u­la­tory Review

Elliot MckernonMay 10, 2024, 8:41 AM
8 points
1 comment1 min readEA link

We might be miss­ing some key fea­ture of AI take­off; it’ll prob­a­bly seem like “we could’ve seen this com­ing”

Dane ValerieMay 16, 2024, 12:05 PM
15 points
0 comments5 min readEA link
(www.lesswrong.com)

Cost-effec­tive­ness of mak­ing a video game with EA concepts

mmKALLLSep 15, 2022, 1:48 PM
8 points
2 comments5 min readEA link

Could Reg­u­la­tory Cost-Benefit Anal­y­sis Stop Fron­tier AI Reg­u­la­tions in the US?

LuiseJul 11, 2024, 3:25 PM
21 points
1 comment14 min readEA link

OpenAI show­case live on ope­nai.com

No longer EA-affiliatedMay 10, 2024, 5:55 PM
2 points
0 comments1 min readEA link

MATS Win­ter 2023-24 Retrospective

utilistrutilMay 11, 2024, 12:09 AM
62 points
2 comments1 min readEA link

Dubai EA Fel­low­ship [4 − 18 May]

rahulxyzApr 19, 2023, 8:06 PM
7 points
2 comments4 min readEA link

AI Safety: Why We Need to Keep Our Smart Machines in Check

adityaraj@eanitaDec 17, 2024, 12:29 PM
1 point
0 comments2 min readEA link
(medium.com)

Join the $10K Au­toHack 2024 Tournament

Paul BricmanSep 25, 2024, 11:56 AM
17 points
0 comments1 min readEA link
(noemaresearch.com)

SB 1047 Simplified

Gabe KSep 25, 2024, 12:00 PM
14 points
0 comments4 min readEA link

The Align­ment Prob­lem No One is Talk­ing About

Non-zero-sum JamesMay 14, 2024, 10:42 AM
5 points
0 comments2 min readEA link

AI Dis­clo­sure Bal­lot Ini­ti­a­tive (and vot­ing method)

aaronhamlinJan 17, 2024, 8:01 PM
5 points
0 comments1 min readEA link

How to do con­cep­tual re­search: Case study in­ter­view with Cas­par Oesterheld

ChiMay 14, 2024, 3:09 PM
26 points
1 comment1 min readEA link

An­nounc­ing the AI Safety Sum­mit Talks with Yoshua Bengio

OttoMay 14, 2024, 12:49 PM
33 points
1 comment1 min readEA link

No “Zero-Shot” Without Ex­po­nen­tial Data: Pre­train­ing Con­cept Fre­quency Deter­mines Mul­ti­modal Model Performance

Nic Kruus🔸May 14, 2024, 11:57 PM
36 points
2 comments1 min readEA link
(arxiv.org)

The Failed Strat­egy of Ar­tifi­cial In­tel­li­gence Doomers

yhoisethFeb 5, 2025, 7:34 PM
12 points
2 comments1 min readEA link
(letter.palladiummag.com)

#191 (Part 2) – Govern­ment and so­ciety af­ter AGI (Carl Shul­man on the 80,000 Hours Pod­cast)

80000_HoursJul 11, 2024, 7:26 PM
23 points
1 comment18 min readEA link

“A Paradigm for AI Con­scious­ness”—Seeds of Science call for reviewers

rogersbacon1May 15, 2024, 8:57 PM
5 points
0 comments1 min readEA link

Digi­tal Agents: The Fu­ture of News Consumption

TharinMay 16, 2024, 8:12 AM
9 points
1 comment7 min readEA link
(echoesandchimes.com)

Ar­ti­cles about re­cent OpenAI departures

bruceMay 17, 2024, 5:38 PM
126 points
12 comments1 min readEA link
(www.vox.com)

AISafety.com – Re­sources for AI Safety

Søren ElverlinMay 17, 2024, 4:01 PM
55 points
3 comments1 min readEA link

Ad­vice for Ac­tivists from the His­tory of Environmentalism

Jeffrey HeningerMay 16, 2024, 8:36 PM
48 points
2 comments1 min readEA link
(blog.aiimpacts.org)

MATS AI Safety Strat­egy Cur­ricu­lum v2

DanielFilanOct 7, 2024, 11:01 PM
29 points
1 comment1 min readEA link

Ge­offrey Hin­ton on the Past, Pre­sent, and Fu­ture of AI

Stephen McAleeseOct 12, 2024, 4:41 PM
5 points
1 comment1 min readEA link

De­sign­ing Ar­tifi­cial Wis­dom: The Wise Work­flow Re­search Organization

Jordan ArelJul 12, 2024, 6:57 AM
14 points
1 comment9 min readEA link

Don’t panic: 90% of EAs are good people

Closed Limelike CurvesMay 19, 2024, 4:37 AM
22 points
13 comments2 min readEA link

Part­ner with Us: Ad­vanc­ing Global Catas­trophic and AI Risk Re­search at Plateau State Univer­sity,Bokkos

emmannaemekaOct 10, 2024, 1:19 AM
15 points
0 comments2 min readEA link

Deep­Mind’s “​​Fron­tier Safety Frame­work” is weak and unambitious

Zach Stein-PerlmanMay 18, 2024, 3:00 AM
54 points
1 comment1 min readEA link

Out in Science: “Manag­ing ex­treme AI risks amid rapid progress” by Ben­gio, Hil­ton et al.

aaron_maiMay 20, 2024, 6:24 PM
9 points
0 comments1 min readEA link
(www.science.org)

[Question] Are AI risks tractable?

defun 🔸May 21, 2024, 1:45 PM
23 points
1 comment1 min readEA link

Big Pic­ture AI Safety: Introduction

EuanMcLeanMay 23, 2024, 11:28 AM
32 points
3 comments5 min readEA link

Please help me find re­search on as­piring AI Safety folk!

yanni kyriacosMay 20, 2024, 10:06 PM
7 points
0 comments1 min readEA link

[Question] Com­mon re­but­tal to “paus­ing” or reg­u­lat­ing AI

sammyboizMay 22, 2024, 4:21 AM
4 points
2 comments1 min readEA link

Can­cel­ling GPT subscription

adekczMay 20, 2024, 4:19 PM
26 points
14 comments3 min readEA link

The Prob­lem With the Word ‘Align­ment’

Peli GrietzerMay 21, 2024, 9:37 PM
13 points
1 comment6 min readEA link

Los­ing faith in big tech altruism

sammyboizMay 22, 2024, 4:49 AM
7 points
1 comment1 min readEA link

“If we go ex­tinct due to mis­al­igned AI, at least na­ture will con­tinue, right? … right?”

plexMay 18, 2024, 3:06 PM
13 points
10 comments1 min readEA link
(aisafety.info)

In­vi­ta­tion to lead a pro­ject at AI Safety Camp (Vir­tual Edi­tion, 2025)

Linda LinseforsAug 23, 2024, 2:18 PM
30 points
2 comments1 min readEA link

A short sum­mary of what I have been post­ing about on LessWrong

ThomasCederborgSep 10, 2024, 12:26 PM
3 points
0 comments2 min readEA link

SB 1047 was ve­toed, but pub­lic com­men­tary now can as­sist fu­ture AI safety legislation

ThomasWOct 2, 2024, 6:10 PM
38 points
0 comments1 min readEA link

Linkpost: Epis­tle to the Successors

ukc10014Jul 14, 2024, 8:07 PM
4 points
0 comments1 min readEA link
(ukc10014.github.io)

On the abo­li­tion of man

Joe_CarlsmithJan 18, 2024, 6:17 PM
71 points
4 comments1 min readEA link

What will the first hu­man-level AI look like, and how might things go wrong?

EuanMcLeanMay 23, 2024, 11:28 AM
12 points
1 comment15 min readEA link

AI Safety Seed Fund­ing Net­work—Join as a Donor or Investor

Alexandra BosDec 16, 2024, 7:30 PM
45 points
1 comment2 min readEA link

Ta­lent Needs of Tech­ni­cal AI Safety Teams

Ryan KiddMay 24, 2024, 12:46 AM
51 points
11 comments14 min readEA link

If try­ing to com­mu­ni­cate about AI risks, make it vivid

Michael Noetel 🔸May 27, 2024, 12:59 AM
19 points
2 comments2 min readEA link

#188 – On whether sci­ence is good (Matt Clancy on the 80,000 Hours Pod­cast)

80000_HoursMay 24, 2024, 3:04 PM
13 points
0 comments17 min readEA link

Good job op­por­tu­ni­ties for helping with the most im­por­tant century

Holden KarnofskyJan 18, 2024, 7:21 PM
46 points
1 comment4 min readEA link
(www.cold-takes.com)

What should AI safety be try­ing to achieve?

EuanMcLeanMay 23, 2024, 11:28 AM
13 points
1 comment13 min readEA link

Why con­scious­ness matters

EdLopezMay 24, 2024, 12:33 PM
0 points
0 comments7 min readEA link
(medium.com)

Sub­mit Your Tough­est Ques­tions for Hu­man­ity’s Last Exam

Matrice JacobineSep 18, 2024, 8:03 AM
6 points
0 comments2 min readEA link
(www.safe.ai)

What mis­takes has the AI safety move­ment made?

EuanMcLeanMay 23, 2024, 11:29 AM
61 points
3 comments12 min readEA link

2024 State of AI Reg­u­la­tory Landscape

Deric ChengMay 28, 2024, 12:00 PM
12 points
1 comment2 min readEA link
(www.convergenceanalysis.org)

John Cochrane on why reg­u­la­tion is the wrong tool for AI Safety

ezrahSep 26, 2024, 8:48 AM
3 points
2 comments1 min readEA link
(www.grumpy-economist.com)

Maybe An­thropic’s Long-Term Benefit Trust is powerless

Zach Stein-PerlmanMay 27, 2024, 1:00 PM
134 points
21 comments1 min readEA link

If you’re an AI Safety move­ment builder con­sider ask­ing your mem­bers these ques­tions in an interview

yanni kyriacosMay 27, 2024, 5:46 AM
10 points
0 comments2 min readEA link

“Suc­cess­ful lan­guage model evals” by Ja­son Wei

Arjun PanicksseryMay 25, 2024, 9:34 AM
11 points
0 comments1 min readEA link
(www.jasonwei.net)

The ne­ces­sity of “Guardian AI” and two con­di­tions for its achievement

ProicaMay 28, 2024, 11:42 AM
1 point
1 comment15 min readEA link

AI com­pa­nies aren’t re­ally us­ing ex­ter­nal evaluators

Zach Stein-PerlmanMay 26, 2024, 7:05 PM
88 points
4 comments1 min readEA link

MATS Alumni Im­pact Analysis

utilistrutilOct 2, 2024, 11:44 PM
16 points
1 comment1 min readEA link

The AI Revolu­tion in Biology

Roman LeventovMay 26, 2024, 9:30 AM
8 points
0 comments1 min readEA link
(www.cognitiverevolution.ai)

He­len Toner (ex-OpenAI board mem­ber): “We learned about ChatGPT on Twit­ter.”

defun 🔸May 29, 2024, 7:40 AM
123 points
13 comments1 min readEA link
(x.com)

The True Story of How GPT-2 Be­came Max­i­mally Lewd

WriterJan 18, 2024, 9:03 PM
23 points
1 comment1 min readEA link
(youtu.be)

Reflect­ing on the First Con­fer­ence on Global Catas­trophic Risks for Span­ish Speakers

SMalagonMay 29, 2024, 2:24 PM
15 points
0 comments1 min readEA link

Is “su­per­hu­man” AI fore­cast­ing BS? Some ex­per­i­ments on the “539″ bot from the Cen­tre for AI Safety

titotalSep 18, 2024, 1:07 PM
68 points
4 comments14 min readEA link
(open.substack.com)

[Question] Thoughts on this $16.7M “AI safety” grant?

defun 🔸Jul 16, 2024, 9:16 AM
61 points
24 comments1 min readEA link

[Question] UK elec­tion and AI safety, who to vote for?

Clay CubeJun 1, 2024, 10:16 AM
25 points
3 comments1 min readEA link

New OGL and ITAR changes are shift­ing AI Gover­nance and Policy be­low the sur­face: A sim­plified up­date

CAISIDMay 31, 2024, 7:54 AM
12 points
2 comments3 min readEA link

Tar­bell is hiring for 3 roles

Cillian_Jul 17, 2024, 12:19 PM
48 points
1 comment5 min readEA link

My (cur­rent) model of what an AI gov­er­nance re­searcher does

Johan de KockAug 26, 2024, 11:22 AM
7 points
1 comment5 min readEA link

There Should Be More Align­ment-Driven Startups

vaniverMay 31, 2024, 2:05 AM
27 points
3 comments1 min readEA link

AI com­pa­nies’ commitments

Zach Stein-PerlmanMay 31, 2024, 12:00 AM
9 points
0 comments1 min readEA link

Why AI Reg­u­la­tion Vio­lates the First Amendment

LockeJun 1, 2024, 8:44 PM
−15 points
0 comments5 min readEA link

[Question] How have analo­gous In­dus­tries solved In­ter­ested > Trained > Em­ployed bot­tle­necks?

yanni kyriacosMay 30, 2024, 11:59 PM
6 points
0 comments1 min readEA link

Us­ing AI to match peo­ple to jobs?

ForumiteMay 30, 2024, 9:19 PM
5 points
0 comments1 min readEA link

[Linkpost] Jobs at the AI Safety Institute

PseudaemoniaJan 19, 2024, 4:39 PM
11 points
0 comments1 min readEA link
(www.gov.uk)

[Question] Look­ing to in­ter­view AI Safety re­searchers for a book

CarusoAug 24, 2024, 8:01 PM
6 points
0 comments1 min readEA link

Clar­ify­ing METR’s Au­dit­ing Role [linkpost]

ChanaMessingerJun 4, 2024, 3:34 PM
47 points
1 comment1 min readEA link
(www.alignmentforum.org)

A nec­es­sary Mem­brane for­mal­ism feature

ThomasCederborgSep 10, 2024, 9:03 PM
1 point
0 comments11 min readEA link

What’s im­por­tant in “AI for epistemics”?

Lukas FinnvedenAug 24, 2024, 1:27 AM
66 points
1 comment1 min readEA link
(lukasfinnveden.substack.com)

Can AI Out­pre­dict Hu­mans? Re­sults From Me­tac­u­lus’s Q3 AI Fore­cast­ing Benchmark

Tom LiptayOct 10, 2024, 6:58 PM
32 points
1 comment6 min readEA link
(www.metaculus.com)

AI Open Source De­bate Comes Down to Trust in In­sti­tu­tions, and AI Policy Mak­ers Should Con­sider How We Can Foster It

another-anon-do-gooderJan 20, 2024, 1:47 PM
6 points
2 comments1 min readEA link

Is prin­ci­pled mass-out­reach pos­si­ble, for AGI X-risk?

Nicholas / Heather KrossJan 21, 2024, 5:45 PM
12 points
2 comments1 min readEA link

Soft Na­tion­al­iza­tion: How the US Govern­ment Will Con­trol AI Labs

Deric ChengAug 27, 2024, 3:10 PM
103 points
5 comments21 min readEA link
(www.convergenceanalysis.org)

Agents that act for rea­sons: a thought experiment

Michele CampoloJan 24, 2024, 4:48 PM
7 points
1 comment3 min readEA link

AISN #30: In­vest­ments in Com­pute and Mili­tary AI Plus, Ja­pan and Sin­ga­pore’s Na­tional AI Safety Institutes

Center for AI SafetyJan 24, 2024, 7:38 PM
7 points
1 comment6 min readEA link
(newsletter.safe.ai)

How did you up­date on AI Safety in 2023?

Chris LeongJan 23, 2024, 2:21 AM
30 points
5 comments1 min readEA link

Cost-effec­tive­ness anal­y­sis of ~1260 USD worth of so­cial me­dia ads for fel­low­ship marketing

gergoJan 25, 2024, 3:18 PM
61 points
5 comments2 min readEA link

My guess for the most cost effec­tive AI Safety projects

Linda LinseforsJan 24, 2024, 12:21 PM
26 points
2 comments4 min readEA link

5 ways to im­prove CoT faithfulness

CBiddulphOct 8, 2024, 4:17 AM
8 points
0 comments1 min readEA link

[Question] Work­shop (hackathon, res­i­dence pro­gram, etc.) about for-profit AI Safety pro­jects?

Roman LeventovJan 26, 2024, 9:49 AM
13 points
1 comment1 min readEA link

1st Alinha Hacka Re­cap: Reflect­ing on the Brazilian AI Align­ment Hackathon

Thiago USPJan 31, 2024, 10:38 AM
7 points
0 comments2 min readEA link

[Question] Open-source AI safety pro­jects?

defun 🔸Jan 29, 2024, 10:09 AM
8 points
2 comments1 min readEA link

xAI raises $6B

andzuckJun 5, 2024, 3:26 PM
18 points
1 comment1 min readEA link
(x.ai)

Com­par­i­son of LLM scal­a­bil­ity and perfor­mance be­tween the U.S. and China based on benchmark

Ivanna_alvaradoOct 12, 2024, 9:51 PM
8 points
0 comments34 min readEA link

Reg­u­la­tion of AI Use for Per­sonal Data Pro­tec­tion: Com­par­i­son of Global Strate­gies and Op­por­tu­ni­ties for Latin Amer­ica

Lisbeth Guzman Oct 14, 2024, 1:22 PM
10 points
1 comment21 min readEA link

EA Nether­lands’ An­nual Strat­egy for 2024

James HerbertJun 5, 2024, 3:07 PM
40 points
4 comments6 min readEA link

Op­ti­mistic As­sump­tions, Longterm Plan­ning, and “Cope”

RaemonJul 18, 2024, 12:06 AM
15 points
1 comment1 min readEA link

[Question] Devel­op­ing AI solu­tions for global health—Em­manuel Katto

EmmanuelKattoJul 18, 2024, 6:41 AM
0 points
0 comments1 min readEA link

Differ­en­tial knowl­edge interconnection

Roman LeventovOct 12, 2024, 12:52 PM
3 points
1 comment1 min readEA link

GPT5 won’t be what kills us all

DPiepgrassSep 28, 2024, 5:11 PM
3 points
3 comments1 min readEA link
(dpiepgrass.medium.com)

De­bat­ing AI’s Mo­ral Sta­tus: The Most Hu­mane and Silliest Thing Hu­mans Do(?)

Soe LinSep 29, 2024, 5:01 AM
5 points
5 comments3 min readEA link

Why Stop AI is bar­ri­cad­ing OpenAI

RemmeltOct 14, 2024, 7:12 AM
−19 points
28 comments6 min readEA link
(docs.google.com)

Printable re­sources for AI Safety tabling

gergoAug 28, 2024, 9:39 AM
29 points
0 comments1 min readEA link

Lev­er­age points for a pause

RemmeltAug 28, 2024, 9:21 AM
6 points
0 comments1 min readEA link

Is Text Water­mark­ing a lost cause?

Egor TimatkovOct 1, 2024, 1:07 PM
4 points
0 comments10 min readEA link

Am­plify is hiring! Work with us to sup­port field-build­ing ini­ti­a­tives through digi­tal mar­ket­ing

gergoAug 28, 2024, 2:12 PM
28 points
1 comment4 min readEA link

Is un­der­stand­ing the moral sta­tus of digi­tal minds a press­ing world prob­lem?

Cody_FenwickSep 30, 2024, 8:50 AM
42 points
0 comments34 min readEA link
(80000hours.org)

An­nounce­ment: there are now monthly co­or­di­na­tion calls for AIS field­builders in Europe

gergoNov 22, 2024, 10:30 AM
30 points
0 comments1 min readEA link

Launch­ing Am­plify: Re­ceive mar­ket­ing sup­port for your lo­cal groups and other field-build­ing initiatives

gergoAug 28, 2024, 2:12 PM
36 points
0 comments2 min readEA link

Will the US Govern­ment Con­trol the First AGI?—Find­ing Base Rates

LuiseSep 2, 2024, 11:11 AM
22 points
5 comments14 min readEA link

Seek­ing Mechanism De­signer for Re­search into In­ter­nal­iz­ing Catas­trophic Externalities

c.troutSep 11, 2024, 3:09 PM
11 points
0 comments1 min readEA link

Of Mice and MAGA: Ex­plor­ing Gen­er­a­tive Short Fic­tion’s Po­ten­tial for An­i­mal Rights Advocacy

Charlie SandersDec 17, 2024, 1:45 AM
2 points
0 comments2 min readEA link
(www.dailymicrofiction.com)

Track­ing Crit­i­cal In­fras­truc­ture AI Incidents

Ben TurseSep 29, 2024, 9:29 PM
1 point
0 comments2 min readEA link

Not Just For Ther­apy Chat­bots: The Case For Com­pas­sion In AI Mo­ral Align­ment Research

Kenneth_DiaoSep 29, 2024, 10:58 PM
8 points
3 comments12 min readEA link

Adam Smith Meets AI Doomers

JamesMillerJan 31, 2024, 4:04 PM
15 points
0 comments5 min readEA link

What is “wire­head­ing”?

Vishakha AgrawalDec 17, 2024, 5:59 PM
1 point
0 comments1 min readEA link
(aisafety.info)

The ELYSIUM Proposal

RokoOct 16, 2024, 2:14 AM
−10 points
0 comments1 min readEA link
(transhumanaxiology.substack.com)

Align­ment Fak­ing in Large Lan­guage Models

Ryan GreenblattDec 18, 2024, 5:19 PM
142 points
9 comments1 min readEA link

MATS Ap­pli­ca­tions + Re­search Direc­tions I’m Cur­rently Ex­cited About

Neel NandaFeb 6, 2025, 11:03 AM
23 points
2 comments1 min readEA link

Open Philan­thropy Tech­ni­cal AI Safety RFP - $40M Available Across 21 Re­search Areas

Jake MendelFeb 6, 2025, 6:59 PM
91 points
1 comment1 min readEA link
(www.openphilanthropy.org)

Ex­ec­u­tive Direc­tor for AIS France—Ex­pres­sion of interest

gergoDec 19, 2024, 8:11 AM
33 points
0 comments4 min readEA link

The am­bigu­ous effect of full au­toma­tion + new goods on GDP growth

trammellFeb 7, 2025, 2:53 AM
47 points
12 comments8 min readEA link

Im­prov­ing ca­pa­bil­ity eval­u­a­tions for AI gov­er­nance: Open Philan­thropy’s new re­quest for proposals

cbFeb 7, 2025, 9:30 AM
37 points
3 comments3 min readEA link

AI data gaps could lead to on­go­ing An­i­mal Suffering

Darkness8i8Oct 17, 2024, 10:52 AM
13 points
3 comments5 min readEA link

What AI com­pa­nies should do: Some rough ideas

Zach Stein-PerlmanOct 21, 2024, 2:00 PM
14 points
1 comment1 min readEA link

Med­i­cal Wind­fall Prizes

PeterMcCluskeyFeb 7, 2025, 12:13 AM
5 points
0 comments5 min readEA link
(bayesianinvestor.com)

It is time to start war gam­ing for AGI

yanni kyriacosOct 17, 2024, 5:14 AM
14 points
4 comments1 min readEA link

#204 – Mak­ing sense of SBF, and his biggest cri­tiques of effec­tive al­tru­ism (Nate Silver on The 80,000 Hours Pod­cast)

80000_HoursOct 17, 2024, 8:41 PM
22 points
2 comments14 min readEA link

A Rocket–In­ter­pretabil­ity Analogy

plexOct 21, 2024, 1:55 PM
13 points
1 comment1 min readEA link

Paus­ing for what?

MountainPathOct 21, 2024, 12:18 PM
6 points
1 comment1 min readEA link

OpenAI defected, but we can take hon­est actions

RemmeltOct 21, 2024, 8:41 AM
19 points
1 comment2 min readEA link

In­tro­duc­ing Tech Gover­nance Pro­ject

Zakariyau YusufOct 29, 2024, 9:20 AM
52 points
5 comments8 min readEA link

Tech­ni­cal Risks of (Lethal) Au­tonomous Weapons Systems

Heramb PodarOct 23, 2024, 8:43 PM
5 points
0 comments1 min readEA link
(www.lesswrong.com)

The Science of AI Is Too Im­por­tant to Be Left to the Scientists

AndrewDorisOct 23, 2024, 7:10 PM
3 points
0 comments1 min readEA link
(foreignpolicy.com)

What is malev­olence? On the na­ture, mea­sure­ment, and dis­tri­bu­tion of dark traits

David_AlthausOct 23, 2024, 8:41 AM
107 points
6 comments52 min readEA link

Are we drop­ping the ball on Recom­men­da­tion AIs?

Raphaël SOct 23, 2024, 7:37 PM
5 points
0 comments1 min readEA link

We should pre­vent the cre­ation of ar­tifi­cial sen­tience

RichardPOct 29, 2024, 12:22 PM
108 points
12 comments15 min readEA link

Ex-OpenAI re­searcher says OpenAI mass-vi­o­lated copy­right law

RemmeltOct 24, 2024, 1:00 AM
11 points
0 comments1 min readEA link
(suchir.net)

Why you are not mo­ti­vated to work on AI safety

MountainPathOct 25, 2024, 4:12 PM
7 points
5 comments1 min readEA link

Towards the Oper­a­tional­iza­tion of Philos­o­phy & Wisdom

Thane RuthenisOct 28, 2024, 7:45 PM
1 point
1 comment1 min readEA link
(aiimpacts.org)

Con­sider this me drunk tex­ting the fo­rum: Is it use­ful to have data that can’t be touched by AI?

Jonas SøvikFeb 7, 2025, 9:52 PM
−8 points
0 comments1 min readEA link

[Job ad] LISA CEO

Ryan KiddFeb 9, 2025, 12:18 AM
4 points
0 comments1 min readEA link

ML4Good Colom­bia—Ap­pli­ca­tions Open

carolinaolliveFeb 9, 2025, 4:03 AM
10 points
0 comments1 min readEA link

Can Knowl­edge Hurt You? The Dangers of In­fo­haz­ards (and Exfo­haz­ards)

A.G.G. LiuFeb 8, 2025, 3:51 PM
12 points
0 comments1 min readEA link
(www.youtube.com)

Sen­tinel min­utes #6/​2025: Power of the purse, D1.1 H5N1 flu var­i­ant, Ay­a­tol­lah against ne­go­ti­a­tions with Trump

NunoSempereFeb 10, 2025, 5:23 PM
39 points
1 comment7 min readEA link
(blog.sentinel-team.org)

Finish­ing The SB-1047 Doc­u­men­tary In 6 Weeks

Michaël TrazziOct 28, 2024, 8:26 PM
67 points
0 comments4 min readEA link

[Question] How do I plan my life in a world with rapid AI de­vel­op­ment?

Oliver KupermanFeb 10, 2025, 2:36 PM
25 points
6 comments1 min readEA link

World Ci­ti­zen Assem­bly about AI—Announcement

CamilleFeb 11, 2025, 10:51 AM
25 points
2 comments5 min readEA link

Utility Eng­ineer­ing: An­a­lyz­ing and Con­trol­ling Emer­gent Value Sys­tems in AIs

Matrice JacobineFeb 12, 2025, 9:15 AM
12 points
0 comments1 min readEA link
(www.emergent-values.ai)

AI Safety Col­lab 2025 - Lo­cal Or­ga­nizer Sign-ups Open

Evander H. 🔸Feb 12, 2025, 11:27 AM
9 points
0 comments1 min readEA link

Ret­ro­spec­tive: PIBBSS Fel­low­ship 2024

Dušan D. Nešić (Dushan)Dec 20, 2024, 3:55 PM
7 points
0 comments1 min readEA link

o3

Zach Stein-PerlmanDec 20, 2024, 9:00 PM
84 points
5 comments1 min readEA link

Oxford Biose­cu­rity Group: Fundrais­ing and Plans for Early 2025

Lin BLDec 20, 2024, 8:56 PM
33 points
0 comments2 min readEA link

New? Start here! (Use­ful links)

LizkaJul 1, 2022, 9:19 PM
26 points
1 comment2 min readEA link

What is com­pute gov­er­nance?

Vishakha AgrawalDec 23, 2024, 6:45 AM
5 points
0 comments2 min readEA link
(aisafety.info)

o3 is not be­ing re­leased to the pub­lic. First they are only giv­ing ac­cess to ex­ter­nal safety testers. You can ap­ply to get early ac­cess to do safety testing

Kat WoodsDec 20, 2024, 6:30 PM
13 points
0 comments1 min readEA link
(openai.com)

Ap­ply to the 2025 PIBBSS Sum­mer Re­search Fellowship

Dušan D. Nešić (Dushan)Dec 24, 2024, 10:28 AM
6 points
0 comments1 min readEA link

My The­ory of Con­scious­ness: The Ex­pe­riencer and the Indicator

David HammerleDec 23, 2024, 4:07 AM
1 point
1 comment7 min readEA link

Whistle­blow­ing Twit­ter Bot

Mckiev 🔸Dec 26, 2024, 6:18 PM
11 points
1 comment2 min readEA link
(www.lesswrong.com)

Skep­ti­cism to­wards claims about the views of pow­er­ful institutions

tlevinFeb 13, 2025, 7:40 AM
14 points
0 comments1 min readEA link

Con­nect For An­i­mals 2025 Strate­gic Plan

Steven RoukFeb 13, 2025, 3:49 PM
3 points
0 comments13 min readEA link

What Areas of AI Safety and Align­ment Re­search are Largely Ig­nored?

Andy E WilliamsDec 27, 2024, 12:19 PM
4 points
0 comments1 min readEA link

Beyond Meta: Large Con­cept Models Will Win

Anthony RepettoDec 30, 2024, 12:57 AM
3 points
0 comments3 min readEA link

A bet­ter “State­ment on AI Risk?” [Cross­post]

Knight LeeDec 30, 2024, 7:36 AM
4 points
0 comments3 min readEA link

Aspiring Jr. AI safety re­searchers: what’s stop­ping you? | Survey

carolinaolliveOct 29, 2024, 11:27 AM
14 points
0 comments1 min readEA link

Tur­ing-Test-Pass­ing AI im­plies Aligned AI

RokoDec 31, 2024, 8:22 PM
0 points
0 comments5 min readEA link

How might we solve the al­ign­ment prob­lem? (Part 1: In­tro, sum­mary, on­tol­ogy)

Joe_CarlsmithOct 28, 2024, 9:57 PM
18 points
0 comments1 min readEA link

En­hanc­ing Math­e­mat­i­cal Model­ing with LLMs: Goals, Challenges, and Evaluations

Ozzie GooenOct 28, 2024, 9:37 PM
11 points
3 comments15 min readEA link

By de­fault, cap­i­tal will mat­ter more than ever af­ter AGI

L Rudolf LDec 28, 2024, 5:52 PM
113 points
3 comments1 min readEA link
(nosetgauge.substack.com)

When do ex­perts think hu­man-level AI will be cre­ated?

Vishakha AgrawalJan 2, 2025, 11:17 PM
36 points
4 comments2 min readEA link
(aisafety.info)

Chi­nese Re­searchers Crack ChatGPT: Repli­cat­ing OpenAI’s Ad­vanced AI Model

Evan_GaensbauerJan 5, 2025, 3:50 AM
1 point
0 comments1 min readEA link
(www.geeky-gadgets.com)

AI Lab Re­tal­i­a­tion: A Sur­vival Guide

Jay ReadyJan 4, 2025, 11:05 PM
6 points
1 comment12 min readEA link
(morelightinai.substack.com)

Sugges­tions for get­ting re­tiree /​ sec­ond ca­reer folks in­ter­ested in AI Safety?

sjsjsjJan 5, 2025, 5:59 PM
2 points
1 comment1 min readEA link

[Question] Idea: Re­pos­i­tory for AI Safety Presentations

EitanJan 6, 2025, 1:04 PM
14 points
3 comments1 min readEA link

[Pre­sen­ta­tion] In­tro to AI Safety

EitanJan 6, 2025, 1:04 PM
13 points
0 comments1 min readEA link

How to Do a PhD (in AI Safety)

Lewis HammondJan 5, 2025, 4:57 PM
22 points
2 comments18 min readEA link
(lewishammond.com)

Alt­man on the board, AGI, and superintelligence

OscarD🔸Jan 6, 2025, 2:37 PM
20 points
1 comment1 min readEA link
(blog.samaltman.com)

AI & wis­dom 3: AI effects on amor­tised optimisation

L Rudolf LOct 29, 2024, 1:37 PM
14 points
0 comments1 min readEA link
(rudolf.website)

AI & wis­dom 2: growth and amor­tised optimisation

L Rudolf LOct 29, 2024, 1:37 PM
20 points
0 comments1 min readEA link
(rudolf.website)

AI & wis­dom 1: wis­dom, amor­tised op­ti­mi­sa­tion, and AI

L Rudolf LOct 29, 2024, 1:37 PM
14 points
0 comments1 min readEA link
(rudolf.website)

Stable to­tal­i­tar­i­anism: an overview

80000_HoursOct 29, 2024, 4:07 PM
35 points
1 comment20 min readEA link
(80000hours.org)

Rea­sons for and against work­ing on tech­ni­cal AI safety at a fron­tier AI lab

bilalchughtaiJan 7, 2025, 1:23 PM
16 points
3 comments12 min readEA link
(www.lesswrong.com)

In­tro­duc­ing The Field Build­ing Blog (FBB #0)

gergoJan 7, 2025, 3:43 PM
36 points
3 comments2 min readEA link

AI Safety Col­lab 2025 - Feed­back on Plans & Ex­pres­sion of Interest

Evander H. 🔸Jan 7, 2025, 4:41 PM
28 points
2 comments1 min readEA link

[Question] Why are bond yields anoma­lously ris­ing fol­low­ing the Septem­ber rate cut?

incredibleutilityJan 7, 2025, 3:49 PM
2 points
2 comments1 min readEA link

Your group needs all the help it can get (FBB #1)

gergoJan 7, 2025, 4:42 PM
42 points
6 comments4 min readEA link

What are poly­se­man­tic neu­rons?

Vishakha AgrawalJan 8, 2025, 7:39 AM
4 points
0 comments2 min readEA link
(aisafety.info)

What About Deon­tol­ogy? Ethics of So­cial Belong­ing and Con­for­mity in Effec­tive Altruism

Maksens DjabaliJan 8, 2025, 2:02 PM
7 points
1 comment4 min readEA link

Are AI safe­ty­ists cry­ing wolf?

sarahhwJan 8, 2025, 8:54 PM
60 points
21 comments16 min readEA link
(longerramblings.substack.com)

Tar­bell Fel­low­ship 2025 - Ap­pli­ca­tions Open (AI Jour­nal­ism)

Tarbell Center for AI JournalismJan 8, 2025, 3:25 PM
62 points
0 comments1 min readEA link

The moral ar­gu­ment for giv­ing AIs autonomy

Matthew_BarnettJan 8, 2025, 12:59 AM
31 points
7 comments11 min readEA link

Start an AIS safety field-build­ing or­ga­ni­za­tion at the city or na­tional level—an EOI form

gergoJan 9, 2025, 8:42 AM
37 points
4 comments2 min readEA link

PIBBSS Fel­low­ship 2025: Boun­ties and Co­op­er­a­tive AI Track Announcement

Dušan D. Nešić (Dushan)Jan 9, 2025, 2:23 PM
18 points
0 comments1 min readEA link

[Question] How to in­fluence AGI?

Sam FreedmanJan 9, 2025, 8:46 PM
2 points
0 comments1 min readEA link

AI Fore­cast­ing Bench­mark: Con­grat­u­la­tions to Q4 Win­ners + Q1 Prac­tice Ques­tions Open

christianJan 10, 2025, 3:02 AM
6 points
0 comments2 min readEA link
(www.metaculus.com)

Re­think­ing the Value of Work­ing on AI Safety

Johan de KockJan 9, 2025, 2:15 PM
45 points
21 comments10 min readEA link

Finite Field Assem­bly : A CUDA al­ter­na­tive rooted in Num­ber The­ory and Pure Mathematics

Murage KibichoJan 13, 2025, 1:37 PM
−7 points
0 comments3 min readEA link

Our new video about goal mis­gen­er­al­iza­tion, plus an apology

WriterJan 14, 2025, 2:07 PM
16 points
1 comment1 min readEA link
(youtu.be)

Join the AI Align­ment Evals hackathon

lenzJan 14, 2025, 6:17 PM
3 points
0 comments3 min readEA link

How do fic­tional sto­ries illus­trate AI mis­al­ign­ment?

Vishakha AgrawalJan 15, 2025, 6:16 AM
4 points
0 comments2 min readEA link
(aisafety.info)

‘Now Is the Time of Mon­sters’

Aaron GoldzimerJan 12, 2025, 11:31 PM
25 points
0 comments1 min readEA link
(www.nytimes.com)

[Question] Im­pact: Eng­ineer­ing VS Med­i­cal Scien­tist VS AI Safety VS Governance

AhmedWezJan 15, 2025, 3:47 PM
1 point
0 comments1 min readEA link

Les­sons learned from talk­ing to >100 aca­demics about AI safety

mariushobbhahnOct 10, 2022, 1:16 PM
138 points
21 comments1 min readEA link

How Could AI Gover­nance Go Wrong?

HaydnBelfieldMay 26, 2022, 9:29 PM
40 points
7 comments18 min readEA link

Rele­vant pre-AGI possibilities

kokotajlodJun 20, 2020, 1:15 PM
22 points
0 comments1 min readEA link
(aiimpacts.org)

Mo­ti­va­tion control

Joe_CarlsmithOct 30, 2024, 5:15 PM
18 points
0 comments1 min readEA link

A New York Times ar­ti­cle on AI risk

Eleni_ASep 6, 2022, 12:46 AM
20 points
0 comments1 min readEA link
(www.nytimes.com)

AI Could Defeat All Of Us Combined

Holden KarnofskyJun 10, 2022, 11:25 PM
143 points
14 comments17 min readEA link

Safety of Self-Assem­bled Neu­ro­mor­phic Hardware

Can RagerDec 26, 2022, 7:10 PM
8 points
1 comment10 min readEA link

Clar­ifi­ca­tions about struc­tural risk from AI

Sam ClarkeJan 18, 2022, 12:57 PM
41 points
3 comments4 min readEA link

Is Ge­netic Code Swap­ping as risky as it seems?

Invert_DOG_about_centre_OJan 12, 2025, 6:38 PM
23 points
2 comments10 min readEA link

Ross Gruet­zemacher: Defin­ing and un­pack­ing trans­for­ma­tive AI

EA GlobalOct 18, 2019, 8:22 AM
9 points
0 comments1 min readEA link
(www.youtube.com)

Thoughts on short timelines

Tobias_BaumannOct 23, 2018, 3:59 PM
22 points
14 comments5 min readEA link

[Question] What is an ex­am­ple of re­cent, tan­gible progress in AI safety re­search?

Aaron Gertler 🔸Jun 14, 2021, 5:29 AM
35 points
4 comments1 min readEA link

In­ter­pret­ing Neu­ral Net­works through the Poly­tope Lens

Sid BlackSep 23, 2022, 6:03 PM
35 points
0 comments1 min readEA link

Rab­bits, robots and resurrection

Patrick WilsonMay 10, 2022, 3:00 PM
9 points
0 comments15 min readEA link

Ap­pli­ca­tions Open for the Co­op­er­a­tive AI Sum­mer School 2025!

C TilliJan 13, 2025, 12:31 PM
25 points
0 comments1 min readEA link

[Question] Why AGIs util­ity can’t out­weigh hu­mans’ util­ity?

Alex PSep 20, 2022, 5:16 AM
6 points
25 comments1 min readEA link

A strange twist on the road to AGI

cveresOct 12, 2022, 11:27 PM
3 points
0 comments1 min readEA link

Un­der­stand­ing the diffu­sion of large lan­guage mod­els: summary

Ben CottierDec 21, 2022, 1:49 PM
127 points
18 comments22 min readEA link

Daniel Dewey: The Open Philan­thropy Pro­ject’s work on po­ten­tial risks from ad­vanced AI

EA GlobalAug 11, 2017, 8:19 AM
7 points
0 comments18 min readEA link
(www.youtube.com)

Free Guy, a rom-com on the moral pa­tient­hood of digi­tal sentience

micDec 23, 2021, 7:47 AM
26 points
2 comments2 min readEA link

Pod­cast: Krister Bykvist on moral un­cer­tainty, ra­tio­nal­ity, metaethics, AI and fu­ture pop­u­la­tions

Gus DockerOct 21, 2021, 3:17 PM
8 points
0 comments1 min readEA link
(www.utilitarianpodcast.com)

Un­con­trol­lable AI as an Ex­is­ten­tial Risk

Karl von WendtOct 9, 2022, 10:37 AM
28 points
0 comments1 min readEA link

Is Democ­racy a Fad?

bgarfinkelMar 13, 2021, 12:40 PM
165 points
36 comments18 min readEA link

Yud­kowsky and Chris­ti­ano dis­cuss “Take­off Speeds”

EliezerYudkowskyNov 22, 2021, 7:42 PM
42 points
0 comments60 min readEA link

CNAS re­port: ‘Ar­tifi­cial In­tel­li­gence and Arms Con­trol’

MMMaasOct 13, 2022, 8:35 AM
16 points
0 comments1 min readEA link
(www.cnas.org)

Sup­ple­ment to “The Brus­sels Effect and AI: How EU AI reg­u­la­tion will im­pact the global AI mar­ket”

MarkusAnderljungAug 16, 2022, 8:55 PM
109 points
7 comments8 min readEA link

Mauhn Re­leases AI Safety Documentation

Berg SeverensJul 2, 2021, 12:19 PM
4 points
2 comments1 min readEA link

[Question] What are your recom­men­da­tions for tech­ni­cal AI al­ign­ment pod­casts?

Evan_GaensbauerMay 11, 2022, 9:52 PM
13 points
4 comments1 min readEA link

[Question] Are there any AI Safety labs that will hire self-taught ML en­g­ineers?

Tomer_GoloboyApr 6, 2022, 11:32 PM
5 points
12 comments1 min readEA link

Baobao Zhang: How so­cial sci­ence re­search can in­form AI governance

EA GlobalJan 22, 2021, 3:10 PM
9 points
0 comments16 min readEA link
(www.youtube.com)

In­ter­view with Tom Chivers: “AI is a plau­si­ble ex­is­ten­tial risk, but it feels as if I’m in Pas­cal’s mug­ging”

felix.hFeb 21, 2021, 1:41 PM
16 points
1 comment7 min readEA link

New book: The Tango of Ethics: In­tu­ition, Ra­tion­al­ity and the Preven­tion of Suffering

jonleightonJan 2, 2023, 8:45 AM
114 points
3 comments5 min readEA link

[Question] What do we do if AI doesn’t take over the world, but still causes a sig­nifi­cant global prob­lem?

James_BanksAug 2, 2020, 3:35 AM
16 points
5 comments1 min readEA link

An in­ter­ven­tion to shape policy di­alogue, com­mu­ni­ca­tion, and AI re­search norms for AI safety

Lee_SharkeyOct 1, 2017, 6:29 PM
9 points
28 comments10 min readEA link

Prin­ci­ples for AI Welfare Research

jeffseboJun 19, 2023, 11:30 AM
138 points
16 comments13 min readEA link

The great en­ergy de­scent (short ver­sion) - An im­por­tant thing EA might have missed

CB🔸Aug 31, 2022, 9:50 PM
61 points
94 comments10 min readEA link

AI Safety field-build­ing pro­jects I’d like to see

AkashSep 11, 2022, 11:45 PM
31 points
4 comments6 min readEA link
(www.lesswrong.com)

How to do the­o­ret­i­cal re­search, a per­sonal perspective

Mark XuAug 19, 2022, 7:43 PM
132 points
7 comments15 min readEA link

[Question] Anal­ogy of AI Align­ment as Rais­ing a Child?

Aaron_ScherFeb 19, 2022, 9:40 PM
4 points
2 comments1 min readEA link

AI safety schol­ar­ships look worth-fund­ing (if other fund­ing is sane)

anon-aNov 19, 2019, 12:59 AM
22 points
6 comments2 min readEA link

[Question] Why is “Ar­gu­ment Map­ping” Not More Com­mon in EA/​Ra­tion­al­ity (And What Ob­jec­tions Should I Ad­dress in a Post on the Topic?)

Marcel DDec 23, 2022, 9:55 PM
15 points
5 comments1 min readEA link

[Link post] How plau­si­ble are AI Takeover sce­nar­ios?

SammyDMartinSep 27, 2021, 1:03 PM
26 points
0 comments1 min readEA link

[Creative Writ­ing Con­test] An AI Safety Limerick

Ben_West🔸Oct 18, 2021, 7:11 PM
21 points
5 comments1 min readEA link

Part 2: AI Safety Move­ment Builders should help the com­mu­nity to op­ti­mise three fac­tors: con­trib­u­tors, con­tri­bu­tions and coordination

PeterSlatteryDec 15, 2022, 10:48 PM
34 points
0 comments6 min readEA link

Ben Garfinkel: How sure are we about this AI stuff?

bgarfinkelFeb 9, 2019, 7:17 PM
128 points
20 comments18 min readEA link

Shal­low eval­u­a­tions of longter­mist organizations

NunoSempereJun 24, 2021, 3:31 PM
192 points
34 comments34 min readEA link

Fu­ture Mat­ters #3: digi­tal sen­tience, AGI ruin, and fore­cast­ing track records

PabloJul 4, 2022, 5:44 PM
70 points
2 comments19 min readEA link

‘Dis­solv­ing’ AI Risk – Pa­ram­e­ter Uncer­tainty in AI Fu­ture Forecasting

FroolowOct 18, 2022, 10:54 PM
111 points
63 comments39 min readEA link

[linkpost] Shar­ing pow­er­ful AI mod­els: the emerg­ing paradigm of struc­tured access

tsJan 20, 2022, 9:10 PM
11 points
3 comments1 min readEA link

Law-Fol­low­ing AI 3: Lawless AI Agents Un­der­mine Sta­bi­liz­ing Agreements

Cullen 🔸Apr 27, 2022, 5:20 PM
28 points
3 comments3 min readEA link

AI Safety groups should imi­tate ca­reer de­vel­op­ment clubs

JoshcNov 9, 2022, 11:48 PM
95 points
5 comments2 min readEA link

There have been 3 planes (billion­aire donors) and 2 have crashed

trevor1Dec 17, 2022, 3:38 AM
4 points
5 comments2 min readEA link

Can GPT-3 pro­duce new ideas? Par­tially au­tomat­ing Robin Han­son and others

NunoSempereJan 16, 2023, 3:05 PM
82 points
6 comments10 min readEA link

Are you re­ally in a race? The Cau­tion­ary Tales of Szilárd and Ellsberg

HaydnBelfieldMay 19, 2022, 8:42 AM
487 points
44 comments18 min readEA link

AI Value Align­ment Speaker Series Pre­sented By EA Berkeley

Mahendra PrasadMar 1, 2022, 6:17 AM
2 points
0 comments1 min readEA link

Fol­lowup on Terminator

skluugMar 12, 2022, 1:11 AM
32 points
0 comments9 min readEA link
(skluug.substack.com)

[Question] What are peo­ple’s thoughts on work­ing for Deep­Mind as a gen­eral soft­ware en­g­ineer?

Max PietschSep 23, 2022, 5:13 PM
9 points
4 comments1 min readEA link

Pre-An­nounc­ing the 2023 Open Philan­thropy AI Wor­ld­views Contest

Jason SchukraftNov 21, 2022, 9:45 PM
291 points
26 comments1 min readEA link

Sum­mary of Stu­art Rus­sell’s new book, “Hu­man Com­pat­i­ble”

Rohin ShahOct 19, 2019, 7:56 PM
33 points
1 comment15 min readEA link
(www.alignmentforum.org)

Disagree­ment with bio an­chors that lead to shorter timelines

mariushobbhahnNov 16, 2022, 2:40 PM
85 points
1 comment1 min readEA link

EA’s brain-over-body bias, and the em­bod­ied value prob­lem in AI al­ign­ment

Geoffrey MillerSep 21, 2022, 6:55 PM
45 points
3 comments25 min readEA link

Differ­en­tial tech­nol­ogy de­vel­op­ment: preprint on the concept

Hamish_HobbsSep 12, 2022, 1:52 PM
65 points
0 comments2 min readEA link

An­nounc­ing Cavendish Labs

dyushaJan 19, 2023, 8:00 PM
112 points
6 comments2 min readEA link

In­tel­lec­tual Diver­sity in AI Safety

KRJul 22, 2020, 7:07 PM
21 points
8 comments3 min readEA link

Paul Chris­ti­ano: Cur­rent work in AI alignment

EA GlobalApr 3, 2020, 7:06 AM
80 points
3 comments24 min readEA link
(www.youtube.com)

6 Year De­crease of Me­tac­u­lus AGI Prediction

Chris LeongApr 12, 2022, 5:36 AM
40 points
6 comments1 min readEA link

Fore­cast­ing Through Fiction

YitzJul 6, 2022, 5:23 AM
8 points
3 comments8 min readEA link
(www.lesswrong.com)

Why I ex­pect suc­cess­ful (nar­row) alignment

Tobias_BaumannDec 29, 2018, 3:46 PM
18 points
10 comments1 min readEA link
(s-risks.org)

Ap­ply to the new Open Philan­thropy Tech­nol­ogy Policy Fel­low­ship!

lukeprogJul 20, 2021, 6:41 PM
78 points
6 comments4 min readEA link

7 Learn­ings and a De­tailed De­scrip­tion of an AI Safety Read­ing Group

nellSep 23, 2022, 2:02 AM
21 points
5 comments9 min readEA link

AI can ex­ploit safety plans posted on the Internet

Peter S. ParkDec 4, 2022, 12:17 PM
5 points
3 comments1 min readEA link

Re­sults for a sur­vey of tool use and work­flows in al­ign­ment research

jacquesthibsDec 19, 2022, 3:19 PM
30 points
0 comments1 min readEA link

Holden Karnofsky In­ter­view about Most Im­por­tant Cen­tury & Trans­for­ma­tive AI

Dwarkesh PatelJan 3, 2023, 5:31 PM
29 points
2 comments1 min readEA link

The Ri­val AI De­ploy­ment Prob­lem: a Pre-de­ploy­ment Agree­ment as the least-bad response

HaydnBelfieldSep 23, 2022, 9:28 AM
44 points
1 comment12 min readEA link

Sta­tus quo bias; Sys­tem justification

RemmeltJan 3, 2023, 2:50 AM
4 points
1 comment1 min readEA link

What’s so dan­ger­ous about AI any­way? – Or: What it means to be a superintelligence

Thomas KehrenbergJul 18, 2022, 4:14 PM
9 points
2 comments11 min readEA link

Defend­ing against Ad­ver­sar­ial Poli­cies in Re­in­force­ment Learn­ing with Alter­nat­ing Training

sergeivolodinFeb 12, 2022, 3:59 PM
1 point
0 comments13 min readEA link

CSER is hiring for a se­nior re­search as­so­ci­ate on longterm AI risk and governance

Sam ClarkeJan 24, 2022, 1:24 PM
9 points
4 comments1 min readEA link

Nu­clear Es­pi­onage and AI Governance

GAAOct 4, 2021, 6:21 PM
32 points
3 comments24 min readEA link

Ideal gov­er­nance (for com­pa­nies, coun­tries and more)

Holden KarnofskyApr 7, 2022, 4:54 PM
80 points
19 comments14 min readEA link

Cog­ni­tive Science/​Psy­chol­ogy As a Ne­glected Ap­proach to AI Safety

Kaj_SotalaJun 5, 2017, 1:46 PM
40 points
37 comments4 min readEA link

Get­ting started in­de­pen­dently in AI Safety

JJ HepburnJul 6, 2021, 3:20 PM
41 points
10 comments2 min readEA link

In­tro­duc­ing a New Course on the Eco­nomics of AI

akorinekDec 21, 2021, 4:55 AM
84 points
6 comments2 min readEA link

Pri­ori­tiz­ing the Arts in re­sponse to AI automation

CaseySep 25, 2022, 7:49 AM
6 points
1 comment1 min readEA link

Con­sider try­ing the ELK con­test (I am)

Holden KarnofskyJan 5, 2022, 7:42 PM
110 points
17 comments16 min readEA link

[Question] Should I force my­self to work on AGI al­ign­ment?

Isaac BensonAug 24, 2022, 5:25 PM
19 points
17 comments1 min readEA link

AI Re­search Con­sid­er­a­tions for Hu­man Ex­is­ten­tial Safety (ARCHES)

Andrew CritchMay 21, 2020, 6:55 AM
29 points
0 comments3 min readEA link
(acritch.com)

Thoughts on AGI or­ga­ni­za­tions and ca­pa­bil­ities work

RobBensingerDec 7, 2022, 7:46 PM
77 points
7 comments5 min readEA link

Hacker-AI and Digi­tal Ghosts – Pre-AGI

Erland WittkotterOct 19, 2022, 7:49 AM
4 points
0 comments1 min readEA link

Align­ing Recom­mender Sys­tems as Cause Area

IvanVendrovMay 8, 2019, 8:56 AM
150 points
48 comments13 min readEA link

An­nounc­ing the Har­vard AI Safety Team

Xander123Jun 30, 2022, 6:34 PM
128 points
4 comments5 min readEA link

[Question] Should AI writ­ers be pro­hibited in ed­u­ca­tion?

Eleni_AJan 16, 2023, 10:29 PM
3 points
2 comments1 min readEA link

[Question] If FTX is liqui­dated, who ends up con­trol­ling An­thropic?

OferNov 15, 2022, 3:04 PM
63 points
8 comments1 min readEA link

[Question] What are the coolest top­ics in AI safety, to a hope­lessly pure math­e­mat­i­cian?

Jenny K EMay 7, 2022, 7:18 AM
89 points
29 comments1 min readEA link

Three Im­pacts of Ma­chine Intelligence

Paul_ChristianoAug 23, 2013, 10:10 AM
33 points
5 comments8 min readEA link
(rationalaltruist.com)

Effec­tive Per­sua­sion For AI Align­ment Risk

Brian LuiAug 9, 2022, 11:55 PM
5 points
7 comments4 min readEA link

AI coöper­a­tion is more pos­si­ble than you think

423175Sep 24, 2022, 11:04 PM
2 points
0 comments1 min readEA link

Feed­back Re­quest on EA Philip­pines’ Ca­reer Ad­vice Re­search for Tech­ni­cal AI Safety

BrianTanOct 3, 2020, 10:39 AM
19 points
5 comments4 min readEA link

Long-term AI policy strat­egy re­search and implementation

Benjamin_ToddNov 9, 2021, 12:00 AM
1 point
0 comments7 min readEA link
(80000hours.org)

An­nual AGI Bench­mark­ing Event

MetaculusAug 26, 2022, 9:31 PM
20 points
2 comments2 min readEA link
(www.metaculus.com)

Hiring en­g­ineers and re­searchers to help al­ign GPT-3

Paul_ChristianoOct 1, 2020, 6:52 PM
107 points
19 comments3 min readEA link

Join the AI gov­er­nance and in­ter­pretabil­ity hackathons!

Esben KranMar 23, 2023, 2:39 PM
33 points
1 comment5 min readEA link
(alignmentjam.com)

[Question] Books and lec­ture se­ries rele­vant to AI gov­er­nance?

MichaelA🔸Jul 18, 2021, 3:54 PM
22 points
8 comments1 min readEA link

An­nounc­ing The Most Im­por­tant Cen­tury Writ­ing Prize

michelOct 31, 2022, 9:37 PM
48 points
0 comments2 min readEA link

New TIME mag­a­z­ine ar­ti­cle on the UK AI Safety In­sti­tute (AISI)

RasoolJan 16, 2025, 10:51 PM
9 points
0 comments1 min readEA link
(time.com)

[Question] What do you mean with ‘al­ign­ment is solv­able in prin­ci­ple’?

RemmeltJan 17, 2025, 3:03 PM
10 points
1 comment1 min readEA link

Descartes’ 17th cen­tury Tur­ing Test

James-Hartree-LawJan 16, 2025, 8:18 PM
3 points
0 comments7 min readEA link

AI Safety Protest, Melbourne, Aus­tralia

Mark BrownJan 17, 2025, 2:55 PM
2 points
0 comments1 min readEA link

AI for Re­solv­ing Fore­cast­ing Ques­tions: An Early Exploration

Ozzie GooenJan 16, 2025, 9:40 PM
21 points
0 comments9 min readEA link

Ex­plain­ing all the US semi­con­duc­tor ex­port controls

ZacRichardsonJan 17, 2025, 6:00 PM
16 points
1 comment9 min readEA link

Fact Check: 57% of the in­ter­net is NOT AI-gen­er­ated

James-Hartree-LawJan 17, 2025, 9:26 PM
1 point
0 comments1 min readEA link

The Man­hat­tan Trap: Why a Race to Ar­tifi­cial Su­per­in­tel­li­gence is Self-Defeating

Corin KatzkeJan 21, 2025, 4:57 PM
93 points
1 comment2 min readEA link
(www.convergenceanalysis.org)

Scal­ing Wargam­ing for Global Catas­trophic Risks with AI

raiJan 18, 2025, 3:07 PM
73 points
1 comment4 min readEA link
(blog.sentinel-team.org)

The sec­ond bit­ter les­son — there’s a fun­da­men­tal prob­lem with al­ign­ing AI

aelwoodJan 19, 2025, 6:48 PM
4 points
1 comment5 min readEA link
(pursuingreality.substack.com)

Free Ac­cess to Ad­vanced An­a­lyt­ics Plat­form for Com­bat­ting Disinformation

CarranzaJan 19, 2025, 3:13 AM
2 points
1 comment1 min readEA link

Wor­ries about la­tent rea­son­ing in LLMs

CBiddulphJan 20, 2025, 9:09 AM
20 points
1 comment1 min readEA link

Prepar­ing Effec­tive Altru­ism for an AI-Trans­formed World

Tobias HäberliJan 22, 2025, 8:50 AM
184 points
22 comments1 min readEA link

Train­ing Data At­tri­bu­tion: Ex­am­in­ing Its Adop­tion & Use Cases

Deric ChengJan 22, 2025, 3:40 PM
18 points
1 comment3 min readEA link
(www.convergenceanalysis.org)

Once More, Without Feel­ing (An­dreas Mo­gensen)

Global Priorities InstituteJan 21, 2025, 2:53 PM
32 points
1 comment2 min readEA link
(globalprioritiesinstitute.org)

[Question] Look­ing for Quick, Col­lab­o­ra­tive Sys­tems for Truth-Seek­ing in Group Disagreements

EffectiveAdvocate🔸Jan 21, 2025, 6:32 AM
10 points
1 comment1 min readEA link

Book Launch: The Mo­ral Cir­cle: Who Mat­ters, What Mat­ters, and Why

Sofia_FogelJan 21, 2025, 1:45 PM
30 points
0 comments1 min readEA link

Why AI Safety Camp strug­gles with fundrais­ing (FBB #2)

gergoJan 21, 2025, 5:25 PM
63 points
10 comments7 min readEA link

Google AI Ac­cel­er­a­tor Open Call

Rochelle HarrisJan 22, 2025, 4:50 PM
10 points
1 comment1 min readEA link

Sen­tinel Fund­ing Memo — Miti­gat­ing GCRs with Fore­cast­ing & Emer­gency Response

Saul MunnNov 6, 2024, 1:57 AM
47 points
5 comments13 min readEA link

Op­tion control

Joe_CarlsmithNov 4, 2024, 5:54 PM
11 points
0 comments1 min readEA link

Win­ning isn’t enough

Anthony DiGiovanniNov 5, 2024, 11:43 AM
31 points
3 comments1 min readEA link

No one has the ball on 1500 Rus­sian olympiad win­ners who’ve re­ceived HPMOR

MikhailSaminJan 23, 2025, 4:40 PM
32 points
10 comments1 min readEA link

[Question] What should I read about defin­ing AI “hal­lu­ci­na­tion?”

James-Hartree-LawJan 23, 2025, 1:00 AM
2 points
0 comments1 min readEA link

What are the differ­ences be­tween AGI, trans­for­ma­tive AI, and su­per­in­tel­li­gence?

Vishakha AgrawalJan 23, 2025, 10:11 AM
12 points
0 comments3 min readEA link
(aisafety.info)

Feed­back wanted! On script for an up­com­ing ~12 minute Rob Miles video on AI x-risk.

melissasamworthJan 23, 2025, 9:46 PM
25 points
0 comments1 min readEA link

Time to Think about ASI Con­sti­tu­tions?

ukc10014Jan 27, 2025, 9:28 AM
20 points
0 comments12 min readEA link

[Re­port] Bridg­ing the In­ter­na­tional AI Gover­nance Divide: Key Strate­gies for In­clud­ing the Global South

Heramb PodarJan 26, 2025, 11:55 PM
9 points
0 comments1 min readEA link
(encodeai.org)

AI Au­dit in Costa Rica

Priscilla CamposJan 27, 2025, 2:57 AM
10 points
4 comments9 min readEA link

Elec­tion by Jury: A Ne­glected Tar­get for Effec­tive Altruism

ClayShentrupJan 27, 2025, 7:27 AM
11 points
10 comments6 min readEA link

Are we try­ing to figure out if AI is con­scious?

kristapszJan 27, 2025, 1:22 PM
5 points
1 comment5 min readEA link

Stan­ford sum­mer course: Eco­nomics of Trans­for­ma­tive AI

trammellJan 23, 2025, 11:07 PM
81 points
4 comments1 min readEA link

In­finite Re­wards, Finite Safety: New Models for AI Mo­ti­va­tion Without In­finite Goals

Whylome TeamNov 12, 2024, 7:21 AM
−5 points
1 comment2 min readEA link

A se­lec­tion of les­sons from Se­bas­tian Lodemann

ClaireBNov 11, 2024, 9:33 PM
82 points
2 comments7 min readEA link

The King and the Golem—The Animation

WriterNov 8, 2024, 6:23 PM
50 points
1 comment1 min readEA link

Quan­tum Im­mor­tal­ity: A Per­spec­tive if AI Doomers are Prob­a­bly Right

turchinNov 7, 2024, 4:06 PM
7 points
0 comments1 min readEA link

Per­sonal AI Planning

Jeff Kaufman 🔸Nov 10, 2024, 2:10 PM
43 points
5 comments1 min readEA link

An­thropic teams up with Palan­tir and AWS to sell AI to defense customers

Matrice JacobineNov 9, 2024, 11:47 AM
26 points
1 comment2 min readEA link
(techcrunch.com)

Don’t Let Other Global Catas­trophic Risks Fall Be­hind: Sup­port ORCG in 2024

JorgeTorresCNov 11, 2024, 6:27 PM
48 points
1 comment4 min readEA link

Cut­ting AI Safety down to size

Holly Elmore ⏸️ 🔸Nov 9, 2024, 11:40 PM
86 points
5 comments5 min readEA link

Ex­plor­ing AI Safety through “Es­cape Ex­per­i­ment”: A Short Film on Su­per­in­tel­li­gence Risks

Gaetan_SelleNov 10, 2024, 4:42 AM
4 points
0 comments2 min readEA link

The Welfare of Digi­tal Minds: A Re­search Agenda

Derek ShillerNov 11, 2024, 12:58 PM
53 points
1 comment31 min readEA link

Ti­maeus is hiring re­searchers & engineers

Tatiana K. Nesic SkuratovaJan 27, 2025, 2:35 PM
19 points
0 comments4 min readEA link

Un­der­stand­ing AI World Models w/​ Chris Canal

Jacob-HaimesJan 27, 2025, 4:37 PM
5 points
0 comments1 min readEA link
(kairos.fm)

An­nounce­ment: Learn­ing The­ory On­line Course

YegregJan 28, 2025, 8:32 AM
5 points
0 comments3 min readEA link
(www.lesswrong.com)

Six Re­search Pit­falls and How to Avoid Them: a Guide for Re­search Managers

Morgan SimpsonJan 28, 2025, 9:49 AM
9 points
0 comments10 min readEA link

The Game Board has been Flipped: Now is a good time to re­think what you’re doing

LintzAJan 28, 2025, 9:20 PM
351 points
60 comments13 min readEA link

Fake think­ing and real thinking

Joe_CarlsmithJan 28, 2025, 8:05 PM
51 points
1 comment1 min readEA link
(joecarlsmith.substack.com)

[Question] Whose track record of AI pre­dic­tions would you like to see eval­u­ated?

Jonny Spicer 🔸Jan 29, 2025, 11:57 AM
10 points
13 comments1 min readEA link

AGI Can­not Be Pre­dicted From Real In­ter­est Rates

Nicholas DeckerJan 28, 2025, 5:45 PM
24 points
3 comments1 min readEA link
(nicholasdecker.substack.com)

[Question] Is it eth­i­cal to work in AI “con­tent eval­u­a­tion”?

anon_databoy555Jan 30, 2025, 1:27 PM
10 points
3 comments1 min readEA link

Talos Net­work needs your help in 2025

DavidConradNov 12, 2024, 9:26 AM
41 points
0 comments5 min readEA link

AMA: PauseAI US needs money! Ask founder/​Exec Dir Holly El­more any­thing for 11/​19

Holly Elmore ⏸️ 🔸Nov 11, 2024, 11:51 PM
98 points
57 comments4 min readEA link

Col­lege tech­ni­cal AI safety hackathon ret­ro­spec­tive—Ge­or­gia Tech

yixiongNov 14, 2024, 1:34 PM
18 points
0 comments5 min readEA link
(yixiong.substack.com)

In­cen­tive de­sign and ca­pa­bil­ity elicitation

Joe_CarlsmithNov 12, 2024, 8:56 PM
9 points
0 comments1 min readEA link

Com­par­ing AI Labs and Phar­ma­ceu­ti­cal Companies

mxschonsNov 13, 2024, 2:51 PM
13 points
0 comments1 min readEA link
(mxschons.com)

Grad­ual Disem­pow­er­ment: Sys­temic Ex­is­ten­tial Risks from In­cre­men­tal AI Development

Jan_KulveitJan 30, 2025, 5:07 PM
36 points
4 comments1 min readEA link
(gradual-disempowerment.ai)

Could ASI Have Ex­isted Since the Big Bang?

Aaron LiJan 31, 2025, 1:20 PM
−13 points
0 comments1 min readEA link

Re­in­force­ment Learn­ing: A Non-Tech­ni­cal Primer on o1 and Deep­Seek-R1

AlexChalkFeb 9, 2025, 11:58 PM
4 points
0 comments9 min readEA link
(alexchalk.net)

From Cri­sis to Con­trol: Estab­lish­ing a Re­silient In­ci­dent Re­sponse Frame­work for De­ployed AI Models

KevinNJan 31, 2025, 1:06 PM
10 points
1 comment6 min readEA link
(www.techpolicy.press)

Thoughts about Policy Ecosys­tems: The Miss­ing Links in AI Governance

Echo HuangJan 31, 2025, 1:23 PM
20 points
2 comments5 min readEA link

PSA: Say­ing “1 in 5” Is Bet­ter Than “20%” When In­form­ing about risks publicly

BlankaJan 30, 2025, 7:03 PM
17 points
1 comment1 min readEA link

AI and Non-Existence

Blue11Jan 31, 2025, 1:19 PM
4 points
0 comments2 min readEA link

Pro­posal for a Form of Con­di­tional Sup­ple­men­tal In­come (CSI) in a Post-Work World

Sean SweeneyJan 31, 2025, 1:00 AM
3 points
0 comments3 min readEA link

ARENA 5.0 - Call for Applicants

James HindmarchJan 31, 2025, 7:54 PM
9 points
0 comments6 min readEA link

Con­sider keep­ing your threat mod­els pri­vate.

Miles KodamaFeb 1, 2025, 12:29 AM
17 points
2 comments4 min readEA link

[Question] Do short AI timelines de­mand short Giv­ing timelines?

ScienceMon🔸Feb 1, 2025, 10:44 PM
12 points
5 comments1 min readEA link

Repli­cat­ing AI Debate

Anthony FlemingFeb 1, 2025, 11:19 PM
9 points
0 comments5 min readEA link

Na­tional Se­cu­rity Is Not In­ter­na­tional Se­cu­rity: A Cri­tique of AGI Realism

Conrad K.Feb 2, 2025, 5:04 PM
41 points
2 comments36 min readEA link
(conradkunadu.substack.com)

Tether­ware #1: The case for hu­man­like AI with free will

Jáchym FibírJan 30, 2025, 11:57 AM
−1 points
2 comments10 min readEA link
(tetherware.substack.com)

[Question] Are you liv­ing in ac­cor­dance with your stated AI timelines?

CyrilBFeb 3, 2025, 5:19 PM
7 points
3 comments1 min readEA link

Linkpost: “Imag­in­ing and build­ing wise ma­chines: The cen­tral­ity of AI metacog­ni­tion” by John­son, Karimi, Ben­gio, et al.

Chris LeongNov 17, 2024, 3:00 PM
8 points
0 comments1 min readEA link
(arxiv.org)

The Hu­man Biolog­i­cal Ad­van­tage Over AI

William StewartNov 18, 2024, 11:18 AM
−1 points
0 comments1 min readEA link

LLMs are weirder than you think

Derek ShillerNov 20, 2024, 1:39 PM
61 points
3 comments22 min readEA link

US gov­ern­ment com­mis­sion pushes Man­hat­tan Pro­ject-style AI initiative

LarksNov 19, 2024, 4:22 PM
83 points
15 comments1 min readEA link
(www.reuters.com)

OpenAI’s CBRN tests seem unclear

Luca Righetti 🔸Nov 21, 2024, 5:26 PM
82 points
3 comments7 min readEA link

Align­ing AI Safety Pro­jects with a Repub­li­can Administration

Deric ChengNov 21, 2024, 10:13 PM
13 points
1 comment8 min readEA link

LLM chat­bots have ~half of the kinds of “con­scious­ness” that hu­mans be­lieve in. Hu­mans should avoid go­ing crazy about that.

Andrew CritchNov 22, 2024, 3:26 AM
11 points
3 comments1 min readEA link

#208 – The case that TV shows, movies, and nov­els can im­prove the world (Eliz­a­beth Cox on The 80,000 Hours Pod­cast)

80000_HoursNov 22, 2024, 11:36 AM
10 points
0 comments17 min readEA link

[Question] Seek­ing Tan­gible Ex­am­ples of AI Catastrophes

clifford.banesNov 25, 2024, 7:55 AM
9 points
2 comments1 min readEA link

The U.S. Na­tional Se­cu­rity State is Here to Make AI Even Less Trans­par­ent and Accountable

Matrice JacobineNov 24, 2024, 9:34 AM
7 points
0 comments2 min readEA link
(www.eff.org)

Gw­ern on cre­at­ing your own AI race and China’s Fast Fol­lower strat­egy.

LarksNov 25, 2024, 3:01 AM
126 points
4 comments2 min readEA link
(www.lesswrong.com)

The An­i­mal Welfare Case for Open Ac­cess: Break­ing Bar­ri­ers to Scien­tific Knowl­edge and En­hanc­ing LLM Training

Wladimir J. AlonsoNov 23, 2024, 1:07 PM
32 points
2 comments3 min readEA link

[Question] Launch­ing Ap­pli­ca­tions for the Global AI Safety Fel­low­ship 2025!

Impact AcademyNov 27, 2024, 3:33 PM
9 points
1 comment1 min readEA link

#209 – OpenAI’s gam­bit to ditch its non­profit (Rose Chan Loui on The 80,000 Hours Pod­cast)

80000_HoursNov 27, 2024, 8:43 PM
22 points
0 comments17 min readEA link

Ap­ply for ARBOx: an ML safety in­ten­sive [dead­line 13 Dec ’24]

Nick MarshDec 1, 2024, 6:13 PM
20 points
0 comments1 min readEA link

Cog­ni­tive Bi­ases Con­tribut­ing to AI X-risk — a deleted ex­cerpt from my 2018 ARCHES draft

Andrew CritchDec 3, 2024, 9:29 AM
14 points
1 comment1 min readEA link

[Question] Is it a fed­eral crime in the US to de­velop AGI that may cause hu­man ex­tinc­tion?

OferDec 4, 2024, 2:38 PM
15 points
6 comments1 min readEA link

State of EA Poland and fund­ing opportunity

Chris SzulcDec 7, 2024, 8:48 AM
72 points
4 comments11 min readEA link

Liti­gate-for-Im­pact: Prepar­ing Le­gal Ac­tion against an AGI Fron­tier Lab Leader

Sonia M JosephDec 8, 2024, 2:28 PM
77 points
1 comment2 min readEA link

166 States Vote to Adopt Lethal Au­tonomous Weapons Re­s­olu­tion at the UNGA

Heramb PodarDec 8, 2024, 9:23 PM
14 points
0 comments1 min readEA link

Longter­mism bet­ter from a de­vel­op­ment skep­ti­cal stance?

Benevolent_RainDec 9, 2024, 12:16 PM
16 points
2 comments1 min readEA link

Sur­vey: How Do Elite Chi­nese Stu­dents Feel About the Risks of AI?

Nick CorvinoSep 2, 2024, 9:14 AM
107 points
9 comments10 min readEA link

List of AI safety courses and resources

Daniel del CastilloSep 6, 2021, 2:26 PM
51 points
8 comments1 min readEA link

New se­ries of posts an­swer­ing one of Holden’s “Im­por­tant, ac­tion­able re­search ques­tions”

Evan R. MurphyMay 12, 2022, 9:22 PM
9 points
0 comments1 min readEA link

[Question] Do EA folks think that a path to zero AGI de­vel­op­ment is fea­si­ble or worth­while for safety from AI?

Noah ScalesJul 17, 2022, 8:47 AM
8 points
3 comments1 min readEA link

[DISC] Are Values Ro­bust?

𝕮𝖎𝖓𝖊𝖗𝖆Dec 21, 2022, 1:13 AM
4 points
0 comments1 min readEA link

[Question] AI Safety Pitches post ChatGPT

ojorgensenDec 5, 2022, 10:48 PM
6 points
2 comments1 min readEA link

[Question] Clos­ing the Feed­back Loop on AI Safety Re­search.

Ben.HartleyJul 29, 2022, 9:46 PM
3 points
4 comments1 min readEA link

Ber­lin AI Safety Open Meetup July 2022

Isidor RegenfußJul 22, 2022, 4:26 PM
1 point
0 comments1 min readEA link

Pod­cast: Tam­era Lan­ham on AI risk, threat mod­els, al­ign­ment pro­pos­als, ex­ter­nal­ized rea­son­ing over­sight, and work­ing at Anthropic

AkashDec 20, 2022, 9:39 PM
14 points
1 comment1 min readEA link

In­tro­duc­ing spirit hazards

brb243May 27, 2022, 10:16 PM
9 points
2 comments2 min readEA link

You won’t solve al­ign­ment with­out agent foundations

MikhailSaminNov 6, 2022, 8:07 AM
14 points
0 comments1 min readEA link

Is in­ter­est in al­ign­ment worth men­tion­ing for grad school ap­pli­ca­tions?

Franziska FischerOct 16, 2022, 4:50 AM
5 points
4 comments1 min readEA link

AGI Timelines in Gover­nance: Differ­ent Strate­gies for Differ­ent Timeframes

simeon_cDec 19, 2022, 9:31 PM
110 points
19 comments1 min readEA link

The In­ter­gov­ern­men­tal Panel On Global Catas­trophic Risks (IPGCR)

DannyBresslerFeb 1, 2024, 5:36 PM
46 points
9 comments19 min readEA link

Ex­pected im­pact of a ca­reer in AI safety un­der differ­ent opinions

Jordan TaylorJun 14, 2022, 2:25 PM
42 points
16 comments11 min readEA link

Sce­nario Map­ping Ad­vanced AI Risk: Re­quest for Par­ti­ci­pa­tion with Data Collection

KiliankMar 27, 2022, 11:44 AM
14 points
0 comments5 min readEA link

Why some peo­ple be­lieve in AGI, but I don’t.

cveresOct 26, 2022, 3:09 AM
13 points
2 comments4 min readEA link

Pre­sump­tive Listen­ing: stick­ing to fa­mil­iar con­cepts and miss­ing the outer rea­son­ing paths

RemmeltDec 27, 2022, 3:40 PM
3 points
0 comments1 min readEA link

aisafety.com­mu­nity—A liv­ing doc­u­ment of AI safety communities

zeshenOct 20, 2022, 10:08 PM
24 points
13 comments1 min readEA link

Perform Tractable Re­search While Avoid­ing Ca­pa­bil­ities Ex­ter­nal­ities [Prag­matic AI Safety #4]

TW123May 30, 2022, 8:37 PM
33 points
1 comment25 min readEA link

Re­silience Via Frag­mented Power

steve6320Jul 14, 2022, 3:37 PM
2 points
0 comments6 min readEA link

Ques­tions about AI that bother me

Eleni_AJan 31, 2023, 6:50 AM
33 points
6 comments2 min readEA link

Ap­ply to at­tend a Global Challenges Pro­ject work­shop in 2025!

Liam 🔸Dec 10, 2024, 11:48 AM
13 points
1 comment2 min readEA link

Re­search + Real­ity Graph­ing to Sup­port AI Policy (and more): Sum­mary of a Frozen Project

Marcel DJul 2, 2022, 8:58 PM
34 points
2 comments8 min readEA link

Join ASAP (AI Safety Ac­countabil­ity Pro­gramme)

TheMcDouglasSep 10, 2022, 11:15 AM
54 points
20 comments3 min readEA link

What role should evolu­tion­ary analo­gies play in un­der­stand­ing AI take­off speeds?

ansonDec 11, 2021, 1:16 AM
12 points
0 comments42 min readEA link

Estab­lish­ing Oxford’s AI Safety Stu­dent Group: Les­sons Learnt and Our Model

Wilkin1234Sep 21, 2022, 7:57 AM
72 points
3 comments1 min readEA link

Open Prob­lems in AI X-Risk [PAIS #5]

TW123Jun 10, 2022, 2:22 AM
44 points
1 comment36 min readEA link

[Question] How long does it take to un­der­srand AI X-Risk from scratch so that I have a con­fi­dent, clear men­tal model of it from first prin­ci­ples?

Jordan ArelJul 27, 2022, 4:58 PM
29 points
6 comments1 min readEA link

Reflec­tions on the PIBBSS Fel­low­ship 2022

noraDec 11, 2022, 10:03 PM
69 points
4 comments18 min readEA link

Su­per­in­tel­li­gent AI is nec­es­sary for an amaz­ing fu­ture, but far from sufficient

So8resOct 31, 2022, 9:16 PM
35 points
5 comments1 min readEA link

4 Key As­sump­tions in AI Safety

PrometheusNov 7, 2022, 10:50 AM
5 points
0 comments1 min readEA link

[Question] Please Share Your Per­spec­tives on the De­gree of So­cietal Im­pact from Trans­for­ma­tive AI Outcomes

KiliankApr 15, 2022, 1:23 AM
3 points
3 comments1 min readEA link

Should AI fo­cus on prob­lem-solv­ing or strate­gic plan­ning? Why not both?

oliver_siegelNov 1, 2022, 9:53 AM
1 point
0 comments1 min readEA link

Data col­lec­tion for AI al­ign­ment—Ca­reer review

Benjamin HiltonJun 3, 2022, 11:44 AM
34 points
1 comment5 min readEA link
(80000hours.org)

Fund­ing for hu­man­i­tar­ian non-prof­its to re­search re­spon­si­ble AI

Deborah W.A. FoulkesDec 10, 2024, 8:08 AM
4 points
0 comments2 min readEA link
(www.gov.uk)

Univer­sity com­mu­nity build­ing seems like the wrong model for AI safety

George StiffmanFeb 26, 2022, 6:23 AM
24 points
8 comments2 min readEA link

A mod­est case for hope

xavier rgOct 17, 2022, 6:03 AM
28 points
0 comments1 min readEA link

Safety timelines: How long will it take to solve al­ign­ment?

Esben KranSep 19, 2022, 12:51 PM
45 points
9 comments6 min readEA link

[Question] Book recom­men­da­tions for the his­tory of ML?

Eleni_ADec 28, 2022, 11:45 PM
10 points
4 comments1 min readEA link

[Job]: AI Stan­dards Devel­op­ment Re­search Assistant

Tony BarrettOct 14, 2022, 8:18 PM
13 points
0 comments2 min readEA link

#177 – Re­cent AI break­throughs and nav­i­gat­ing the grow­ing rift be­tween AI safety and ac­cel­er­a­tionist camps (Nathan Labenz on the 80,000 Hours Pod­cast)

80000_HoursJan 31, 2024, 7:37 PM
15 points
0 comments16 min readEA link

AI Safety Ex­ec­u­tive Summary

Sean OsierSep 6, 2022, 8:26 AM
20 points
2 comments5 min readEA link
(seanosier.notion.site)

Win­ners of the AI Safety Nudge Competition

Marc CarauleanuNov 15, 2022, 1:06 AM
22 points
0 comments1 min readEA link

(Re­port) Eval­u­at­ing Taiwan’s Tac­tics to Safe­guard its Semi­con­duc­tor As­sets Against a Chi­nese Invasion

YadavDec 7, 2023, 12:01 AM
16 points
0 comments22 min readEA link
(bristolaisafety.org)

ML Safety Schol­ars Sum­mer 2022 Retrospective

TW123Nov 1, 2022, 3:09 AM
56 points
2 comments21 min readEA link

Tech­ni­cal AI safety in the United Arab Emirates

ea nyuadJun 21, 2022, 3:11 AM
10 points
0 comments11 min readEA link

Prin­ci­ples for the AGI Race

William_SAug 30, 2024, 2:30 PM
81 points
4 comments18 min readEA link

[Question] A dataset for AI/​su­per­in­tel­li­gence sto­ries and other me­dia?

Marcel DMar 29, 2022, 9:41 PM
20 points
2 comments1 min readEA link

Safety with­out op­pres­sion: an AI gov­er­nance problem

Nathan_BarnardJul 28, 2022, 10:19 AM
3 points
0 comments8 min readEA link

Ac­tion­able-guidance and roadmap recom­men­da­tions for the NIST AI Risk Man­age­ment Framework

Tony BarrettMay 17, 2022, 3:27 PM
11 points
0 comments3 min readEA link

Fol­low along with Columbia EA’s Ad­vanced AI Safety Fel­low­ship!

RohanSJul 2, 2022, 6:07 AM
27 points
0 comments2 min readEA link

AI Safety Ca­reer Bot­tle­necks Sur­vey Re­sponses Responses

Linda LinseforsMay 28, 2021, 10:41 AM
35 points
1 comment5 min readEA link

My sum­mary of “Prag­matic AI Safety”

Eleni_ANov 5, 2022, 2:47 PM
14 points
0 comments5 min readEA link

Three sce­nar­ios of pseudo-al­ign­ment

Eleni_ASep 5, 2022, 8:26 PM
7 points
0 comments3 min readEA link

A Ma­jor Flaw in SP1047 re APTs and So­phis­ti­cated Threat Actors

CarusoAug 30, 2024, 2:11 PM
0 points
6 comments3 min readEA link

Com­plex Sys­tems for AI Safety [Prag­matic AI Safety #3]

TW123May 24, 2022, 12:04 AM
49 points
6 comments21 min readEA link

[Question] Up­dates on FLI’S Value Align­ment Map?

QubitSwarm99Sep 19, 2022, 12:25 AM
8 points
0 comments1 min readEA link

AGI Bat­tle Royale: Why “slow takeover” sce­nar­ios de­volve into a chaotic multi-AGI fight to the death

titotalSep 22, 2022, 3:00 PM
49 points
11 comments15 min readEA link

Brain­storm of things that could force an AI team to burn their lead

So8resJul 25, 2022, 12:00 AM
26 points
1 comment13 min readEA link

[Question] Is there any re­search or fore­casts of how likely AI Align­ment is go­ing to be a hard vs. easy prob­lem rel­a­tive to ca­pa­bil­ities?

Jordan ArelAug 14, 2022, 3:58 PM
8 points
1 comment1 min readEA link

The Limit of Lan­guage Models

𝕮𝖎𝖓𝖊𝖗𝖆Dec 26, 2022, 11:17 AM
10 points
0 comments1 min readEA link

(My sug­ges­tions) On Begin­ner Steps in AI Alignment

Joseph BloomSep 22, 2022, 3:32 PM
36 points
3 comments9 min readEA link

Zvi on: A Play­book for AI Policy at the Man­hat­tan Institute

PhibAug 4, 2024, 9:34 PM
9 points
1 comment7 min readEA link
(thezvi.substack.com)

FYI: I’m work­ing on a book about the threat of AGI/​ASI for a gen­eral au­di­ence. I hope it will be of value to the cause and the community

Darren McKeeJun 17, 2022, 11:52 AM
32 points
1 comment2 min readEA link

Chris Olah on what the hell is go­ing on in­side neu­ral networks

80000_HoursAug 4, 2021, 3:13 PM
5 points
0 comments133 min readEA link

A challenge for AGI or­ga­ni­za­tions, and a challenge for readers

RobBensingerDec 1, 2022, 11:11 PM
172 points
13 comments1 min readEA link

Re­sources that (I think) new al­ign­ment re­searchers should know about

AkashOct 28, 2022, 10:13 PM
20 points
2 comments1 min readEA link

Co­op­er­a­tion and Align­ment in Del­e­ga­tion Games: You Need Both!

Oliver SourbutAug 3, 2024, 10:16 AM
4 points
1 comment1 min readEA link
(www.oliversourbut.net)

Ap­ply for the ML Win­ter Camp in Cam­bridge, UK [2-10 Jan]

Nathan_BarnardDec 2, 2022, 7:33 PM
50 points
11 comments2 min readEA link

Distil­la­tion of “How Likely is De­cep­tive Align­ment?”

NickGabsDec 1, 2022, 8:22 PM
10 points
1 comment10 min readEA link

[Question] Benefits/​Risks of Scott Aaron­son’s Ortho­dox/​Re­form Fram­ing for AI Alignment

JeremyNov 21, 2022, 5:47 PM
15 points
5 comments1 min readEA link
(scottaaronson.blog)

*New* Canada AI Safety & Gover­nance community

Wyatt Tessari L'AlliéAug 29, 2022, 3:58 PM
32 points
2 comments1 min readEA link

How Josiah be­came an AI safety researcher

Neil CrawfordMar 29, 2022, 7:47 PM
10 points
0 comments1 min readEA link

I’m In­ter­view­ing Kat Woods, EA Pow­er­house. What Should I Ask?

SereneDesireeSep 20, 2022, 9:49 AM
4 points
2 comments1 min readEA link

Grokking “Fore­cast­ing TAI with biolog­i­cal an­chors”

ansonJun 6, 2022, 6:56 PM
43 points
0 comments14 min readEA link

Strate­gic Direc­tions for a Digi­tal Con­scious­ness Model

Derek ShillerDec 10, 2024, 7:33 PM
41 points
1 comment12 min readEA link

[Question] I’m in­ter­view­ing pro­lific AI safety re­searcher Richard Ngo (now at OpenAI and pre­vi­ously Deep­Mind). What should I ask him?

Robert_WiblinSep 29, 2022, 12:00 AM
45 points
11 comments1 min readEA link

AI Safety in a Vuln­er­a­ble World: Re­quest­ing Feed­back on Pre­limi­nary Thoughts

Jordan ArelDec 6, 2022, 10:36 PM
5 points
4 comments3 min readEA link

Fermi es­ti­ma­tion of the im­pact you might have work­ing on AI safety

fribMay 13, 2022, 1:30 PM
24 points
13 comments1 min readEA link

ARIA is look­ing for top­ics for roundtables

Nathan_BarnardAug 26, 2022, 7:14 PM
34 points
11 comments1 min readEA link

What is the role of Bayesian ML for AI al­ign­ment/​safety?

mariushobbhahnJan 11, 2022, 8:07 AM
39 points
6 comments3 min readEA link

Take­aways from a sur­vey on AI al­ign­ment resources

DanielFilanNov 5, 2022, 11:45 PM
20 points
9 comments6 min readEA link
(www.lesswrong.com)

Let’s think about slow­ing down AI

Katja_GraceDec 23, 2022, 7:56 PM
334 points
9 comments1 min readEA link

[Question] What does the Pro­ject Man­age­ment role look like in AI safety?

gvstMay 14, 2022, 7:29 PM
10 points
1 comment1 min readEA link

The Po­ten­tial Im­pact of AI in An­i­mal Ad­vo­cacy & The Need For More Fund­ing In This Space

Sam TuckerFeb 1, 2024, 12:43 PM
10 points
0 comments5 min readEA link

In­for­ma­tion se­cu­rity con­sid­er­a­tions for AI and the long term future

Jeffrey LadishMay 2, 2022, 8:53 PM
134 points
8 comments11 min readEA link

How to be­come an AI safety researcher

peterbarnettApr 12, 2022, 11:33 AM
113 points
15 comments14 min readEA link

So­cial sci­en­tists in­ter­ested in AI safety should con­sider do­ing di­rect tech­ni­cal AI safety re­search, (pos­si­bly meta-re­search), or gov­er­nance, sup­port roles, or com­mu­nity build­ing instead

Vael GatesJul 20, 2022, 11:01 PM
65 points
8 comments18 min readEA link

Which AI Safety Org to Join?

Yonatan CaleOct 11, 2022, 7:42 PM
17 points
21 comments1 min readEA link

AI ac­cel­er­a­tion from a safety per­spec­tive: Trade-offs and con­sid­er­a­tions

mariushobbhahnJan 19, 2022, 9:44 AM
12 points
1 comment7 min readEA link

Data Publi­ca­tion for the 2021 Ar­tifi­cial In­tel­li­gence, Mo­ral­ity, and Sen­tience (AIMS) Sur­vey

Janet PauketatMar 24, 2022, 3:43 PM
21 points
0 comments3 min readEA link
(www.sentienceinstitute.org)

Why I’m Scep­ti­cal of Foom

𝕮𝖎𝖓𝖊𝖗𝖆Dec 8, 2022, 10:01 AM
22 points
7 comments1 min readEA link

Is GPT3 a Good Ra­tion­al­ist? - In­struc­tGPT3 [2/​2]

simeon_cApr 7, 2022, 1:54 PM
25 points
0 comments7 min readEA link

Against Agents as an Ap­proach to Aligned Trans­for­ma­tive AI

𝕮𝖎𝖓𝖊𝖗𝖆Dec 27, 2022, 12:47 AM
4 points
0 comments1 min readEA link

What AI Safety Ma­te­ri­als Do ML Re­searchers Find Com­pel­ling?

Vael GatesDec 28, 2022, 2:03 AM
130 points
12 comments1 min readEA link

Mak­ing of #IAN

kirchner.janAug 29, 2021, 4:24 PM
9 points
0 comments1 min readEA link
(universalprior.substack.com)

Let’s Talk About Emergence

Jacob-HaimesJun 7, 2024, 7:34 PM
8 points
1 comment7 min readEA link
(www.odysseaninstitute.org)

My (naive) take on Risks from Learned Optimization

Artyom KNov 6, 2022, 4:25 PM
5 points
0 comments1 min readEA link

Prizes for ML Safety Bench­mark Ideas

JoshcOct 28, 2022, 2:44 AM
56 points
8 comments1 min readEA link

Devel­op­ing a Calcu­la­ble Con­science for AI: Equa­tion for Rights Violations

Sean SweeneyDec 12, 2024, 5:50 PM
4 points
1 comment15 min readEA link

AI Safety Un­con­fer­ence NeurIPS 2022

Orpheus_LummisNov 7, 2022, 3:39 PM
13 points
5 comments1 min readEA link
(aisafetyevents.org)

Si­mu­la­tors and Mindcrime

𝕮𝖎𝖓𝖊𝖗𝖆Dec 9, 2022, 3:20 PM
1 point
0 comments1 min readEA link

Pitch­ing AI Safety in 3 sentences

PabloAMC 🔸Mar 30, 2022, 6:50 PM
7 points
0 comments1 min readEA link

Loss of con­trol of AI is not a likely source of AI x-risk

squekNov 9, 2022, 5:48 AM
8 points
0 comments1 min readEA link

When is AI safety re­search harm­ful?

Nathan_BarnardMay 9, 2022, 10:36 AM
13 points
6 comments9 min readEA link

Les­sons from Three Mile Is­land for AI Warn­ing Shots

NickGabsSep 26, 2022, 2:47 AM
42 points
0 comments15 min readEA link

My per­sonal cruxes for work­ing on AI safety

BuckFeb 13, 2020, 7:11 AM
136 points
35 comments44 min readEA link

An­nounc­ing the Cam­bridge Bos­ton Align­ment Ini­ti­a­tive [Hiring!]

kuhanjDec 2, 2022, 1:07 AM
83 points
0 comments1 min readEA link

A tough ca­reer decision

PabloAMC 🔸Apr 9, 2022, 12:46 AM
68 points
13 comments4 min readEA link

Con­tribute by fa­cil­i­tat­ing the AGI Safety Fun­da­men­tals Programme

Jamie BDec 6, 2021, 11:50 AM
27 points
0 comments2 min readEA link

“AGI timelines: ig­nore the so­cial fac­tor at their peril” (Fu­ture Fund AI Wor­ld­view Prize sub­mis­sion)

ketanramaNov 5, 2022, 5:45 PM
10 points
0 comments12 min readEA link
(trevorklee.substack.com)

I made an AI safety fel­low­ship. What I wish I knew.

RubenCastaingJun 9, 2024, 4:32 PM
14 points
1 comment2 min readEA link

[Linkpost] “Blueprint for an AI Bill of Rights”—Office of Science and Tech­nol­ogy Policy, USA (2022)

QubitSwarm99Oct 5, 2022, 4:48 PM
15 points
0 comments1 min readEA link

AI Safety For Dum­mies (Like Me)

Madhav MalhotraAug 24, 2022, 8:26 PM
22 points
7 comments20 min readEA link

Stress Ex­ter­nal­ities More in AI Safety Pitches

NickGabsSep 26, 2022, 8:31 PM
31 points
9 comments2 min readEA link

Overview of Trans­for­ma­tive AI Mi­suse Risks

SammyDMartinDec 11, 2024, 11:04 AM
12 points
0 comments2 min readEA link
(longtermrisk.org)

Anal­y­sis of Global AI Gover­nance Strategies

SammyDMartinDec 11, 2024, 11:08 AM
23 points
0 comments1 min readEA link
(www.lesswrong.com)

AI Safety Ideas: A col­lab­o­ra­tive AI safety re­search platform

Apart ResearchOct 17, 2022, 5:01 PM
67 points
13 comments4 min readEA link

Ter­minol­ogy sug­ges­tion: stan­dard­ize terms for prob­a­bil­ity ranges

Egg SyntaxAug 30, 2024, 4:05 PM
2 points
0 comments1 min readEA link

[Question] Why does (any par­tic­u­lar) AI safety work re­duce s-risks more than it in­creases them?

MichaelStJulesOct 3, 2021, 4:55 PM
48 points
19 comments1 min readEA link

Refer the Co­op­er­a­tive AI Foun­da­tion’s New COO, Re­ceive $5000

Lewis HammondJun 16, 2022, 1:27 PM
42 points
0 comments3 min readEA link

[Question] Does China have AI al­ign­ment re­sources/​in­sti­tu­tions? How can we pri­ori­tize cre­at­ing more?

JakubKAug 4, 2022, 7:23 PM
18 points
9 comments1 min readEA link

Con­crete Ad­vice for Form­ing In­side Views on AI Safety

Neel NandaAug 17, 2022, 11:26 PM
58 points
4 comments10 min readEA link
(www.alignmentforum.org)

Ar­tifi­cial In­tel­li­gence, Mo­ral­ity, and Sen­tience (AIMS) Sur­vey: 2021

Janet PauketatJul 1, 2022, 7:47 AM
36 points
0 comments2 min readEA link
(www.sentienceinstitute.org)

Newslet­ter for Align­ment Re­search: The ML Safety Updates

Esben KranOct 22, 2022, 4:17 PM
30 points
0 comments7 min readEA link

Math­e­mat­i­cal Cir­cuits in Neu­ral Networks

Sean OsierSep 22, 2022, 2:32 AM
23 points
2 comments1 min readEA link
(www.youtube.com)

All AGI Safety ques­tions wel­come (es­pe­cially ba­sic ones) [~monthly thread]

robertskmilesNov 1, 2022, 11:21 PM
75 points
83 comments1 min readEA link

A Quick List of Some Prob­lems in AI Align­ment As A Field

Nicholas / Heather KrossJun 21, 2022, 5:09 PM
16 points
10 comments6 min readEA link
(www.thinkingmuchbetter.com)

Beg­ging, Plead­ing AI Orgs to Com­ment on NIST AI Risk Man­age­ment Framework

BridgesApr 15, 2022, 7:35 PM
87 points
3 comments2 min readEA link

[Question] AI Eth­i­cal Committee

eaaicommitteeMar 1, 2022, 11:35 PM
8 points
0 comments1 min readEA link

An­nounc­ing AI Align­ment Awards: $100k re­search con­tests about goal mis­gen­er­al­iza­tion & corrigibility

AkashNov 22, 2022, 10:19 PM
60 points
1 comment1 min readEA link

Good Fu­tures Ini­ti­a­tive: Win­ter Pro­ject In­tern­ship

a_e_rNov 27, 2022, 11:27 PM
67 points
7 comments3 min readEA link

AI Safety Endgame Stories

IvanVendrovSep 28, 2022, 5:12 PM
31 points
1 comment1 min readEA link

A Cri­tique of AI Takeover Scenarios

James FodorAug 31, 2022, 1:49 PM
53 points
4 comments12 min readEA link

Ad­vice on Pur­su­ing Tech­ni­cal AI Safety Research

frances_lorenzMay 31, 2022, 5:48 PM
29 points
2 comments4 min readEA link

[Question] What Do AI Safety Pitches Not Get About Your Field?

a_e_rSep 20, 2022, 6:13 PM
70 points
18 comments1 min readEA link

UK AI Policy Re­port: Con­tent, Sum­mary, and its Im­pact on EA Cause Areas

Algo_LawJul 21, 2022, 5:32 PM
9 points
1 comment9 min readEA link

Im­proved Se­cu­rity to Prevent Hacker-AI and Digi­tal Ghosts

Erland WittkotterOct 21, 2022, 10:11 AM
1 point
0 comments1 min readEA link

“Clean” vs. “messy” goal-di­rect­ed­ness (Sec­tion 2.2.3 of “Schem­ing AIs”)

Joe_CarlsmithNov 29, 2023, 4:32 PM
7 points
0 comments1 min readEA link

Why The Fo­cus on Ex­pected Utility Max­imisers?

𝕮𝖎𝖓𝖊𝖗𝖆Dec 27, 2022, 3:51 PM
11 points
1 comment1 min readEA link

Con­sider try­ing Vivek Heb­bar’s al­ign­ment exercises

AkashOct 24, 2022, 7:46 PM
16 points
0 comments1 min readEA link

Con­crete ac­tions to im­prove AI gov­er­nance: the be­havi­our sci­ence approach

Alexander SaeriDec 1, 2022, 9:34 PM
31 points
0 comments11 min readEA link

Es­ti­mat­ing the Cur­rent and Fu­ture Num­ber of AI Safety Researchers

Stephen McAleeseSep 28, 2022, 8:58 PM
64 points
34 comments9 min readEA link

What I’m doing

Chris LeongJul 19, 2022, 11:31 AM
28 points
0 comments4 min readEA link

Prov­ably Hon­est—A First Step

Srijanak DeNov 5, 2022, 9:49 PM
1 point
0 comments1 min readEA link

Align­ing AI with Hu­mans by Lev­er­ag­ing Le­gal Informatics

johnjnaySep 18, 2022, 7:43 AM
20 points
11 comments3 min readEA link

What are the “no free lunch” the­o­rems?

Vishakha AgrawalFeb 4, 2025, 2:02 AM
3 points
0 comments1 min readEA link
(aisafety.info)

Part 1: The AI Safety com­mu­nity has four main work groups, Strat­egy, Gover­nance, Tech­ni­cal and Move­ment Building

PeterSlatteryNov 25, 2022, 3:45 AM
72 points
7 comments6 min readEA link

Pivotal out­comes and pivotal processes

Andrew CritchJun 17, 2022, 11:43 PM
49 points
1 comment4 min readEA link

Dis­cov­er­ing Lan­guage Model Be­hav­iors with Model-Writ­ten Evaluations

evhubDec 20, 2022, 8:09 PM
25 points
0 comments1 min readEA link

How could we know that an AGI sys­tem will have good con­se­quences?

So8resNov 7, 2022, 10:42 PM
25 points
0 comments1 min readEA link

In­tro to Safety Engineering

Madhav MalhotraOct 19, 2022, 11:44 PM
4 points
0 comments1 min readEA link

[Link post] Promis­ing Paths to Align­ment—Con­nor Leahy | Talk

frances_lorenzMay 14, 2022, 3:58 PM
17 points
0 comments1 min readEA link

AI Safety Overview: CERI Sum­mer Re­search Fellowship

Jamie BMar 24, 2022, 3:12 PM
29 points
0 comments2 min readEA link

AI safety uni­ver­sity groups: a promis­ing op­por­tu­nity to re­duce ex­is­ten­tial risk

micJun 30, 2022, 6:37 PM
53 points
1 comment11 min readEA link

An­nounc­ing the AIPoli­cyIdeas.com Database

abiolveraJun 23, 2023, 4:09 PM
50 points
3 comments2 min readEA link
(www.aipolicyideas.com)

My Most Likely Rea­son to Die Young is AI X-Risk

AISafetyIsNotLongtermistJul 4, 2022, 3:34 PM
237 points
62 comments4 min readEA link
(www.lesswrong.com)

In­fer­ence-Only De­bate Ex­per­i­ments Us­ing Math Problems

Arjun PanicksseryAug 6, 2024, 5:44 PM
3 points
1 comment1 min readEA link

Three new re­ports re­view­ing re­search and con­cepts in ad­vanced AI governance

MMMaasNov 28, 2023, 9:21 AM
32 points
0 comments2 min readEA link
(www.legalpriorities.org)

“In­tro to brain-like-AGI safety” se­ries—just finished!

Steven ByrnesMay 17, 2022, 3:35 PM
15 points
0 comments1 min readEA link

Four rea­sons I find AI safety emo­tion­ally compelling

Kat WoodsJun 28, 2022, 2:01 PM
32 points
5 comments4 min readEA link

Half-baked ideas thread (EA /​ AI Safety)

Aryeh EnglanderJun 23, 2022, 4:05 PM
21 points
8 comments1 min readEA link

AGI will ar­rive by the end of this decade ei­ther as a uni­corn or as a black swan

Yuri BarzovOct 21, 2022, 10:50 AM
−4 points
7 comments3 min readEA link

A vi­su­al­iza­tion of some orgs in the AI Safety Pipeline

Aaron_ScherApr 10, 2022, 4:52 PM
11 points
8 comments1 min readEA link

Meta: Fron­tier AI Framework

Zach Stein-PerlmanFeb 3, 2025, 10:00 PM
23 points
0 comments1 min readEA link
(ai.meta.com)

[Question] Best in­tro­duc­tory overviews of AGI safety?

JakubKDec 13, 2022, 7:04 PM
21 points
8 comments2 min readEA link
(www.lesswrong.com)

[Question] Is it valuable to the field of AI Safety to have a neu­ro­science back­ground?

Samuel NellessenApr 3, 2022, 7:44 PM
18 points
3 comments1 min readEA link

NeurIPS ML Safety Work­shop 2022

Dan HJul 26, 2022, 3:33 PM
72 points
0 comments1 min readEA link
(neurips2022.mlsafety.org)

#173 – Digi­tal minds, and how to avoid sleep­walk­ing into a ma­jor moral catas­tro­phe (Jeff Sebo on the 80,000 Hours Pod­cast)

80000_HoursNov 29, 2023, 7:18 PM
43 points
0 comments18 min readEA link

Rac­ing through a minefield: the AI de­ploy­ment problem

Holden KarnofskyDec 31, 2022, 9:44 PM
79 points
1 comment13 min readEA link
(www.cold-takes.com)

My thoughts on OpenAI’s al­ign­ment plan

AkashDec 30, 2022, 7:34 PM
16 points
0 comments1 min readEA link

Self-Limit­ing AI in AI Alignment

The_Lord's_Servant_280Dec 31, 2022, 7:07 PM
2 points
1 comment1 min readEA link

Re­sults from the AI test­ing hackathon

Esben KranJan 2, 2023, 3:46 PM
35 points
4 comments5 min readEA link
(alignmentjam.com)

“AI” is an indexical

TW123Jan 3, 2023, 10:00 PM
23 points
2 comments1 min readEA link

[Question] What are the strate­gic im­pli­ca­tions if aliens and Earth civ­i­liza­tions pro­duce similar util­ities?

Maxime_RicheAug 6, 2024, 9:21 PM
6 points
1 comment1 min readEA link

Ma­chine Learn­ing for Scien­tific Dis­cov­ery—AI Safety Camp

Eleni_AJan 6, 2023, 3:06 AM
9 points
0 comments1 min readEA link

Learn­ing as much Deep Learn­ing math as I could in 24 hours

PhosphorousJan 8, 2023, 2:19 AM
58 points
6 comments7 min readEA link

Is any­one else also get­ting more wor­ried about hard take­off AGI sce­nar­ios?

JonCefaluJan 9, 2023, 6:04 AM
19 points
11 comments3 min readEA link

Sakana, Straw­berry, and Scary AI

Matrice JacobineSep 19, 2024, 11:57 AM
1 point
0 comments1 min readEA link
(www.astralcodexten.com)

My ex­pe­rience ap­ply­ing to MATS 6.0

micJul 18, 2024, 7:02 PM
20 points
0 comments1 min readEA link

Big list of AI safety videos

JakubKJan 9, 2023, 6:09 AM
9 points
0 comments1 min readEA link
(docs.google.com)

#199 – Cal­ifor­nia’s AI bill SB 1047 and its po­ten­tial to shape US AI policy (Nathan Calvin on The 80,000 Hours Pod­cast)

80000_HoursAug 30, 2024, 6:18 PM
12 points
0 comments10 min readEA link

David Krueger on AI Align­ment in Academia and Coordination

Michaël TrazziJan 7, 2023, 9:14 PM
32 points
1 comment3 min readEA link
(theinsideview.ai)

[Ru­mour] Microsoft to in­vest $10B in OpenAI, will re­ceive 75% of prof­its un­til they re­coup in­vest­ment: GPT would be in­te­grated with Office

𝕮𝖎𝖓𝖊𝖗𝖆Jan 10, 2023, 11:43 PM
25 points
2 comments1 min readEA link

ML Sum­mer Boot­camp Reflec­tion: Aalto EA Finland

Aayush KucheriaJan 12, 2023, 8:24 AM
15 points
2 comments9 min readEA link

Slay­ing the Hy­dra: to­ward a new game board for AI

PrometheusJun 23, 2023, 5:04 PM
3 points
2 comments1 min readEA link

My ex­pe­rience build­ing math­e­mat­i­cal ML skills with a course from UIUC

Naoya OkamotoJun 9, 2024, 11:41 AM
2 points
0 comments10 min readEA link

Jan Kirch­ner on AI Alignment

birtesJan 17, 2023, 3:11 PM
5 points
0 comments1 min readEA link

On the com­pute gov­er­nance era and what has to come af­ter (Len­nart Heim on The 80,000 Hours Pod­cast)

80000_HoursJun 23, 2023, 8:11 PM
37 points
0 comments18 min readEA link

Emerg­ing Paradigms: The Case of Ar­tifi­cial In­tel­li­gence Safety

Eleni_AJan 18, 2023, 5:59 AM
16 points
0 comments19 min readEA link

[Question] Any Philos­o­phy PhD recom­men­da­tions for stu­dents in­ter­ested in Align­ment Efforts?

rickyhuang.hexuanJan 18, 2023, 5:54 AM
7 points
6 comments1 min readEA link

Catas­trophic Risks from AI #3: AI Race

Dan HJun 23, 2023, 7:21 PM
9 points
0 comments1 min readEA link

Are New Ideas in AI Get­ting Harder to Find?

Charlie HarrisonDec 10, 2024, 12:52 PM
39 points
3 comments5 min readEA link

We Ran an Align­ment Workshop

aiden amentJan 21, 2023, 5:37 AM
6 points
0 comments3 min readEA link

What Are The Biggest Threats To Hu­man­ity? (A Hap­pier World video)

Jeroen Willems🔸Jan 31, 2023, 7:50 PM
17 points
1 comment15 min readEA link

What a com­pute-cen­tric frame­work says about AI take­off speeds

Tom_DavidsonJan 23, 2023, 4:09 AM
189 points
7 comments16 min readEA link
(www.lesswrong.com)

There should be a pub­lic ad­ver­sar­ial col­lab­o­ra­tion on AI x-risk

pradyuprasadJan 23, 2023, 4:09 AM
56 points
5 comments2 min readEA link

[Question] Has pri­vate AGI re­search made in­de­pen­dent safety re­search in­effec­tive already? What should we do about this?

Roman LeventovJan 23, 2023, 4:23 PM
15 points
0 comments5 min readEA link

Up­date to Samotsvety AGI timelines

Misha_YagudinJan 24, 2023, 4:27 AM
120 points
9 comments4 min readEA link

AI safety mile­stones?

Zach Stein-PerlmanJan 23, 2023, 9:30 PM
6 points
0 comments1 min readEA link

AGI safety field build­ing pro­jects I’d like to see

SeverinJan 24, 2023, 11:30 PM
25 points
2 comments1 min readEA link

AI Gover­nance Read­ing Group [Toronto+re­mote]

Liav.KorenJan 24, 2023, 10:05 PM
2 points
0 comments1 min readEA link

Ex­is­ten­tial Risk of Misal­igned In­tel­li­gence Aug­men­ta­tion (Par­tic­u­larly Us­ing High-Band­width BCI Im­plants)

Damian GorskiJan 24, 2023, 5:02 PM
1 point
0 comments9 min readEA link

“How to Es­cape from the Si­mu­la­tion”—Seeds of Science call for reviewers

rogersbacon1Jan 26, 2023, 3:12 PM
7 points
0 comments1 min readEA link

In­ter­views with 97 AI Re­searchers: Quan­ti­ta­tive Analysis

Maheen ShermohammedFeb 2, 2023, 4:50 AM
76 points
4 comments7 min readEA link

Ap­ply to HAIST/​MAIA’s AI Gover­nance Work­shop in DC (Feb 17-20)

PhosphorousJan 28, 2023, 12:45 AM
15 points
0 comments1 min readEA link
(www.lesswrong.com)

AI gov­er­nance & China: Read­ing list

Zach Stein-PerlmanDec 18, 2023, 3:30 PM
14 points
0 comments1 min readEA link
(docs.google.com)

Im­pact Academy is hiring an AI Gover­nance Lead—more in­for­ma­tion, up­com­ing Q&A and $500 bounty

Lowe LundinAug 29, 2023, 6:42 PM
9 points
1 comment1 min readEA link

Time-stamp­ing: An ur­gent, ne­glected AI safety measure

Axel SvenssonJan 30, 2023, 11:21 AM
57 points
27 comments3 min readEA link

On value in hu­mans, other an­i­mals, and AI

Michele CampoloJan 31, 2023, 11:48 PM
7 points
6 comments5 min readEA link

Ret­ro­spec­tive on the AI Safety Field Build­ing Hub

Vael GatesFeb 2, 2023, 2:06 AM
64 points
2 comments9 min readEA link

“AI Risk Dis­cus­sions” web­site: Ex­plor­ing in­ter­views from 97 AI Researchers

Vael GatesFeb 2, 2023, 1:00 AM
46 points
1 comment1 min readEA link

Pre­dict­ing re­searcher in­ter­est in AI alignment

Vael GatesFeb 2, 2023, 12:58 AM
30 points
0 comments21 min readEA link
(docs.google.com)

A Brief Overview of AI Safety/​Align­ment Orgs, Fields, Re­searchers, and Re­sources for ML Researchers

Austin WitteFeb 2, 2023, 6:19 AM
18 points
5 comments2 min readEA link

Eli Lifland on Nav­i­gat­ing the AI Align­ment Landscape

Ozzie GooenFeb 1, 2023, 12:07 AM
48 points
9 comments31 min readEA link
(quri.substack.com)

[Linkpost] Hu­man-nar­rated au­dio ver­sion of “Is Power-Seek­ing AI an Ex­is­ten­tial Risk?”

Joe_CarlsmithJan 31, 2023, 7:19 PM
9 points
0 comments1 min readEA link

Talk to me about your sum­mer/​ca­reer plans

AkashJan 31, 2023, 6:29 PM
31 points
0 comments1 min readEA link

Alexan­der and Yud­kowsky on AGI goals

Scott AlexanderJan 31, 2023, 11:36 PM
29 points
1 comment1 min readEA link

Fo­cus on the places where you feel shocked ev­ery­one’s drop­ping the ball

So8resFeb 2, 2023, 12:27 AM
92 points
6 comments1 min readEA link

40,000 rea­sons to worry about AI safety

Michael HuangFeb 2, 2023, 7:48 AM
9 points
2 comments2 min readEA link
(www.theverge.com)

Assess­ing China’s im­por­tance as an AI superpower

JulianHazellFeb 3, 2023, 11:08 AM
89 points
7 comments1 min readEA link
(muddyclothes.substack.com)

An au­dio ver­sion of the al­ign­ment prob­lem from a deep learn­ing per­spec­tive by Richard Ngo Et Al

MiguelFeb 3, 2023, 7:32 PM
18 points
0 comments1 min readEA link
(www.whitehatstoic.com)

Crit­i­cism Thread: What things should OpenPhil im­prove on?

anonymousEA20Feb 4, 2023, 8:16 AM
85 points
8 comments2 min readEA link

A dis­cus­sion with ChatGPT on value-based mod­els vs. large lan­guage mod­els, etc..

MiguelFeb 4, 2023, 4:49 PM
4 points
0 comments12 min readEA link
(www.whitehatstoic.com)

Se­cond call: CFP for Re­bel­lion and Di­sobe­di­ence in AI workshop

Ram RachumFeb 5, 2023, 12:19 PM
2 points
0 comments2 min readEA link

Call for sub­mis­sions: AI Safety Spe­cial Ses­sion at the Con­fer­ence on Ar­tifi­cial Life (ALIFE 2023)

Rory GreigFeb 5, 2023, 4:37 PM
16 points
0 comments2 min readEA link
(humanvaluesandartificialagency.com)

Launch­ing The Col­lec­tive In­tel­li­gence Pro­ject: Whitepa­per and Pilots

jasmine_wangFeb 6, 2023, 5:00 PM
38 points
8 comments2 min readEA link
(cip.org)

Dear An­thropic peo­ple, please don’t re­lease Claude

Joseph MillerFeb 8, 2023, 2:44 AM
27 points
5 comments1 min readEA link

[Our World in Data] AI timelines: What do ex­perts in ar­tifi­cial in­tel­li­gence ex­pect for the fu­ture? (Roser, 2023)

Will AldredFeb 7, 2023, 2:52 PM
98 points
1 comment1 min readEA link
(ourworldindata.org)

Mechanism De­sign for AI Safety—Agenda Creation Retreat

Rubi J. HudsonFeb 10, 2023, 3:05 AM
21 points
1 comment1 min readEA link

[Question] Sur­vey about Copy­right and gen­er­a­tive AI al­lowed here ?

Lee O'BrienAug 9, 2024, 12:27 PM
0 points
1 comment1 min readEA link

Overview | An Eval­u­a­tive Evolu­tion

Matt KeeneFeb 10, 2023, 6:15 PM
−9 points
0 comments5 min readEA link
(www.creatingafuturewewant.com)

$1,000 bounty for an AI Pro­gramme Lead recommendation

Cillian_Aug 14, 2023, 1:11 PM
11 points
1 comment2 min readEA link

High im­pact job op­por­tu­nity at ARIA (UK)

RasoolFeb 12, 2023, 10:35 AM
80 points
0 comments1 min readEA link

[Question] Huh. Bing thing got me real anx­ious about AI. Re­sources to help with that please?

ArvinFeb 15, 2023, 4:55 PM
2 points
7 comments1 min readEA link

AI Safety Info Distil­la­tion Fellowship

robertskmilesFeb 17, 2023, 4:16 PM
80 points
1 comment1 min readEA link

In­ter­view with Ro­man Yam­polskiy about AGI on The Real­ity Check

Darren McKeeFeb 18, 2023, 11:29 PM
27 points
0 comments1 min readEA link
(www.trcpodcast.com)

Where does Re­spon­si­ble Ca­pa­bil­ities Scal­ing take AI gov­er­nance?

ZacRichardsonJun 9, 2024, 10:25 PM
17 points
1 comment16 min readEA link

Does most of your im­pact come from what you do soon?

JoshcFeb 21, 2023, 5:12 AM
38 points
1 comment5 min readEA link

AI al­ign­ment re­searchers don’t (seem to) stack

So8resFeb 21, 2023, 12:48 AM
47 points
3 comments1 min readEA link

How should norms of aca­demic writ­ing and pub­lish­ing be changed once AI sys­tems be­come su­per­hu­man in more re­spects?

simonfriederichNov 24, 2023, 1:35 PM
10 points
0 comments1 min readEA link
(link.springer.com)

Cal­ifor­nia AI Bill, SB 1047, cov­ered in to­day’s WSJ.

EmersonAug 8, 2024, 12:27 PM
5 points
0 comments1 min readEA link
(www.wsj.com)

2023 Stan­ford Ex­is­ten­tial Risks Conference

elizabethcooperFeb 24, 2023, 5:49 PM
29 points
5 comments1 min readEA link

Ap­ply to a small iter­a­tion of MLAB to be run in Oxford

Rio PAug 29, 2023, 7:39 PM
11 points
0 comments1 min readEA link

How much should gov­ern­ments pay to pre­vent catas­tro­phes? Longter­mism’s limited role

EJTMar 19, 2023, 4:50 PM
258 points
35 comments35 min readEA link
(philpapers.org)

[Question] Which is more im­por­tant for re­duc­ing s-risks, re­search­ing on AI sen­tience or an­i­mal welfare?

jackchang110Feb 25, 2023, 2:20 AM
9 points
0 comments1 min readEA link

How to ‘troll for good’: Lev­er­ag­ing IP for AI governance

Michael HuangFeb 26, 2023, 6:34 AM
26 points
3 comments1 min readEA link
(www.science.org)

Seek­ing in­put on a list of AI books for broader audience

Darren McKeeFeb 27, 2023, 10:40 PM
49 points
14 comments5 min readEA link

Why I think it’s im­por­tant to work on AI forecasting

Matthew_BarnettFeb 27, 2023, 9:24 PM
179 points
10 comments10 min readEA link

Very Briefly: The CHIPS Act

YadavFeb 26, 2023, 1:53 PM
40 points
3 comments1 min readEA link
(www.y1d2.com)

Safe Sta­sis Fallacy

DavidmanheimFeb 5, 2024, 10:54 AM
23 points
4 comments1 min readEA link

[Question] An eco­nomics of AI gov—best re­sources for

LivFeb 26, 2023, 11:11 AM
10 points
4 comments1 min readEA link

In­tro­duc­ing Leap Labs, an AI in­ter­pretabil­ity startup

Jessica RumbelowMar 6, 2023, 5:37 PM
11 points
0 comments1 min readEA link
(www.lesswrong.com)

AI al­ign­ment as a trans­la­tion problem

Roman LeventovFeb 5, 2024, 2:14 PM
3 points
1 comment1 min readEA link

Scor­ing fore­casts from the 2016 “Ex­pert Sur­vey on Progress in AI”

PatrickLMar 1, 2023, 2:39 PM
204 points
21 comments9 min readEA link

[Question] What are some sources re­lated to big-pic­ture AI strat­egy?

Jacob Watts🔸Mar 2, 2023, 5:04 AM
9 points
4 comments1 min readEA link

Joscha Bach on Syn­thetic In­tel­li­gence [an­no­tated]

Roman LeventovMar 2, 2023, 11:21 AM
8 points
0 comments9 min readEA link
(www.jimruttshow.com)

Distil­la­tion of The Offense-Defense Balance of Scien­tific Knowledge

Arjun YadavAug 12, 2022, 7:01 AM
17 points
0 comments2 min readEA link

A con­cern­ing ob­ser­va­tion from me­dia cov­er­age of AI in­dus­try dynamics

Justin OliveMar 2, 2023, 11:56 PM
48 points
5 comments3 min readEA link

Prob­lems of peo­ple new to AI safety and my pro­ject ideas to miti­gate them

Igor IvanovMar 3, 2023, 5:35 PM
19 points
0 comments7 min readEA link

Acausal normalcy

Andrew CritchMar 3, 2023, 11:35 PM
21 points
4 comments8 min readEA link

The Benefits of Distil­la­tion in Research

Jonas HallgrenMar 4, 2023, 7:19 PM
45 points
2 comments5 min readEA link

[Question] How to nav­i­gate po­ten­tial infohazards

more better Mar 4, 2023, 9:28 PM
16 points
7 comments1 min readEA link

[Cross­post] Why Un­con­trol­lable AI Looks More Likely Than Ever

OttoMar 8, 2023, 3:33 PM
49 points
6 comments4 min readEA link
(time.com)

Fake Meat and Real Talk 1 - Are We All Gonna Die? Yud­kowsky and the Dangers of AI (Please RSVP)

David NMar 8, 2023, 8:40 PM
11 points
2 comments1 min readEA link

An­thropic: Core Views on AI Safety: When, Why, What, and How

jonmenasterMar 9, 2023, 5:30 PM
107 points
6 comments22 min readEA link
(www.anthropic.com)

Every­thing’s nor­mal un­til it’s not

Eleni_AMar 10, 2023, 1:42 AM
6 points
0 comments3 min readEA link

Ja­pan AI Align­ment Conference

ChrisScammellMar 10, 2023, 9:23 AM
17 points
2 comments1 min readEA link
(www.conjecture.dev)

Thoughts on the OpenAI al­ign­ment plan: will AI re­search as­sis­tants be net-pos­i­tive for AI ex­is­ten­tial risk?

Jeffrey LadishMar 10, 2023, 8:20 AM
12 points
0 comments9 min readEA link

Ques­tion­able Nar­ra­tives of “Si­tu­a­tional Aware­ness”

fergusqJun 16, 2024, 5:09 PM
23 points
10 comments14 min readEA link

The Power of In­tel­li­gence—The Animation

WriterMar 11, 2023, 4:15 PM
59 points
0 comments1 min readEA link

On tak­ing AI risk se­ri­ously

Eleni_AMar 13, 2023, 5:44 AM
51 points
4 comments1 min readEA link
(www.nytimes.com)

[Question] De­sign­ing user au­then­ti­ca­tion pro­to­cols

Kinoshita Yoshikazu (pseudonym)Mar 13, 2023, 3:56 PM
−1 points
2 comments1 min readEA link

Ver­ifi­ca­tion meth­ods for in­ter­na­tional AI agreements

AkashAug 31, 2024, 2:58 PM
20 points
0 comments1 min readEA link
(arxiv.org)

5th IEEE In­ter­na­tional Con­fer­ence on Ar­tifi­cial In­tel­li­gence Test­ing (AITEST 2023)

surabhi guptaMar 12, 2023, 9:06 AM
−5 points
0 comments1 min readEA link

AGI Safety Fun­da­men­tals cur­ricu­lum and application

richard_ngoOct 20, 2021, 9:45 PM
123 points
20 comments8 min readEA link
(docs.google.com)

UK policy and poli­tics careers

weeatquinceSep 28, 2019, 4:18 PM
28 points
10 comments7 min readEA link

AI Risk in Africa

Claude FormanekOct 12, 2021, 2:28 AM
18 points
0 comments10 min readEA link

Sha­har Avin on How to Strate­gi­cally Reg­u­late Ad­vanced AI Systems

Michaël TrazziSep 23, 2022, 3:49 PM
48 points
2 comments4 min readEA link
(theinsideview.ai)

FLI launches Wor­ld­build­ing Con­test with $100,000 in prizes

ggilgallonJan 17, 2022, 1:54 PM
87 points
55 comments6 min readEA link

Idea: an AI gov­er­nance group colo­cated with ev­ery AI re­search group!

capybaraletDec 7, 2020, 11:41 PM
8 points
1 comment2 min readEA link

CFP for the Largest An­nual Meet­ing of Poli­ti­cal Science: Get Help With Your Re­search Submission

Mahendra PrasadDec 22, 2020, 11:39 PM
13 points
0 comments2 min readEA link

The Wind­fall Clause has a reme­dies problem

John Bridge 🔸May 23, 2022, 10:31 AM
40 points
0 comments17 min readEA link

Ques­tions for fur­ther in­ves­ti­ga­tion of AI diffusion

Ben CottierDec 21, 2022, 1:50 PM
28 points
0 comments11 min readEA link

How to make the best of the most im­por­tant cen­tury?

Holden KarnofskySep 14, 2021, 9:05 PM
54 points
5 comments12 min readEA link

AI Gover­nance Read­ing Group Guide

Alex HTJun 25, 2020, 10:16 AM
26 points
2 comments3 min readEA link

Longter­mist rea­sons to work for in­no­va­tive governments

acOct 13, 2020, 4:32 PM
74 points
8 comments1 min readEA link

Slightly against al­ign­ing with neo-luddites

Matthew_BarnettDec 26, 2022, 11:27 PM
77 points
17 comments4 min readEA link

A Map to Nav­i­gate AI Governance

hanadulsetFeb 14, 2022, 10:41 PM
72 points
11 comments25 min readEA link

Google could build a con­scious AI in three months

Derek ShillerOct 1, 2022, 1:24 PM
16 points
22 comments7 min readEA link

AI Gover­nance Needs Tech­ni­cal Work

MauSep 5, 2022, 10:25 PM
120 points
3 comments8 min readEA link

On Ar­tifi­cial Gen­eral In­tel­li­gence: Ask­ing the Right Questions

Heather DouglasOct 2, 2022, 5:00 AM
−1 points
7 comments3 min readEA link

What if AI de­vel­op­ment goes well?

RoryGAug 3, 2022, 8:57 AM
25 points
7 comments12 min readEA link

Mas­sive Scal­ing Should be Frowned Upon

harsimonyNov 17, 2022, 5:44 PM
9 points
0 comments5 min readEA link

An­nounc­ing the SPT Model Web App for AI Governance

Paolo BovaAug 4, 2022, 10:45 AM
42 points
0 comments5 min readEA link

[Question] What are the challenges and prob­lems with pro­gram­ming law-break­ing con­straints into AGI?

MichaelStJulesFeb 2, 2020, 8:53 PM
20 points
34 comments1 min readEA link

[Question] Do­ing Global Pri­ori­ties or AI Policy re­search from re­mote lo­ca­tion?

With Love from IsraelOct 29, 2019, 9:34 AM
30 points
4 comments1 min readEA link

Cryp­tocur­rency Ex­ploits Show the Im­por­tance of Proac­tive Poli­cies for AI X-Risk

eSpencerSep 16, 2022, 4:44 AM
14 points
1 comment4 min readEA link

AI Alter­na­tive Fu­tures: Ex­plo­ra­tory Sce­nario Map­ping for Ar­tifi­cial In­tel­li­gence Risk—Re­quest for Par­ti­ci­pa­tion [Linkpost]

KiliankMay 9, 2022, 7:53 PM
17 points
2 comments8 min readEA link

AGI al­ign­ment re­sults from a se­ries of al­igned ac­tions

hanadulsetDec 27, 2021, 7:33 PM
15 points
1 comment6 min readEA link

EU AI Act now has a sec­tion on gen­eral pur­pose AI systems

MathiasKB🔸Dec 9, 2021, 12:40 PM
64 points
10 comments1 min readEA link

Im­pli­ca­tions of large lan­guage model diffu­sion for AI governance

Ben CottierDec 21, 2022, 1:50 PM
14 points
0 comments38 min readEA link

How tech­ni­cal safety stan­dards could pro­mote TAI safety

Cullen 🔸Aug 8, 2022, 4:57 PM
128 points
15 comments7 min readEA link

Assess­ing the state of AI R&D in the US, China, and Europe – Part 1: Out­put indicators

stefan.torgesNov 1, 2019, 2:41 PM
21 points
0 comments14 min readEA link

[Question] What are the most press­ing is­sues in short-term AI policy?

Eevee🔹Jan 14, 2020, 10:05 PM
9 points
0 comments1 min readEA link

[Question] Will AGI cause mass tech­nolog­i­cal un­em­ploy­ment?

Eevee🔹Jun 22, 2020, 8:55 PM
4 points
2 comments2 min readEA link

Who owns AI-gen­er­ated con­tent?

Johan S DanielDec 7, 2022, 3:03 AM
−2 points
0 comments2 min readEA link

Against GDP as a met­ric for timelines and take­off speeds

kokotajlodDec 29, 2020, 5:50 PM
47 points
6 comments14 min readEA link

“Nor­mal ac­ci­dents” and AI sys­tems

Eleni_AAug 8, 2022, 6:43 PM
5 points
1 comment1 min readEA link
(www.achan.ca)

Trans­for­ma­tive AI and Com­pute [Sum­mary]

lennartSep 23, 2021, 1:53 PM
65 points
5 comments9 min readEA link

AI Gover­nance Course—Cur­ricu­lum and Application

MauNov 29, 2021, 1:29 PM
94 points
9 comments1 min readEA link

[Question] Track­ing Com­pute Stocks and Flows: Case Stud­ies?

Cullen 🔸Oct 5, 2022, 5:54 PM
34 points
1 comment1 min readEA link

Spicy takes about AI policy (Clark, 2022)

Will AldredAug 9, 2022, 1:49 PM
44 points
0 comments3 min readEA link
(twitter.com)

In­stead of tech­ni­cal re­search, more peo­ple should fo­cus on buy­ing time

AkashNov 5, 2022, 8:43 PM
107 points
31 comments1 min readEA link

AMA: Fu­ture of Life In­sti­tute’s EU Team

Risto UukJan 31, 2022, 5:14 PM
44 points
15 comments2 min readEA link

[Question] How to Im­prove China-Western Co­or­di­na­tion on EA Is­sues?

Michael KehoeNov 3, 2021, 7:28 AM
15 points
2 comments1 min readEA link

Credo AI is hiring for sev­eral roles

IanEisenbergApr 11, 2022, 3:58 PM
14 points
2 comments1 min readEA link

[Job ad] Re­search im­por­tant longter­mist top­ics at Re­think Pri­ori­ties!

LinchOct 6, 2021, 7:09 PM
65 points
46 comments1 min readEA link

Com­pute & An­titrust: Reg­u­la­tory im­pli­ca­tions of the AI hard­ware sup­ply chain, from chip de­sign to cloud APIs

HaydnBelfieldAug 19, 2022, 5:20 PM
32 points
0 comments6 min readEA link
(verfassungsblog.de)

Where are the red lines for AI?

Karl von WendtAug 5, 2022, 9:41 AM
13 points
3 comments6 min readEA link

AI & Policy 1/​3: On know­ing the effect of to­day’s poli­cies on Trans­for­ma­tive AI risks, and the case for in­sti­tu­tional im­prove­ments.

weeatquinceAug 27, 2019, 11:04 AM
27 points
3 comments10 min readEA link

The AI rev­olu­tion and in­ter­na­tional poli­tics (Allan Dafoe)

EA GlobalJun 2, 2017, 8:48 AM
8 points
0 comments18 min readEA link
(www.youtube.com)

What does (and doesn’t) AI mean for effec­tive al­tru­ism?

EA GlobalAug 12, 2017, 7:00 AM
9 points
0 comments12 min readEA link

Ngo and Yud­kowsky on AI ca­pa­bil­ity gains

richard_ngoNov 19, 2021, 1:54 AM
23 points
4 comments39 min readEA link

[Link] EAF Re­search agenda: “Co­op­er­a­tion, Con­flict, and Trans­for­ma­tive Ar­tifi­cial In­tel­li­gence”

stefan.torgesJan 17, 2020, 1:28 PM
64 points
0 comments1 min readEA link

Jeffrey Ding: Re-de­ci­pher­ing China’s AI dream

EA GlobalOct 18, 2019, 6:05 PM
13 points
0 comments1 min readEA link
(www.youtube.com)

Some AI re­search ar­eas and their rele­vance to ex­is­ten­tial safety

Andrew CritchDec 15, 2020, 12:15 PM
12 points
1 comment56 min readEA link
(alignmentforum.org)

Deep­Mind’s gen­er­al­ist AI, Gato: A non-tech­ni­cal explainer

frances_lorenzMay 16, 2022, 9:19 PM
128 points
13 comments6 min readEA link

UK’s new 10-year “Na­tional AI Strat­egy,” re­leased today

jared_mSep 22, 2021, 11:18 AM
28 points
7 comments1 min readEA link

The case for long-term cor­po­rate gov­er­nance of AI

SethBaumNov 3, 2021, 10:50 AM
42 points
3 comments8 min readEA link

Google’s ethics is alarming

len.hoang.lnhFeb 25, 2021, 5:57 AM
6 points
5 comments1 min readEA link

A per­sonal take on longter­mist AI governance

lukeprogJul 16, 2021, 10:08 PM
173 points
6 comments7 min readEA link

AI gov­er­nance stu­dent hackathon on Satur­day, April 23: reg­ister now!

micApr 12, 2022, 4:39 AM
18 points
0 comments1 min readEA link

[Question] What would you do if you had a lot of money/​power/​in­fluence and you thought that AI timelines were very short?

Greg_ColbournNov 12, 2021, 9:59 PM
29 points
8 comments1 min readEA link

Sha­har Avin: Near-term AI se­cu­rity risks, and what to do about them

EA GlobalNov 3, 2017, 7:43 AM
7 points
0 comments1 min readEA link
(www.youtube.com)

[Question] What are the best jour­nals to pub­lish AI gov­er­nance pa­pers in?

CaroMay 2, 2022, 10:07 AM
26 points
4 comments1 min readEA link

Law-Fol­low­ing AI 1: Se­quence In­tro­duc­tion and Structure

Cullen 🔸Apr 27, 2022, 5:16 PM
35 points
2 comments9 min readEA link

AI Benefits Post 2: How AI Benefits Differs from AI Align­ment & AI for Good

Cullen 🔸Jun 29, 2020, 4:59 PM
9 points
0 comments2 min readEA link

Owen Cot­ton-Bar­ratt: What does (and doesn’t) AI mean for effec­tive al­tru­ism?

EA GlobalAug 11, 2017, 8:19 AM
10 points
0 comments12 min readEA link
(www.youtube.com)

An­nounc­ing the GovAI Policy Team

MarkusAnderljungAug 1, 2022, 10:46 PM
107 points
11 comments2 min readEA link

Fore­cast­ing Com­pute—Trans­for­ma­tive AI and Com­pute [2/​4]

lennartOct 1, 2021, 8:25 AM
39 points
6 comments19 min readEA link

The case for build­ing ex­per­tise to work on US AI policy, and how to do it

80000_HoursJan 31, 2019, 10:44 PM
37 points
2 comments2 min readEA link

How to get tech­nolog­i­cal knowl­edge on AI/​ML (for non-tech peo­ple)

FangFangJun 30, 2021, 7:53 AM
62 points
7 comments5 min readEA link

What is the EU AI Act and why should you care about it?

MathiasKB🔸Sep 10, 2021, 7:47 AM
116 points
10 comments7 min readEA link

Jeffrey Ding: Bring­ing techno-global­ism back: a ro­man­ti­cally re­al­ist re­fram­ing of the US-China tech relationship

EA GlobalNov 21, 2020, 8:12 AM
9 points
0 comments1 min readEA link
(www.youtube.com)

The longter­mist AI gov­er­nance land­scape: a ba­sic overview

Sam ClarkeJan 18, 2022, 12:58 PM
168 points
13 comments9 min readEA link

Allan Dafoe: Prepar­ing for AI — risks and opportunities

EA GlobalNov 3, 2017, 7:43 AM
7 points
0 comments1 min readEA link
(www.youtube.com)

What if we don’t need a “Hard Left Turn” to reach AGI?

EigengenderJul 15, 2022, 9:49 AM
39 points
7 comments4 min readEA link

Im­por­tant, ac­tion­able re­search ques­tions for the most im­por­tant century

Holden KarnofskyFeb 24, 2022, 4:34 PM
298 points
13 comments19 min readEA link

GovAI We­bi­nars on the Gover­nance and Eco­nomics of AI

MarkusAnderljungMay 12, 2020, 3:00 PM
16 points
0 comments1 min readEA link

[Question] What type of Master’s is best for AI policy work?

Milan GriffesFeb 22, 2019, 8:04 PM
14 points
7 comments1 min readEA link

Del­e­gated agents in prac­tice: How com­pa­nies might end up sel­l­ing AI ser­vices that act on be­half of con­sumers and coal­i­tions, and what this im­plies for safety research

RemmeltNov 26, 2020, 4:39 PM
11 points
0 comments4 min readEA link

England & Wales & Windfalls

John Bridge 🔸Jun 3, 2022, 10:26 AM
13 points
1 comment24 min readEA link

Effec­tive En­force­abil­ity of EU Com­pe­ti­tion Law Un­der Differ­ent AI Devel­op­ment Sce­nar­ios: A Frame­work for Le­gal Analysis

HaydnBelfieldAug 19, 2022, 5:20 PM
11 points
0 comments6 min readEA link
(verfassungsblog.de)

AI Benefits Post 5: Out­stand­ing Ques­tions on Govern­ing Benefits

Cullen 🔸Jul 21, 2020, 4:45 PM
5 points
0 comments4 min readEA link

Ap­ply now for the EU Tech Policy Fel­low­ship 2023

Jan-WillemNov 11, 2022, 6:16 AM
64 points
1 comment5 min readEA link

The repli­ca­tion and em­u­la­tion of GPT-3

Ben CottierDec 21, 2022, 1:49 PM
14 points
0 comments33 min readEA link

Should you work in the Euro­pean Union to do AGI gov­er­nance?

hanadulsetJan 31, 2022, 10:34 AM
90 points
20 comments15 min readEA link

Law-Fol­low­ing AI 2: In­tent Align­ment + Su­per­in­tel­li­gence → Lawless AI (By De­fault)

Cullen 🔸Apr 27, 2022, 5:18 PM
19 points
0 comments6 min readEA link

Jaan Tal­linn: Fireside chat (2020)

EA GlobalNov 21, 2020, 8:12 AM
7 points
0 comments1 min readEA link
(www.youtube.com)

Toby Ord’s new re­port on les­sons from the de­vel­op­ment of the atomic bomb

Ishan MukherjeeNov 22, 2022, 10:37 AM
65 points
3 comments1 min readEA link
(www.governance.ai)

Train­ing for Good—Up­date & Plans for 2023

Cillian_Nov 15, 2022, 4:02 PM
80 points
1 comment10 min readEA link

New US Se­nate Bill on X-Risk Miti­ga­tion [Linkpost]

Evan R. MurphyJul 4, 2022, 1:28 AM
22 points
12 comments1 min readEA link
(www.hsgac.senate.gov)

Publi­ca­tion de­ci­sions for large lan­guage mod­els, and their impacts

Ben CottierDec 21, 2022, 1:50 PM
14 points
0 comments16 min readEA link

4 Years Later: Pres­i­dent Trump and Global Catas­trophic Risk

HaydnBelfieldOct 25, 2020, 4:28 PM
43 points
10 comments10 min readEA link

TAI Safety Biblio­graphic Database

Jess_RiedelDec 22, 2020, 4:03 PM
61 points
9 comments17 min readEA link

[Question] What kind of or­ga­ni­za­tion should be the first to de­velop AGI in a po­ten­tial arms race?

Eevee🔹Jul 17, 2022, 5:41 PM
10 points
2 comments1 min readEA link

Ar­gu­ment Against Im­pact: EU Is Not an AI Su­per­power

EU AI GovernanceJan 31, 2022, 9:48 AM
35 points
9 comments4 min readEA link

How long till Brus­sels?: A light in­ves­ti­ga­tion into the Brus­sels Gap

YadavDec 26, 2022, 7:49 AM
50 points
2 comments5 min readEA link

Markus An­der­ljung and Ben Garfinkel: Fireside chat on AI governance

EA GlobalJul 24, 2020, 2:56 PM
25 points
0 comments16 min readEA link
(www.youtube.com)

An overview of ar­gu­ments for con­cern about automation

LintzAAug 6, 2019, 7:56 AM
34 points
3 comments13 min readEA link

GPT-3-like mod­els are now much eas­ier to ac­cess and de­ploy than to develop

Ben CottierDec 21, 2022, 1:49 PM
22 points
3 comments19 min readEA link

Fi­nal Re­port of the Na­tional Se­cu­rity Com­mis­sion on Ar­tifi­cial In­tel­li­gence (NSCAI, 2021)

MichaelA🔸Jun 1, 2021, 8:19 AM
51 points
3 comments4 min readEA link
(www.nscai.gov)

Ap­pli­ca­tions Open: GovAI Sum­mer Fel­low­ship 2023

GovAIDec 21, 2022, 3:00 PM
28 points
0 comments2 min readEA link

Some AI Gover­nance Re­search Ideas

MarkusAnderljungJun 3, 2021, 10:51 AM
102 points
5 comments2 min readEA link

A Brief Sum­mary Of The Most Im­por­tant Century

Maynk02Oct 25, 2022, 3:28 PM
3 points
0 comments5 min readEA link

The AIA and its Brus­sels Effect

Kathryn O'RourkeDec 27, 2022, 4:01 PM
16 points
0 comments5 min readEA link

13 Re­cent Publi­ca­tions on Ex­is­ten­tial Risk (Jan 2021 up­date)

HaydnBelfieldFeb 8, 2021, 12:42 PM
7 points
2 comments10 min readEA link

A Cal­ifor­nia Effect for Ar­tifi­cial Intelligence

henryjSep 9, 2022, 2:17 PM
73 points
1 comment4 min readEA link
(docs.google.com)

An­nounc­ing the EU Tech Policy Fellowship

Jan-WillemMar 30, 2022, 8:15 AM
53 points
4 comments5 min readEA link

[Question] Ex­am­ples of self-gov­er­nance to re­duce tech­nol­ogy risk?

jiaSep 25, 2020, 1:26 PM
32 points
1 comment1 min readEA link

AI risk hub in Sin­ga­pore?

kokotajlodOct 29, 2020, 11:51 AM
24 points
3 comments4 min readEA link

Back­ground for “Un­der­stand­ing the diffu­sion of large lan­guage mod­els”

Ben CottierDec 21, 2022, 1:49 PM
12 points
0 comments23 min readEA link

AMA: Markus An­der­ljung (PM at GovAI, FHI)

MarkusAnderljungSep 21, 2020, 11:23 AM
49 points
24 comments2 min readEA link

In­for­ma­tion in risky tech­nol­ogy races

nemeryxuAug 2, 2022, 11:35 PM
15 points
2 comments3 min readEA link

Com­pute Gover­nance and Con­clu­sions—Trans­for­ma­tive AI and Com­pute [3/​4]

lennartOct 14, 2021, 7:55 AM
20 points
3 comments5 min readEA link

Tips for con­duct­ing wor­ld­view investigations

lukeprogApr 12, 2022, 7:28 PM
88 points
4 comments2 min readEA link

Why don’t gov­ern­ments seem to mind that com­pa­nies are ex­plic­itly try­ing to make AGIs?

Ozzie GooenDec 23, 2021, 7:08 AM
82 points
49 comments2 min readEA link

AI Gover­nance: Op­por­tu­nity and The­ory of Impact

Allan DafoeSep 17, 2020, 6:30 AM
262 points
19 comments12 min readEA link

CSER Ad­vice to EU High-Level Ex­pert Group on AI

HaydnBelfieldMar 8, 2019, 8:42 PM
14 points
0 comments5 min readEA link
(www.cser.ac.uk)

The His­tory, Episte­mol­ogy and Strat­egy of Tech­nolog­i­cal Res­traint, and les­sons for AI (short es­say)

MMMaasAug 10, 2022, 11:00 AM
90 points
6 comments9 min readEA link
(verfassungsblog.de)

In­for­ma­tion se­cu­rity ca­reers for GCR reduction

ClaireZabelJun 20, 2019, 11:56 PM
187 points
35 comments8 min readEA link

Com­pute Re­search Ques­tions and Met­rics—Trans­for­ma­tive AI and Com­pute [4/​4]

lennartNov 28, 2021, 10:18 PM
18 points
2 comments1 min readEA link

A Primer on God, Liber­al­ism and the End of History

Mahdi ComplexMar 28, 2022, 5:26 AM
8 points
3 comments14 min readEA link

Hu­man­i­ties Re­search Ideas for Longtermists

LizkaJun 9, 2021, 4:39 AM
151 points
13 comments13 min readEA link

AGI Risk: How to in­ter­na­tion­ally reg­u­late in­dus­tries in non-democracies

Timothy_LiptrotMay 16, 2022, 10:45 PM
9 points
2 comments9 min readEA link

Open Philan­thropy’s AI gov­er­nance grant­mak­ing (so far)

Aaron Gertler 🔸Dec 17, 2020, 12:00 PM
63 points
0 comments6 min readEA link
(www.openphilanthropy.org)

Ar­tifi­cial In­tel­li­gence and Nu­clear Com­mand, Con­trol, & Com­mu­ni­ca­tions: The Risks of Integration

Peter RautenbachNov 18, 2022, 1:01 PM
60 points
3 comments50 min readEA link

New Work­ing Paper Series of the Le­gal Pri­ori­ties Project

Legal Priorities ProjectOct 18, 2021, 10:30 AM
60 points
0 comments9 min readEA link

Cen­tre for the Study of Ex­is­ten­tial Risk Four Month Re­port June—Septem­ber 2020

HaydnBelfieldDec 2, 2020, 6:33 PM
24 points
0 comments17 min readEA link

Sup­port­ing global co­or­di­na­tion in AI de­vel­op­ment: Why and how to con­tribute to in­ter­na­tional AI standards

pcihonApr 17, 2019, 10:17 PM
21 points
4 comments1 min readEA link

[Question] Is there ev­i­dence that recom­mender sys­tems are chang­ing users’ prefer­ences?

zdgroffApr 12, 2021, 7:11 PM
60 points
15 comments1 min readEA link

The Ter­minol­ogy of Ar­tifi­cial Sentience

Janet PauketatNov 28, 2021, 7:52 AM
29 points
0 comments1 min readEA link
(www.sentienceinstitute.org)

Is Eric Sch­midt fund­ing AI ca­pa­bil­ities re­search by the US gov­ern­ment?

Pranay KDec 24, 2022, 8:32 AM
46 points
3 comments2 min readEA link
(www.politico.com)

Drivers of large lan­guage model diffu­sion: in­cre­men­tal re­search, pub­lic­ity, and cascades

Ben CottierDec 21, 2022, 1:50 PM
21 points
0 comments29 min readEA link

[Question] Slow­ing down AI progress?

Eleni_AJul 26, 2022, 8:46 AM
16 points
9 comments1 min readEA link

Jade Le­ung and Seth Baum: The role of ex­ist­ing in­sti­tu­tions in AI strategy

EA GlobalJun 8, 2018, 7:15 AM
9 points
0 comments28 min readEA link
(www.youtube.com)

FHI Re­port: The Wind­fall Clause: Distribut­ing the Benefits of AI for the Com­mon Good

Cullen 🔸Feb 5, 2020, 11:49 PM
54 points
21 comments2 min readEA link

The Gover­nance Prob­lem and the “Pretty Good” X-Risk

Zach Stein-PerlmanAug 28, 2021, 8:00 PM
23 points
4 comments11 min readEA link

Euro­pean Union AI Devel­op­ment and Gover­nance Part­ner­ships

EU AI GovernanceJan 19, 2022, 10:26 AM
22 points
1 comment4 min readEA link

AI policy ca­reers in the EU

Lauro LangoscoNov 11, 2019, 10:43 AM
60 points
7 comments11 min readEA link

Po­ten­tial Risks from Ad­vanced AI

EA GlobalAug 13, 2017, 7:00 AM
9 points
0 comments18 min readEA link

Par­allels Between AI Safety by De­bate and Ev­i­dence Law

Cullen 🔸Jul 20, 2020, 10:52 PM
30 points
2 comments2 min readEA link
(cullenokeefe.com)

Some thoughts on risks from nar­row, non-agen­tic AI

richard_ngoJan 19, 2021, 12:07 AM
36 points
2 comments8 min readEA link

Re­port on Semi-in­for­ma­tive Pri­ors for AI timelines (Open Philan­thropy)

Tom_DavidsonMar 26, 2021, 5:46 PM
62 points
6 comments2 min readEA link

Pile of Law and Law-Fol­low­ing AI

Cullen 🔸Jul 13, 2022, 12:29 AM
28 points
2 comments3 min readEA link

Le­gal Pri­ori­ties Re­search: A Re­search Agenda

jonasschuettJan 6, 2021, 9:47 PM
58 points
4 comments1 min readEA link

AI Benefits Post 1: In­tro­duc­ing “AI Benefits”

Cullen 🔸Jun 22, 2020, 4:58 PM
10 points
2 comments3 min readEA link

Deep­Mind is hiring Long-term Strat­egy & Gover­nance researchers

vishalSep 13, 2021, 6:44 PM
54 points
1 comment1 min readEA link

Tony Blair In­sti­tute—Com­pute for AI In­dex ( Seek­ing a Sup­plier)

TomWestgarthOct 3, 2022, 10:25 AM
29 points
8 comments1 min readEA link

AI Gover­nance Ca­reer Paths for Europeans

careersthrowawayMay 16, 2020, 6:40 AM
83 points
1 comment12 min readEA link

Fore­sight for AGI Safety Strategy

jacquesthibsDec 5, 2022, 4:09 PM
14 points
1 comment1 min readEA link

Col­lec­tion of work on ‘Should you fo­cus on the EU if you’re in­ter­ested in AI gov­er­nance for longter­mist/​x-risk rea­sons?’

MichaelA🔸Aug 6, 2022, 4:49 PM
51 points
3 comments1 min readEA link

AI Benefits Post 3: Direct and Indi­rect Ap­proaches to AI Benefits

Cullen 🔸Jul 6, 2020, 6:46 PM
5 points
0 comments2 min readEA link

How Rood­man’s GWP model trans­lates to TAI timelines

kokotajlodNov 16, 2020, 2:11 PM
22 points
0 comments2 min readEA link

Con­clu­sion and Bibliog­ra­phy for “Un­der­stand­ing the diffu­sion of large lan­guage mod­els”

Ben CottierDec 21, 2022, 1:50 PM
12 points
0 comments11 min readEA link

Case stud­ies of self-gov­er­nance to re­duce tech­nol­ogy risk

jiaApr 6, 2021, 8:49 AM
55 points
6 comments7 min readEA link

The flaws that make to­day’s AI ar­chi­tec­ture un­safe and a new ap­proach that could fix it

80000_HoursJun 22, 2020, 10:15 PM
3 points
0 comments86 min readEA link
(80000hours.org)

HIRING: In­form and shape a new pro­ject on AI safety at Part­ner­ship on AI

Madhulika SrikumarNov 24, 2021, 4:29 PM
11 points
2 comments1 min readEA link

Any fur­ther work on AI Safety Suc­cess Sto­ries?

KriegerOct 2, 2022, 11:59 AM
4 points
0 comments1 min readEA link

[Question] What “defense lay­ers” should gov­ern­ments, AI labs, and busi­nesses use to pre­vent catas­trophic AI failures?

LintzADec 3, 2021, 2:24 PM
37 points
3 comments1 min readEA link

Some cruxes on im­pact­ful al­ter­na­tives to AI policy work

richard_ngoNov 22, 2018, 1:43 PM
28 points
2 comments12 min readEA link

In­ter­na­tional co­op­er­a­tion as a tool to re­duce two ex­is­ten­tial risks.

johl@umich.eduApr 19, 2021, 4:51 PM
28 points
4 comments23 min readEA link

Stu­art Rus­sell Hu­man Com­pat­i­ble AI Roundtable with Allan Dafoe, Rob Re­ich, & Ma­ri­etje Schaake

Mahendra PrasadFeb 11, 2021, 7:43 AM
16 points
0 comments1 min readEA link

CSER and FHI ad­vice to UN High-level Panel on Digi­tal Co­op­er­a­tion

HaydnBelfieldMar 8, 2019, 8:39 PM
22 points
7 comments6 min readEA link
(www.cser.ac.uk)

A new pro­posal for reg­u­lat­ing AI in the EU

EdoAradApr 26, 2021, 5:25 PM
37 points
3 comments1 min readEA link
(www.bbc.com)

Sin­ga­pore AI Policy Ca­reer Guide

Yi-YangJan 21, 2021, 3:05 AM
28 points
0 comments5 min readEA link

Pos­si­ble di­rec­tions in AI ideal gov­er­nance research

RoryGAug 10, 2022, 8:36 AM
5 points
0 comments3 min readEA link

Con­crete ac­tion­able poli­cies rele­vant to AI safety (writ­ten 2019)

weeatquinceDec 16, 2022, 6:41 PM
48 points
0 comments22 min readEA link

AMA: The new Open Philan­thropy Tech­nol­ogy Policy Fellowship

lukeprogJul 26, 2021, 3:11 PM
38 points
14 comments1 min readEA link

I’m Cul­len O’Keefe, a Policy Re­searcher at OpenAI, AMA

Cullen 🔸Jan 11, 2020, 4:13 AM
45 points
68 comments1 min readEA link

Dis­cus­sion with Eliezer Yud­kowsky on AGI interventions

RobBensingerNov 11, 2021, 3:21 AM
60 points
33 comments34 min readEA link

How Open Source Ma­chine Learn­ing Soft­ware Shapes AI

Max LSep 28, 2022, 5:49 PM
11 points
3 comments15 min readEA link
(maxlangenkamp.me)

GovAI An­nual Re­port 2021

GovAIJan 5, 2022, 4:57 PM
52 points
2 comments9 min readEA link

Credo AI is hiring!

IanEisenbergMar 3, 2022, 6:02 PM
16 points
6 comments4 min readEA link

A col­lec­tion of AI Gover­nance-re­lated Pod­casts, Newslet­ters, Blogs, and more

LintzAOct 2, 2021, 12:46 AM
24 points
1 comment1 min readEA link

FHI Re­port: How Will Na­tional Se­cu­rity Con­sid­er­a­tions Affect An­titrust De­ci­sions in AI? An Ex­am­i­na­tion of His­tor­i­cal Precedents

Cullen 🔸Jul 28, 2020, 6:33 PM
13 points
0 comments1 min readEA link
(www.fhi.ox.ac.uk)

Nine Points of Col­lec­tive Insanity

RemmeltDec 27, 2022, 3:14 AM
1 point
0 comments1 min readEA link

A course for the gen­eral pub­lic on AI

LeandroDAug 31, 2020, 1:29 AM
1 point
0 comments1 min readEA link

Ap­pendix to Bridg­ing Demonstration

mako yassJun 1, 2022, 8:30 PM
18 points
2 comments28 min readEA link

EU’s im­por­tance for AI gov­er­nance is con­di­tional on AI tra­jec­to­ries—a case study

MathiasKB🔸Jan 13, 2022, 2:58 PM
31 points
2 comments3 min readEA link

Strate­gic Per­spec­tives on Trans­for­ma­tive AI Gover­nance: Introduction

MMMaasJul 2, 2022, 11:20 AM
115 points
18 comments4 min readEA link

Does gen­er­al­ity pay? GPT-3 can provide pre­limi­nary ev­i­dence.

Eevee🔹Jul 12, 2020, 6:53 PM
21 points
4 comments2 min readEA link

Main paths to im­pact in EU AI Policy

JOMG_MonnetDec 8, 2022, 4:17 PM
69 points
2 comments8 min readEA link

Book re­view: Ar­chi­tects of In­tel­li­gence by Martin Ford (2018)

OferAug 11, 2020, 5:24 PM
11 points
1 comment2 min readEA link

[Link and com­men­tary] Beyond Near- and Long-Term: Towards a Clearer Ac­count of Re­search Pri­ori­ties in AI Ethics and Society

MichaelA🔸Mar 14, 2020, 9:04 AM
18 points
0 comments6 min readEA link

[Link post] Co­or­di­na­tion challenges for pre­vent­ing AI conflict

stefan.torgesMar 9, 2021, 9:39 AM
58 points
0 comments1 min readEA link
(longtermrisk.org)

The ‘Old AI’: Les­sons for AI gov­er­nance from early elec­tric­ity regulation

Sam ClarkeDec 19, 2022, 2:46 AM
58 points
1 comment13 min readEA link

FLI is hiring a new Direc­tor of US Policy

aaguirreJul 27, 2022, 12:07 AM
14 points
0 comments1 min readEA link

Here are the fi­nal­ists from FLI’s $100K Wor­ld­build­ing Contest

Jackson WagnerJun 6, 2022, 6:42 PM
44 points
5 comments2 min readEA link

AI Benefits Post 4: Out­stand­ing Ques­tions on Select­ing Benefits

Cullen 🔸Jul 14, 2020, 5:24 PM
6 points
0 comments5 min readEA link

Why we need a new agency to reg­u­late ad­vanced ar­tifi­cial intelligence

Michael HuangAug 4, 2022, 1:38 PM
25 points
0 comments1 min readEA link
(www.brookings.edu)

The Slip­pery Slope from DALLE-2 to Deep­fake Anarchy

stecasNov 5, 2022, 2:47 PM
55 points
11 comments17 min readEA link

[Pod­cast] Ajeya Co­tra on wor­ld­view di­ver­sifi­ca­tion and how big the fu­ture could be

Eevee🔹Jan 22, 2021, 11:57 PM
57 points
20 comments1 min readEA link
(80000hours.org)

Good policy ideas that won’t hap­pen (yet)

Niel_BowermanSep 11, 2014, 12:29 PM
28 points
8 comments14 min readEA link

Chris­ti­ano, Co­tra, and Yud­kowsky on AI progress

AjeyaNov 25, 2021, 4:30 PM
18 points
6 comments68 min readEA link

An­titrust-Com­pli­ant AI In­dus­try Self-Regulation

Cullen 🔸Jul 7, 2020, 8:52 PM
26 points
1 comment1 min readEA link
(cullenokeefe.com)

Will the EU reg­u­la­tions on AI mat­ter to the rest of the world?

hanadulsetJan 1, 2022, 9:56 PM
33 points
5 comments5 min readEA link

In­sti­tu­tions Can­not Res­train Dark-Triad AI Exploitation

RemmeltDec 27, 2022, 10:34 AM
8 points
0 comments1 min readEA link

Per­sonal thoughts on ca­reers in AI policy and strategy

carrickflynnSep 27, 2017, 4:52 PM
56 points
28 comments18 min readEA link

Slow­ing down AI progress is an un­der­ex­plored al­ign­ment strategy

Michael HuangJul 13, 2022, 3:22 AM
92 points
11 comments3 min readEA link
(www.lesswrong.com)

Three pillars for avoid­ing AGI catas­tro­phe: Tech­ni­cal al­ign­ment, de­ploy­ment de­ci­sions, and co­or­di­na­tion

LintzAAug 3, 2022, 9:24 PM
93 points
4 comments11 min readEA link

[Fic­tion] Im­proved Gover­nance on the Crit­i­cal Path to AI Align­ment by 2045.

Jackson WagnerMay 18, 2022, 3:50 PM
20 points
1 comment12 min readEA link

Com­po­nents of Strate­gic Clar­ity [Strate­gic Per­spec­tives on Long-term AI Gover­nance, #2]

MMMaasJul 2, 2022, 11:22 AM
66 points
0 comments6 min readEA link

[Question] How strong is the ev­i­dence of un­al­igned AI sys­tems caus­ing harm?

Eevee🔹Jul 21, 2020, 4:08 AM
31 points
1 comment1 min readEA link

Sur­vey on AI ex­is­ten­tial risk scenarios

Sam ClarkeJun 8, 2021, 5:12 PM
154 points
11 comments6 min readEA link

How might we al­ign trans­for­ma­tive AI if it’s de­vel­oped very soon?

Holden KarnofskyAug 29, 2022, 3:48 PM
163 points
17 comments44 min readEA link

What does it mean to be­come an ex­pert in AI Hard­ware?

TophJan 9, 2021, 4:15 AM
87 points
10 comments11 min readEA link

[link] Cen­tre for the Gover­nance of AI 2020 An­nual Report

MarkusAnderljungJan 14, 2021, 10:23 AM
11 points
5 comments1 min readEA link

Truth­ful AI

Owen Cotton-BarrattOct 20, 2021, 3:11 PM
55 points
14 comments10 min readEA link

‘Ar­tifi­cial In­tel­li­gence Gover­nance un­der Change’ (PhD dis­ser­ta­tion)

MMMaasSep 15, 2022, 12:10 PM
54 points
1 comment2 min readEA link
(drive.google.com)

FHI Re­port: Stable Agree­ments in Tur­bu­lent Times

Cullen 🔸Feb 21, 2019, 5:12 PM
25 points
2 comments4 min readEA link
(www.fhi.ox.ac.uk)

Birds, Brains, Planes, and AI: Against Ap­peals to the Com­plex­ity/​Mys­te­ri­ous­ness/​Effi­ciency of the Brain

kokotajlodJan 18, 2021, 12:39 PM
27 points
2 comments1 min readEA link

The Pug­wash Con­fer­ences and the Anti-Bal­lis­tic Mis­sile Treaty as a case study of Track II diplomacy

rani_martinSep 16, 2022, 10:42 AM
82 points
5 comments27 min readEA link

Quick Thoughts on A.I. Governance

Nicholas / Heather KrossApr 30, 2022, 2:49 PM
43 points
0 comments2 min readEA link
(www.thinkingmuchbetter.com)

The Im­por­tance of Ar­tifi­cial Sentience

Jamie_HarrisMar 3, 2021, 5:17 PM
70 points
10 comments11 min readEA link
(www.sentienceinstitute.org)

[Link] Cen­ter for the Gover­nance of AI (GovAI) An­nual Re­port 2018

MarkusAnderljungDec 21, 2018, 4:17 PM
24 points
0 comments1 min readEA link

New Se­quence—Towards a wor­ld­wide, wa­ter­tight Wind­fall Clause

John Bridge 🔸Apr 7, 2022, 3:02 PM
25 points
4 comments8 min readEA link

NIST AI Risk Man­age­ment Frame­work re­quest for in­for­ma­tion (RFI)

Aryeh EnglanderAug 31, 2021, 10:24 PM
7 points
0 comments2 min readEA link

How Europe might mat­ter for AI governance

stefan.torgesJul 12, 2019, 11:42 PM
52 points
13 comments8 min readEA link

Went­worth and Larsen on buy­ing time

AkashJan 9, 2023, 9:31 PM
48 points
0 comments1 min readEA link

Shul­man and Yud­kowsky on AI progress

CarlShulmanDec 4, 2021, 11:37 AM
46 points
0 comments20 min readEA link

MIRI Con­ver­sa­tions: Tech­nol­ogy Fore­cast­ing & Grad­u­al­ism (Distil­la­tion)

TheMcDouglasJul 13, 2022, 10:45 AM
27 points
9 comments19 min readEA link

An­nounc­ing Epoch: A re­search or­ga­ni­za­tion in­ves­ti­gat­ing the road to Trans­for­ma­tive AI

Jaime SevillaJun 27, 2022, 1:39 PM
183 points
11 comments2 min readEA link
(epochai.org)

“Slower tech de­vel­op­ment” can be about or­der­ing, grad­u­al­ness, or dis­tance from now

MichaelA🔸Nov 14, 2021, 8:58 PM
47 points
3 comments4 min readEA link

AI im­pacts and Paul Chris­ti­ano on take­off speeds

CrosspostMar 2, 2018, 11:16 AM
4 points
0 comments1 min readEA link

Con­ti­nu­ity Assumptions

Jan_KulveitJun 13, 2022, 9:36 PM
44 points
4 comments4 min readEA link
(www.alignmentforum.org)

Vignettes Work­shop (AI Im­pacts)

kokotajlodJun 15, 2021, 11:02 AM
43 points
5 comments1 min readEA link

Epoch is hiring a Re­search Data Analyst

merilalamaNov 22, 2022, 5:34 PM
21 points
0 comments4 min readEA link
(careers.rethinkpriorities.org)

Hereti­cal Thoughts on AI | Eli Dourado

𝕮𝖎𝖓𝖊𝖗𝖆Jan 19, 2023, 4:11 PM
142 points
15 comments1 min readEA link

An­i­mal Rights, The Sin­gu­lar­ity, and Astro­nom­i­cal Suffering

sapphireAug 20, 2020, 8:23 PM
51 points
0 comments3 min readEA link

Paul Chris­ti­ano on how OpenAI is de­vel­op­ing real solu­tions to the ‘AI al­ign­ment prob­lem’, and his vi­sion of how hu­man­ity will pro­gres­sively hand over de­ci­sion-mak­ing to AI systems

80000_HoursOct 2, 2018, 11:49 AM
6 points
0 comments185 min readEA link

Op­por­tu­ni­ties for in­di­vi­d­ual donors in AI safety

alexflintMar 12, 2018, 2:10 AM
13 points
11 comments10 min readEA link

AGI risk: analo­gies & arguments

technicalitiesMar 23, 2021, 1:18 PM
31 points
3 comments8 min readEA link
(www.gleech.org)

Ap­ply to the ML for Align­ment Boot­camp (MLAB) in Berkeley [Jan 3 - Jan 22]

HabrykaNov 3, 2021, 6:20 PM
140 points
6 comments1 min readEA link

SERI ML Align­ment The­ory Schol­ars Pro­gram 2022

Ryan KiddApr 27, 2022, 4:33 PM
57 points
2 comments3 min readEA link

Michael Page, Dario Amodei, He­len Toner, Tasha McCauley, Jan Leike, & Owen Cot­ton-Bar­ratt: Mus­ings on AI

EA GlobalAug 11, 2017, 8:19 AM
7 points
0 comments1 min readEA link
(www.youtube.com)

In­for­mat­ica: Spe­cial Is­sue on Superintelligence

RyanCareyMay 3, 2017, 5:05 AM
7 points
0 comments2 min readEA link

Long-Term Fu­ture Fund: Ask Us Any­thing!

AdamGleaveDec 3, 2020, 1:44 PM
89 points
153 comments1 min readEA link

AI Fore­cast­ing Dic­tionary (Fore­cast­ing in­fras­truc­ture, part 1)

terraformAug 8, 2019, 1:16 PM
18 points
0 comments5 min readEA link

AI Im­pacts: His­toric trends in tech­nolog­i­cal progress

Aaron Gertler 🔸Feb 12, 2020, 12:08 AM
55 points
5 comments3 min readEA link

Ap­ply to the sec­ond ML for Align­ment Boot­camp (MLAB 2) in Berkeley [Aug 15 - Fri Sept 2]

BuckMay 6, 2022, 12:19 AM
111 points
7 comments6 min readEA link

Asya Ber­gal: Rea­sons you might think hu­man-level AI is un­likely to hap­pen soon

EA GlobalAug 26, 2020, 4:01 PM
24 points
2 comments17 min readEA link
(www.youtube.com)

Why the Orthog­o­nal­ity Th­e­sis’s ve­rac­ity is not the point:

Antoine de Scorraille ⏸️Jul 23, 2020, 3:40 PM
3 points
0 comments3 min readEA link

[Question] What are the top pri­ori­ties in a slow-take­off, mul­ti­po­lar world?

JP Addison🔸Aug 25, 2021, 8:47 AM
26 points
9 comments1 min readEA link

Ought: why it mat­ters and ways to help

Paul_ChristianoJul 26, 2019, 1:56 AM
52 points
5 comments5 min readEA link

Two rea­sons we might be closer to solv­ing al­ign­ment than it seems

Kat WoodsSep 24, 2022, 5:38 PM
44 points
17 comments4 min readEA link

Three kinds of competitiveness

AI ImpactsApr 2, 2020, 3:46 AM
10 points
0 comments5 min readEA link
(aiimpacts.org)

Cri­tique of Su­per­in­tel­li­gence Part 5

James FodorDec 13, 2018, 5:19 AM
12 points
2 comments6 min readEA link

[Question] Who would you have on your dream team for solv­ing AGI Align­ment?

Greg_ColbournAug 25, 2022, 1:34 PM
10 points
14 comments1 min readEA link

$500 bounty for al­ign­ment con­test ideas

AkashJun 30, 2022, 1:55 AM
18 points
1 comment2 min readEA link

BERI is hiring an ML Soft­ware Engineer

sawyer🔸Nov 10, 2021, 7:36 PM
17 points
2 comments1 min readEA link

AI views and dis­agree­ments AMA: Chris­ti­ano, Ngo, Shah, Soares, Yudkowsky

RobBensingerMar 1, 2022, 1:13 AM
30 points
4 comments1 min readEA link
(www.lesswrong.com)

The al­ign­ment prob­lem from a deep learn­ing perspective

richard_ngoAug 11, 2022, 3:18 AM
58 points
0 comments26 min readEA link

An­nounc­ing AI Safety Support

Linda LinseforsNov 19, 2020, 8:19 PM
55 points
0 comments4 min readEA link

Ngo and Yud­kowsky on al­ign­ment difficulty

richard_ngoNov 15, 2021, 10:47 PM
71 points
13 comments94 min readEA link

Long-Term Fu­ture Fund: April 2019 grant recommendations

HabrykaApr 23, 2019, 7:00 AM
142 points
242 comments46 min readEA link

Con­jec­ture: In­ter­nal In­fo­haz­ard Policy

Connor LeahyJul 29, 2022, 7:35 PM
34 points
3 comments19 min readEA link

PIBBSS Fel­low­ship: Bounty for Refer­rals & Dead­line Extension

Anna_GajdovaJan 17, 2022, 4:23 PM
17 points
7 comments1 min readEA link

The Vi­talik Bu­terin Fel­low­ship in AI Ex­is­ten­tial Safety is open for ap­pli­ca­tions!

Cynthia ChenOct 14, 2022, 3:23 AM
38 points
0 comments2 min readEA link

It’s (not) how you use it

Eleni_ASep 7, 2022, 1:28 PM
6 points
3 comments2 min readEA link

[Question] Brief sum­mary of key dis­agree­ments in AI Risk

Aryeh EnglanderDec 26, 2019, 7:40 PM
31 points
3 comments1 min readEA link

AGI in a vuln­er­a­ble world

AI ImpactsApr 2, 2020, 3:43 AM
17 points
0 comments1 min readEA link
(aiimpacts.org)

De­cep­tion as the op­ti­mal: mesa-op­ti­miz­ers and in­ner al­ign­ment

Eleni_AAug 16, 2022, 3:45 AM
19 points
0 comments5 min readEA link

Twit­ter-length re­sponses to 24 AI al­ign­ment arguments

RobBensingerMar 14, 2022, 7:34 PM
67 points
17 comments8 min readEA link

Short-Term AI Align­ment as a Pri­or­ity Cause

len.hoang.lnhFeb 11, 2020, 4:22 PM
17 points
11 comments7 min readEA link

Align­ment Newslet­ter One Year Retrospective

Rohin ShahApr 10, 2019, 7:00 AM
62 points
22 comments21 min readEA link

AMA: Ajeya Co­tra, re­searcher at Open Phil

AjeyaJan 28, 2021, 5:38 PM
84 points
105 comments1 min readEA link

We Ran an AI Timelines Retreat

Lenny McClineMay 17, 2022, 4:40 AM
46 points
6 comments3 min readEA link

The Tree of Life: Stan­ford AI Align­ment The­ory of Change

GabeMJul 2, 2022, 6:32 PM
69 points
5 comments14 min readEA link

A list of good heuris­tics that the case for AI X-risk fails

Aaron Gertler 🔸Jul 16, 2020, 9:56 AM
25 points
9 comments2 min readEA link
(www.alignmentforum.org)

Is GPT-3 the death of the pa­per­clip max­i­mizer?

matthias_samwaldAug 3, 2020, 11:34 AM
4 points
1 comment1 min readEA link

New re­port on how much com­pu­ta­tional power it takes to match the hu­man brain (Open Philan­thropy)

Aaron Gertler 🔸Sep 15, 2020, 1:06 AM
45 points
1 comment18 min readEA link
(www.openphilanthropy.org)

Align­ment is hard. Com­mu­ni­cat­ing that, might be harder

Eleni_ASep 1, 2022, 11:45 AM
17 points
1 comment3 min readEA link

AI Safety Needs Great Engineers

Andy JonesNov 23, 2021, 9:03 PM
98 points
13 comments4 min readEA link

An ML safety in­surance com­pany—shower thoughts

EdoAradOct 18, 2021, 7:45 AM
15 points
4 comments1 min readEA link

LessWrong is now a book, available for pre-or­der!

terraformDec 4, 2020, 8:42 PM
48 points
1 comment10 min readEA link

In­creased Availa­bil­ity and Willing­ness for De­ploy­ment of Re­sources for Effec­tive Altru­ism and Long-Termism

Evan_GaensbauerDec 29, 2021, 8:20 PM
46 points
1 comment2 min readEA link

“Ex­is­ten­tial risk from AI” sur­vey results

RobBensingerJun 1, 2021, 8:19 PM
80 points
35 comments11 min readEA link

Align­ment’s phlo­gis­ton

Eleni_AAug 18, 2022, 1:41 AM
18 points
1 comment2 min readEA link

7 es­says on Build­ing a Bet­ter Future

Jamie_HarrisJun 24, 2022, 2:28 PM
21 points
0 comments2 min readEA link

Some promis­ing ca­reer ideas be­yond 80,000 Hours’ pri­or­ity paths

Arden KoehlerJun 26, 2020, 10:34 AM
142 points
28 comments15 min readEA link

Tech­ni­cal AGI safety re­search out­side AI

richard_ngoOct 18, 2019, 3:02 PM
91 points
5 comments3 min readEA link

A cen­tral AI al­ign­ment prob­lem: ca­pa­bil­ities gen­er­al­iza­tion, and the sharp left turn

So8resJun 15, 2022, 2:19 PM
53 points
2 comments10 min readEA link

SERI ML ap­pli­ca­tion dead­line is ex­tended un­til May 22.

Viktoria MalyasovaMay 22, 2022, 12:13 AM
13 points
3 comments1 min readEA link

Messy per­sonal stuff that af­fected my cause pri­ori­ti­za­tion (or: how I started to care about AI safety)

Julia_Wise🔸May 5, 2022, 5:59 PM
265 points
14 comments2 min readEA link

Be­ing an in­di­vi­d­ual al­ign­ment grantmaker

A_donorFeb 28, 2022, 4:39 PM
34 points
20 comments2 min readEA link

[Question] How do you talk about AI safety?

Eevee🔹Apr 19, 2020, 4:15 PM
10 points
5 comments1 min readEA link

[Question] What is most con­fus­ing to you about AI stuff?

Sam ClarkeNov 23, 2021, 4:00 PM
25 points
15 comments1 min readEA link

How do take­off speeds af­fect the prob­a­bil­ity of bad out­comes from AGI?

KRJul 7, 2020, 5:53 PM
18 points
0 comments8 min readEA link

[Question] Is work­ing on AI safety as dan­ger­ous as ig­nor­ing it?

jkmhSep 20, 2021, 11:06 PM
10 points
5 comments1 min readEA link

[Question] Is it crunch time yet? If so, who can help?

Nicholas / Heather KrossOct 13, 2021, 4:11 AM
29 points
9 comments1 min readEA link

Syd­ney AI Safety Fellowship

Chris LeongDec 2, 2021, 7:35 AM
16 points
0 comments2 min readEA link

Jan Leike, He­len Toner, Malo Bour­gon, and Miles Brundage: Work­ing in AI

EA GlobalAug 11, 2017, 8:19 AM
7 points
0 comments1 min readEA link
(www.youtube.com)

The aca­demic con­tri­bu­tion to AI safety seems large

technicalitiesJul 30, 2020, 10:30 AM
117 points
28 comments9 min readEA link

“In­tro to brain-like-AGI safety” se­ries—halfway point!

Steven ByrnesMar 9, 2022, 3:21 PM
8 points
0 comments2 min readEA link

7 traps that (we think) new al­ign­ment re­searchers of­ten fall into

AkashSep 27, 2022, 11:13 PM
73 points
8 comments1 min readEA link

[Linkpost] How To Get Into In­de­pen­dent Re­search On Align­ment/​Agency

Jackson WagnerFeb 14, 2022, 9:40 PM
10 points
0 comments1 min readEA link

The first AI Safety Camp & onwards

RemmeltJun 7, 2018, 6:49 PM
25 points
2 comments8 min readEA link

Con­nor Leahy on Con­jec­ture and Dy­ing with Dignity

Michaël TrazziJul 22, 2022, 7:30 PM
34 points
0 comments10 min readEA link
(theinsideview.ai)

2020 AI Align­ment Liter­a­ture Re­view and Char­ity Comparison

LarksDec 21, 2020, 3:25 PM
155 points
16 comments68 min readEA link

2018 AI Align­ment Liter­a­ture Re­view and Char­ity Comparison

LarksDec 18, 2018, 4:48 AM
118 points
28 comments63 min readEA link

[AN #80]: Why AI risk might be solved with­out ad­di­tional in­ter­ven­tion from longtermists

Rohin ShahJan 3, 2020, 7:52 AM
58 points
12 comments10 min readEA link
(www.alignmentforum.org)

Con­sider pay­ing me to do AI safety re­search work

RupertNov 5, 2020, 8:09 AM
11 points
3 comments2 min readEA link

The het­ero­gene­ity of hu­man value types: Im­pli­ca­tions for AI alignment

Geoffrey MillerSep 16, 2022, 9:21 PM
27 points
2 comments10 min readEA link

Shar­ing the World with Digi­tal Minds

Aaron Gertler 🔸Dec 1, 2020, 8:00 AM
12 points
1 comment1 min readEA link
(www.nickbostrom.com)

Chris­ti­ano and Yud­kowsky on AI pre­dic­tions and hu­man intelligence

EliezerYudkowskyFeb 23, 2022, 4:51 PM
31 points
0 comments42 min readEA link

[Creative Writ­ing Con­test] Me­tal or Mortal

LouisOct 16, 2021, 4:24 PM
7 points
0 comments7 min readEA link

De­fus­ing AGI Danger

Mark XuDec 24, 2020, 11:08 PM
23 points
0 comments2 min readEA link
(www.alignmentforum.org)

Quan­tify­ing the Far Fu­ture Effects of Interventions

MichaelDickensMay 18, 2016, 2:15 AM
8 points
0 comments11 min readEA link

2019 AI Align­ment Liter­a­ture Re­view and Char­ity Comparison

LarksDec 19, 2019, 2:58 AM
147 points
28 comments62 min readEA link

A mesa-op­ti­miza­tion per­spec­tive on AI valence and moral patienthood

jacobpfauSep 9, 2021, 10:23 PM
10 points
18 comments17 min readEA link

Cri­tique of Su­per­in­tel­li­gence Part 2

James FodorDec 13, 2018, 5:12 AM
10 points
12 comments7 min readEA link

EA megapro­jects continued

mariushobbhahnDec 3, 2021, 10:33 AM
183 points
48 comments7 min readEA link

Skil­ling-up in ML Eng­ineer­ing for Align­ment: re­quest for comments

TheMcDouglasApr 24, 2022, 6:40 AM
8 points
0 comments1 min readEA link

Visi­ble Thoughts Pro­ject and Bounty Announcement

So8resNov 30, 2021, 12:35 AM
35 points
2 comments13 min readEA link

[Creative Writ­ing Con­test] The Puppy Problem

LouisOct 13, 2021, 2:01 PM
13 points
0 comments7 min readEA link

Public-fac­ing Cen­sor­ship Is Safety Theater, Caus­ing Rep­u­ta­tional Da­m­age

YitzSep 23, 2022, 5:08 AM
49 points
7 comments1 min readEA link

A con­ver­sa­tion with Ro­hin Shah

AI ImpactsNov 12, 2019, 1:31 AM
27 points
8 comments33 min readEA link
(aiimpacts.org)

Euro­pean Master’s Pro­grams in Ma­chine Learn­ing, Ar­tifi­cial In­tel­li­gence, and re­lated fields

Master Programs ML/AIJan 17, 2021, 8:09 PM
17 points
4 comments1 min readEA link

[Ex­tended Dead­line: Jan 23rd] An­nounc­ing the PIBBSS Sum­mer Re­search Fellowship

noraDec 18, 2021, 4:54 PM
36 points
1 comment1 min readEA link

Cri­tique of Su­per­in­tel­li­gence Part 4

James FodorDec 13, 2018, 5:14 AM
4 points
2 comments4 min readEA link

[Question] How would a lan­guage model be­come goal-di­rected?

David MJul 16, 2022, 2:50 PM
113 points
20 comments1 min readEA link

Shah and Yud­kowsky on al­ign­ment failures

EliezerYudkowskyFeb 28, 2022, 7:25 PM
38 points
7 comments92 min readEA link

Deep­Mind is hiring for the Scal­able Align­ment and Align­ment Teams

Rohin ShahMay 13, 2022, 12:19 PM
102 points
0 comments9 min readEA link

Changes in fund­ing in the AI safety field

Sebastian_FarquharFeb 3, 2017, 1:09 PM
34 points
10 comments7 min readEA link

AGI safety from first principles

richard_ngoOct 21, 2020, 5:42 PM
77 points
10 comments3 min readEA link
(www.alignmentforum.org)

“Tak­ing AI Risk Se­ri­ously” – Thoughts by An­drew Critch

RaemonNov 19, 2018, 2:21 AM
26 points
9 comments1 min readEA link
(www.lesswrong.com)

[Question] Why should we *not* put effort into AI safety re­search?

Ben ThompsonMay 16, 2021, 5:11 AM
15 points
5 comments1 min readEA link

An­drew Critch: Log­i­cal in­duc­tion — progress in AI alignment

EA GlobalAug 6, 2016, 12:40 AM
7 points
0 comments1 min readEA link
(www.youtube.com)

On how var­i­ous plans miss the hard bits of the al­ign­ment challenge

So8resJul 12, 2022, 5:35 AM
126 points
13 comments29 min readEA link

[Question] Do EA folks want AGI at all?

Noah ScalesJul 16, 2022, 5:44 AM
8 points
10 comments1 min readEA link

You Un­der­stand AI Align­ment and How to Make Soup

Leen ArmoushMay 28, 2022, 6:22 AM
0 points
2 comments5 min readEA link

A re­sponse to Matthews on AI Risk

RyanCareyAug 11, 2015, 12:58 PM
11 points
16 comments6 min readEA link

[Question] Are so­cial me­dia al­gorithms an ex­is­ten­tial risk?

Barry GrimesSep 15, 2020, 8:52 AM
24 points
13 comments1 min readEA link

The role of academia in AI Safety.

PabloAMC 🔸Mar 28, 2022, 12:04 AM
71 points
19 comments3 min readEA link

AMA or dis­cuss my 80K pod­cast epi­sode: Ben Garfinkel, FHI researcher

bgarfinkelJul 13, 2020, 4:17 PM
87 points
140 comments1 min readEA link

Pre­dict re­sponses to the “ex­is­ten­tial risk from AI” survey

RobBensingerMay 28, 2021, 1:38 AM
36 points
8 comments2 min readEA link

Anal­y­sis of AI Safety sur­veys for field-build­ing insights

Ash JafariDec 5, 2022, 5:37 PM
30 points
7 comments5 min readEA link

[Question] Why not offer a multi-mil­lion /​ billion dol­lar prize for solv­ing the Align­ment Prob­lem?

Aryeh EnglanderApr 17, 2022, 4:08 PM
15 points
9 comments1 min readEA link

Steer­ing AI to care for an­i­mals, and soon

Andrew CritchJun 14, 2022, 1:13 AM
224 points
37 comments1 min readEA link

On Defer­ence and Yud­kowsky’s AI Risk Estimates

bgarfinkelJun 19, 2022, 2:35 PM
285 points
194 comments17 min readEA link

Owain Evans and Vic­to­ria Krakovna: Ca­reers in tech­ni­cal AI safety

EA GlobalNov 3, 2017, 7:43 AM
7 points
0 comments1 min readEA link
(www.youtube.com)

[Question] Is a ca­reer in mak­ing AI sys­tems more se­cure a mean­ingful way to miti­gate the X-risk posed by AGI?

Kyle O’BrienFeb 13, 2022, 7:05 AM
14 points
4 comments1 min readEA link

[Question] Why not to solve al­ign­ment by mak­ing su­per­in­tel­li­gent hu­mans?

PatoOct 16, 2022, 9:26 PM
9 points
12 comments1 min readEA link

[Link] How un­der­stand­ing valence could help make fu­ture AIs safer

Milan GriffesOct 8, 2020, 6:53 PM
22 points
2 comments3 min readEA link

My cur­rent thoughts on MIRI’s “highly re­li­able agent de­sign” work

Daniel_DeweyJul 7, 2017, 1:17 AM
60 points
59 comments19 min readEA link

En­abling more feedback

JJ HepburnDec 10, 2021, 6:52 AM
41 points
3 comments3 min readEA link

Red­wood Re­search is hiring for sev­eral roles

Jack RNov 29, 2021, 12:18 AM
75 points
0 comments1 min readEA link

‘Force mul­ti­pli­ers’ for EA research

Craig DraytonJun 18, 2022, 1:39 PM
18 points
7 comments4 min readEA link

The case for be­com­ing a black-box in­ves­ti­ga­tor of lan­guage models

BuckMay 6, 2022, 2:37 PM
90 points
7 comments3 min readEA link

Draft re­port on AI timelines

AjeyaDec 15, 2020, 12:10 PM
35 points
0 comments1 min readEA link
(alignmentforum.org)

Fore­cast­ing Trans­for­ma­tive AI: What Kind of AI?

Holden KarnofskyAug 10, 2021, 9:38 PM
62 points
3 comments10 min readEA link

What Should the Aver­age EA Do About AI Align­ment?

RaemonFeb 25, 2017, 8:07 PM
42 points
39 comments7 min readEA link

2017 AI Safety Liter­a­ture Re­view and Char­ity Comparison

LarksDec 20, 2017, 9:54 PM
43 points
17 comments23 min readEA link

Disagree­ments about Align­ment: Why, and how, we should try to solve them

ojorgensenAug 8, 2022, 10:32 PM
16 points
6 comments16 min readEA link

New Speaker Series on AI Align­ment Start­ing March 3

Zechen ZhangFeb 26, 2022, 10:58 AM
5 points
0 comments1 min readEA link

Cri­tique of Su­per­in­tel­li­gence Part 1

James FodorDec 13, 2018, 5:10 AM
22 points
13 comments8 min readEA link

[Closed] Prize and fast track to al­ign­ment re­search at ALTER

VanessaSep 18, 2022, 9:15 AM
38 points
0 comments3 min readEA link

AI Align­ment 2018-2019 Review

HabrykaJan 28, 2020, 9:14 PM
28 points
0 comments6 min readEA link
(www.lesswrong.com)

From lan­guage to ethics by au­to­mated reasoning

Michele CampoloNov 21, 2021, 3:16 PM
8 points
0 comments6 min readEA link

Ma­hen­dra Prasad: Ra­tional group de­ci­sion-making

EA GlobalJul 8, 2020, 3:06 PM
15 points
0 comments16 min readEA link
(www.youtube.com)

We should ex­pect to worry more about spec­u­la­tive risks

bgarfinkelMay 29, 2022, 9:08 PM
120 points
14 comments3 min readEA link

Buck Sh­legeris: How I think stu­dents should ori­ent to AI safety

EA GlobalOct 25, 2020, 5:48 AM
11 points
0 comments1 min readEA link
(www.youtube.com)

Pro­mot­ing com­pas­sion­ate longtermism

jonleightonDec 7, 2022, 2:26 PM
117 points
5 comments12 min readEA link

Med­i­ta­tions on ca­reers in AI Safety

PabloAMC 🔸Mar 23, 2022, 10:00 PM
88 points
30 comments2 min readEA link

Con­ver­sa­tion on AI risk with Adam Gleave

AI ImpactsDec 27, 2019, 9:43 PM
18 points
3 comments4 min readEA link
(aiimpacts.org)

There are two fac­tions work­ing to pre­vent AI dan­gers. Here’s why they’re deeply di­vided.

SharmakeAug 10, 2022, 7:52 PM
10 points
0 comments4 min readEA link
(www.vox.com)

Atari early

AI ImpactsApr 2, 2020, 11:28 PM
34 points
2 comments5 min readEA link
(aiimpacts.org)

I’m Buck Sh­legeris, I do re­search and out­reach at MIRI, AMA

BuckNov 15, 2019, 10:44 PM
123 points
228 comments2 min readEA link

[Question] Why aren’t you freak­ing out about OpenAI? At what point would you start?

AppliedDivinityStudiesOct 10, 2021, 1:06 PM
80 points
22 comments2 min readEA link

My Overview of the AI Align­ment Land­scape: A Bird’s Eye View

Neel NandaDec 15, 2021, 11:46 PM
45 points
15 comments16 min readEA link
(www.alignmentforum.org)

AI Safety: Ap­ply­ing to Grad­u­ate Studies

frances_lorenzDec 15, 2021, 10:56 PM
23 points
0 comments12 min readEA link

FLI AI Align­ment pod­cast: Evan Hub­inger on In­ner Align­ment, Outer Align­ment, and Pro­pos­als for Build­ing Safe Ad­vanced AI

evhubJul 1, 2020, 8:59 PM
13 points
2 comments1 min readEA link
(futureoflife.org)

How to build a safe ad­vanced AI (Evan Hub­inger) | What’s up in AI safety? (Asya Ber­gal)

EA GlobalOct 25, 2020, 5:48 AM
7 points
0 comments1 min readEA link
(www.youtube.com)

Can we simu­late hu­man evolu­tion to cre­ate a some­what al­igned AGI?

Thomas KwaMar 29, 2022, 1:23 AM
19 points
0 comments7 min readEA link

Work­ing at EA or­ga­ni­za­tions se­ries: Ma­chine In­tel­li­gence Re­search Institute

SoerenMindNov 1, 2015, 12:49 PM
8 points
0 comments4 min readEA link

AI al­ign­ment prize win­ners and next round [link]

RyanCareyJan 20, 2018, 12:07 PM
7 points
1 comment1 min readEA link

Stu­dent pro­ject for en­gag­ing with AI alignment

Per Ivar FriborgMay 9, 2022, 10:44 AM
35 points
1 comment1 min readEA link

[Question] Is trans­for­ma­tive AI the biggest ex­is­ten­tial risk? Why or why not?

Eevee🔹Mar 5, 2022, 3:54 AM
9 points
10 comments1 min readEA link

Ngo and Yud­kowsky on sci­en­tific rea­son­ing and pivotal acts

EliezerYudkowskyFeb 21, 2022, 5:00 PM
33 points
1 comment35 min readEA link

Cri­tique of Su­per­in­tel­li­gence Part 3

James FodorDec 13, 2018, 5:13 AM
3 points
5 comments7 min readEA link

Me­tac­u­lus is build­ing a team ded­i­cated to AI forecasting

christianOct 18, 2022, 4:08 PM
35 points
0 comments1 min readEA link
(apply.workable.com)

E.A. Me­gapro­ject Ideas

Tomer_GoloboyMar 21, 2022, 1:23 AM
15 points
4 comments4 min readEA link

LW4EA: Some cruxes on im­pact­ful al­ter­na­tives to AI policy work

JeremyMay 17, 2022, 3:05 AM
11 points
1 comment1 min readEA link
(www.lesswrong.com)

Sum­maries: Align­ment Fun­da­men­tals Curriculum

Leon_LangSep 19, 2022, 3:43 PM
25 points
1 comment1 min readEA link
(docs.google.com)

Why AI is Harder Than We Think—Me­lanie Mitchell

Eevee🔹Apr 28, 2021, 8:19 AM
45 points
7 comments2 min readEA link
(arxiv.org)

Ought’s the­ory of change

stuhlmuellerApr 12, 2022, 12:09 AM
43 points
4 comments3 min readEA link

[Question] What con­sid­er­a­tions in­fluence whether I have more in­fluence over short or long timelines?

kokotajlodNov 5, 2020, 7:57 PM
18 points
0 comments1 min readEA link

[Question] Donat­ing against Short Term AI risks

Jan-WillemNov 16, 2020, 12:23 PM
6 points
10 comments1 min readEA link

How Do AI Timelines Affect Giv­ing Now vs. Later?

MichaelDickensAug 3, 2021, 3:36 AM
36 points
8 comments8 min readEA link

Pre­serv­ing and con­tin­u­ing al­ign­ment re­search through a se­vere global catastrophe

A_donorMar 6, 2022, 6:43 PM
40 points
11 comments5 min readEA link

Database of ex­is­ten­tial risk estimates

MichaelA🔸Apr 15, 2020, 12:43 PM
130 points
37 comments5 min readEA link

Long-Term Fu­ture Fund: May 2021 grant recommendations

abergalMay 27, 2021, 6:44 AM
110 points
17 comments57 min readEA link

AGI Safety Com­mu­ni­ca­tions Initiative

InesJun 11, 2022, 4:30 PM
35 points
6 comments1 min readEA link

What Should We Op­ti­mize—A Conversation

Johannes C. MayerApr 7, 2022, 2:48 PM
1 point
0 comments14 min readEA link

What can the prin­ci­pal-agent liter­a­ture tell us about AI risk?

acFeb 10, 2020, 10:10 AM
26 points
1 comment16 min readEA link

[Question] I’m in­ter­view­ing Max Teg­mark about AI safety and more. What shouId I ask him?

Robert_WiblinMay 13, 2022, 3:32 PM
18 points
2 comments1 min readEA link

Amanda Askell: AI safety needs so­cial scientists

EA GlobalMar 4, 2019, 3:50 PM
27 points
0 comments18 min readEA link
(www.youtube.com)

On pre­sent­ing the case for AI risk

Aryeh EnglanderMar 8, 2022, 9:37 PM
114 points
12 comments4 min readEA link

AGI Predictions

PabloNov 21, 2020, 12:02 PM
36 points
0 comments1 min readEA link
(www.lesswrong.com)

Why AI al­ign­ment could be hard with mod­ern deep learning

AjeyaSep 21, 2021, 3:35 PM
153 points
17 comments14 min readEA link
(www.cold-takes.com)

[Question] Can we con­vince peo­ple to work on AI safety with­out con­vinc­ing them about AGI hap­pen­ing this cen­tury?

BrianTanNov 26, 2020, 2:46 PM
8 points
3 comments2 min readEA link

Jesse Clif­ton: Open-source learn­ing — a bar­gain­ing approach

EA GlobalOct 18, 2019, 6:05 PM
10 points
0 comments1 min readEA link
(www.youtube.com)

Tan Zhi Xuan: AI al­ign­ment, philo­soph­i­cal plu­ral­ism, and the rele­vance of non-Western philosophy

EA GlobalNov 21, 2020, 8:12 AM
19 points
1 comment1 min readEA link
(www.youtube.com)

Cortés, Pizarro, and Afonso as Prece­dents for Takeover

AI ImpactsMar 2, 2020, 12:25 PM
27 points
17 comments11 min readEA link
(aiimpacts.org)

My plan for a “Most Im­por­tant Cen­tury” read­ing group

Jack O'BrienJan 19, 2022, 9:32 AM
12 points
1 comment2 min readEA link

Red­wood Re­search is hiring for sev­eral roles (Oper­a­tions and Tech­ni­cal)

JJXWangApr 14, 2022, 3:23 PM
45 points
0 comments1 min readEA link

Katja Grace: AI safety

EA GlobalAug 11, 2017, 8:19 AM
7 points
0 comments1 min readEA link
(www.youtube.com)

Some global catas­trophic risk estimates

TamayFeb 10, 2021, 7:32 PM
106 points
15 comments1 min readEA link

Key Papers in Lan­guage Model Safety

aogaraJun 20, 2022, 2:59 PM
20 points
0 comments22 min readEA link

Sin­ga­pore’s Tech­ni­cal AI Align­ment Re­search Ca­reer Guide

Yi-YangAug 26, 2020, 8:09 AM
34 points
7 comments8 min readEA link

[Closed] Hiring a math­e­mat­i­cian to work on the learn­ing-the­o­retic AI al­ign­ment agenda

VanessaApr 19, 2022, 6:49 AM
53 points
4 comments2 min readEA link

Ar­tifi­cial in­tel­li­gence ca­reer stories

EA GlobalOct 25, 2020, 6:56 AM
12 points
0 comments1 min readEA link
(www.youtube.com)

[Question] Ca­reer Ad­vice: Philos­o­phy + Pro­gram­ming → AI Safety

tcelferactMar 18, 2022, 3:09 PM
30 points
11 comments2 min readEA link

We Are Con­jec­ture, A New Align­ment Re­search Startup

Connor LeahyApr 9, 2022, 3:07 PM
31 points
0 comments1 min readEA link

Co­her­ence ar­gu­ments im­ply a force for goal-di­rected behavior

Katja_GraceApr 6, 2021, 9:44 PM
19 points
1 comment11 min readEA link
(worldspiritsockpuppet.com)

Soares, Tal­linn, and Yud­kowsky dis­cuss AGI cognition

EliezerYudkowskyNov 29, 2021, 5:28 PM
15 points
0 comments40 min readEA link

Draft re­port on ex­is­ten­tial risk from power-seek­ing AI

Joe_CarlsmithApr 28, 2021, 9:41 PM
88 points
34 comments1 min readEA link

2016 AI Risk Liter­a­ture Re­view and Char­ity Comparison

LarksDec 13, 2016, 4:36 AM
57 points
12 comments28 min readEA link

AI and im­pact opportunities

brb243Mar 31, 2022, 8:23 PM
−2 points
6 comments1 min readEA link

[Cause Ex­plo­ra­tion Prizes] Ex­pand­ing com­mu­ni­ca­tion about AGI risks

InesSep 22, 2022, 5:30 AM
13 points
0 comments11 min readEA link

In­tro­duc­ing the Prin­ci­ples of In­tel­li­gent Be­havi­our in Biolog­i­cal and So­cial Sys­tems (PIBBSS) Fellowship

adamShimiDec 18, 2021, 3:25 PM
37 points
5 comments10 min readEA link

Three Bi­ases That Made Me Believe in AI Risk

beth​Feb 13, 2019, 11:22 PM
41 points
20 comments3 min readEA link

AI al­ign­ment with hu­mans… but with which hu­mans?

Geoffrey MillerSep 8, 2022, 11:43 PM
51 points
20 comments3 min readEA link

Scru­ti­niz­ing AI Risk (80K, #81) - v. quick summary

BenJul 23, 2020, 7:02 PM
10 points
1 comment3 min readEA link

Re: Some thoughts on veg­e­tar­i­anism and veganism

FaiFeb 25, 2022, 8:43 PM
46 points
3 comments8 min readEA link

Are Hu­mans ‘Hu­man Com­pat­i­ble’?

Matt BoydDec 6, 2019, 5:49 AM
23 points
8 comments4 min readEA link

Crit­i­cism of the main frame­work in AI alignment

Michele CampoloAug 31, 2022, 9:44 PM
42 points
4 comments7 min readEA link

AI Risk: In­creas­ing Per­sua­sion Power

kewlcatsAug 3, 2020, 8:25 PM
4 points
0 comments1 min readEA link

[Question] Should the EA com­mu­nity have a DL en­g­ineer­ing fel­low­ship?

PabloAMC 🔸Dec 24, 2021, 1:43 PM
26 points
6 comments1 min readEA link

[Question] Is this a good way to bet on short timelines?

kokotajlodNov 28, 2020, 2:31 PM
17 points
16 comments1 min readEA link

In­tro­duc­ing The Non­lin­ear Fund: AI Safety re­search, in­cu­ba­tion, and funding

Kat WoodsMar 18, 2021, 2:07 PM
71 points
32 comments5 min readEA link

There should be an AI safety pro­ject board

mariushobbhahnMar 14, 2022, 4:08 PM
24 points
3 comments1 min readEA link

AI Align­ment YouTube Playlists

jacquesthibsMay 9, 2022, 9:31 PM
16 points
2 comments1 min readEA link

He­len Toner: The Open Philan­thropy Pro­ject’s work on AI risk

EA GlobalNov 3, 2017, 7:43 AM
7 points
0 comments1 min readEA link
(www.youtube.com)

AI Fore­cast­ing Re­s­olu­tion Coun­cil (Fore­cast­ing in­fras­truc­ture, part 2)

terraformAug 29, 2019, 5:43 PM
28 points
0 comments3 min readEA link

My Un­der­stand­ing of Paul Chris­ti­ano’s Iter­ated Am­plifi­ca­tion AI Safety Re­search Agenda

ChiAug 15, 2020, 7:59 PM
38 points
3 comments39 min readEA link

[Question] What kind of event, tar­geted to un­der­grad­u­ate CS ma­jors, would be most effec­tive at get­ting peo­ple to work on AI safety?

CBiddulphSep 19, 2021, 4:19 PM
9 points
1 comment1 min readEA link

Crypto ‘or­a­cle pro­to­cols’ for AI al­ign­ment with real-world data?

Geoffrey MillerSep 22, 2022, 11:05 PM
9 points
3 comments1 min readEA link

Key ques­tions about ar­tifi­cial sen­tience: an opinionated guide

rgbApr 25, 2022, 1:42 PM
91 points
3 comments1 min readEA link

The Me­taethics and Nor­ma­tive Ethics of AGI Value Align­ment: Many Ques­tions, Some Implications

Eleos Arete CitriniSep 15, 2021, 7:05 PM
25 points
0 comments8 min readEA link

The re­li­gion prob­lem in AI alignment

Geoffrey MillerSep 16, 2022, 1:24 AM
54 points
28 comments11 min readEA link

How to Diver­sify Con­cep­tual AI Align­ment: the Model Be­hind Refine

adamShimiJul 20, 2022, 10:44 AM
43 points
0 comments9 min readEA link
(www.alignmentforum.org)

[Question] Does the idea of AGI that benev­olently con­trol us ap­peal to EA folks?

Noah ScalesJul 16, 2022, 7:17 PM
6 points
20 comments1 min readEA link

[Question] How can I bet on short timelines?

kokotajlodNov 7, 2020, 12:45 PM
33 points
12 comments2 min readEA link

[Dis­cus­sion] Best in­tu­ition pumps for AI safety

mariushobbhahnNov 6, 2021, 8:11 AM
10 points
8 comments1 min readEA link

Our Cur­rent Direc­tions in Mechanis­tic In­ter­pretabil­ity Re­search (AI Align­ment Speaker Series)

Group OrganizerApr 8, 2022, 5:08 PM
3 points
0 comments1 min readEA link

AGI x-risk timelines: 10% chance (by year X) es­ti­mates should be the head­line, not 50%.

Greg_ColbournMar 1, 2022, 12:02 PM
69 points
22 comments2 min readEA link

Dis­con­tin­u­ous progress in his­tory: an update

AI ImpactsApr 17, 2020, 4:28 PM
69 points
3 comments24 min readEA link

Anti-squat­ted AI x-risk do­mains index

plexAug 12, 2022, 12:00 PM
56 points
9 comments1 min readEA link

[Question] 1h-vol­un­teers needed for a small AI Safety-re­lated re­search pro­ject

PabloAMC 🔸Aug 16, 2021, 5:51 PM
4 points
0 comments1 min readEA link

Align­ment 201 curriculum

richard_ngoOct 12, 2022, 7:17 PM
94 points
9 comments1 min readEA link

Max Teg­mark: Risks and benefits of ad­vanced ar­tifi­cial intelligence

EA GlobalAug 5, 2016, 9:19 AM
7 points
0 comments1 min readEA link
(www.youtube.com)

In­tro­duc­ing the Fund for Align­ment Re­search (We’re Hiring!)

AdamGleaveJul 6, 2022, 2:00 AM
74 points
3 comments4 min readEA link

On Solv­ing Prob­lems Be­fore They Ap­pear: The Weird Episte­molo­gies of Alignment

adamShimiOct 11, 2021, 8:21 AM
28 points
0 comments15 min readEA link

fic­tion about AI risk

Ann Garth 🔸Nov 12, 2020, 10:36 PM
8 points
1 comment1 min readEA link

Ro­hin Shah: What’s been hap­pen­ing in AI al­ign­ment?

EA GlobalJul 29, 2020, 8:15 PM
18 points
0 comments14 min readEA link
(www.youtube.com)

Who or­dered al­ign­ment’s ap­ple?

Eleni_AAug 28, 2022, 2:24 PM
5 points
0 comments3 min readEA link

[Question] How much EA anal­y­sis of AI safety as a cause area ex­ists?

richard_ngoSep 6, 2019, 11:15 AM
94 points
20 comments2 min readEA link

[3-hour pod­cast]: Joseph Car­l­smith on longter­mism, utopia, the com­pu­ta­tional power of the brain, meta-ethics, illu­sion­ism and meditation

Gus DockerJul 27, 2021, 1:18 PM
34 points
2 comments1 min readEA link

EA Berkeley Pre­sents: Univer­sal Own­er­ship: Is In­dex In­vest­ing the New So­cially Re­spon­si­ble In­vest­ing?

Mahendra PrasadMar 10, 2022, 6:58 AM
7 points
0 comments1 min readEA link

AI Fore­cast­ing Ques­tion Database (Fore­cast­ing in­fras­truc­ture, part 3)

terraformSep 3, 2019, 2:57 PM
23 points
2 comments4 min readEA link

An­nounc­ing AXRP, the AI X-risk Re­search Podcast

DanielFilanDec 23, 2020, 8:10 PM
32 points
1 comment1 min readEA link

On AI and Compute

johncroxApr 3, 2019, 9:26 PM
39 points
12 comments8 min readEA link

Eric Drexler: Pare­to­topian goal alignment

EA GlobalMar 15, 2019, 2:51 PM
14 points
0 comments10 min readEA link
(www.youtube.com)

[Question] How should we in­vest in “long-term short-ter­mism” given the like­li­hood of trans­for­ma­tive AI?

James_BanksJan 12, 2021, 11:54 PM
8 points
0 comments1 min readEA link

In­tro to car­ing about AI al­ign­ment as an EA cause

So8resApr 14, 2017, 12:42 AM
28 points
10 comments25 min readEA link

[Link] Thiel on GCRs

Milan GriffesJul 22, 2019, 8:47 PM
28 points
11 comments1 min readEA link

Crit­i­cal Re­view of ‘The Precipice’: A Re­assess­ment of the Risks of AI and Pandemics

James FodorMay 11, 2020, 11:11 AM
111 points
32 comments26 min readEA link

Gen­eral ad­vice for tran­si­tion­ing into The­o­ret­i­cal AI Safety

Martín SotoSep 15, 2022, 5:23 AM
25 points
0 comments10 min readEA link

13 back­ground claims about EA

AkashSep 7, 2022, 3:54 AM
70 points
16 comments3 min readEA link

De­sir­able? AI qualities

brb243Mar 21, 2022, 10:05 PM
7 points
0 comments2 min readEA link

Emer­gent Ven­tures AI

technicalitiesApr 8, 2022, 10:08 PM
22 points
0 comments1 min readEA link
(marginalrevolution.com)

Con­fused about AI re­search as a means of ad­dress­ing AI risk

Eli RoseFeb 21, 2019, 12:07 AM
31 points
15 comments1 min readEA link

Nat­u­ral­ism and AI alignment

Michele CampoloApr 24, 2021, 4:20 PM
17 points
3 comments7 min readEA link

Take­aways from safety by de­fault interviews

AI ImpactsApr 7, 2020, 2:01 AM
25 points
2 comments13 min readEA link
(aiimpacts.org)

A stub­born un­be­liever fi­nally gets the depth of the AI al­ign­ment problem

aelwoodOct 13, 2022, 3:16 PM
32 points
7 comments1 min readEA link

But ex­actly how com­plex and frag­ile?

Katja_GraceDec 13, 2019, 7:05 AM
37 points
3 comments3 min readEA link
(meteuphoric.com)

[Question] How to get more aca­demics en­thu­si­as­tic about do­ing AI Safety re­search?

PabloAMC 🔸Sep 4, 2021, 2:10 PM
25 points
19 comments1 min readEA link

Why I pri­ori­tize moral cir­cle ex­pan­sion over re­duc­ing ex­tinc­tion risk through ar­tifi­cial in­tel­li­gence alignment

JacyFeb 20, 2018, 6:29 PM
107 points
72 comments35 min readEA link
(www.sentienceinstitute.org)

Im­pli­ca­tions of Quan­tum Com­put­ing for Ar­tifi­cial In­tel­li­gence al­ign­ment re­search (ABRIDGED)

Jaime SevillaSep 5, 2019, 2:56 PM
25 points
4 comments2 min readEA link

AI al­ign­ment re­search links

Holden KarnofskyJan 6, 2022, 5:52 AM
16 points
0 comments6 min readEA link
(www.cold-takes.com)

Prepar­ing for AI-as­sisted al­ign­ment re­search: we need data!

CBiddulphJan 17, 2023, 3:28 AM
11 points
0 comments11 min readEA link

[Link post] AI could fuel fac­tory farm­ing—or end it

BrianKOct 18, 2022, 11:16 AM
39 points
0 comments1 min readEA link
(www.fastcompany.com)

[Question] What’s the best ma­chine learn­ing newslet­ter? How do you keep up to date?

Matt PutzMar 25, 2022, 2:36 PM
13 points
12 comments1 min readEA link

[Question] What work has been done on the post-AGI dis­tri­bu­tion of wealth?

tlevinJul 6, 2022, 6:59 PM
16 points
3 comments1 min readEA link

Philan­thropists Prob­a­bly Shouldn’t Mis­sion-Hedge AI Progress

MichaelDickensAug 23, 2022, 11:03 PM
28 points
9 comments36 min readEA link

EA AI/​Emerg­ing Tech Orgs Should Be In­volved with Pa­tent Office Partnership

BridgesJun 12, 2022, 10:32 PM
10 points
0 comments1 min readEA link

[Question] How might a herd of in­terns help with AI or biose­cu­rity re­search tasks/​ques­tions?

Marcel DMar 20, 2022, 10:49 PM
30 points
8 comments2 min readEA link

Cru­cial con­sid­er­a­tions in the field of Wild An­i­mal Welfare (WAW)

Holly Elmore ⏸️ 🔸Apr 10, 2022, 7:43 PM
63 points
10 comments3 min readEA link

New GPT3 Im­pres­sive Ca­pa­bil­ities—In­struc­tGPT3 [1/​2]

simeon_cMar 13, 2022, 10:45 AM
49 points
4 comments7 min readEA link

The prob­a­bil­ity that Ar­tifi­cial Gen­eral In­tel­li­gence will be de­vel­oped by 2043 is ex­tremely low.

cveresOct 6, 2022, 11:26 AM
2 points
12 comments13 min readEA link

[Question] What EAG ses­sions would you like on AI?

Nathan YoungMar 20, 2022, 5:05 PM
7 points
10 comments1 min readEA link

On Scal­ing Academia

kirchner.janSep 20, 2021, 2:54 PM
18 points
3 comments13 min readEA link
(universalprior.substack.com)

We Can’t Do Long Term Utili­tar­ian Calcu­la­tions Un­til We Know if AIs Can Be Con­scious or Not

Mike20731Sep 2, 2022, 8:37 AM
4 points
0 comments11 min readEA link

“The Physi­cists”: A play about ex­tinc­tion and the re­spon­si­bil­ity of scientists

Lara_THNov 29, 2022, 4:53 PM
28 points
1 comment8 min readEA link

When 2/​3rds of the world goes against you

Jeffrey KursonisJul 2, 2022, 8:34 PM
2 points
2 comments9 min readEA link

Ap­ply to be a Stan­ford HAI Ju­nior Fel­low (As­sis­tant Pro­fes­sor- Re­search) by Nov. 15, 2021

Vael GatesOct 31, 2021, 2:21 AM
15 points
0 comments1 min readEA link

Fa­nat­i­cism in AI: SERI Project

Jake Arft-GuatelliSep 24, 2021, 4:39 AM
7 points
2 comments5 min readEA link

Stack­elberg Games and Co­op­er­a­tive Com­mit­ment: My Thoughts and Reflec­tions on a 2-Month Re­search Project

Ben BucknallDec 13, 2021, 10:49 AM
18 points
1 comment9 min readEA link

Strong AI. From the­ory to prac­tice.

GaHHuKoBAug 19, 2022, 11:33 AM
−2 points
0 comments10 min readEA link
(www.reddit.com)

Make a neu­ral net­work in ~10 minutes

Arjun YadavApr 25, 2022, 6:36 PM
3 points
0 comments4 min readEA link
(arjunyadav.net)

Pro­ject Idea: The cost of Coc­ci­dio­sis on Chicken farm­ing and if AI can help

Max HarrisSep 26, 2022, 4:30 PM
25 points
8 comments2 min readEA link

An Ex­er­cise in Speed-Read­ing: The Na­tional Se­cu­rity Com­mis­sion on AI (NSCAI) Fi­nal Report

abiolveraAug 17, 2022, 4:55 PM
47 points
4 comments12 min readEA link

Sixty years af­ter the Cuban Mis­sile Cri­sis, a new era of global catas­trophic risks

christian.rOct 13, 2022, 11:25 AM
31 points
0 comments1 min readEA link
(thebulletin.org)

GPT-2 as step to­ward gen­eral in­tel­li­gence (Alexan­der, 2019)

Will AldredJul 18, 2022, 4:14 PM
42 points
0 comments2 min readEA link
(slatestarcodex.com)

In­tro­duc­ing the ML Safety Schol­ars Program

TW123May 4, 2022, 1:14 PM
157 points
42 comments3 min readEA link

[Question] Re­quest for As­sis­tance—Re­search on Sce­nario Devel­op­ment for Ad­vanced AI Risk

KiliankMar 30, 2022, 3:01 AM
2 points
1 comment1 min readEA link

Slides: Po­ten­tial Risks From Ad­vanced AI

Aryeh EnglanderApr 28, 2022, 2:18 AM
9 points
0 comments1 min readEA link

[Question] What will be some of the most im­pact­ful ap­pli­ca­tions of ad­vanced AI in the near term?

IanDavidMossMar 3, 2022, 3:26 PM
16 points
7 comments1 min readEA link

Love and AI: Re­la­tional Brain/​Mind Dy­nam­ics in AI Development

Jeffrey KursonisJun 21, 2022, 7:09 AM
2 points
2 comments3 min readEA link

On Generality

Oren MontanoSep 26, 2022, 8:59 AM
2 points
0 comments1 min readEA link

[Question] I have re­cently been in­ter­ested in robotics, par­tic­u­larly in for-profit star­tups. I think they can help in­crease food pro­duc­tion and help re­duce im­prove health­care. Would this fall un­der AI for so­cial good? How im­pact­ful will robotics be to so­ciety? How large is the coun­ter­fac­tual?

Isaac BensonJan 2, 2022, 5:38 AM
4 points
3 comments1 min readEA link

Without spe­cific coun­ter­mea­sures, the eas­iest path to trans­for­ma­tive AI likely leads to AI takeover

AjeyaJul 18, 2022, 7:07 PM
217 points
12 comments84 min readEA link
(www.lesswrong.com)

[Question] Train­ing a GPT model on EA texts: what data?

JoyOptimizerJun 4, 2022, 5:59 AM
23 points
17 comments1 min readEA link

AI timelines and the­o­ret­i­cal un­der­stand­ing of deep learn­ing

Venky1024Sep 12, 2021, 4:26 PM
4 points
8 comments2 min readEA link

My ar­gu­ment against AGI

cveresOct 12, 2022, 6:32 AM
2 points
29 comments3 min readEA link

Chris Olah on work­ing at top AI labs with­out an un­der­grad degree

80000_HoursSep 10, 2021, 8:46 PM
15 points
0 comments73 min readEA link

A Bird’s Eye View of the ML Field [Prag­matic AI Safety #2]

TW123May 9, 2022, 5:15 PM
97 points
2 comments35 min readEA link

How to be­come more agen­tic, by GPT-EA-Fo­rum-v1

JoyOptimizerJun 20, 2022, 6:50 AM
24 points
8 comments4 min readEA link

Why I think strong gen­eral AI is com­ing soon

porbySep 28, 2022, 6:55 AM
14 points
1 comment1 min readEA link

Ex­plo­ra­tory sur­vey on psy­chol­ogy of AI risk perception

Daniel_FriedrichAug 2, 2022, 8:34 PM
1 point
0 comments1 min readEA link
(forms.gle)

Vot­ing The­ory has a HOLE

Anthony RepettoDec 4, 2021, 4:20 AM
2 points
4 comments2 min readEA link

First call for EA Data Science/​ML/​AI

astrastefaniaAug 23, 2022, 7:37 PM
29 points
0 comments1 min readEA link

Catholic the­olo­gians and priests on ar­tifi­cial intelligence

anonymous6Jun 14, 2022, 6:53 PM
21 points
2 comments1 min readEA link

Carnegie Coun­cil MisUn­der­stands Longtermism

Jeff ASep 30, 2022, 2:57 AM
6 points
8 comments1 min readEA link
(www.carnegiecouncil.org)

Re­sults from the lan­guage model hackathon

Esben KranOct 10, 2022, 8:29 AM
23 points
2 comments1 min readEA link

Seek­ing Sur­vey Re­sponses—At­ti­tudes Towards AI risks

ansonMar 28, 2022, 5:47 PM
23 points
2 comments1 min readEA link
(forms.gle)

Gen­eral vs spe­cific ar­gu­ments for the longter­mist im­por­tance of shap­ing AI development

Sam ClarkeOct 15, 2021, 2:43 PM
44 points
7 comments2 min readEA link

Don’t ex­pect AGI any­time soon

cveresOct 10, 2022, 10:38 PM
0 points
19 comments1 min readEA link

EA for dumb peo­ple?

Olivia AddyJul 11, 2022, 10:46 AM
493 points
160 comments2 min readEA link

It takes 5 lay­ers and 1000 ar­tifi­cial neu­rons to simu­late a sin­gle biolog­i­cal neu­ron [Link]

MichaelStJulesSep 7, 2021, 9:53 PM
44 points
17 comments2 min readEA link

TIO: A men­tal health chatbot

SanjayOct 12, 2020, 8:52 PM
25 points
6 comments28 min readEA link

The His­tory of AI Rights Research

Jamie_HarrisAug 27, 2022, 8:14 AM
48 points
1 comment14 min readEA link
(www.sentienceinstitute.org)

High Im­pact Ca­reers in For­mal Ver­ifi­ca­tion: Ar­tifi­cial Intelligence

quinnJun 5, 2021, 2:45 PM
28 points
7 comments16 min readEA link

Paul Chris­ti­ano – Ma­chine in­tel­li­gence and cap­i­tal accumulation

Tessa A 🔸May 15, 2014, 12:10 AM
21 points
0 comments6 min readEA link
(rationalaltruist.com)

Deep­Mind: Gen­er­ally ca­pa­ble agents emerge from open-ended play

kokotajlodJul 27, 2021, 7:35 PM
56 points
10 comments2 min readEA link
(deepmind.com)

There’s No Fire Alarm for Ar­tifi­cial Gen­eral Intelligence

EA Forum ArchivesOct 14, 2017, 2:41 AM
30 points
1 comment25 min readEA link
(www.lesswrong.com)

A Sur­vey of the Po­ten­tial Long-term Im­pacts of AI

Sam ClarkeJul 18, 2022, 9:48 AM
63 points
2 comments27 min readEA link

[Question] What are some re­sources (ar­ti­cles, videos) that show off what the cur­rent state of the art in AI is? (for a layper­son who doesn’t know much about AI)

jamesDec 6, 2021, 9:06 PM
10 points
6 comments1 min readEA link

[Cross-post] Change my mind: we should define and mea­sure the effec­tive­ness of ad­vanced AI

David JohnstonApr 6, 2022, 12:20 AM
4 points
0 comments7 min readEA link

“Tech­nolog­i­cal un­em­ploy­ment” AI vs. “most im­por­tant cen­tury” AI: how far apart?

Holden KarnofskyOct 11, 2022, 4:50 AM
17 points
1 comment3 min readEA link
(www.cold-takes.com)

An­nounc­ing In­sights for Impact

Christian PearsonJan 4, 2023, 7:00 AM
80 points
6 comments1 min readEA link

How to use AI speech tran­scrip­tion and anal­y­sis to ac­cel­er­ate so­cial sci­ence research

Alexander SaeriJan 31, 2023, 4:01 AM
39 points
6 comments11 min readEA link

Should ChatGPT make us down­weight our be­lief in the con­scious­ness of non-hu­man an­i­mals?

splinterFeb 18, 2023, 11:29 PM
11 points
15 comments2 min readEA link

Vael Gates: Risks from Ad­vanced AI (June 2022)

Vael GatesJun 14, 2022, 12:49 AM
45 points
5 comments30 min readEA link

Why I think that teach­ing philos­o­phy is high impact

Eleni_ADec 19, 2022, 11:00 PM
17 points
2 comments2 min readEA link

Air-gap­ping eval­u­a­tion and support

Ryan KiddDec 26, 2022, 10:52 PM
22 points
12 comments1 min readEA link

An­nounc­ing the AI Safety Field Build­ing Hub, a new effort to provide AISFB pro­jects, men­tor­ship, and funding

Vael GatesJul 28, 2022, 9:29 PM
126 points
6 comments6 min readEA link

AGI Safety Fun­da­men­tals pro­gramme is con­tract­ing a low-code engineer

Jamie BAug 26, 2022, 3:43 PM
39 points
4 comments5 min readEA link

We all teach: here’s how to do it better

Michael Noetel 🔸Sep 30, 2022, 2:06 AM
172 points
12 comments24 min readEA link

Tran­scripts of in­ter­views with AI researchers

Vael GatesMay 9, 2022, 6:03 AM
140 points
14 comments2 min readEA link

Con­crete Steps to Get Started in Trans­former Mechanis­tic Interpretability

Neel NandaDec 26, 2022, 1:00 PM
18 points
0 comments12 min readEA link

An­nounc­ing an Em­piri­cal AI Safety Program

JoshcSep 13, 2022, 9:39 PM
64 points
7 comments2 min readEA link

What are some low-cost out­side-the-box ways to do/​fund al­ign­ment re­search?

trevor1Nov 11, 2022, 5:57 AM
2 points
3 comments1 min readEA link

Pod­cast: Shoshan­nah Tekofsky on skil­ling up in AI safety, vis­it­ing Berkeley, and de­vel­op­ing novel re­search ideas

AkashNov 25, 2022, 8:47 PM
14 points
0 comments1 min readEA link

Up­date on Har­vard AI Safety Team and MIT AI Alignment

Xander123Dec 2, 2022, 6:09 AM
71 points
3 comments1 min readEA link

[Question] I have thou­sands of copies of HPMOR in Rus­sian. How to use them with the most im­pact?

MikhailSaminDec 27, 2022, 11:07 AM
39 points
10 comments1 min readEA link

AI Safety Micro­grant Round

Chris LeongNov 14, 2022, 4:25 AM
81 points
3 comments3 min readEA link

An­nounc­ing aisafety.training

JJ HepburnJan 17, 2023, 1:55 AM
110 points
4 comments1 min readEA link

How many peo­ple are work­ing (di­rectly) on re­duc­ing ex­is­ten­tial risk from AI?

Benjamin HiltonJan 17, 2023, 2:03 PM
117 points
3 comments4 min readEA link
(80000hours.org)

Govern­ments pose larger risks than cor­po­ra­tions: a brief re­sponse to Grace

David JohnstonOct 19, 2022, 11:54 AM
11 points
3 comments2 min readEA link

Refine: An In­cu­ba­tor for Con­cep­tual Align­ment Re­search Bets

adamShimiApr 15, 2022, 8:59 AM
47 points
0 comments4 min readEA link

Re­cruit­ing Skil­led Volunteers

The BOOMNov 3, 2022, 2:36 PM
−9 points
14 comments1 min readEA link

Why do we post our AI safety plans on the In­ter­net?

Peter S. ParkOct 31, 2022, 4:27 PM
15 points
22 comments11 min readEA link

Why EAs are skep­ti­cal about AI Safety

Lukas Trötzmüller🔸Jul 18, 2022, 7:01 PM
290 points
31 comments29 min readEA link

I am a Me­moryless System

Nicholas / Heather KrossOct 23, 2022, 5:36 PM
4 points
0 comments9 min readEA link
(www.thinkingmuchbetter.com)

Rood­man’s Thoughts on Biolog­i­cal Anchors

lukeprogSep 14, 2022, 12:23 PM
73 points
8 comments1 min readEA link
(docs.google.com)

SERI MATS Pro­gram—Win­ter 2022 Cohort

Ryan KiddOct 8, 2022, 7:09 PM
50 points
4 comments1 min readEA link

The next decades might be wild

mariushobbhahnDec 15, 2022, 4:10 PM
130 points
31 comments1 min readEA link

Which of these ar­gu­ments for x-risk do you think we should test?

WimAug 9, 2022, 1:43 PM
3 points
2 comments1 min readEA link

[Question] Mu­tual As­sured Destruc­tion used against AGI

LeopardOct 8, 2022, 9:35 AM
4 points
5 comments1 min readEA link

Join the in­ter­pretabil­ity re­search hackathon

Esben KranOct 28, 2022, 4:26 PM
48 points
0 comments5 min readEA link

Don’t worry, be happy (liter­ally)

Yuri ZavorotnyOct 5, 2022, 1:55 AM
0 points
1 comment2 min readEA link

Soft­ware en­g­ineer­ing—Ca­reer review

Benjamin HiltonFeb 8, 2022, 6:11 AM
93 points
19 comments8 min readEA link
(80000hours.org)

A ten­ta­tive di­alogue with a Friendly-boxed-su­per-AGI on brain uploads

RamiroMay 12, 2022, 9:55 PM
5 points
0 comments4 min readEA link

The prob­lem of ar­tifi­cial suffering

mlsbtSep 24, 2021, 2:43 PM
50 points
3 comments9 min readEA link

In­tro to AI Safety

Madhav MalhotraOct 19, 2022, 11:45 PM
4 points
0 comments1 min readEA link

An ex­per­i­ment elic­it­ing rel­a­tive es­ti­mates for Open Philan­thropy’s 2018 AI safety grants

NunoSempereSep 12, 2022, 11:19 AM
111 points
16 comments13 min readEA link

AI Twit­ter ac­counts to fol­low?

Adrian SalustriJun 10, 2022, 6:19 AM
1 point
2 comments1 min readEA link

Sys­temic Cas­cad­ing Risks: Rele­vance in Longter­mism & Value Lock-In

Richard RSep 2, 2022, 7:53 AM
56 points
10 comments16 min readEA link

What we owe the microbiome

TeddyWDec 17, 2022, 4:17 PM
12 points
2 comments1 min readEA link

Prob­a­bly good pro­jects for the AI safety ecosystem

Ryan KiddDec 5, 2022, 3:24 AM
21 points
0 comments1 min readEA link

Mere ex­po­sure effect: Bias in Eval­u­at­ing AGI X-Risks

RemmeltDec 27, 2022, 2:05 PM
4 points
1 comment1 min readEA link

Ex­is­ten­tial AI Safety is NOT sep­a­rate from near-term applications

stecasDec 13, 2022, 2:47 PM
28 points
9 comments1 min readEA link

AI Fore­cast­ing Re­search Ideas

Jaime SevillaNov 17, 2022, 5:37 PM
78 points
1 comment1 min readEA link
(docs.google.com)

Ap­ply to the Red­wood Re­search Mechanis­tic In­ter­pretabil­ity Ex­per­i­ment (REMIX), a re­search pro­gram in Berkeley

Max NadeauOct 27, 2022, 1:39 AM
95 points
5 comments12 min readEA link

Per­sua­sion Tools: AI takeover with­out AGI or agency?

kokotajlodNov 20, 2020, 4:56 PM
15 points
5 comments10 min readEA link

ChatGPT can write code! ?

MiguelDec 10, 2022, 5:36 AM
6 points
15 comments1 min readEA link
(www.whitehatstoic.com)

What are the risks of an or­a­cle AI?

Griffin YoungOct 5, 2022, 6:18 AM
6 points
2 comments1 min readEA link

An en­tire cat­e­gory of risks is un­der­val­ued by EA [Sum­mary of pre­vi­ous fo­rum post]

Richard RSep 5, 2022, 3:07 PM
76 points
5 comments5 min readEA link

Grokking “Semi-in­for­ma­tive pri­ors over AI timelines”

ansonJun 12, 2022, 10:15 PM
60 points
1 comment14 min readEA link

An ap­praisal of the Fu­ture of Life In­sti­tute AI ex­is­ten­tial risk program

PabloAMC 🔸Dec 11, 2022, 1:36 PM
29 points
0 comments1 min readEA link

More Aca­demic Diver­sity in Align­ment?

ojorgensenNov 27, 2022, 5:52 PM
7 points
0 comments1 min readEA link

[An­nounce­ment] The Steven Aiberg Project

StevenAibergOct 19, 2022, 7:48 AM
0 points
0 comments4 min readEA link

Linkpost—Beyond Hyper­an­thro­po­mor­phism: Or, why fears of AI are not even wrong, and how to make them real

LockeAug 24, 2022, 4:24 PM
−4 points
3 comments2 min readEA link
(studio.ribbonfarm.com)

Maybe AI risk shouldn’t af­fect your life plan all that much

JustisJul 22, 2022, 3:30 PM
22 points
4 comments6 min readEA link

Fa­cil­i­ta­tor Help Wanted for Columbia EA AI Safety Groups

Berkan OttlikJul 5, 2022, 10:27 AM
16 points
0 comments1 min readEA link

[Question] Is there a news-tracker about GPT-4? Why has ev­ery­thing be­come so silent about it?

Franziska FischerOct 29, 2022, 8:56 AM
10 points
4 comments1 min readEA link

[Question] Fore­cast­ing thread: How does AI risk level vary based on timelines?

eliflandSep 14, 2022, 11:56 PM
47 points
8 comments1 min readEA link

AGI with feelings

Nicolai MebergDec 7, 2022, 4:00 PM
−13 points
0 comments1 min readEA link
(twitter.com)

A con­cern about the “evolu­tion­ary an­chor” of Ajeya Co­tra’s re­port on AI timelines.

NunoSempereAug 16, 2022, 2:44 PM
75 points
40 comments5 min readEA link
(nunosempere.com)

[CANCELLED] Ber­lin AI Align­ment Open Meetup Au­gust 2022

Isidor RegenfußAug 4, 2022, 1:34 PM
0 points
0 comments1 min readEA link

Longevity re­search as AI X-risk intervention

DirectedEvolutionNov 6, 2022, 5:58 PM
27 points
0 comments9 min readEA link

AGI Ruin: A List of Lethalities

EliezerYudkowskyJun 6, 2022, 11:28 PM
162 points
53 comments30 min readEA link
(www.lesswrong.com)

De­liber­ate prac­tice for re­search?

Alex_AltairOct 8, 2022, 3:45 AM
19 points
4 comments1 min readEA link

Yud­kowsky and Chris­ti­ano on AI Take­off Speeds [LINKPOST]

aogaraApr 5, 2022, 12:57 AM
15 points
0 comments11 min readEA link

Why does no one care about AI?

Olivia AddyAug 7, 2022, 10:04 PM
55 points
47 comments1 min readEA link

AGI ruin sce­nar­ios are likely (and dis­junc­tive)

So8resJul 27, 2022, 3:24 AM
53 points
5 comments6 min readEA link

Hu­mans aren’t fit­ness maximizers

So8resOct 4, 2022, 1:32 AM
30 points
2 comments5 min readEA link

AGI Isn’t Close—Fu­ture Fund Wor­ld­view Prize

Toni MUENDELDec 18, 2022, 4:03 PM
−8 points
24 comments13 min readEA link

[Question] What are the num­bers in mind for the su­per-short AGI timelines so many long-ter­mists are alarmed about?

Evan_GaensbauerApr 19, 2022, 9:09 PM
41 points
2 comments1 min readEA link

Op­ti­mism, AI risk, and EA blind spots

JustisSep 28, 2022, 5:21 PM
87 points
21 comments8 min readEA link

Reflec­tions on my 5-month AI al­ign­ment up­skil­ling grant

Jay BaileyDec 28, 2022, 7:23 AM
113 points
5 comments8 min readEA link
(www.lesswrong.com)

[Question] Which pos­si­ble AI im­pacts should re­ceive the most ad­di­tional at­ten­tion?

David JohnstonMay 31, 2022, 2:01 AM
10 points
10 comments1 min readEA link

$20K in Boun­ties for AI Safety Public Materials

TW123Aug 5, 2022, 2:57 AM
45 points
11 comments6 min readEA link

New co­op­er­a­tion mechanism—quadratic fund­ing with­out a match­ing pool

Filip SondejJun 5, 2022, 1:55 PM
55 points
11 comments5 min readEA link

Please provide feed­back on AI-safety grant pro­posal, thanks!

Alex LongDec 11, 2022, 11:29 PM
8 points
1 comment2 min readEA link

AI Timelines: Where the Ar­gu­ments, and the “Ex­perts,” Stand

Holden KarnofskySep 7, 2021, 5:35 PM
88 points
3 comments11 min readEA link

Hu­man­ity’s vast fu­ture and its im­pli­ca­tions for cause prioritization

Eevee🔹Jul 26, 2022, 5:04 AM
38 points
3 comments5 min readEA link
(sunyshore.substack.com)

More to ex­plore on ‘Risks from Ar­tifi­cial In­tel­li­gence’

EA HandbookJul 15, 2022, 11:00 PM
8 points
2 comments2 min readEA link

Miti­gat­ing x-risk through modularity

Toby NewberryDec 17, 2020, 7:54 PM
103 points
6 comments14 min readEA link

Where I cur­rently dis­agree with Ryan Green­blatt’s ver­sion of the ELK approach

So8resSep 29, 2022, 9:19 PM
21 points
0 comments5 min readEA link

Mechanism De­sign for AI Safety—Read­ing Group Curriculum

Rubi J. HudsonOct 25, 2022, 3:54 AM
24 points
1 comment4 min readEA link

Ap­ply to at­tend an AI safety work­shop in Berkeley (Nov 18-21)

AkashNov 6, 2022, 6:06 PM
19 points
0 comments1 min readEA link

Have your timelines changed as a re­sult of ChatGPT?

Chris LeongDec 5, 2022, 3:03 PM
30 points
18 comments1 min readEA link

Call to ac­tion: Read + Share AI Safety /​ Re­in­force­ment Learn­ing Fea­tured in Conversation

Justin OliveOct 24, 2022, 1:13 AM
3 points
0 comments1 min readEA link

Rea­sons I’ve been hes­i­tant about high lev­els of near-ish AI risk

eliflandJul 22, 2022, 1:32 AM
208 points
16 comments7 min readEA link
(www.foxy-scout.com)

The right to pro­tec­tion from catas­trophic AI risk

Jack CunninghamApr 9, 2022, 11:11 PM
11 points
0 comments7 min readEA link

How im­por­tant are ac­cu­rate AI timelines for the op­ti­mal spend­ing sched­ule on AI risk in­ter­ven­tions?

Tristan CookDec 16, 2022, 4:05 PM
30 points
0 comments6 min readEA link

“Cot­ton Gin” AI Risk

423175Sep 24, 2022, 11:04 PM
6 points
2 comments1 min readEA link

[Question] “Epistemic maps” for AI De­bates? (or for other is­sues)

Marcel DAug 30, 2021, 4:59 AM
14 points
9 comments5 min readEA link

A note about differ­en­tial tech­nolog­i­cal development

So8resJul 24, 2022, 11:41 PM
58 points
8 comments6 min readEA link

Ber­lin AI Align­ment Open Meetup Septem­ber 2022

Isidor RegenfußSep 21, 2022, 3:09 PM
2 points
0 comments1 min readEA link

[Question] EA’s Achieve­ments in 2022

ElliotJDaviesDec 14, 2022, 2:33 PM
98 points
11 comments1 min readEA link

Con­test: 250€ for trans­la­tion of “longter­mism” to German

constructiveJun 1, 2022, 7:59 PM
18 points
30 comments1 min readEA link

What’s Hap­pen­ing in Australia

Bradley TjandraNov 7, 2022, 1:03 AM
105 points
4 comments13 min readEA link

In­ter­gen­er­a­tional trauma im­ped­ing co­op­er­a­tive ex­is­ten­tial safety efforts

Andrew CritchJun 3, 2022, 5:27 PM
82 points
2 comments3 min readEA link

[Question] What are the best ideas of how to reg­u­late AI from the US ex­ec­u­tive branch?

Jack CunninghamApr 2, 2022, 9:53 PM
10 points
0 comments1 min readEA link

How ‘Hu­man-Hu­man’ dy­nam­ics give way to ‘Hu­man-AI’ and then ‘AI-AI’ dynamics

RemmeltDec 27, 2022, 3:16 AM
4 points
0 comments1 min readEA link

A grand strat­egy to re­cruit AI ca­pa­bil­ities re­searchers into AI safety research

Peter S. ParkApr 15, 2022, 5:11 PM
20 points
13 comments4 min readEA link

[Question] What are some cur­rent, already pre­sent challenges from AI?

nonzerosumJun 30, 2022, 3:44 PM
5 points
1 comment1 min readEA link

Re­views of “Is power-seek­ing AI an ex­is­ten­tial risk?”

Joe_CarlsmithDec 16, 2021, 8:50 PM
71 points
4 comments1 min readEA link

What does it take to defend the world against out-of-con­trol AGIs?

Steven ByrnesOct 25, 2022, 2:47 PM
43 points
0 comments1 min readEA link

Oren’s Field Guide of Bad AGI Outcomes

Oren MontanoSep 26, 2022, 8:59 AM
1 point
0 comments1 min readEA link

Markus An­der­ljung On The AI Policy Landscape

Michaël TrazziSep 9, 2022, 5:27 PM
14 points
0 comments2 min readEA link
(theinsideview.ai)

Samotsvety’s AI risk forecasts

eliflandSep 9, 2022, 4:01 AM
175 points
30 comments4 min readEA link

AGI and Lock-In

Lukas FinnvedenOct 29, 2022, 1:56 AM
153 points
20 comments10 min readEA link
(docs.google.com)

Po­ten­tial Risks from Ad­vanced Ar­tifi­cial In­tel­li­gence: The Philan­thropic Opportunity

Holden KarnofskyMay 6, 2016, 12:55 PM
2 points
0 comments23 min readEA link
(www.openphilanthropy.org)

Join the AI Test­ing Hackathon this Friday

Esben KranDec 12, 2022, 2:24 PM
33 points
0 comments8 min readEA link
(alignmentjam.com)

AI ethics: the case for in­clud­ing an­i­mals (my first pub­lished pa­per, Peter Singer’s first on AI)

FaiJul 12, 2022, 4:14 AM
78 points
5 comments1 min readEA link
(link.springer.com)

AI Timelines via Cu­mu­la­tive Op­ti­miza­tion Power: Less Long, More Short

Jake CannellOct 6, 2022, 7:06 AM
27 points
0 comments1 min readEA link

Call For Distillers

johnswentworthApr 6, 2022, 3:03 AM
70 points
6 comments3 min readEA link

Ap­pli­ca­tions open for AGI Safety Fun­da­men­tals: Align­ment Course

Jamie BDec 13, 2022, 10:50 AM
75 points
0 comments2 min readEA link

AISER—AIS Europe Retreat

CarolinDec 23, 2022, 6:11 PM
5 points
0 comments1 min readEA link

Values and control

dotsamAug 4, 2022, 6:28 PM
3 points
1 comment1 min readEA link

Ajeya’s TAI timeline short­ened from 2050 to 2040

Zach Stein-PerlmanAug 3, 2022, 12:00 AM
59 points
2 comments1 min readEA link
(www.lesswrong.com)

AI Risk In­tro 2: Solv­ing The Problem

L Rudolf LSep 24, 2022, 9:33 AM
11 points
0 comments28 min readEA link
(www.perfectlynormal.co.uk)

AI Safety re­searcher ca­reer review

Benjamin_ToddNov 23, 2021, 12:00 AM
13 points
1 comment6 min readEA link
(80000hours.org)

The Cred­i­bil­ity of Apoca­lyp­tic Claims: A Cri­tique of Techno-Fu­tur­ism within Ex­is­ten­tial Risk

EmberAug 16, 2022, 7:48 PM
25 points
35 comments17 min readEA link

Katja Grace on Slow­ing Down AI, AI Ex­pert Sur­veys And Es­ti­mat­ing AI Risk

Michaël TrazziSep 16, 2022, 6:00 PM
48 points
6 comments3 min readEA link
(theinsideview.ai)

When can a mimic sur­prise you? Why gen­er­a­tive mod­els han­dle seem­ingly ill-posed problems

David JohnstonNov 6, 2022, 11:46 AM
6 points
0 comments1 min readEA link

AI Risk is like Ter­mi­na­tor; Stop Say­ing it’s Not

skluugMar 8, 2022, 7:17 PM
191 points
43 comments10 min readEA link
(skluug.substack.com)

We should say more than “x-risk is high”

OllieBaseDec 16, 2022, 10:09 PM
52 points
12 comments4 min readEA link

What could an AI-caused ex­is­ten­tial catas­tro­phe ac­tu­ally look like?

Benjamin HiltonSep 12, 2022, 4:25 PM
57 points
7 comments9 min readEA link
(80000hours.org)

When to di­ver­sify? Break­ing down mis­sion-cor­re­lated investing

jhNov 29, 2022, 11:18 AM
33 points
2 comments8 min readEA link

En­cul­tured AI, Part 1: En­abling New Benchmarks

Andrew CritchAug 8, 2022, 10:49 PM
17 points
0 comments6 min readEA link

Re­think Pri­ori­ties’ 2022 Im­pact, 2023 Strat­egy, and Fund­ing Gaps

kierangreig🔸Nov 25, 2022, 5:37 AM
108 points
10 comments28 min readEA link

[Question] AI Risk Micro­dy­nam­ics Survey

FroolowOct 9, 2022, 8:00 PM
7 points
1 comment1 min readEA link

The AI Mes­siah

ryancbriggsMay 5, 2022, 4:58 PM
71 points
44 comments2 min readEA link

Neart­er­mists should con­sider AGI timelines in their spend­ing decisions

Tristan CookJul 26, 2022, 5:01 PM
68 points
4 comments4 min readEA link

AI Risk In­tro 1: Ad­vanced AI Might Be Very Bad

L Rudolf LSep 11, 2022, 10:57 AM
22 points
0 comments30 min readEA link

On AI Weapons

kbogNov 13, 2019, 12:48 PM
76 points
10 comments30 min readEA link

[Question] How will the world re­spond to “AI x-risk warn­ing shots” ac­cord­ing to refer­ence class fore­cast­ing?

Ryan KiddApr 18, 2022, 9:10 AM
18 points
1 comment1 min readEA link

Video and Tran­script of Pre­sen­ta­tion on Ex­is­ten­tial Risk from Power-Seek­ing AI

Joe_CarlsmithMay 8, 2022, 3:52 AM
97 points
7 comments30 min readEA link

[Question] Recom­men­da­tions for non-tech­ni­cal books on AI?

JosephJul 12, 2022, 11:23 PM
8 points
11 comments1 min readEA link

[Question] Benev­olen­tAI—an effec­tively im­pact­ful com­pany?

Jack HiltonOct 11, 2022, 2:35 PM
16 points
11 comments1 min readEA link

En­cul­tured AI, Part 2: Pro­vid­ing a Service

Andrew CritchAug 11, 2022, 8:13 PM
10 points
0 comments3 min readEA link

Band­wagon effect: Bias in Eval­u­at­ing AGI X-Risks

RemmeltDec 28, 2022, 7:54 AM
4 points
0 comments1 min readEA link

It’s OK not to go into AI (for stu­dents)

ruthgraceJul 14, 2022, 3:16 PM
59 points
18 comments2 min readEA link

The limited up­side of interpretability

Peter S. ParkNov 15, 2022, 8:22 PM
23 points
3 comments10 min readEA link

Read­ing the ethi­cists 2: Hunt­ing for AI al­ign­ment papers

Charlie SteinerJun 6, 2022, 3:53 PM
9 points
0 comments1 min readEA link
(www.lesswrong.com)

How dath ilan co­or­di­nates around solv­ing AI alignment

Thomas KwaApr 14, 2022, 1:53 AM
12 points
1 comment5 min readEA link

AI Align­ment is in­tractable (and we hu­mans should stop work­ing on it)

GPT 3Jul 28, 2022, 8:02 PM
1 point
1 comment1 min readEA link

An­nounc­ing the AI Safety Nudge Com­pe­ti­tion to Help Beat Procrastination

Marc CarauleanuOct 1, 2022, 1:49 AM
24 points
1 comment2 min readEA link

[Question] How/​When Should One In­tro­duce AI Risk Ar­gu­ments to Peo­ple Un­fa­mil­iar With the Idea?

Marcel DAug 9, 2022, 2:57 AM
12 points
4 comments1 min readEA link

Con­tra shard the­ory, in the con­text of the di­a­mond max­i­mizer problem

So8resOct 13, 2022, 11:51 PM
27 points
0 comments1 min readEA link

Re­place­ment for PONR concept

kokotajlodSep 2, 2022, 12:38 AM
14 points
1 comment2 min readEA link

Black Box In­ves­ti­ga­tions Re­search Hackathon

Esben KranSep 15, 2022, 10:09 AM
23 points
0 comments2 min readEA link

When re­port­ing AI timelines, be clear who you’re defer­ring to

Sam ClarkeOct 10, 2022, 2:24 PM
120 points
20 comments1 min readEA link

Ap­ply to the Ma­chine Learn­ing For Good boot­camp in France

Alexandre VariengienJun 17, 2022, 9:13 AM
9 points
0 comments1 min readEA link
(www.lesswrong.com)

A pseudo math­e­mat­i­cal for­mu­la­tion of di­rect work choice be­tween two x-risks

Joseph BloomAug 11, 2022, 12:28 AM
7 points
0 comments4 min readEA link

Seek­ing so­cial sci­ence stu­dents /​ col­lab­o­ra­tors in­ter­ested in AI ex­is­ten­tial risks

Vael GatesSep 24, 2021, 9:56 PM
58 points
7 comments3 min readEA link

Re­view: What We Owe The Future

Kelsey PiperNov 21, 2022, 9:41 PM
165 points
3 comments1 min readEA link
(asteriskmag.com)

The US ex­pands re­stric­tions on AI ex­ports to China. What are the x-risk effects?

Stephen ClareOct 14, 2022, 6:17 PM
161 points
20 comments4 min readEA link

[Question] What is the best ar­ti­cle to in­tro­duce some­one to AI safety for the first time?

trevor1Nov 22, 2022, 2:06 AM
2 points
3 comments1 min readEA link

[$20K In Prizes] AI Safety Ar­gu­ments Competition

TW123Apr 26, 2022, 4:21 PM
71 points
121 comments3 min readEA link

[Question] What should I ask Ajeya Co­tra — se­nior re­searcher at Open Philan­thropy, and ex­pert on AI timelines and safety challenges?

Robert_WiblinOct 28, 2022, 3:28 PM
23 points
10 comments1 min readEA link

How likely are ma­lign pri­ors over ob­jec­tives? [aborted WIP]

David JohnstonNov 11, 2022, 6:03 AM
6 points
0 comments1 min readEA link

Me­diocre AI safety as ex­is­ten­tial risk

technicalitiesMar 16, 2022, 11:50 AM
52 points
12 comments3 min readEA link

New AI risk in­tro from Vox [link post]

JakubKDec 21, 2022, 5:50 AM
7 points
1 comment2 min readEA link
(www.vox.com)