RSS

Build­ing the field of AI safety

TagLast edit: 28 Oct 2022 15:56 UTC by Lizka

Building the field of AI safety refers to the family of interventions aimed at growing, shaping or otherwise improving AI safety as an intellectual community.

Related entries

AI risk | AI safety | existential risk | building effective altruism

An­nounc­ing the Cam­bridge Bos­ton Align­ment Ini­ti­a­tive [Hiring!]

kuhanj2 Dec 2022 1:07 UTC
83 points
0 comments1 min readEA link

There should be a pub­lic ad­ver­sar­ial col­lab­o­ra­tion on AI x-risk

pradyuprasad23 Jan 2023 4:09 UTC
56 points
5 comments2 min readEA link

An­nounc­ing the AI Safety Field Build­ing Hub, a new effort to provide AISFB pro­jects, men­tor­ship, and funding

Vael Gates28 Jul 2022 21:29 UTC
126 points
6 comments6 min readEA link

Spread­ing mes­sages to help with the most im­por­tant century

Holden Karnofsky25 Jan 2023 20:35 UTC
122 points
20 comments18 min readEA link
(www.cold-takes.com)

An­nounc­ing the Har­vard AI Safety Team

Xander Davies30 Jun 2022 18:34 UTC
128 points
4 comments5 min readEA link

AGI Safety Fun­da­men­tals pro­gramme is con­tract­ing a low-code engineer

Jamie Bernardi26 Aug 2022 15:43 UTC
39 points
4 comments5 min readEA link

Ret­ro­spec­tive on the AI Safety Field Build­ing Hub

Vael Gates2 Feb 2023 2:06 UTC
56 points
2 comments9 min readEA link

Con­crete Steps to Get Started in Trans­former Mechanis­tic Interpretability

Neel Nanda26 Dec 2022 13:00 UTC
18 points
0 comments12 min readEA link

Re­sources that (I think) new al­ign­ment re­searchers should know about

Akash28 Oct 2022 22:13 UTC
20 points
2 comments1 min readEA link

Are al­ign­ment re­searchers de­vot­ing enough time to im­prov­ing their re­search ca­pac­ity?

Carson Jones4 Nov 2022 0:58 UTC
11 points
1 comment1 min readEA link

The Tree of Life: Stan­ford AI Align­ment The­ory of Change

Gabriel Mukobi2 Jul 2022 18:32 UTC
68 points
5 comments14 min readEA link

What are some low-cost out­side-the-box ways to do/​fund al­ign­ment re­search?

trevor111 Nov 2022 5:57 UTC
2 points
3 comments1 min readEA link

Pod­cast: Shoshan­nah Tekofsky on skil­ling up in AI safety, vis­it­ing Berkeley, and de­vel­op­ing novel re­search ideas

Akash25 Nov 2022 20:47 UTC
14 points
0 comments1 min readEA link

Up­date on Har­vard AI Safety Team and MIT AI Alignment

Xander Davies2 Dec 2022 6:09 UTC
69 points
3 comments1 min readEA link

The Benefits of Distil­la­tion in Research

Jonas Hallgren4 Mar 2023 19:19 UTC
43 points
2 comments5 min readEA link

[Question] I have thou­sands of copies of HPMOR in Rus­sian. How to use them with the most im­pact?

Samin27 Dec 2022 11:07 UTC
35 points
10 comments1 min readEA link

AI safety uni­ver­sity groups: a promis­ing op­por­tu­nity to re­duce ex­is­ten­tial risk

mic30 Jun 2022 18:37 UTC
50 points
1 comment11 min readEA link

Re­sults for a sur­vey of tool use and work­flows in al­ign­ment research

jacquesthibs19 Dec 2022 15:19 UTC
29 points
0 comments1 min readEA link

Tran­scripts of in­ter­views with AI researchers

Vael Gates9 May 2022 6:03 UTC
140 points
14 comments2 min readEA link

An­nounc­ing aisafety.training

JJ Hepburn17 Jan 2023 1:55 UTC
108 points
4 comments1 min readEA link

How many peo­ple are work­ing (di­rectly) on re­duc­ing ex­is­ten­tial risk from AI?

Benjamin Hilton17 Jan 2023 14:03 UTC
117 points
3 comments4 min readEA link
(80000hours.org)

Estab­lish­ing Oxford’s AI Safety Stu­dent Group: Les­sons Learnt and Our Model

Wilkin123421 Sep 2022 7:57 UTC
71 points
3 comments1 min readEA link

AI Safety Seems Hard to Measure

Holden Karnofsky11 Dec 2022 1:31 UTC
88 points
2 comments14 min readEA link

Re­cur­sive Mid­dle Man­ager Hell

Raemon17 Jan 2023 19:02 UTC
78 points
2 comments1 min readEA link

AGI safety field build­ing pro­jects I’d like to see

Severin24 Jan 2023 23:30 UTC
25 points
2 comments1 min readEA link

We Ran an Align­ment Workshop

aiden ament21 Jan 2023 5:37 UTC
6 points
0 comments3 min readEA link

Vael Gates: Risks from Ad­vanced AI (June 2022)

Vael Gates14 Jun 2022 0:49 UTC
45 points
5 comments30 min readEA link

Why I think that teach­ing philos­o­phy is high impact

Eleni_A19 Dec 2022 23:00 UTC
17 points
2 comments2 min readEA link

Air-gap­ping eval­u­a­tion and support

Ryan Kidd26 Dec 2022 22:52 UTC
18 points
12 comments1 min readEA link

AI Safety field-build­ing pro­jects I’d like to see

Akash11 Sep 2022 23:45 UTC
19 points
4 comments7 min readEA link
(www.lesswrong.com)

*New* Canada AI Safety & Gover­nance community

Wyatt Tessari L'Allié29 Aug 2022 15:58 UTC
31 points
2 comments1 min readEA link

Stress Ex­ter­nal­ities More in AI Safety Pitches

NickGabs26 Sep 2022 20:31 UTC
31 points
13 comments2 min readEA link

[Question] Best in­tro­duc­tory overviews of AGI safety?

Jakub Kraus13 Dec 2022 19:04 UTC
13 points
8 comments2 min readEA link
(www.lesswrong.com)

We all teach: here’s how to do it better

Michael Noetel30 Sep 2022 2:06 UTC
158 points
12 comments24 min readEA link

[Question] Does China have AI al­ign­ment re­sources/​in­sti­tu­tions? How can we pri­ori­tize cre­at­ing more?

Jakub Kraus4 Aug 2022 19:23 UTC
17 points
9 comments1 min readEA link

Anal­y­sis of AI Safety sur­veys for field-build­ing insights

Ash Jafari5 Dec 2022 17:37 UTC
24 points
7 comments5 min readEA link

What AI Safety Ma­te­ri­als Do ML Re­searchers Find Com­pel­ling?

Vael Gates28 Dec 2022 2:03 UTC
129 points
12 comments1 min readEA link

An­nounc­ing an Em­piri­cal AI Safety Program

Joshc13 Sep 2022 21:39 UTC
64 points
7 comments2 min readEA link

AI Safety Micro­grant Round

Chris Leong14 Nov 2022 4:25 UTC
81 points
1 comment3 min readEA link

Re­sources I send to AI re­searchers about AI safety

Vael Gates11 Jan 2023 1:24 UTC
32 points
0 comments1 min readEA link

Why peo­ple want to work on AI safety (but don’t)

Emily Grundy24 Jan 2023 6:41 UTC
67 points
10 comments7 min readEA link

AI Safety Un­con­fer­ence NeurIPS 2022

Orpheus_Lummis7 Nov 2022 15:39 UTC
13 points
5 comments1 min readEA link
(aisafetyevents.org)

AI Safety Ar­gu­ments: An In­ter­ac­tive Guide

Lukas Trötzmüller1 Feb 2023 19:21 UTC
32 points
5 comments3 min readEA link

“AI Risk Dis­cus­sions” web­site: Ex­plor­ing in­ter­views from 97 AI Researchers

Vael Gates2 Feb 2023 1:00 UTC
41 points
1 comment1 min readEA link

Pre­dict­ing re­searcher in­ter­est in AI alignment

Vael Gates2 Feb 2023 0:58 UTC
30 points
0 comments21 min readEA link
(docs.google.com)

In­ter­views with 97 AI Re­searchers: Quan­ti­ta­tive Analysis

Maheen Shermohammed2 Feb 2023 4:50 UTC
73 points
4 comments7 min readEA link

A Brief Overview of AI Safety/​Align­ment Orgs, Fields, Re­searchers, and Re­sources for ML Researchers

Austin Witte2 Feb 2023 6:19 UTC
18 points
5 comments2 min readEA link

Talk to me about your sum­mer/​ca­reer plans

Akash31 Jan 2023 18:29 UTC
30 points
0 comments1 min readEA link

Prob­lems of peo­ple new to AI safety and my pro­ject ideas to miti­gate them

Igor Ivanov3 Mar 2023 17:35 UTC
14 points
0 comments7 min readEA link

A Pro­posed Ap­proach for AI Safety Move­ment Build­ing: Pro­jects, Pro­fes­sions, Skills, and Ideas for the Fu­ture [long post][bounty for feed­back]

PeterSlattery22 Mar 2023 0:54 UTC
20 points
8 comments32 min readEA link
No comments.