RSS

Joe_Carlsmith

Karma: 3,540

Senior advisor at Open Philanthropy. Doctorate in philosophy at the University of Oxford. Opinions my own.

Video and tran­script of talk on “Can good­ness com­pete?”

Joe_CarlsmithJul 17, 2025, 5:59 PM
30 points
4 comments34 min readEA link
(joecarlsmith.substack.com)

Video and tran­script of talk on AI welfare

Joe_CarlsmithMay 22, 2025, 4:15 PM
22 points
1 comment28 min readEA link
(joecarlsmith.substack.com)

The stakes of AI moral status

Joe_CarlsmithMay 21, 2025, 6:20 PM
54 points
9 comments14 min readEA link
(joecarlsmith.substack.com)

Video and tran­script of talk on au­tomat­ing al­ign­ment research

Joe_CarlsmithApr 30, 2025, 5:43 PM
11 points
1 comment24 min readEA link
(joecarlsmith.com)

Can we safely au­to­mate al­ign­ment re­search?

Joe_CarlsmithApr 30, 2025, 5:37 PM
13 points
1 comment48 min readEA link
(joecarlsmith.com)

AI for AI safety

Joe_CarlsmithMar 14, 2025, 3:00 PM
34 points
1 comment17 min readEA link
(joecarlsmith.substack.com)

Paths and waysta­tions in AI safety

Joe_CarlsmithMar 11, 2025, 6:52 PM
22 points
2 comments11 min readEA link
(joecarlsmith.substack.com)

When should we worry about AI power-seek­ing?

Joe_CarlsmithFeb 19, 2025, 7:44 PM
21 points
2 comments18 min readEA link
(joecarlsmith.substack.com)

What is it to solve the al­ign­ment prob­lem?

Joe_CarlsmithFeb 13, 2025, 6:42 PM
25 points
1 comment19 min readEA link
(joecarlsmith.substack.com)

How do we solve the al­ign­ment prob­lem?

Joe_CarlsmithFeb 13, 2025, 6:27 PM
28 points
1 comment6 min readEA link
(joecarlsmith.substack.com)

Fake think­ing and real thinking

Joe_CarlsmithJan 28, 2025, 8:05 PM
77 points
3 comments38 min readEA link

Takes on “Align­ment Fak­ing in Large Lan­guage Models”

Joe_CarlsmithDec 18, 2024, 6:22 PM
72 points
1 comment62 min readEA link

In­cen­tive de­sign and ca­pa­bil­ity elicitation

Joe_CarlsmithNov 12, 2024, 8:56 PM
9 points
0 comments12 min readEA link

Op­tion control

Joe_CarlsmithNov 4, 2024, 5:54 PM
11 points
0 comments54 min readEA link

Mo­ti­va­tion control

Joe_CarlsmithOct 30, 2024, 5:15 PM
18 points
0 comments52 min readEA link

How might we solve the al­ign­ment prob­lem? (Part 1: In­tro, sum­mary, on­tol­ogy)

Joe_CarlsmithOct 28, 2024, 9:57 PM
18 points
0 comments32 min readEA link

Video and tran­script of pre­sen­ta­tion on Oth­er­ness and con­trol in the age of AGI

Joe_CarlsmithOct 8, 2024, 10:30 PM
18 points
1 comment27 min readEA link

What is it to solve the al­ign­ment prob­lem? (Notes)

Joe_CarlsmithAug 24, 2024, 9:19 PM
32 points
1 comment53 min readEA link

Value frag­ility and AI takeover

Joe_CarlsmithAug 5, 2024, 9:28 PM
39 points
3 comments30 min readEA link