Joe_Carlsmith

Karma: 3,804

Working on Claude’s values at Anthropic. Former senior advisor at Coefficient Giving (then Open Philanthropy). Doctorate in philosophy at the University of Oxford. Opinions my own.

Video and transcript of talk on writing AI constitutions

Joe_Carlsmith9 Apr 2026 17:21 UTC

22 points

2 comments47 min readEA link

On restraining AI development for the sake of safety

Joe_Carlsmith19 Mar 2026 16:36 UTC

21 points

1 comment50 min readEA link

Building AIs that do human-like philosophy

Joe_Carlsmith29 Jan 2026 18:03 UTC

16 points

1 comment21 min readEA link

Video and transcript of talk on human-like-ness in AI safety

Joe_Carlsmith17 Dec 2025 4:13 UTC

14 points

0 comments36 min readEA link

How human-like do safe AI motivations need to be?

Joe_Carlsmith12 Nov 2025 5:33 UTC

27 points

1 comment52 min readEA link

Leaving Open Philanthropy, going to Anthropic

Joe_Carlsmith3 Nov 2025 17:41 UTC

137 points

14 comments18 min readEA link

Controlling the options AIs can pursue

Joe_Carlsmith29 Sep 2025 17:24 UTC

9 points

0 comments35 min readEA link

Video and transcript of talk on giving AIs safe motivations

Joe_Carlsmith22 Sep 2025 16:47 UTC

10 points

1 comment50 min readEA link

Giving AIs safe motivations

Joe_Carlsmith18 Aug 2025 18:02 UTC

22 points

1 comment51 min readEA link

Video and transcript of talk on “Can goodness compete?”

Joe_Carlsmith17 Jul 2025 17:59 UTC

34 points

4 comments34 min readEA link

(joecarlsmith.substack.com)

Video and transcript of talk on AI welfare

Joe_Carlsmith22 May 2025 16:15 UTC

22 points

1 comment28 min readEA link

(joecarlsmith.substack.com)

The stakes of AI moral status

Joe_Carlsmith21 May 2025 18:20 UTC

54 points

9 comments14 min readEA link

(joecarlsmith.substack.com)

Video and transcript of talk on automating alignment research

Joe_Carlsmith30 Apr 2025 17:43 UTC

11 points

1 comment24 min readEA link

(joecarlsmith.com)

Can we safely automate alignment research?

Joe_Carlsmith30 Apr 2025 17:37 UTC

13 points

1 comment48 min readEA link

(joecarlsmith.com)

AI for AI safety

Joe_Carlsmith14 Mar 2025 15:00 UTC

34 points

1 comment17 min readEA link

(joecarlsmith.substack.com)

Paths and waystations in AI safety

Joe_Carlsmith11 Mar 2025 18:52 UTC

22 points

2 comments11 min readEA link

(joecarlsmith.substack.com)

When should we worry about AI power-seeking?

Joe_Carlsmith19 Feb 2025 19:44 UTC

21 points

2 comments18 min readEA link

(joecarlsmith.substack.com)

What is it to solve the alignment problem?

Joe_Carlsmith13 Feb 2025 18:42 UTC

25 points

1 comment19 min readEA link

(joecarlsmith.substack.com)

How do we solve the alignment problem?

Joe_Carlsmith13 Feb 2025 18:27 UTC

38 points

1 comment9 min readEA link

(joecarlsmith.substack.com)

Fake thinking and real thinking

Joe_Carlsmith28 Jan 2025 20:05 UTC

86 points

3 comments38 min readEA link