RSS

Align­ment tax

TagLast edit: 22 Jul 2022 21:00 UTC by Leo

An alignment tax (sometimes called a safety tax) is the additional cost of making AI aligned, relative to unaligned AI.

Approaches to the alignment tax

Paul Christiano distinguishes two main approaches for dealing with the alignment tax.[1][2] One approach seeks to find ways to pay the tax, such as persuading individual actors to pay it or facilitating coordination of the sort that would allow groups to pay it. The other approach tries to reduce the tax, by differentially advancing existing alignable algorithms or by making existing algorithms more alignable.

Further reading

Askell, Amanda et al. (2021) A general language assistant as a laboratory for alignment, arXiv:2112.00861 [Cs].

Xu, Mark & Carl Shulman (2021) Rogue AGI embodies valuable intellectual property, LessWrong, June 3.

Yudkowsky, Eliezer (2017) Aligning an AGI adds significant development time, Arbital, February 22.

Related entries

AI alignment | AI governance | AI forecasting | differential progress

  1. ^

    Christiano, Paul (2020) Current work in AI alignment, Effective Altruism Global, April 3.

  2. ^

    For a summary, see Rohin Shah (2020) A framework for thinking about how to make AI go well, LessWrong, April 15.

Paul Chris­ti­ano: Cur­rent work in AI alignment

EA Global3 Apr 2020 7:06 UTC
80 points
3 comments24 min readEA link
(www.youtube.com)

My per­sonal cruxes for work­ing on AI safety

Buck13 Feb 2020 7:11 UTC
136 points
35 comments44 min readEA link

Safety tax functions

Owen Cotton-Barratt20 Oct 2024 14:13 UTC
23 points
1 comment6 min readEA link
(strangecities.substack.com)

AI safety tax dynamics

Owen Cotton-Barratt23 Oct 2024 12:21 UTC
21 points
9 comments6 min readEA link
(strangecities.substack.com)

[Linkpost] Jan Leike on three kinds of al­ign­ment taxes

Akash6 Jan 2023 23:57 UTC
29 points
0 comments1 min readEA link

Align­ing AI with Hu­mans by Lev­er­ag­ing Le­gal Informatics

johnjnay18 Sep 2022 7:43 UTC
20 points
11 comments3 min readEA link

New co­op­er­a­tion mechanism—quadratic fund­ing with­out a match­ing pool

Filip Sondej5 Jun 2022 13:55 UTC
55 points
11 comments5 min readEA link