One key strength of mathematical AI alignment work is that it’s probably extremely cheap compared to ‘technical AI alignment work’ that requires a lot of skilled programming and computational resources. (Just as mathematical evolutionary theory is much, much cheaper to fund than empirical evolutionary genomics research.)
I would just make a plea for more use of game theory in mathematical AI alignment work. The Yudkowsky-style agent foundations work is valuable. But I think a lot of alignment issues boil down to game-theoretic issues, and we’ve had a huge amount of increasingly sophisticated work on game theory since the foundational work in the 1940s and 1950s. This goes far, far beyond the pop science accounts of the Prisoner’s Dilemma, the Ultimatum Game, the Tragedy of the Commons, and other ‘Game Theory 101’ examples that many EAs are familiar with.
Nice post; agree with most of it.
One key strength of mathematical AI alignment work is that it’s probably extremely cheap compared to ‘technical AI alignment work’ that requires a lot of skilled programming and computational resources. (Just as mathematical evolutionary theory is much, much cheaper to fund than empirical evolutionary genomics research.)
I would just make a plea for more use of game theory in mathematical AI alignment work. The Yudkowsky-style agent foundations work is valuable. But I think a lot of alignment issues boil down to game-theoretic issues, and we’ve had a huge amount of increasingly sophisticated work on game theory since the foundational work in the 1940s and 1950s. This goes far, far beyond the pop science accounts of the Prisoner’s Dilemma, the Ultimatum Game, the Tragedy of the Commons, and other ‘Game Theory 101’ examples that many EAs are familiar with.