You discuss at one point in the podcast the claim that as AI systems take on larger and larger real world problems, the challenge of defining the reward function will become more and more important. For example for cleaning, the simple number-of-dust-particles objective is inadequate because we care about many other things e.g. keeping the house tidy and many side constraints e.g. avoiding damaging household objects. This isn’t quite an argument for AI alignment solving itself, but it is an argument that the attention and resources poured into AI alignment may naturally rise to the challenge without EA effort, and thus perhaps EA effort is misplaced.
First off, I think this is a great steel-man of the Lecun/Etzioni safety skeptic position, and, importantly, I think it gives a more concrete/falsifiable position to argue against. On the other hand, this argument seems to go through only if most of the tasks worked on by AI researchers are of the kind described—i.e. the designer of the system has it in their own interest to deal with side constraints and fix reward function specification. In my view, this condition is unlikely to be met. It seems to me likely that most tasks AI corps work on will have a principal-agent complication. Recommender system alignment, automated advertising, stock trading, etc. work in all of these domains maximize profit for the AI researchers’ company when they run roughshod over side constraints. The side constraints here being mostly the preferences of users on the platform for tech, and other investors for finance.
Does this seem right? If so, what are the upshots? Could the legal/lobbying work of strengthening the positions of these principals become a high-value task for EA to take on?
You discuss at one point in the podcast the claim that as AI systems take on larger and larger real world problems, the challenge of defining the reward function will become more and more important. For example for cleaning, the simple number-of-dust-particles objective is inadequate because we care about many other things e.g. keeping the house tidy and many side constraints e.g. avoiding damaging household objects. This isn’t quite an argument for AI alignment solving itself, but it is an argument that the attention and resources poured into AI alignment may naturally rise to the challenge without EA effort, and thus perhaps EA effort is misplaced.
First off, I think this is a great steel-man of the Lecun/Etzioni safety skeptic position, and, importantly, I think it gives a more concrete/falsifiable position to argue against. On the other hand, this argument seems to go through only if most of the tasks worked on by AI researchers are of the kind described—i.e. the designer of the system has it in their own interest to deal with side constraints and fix reward function specification. In my view, this condition is unlikely to be met. It seems to me likely that most tasks AI corps work on will have a principal-agent complication. Recommender system alignment, automated advertising, stock trading, etc. work in all of these domains maximize profit for the AI researchers’ company when they run roughshod over side constraints. The side constraints here being mostly the preferences of users on the platform for tech, and other investors for finance.
Does this seem right? If so, what are the upshots? Could the legal/lobbying work of strengthening the positions of these principals become a high-value task for EA to take on?