OscarD🔸 comments on Disagreements about Alignment: Why, and how, we should try to solve them

OscarD🔸 9 Aug 2022 10:26 UTC
4 points
0 ∶ 0
Thanks for this, I agree that it seems valuable to think carefully about the foundations of different research agendas and how justified these are. Indeed, this seems analogous to the traditional EA pursuit of cause prioritisation: thinking carefully about the underlying assumptions and methodologies of different approaches to doing good, and comparing how well justified these are. To stretch the analogy, there may be some alignment equivalents of deworming that seem to have a strong chance of having little value but are still worthwhile in EV terms because of the possibility of having an outsized impact.
While I feel relatively unequipped to do useful direct alignment research (rowing), I feel even more unequipped to do steering. I think this is a general feature of the world rather than just of me, that in order to usefully interrogate the axioms of a research agenda and compare the promisingness of different agendas it is very valuable to be quite familiar with these approaches, especially having already tried rowing in each. For instance in biology, people often start out doing relatively menial lab work to help a senior person’s project, then start directing particular experiments, after several years will run whole research projects, and usually only later in their career will they be well-placed to judge the overall merits of various research agendas. Even though senior researchers are better at pipetting than undergrads, the comparative advantage of the undergrads is to pipette, and of the senior people is to steer and direct.
Likewise in alignment research, it seems most valuable for less experienced people to try rowing within one or more research agendas, and only later try to start their own or compare the value proposition of the different agendas.
I don’t think this disagrees with what you wrote, it just explains why I think I should not be steering (yet).