Funding the AI alignment institute, a Manhattan project scale for AI alignment.
Artificial intelligence
Aligning AI with human interests could be very hard. The current growth in AI alignment research might be insufficient to align AI. To speed up alignment research, we want to fund an ambitious institute attracting hundreds to thousands of researchers and engineers to work full-time on aligning AI. The institute would enable these researchers to work with computing resources competitive with top AI industries. We could also slow down risky AI capability research by offering top AI capability researchers competitive wages and autonomy, draining them from top AI organizations. While small specialized teams would pursue innovative alignment research, the institute would enhance their collaboration, bridging AI alignment theory, experiment, and policy. The institute could also offer alignment fellowships optimized to speed up the onboarding of bright young students in alignment research. For example, we would fund stipends and mentorships competitive with doctoral programs or entry-level jobs in the industry. The institute would be located in a place safe from global catastrophic risks and facilitate access to high-quality healthcare, food, housing, transportation to optimize researchers well being and productivity.
The definition of existential risk as ‘humanity losing its long term potential’ in Toby Ord precipice could be specified further. Without (perhaps) loss of generality, assuming finite total value in our universe, one could specify existential risks into two broad categories of risks such as:
Extinction risks (X-risks): Human share of total value goes to zero. Examples could be extinction from pandemics, extreme climate change or some natural event.
Agential risks (A-risks): Human share of total value could be greater than in the X-risks scenarios but keeps being strictly dominated by the share of total value holded by misaligned agents. Examples could be misaligned institutions, AIs or loud alienscontrolling most of the value in the universe and with whom there would be little gain from trade to be hoped for.