> How difficult should we expect AI alignment to be?
With many of the AI questions, one needs to reason backwards rather than pose the general question.
Suppose we all die because unaligned AI. What form did the unaligned AI take? How did it work? Which things that exist now were progenitors of it, and what changed to make it dangerous? How could those problems have been avoided, technically? Organisationally?
I don’t see how useful alignment research can be done quite separately to capabilities research. Otherwise we’ll get will be people coming in at the wrong time with a bunch of ideas that lack technical purchase.
Similarly, the questions about what applications we’ll see first are already hinted at in capabilities research.
That being the case, it will take way more energy than 1 year for someone to upskill because they actually need to understand something about capabilities work.
> How difficult should we expect AI alignment to be?
With many of the AI questions, one needs to reason backwards rather than pose the general question.
Suppose we all die because unaligned AI. What form did the unaligned AI take? How did it work? Which things that exist now were progenitors of it, and what changed to make it dangerous? How could those problems have been avoided, technically? Organisationally?
I don’t see how useful alignment research can be done quite separately to capabilities research. Otherwise we’ll get will be people coming in at the wrong time with a bunch of ideas that lack technical purchase.
Similarly, the questions about what applications we’ll see first are already hinted at in capabilities research.
That being the case, it will take way more energy than 1 year for someone to upskill because they actually need to understand something about capabilities work.