To what extent is AI alignment tractable?
I’m especially interested in subfields that are very tractable – as well as fields that are not tractable at all (with people still working there).
To what extent is AI alignment tractable?
I’m especially interested in subfields that are very tractable – as well as fields that are not tractable at all (with people still working there).
For everyone who wanted to participate in the poll but didn’t because it seemed like too much work – I updated it! Here’s the updated version. It should be easier to answer now :)
What is prosaic alignment? What are examples for prosaic alignment?