There seems to be a nascent field in academia of using psychology tools/methods to understand LLMs, e.g. https://www.pnas.org/doi/10.1073/pnas.2218523120; it might be interesting to think about the intersection of this with alignment e.g. what experiments to perform, etc.
Maybe more on the neuroscience side, I’d be very excited to see (more) people think about how to build a neuroconnectionist research programme for alignment (I’ve also briefly mentioned this in the linkpost).
Another relevant article on “machine psychology” https://arxiv.org/abs/2303.13988 (interestingly, it’s by a co-author of Peter Singer’s first AI paper)
There seems to be a nascent field in academia of using psychology tools/methods to understand LLMs, e.g. https://www.pnas.org/doi/10.1073/pnas.2218523120; it might be interesting to think about the intersection of this with alignment e.g. what experiments to perform, etc.
Maybe more on the neuroscience side, I’d be very excited to see (more) people think about how to build a neuroconnectionist research programme for alignment (I’ve also briefly mentioned this in the linkpost).
Another relevant article on “machine psychology” https://arxiv.org/abs/2303.13988 (interestingly, it’s by a co-author of Peter Singer’s first AI paper)