Fair points. In particular, I think my response should have focused more on the role of academia + industry.
a disproportionate amount of progress on [mechanistic interpretability] has been made outside of academia, by Chris Olah & collaborators at OpenAI & Anthropic
Not entirely fair: if you open the field just a bit to “interpretability” in general you will see that most important advances in the field (eg SHAP and LIME) were done inside academia.
I would also not be too surprised to find people within academia who are doing great mechanistic interpretability, simply because of the sheer number of people researching interpretability.
There are various concrete problems here but it seems that more progress is being made by independent researchers (e.g. Vanessa Kosoy, John Wentworth) and researchers at nonprofits (MIRI) than by anyone in academia.
Granted, but part of the problem is convincing academics to engage. I think that the math community would be x100 times better at solving these problems if they ever become popular enough in academia.
However, I think it’s MUCH less clear that any particular Person X would be more productive as a grad student than as a nonprofit employee, or more productive as a professor than as a nonprofit technical co-founder. In fact, I strongly expect the reverse.
Matches my intuition as well (though I might be biased here). I’d add that I expect grad students will get better mentorship on average in academia than in non profits / doing independent research (but mostly work on irrelevant problems while in academia).
One important intuition that I have is that I think academia + industry scales to “crappy but at the end advancing in the object level” despite having lots of mediocre people involved, while I think that all cool things happening in EA+LW are due to some exceptionally talented people, and if we tried to scale them up we would end with “horrible potcrackery”.
If you are saying academia has a good track record, then I must say (1) wrong for stuff like ML, where in recent years much (arguably most) relevant progress is made outside of academia, and (2) it may have a good track record for the long history of science, and when you say it’s good at solving problems, sure I think it might solve alignment in 100 years, but we need it in 10, and academia is slow. (E.g. read Yudkowsky’s sequence on science, if you don’t think that academia is slow.)
Do you have some reason why you think that a person can make more progress in academia than elsewhere? I agree that academia has people, and it’s good to get those people, but academia has badly shaped incentives, like (from my other comment): “Academia doesn’t have good incentives to make that kind of important progress: You are supposed to publish papers, so you (1) focus on what you can do with current ML systems, instead of focusing on more uncertain longer-term work, and (2) goodhart on some subproblems that don’t take that long to solve, instead of actually focusing on understanding the core difficulties and how one might address them.” So I expect a person can make more progress outside of academia. Much more, in fact.
Some important parts of the AI safety problem seem to me like they don’t fit well into academia work. There are of course exceptions, people in academia who can make useful progress here, but they are rare. I am not that confident in this, as my understanding of AI safety isn’t that deep, but I’m not just making this up. (EDIT: This mostly overlaps with the first two points I made, that academia is slow and that there are bad incentives, and maybe some other minor considerations about why excellent people (e.g. John Wentworth) may rather choose to not work in academia. What I’m saying is that I think that AI safety is a problem where those obstacles are big obstacles, whereas there might be other fields where those obstacles aren’t thaaat bad.)
Fair points. In particular, I think my response should have focused more on the role of academia + industry.
Not entirely fair: if you open the field just a bit to “interpretability” in general you will see that most important advances in the field (eg SHAP and LIME) were done inside academia.
I would also not be too surprised to find people within academia who are doing great mechanistic interpretability, simply because of the sheer number of people researching interpretability.
Granted, but part of the problem is convincing academics to engage. I think that the math community would be x100 times better at solving these problems if they ever become popular enough in academia.
Matches my intuition as well (though I might be biased here). I’d add that I expect grad students will get better mentorship on average in academia than in non profits / doing independent research (but mostly work on irrelevant problems while in academia).
One important intuition that I have is that I think academia + industry scales to “crappy but at the end advancing in the object level” despite having lots of mediocre people involved, while I think that all cool things happening in EA+LW are due to some exceptionally talented people, and if we tried to scale them up we would end with “horrible potcrackery”.
But I’d be delighted to be proven wrong!
I must say I strongly agree with Steven.
If you are saying academia has a good track record, then I must say (1) wrong for stuff like ML, where in recent years much (arguably most) relevant progress is made outside of academia, and (2) it may have a good track record for the long history of science, and when you say it’s good at solving problems, sure I think it might solve alignment in 100 years, but we need it in 10, and academia is slow. (E.g. read Yudkowsky’s sequence on science, if you don’t think that academia is slow.)
Do you have some reason why you think that a person can make more progress in academia than elsewhere? I agree that academia has people, and it’s good to get those people, but academia has badly shaped incentives, like (from my other comment): “Academia doesn’t have good incentives to make that kind of important progress: You are supposed to publish papers, so you (1) focus on what you can do with current ML systems, instead of focusing on more uncertain longer-term work, and (2) goodhart on some subproblems that don’t take that long to solve, instead of actually focusing on understanding the core difficulties and how one might address them.” So I expect a person can make more progress outside of academia. Much more, in fact.
Some important parts of the AI safety problem seem to me like they don’t fit well into academia work. There are of course exceptions, people in academia who can make useful progress here, but they are rare. I am not that confident in this, as my understanding of AI safety isn’t that deep, but I’m not just making this up. (EDIT: This mostly overlaps with the first two points I made, that academia is slow and that there are bad incentives, and maybe some other minor considerations about why excellent people (e.g. John Wentworth) may rather choose to not work in academia. What I’m saying is that I think that AI safety is a problem where those obstacles are big obstacles, whereas there might be other fields where those obstacles aren’t thaaat bad.)