I really appreciate specific career advice from people working in relevant jobs and the ideas and considerations outlined here, and am curating the post. (I’m also really interested in the discussion happening here.)
Personal highlights (note that I’m interested in hearing disagreement with these points!):
The emphasis on fast feedback loops, especially for people who are newer to a field (see also the bit about becoming an expert in something for governance)
“the best option for mentorship may be outside of alignment—but PhDs are long enough, and timelines short enough, that you should make sure that your mentor would be excited about supervising some kind of alignment-relevant research.”
This bit (I’d be interested in hearing disagreement, if there is much, though!):
“You’ll need to get hands-on. The best ML and alignment research engages heavily with neural networks (with only a few exceptions). Even if you’re more theoretically-minded, you should plan to be interacting with models regularly, and gain the relevant coding skills. In particular, I see a lot of junior researchers who want to do “conceptual research”. But you should assume that such research is useless until it cashes out in writing code or proving theorems, and that you’ll need to do the cashing out yourself (with threat modeling being the main exception, since it forces a different type of concreteness). …”
“You can get started quickly. People coming from fields like physics and mathematics often don’t realize how much shallower deep learning is as a field, and so think they need to spend a long time understanding the theoretical foundations first. You don’t...” [read the rest above]
The specific directions and research topics listed! (With links and commentary!)
On governance:
“The main advice I give people who want to enter this field: pick one relevant topic and try to become an expert on it.”
“In general I think people overrate “analysis” and underrate “proposals”.
“You’ll need to get hands-on. The best ML and alignment research engages heavily with neural networks (with only a few exceptions). Even if you’re more theoretically-minded, you should plan to be interacting with models regularly, and gain the relevant coding skills. In particular, I see a lot of junior researchers who want to do “conceptual research”. But you should assume that such research is useless until it cashes out in writing code or proving theorems, and that you’ll need to do the cashing out yourself (with threat modeling being the main exception, since it forces a different type of concreteness). …”
Yeah, I agree on priors & some arguments about feedback loops, although note that I don’t really have relevant experience. But I remember hearing someone try to defend something like the opposite claim to me in some group setting where I wasn’t able to ask the follow-up questions I wanted to ask — so now I don’t remember what their main arguments were and don’t know if I should change my opinion.
I expect a bunch of more rationalist-type people disagree with this claim, FWIW. But I also think that they heavily overestimate the value of the types of conceptual research I’m talking about here.
I really appreciate specific career advice from people working in relevant jobs and the ideas and considerations outlined here, and am curating the post. (I’m also really interested in the discussion happening here.)
Personal highlights (note that I’m interested in hearing disagreement with these points!):
The emphasis on fast feedback loops, especially for people who are newer to a field (see also the bit about becoming an expert in something for governance)
“the best option for mentorship may be outside of alignment—but PhDs are long enough, and timelines short enough, that you should make sure that your mentor would be excited about supervising some kind of alignment-relevant research.”
This bit (I’d be interested in hearing disagreement, if there is much, though!):
“You’ll need to get hands-on. The best ML and alignment research engages heavily with neural networks (with only a few exceptions). Even if you’re more theoretically-minded, you should plan to be interacting with models regularly, and gain the relevant coding skills. In particular, I see a lot of junior researchers who want to do “conceptual research”. But you should assume that such research is useless until it cashes out in writing code or proving theorems, and that you’ll need to do the cashing out yourself (with threat modeling being the main exception, since it forces a different type of concreteness). …”
“You can get started quickly. People coming from fields like physics and mathematics often don’t realize how much shallower deep learning is as a field, and so think they need to spend a long time understanding the theoretical foundations first. You don’t...” [read the rest above]
The specific directions and research topics listed! (With links and commentary!)
On governance:
“The main advice I give people who want to enter this field: pick one relevant topic and try to become an expert on it.”
“In general I think people overrate “analysis” and underrate “proposals”.
This seems strongly true to me
Yeah, I agree on priors & some arguments about feedback loops, although note that I don’t really have relevant experience. But I remember hearing someone try to defend something like the opposite claim to me in some group setting where I wasn’t able to ask the follow-up questions I wanted to ask — so now I don’t remember what their main arguments were and don’t know if I should change my opinion.
I expect a bunch of more rationalist-type people disagree with this claim, FWIW. But I also think that they heavily overestimate the value of the types of conceptual research I’m talking about here.
CC https://www.lesswrong.com/posts/fqryrxnvpSr5w2dDJ/touch-reality-as-soon-as-possible-when-doing-machine that expands on “hands-on” experience in alignment.
I don’t know of any writing that directly contradicts these claims. I think https://www.lesswrong.com/s/v55BhXbpJuaExkpcD/p/3pinFH3jerMzAvmza indirectly contradicts these claims as it broadly criticizes most empirical approaches and is more open to conceptual approaches.