This is a fantastic resource, and I’m really glad to have it!
My own path has been a little more haphazard—I completed Level 2 (Software Engineering) years ago, and am currently working on AI safety (1), mathematics (3) and research engineering ability (4) simultaneously. Having just completed the last goal of 4 (Completing 1-3 RL projects) I was planning to jump right into 6 at this point, since transformers haven’t yet appeared in my RL perusal, but I’m now rethinking those plans based on this document—perhaps I should learn about transformers first.
All in all, the first four levels (The ones I feel qualified to write about, having gone through some or all of them) seem extremely good.
The thing that most surprised me about the rest of the document was Level 6. Specifically, the part about being able to reimplement a paper’s work in 10-20 hours. This seems pretty fast compared to other resources I’ve seen out there, though most of these resources are RL-focused. For instance, this post (220 hours). This post from DeepMind about job vacancies a few months ago also says:
”As a rough test for the Research Engineer role, if you can reproduce a typical ML paper in a few hundred hours and your interests align with ours, we’re probably interested in interviewing you.”
Thus, I don’t think it’s necessary to be able to replicate a paper in 10-20 hours. Replicating papers is a great idea according to my own research, but I think that one can be considerably slower than that and still be at a useful standard.
If you have other sources that suggest otherwise I’d be very interested to read them—it’s always good to improve my idea of where I’m heading towards!
Thanks for sharing your experiences, too! As for transformers, yeah it seems pretty plausible that you could specialize in a bunch of traditional Deep RL methods and qualify as a good research engineer (e.g. very employable). That’s what several professionals seem to have done, e.g. Daniel Ziegler.
But maybe that’s changing, and it’s worth it to start learning things. It seems like most of the new RL papers incorporate some kind of transformer encoder in the loop, if not basically being a straight-up Decision Transformer.
Thanks, that’s a good point! I was very uncertain about that, it was mostly a made-up number. I do think the time to implement an ML paper depends wildly on how complex the paper is (e.g. a new training algorithm paper necessitates a lot more time to test it than a post-hoc interpretability paper that uses pre-trained models) and how much you implement (e.g. rewrite the code but don’t do any training vs evaluate the key result to get the most important graph vs try to replicate almost all of the results).
I now think my original 10-20 hours per paper number was probably an underestimate, but it feels really hard to come up with a robust estimate here and I’m not sure how valuable it would be, so I’ve removed that parenthetical from the text.
This is a fantastic resource, and I’m really glad to have it!
My own path has been a little more haphazard—I completed Level 2 (Software Engineering) years ago, and am currently working on AI safety (1), mathematics (3) and research engineering ability (4) simultaneously. Having just completed the last goal of 4 (Completing 1-3 RL projects) I was planning to jump right into 6 at this point, since transformers haven’t yet appeared in my RL perusal, but I’m now rethinking those plans based on this document—perhaps I should learn about transformers first.
All in all, the first four levels (The ones I feel qualified to write about, having gone through some or all of them) seem extremely good.
The thing that most surprised me about the rest of the document was Level 6. Specifically, the part about being able to reimplement a paper’s work in 10-20 hours. This seems pretty fast compared to other resources I’ve seen out there, though most of these resources are RL-focused. For instance, this post (220 hours). This post from DeepMind about job vacancies a few months ago also says:
”As a rough test for the Research Engineer role, if you can reproduce a typical ML paper in a few hundred hours and your interests align with ours, we’re probably interested in interviewing you.”
Thus, I don’t think it’s necessary to be able to replicate a paper in 10-20 hours. Replicating papers is a great idea according to my own research, but I think that one can be considerably slower than that and still be at a useful standard.
If you have other sources that suggest otherwise I’d be very interested to read them—it’s always good to improve my idea of where I’m heading towards!
Thanks for sharing your experiences, too! As for transformers, yeah it seems pretty plausible that you could specialize in a bunch of traditional Deep RL methods and qualify as a good research engineer (e.g. very employable). That’s what several professionals seem to have done, e.g. Daniel Ziegler.
But maybe that’s changing, and it’s worth it to start learning things. It seems like most of the new RL papers incorporate some kind of transformer encoder in the loop, if not basically being a straight-up Decision Transformer.
Interesting. Do you have any good examples?
Sure!
A Generalist Agent (deepmind.com)
SayCan: Grounding Language in Robotic Affordances (say-can.github.io)
From motor control to embodied intelligence (deepmind.com)
Transformers are Sample Efficient World Models (arxiv.org)
Decision Transformer: Reinforcement Learning via Sequence Modeling (arxiv.org)
Thanks, that’s a good point! I was very uncertain about that, it was mostly a made-up number. I do think the time to implement an ML paper depends wildly on how complex the paper is (e.g. a new training algorithm paper necessitates a lot more time to test it than a post-hoc interpretability paper that uses pre-trained models) and how much you implement (e.g. rewrite the code but don’t do any training vs evaluate the key result to get the most important graph vs try to replicate almost all of the results).
I now think my original 10-20 hours per paper number was probably an underestimate, but it feels really hard to come up with a robust estimate here and I’m not sure how valuable it would be, so I’ve removed that parenthetical from the text.