Cornelis Dirk Haupt comments on Hackathon on Mon, 12/5 to follow EAGxBerkeley

Cornelis Dirk Haupt 2 Dec 2022 1:04 UTC
6 points
1 ∶ 0
I couldn’t attend the interpretability hackathon and was hoping to get acquainted with LLM interpretability research as a sofware dev with no experience in interpretability or transformers. So here’s a starting point following in the footsteps of this submission (see their writeup here):
- https://github.com/neelnanda-io/Easy-Transformer
- Neel Nanda live coding/testing transformer hypotheses learning (1.5hrs) https://www.youtube.com/watch?v=yo4QvDn-vsU&t=3403s
- Go through Transformer theory:
  - Attention is all you need paper: https://arxiv.org/abs/1706.03762
  - Good summary: https://nostalgebraist.tumblr.com/post/185326092369/the-transformer-explained
  - Positional embeddings explained (9mins): https://www.youtube.com/watch?v=1biZfFLPRSY
Basically I am thinking we can use the hackathon as a collaborative study session to become more familiar with transformers and interpretability, ultimately culminating in replicating the results in the linked submission (it took them 3 days but since we have a starting point, possibly we can replicate their project and grok what they did much quicker).
Not shoehorned to this idea though. If you think there is a better avenue to using the hackathon to upskill in LLM interpretability and transformers, do share.
- NicoleJaneway 🔸 3 Dec 2022 2:02 UTC
  1 point
  0 ∶ 0
  Parent
  Nice — this seems ambitious, I really like this idea.
  Maybe you can start a study group in GatherTown to continue this virtually as well. I’m sure you’d get takers from other folks interested in ML research.

Cornelis Dirk Haupt comments on Hackathon on Mon, 12/​5 to follow EAGxBerkeley

Cornelis Dirk Haupt comments on Hackathon on Mon, 12/5 to follow EAGxBerkeley