Hacker News new | past | comments | ask | show | jobs | submit login

The best way to understand transformers is to take Andrej’s Karpathy course on youtube. With a keyboard and a lot of focus time.



It is hard but so worth it. It is hard to overstate how good it is. The pedagogy, the charisma/style, the fact he cofounded openai and worked for Elon but is a modest as say a math tutor popping into your house to teach you some math!

There is a great discord community attached too which makes a big difference.

What is missing from this and another course I did and is very hard to find is multivariate calculus on linear algebra. I feel motivated to create a resource on it because its pretty hard. For example how to differentiate matrix operations where broadcasting has been involved. Not just the how but really grokking it into working memory.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: