Focusing on Deep Learning specifically: - Most LLMs currently use the transforme... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

IshanMi on Dec 14, 2023 | parent | context | favorite | on: Ask HN: Daily practices for building AI/ML skills?

Focusing on Deep Learning specifically: - Most LLMs currently use the transformer architecture. You can learn about this visually (https://bbycroft.net/llm), or through this blog post (https://jalammar.github.io/illustrated-transformer/), or through any number of Andrej Karpathy's blog posts and materials. - To stay on top of papers that get published every week, I read a summary every Sunday: https://github.com/dair-ai/ML-Papers-of-the-Week - To learn more about the engineering side of it, you can join Discord servers such as EleutherAI's, or follow GitHub discussions of projects like llama.cpp

Personally I think the best way to develop per unit time is probably to try to re-implement some of the big papers in the field. There's a clear goal, there are clear signs of success, there are many implementations out there for you to check your work against and compare and learn from.

Good luck!

IshanMi on Dec 14, 2023 | [–]

In case you're unsure which papers would be good to implement, here's a nice GitHub repo: https://github.com/aimerou/awesome-ai-papers

Try out the "historical papers"! :)

atomicnature on Dec 14, 2023 | [–]

These are super helpful, thanks

Consider applying for YC's Spring batch! Applications are open till Feb 11.
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact