Hacker News new | past | comments | ask | show | jobs | submit login

Focusing on Deep Learning specifically: - Most LLMs currently use the transformer architecture. You can learn about this visually (https://bbycroft.net/llm), or through this blog post (https://jalammar.github.io/illustrated-transformer/), or through any number of Andrej Karpathy's blog posts and materials. - To stay on top of papers that get published every week, I read a summary every Sunday: https://github.com/dair-ai/ML-Papers-of-the-Week - To learn more about the engineering side of it, you can join Discord servers such as EleutherAI's, or follow GitHub discussions of projects like llama.cpp

Personally I think the best way to develop per unit time is probably to try to re-implement some of the big papers in the field. There's a clear goal, there are clear signs of success, there are many implementations out there for you to check your work against and compare and learn from.

Good luck!




In case you're unsure which papers would be good to implement, here's a nice GitHub repo: https://github.com/aimerou/awesome-ai-papers

Try out the "historical papers"! :)


These are super helpful, thanks




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: