Hacker News new | past | comments | ask | show | jobs | submit login

And if you want to understand I'd recommend this post (gpt2 in 60 lines of numpy) and the post on attention it links to. The concepts are mostly identical to llama, just with a few minor architectural tweaks. https://jaykmody.com/blog/gpt-from-scratch/



Thanks for sharing this!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: