And if you want to understand I'd recommend this post (gpt2 in 60 lines of numpy...

andy99 4 months ago | parent | context | favorite | on: Llama3 implemented from scratch

And if you want to understand I'd recommend this post (gpt2 in 60 lines of numpy) and the post on attention it links to. The concepts are mostly identical to llama, just with a few minor architectural tweaks. https://jaykmody.com/blog/gpt-from-scratch/

bhavesh2712 4 months ago [–]

Thanks for sharing this!