Hacker News new | past | comments | ask | show | jobs | submit | djsh's comments login

Since we are talking about throughput of API hosting providers, I wanted to add in the work we have done at Groq. I understand that the team is getting in touch with the ArtificialAnalysis folks to get benchmarked.

Mixtral running at >500 tokens/s @ Groq https://www.youtube.com/watch?v=5fJyOVtOk4Y Experience the speed for yourself, LLama2-70B, at https://chat.groq.com/


If you like that speed, you would love Mixtral running at >500 tokens/s @ Groq https://www.youtube.com/watch?v=5fJyOVtOk4Y

In full disclosure, I have worked on getting this up @ Groq.

PS: Experience the speed for yourself, LLama2-70B, at https://chat.groq.com/


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: