Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Are there any reliable benchmarks for Machine Learning Model Serving?
6 points by KuriousCat 7 months ago | hide | past | favorite | 3 comments
Hi, I am searching for benchmarks that compare the performance of various machine learning model serving frameworks. Some of the previous posts such as the following exist but they don't paint the full picture. Is there any reliable benchmark that gives a good snapshot of the state of art? 1. https://news.ycombinator.com/item?id=28760158 2. https://github.com/cortexlabs/cortex/tree/v0.15.1



This might be relevant. It’s a consortium with members like Baidu, nvidia, arm etc

https://mlcommons.org/benchmarks/inference-datacenter/


Not exactly what you’re looking fir, but perhaps you’ll find it useful - llama benchmarked on all M-series chips, and in comments there are comparisons with nvidia.

https://github.com/ggerganov/llama.cpp/discussions/4167


There is no single answer, mostly a lot of strengths and weaknesses around which formats you can use on your specific hardware / amount of VRAM / if you want to ue multiple GPUs, whether they are exact same GPU or not / etc.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: