Ask HN: Are there any reliable benchmarks for Machine Learning Model Serving?

ahurmazda · 2024-02-10T23:40:38.000000Z

This might be relevant. It’s a consortium with members like Baidu, nvidia, arm etc

https://mlcommons.org/benchmarks/inference-datacenter/

kolinko · 2024-02-11T00:00:26.000000Z

Not exactly what you’re looking fir, but perhaps you’ll find it useful - llama benchmarked on all M-series chips, and in comments there are comparisons with nvidia.

https://github.com/ggerganov/llama.cpp/discussions/4167

cjbprime · 2024-02-10T23:44:18.000000Z

There is no single answer, mostly a lot of strengths and weaknesses around which formats you can use on your specific hardware / amount of VRAM / if you want to ue multiple GPUs, whether they are exact same GPU or not / etc.