> 20GB RAM with Q4 quant. Closer to 25GB for the 4_K_M one how does this math wo... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

swyx 4 months ago | parent | context | favorite | on: QwQ: Alibaba's O1-like reasoning LLM

> 20GB RAM with Q4 quant. Closer to 25GB for the 4_K_M one

how does this math work? are there rules of thumb that you guys know that the rest of us dont?

mekpro 4 months ago [–]

As a quick estimation, the size of q4 quantized model usually be around 60-70% of the model's parameter. You can preciselly check the quantized model size from .gguf files hosted in huggingface.

Join us for AI Startup School this June 16-17 in San Francisco!
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact