Hacker News new | past | comments | ask | show | jobs | submit login

> 20GB RAM with Q4 quant. Closer to 25GB for the 4_K_M one

how does this math work? are there rules of thumb that you guys know that the rest of us dont?




As a quick estimation, the size of q4 quantized model usually be around 60-70% of the model's parameter. You can preciselly check the quantized model size from .gguf files hosted in huggingface.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: