Forgot about R1, what hardware are you using to run it?

syntaxing · 2024-11-28T00:30:52 1732753852

I haven’t ran QWQ yet, but it’s a 32B. So about 20GB RAM with Q4 quant. Closer to 25GB for the 4_K_M one. You can wait for a day or so for the quantized GGUFs to show up (we should see the Q4 in the next hour or so). I personally use Ollama on an MacBook Pro. It usually takes a day or two for it to show up. Any M series MacBook with 32GB+ of RAM will run this.

Terretta · 2024-11-28T16:35:26 1732811726

On Macbooks with Apple Silicon consider MLX models from MLX community:

https://huggingface.co/collections/mlx-community/qwq-32b-pre...

For a GUI, LM Studio 0.3.x is iterating MLX support: https://lmstudio.ai/beta-releases

When searching in LM Studio, you can narrow search to the mlx-community.

risho · 2024-11-29T16:14:32 1732896872

on macos with lm-studio is it better to use the mlx-community releases over the one that lm-studio releases?

also I didn't install a beta and mine says i'm using 3.5 which is what the beta also says. is there a difference right now between the beta and the release version?

Terretta · 2024-11-29T21:22:07 1732915327

You're right, looks like 0.3.5 is now on the home page.

aledalgrande · 2024-11-28T01:01:32 1732755692

https://ollama.com/library/qwq

swyx · 2024-11-29T03:44:22 1732851862

> 20GB RAM with Q4 quant. Closer to 25GB for the 4_K_M one

how does this math work? are there rules of thumb that you guys know that the rest of us dont?

mekpro · 2024-11-29T06:41:39 1732862499

As a quick estimation, the size of q4 quantized model usually be around 60-70% of the model's parameter. You can preciselly check the quantized model size from .gguf files hosted in huggingface.

int_19h · 2024-11-28T05:28:59 1732771739

https://huggingface.co/lmstudio-community/QwQ-32B-Preview-GG...