I haven’t ran QWQ yet, but it’s a 32B. So about 20GB RAM with Q4 quant. Closer to 25GB for the 4_K_M one. You can wait for a day or so for the quantized GGUFs to show up (we should see the Q4 in the next hour or so). I personally use Ollama on an MacBook Pro. It usually takes a day or two for it to show up. Any M series MacBook with 32GB+ of RAM will run this.
on macos with lm-studio is it better to use the mlx-community releases over the one that lm-studio releases?
also I didn't install a beta and mine says i'm using 3.5 which is what the beta also says. is there a difference right now between the beta and the release version?
As a quick estimation, the size of q4 quantized model usually be around 60-70% of the model's parameter. You can preciselly check the quantized model size from .gguf files hosted in huggingface.