Qwen3 32B Instruct

32B

Alibaba Qwen3

Qwen3 dense 32B — successor to Qwen2.5-32B with stronger reasoning and thinking mode.

10.1K HF downloads69 likesQwen/Qwen3-32B-GGUF· stats from 6/25/2026
Consumer GPUPro GPU

41K

Max Context

4

Quant Variants

GGUF Q4_K_M

Best Quality

97.5%

Accuracy Retained

Quantization Variants

Per-quant VRAM, quality loss, and inference speed on RTX 4090

FormatLevelBPWVRAMPPL LossSpeedActions
GGUFQ3_K_M3.8717.8 GB7.2%50 tok/s
CalcHF
GGUFQ4_K_M4.8522.5 GB2.5%42 tok/s
CalcHF
EXL23.5bpw3.516.8 GB4.5%62 tok/s
CalcHF
AWQINT4419.5 GB3.6%55 tok/s
CalcHF