DeepSeek-R1-Distill-Qwen-32B

32B

DeepSeek

R1 distilled to 32B. Near-frontier reasoning on a single 24GB card (Q3/Q4).

⬇ 18.2K HF downloads♥ 306 likesbartowski/DeepSeek-R1-Distill-Qwen-32B-GGUF· stats from 6/24/2026

Consumer GPUPro GPU

131K

Max Context

Quant Variants

GGUF Q4_K_M

Best Quality

97.4%

Accuracy Retained

Quantization Variants

Per-quant VRAM, quality loss, and inference speed on RTX 4090

Format	Level	BPW	VRAM	PPL Loss	Speed	Actions
GGUF	Q3_K_M	3.87	17.8 GB	7.2%	50 tok/s	Calc HF
GGUF	Q4_K_M	4.85	22.2 GB	2.6%	42 tok/s	Calc HF
EXL2	3.5bpw	3.5	16.8 GB	4.5%	65 tok/s	Calc HF