Qwen3 30B-A3B Instruct

30B-A3B

Alibaba Qwen3

Qwen3 MoE with only 3B active params. Q4 ~19GB file; outperforms QwQ-32B on 16GB cards.

11.6K HF downloads72 likesQwen/Qwen3-30B-A3B-GGUF· stats from 6/25/2026
Consumer GPUMac / Apple Silicon

41K

Max Context

3

Quant Variants

GGUF Q5_K_M

Best Quality

98.9%

Accuracy Retained

Quantization Variants

Per-quant VRAM, quality loss, and inference speed on RTX 4090

FormatLevelBPWVRAMPPL LossSpeedActions
GGUFQ4_K_M4.8519.0 GB2.4%95 tok/s
CalcHF
GGUFQ5_K_M5.6822.0 GB1.1%82 tok/s
CalcHF
AWQINT4417.5 GB3.4%118 tok/s
CalcHF