Qwen3 235B-A22B Instruct
235B-A22BAlibaba Qwen3
Qwen3 flagship MoE (22B active / 235B total). Q4_K_M ~142GB; rivals DeepSeek-R1 class models.
Pro GPU
41K
Max Context
3
Quant Variants
GGUF Q4_K_M
Best Quality
97.8%
Accuracy Retained
Quantization Variants
Per-quant VRAM, quality loss, and inference speed on RTX 4090
Similar models
Compare with Qwen3 32B32B
Qwen3 32B Instruct
Alibaba Qwen3
Consumer GPUPro GPU
16.8 GBmin VRAM·97.5%accuracy
Qwen3 dense 32B — successor to Qwen2.5-32B with stronger reasoning and thinking mode.
8B
Qwen3 8B Instruct
Alibaba Qwen3
Consumer GPUMac / Apple Silicon
5.1 GBmin VRAM·99.4%accuracy
Latest Qwen3 dense 8B with thinking mode. Strong upgrade from Qwen2.5 7B for local deploy.
14B
Qwen3 14B Instruct
Alibaba Qwen3
Consumer GPUMac / Apple Silicon
9.5 GBmin VRAM·98.8%accuracy
Qwen3 14B — best balance of reasoning and VRAM in the 2026 Qwen lineup.
30B-A3B
Qwen3 30B-A3B Instruct
Alibaba Qwen3
Consumer GPUMac / Apple Silicon
17.5 GBmin VRAM·98.9%accuracy
Qwen3 MoE with only 3B active params. Q4 ~19GB file; outperforms QwQ-32B on 16GB cards.