Qwen3-Coder 30B-A3B Instruct

30B-A3B

Alibaba Qwen3

Agentic coding MoE with 3.3B active params and 256K native context. Top open coder for 16–24GB cards.

⬇ 2.1M HF downloads♥ 1123 likesQwen/Qwen3-Coder-30B-A3B-Instruct· stats from 6/25/2026

Consumer GPUMac / Apple Silicon

262K

Max Context

Quant Variants

GGUF Q5_K_M

Best Quality

99.0%

Accuracy Retained

Quantization Variants

Per-quant VRAM, quality loss, and inference speed on RTX 4090

Format	Level	BPW	VRAM	PPL Loss	Speed	Actions
GGUF	Q4_K_M	4.85	19.2 GB	2.3%	92 tok/s	Calc HF
GGUF	Q5_K_M	5.68	22.0 GB	1.0%	80 tok/s	Calc HF
AWQ	INT4	4	17.8 GB	3.2%	115 tok/s	Calc HF

Alibaba Qwen3

Qwen3 MoE with only 3B active params. Q4 ~19GB file; outperforms QwQ-32B on 16GB cards.

Alibaba Qwen3

Qwen3 dense 32B — successor to Qwen2.5-32B with stronger reasoning and thinking mode.

Alibaba Qwen3

Latest Qwen3 dense 8B with thinking mode. Strong upgrade from Qwen2.5 7B for local deploy.

Alibaba Qwen3

Qwen3 14B — best balance of reasoning and VRAM in the 2026 Qwen lineup.