Qwen3 14B Instruct

14B

Alibaba Qwen3

Qwen3 14B — best balance of reasoning and VRAM in the 2026 Qwen lineup.

30.1K HF downloads104 likesQwen/Qwen3-14B-GGUF· stats from 6/25/2026
Consumer GPUMac / Apple Silicon

41K

Max Context

4

Quant Variants

GGUF Q5_K_M

Best Quality

98.8%

Accuracy Retained

Quantization Variants

Per-quant VRAM, quality loss, and inference speed on RTX 4090

FormatLevelBPWVRAMPPL LossSpeedActions
GGUFQ4_K_M4.8510.5 GB2.6%88 tok/s
CalcHF
GGUFQ5_K_M5.6812.2 GB1.2%78 tok/s
CalcHF
AWQINT449.5 GB3.5%115 tok/s
CalcHF
EXL24.65bpw4.6510.0 GB1.8%125 tok/s
CalcHF

Similar models

Compare with Qwen3 8B