Back to Quant Hub

Qwen2.5 3B Instruct

3B

Alibaba Qwen2.5

Tiny Qwen2.5 for edge devices. Runs on 4GB VRAM or Raspberry Pi class hardware.

238.1K HF downloads140 likesQwen/Qwen2.5-3B-Instruct-GGUF· stats from 6/24/2026
Consumer GPUMac / Apple SiliconCPU / VPS

33K

Max Context

2

Quant Variants

GGUF Q8_0

Best Quality

99.7%

Accuracy Retained

Quantization Variants

Per-quant VRAM, quality loss, and inference speed on RTX 4090

FormatLevelBPWVRAMPPL LossSpeedActions
GGUFQ4_K_M4.852.1 GB3.2%340 tok/s
GGUFQ8_08.53.5 GB0.3%290 tok/s