Qwen2.5 3B Instruct

Alibaba Qwen2.5

Tiny Qwen2.5 for edge devices. Runs on 4GB VRAM or Raspberry Pi class hardware.

⬇ 238.1K HF downloads♥ 140 likesQwen/Qwen2.5-3B-Instruct-GGUF· stats from 6/24/2026

Consumer GPUMac / Apple SiliconCPU / VPS

33K

Max Context

Quant Variants

GGUF Q8_0

Best Quality

99.7%

Accuracy Retained

Quantization Variants

Per-quant VRAM, quality loss, and inference speed on RTX 4090

Format	Level	BPW	VRAM	PPL Loss	Speed	Actions
GGUF	Q4_K_M	4.85	2.1 GB	3.2%	340 tok/s	Calc HF
GGUF	Q8_0	8.5	3.5 GB	0.3%	290 tok/s	Calc HF