Back to Quant Hub

Qwen2-VL 7B Instruct

7B

Alibaba Qwen2

Vision-language model with video understanding. Strong OCR and chart reading.

Consumer GPUMac / Apple Silicon

33K

Max Context

2

Quant Variants

GGUF Q4_K_M

Best Quality

96.2%

Accuracy Retained

Calculate VRAM Hugging Face Compare

Quantization Variants

Per-quant VRAM, quality loss, and inference speed on RTX 4090

Format	Level	BPW	VRAM	PPL Loss	Speed	Actions
GGUF	Q4_K_M	4.85	6.8 GB	3.8%	72 tok/s	Calc HF
AWQ	INT4	4	6.0 GB	5.0%	95 tok/s	Calc HF