Back to Quant Hub

Qwen2-VL 7B Instruct

7B

Alibaba Qwen2

Vision-language model with video understanding. Strong OCR and chart reading.

Consumer GPUMac / Apple Silicon

33K

Max Context

2

Quant Variants

GGUF Q4_K_M

Best Quality

96.2%

Accuracy Retained

Quantization Variants

Per-quant VRAM, quality loss, and inference speed on RTX 4090

FormatLevelBPWVRAMPPL LossSpeedActions
GGUFQ4_K_M4.856.8 GB3.8%72 tok/s
AWQINT446.0 GB5.0%95 tok/s