Mistral Large 3 675B Instruct

675B MoE

Mistral AI

Mistral 3 flagship MoE (41B active / 675B total) with vision encoder. FP8 on 8×H200; GGUF quant for research clusters only.

2.9K HF downloads233 likesmistralai/Mistral-Large-3-675B-Instruct-2512· stats from 6/25/2026
Pro GPU

262K

Max Context

2

Quant Variants

GGUF Q4_K_M

Best Quality

97.9%

Accuracy Retained

Quantization Variants

Per-quant VRAM, quality loss, and inference speed on RTX 4090

FormatLevelBPWVRAMPPL LossSpeedActions
GGUFQ4_K_M4.85388.0 GB2.1%4 tok/s
CalcHF
GGUFQ3_K_M3.87312.0 GB4.5%5 tok/s
CalcHF