DeepSeek-V3

671B MoE

DeepSeek

DeepSeek-V3 frontier MoE (~37B active / 671B total). MLA + FP8; multi-node GPU cluster required at Q4.

1.0M HF downloads4092 likesdeepseek-ai/DeepSeek-V3· stats from 6/25/2026
Pro GPU

164K

Max Context

2

Quant Variants

GGUF Q4_K_M

Best Quality

98.0%

Accuracy Retained

Quantization Variants

Per-quant VRAM, quality loss, and inference speed on RTX 4090

FormatLevelBPWVRAMPPL LossSpeedActions
GGUFQ4_K_M4.85385.0 GB2.0%4 tok/s
CalcHF
GGUFQ3_K_M3.87310.0 GB4.5%5 tok/s
CalcHF