About

About quantized.uk

A practical reference for running quantized LLMs on your own hardware — built by someone who actually does it.

Page last updated: 24 June 2026 · Data last refreshed: 2026-06-24

Who maintains this?

quantized.uk is an indie side project by a developer who got tired of guessing VRAM requirements and digging through scattered Hugging Face repos. It is maintained in spare time — data gets corrected and expanded as new models ship and as real-world testing surfaces better numbers.

Why this exists

Running LLMs locally should not require a PhD in quantization. Yet most knowledge lives in Reddit threads, Discord messages, and model cards written for researchers — not for someone staring at a 12GB VRAM bar wondering if Q4_K_M will fit.

quantized.uk exists to answer the questions people actually ask before downloading a 20GB file: Will it fit? Which format? How much quality do I lose? What is the exact command to run it?

What we publish

A searchable index of 50+ models with per-quant VRAM, speed, and quality estimates; interactive tools (VRAM calculator, format wizard, CLI generator); real-hardware benchmarks; and step-by-step deployment cookbooks.

We index metadata and link to Hugging Face — we do not host, distribute, or sell model weights. Each model remains subject to its own license.

How we source data

Model stats are manually curated from Hugging Face model cards, community quant releases (bartowski, turboderp, unsloth, and others), and local benchmark runs where possible. Hugging Face download counts are fetched from the public API at each site build.

Numbers are planning estimates, not guarantees. We document our methodology on the Benchmarks page and mark editorial estimates (like format heat) clearly. Always verify against your own hardware before production use.

How often things update

There is no fixed schedule — updates happen when new models ship or gaps are spotted. Model index and cookbook articles typically grow in batches every few weeks. The homepage changelog lists every data change with dates.

HF download stats refresh automatically on each site rebuild. Major updates are noted in the changelog — that is the best place to see what changed recently.

How data improves

Model entries, benchmarks, and cookbook guides are curated manually from community releases and hands-on testing. We prioritise models people actually download and hardware configs readers actually use.

If a number looks wrong, cross-check with our VRAM calculator and the official Hugging Face model card — then watch the homepage changelog for corrections in the next data batch.

What we are not

quantized.uk is not affiliated with Meta, NVIDIA, Alibaba, Mistral, Google, Hugging Face, or any model vendor. We are not a company, a hosting provider, or a model distributor.

For legal terms, liability limits, and trademark notices, see the Terms & Disclaimer page.

Found a wrong VRAM number? Know a model we should add?

Homepage changelog Browse the model index Terms & Disclaimer