BeginnerMac / Apple 7 min read

M1 / M2 Mac 8GB: Realistic Ollama Limits

Unified memory is shared with macOS — here is what actually works on base MacBooks without swapping.

M1M28GB RAMOllamaMetal

Memory budget

macOS + apps use 3–4GB. That leaves ~4GB for the model on an 8GB Mac. Stick to 3B Q4 or 7B Q2/Q3 with short context. Close browsers before loading 7B.

text

M1 8GB safe picks:
  llama3.2:3b       → smooth chat
  qwen2.5:3b        → good Chinese
  phi3.5:mini       → fast responses

Avoid on 8GB:
  llama3.1:8b @ Q4  → swap thrashing
  any 14B+ model

Ollama settings

Set OLLAMA_NUM_PARALLEL=1 and keep context at 2048 for 8GB machines. Monitor Memory Pressure in Activity Monitor.

bash

export OLLAMA_NUM_PARALLEL=1
export OLLAMA_MAX_LOADED_MODELS=1
ollama pull llama3.2:3b
ollama run llama3.2:3b

Deployment guides are educational. Each model is subject to its own license — read the official Hugging Face model card before downloading or deploying.