Direct answer
How Much VRAM Do You Need for a 7B Model?
Approximate VRAM planning for 7B LLMs across FP16, INT8, and INT4-style quantization.
Question
How much VRAM do you need for a 7B model?
A 7B model needs about 14 GB for FP16 weights, 7 GB for INT8 weights, or 3.5 to 4 GB for INT4 weights, plus runtime overhead.
7B FP16 weights
about 14 GB
7B INT8 weights
about 7 GB
7B INT4 weights
about 3.5 GB before overhead
Explanation
A rough weight-only estimate is parameters multiplied by bytes per parameter.
KV cache, batch size, context length, framework overhead, and quantization format add extra memory.
For comfortable local inference, leave additional headroom beyond the weight-only number.