Best LLMs Under 8GB VRAM
If you have an 8GB GPU (RTX 3060 8GB, RTX 4060, RX 7600, or many laptop GPUs), these are the language models that run cleanly with comfortable headroom. All entries below have been graded against an 8GB reference card and sorted by quality.
- 1
DeepSeek R1 Distill 8B
8B paramsS gradeCompact reasoning model. Good reasoning capabilities in a small package.
Min VRAM: 5.08GBQuant: Q4_K_MSize: 4.583GBLicense: mit - 2
Llama 3.1 8B Instruct
8B paramsS gradeMeta's 8B parameter instruction-tuned model. Great balance of performance and efficiency for local deployment.
Min VRAM: 5.08GBQuant: Q4_K_MSize: 4.583GBLicense: llama3.1 - 3
Granite 3.3 8B
8B paramsS gradeIBM's 8B instruction model. Enterprise quality.
Min VRAM: 5.1GBQuant: Q4_K_MSize: 4.603GBLicense: apache-2.0 - 4
EXAONE 3.5 7.8B
7.8B paramsS grade7.8B model from LG. Strong bilingual Korean/English.
Min VRAM: 4.94GBQuant: Q4_K_MSize: 4.443GBLicense: other - 5
InternLM 2.5 7B
7.7B paramsS gradeStrong 7B model from China. Good at tool use and math.
Min VRAM: 4.89GBQuant: Q4_K_MSize: 4.389GBLicense: apache-2.0 - 6
Qwen 2.5 7B Instruct
7.6B paramsS gradeEfficient 7B model with strong coding and reasoning abilities.
Min VRAM: 5.3GBQuant: Q4_K_MSize: 4.7GBLicense: apache-2.0 - 7
Mistral 7B Instruct v0.3
7.3B paramsS gradeEfficient 7B model from Mistral AI with strong performance for its size.
Min VRAM: 5.28GBQuant: Q5_K_MSize: 4.783GBLicense: apache-2.0 - 8
Falcon 3 7B
7B paramsS gradeFull-size Falcon 3 with strong performance across benchmarks.
Min VRAM: 5GBQuant: Q4_K_MSize: 4.4GBLicense: apache-2.0 - 9
OLMo 2 7B
7B paramsS gradeFully open research model. Transparent training.
Min VRAM: 4.67GBQuant: Q4_K_MSize: 4.165GBLicense: apache-2.0 - 10
OpenChat 3.5 7B
7B paramsS gradeFine-tuned Mistral 7B for chat. Strong instruction following.
Min VRAM: 4.57GBQuant: Q4_K_MSize: 4.068GBLicense: apache-2.0 - 11
Yi 1.5 6B Chat
6B paramsS gradeEfficient 6B bilingual (English/Chinese) model.
Min VRAM: 3.92GBQuant: Q4_K_MSize: 3.422GBLicense: apache-2.0 - 12
Gemma 3 4B
4B paramsS gradeBalanced 4B model with strong reasoning. Great for iPhones.
Min VRAM: 4.35GBQuant: Q8_0Size: 3.847GBLicense: gemma - 13
Nemotron Mini 4B
4B paramsS gradeNVIDIA's compact 4B model optimized for edge deployment.
Min VRAM: 4.65GBQuant: Q8_0Size: 4.154GBLicense: other - 14
Danube 3 4B
4B paramsS gradeCapable 4B model from H2O.ai. Good for phones.
Min VRAM: 4.42GBQuant: Q8_0Size: 3.922GBLicense: apache-2.0 - 15
Phi-3.5 Mini 3.8B
3.8B paramsS gradeTiny but capable 3.8B model. Runs on almost any hardware including phones.
Min VRAM: 4.28GBQuant: Q8_0Size: 3.782GBLicense: mit