Best Models for the RTX 4060
The RTX 4060 (8GB) is one of the most common gaming GPUs, and it punches above its weight on 7B-class models with Q4 quantization. Below are the standout fits.
- 1
DeepSeek R1 Distill 8B
8B paramsS gradeCompact reasoning model. Good reasoning capabilities in a small package.
Min VRAM: 5.08GBQuant: Q4_K_MSize: 4.583GBLicense: mit - 2
Llama 3.1 8B Instruct
8B paramsS gradeMeta's 8B parameter instruction-tuned model. Great balance of performance and efficiency for local deployment.
Min VRAM: 5.08GBQuant: Q4_K_MSize: 4.583GBLicense: llama3.1 - 3
Granite 3.3 8B
8B paramsS gradeIBM's 8B instruction model. Enterprise quality.
Min VRAM: 5.1GBQuant: Q4_K_MSize: 4.603GBLicense: apache-2.0 - 4
EXAONE 3.5 7.8B
7.8B paramsS grade7.8B model from LG. Strong bilingual Korean/English.
Min VRAM: 4.94GBQuant: Q4_K_MSize: 4.443GBLicense: other - 5
InternLM 2.5 7B
7.7B paramsS gradeStrong 7B model from China. Good at tool use and math.
Min VRAM: 4.89GBQuant: Q4_K_MSize: 4.389GBLicense: apache-2.0 - 6
Qwen 2.5 7B Instruct
7.6B paramsS gradeEfficient 7B model with strong coding and reasoning abilities.
Min VRAM: 5.3GBQuant: Q4_K_MSize: 4.7GBLicense: apache-2.0 - 7
Qwen 2.5 Coder 7B
7.6B paramsS gradeStrong 7B code model rivaling larger coding models. Excellent for local development.
Min VRAM: 4.86GBQuant: Q4_K_MSize: 4.361GBLicense: apache-2.0 - 8
Mistral 7B Instruct v0.3
7.3B paramsS gradeEfficient 7B model from Mistral AI with strong performance for its size.
Min VRAM: 5.28GBQuant: Q5_K_MSize: 4.783GBLicense: apache-2.0 - 9
LLaVA 1.6 7B
7B paramsS gradeMultimodal vision-language model. Understands images and answers questions about them.
Min VRAM: 5GBQuant: Q4_K_MSize: 4.4GBLicense: apache-2.0 - 10
Falcon 3 7B
7B paramsS gradeFull-size Falcon 3 with strong performance across benchmarks.
Min VRAM: 5GBQuant: Q4_K_MSize: 4.4GBLicense: apache-2.0 - 11
OLMo 2 7B
7B paramsS gradeFully open research model. Transparent training.
Min VRAM: 4.67GBQuant: Q4_K_MSize: 4.165GBLicense: apache-2.0 - 12
OpenChat 3.5 7B
7B paramsS gradeFine-tuned Mistral 7B for chat. Strong instruction following.
Min VRAM: 4.57GBQuant: Q4_K_MSize: 4.068GBLicense: apache-2.0 - 13
StarCoder2 7B
7B paramsS gradeLarger code model with better completions.
Min VRAM: 4.66GBQuant: Q4_K_MSize: 4.155GBLicense: bigcode-openrail-m - 14
Code Llama 7B
7B paramsS gradeMeta's code-specialized Llama model. Good at code completion.
Min VRAM: 4.3GBQuant: Q4_K_MSize: 3.801GBLicense: llama2 - 15
DeepSeek Coder 6.7B
6.7B paramsS gradePowerful 6.7B code model with excellent code generation across many languages.
Min VRAM: 4.3GBQuant: Q4_K_MSize: 3.803GBLicense: mit