Best Models for the RTX 4060

The RTX 4060 (8GB) is one of the most common gaming GPUs, and it punches above its weight on 7B-class models with Q4 quantization. Below are the standout fits.

  1. 1

    DeepSeek R1 Distill 8B

    8B paramsS grade

    Compact reasoning model. Good reasoning capabilities in a small package.

    Min VRAM: 5.08GBQuant: Q4_K_MSize: 4.583GBLicense: mit
  2. 2

    Llama 3.1 8B Instruct

    8B paramsS grade

    Meta's 8B parameter instruction-tuned model. Great balance of performance and efficiency for local deployment.

    Min VRAM: 5.08GBQuant: Q4_K_MSize: 4.583GBLicense: llama3.1
  3. 3

    Granite 3.3 8B

    8B paramsS grade

    IBM's 8B instruction model. Enterprise quality.

    Min VRAM: 5.1GBQuant: Q4_K_MSize: 4.603GBLicense: apache-2.0
  4. 4

    EXAONE 3.5 7.8B

    7.8B paramsS grade

    7.8B model from LG. Strong bilingual Korean/English.

    Min VRAM: 4.94GBQuant: Q4_K_MSize: 4.443GBLicense: other
  5. 5

    InternLM 2.5 7B

    7.7B paramsS grade

    Strong 7B model from China. Good at tool use and math.

    Min VRAM: 4.89GBQuant: Q4_K_MSize: 4.389GBLicense: apache-2.0
  6. 6

    Qwen 2.5 7B Instruct

    7.6B paramsS grade

    Efficient 7B model with strong coding and reasoning abilities.

    Min VRAM: 5.3GBQuant: Q4_K_MSize: 4.7GBLicense: apache-2.0
  7. 7

    Qwen 2.5 Coder 7B

    7.6B paramsS grade

    Strong 7B code model rivaling larger coding models. Excellent for local development.

    Min VRAM: 4.86GBQuant: Q4_K_MSize: 4.361GBLicense: apache-2.0
  8. 8

    Mistral 7B Instruct v0.3

    7.3B paramsS grade

    Efficient 7B model from Mistral AI with strong performance for its size.

    Min VRAM: 5.28GBQuant: Q5_K_MSize: 4.783GBLicense: apache-2.0
  9. 9

    LLaVA 1.6 7B

    7B paramsS grade

    Multimodal vision-language model. Understands images and answers questions about them.

    Min VRAM: 5GBQuant: Q4_K_MSize: 4.4GBLicense: apache-2.0
  10. 10

    Falcon 3 7B

    7B paramsS grade

    Full-size Falcon 3 with strong performance across benchmarks.

    Min VRAM: 5GBQuant: Q4_K_MSize: 4.4GBLicense: apache-2.0
  11. 11

    OLMo 2 7B

    7B paramsS grade

    Fully open research model. Transparent training.

    Min VRAM: 4.67GBQuant: Q4_K_MSize: 4.165GBLicense: apache-2.0
  12. 12

    OpenChat 3.5 7B

    7B paramsS grade

    Fine-tuned Mistral 7B for chat. Strong instruction following.

    Min VRAM: 4.57GBQuant: Q4_K_MSize: 4.068GBLicense: apache-2.0
  13. 13

    StarCoder2 7B

    7B paramsS grade

    Larger code model with better completions.

    Min VRAM: 4.66GBQuant: Q4_K_MSize: 4.155GBLicense: bigcode-openrail-m
  14. 14

    Code Llama 7B

    7B paramsS grade

    Meta's code-specialized Llama model. Good at code completion.

    Min VRAM: 4.3GBQuant: Q4_K_MSize: 3.801GBLicense: llama2
  15. 15

    DeepSeek Coder 6.7B

    6.7B paramsS grade

    Powerful 6.7B code model with excellent code generation across many languages.

    Min VRAM: 4.3GBQuant: Q4_K_MSize: 3.803GBLicense: mit

Related