Best Models for the RTX 4090
The RTX 4090's 24GB of VRAM and Ada Lovelace tensor cores make it the strongest single-card consumer setup short of the 5090. These are the standout models worth running on it.
- 1
Gemma 3 27B
27B paramsS gradeGoogle's flagship open model. Near GPT-4 quality. Needs 20GB+ RAM.
Min VRAM: 15.91GBQuant: Q4_K_MSize: 15.41GBLicense: gemma - 2
Mistral Small 22B
22B paramsS grade22B parameter model. Strong reasoning and multilingual. Needs 16GB+ RAM.
Min VRAM: 12.93GBQuant: Q4_K_MSize: 12.425GBLicense: apache-2.0 - 3
Phi-4
14B paramsS gradeMicrosoft's 14B parameter model. Punches well above its weight on reasoning.
Min VRAM: 15.01GBQuant: Q8_0Size: 14.51GBLicense: mit - 4
Qwen 2.5 14B
14B paramsS gradeStrong 14B model with excellent coding and reasoning. iPad Pro recommended.
Min VRAM: 15.12GBQuant: Q8_0Size: 14.623GBLicense: apache-2.0 - 5
Qwen 2.5 Coder 14B
14B paramsS gradePowerful 14B code model. Excellent for complex programming tasks.
Min VRAM: 15.12GBQuant: Q8_0Size: 14.623GBLicense: apache-2.0 - 6
Code Llama 13B Instruct
13B paramsS grade13B code model for complex tasks. iPad Pro recommended.
Min VRAM: 7.83GBQuant: Q4_K_MSize: 7.326GBLicense: llama2 - 7
Gemma 3 12B
12B paramsS gradeHigh quality 12B model. Excellent for iPad Pro and Mac.
Min VRAM: 12.15GBQuant: Q8_0Size: 11.651GBLicense: gemma - 8
Mistral Nemo 12B
12B paramsS gradeMistral's 12B model with excellent instruction following.
Min VRAM: 12.63GBQuant: Q8_0Size: 12.128GBLicense: apache-2.0 - 9
FLUX.1 Schnell (GGUF)
12B paramsS gradeFast 1-4 step generation. State-of-the-art quality. Needs 16GB+ RAM.
Min VRAM: 14GBQuant: Q5_0Size: 12GBLicense: apache-2.0 - 10
FLUX.1 Dev (GGUF)
12B paramsS gradeHighest quality FLUX model. 20-50 steps. Mac with 24GB+ RAM.
Min VRAM: 14GBQuant: Q5_0Size: 12GBLicense: flux-1-dev-non-commercial - 11
Solar 10.7B
10.7B paramsS gradeDepth-upscaled 10.7B model. Strong reasoning.
Min VRAM: 11.12GBQuant: Q8_0Size: 10.621GBLicense: apache-2.0 - 12
Falcon 3 10B
10B paramsS grade10B Falcon model. Good iPad model.
Min VRAM: 10.7GBQuant: Q8_0Size: 10.203GBLicense: apache-2.0 - 13
Gemma 2 9B Instruct
9.2B paramsS gradeGoogle's efficient 9B model. Great performance-to-size ratio.
Min VRAM: 9.65GBQuant: Q8_0Size: 9.152GBLicense: gemma - 14
Yi 1.5 9B Chat
9B paramsS grade9B bilingual model with strong reasoning.
Min VRAM: 9.24GBQuant: Q8_0Size: 8.739GBLicense: apache-2.0 - 15
Yi Coder 9B
9B paramsS gradeStrong 9B code model with good reasoning.
Min VRAM: 9.24GBQuant: Q8_0Size: 8.739GBLicense: apache-2.0 - 16
CodeGemma 7B
8.5B paramsS gradeGoogle's instruction-tuned code model. Strong code generation and understanding.
Min VRAM: 8.95GBQuant: Q8_0Size: 8.454GBLicense: gemma - 17
DeepSeek R1 Distill 8B
8B paramsS gradeCompact reasoning model. Good reasoning capabilities in a small package.
Min VRAM: 8.45GBQuant: Q8_0Size: 7.954GBLicense: mit - 18
Llama 3.1 8B Instruct
8B paramsS gradeMeta's 8B parameter instruction-tuned model. Great balance of performance and efficiency for local deployment.
Min VRAM: 8.45GBQuant: Q8_0Size: 7.954GBLicense: llama3.1