Best LLMs for 24GB VRAM

24GB unlocks 30B-class models at Q4 and 13B models at high quality. With 24GB you can also push context lengths past 32K. Top picks below are ranked against a 24GB reference card.

1
Gemma 3 27B
27B paramsS grade
Google's flagship open model. Near GPT-4 quality. Needs 20GB+ RAM.
Min VRAM: 15.91GBQuant: Q4_K_MSize: 15.41GBLicense: gemma
2
Mistral Small 22B
22B paramsS grade
22B parameter model. Strong reasoning and multilingual. Needs 16GB+ RAM.
Min VRAM: 12.93GBQuant: Q4_K_MSize: 12.425GBLicense: apache-2.0
3
Phi-4
14B paramsS grade
Microsoft's 14B parameter model. Punches well above its weight on reasoning.
Min VRAM: 15.01GBQuant: Q8_0Size: 14.51GBLicense: mit
4
Qwen 2.5 14B
14B paramsS grade
Strong 14B model with excellent coding and reasoning. iPad Pro recommended.
Min VRAM: 15.12GBQuant: Q8_0Size: 14.623GBLicense: apache-2.0
5
Gemma 3 12B
12B paramsS grade
High quality 12B model. Excellent for iPad Pro and Mac.
Min VRAM: 12.15GBQuant: Q8_0Size: 11.651GBLicense: gemma
6
Mistral Nemo 12B
12B paramsS grade
Mistral's 12B model with excellent instruction following.
Min VRAM: 12.63GBQuant: Q8_0Size: 12.128GBLicense: apache-2.0
7
Solar 10.7B
10.7B paramsS grade
Depth-upscaled 10.7B model. Strong reasoning.
Min VRAM: 11.12GBQuant: Q8_0Size: 10.621GBLicense: apache-2.0
8
Falcon 3 10B
10B paramsS grade
10B Falcon model. Good iPad model.
Min VRAM: 10.7GBQuant: Q8_0Size: 10.203GBLicense: apache-2.0
9
Gemma 2 9B Instruct
9.2B paramsS grade
Google's efficient 9B model. Great performance-to-size ratio.
Min VRAM: 9.65GBQuant: Q8_0Size: 9.152GBLicense: gemma
10
Yi 1.5 9B Chat
9B paramsS grade
9B bilingual model with strong reasoning.
Min VRAM: 9.24GBQuant: Q8_0Size: 8.739GBLicense: apache-2.0
11
DeepSeek R1 Distill 8B
8B paramsS grade
Compact reasoning model. Good reasoning capabilities in a small package.
Min VRAM: 8.45GBQuant: Q8_0Size: 7.954GBLicense: mit
12
Llama 3.1 8B Instruct
8B paramsS grade
Meta's 8B parameter instruction-tuned model. Great balance of performance and efficiency for local deployment.
Min VRAM: 8.45GBQuant: Q8_0Size: 7.954GBLicense: llama3.1
13
Granite 3.3 8B
8B paramsS grade
IBM's 8B instruction model. Enterprise quality.
Min VRAM: 8.59GBQuant: Q8_0Size: 8.088GBLicense: apache-2.0
14
EXAONE 3.5 7.8B
7.8B paramsS grade
7.8B model from LG. Strong bilingual Korean/English.
Min VRAM: 8.24GBQuant: Q8_0Size: 7.741GBLicense: other
15
InternLM 2.5 7B
7.7B paramsS grade
Strong 7B model from China. Good at tool use and math.
Min VRAM: 8.16GBQuant: Q8_0Size: 7.659GBLicense: apache-2.0

Related

All models for 24GB VRAM RTX 4090 compatibility