DeepSeek R1 Distill Qwen 14B
By DeepSeek · China
Overview
DeepSeek's R1 reasoning distilled into Qwen 14B under MIT. AIME24 69.7 and MATH-500 93.9 — beats o1-mini on most reasoning benchmarks.
When to pick this model
- Math, coding, and STEM reasoning on a single 24GB GPU
- Local alternative to o1-mini-class APIs
- Workloads needing MIT-licensed reasoning
- Agentic planners that benefit from explicit CoT
VRAM requirements by quantization
| Quantization | VRAM required |
|---|---|
| Q4_K_M (recommended) | 9 GB |
| Q5_K_M | 11 GB |
| Q8_0 | 16 GB |
| FP16 (no quantization) | 28 GB |
VRAM figures include model weights plus a typical 8k KV cache and ~600 MB runtime overhead (Ollama / llama.cpp baseline). Add headroom for higher context lengths.
Published benchmark scores
| Benchmark | Score |
|---|---|
| AIME 2024 | 69.7 |
| MATH-500 | 93.9 |
| GPQA | 59.1 |
Scores published by the model author or aggregated from public leaderboards. Re-measured monthly by our editorial team.
Strengths
- AIME24 69.7 and MATH-500 93.9
- Beats o1-mini on multiple reasoning benchmarks
- MIT license — no usage restrictions
- 131k context
Limitations
- Verbose CoT inflates token costs
- Slower than non-reasoning 14B for simple queries
- No vision or tool-use specialization
Architecture & training
Architecture: Dense Qwen 2.5 14B · SFT on R1 traces
Training: Distillation of R1 671B.
The best 14B reasoner on permissive license today — a serious local alternative to o1-mini for STEM workloads.
Quick start
ollama run deepseek-r1:14bOr use the open-source MCP server to query this model from Claude Desktop, Cursor, or any MCP-compatible client.
Is DeepSeek R1 Distill Qwen 14B the right pick for you?