BestLLMfor EN Your hardware. Your LLM. Your call.
APIOpen data Find my LLM
Model fiche

DeepSeek R1 Distill Qwen 1.5B

By DeepSeek · China

reasoning small
Parameters
1.5B
License
MIT
Context
128k
VRAM (Q4)
1 GB
Released
January 2025

Overview

DeepSeek's R1 reasoning distilled into a 1.5B MIT-licensed model with visible chain-of-thought. Hits MATH-500 83.9 and runs on any laptop.

When to pick this model

  • Teaching and demos showing CoT reasoning on minimal hardware
  • Math tutoring apps on edge devices
  • Research baselines for distillation experiments
  • Battery-constrained mobile deployments

VRAM requirements by quantization

QuantizationVRAM required
Q4_K_M (recommended)1 GB
Q5_K_M1.2 GB
Q8_02 GB
FP16 (no quantization)3 GB

VRAM figures include model weights plus a typical 8k KV cache and ~600 MB runtime overhead (Ollama / llama.cpp baseline). Add headroom for higher context lengths.

Published benchmark scores

BenchmarkScore
MATH-50083.9

Scores published by the model author or aggregated from public leaderboards. Re-measured monthly by our editorial team.

Strengths

  • Around 1GB VRAM at Q4 — runs on any laptop
  • Visible chain-of-thought reasoning
  • MIT license — fully unrestricted
  • 128k context in a 1.5B model

Limitations

  • Reasoning depth is genuinely limited at 1.5B despite CoT
  • Highly verbose — token costs add up fast
  • Outclassed by the 14B distill on anything non-trivial

Architecture & training

Architecture: DeepSeek R1 distillation into Qwen 2.5 1.5B · chain-of-thought

Training: Distilled from R1 671B. Ultra-compact 1.5B version with CoT reasoning.

Verdict

A fun MIT-licensed reasoning model that fits anywhere, but the 1.5B ceiling shows on real problems.

Quick start

ollama run deepseek-r1:1.5b

Or use the open-source MCP server to query this model from Claude Desktop, Cursor, or any MCP-compatible client.

Tools

Is DeepSeek R1 Distill Qwen 1.5B the right pick for you?

Compute self-hosted ROI → Back to catalog