BestLLMfor EN Your hardware. Your LLM. Your call.
APIOpen data Find my LLM
Model fiche

SmolLM2 1.7B Instruct

By HuggingFace · France

chat general small
Parameters
1.7B
License
Apache 2.0
Context
8k
VRAM (Q4)
1.2 GB
Released
November 2024

Overview

HuggingFace's 1.7B Apache 2.0 instruct model trained on 11T tokens. Beats Qwen2.5-1.5B by roughly 6 points on MMLU-Pro, making it a top pick at the sub-2B tier.

When to pick this model

  • On-device assistants where every megabyte counts
  • Edge inference on CPUs or low-end GPUs
  • Building permissively licensed downstream products
  • Fine-tuning experiments on a single consumer GPU
  • Latency-critical autocomplete or classification tasks

VRAM requirements by quantization

QuantizationVRAM required
Q4_K_M (recommended)1.2 GB
Q5_K_M1.5 GB
Q8_02.2 GB
FP16 (no quantization)3.5 GB

VRAM figures include model weights plus a typical 8k KV cache and ~600 MB runtime overhead (Ollama / llama.cpp baseline). Add headroom for higher context lengths.

Published benchmark scores

BenchmarkScore
BFCL (function calling)27

Scores published by the model author or aggregated from public leaderboards. Re-measured monthly by our editorial team.

Strengths

  • Best-in-class quality for its size on MMLU-Pro
  • Clean Apache 2.0 license with no commercial strings
  • Massive 11T-token training corpus for a small model
  • One of the most downloaded small models on Hugging Face

Limitations

  • English-centric, weak on non-English languages
  • 8K context window is tight for modern RAG workflows
  • BFCL function-calling score of 27% trails larger peers

Architecture & training

Architecture: Dense Llama 2-style · SFT + DPO (UltraFeedback)

Training: 11T tokens.

Verdict

If you need an Apache-licensed sub-2B model that punches above its weight, SmolLM2 is the default choice.

Quick start

ollama run smollm2:1.7b

Or use the open-source MCP server to query this model from Claude Desktop, Cursor, or any MCP-compatible client.

Tools

Is SmolLM2 1.7B Instruct the right pick for you?

Compute self-hosted ROI → Back to catalog