BestLLMfor EN Your hardware. Your LLM. Your call.
APIOpen data Find my LLM
Model fiche

Mistral Small 3

By Mistral AI · France

chat general code
Parameters
24B
License
Apache 2.0
Context
32k
VRAM (Q4)
14 GB
Released
January 2025

Overview

Mistral AI's 24B dense model that closes most of the gap with 70B-class models. Best quality-per-parameter we've measured at this size in 2025.

When to pick this model

  • Self-hosting a near-frontier assistant on a single 24GB GPU
  • Agentic workflows and tool calling where latency matters
  • Long-context RAG with up to 128k tokens
  • Commercial deployments needing Apache 2.0
  • Replacing Llama 3 70B to cut VRAM and inference cost

VRAM requirements by quantization

QuantizationVRAM required
Q4_K_M (recommended)14 GB
Q5_K_M17 GB
Q8_026 GB
FP16 (no quantization)48 GB

VRAM figures include model weights plus a typical 8k KV cache and ~600 MB runtime overhead (Ollama / llama.cpp baseline). Add headroom for higher context lengths.

Published benchmark scores

BenchmarkScore
MMLU81
GPQA42.2
HumanEval84.8

Scores published by the model author or aggregated from public leaderboards. Re-measured monthly by our editorial team.

Strengths

  • Quality approaching Llama 3 70B at a third the size
  • Low latency relative to peers
  • 128k context window
  • Strong tool use and agent behavior
  • Apache 2.0 license

Limitations

  • Needs ~16GB VRAM at Q4, more for higher precision
  • Trails Qwen 2.5 Coder on dedicated coding tasks
  • No native vision (see Small 3.1 for that)

Architecture & training

Architecture: Dense Transformer · 40 layers · GQA + sliding window

Training: Enriched multilingual corpus, strong focus on FR + scientific English.

Verdict

The 2025 sweet spot for open-weight chat — frontier-adjacent quality at a tractable size.

Quick start

ollama run mistral-small:24b

Or use the open-source MCP server to query this model from Claude Desktop, Cursor, or any MCP-compatible client.

Tools

Is Mistral Small 3 the right pick for you?

Compute self-hosted ROI → Back to catalog