BestLLMfor EN Your hardware. Your LLM. Your call.
APIOpen data Find my LLM
Model fiche

Mistral Small 4

By Mistral AI · France

chat general code vision reasoning multilingual fr moe
Parameters
119B
License
Apache 2.0
Context
250k
VRAM (Q4)
72 GB
Released
March 2026

Overview

Mistral AI's 2026 flagship MoE with 119B total and 6.5B active parameters, unifying chat, reasoning, vision, and code in a single Apache 2.0 model.

When to pick this model

  • Consolidating multiple Mistral deployments into one model
  • Vision plus reasoning workloads on a prosumer rig
  • Long-context analysis up to 256K tokens
  • European-data-sovereignty deployments
  • Apache-licensed commercial products

VRAM requirements by quantization

QuantizationVRAM required
Q4_K_M (recommended)72 GB
Q5_K_M86 GB
Q8_0128 GB
FP16 (no quantization)238 GB

VRAM figures include model weights plus a typical 8k KV cache and ~600 MB runtime overhead (Ollama / llama.cpp baseline). Add headroom for higher context lengths.

Strengths

  • Unifies chat, reasoning, vision, and code in one model
  • Only 6.5B active parameters for fast inference
  • 256K context window
  • Apache 2.0 license
  • European lab with strong French and EU-language support

Limitations

  • 72GB+ in Q4 requires a prosumer multi-GPU setup
  • Breaks continuity with the Small 3.x line
  • Newer release means thinner ecosystem

Architecture & training

Architecture: MoE 119B/6.5B active · 256k ctx · unifies instruct+reasoning+vision+code

Training: Replaces Small 3.x and Pixtral in a single model.

Verdict

Mistral's most ambitious open release yet, ideal if you want one model covering four product lines.

Quick start

# HuggingFace : mistralai/Mistral-Small-4 (pas encore de tag Ollama officiel)

Or use the open-source MCP server to query this model from Claude Desktop, Cursor, or any MCP-compatible client.

Tools

Is Mistral Small 4 the right pick for you?

Compute self-hosted ROI → Back to catalog