BestLLMfor EN Your hardware. Your LLM. Your call.
APIOpen data Find my LLM
Model fiche

Mistral Medium 3.5 128B

By Mistral AI · France

chat general code reasoning vision multilingual fr
Parameters
128B
License
Modified MIT
Context
250k
VRAM (Q4)
74 GB
Released
29 April 2026

Overview

Mistral AI's first merged flagship — a dense 128B with vision, 256k context, and configurable reasoning. Hits 77.6% on SWE-Bench Verified, consolidating Medium 3.1, Magistral, and Devstral 2 into one model.

When to pick this model

  • Agentic coding workflows demanding state-of-the-art SWE-Bench performance
  • Customer-support automation needing top τ³-Telecom scores
  • Long-document analysis up to 256k tokens
  • Multilingual vision tasks across 24 languages
  • Production deployments wanting reasoning toggleable per request

VRAM requirements by quantization

QuantizationVRAM required
Q4_K_M (recommended)74 GB
Q5_K_M91 GB
Q8_0137 GB
FP16 (no quantization)256 GB

VRAM figures include model weights plus a typical 8k KV cache and ~600 MB runtime overhead (Ollama / llama.cpp baseline). Add headroom for higher context lengths.

Published benchmark scores

BenchmarkScore
SWE-Bench Verified77.6
τ³-Telecom91.4

Scores published by the model author or aggregated from public leaderboards. Re-measured monthly by our editorial team.

Strengths

  • SWE-Bench Verified 77.6% — best-in-class for open weights
  • τ³-Telecom 91.4% for tool-using agents
  • 256k context with strong long-context retention
  • Vision-enabled and multilingual across 24 languages
  • Modified MIT — permissive for most commercial use

Limitations

  • ~74GB at Q4 — needs a 4-GPU box for comfortable serving
  • Revenue clause kicks in for large enterprises
  • Single-model consolidation means no separate specialized variants

Architecture & training

Architecture: Dense 128B · vision encoder · 256k ctx · configurable reasoning · integrated EAGLE draft head

Training: First merged Mistral flagship: replaces Medium 3.1, Magistral and Devstral 2 in Le Chat / Vibe.

Verdict

The first Mistral flagship that bundles coding, reasoning, and vision into one model — and it's competitive on every axis.

Quick start

ollama run mistral-medium-3.5:128b

Or use the open-source MCP server to query this model from Claude Desktop, Cursor, or any MCP-compatible client.

Tools

Is Mistral Medium 3.5 128B the right pick for you?

Compute self-hosted ROI → Back to catalog