Mixtral 8x22B Instruct
By Mistral AI · France
Overview
Mistral AI's mature 141B/39B-active MoE under Apache 2.0, scoring 77.8 on MMLU and 45.1 on HumanEval. A proven general-purpose workhorse at roughly 80GB in Q4.
When to pick this model
- Stable, well-understood production deployments
- Apache-licensed commercial products
- Multilingual general chat including French
- Workloads where reliability beats latest benchmarks
- Teams with existing Mixtral infrastructure
VRAM requirements by quantization
| Quantization | VRAM required |
|---|---|
| Q4_K_M (recommended) | 82 GB |
| Q5_K_M | 100 GB |
| Q8_0 | 150 GB |
| FP16 (no quantization) | 282 GB |
VRAM figures include model weights plus a typical 8k KV cache and ~600 MB runtime overhead (Ollama / llama.cpp baseline). Add headroom for higher context lengths.
Published benchmark scores
| Benchmark | Score |
|---|---|
| MMLU | 77.8 |
| GSM8K | 78.6 |
Scores published by the model author or aggregated from public leaderboards. Re-measured monthly by our editorial team.
Strengths
- Battle-tested mature MoE
- Strong general-purpose performance
- Apache 2.0 license
- Solid multilingual coverage
Limitations
- 80GB in Q4 still demands serious hardware
- Coding trails newer specialists
- 64K context lags 2026 competitors
- Outclassed by newer Mistral releases on most benchmarks
Architecture & training
Architecture: Sparse MoE · 8 experts · 141B/39B active · GQA
Training: Apache 2.0, 64k ctx.
Still a dependable Apache-licensed generalist, but newer Mistral models now beat it across the board.
Quick start
ollama run mixtral:8x22bOr use the open-source MCP server to query this model from Claude Desktop, Cursor, or any MCP-compatible client.