MiniMax-M2.7
By MiniMax · China
Overview
MiniMax's agentic MoE with 229B total and 10B active parameters under Apache 2.0. Open-sourced April 12, 2026 and currently the top trending model on Hugging Face.
When to pick this model
- Multi-step agentic workflows and tool use
- Terminal and shell automation pipelines
- SWE-Bench-style autonomous coding agents
- Long-context tasks needing 200K+ tokens
- Commercial agent products under Apache 2.0
VRAM requirements by quantization
| Quantization | VRAM required |
|---|---|
| Q4_K_M (recommended) | 138 GB |
| Q5_K_M | 165 GB |
| Q8_0 | 246 GB |
| FP16 (no quantization) | 458 GB |
VRAM figures include model weights plus a typical 8k KV cache and ~600 MB runtime overhead (Ollama / llama.cpp baseline). Add headroom for higher context lengths.
Published benchmark scores
| Benchmark | Score |
|---|---|
| SWE-Pro | 56.22 |
| Terminal Bench | 57 |
Scores published by the model author or aggregated from public leaderboards. Re-measured monthly by our editorial team.
Strengths
- State-of-the-art open agentic performance
- 56.22% SWE-Bench Pro and 57% Terminal Bench
- Only 10B active parameters keeps inference fast
- Clean Apache 2.0 license
- #1 trending on Hugging Face at launch
Limitations
- 138GB+ in Q4 demands serious server hardware
- Verbose output in agent mode inflates token costs
- Newer release means thinner tooling ecosystem
Architecture & training
Architecture: MoE 229B/10B active · 205k ctx · self-evolving agent
Training: Open-sourced April 12, 2026. Successor to M2.5.
The most exciting agentic open-weight model of 2026, if your hardware can host 229B parameters.
Quick start
# HuggingFace (GGUF) : unsloth/MiniMax-M2.7-GGUFOr use the open-source MCP server to query this model from Claude Desktop, Cursor, or any MCP-compatible client.