Model fiche
Command R+ 104B (08-2024)
By Cohere · United States
chat
general
multilingual
Overview
Cohere's 104B RAG and tool-use flagship from August 2024 — 128K context, 23 languages. Licensed CC-BY-NC, so non-commercial only without a Cohere agreement.
When to pick this model
- You're building research or internal-only RAG systems
- You need top-tier tool-use behavior in an open weights model
- You need broad multilingual coverage across 23 languages
- You're evaluating before signing a commercial agreement with Cohere
VRAM requirements by quantization
| Quantization | VRAM required |
|---|---|
| Q4_K_M (recommended) | 60 GB |
| Q5_K_M | 72 GB |
| Q8_0 | 110 GB |
| FP16 (no quantization) | 208 GB |
VRAM figures include model weights plus a typical 8k KV cache and ~600 MB runtime overhead (Ollama / llama.cpp baseline). Add headroom for higher context lengths.
Strengths
- Best-in-class open RAG and tool-use at release
- 128K context window
- 23 language coverage
- Higher throughput and lower latency than the April 2024 release
Limitations
- CC-BY-NC 4.0 — no commercial use without a separate license
- 60GB+ VRAM in Q4
- Surpassed by newer 100B-class models on general benchmarks
Architecture & training
Architecture: Dense · optimized for RAG and tool-use · GQA
Training: 23 languages, +50% throughput / -25% latency vs April 2024 version.
Verdict
Strong RAG and tool-use, but the non-commercial license rules it out of most production deployments.
Quick start
ollama run command-r-plus:104bOr use the open-source MCP server to query this model from Claude Desktop, Cursor, or any MCP-compatible client.
Tools
Is Command R+ 104B (08-2024) the right pick for you?