Model fiche
Command R 35B v01
By Cohere · Canada
chat
general
multilingual
Overview
Cohere's original Command R, a 35B optimized for RAG and tool use across 10 languages with 128k context — but locked under CC-BY-NC for non-commercial use only.
When to pick this model
- Research projects exploring early open RAG-native models
- Internal evaluations and prototyping with no commercial intent
- Tool-use experiments needing 128k context
- Multilingual RAG benchmarking across 10 languages
- Comparisons against successor Command R+ 104B
VRAM requirements by quantization
| Quantization | VRAM required |
|---|---|
| Q4_K_M (recommended) | 20 GB |
| Q5_K_M | 25 GB |
| Q8_0 | 37 GB |
| FP16 (no quantization) | 70 GB |
VRAM figures include model weights plus a typical 8k KV cache and ~600 MB runtime overhead (Ollama / llama.cpp baseline). Add headroom for higher context lengths.
Strengths
- First open model designed natively for RAG and tool use
- 128k context for long retrieval pipelines
- 10 evaluated languages, 23 in pretraining
- Strong citation and grounding behavior
Limitations
- CC-BY-NC 4.0 license blocks commercial deployment
- Superseded by Command R+ 104B for production quality
- No multimodal capabilities
Architecture & training
Architecture: Dense 35B · optimized for RAG and tool-use · GQA
Training: 10 languages evaluated, 23 trained.
Verdict
Historically important but commercially off-limits — choose it only for research, and reach for Command R+ everywhere else.
Quick start
ollama run command-r:35bOr use the open-source MCP server to query this model from Claude Desktop, Cursor, or any MCP-compatible client.