Model fiche
Aya 23 35B
By Cohere For AI · Canada
chat
general
multilingual
Overview
Cohere For AI's 35B pre-Expanse multilingual model on the Command base, covering 23 languages with strong instruction following — but locked to non-commercial use.
When to pick this model
- Research baselines for multilingual instruction following
- Non-commercial multilingual chat in low-resource languages
- Comparisons against Aya Expanse 32B successor
- Academic evaluation across 23 languages
VRAM requirements by quantization
| Quantization | VRAM required |
|---|---|
| Q4_K_M (recommended) | 20 GB |
| Q5_K_M | 25 GB |
| Q8_0 | 37 GB |
| FP16 (no quantization) | 70 GB |
VRAM figures include model weights plus a typical 8k KV cache and ~600 MB runtime overhead (Ollama / llama.cpp baseline). Add headroom for higher context lengths.
Strengths
- Strong native quality across 23 languages
- Good instruction following in non-English settings
- Backed by Cohere's Command base architecture
- Competitive multilingual coverage for its era
Limitations
- CC-BY-NC 4.0 license blocks commercial deployment
- ~20 GB VRAM at Q4 with only 8k context
- Reasoning capabilities lag 2025-class open models
Architecture & training
Architecture: Dense · 35B · Cohere Command R+ backbone · 23 native languages
Training: Cohere For AI — 23 languages including FR/AR/ZH, multilingual instruction data.
Verdict
A strong pre-Expanse multilingual 35B — useful for research, but Aya Expanse and modern peers have moved past it.
Quick start
ollama run aya:35bOr use the open-source MCP server to query this model from Claude Desktop, Cursor, or any MCP-compatible client.