BestLLMfor EN Your hardware. Your LLM. Your call.
APIOpen data Find my LLM
Model fiche

Tiny Aya 3.35B

By Cohere For AI · Canada

chat multilingual small
Parameters
3.35B
License
CC-BY-NC 4.0
Context
8k
VRAM (Q4)
2.2 GB
Released
February 2026

Overview

Cohere For AI's 3.35B model in 5 regional variants covering 70+ languages, with the Water variant tuned for Europe and APAC. CC-BY-NC 4.0, non-commercial only.

When to pick this model

  • Research on multilingual small models
  • Internal multilingual tooling under non-commercial use
  • Region-specific assistants via specialized variants
  • Educational and academic projects
  • Personal multilingual chat applications

VRAM requirements by quantization

QuantizationVRAM required
Q4_K_M (recommended)2.2 GB
Q5_K_M2.7 GB
Q8_03.8 GB
FP16 (no quantization)7 GB

VRAM figures include model weights plus a typical 8k KV cache and ~600 MB runtime overhead (Ollama / llama.cpp baseline). Add headroom for higher context lengths.

Strengths

  • Best multilingual quality in the tiny tier
  • Five regional variants targeting specific markets
  • Native coverage of 70+ languages
  • Backed by Cohere For AI research

Limitations

  • CC-BY-NC 4.0 blocks commercial deployment
  • 8K context is limiting for long-form work
  • Quality varies across regional variants

Architecture & training

Architecture: Dense 3.35B · 5 regional variants (Base/Global/Earth/Fire/Water)

Training: 70+ languages. Water = Europe + APAC (FR).

Verdict

The strongest tiny multilingual model when commercial use is off the table.

Quick start

# HuggingFace : CohereLabs/tiny-aya-base (pas encore de tag Ollama officiel)

Or use the open-source MCP server to query this model from Claude Desktop, Cursor, or any MCP-compatible client.

Tools

Is Tiny Aya 3.35B the right pick for you?

Compute self-hosted ROI → Back to catalog