Model fiche
Helium 1 2B
By Kyutai · France
chat
general
multilingual
fr
small
Overview
Kyutai's 2B multilingual base covering all 24 EU languages, distilled from Gemma 2 — which means Gemma Terms apply on top of CC-BY-SA. Beats Qwen 2.5 1.5B, Gemma 2B, and Llama 3.2 3B at its scale.
When to pick this model
- You need a small multilingual base for fine-tuning across EU languages
- You're building edge or embedded deployments with French as a priority
- You want a European base model with strong sub-3B performance
- You're doing pre-training research and need a clean small foundation
VRAM requirements by quantization
| Quantization | VRAM required |
|---|---|
| Q4_K_M (recommended) | 1.5 GB |
| Q5_K_M | 2 GB |
| Q8_0 | 3 GB |
| FP16 (no quantization) | 5 GB |
VRAM figures include model weights plus a typical 8k KV cache and ~600 MB runtime overhead (Ollama / llama.cpp baseline). Add headroom for higher context lengths.
Strengths
- Compact multilingual base from a European lab
- Covers all 24 EU languages
- Beats Qwen 2.5 1.5B, Gemma 2B, and Llama 3.2 3B at its scale
- Built by Kyutai
Limitations
- CC-BY-SA 4.0 plus Gemma Terms via distillation
- Base model — not instruction-tuned
- No official Ollama support
Architecture & training
Architecture: Dense · GQA · RoPE · distilled from Gemma 2
Training: 2.5T tokens, 24 EU languages.
Verdict
A strong European small base for fine-tuning — just budget for the dual-license obligations.
Quick start
# HuggingFace : kyutai/helium-1-2bOr use the open-source MCP server to query this model from Claude Desktop, Cursor, or any MCP-compatible client.