Model fiche
Rakuten AI 3.0
By Rakuten · Japan
chat
general
multilingual
moe
Overview
Rakuten's flagship ~700B MoE model built under Japan's GENIAC program and released under Apache 2.0. Best-in-class Japanese performance with serious enterprise e-commerce DNA.
When to pick this model
- Japanese-language production workloads (support, content, search)
- Bilingual JP/EN e-commerce and retail applications
- Enterprise deployments needing Apache 2.0 licensing
- Replacing closed JP models like proprietary Rakuten/LINE APIs
- Localization pipelines targeting the Japanese market
VRAM requirements by quantization
| Quantization | VRAM required |
|---|---|
| Q4_K_M (recommended) | 420 GB |
| Q5_K_M | 500 GB |
| Q8_0 | 745 GB |
| FP16 (no quantization) | 1400 GB |
VRAM figures include model weights plus a typical 8k KV cache and ~600 MB runtime overhead (Ollama / llama.cpp baseline). Add headroom for higher context lengths.
Strengths
- Top-tier Japanese fluency, beating most open models on JP benchmarks
- 700B total parameters give it broad knowledge depth
- Apache 2.0 — no commercial restrictions
- Backed by Rakuten's massive e-commerce and fintech corpus
- Built under Japan's GENIAC sovereign-AI initiative
Limitations
- Roughly 420 GB VRAM in Q4 — datacenter-only
- 32k context is tight versus modern 128k+ flagships
- Heavily skewed toward Japanese and commerce; weaker on global general tasks
Architecture & training
Architecture: MoE · 700B total · Rakuten AI 3 · 32k context
Training: Rakuten — massive JP/EN corpus for e-commerce and enterprise.
Verdict
The default open-weight choice for Japanese enterprise; overkill and underspecialized for anyone else.
Quick start
# Infrastructure lourde requise — non disponible en local standardOr use the open-source MCP server to query this model from Claude Desktop, Cursor, or any MCP-compatible client.