Model fiche
Jais Adapted 70B Chat
By MBZUAI / Core42 · UAE
chat
general
multilingual
Overview
MBZUAI and Core42's Llama-2 70B extended with 32k Arabic tokens and GQA — the strongest open-weight Arabic LLM, reaching GPT-4-class quality in Arabic.
When to pick this model
- Production Arabic workloads needing top open quality
- Arabic legal, medical, or technical content generation
- Bilingual Arabic-English assistants at enterprise scale
- MENA sovereign deployments with multi-GPU budgets
- Replacing GPT-4 for Arabic-heavy regulated use cases
VRAM requirements by quantization
| Quantization | VRAM required |
|---|---|
| Q4_K_M (recommended) | 40 GB |
| Q5_K_M | 48 GB |
| Q8_0 | 75 GB |
| FP16 (no quantization) | 140 GB |
VRAM figures include model weights plus a typical 8k KV cache and ~600 MB runtime overhead (Ollama / llama.cpp baseline). Add headroom for higher context lengths.
Strengths
- Strongest open-weight Arabic model available
- GPT-4-level performance in Arabic
- Jais license permits commercial use
- GQA improves inference efficiency at 70B scale
Limitations
- ~40 GB VRAM at Q4
- 4096-token context is restrictive for long documents
- Limited capability outside Arabic and English
Architecture & training
Architecture: Dense · 70B · specialized in Arabic + English · MBZUAI/Core42 UAE
Training: MBZUAI/Core42 — 395B token corpus of native Arabic + high-quality English.
Verdict
The clear top pick for Arabic at 70B — choose it when GPT-4-grade Arabic must run on your own hardware.
Quick start
ollama pull hf.co/inceptionai/jais-instruct-GGUFOr use the open-source MCP server to query this model from Claude Desktop, Cursor, or any MCP-compatible client.