BestLLMfor EN Your hardware. Your LLM. Your call.
APIOpen data Find my LLM
Model fiche

Sarvam-M 24B

By Sarvam AI · India

chat general reasoning multilingual
Parameters
24B
License
Apache 2.0
Context
32k
VRAM (Q4)
14 GB
Released
May 2025

Overview

Sarvam AI's 24B built on Mistral Small 3.1 with hybrid think/no-think modes, gaining +86% on romanized GSM-8K Indic and covering 11 Indian languages plus English.

When to pick this model

  • Indic-language chat and content across 11 Indian languages
  • Math-heavy workloads in romanized Indic scripts
  • Hybrid reasoning where toggleable thinking helps
  • Sovereign Indian deployments needing open weights
  • Replacing closed APIs for Indian-market products

VRAM requirements by quantization

QuantizationVRAM required
Q4_K_M (recommended)14 GB
Q5_K_M17 GB
Q8_026 GB
FP16 (no quantization)48 GB

VRAM figures include model weights plus a typical 8k KV cache and ~600 MB runtime overhead (Ollama / llama.cpp baseline). Add headroom for higher context lengths.

Strengths

  • +86% gain on romanized Indic GSM-8K
  • Hybrid think/no-think mode toggle
  • 11 Indian languages plus English
  • Apache 2.0 with permissive commercial use
  • Mistral Small 3.1 base brings solid general quality

Limitations

  • No official Ollama distribution yet
  • Strong Indic focus limits broader multilingual use
  • Smaller community ecosystem than Mistral mainline

Architecture & training

Architecture: Dense 24B · Mistral Small 3.1 base · hybrid think/non-think

Training: 11 Indian languages + EN.

Verdict

The top open model for Indic markets — pick it when you need real Indian-language coverage with hybrid reasoning.

Quick start

# HuggingFace : sarvamai/sarvam-m

Or use the open-source MCP server to query this model from Claude Desktop, Cursor, or any MCP-compatible client.

Tools

Is Sarvam-M 24B the right pick for you?

Compute self-hosted ROI → Back to catalog