BestLLMfor EN Your hardware. Your LLM. Your call.
APIOpen data Find my LLM
Model fiche

Hunyuan-A13B Instruct

By Tencent · China

chat general reasoning moe
Parameters
80B
License
Tencent Hunyuan License
Context
256k
VRAM (Q4)
48 GB
Released
June 2025

Overview

Tencent's fine-grained MoE activating 13B of 80B parameters, with dual fast/slow thinking modes and a 256k context. Released under Tencent's custom Hunyuan license.

When to pick this model

  • Reasoning-heavy tasks needing toggleable thinking modes
  • Long-context analysis up to 256k tokens
  • Cost-sensitive deployment of a frontier-class MoE
  • Chinese-language production workloads

VRAM requirements by quantization

QuantizationVRAM required
Q4_K_M (recommended)48 GB
Q5_K_M57 GB
Q8_085 GB
FP16 (no quantization)160 GB

VRAM figures include model weights plus a typical 8k KV cache and ~600 MB runtime overhead (Ollama / llama.cpp baseline). Add headroom for higher context lengths.

Strengths

  • Competitive with o1 and DeepSeek on mainstream benchmarks
  • Native 256k context
  • Dual fast/slow thinking for latency-quality tradeoffs
  • Only 13B active parameters keeps inference cheap

Limitations

  • Tencent Hunyuan license has commercial restrictions
  • No official Ollama distribution
  • Tooling support trails Qwen and Llama

Architecture & training

Architecture: Fine-grained MoE · 80B/13B active · dual fast/slow thinking

Training: 256k native ctx.

Verdict

Frontier-tier MoE reasoning at a manageable active-parameter count, held back mainly by the custom Tencent license.

Quick start

# HuggingFace : tencent/Hunyuan-A13B-Instruct

Or use the open-source MCP server to query this model from Claude Desktop, Cursor, or any MCP-compatible client.

Tools

Is Hunyuan-A13B Instruct the right pick for you?

Compute self-hosted ROI → Back to catalog