BestLLMfor EN Your hardware. Your LLM. Your call.
APIOpen data Find my LLM
Model fiche

Pangu Pro MoE 72B

By Huawei · China

chat general moe
Parameters
72B
License
Pangu Model License
Context
32k
VRAM (Q4)
42 GB
Released
April 2025

Overview

Huawei's first open-weight release, a 72B MoE optimized for Ascend silicon. Strong on enterprise code and Chinese business scenarios, but the custom Pangu license needs careful review.

When to pick this model

  • Deployments already running on Huawei Ascend hardware
  • Enterprise code and business workflows in Chinese markets
  • Research on non-NVIDIA training and inference stacks
  • Workloads where Huawei's ecosystem integration matters

VRAM requirements by quantization

QuantizationVRAM required
Q4_K_M (recommended)42 GB
Q5_K_M50 GB
Q8_078 GB
FP16 (no quantization)144 GB

VRAM figures include model weights plus a typical 8k KV cache and ~600 MB runtime overhead (Ollama / llama.cpp baseline). Add headroom for higher context lengths.

Strengths

  • First-class optimization for Ascend NPUs
  • Solid enterprise code and business reasoning
  • Open weights from a major hyperscaler
  • MoE design keeps inference tractable

Limitations

  • Around 42 GB VRAM in Q4
  • 32k context trails modern flagships
  • Custom Pangu license requires legal review
  • Tooling outside Huawei's stack is thin

Architecture & training

Architecture: MoE · 72B · Huawei PanGu Pro · proprietary architecture

Training: Huawei — specialized in enterprise code and CN business scenarios.

Verdict

A reasonable pick if you're on Ascend; on NVIDIA hardware, Qwen 3.5 or DeepSeek will serve you better.

Quick start

ollama pull hf.co/huawei/pangu-pro-moe-72b-GGUF

Or use the open-source MCP server to query this model from Claude Desktop, Cursor, or any MCP-compatible client.

Tools

Is Pangu Pro MoE 72B the right pick for you?

Compute self-hosted ROI → Back to catalog