BestLLMfor EN Your hardware. Your LLM. Your call.
APIOpen data Find my LLM
Model fiche

HunyuanOCR 1B

By Tencent · China

vision chat small
Parameters
1B
License
Tencent Hunyuan License
Context
8k
VRAM (Q4)
0.8 GB
Released
March 2025

Overview

Tencent's 1B end-to-end OCR model that outperforms 235B general VLMs on document tasks. Engineered for edge and mobile deployment.

When to pick this model

  • On-device or mobile OCR with strict memory budgets
  • High-throughput batch OCR where latency matters
  • Receipt, invoice, and form processing at scale
  • Embedded systems and edge gateways
  • Cost-sensitive OCR pipelines replacing cloud APIs

VRAM requirements by quantization

QuantizationVRAM required
Q4_K_M (recommended)0.8 GB
Q5_K_M1 GB
Q8_01.5 GB
FP16 (no quantization)2 GB

VRAM figures include model weights plus a typical 8k KV cache and ~600 MB runtime overhead (Ollama / llama.cpp baseline). Add headroom for higher context lengths.

Strengths

  • Runs in under 1 GB VRAM at Q4
  • Beats 200B+ general VLMs on document benchmarks
  • End-to-end model — no separate detection/recognition stages
  • Latency low enough for real-time mobile use

Limitations

  • 1B ceiling shows on noisy or complex layouts
  • 8k context limits multi-page workflows
  • Tencent Hunyuan License is custom — review before commercial use

Architecture & training

Architecture: Dense vision · 1B · Tencent Hunyuan OCR ultra-compact

Training: Tencent — text extraction from scanned documents and images, ultra-compact version.

Verdict

The OCR model to pick when every megabyte counts; for messy real-world documents, step up to DeepSeek-OCR.

Quick start

ollama pull hf.co/tencent/Hunyuan-OCR-1B-GGUF

Or use the open-source MCP server to query this model from Claude Desktop, Cursor, or any MCP-compatible client.

Tools

Is HunyuanOCR 1B the right pick for you?

Compute self-hosted ROI → Back to catalog