BestLLMfor EN Your hardware. Your LLM. Your call.
APIOpen data Find my LLM
Model fiche

Granite 4.0 3B Vision

By IBM · United States

vision chat small
Parameters
3B
License
Apache 2.0
Context
16k
VRAM (Q4)
2.2 GB
Released
March 2026

Overview

IBM's 3B vision-language model purpose-built for enterprise document extraction, including OCR, table parsing, and form understanding. Apache 2.0 and laptop-deployable.

When to pick this model

  • Enterprise document and form extraction pipelines
  • OCR replacement for invoices, receipts, and PDFs
  • Table structure understanding at scale
  • Apache-licensed on-prem document AI
  • Edge deployment for sensitive enterprise data

VRAM requirements by quantization

QuantizationVRAM required
Q4_K_M (recommended)2.2 GB
Q5_K_M2.7 GB
Q8_03.8 GB
FP16 (no quantization)6.5 GB

VRAM figures include model weights plus a typical 8k KV cache and ~600 MB runtime overhead (Ollama / llama.cpp baseline). Add headroom for higher context lengths.

Strengths

  • Fast, accurate enterprise OCR
  • Strong table and form-field extraction
  • Apache 2.0 license
  • Runs comfortably on a laptop

Limitations

  • 16K context limits multi-page documents
  • English-first, weak on non-Latin scripts
  • Narrow scope, not a general-purpose VLM

Architecture & training

Architecture: Dense 3B VLM · specialized for enterprise documents

Training: IBM Granite 4.0 family.

Verdict

The best small VLM for enterprise document workflows under an Apache license.

Quick start

# HuggingFace : ibm-granite/granite-4.0-3b-vision

Or use the open-source MCP server to query this model from Claude Desktop, Cursor, or any MCP-compatible client.

Tools

Is Granite 4.0 3B Vision the right pick for you?

Compute self-hosted ROI → Back to catalog