BestLLMfor EN Your hardware. Your LLM. Your call.
APIOpen data Find my LLM
Model fiche

Qwen 3 VL 235B-A22B

By Alibaba · China

vision chat general moe multilingual
Parameters
235B
License
Apache 2.0
Context
256k
VRAM (Q4)
142 GB
Released
May 2025

Overview

Alibaba's flagship Qwen 3 vision model: 235B MoE with 22B active parameters and a native 256k context that extends to 1M. The current open-weight vision leader.

When to pick this model

  • Best-in-class open vision performance
  • Long-context multimodal analysis (256k native, 1M extended)
  • Document, chart, and video understanding at scale
  • Apache-licensed alternative to closed multimodal APIs

VRAM requirements by quantization

QuantizationVRAM required
Q4_K_M (recommended)142 GB
Q5_K_M170 GB
Q8_0250 GB
FP16 (no quantization)470 GB

VRAM figures include model weights plus a typical 8k KV cache and ~600 MB runtime overhead (Ollama / llama.cpp baseline). Add headroom for higher context lengths.

Strengths

  • Top open-weight vision model as of May 2025
  • 262k native context, extensible to 1M tokens
  • Apache 2.0 license
  • Only 22B active parameters keeps inference tractable

Limitations

  • Around 142 GB VRAM at Q4 — multi-GPU required
  • Heavier operational lift than dense alternatives
  • Overkill for simple captioning workloads

Architecture & training

Architecture: MoE vision · 235B total / 22B active · Qwen3-VL flagship

Training: Qwen3-VL 235B — text, images, video, 262k native context.

Verdict

The open-vision benchmark to beat — if you can afford the GPUs, this is the model to deploy.

Quick start

ollama run qwen3-vl:235b

Or use the open-source MCP server to query this model from Claude Desktop, Cursor, or any MCP-compatible client.

Tools

Is Qwen 3 VL 235B-A22B the right pick for you?

Compute self-hosted ROI → Back to catalog