BestLLMfor EN Your hardware. Your LLM. Your call.
APIOpen data Find my LLM
Model fiche

Qwen 2.5 Coder 1.5B Instruct

By Alibaba · China

code small
Parameters
1.5B
License
Apache 2.0
Context
32k
VRAM (Q4)
1 GB
Released
November 2024

Overview

Alibaba's smallest Qwen 2.5 Coder at 1.5B parameters under Apache 2.0, covering 92 programming languages. HumanEval 70.7 makes it a serious on-device completion model.

When to pick this model

  • Local inline completion in IDE plugins
  • Edge devices and laptops without dedicated GPUs
  • Latency-critical code suggestions where 7B is too slow
  • Fallback model when bigger coders are unavailable

VRAM requirements by quantization

QuantizationVRAM required
Q4_K_M (recommended)1 GB
Q5_K_M1.2 GB
Q8_02 GB
FP16 (no quantization)3 GB

VRAM figures include model weights plus a typical 8k KV cache and ~600 MB runtime overhead (Ollama / llama.cpp baseline). Add headroom for higher context lengths.

Strengths

  • Around 1GB VRAM at Q4 — runs nearly anywhere
  • Strong inline completion for a 1.5B model
  • Apache 2.0 license
  • 92 programming languages covered

Limitations

  • 1.5B caps code quality — not for complex generation
  • 32k context only
  • Outclassed on harder tasks by 7B+ coders

Architecture & training

Architecture: Dense · 1.5B · Qwen 2.5 Coder · compact code-specialized

Training: 1.5B params, code corpus across 92 languages, ideal for lightweight completion.

Verdict

An impressively capable 1.5B coder — keep it for on-device completion, not for whole-feature generation.

Quick start

ollama run qwen2.5-coder:1.5b

Or use the open-source MCP server to query this model from Claude Desktop, Cursor, or any MCP-compatible client.

Tools

Is Qwen 2.5 Coder 1.5B Instruct the right pick for you?

Compute self-hosted ROI → Back to catalog