BestLLMfor EN Your hardware. Your LLM. Your call.
APIOpen data Find my LLM
Model fiche

MedGemma 1.5 4B

By Google · United States

chat vision multilingual small
Parameters
4B
License
Gemma
Context
125k
VRAM (Q4)
2.3 GB
Released
May 2026

Overview

Google's v1.5 update to MedGemma — a 4B vision-and-text model fine-tuned on clinical literature, radiology imagery, and medical reports. 128k context, Gemma license.

When to pick this model

  • Upgrading existing MedGemma 1.0 deployments without re-architecting
  • Drafting and summarizing clinical reports with image grounding
  • Research workflows in radiology and medical imaging
  • On-prem clinical assistants where API calls aren't an option

VRAM requirements by quantization

QuantizationVRAM required
Q4_K_M (recommended)2.3 GB
Q5_K_M2.8 GB
Q8_04.3 GB
FP16 (no quantization)8 GB

VRAM figures include model weights plus a typical 8k KV cache and ~600 MB runtime overhead (Ollama / llama.cpp baseline). Add headroom for higher context lengths.

Strengths

  • Iterative refinement over MedGemma 1.0 with the same footprint
  • Compact 4B (~2.3GB VRAM at Q4)
  • Multimodal — text plus medical imagery
  • 128k context for long patient histories and literature

Limitations

  • Decision-support tool only — not for direct clinical use
  • Narrow medical focus, weak general performance
  • Gated on Hugging Face

Architecture & training

Architecture: Gemma · 4B parameters · multimodal text + image · 128k context

Training: v1.5 iteration of Google's medical fine-tuning on Gemma: clinical literature, radiological imaging, reports.

Verdict

A drop-in upgrade to MedGemma 1.0 with sharper clinical performance at the same compact size.

Quick start

ollama run medgemma1.5

Or use the open-source MCP server to query this model from Claude Desktop, Cursor, or any MCP-compatible client.

Tools

Is MedGemma 1.5 4B the right pick for you?

Compute self-hosted ROI → Back to catalog