MedGemma 1.5 4B
By Google · United States
Overview
Google's v1.5 update to MedGemma — a 4B vision-and-text model fine-tuned on clinical literature, radiology imagery, and medical reports. 128k context, Gemma license.
When to pick this model
- Upgrading existing MedGemma 1.0 deployments without re-architecting
- Drafting and summarizing clinical reports with image grounding
- Research workflows in radiology and medical imaging
- On-prem clinical assistants where API calls aren't an option
VRAM requirements by quantization
| Quantization | VRAM required |
|---|---|
| Q4_K_M (recommended) | 2.3 GB |
| Q5_K_M | 2.8 GB |
| Q8_0 | 4.3 GB |
| FP16 (no quantization) | 8 GB |
VRAM figures include model weights plus a typical 8k KV cache and ~600 MB runtime overhead (Ollama / llama.cpp baseline). Add headroom for higher context lengths.
Strengths
- Iterative refinement over MedGemma 1.0 with the same footprint
- Compact 4B (~2.3GB VRAM at Q4)
- Multimodal — text plus medical imagery
- 128k context for long patient histories and literature
Limitations
- Decision-support tool only — not for direct clinical use
- Narrow medical focus, weak general performance
- Gated on Hugging Face
Architecture & training
Architecture: Gemma · 4B parameters · multimodal text + image · 128k context
Training: v1.5 iteration of Google's medical fine-tuning on Gemma: clinical literature, radiological imaging, reports.
A drop-in upgrade to MedGemma 1.0 with sharper clinical performance at the same compact size.
Quick start
ollama run medgemma1.5Or use the open-source MCP server to query this model from Claude Desktop, Cursor, or any MCP-compatible client.