BestLLMfor EN Your hardware. Your LLM. Your call.
APIOpen data Find my LLM
Model fiche

Llama 3.1 405B Instruct

By Meta · United States

chat general reasoning
Parameters
405B
License
Llama 3.1 Community
Context
125k
VRAM (Q4)
240 GB
Released
July 2024

Overview

Meta's reference open dense model at 405B parameters, with MMLU 88.6 and HumanEval 89.0. Gated on Hugging Face and over 240GB even at Q4.

When to pick this model

  • Self-hosted alternative to closed frontier APIs when you have the hardware
  • Reproducible research baseline for large dense models
  • Long-running batch inference where weight licensing matters more than speed
  • Distillation source for smaller specialist models

VRAM requirements by quantization

QuantizationVRAM required
Q4_K_M (recommended)240 GB
Q5_K_M288 GB
Q8_0435 GB
FP16 (no quantization)810 GB

VRAM figures include model weights plus a typical 8k KV cache and ~600 MB runtime overhead (Ollama / llama.cpp baseline). Add headroom for higher context lengths.

Published benchmark scores

BenchmarkScore
MMLU88.6
HumanEval89
GSM8K96.8

Scores published by the model author or aggregated from public leaderboards. Re-measured monthly by our editorial team.

Strengths

  • The reference dense open model — widely benchmarked and well-understood
  • MMLU 88.6, HumanEval 89.0
  • 128k context
  • Mature ecosystem support across all serving frameworks

Limitations

  • 240+ GB at Q4 — needs a serious multi-GPU server
  • Hugging Face gated access
  • Llama 3.1 Community License with MAU clause
  • Largely superseded by MoE alternatives at similar quality

Architecture & training

Architecture: Dense 405B · GQA

Training: 15T tokens by Meta.

Verdict

Still the canonical dense open model, but MoE alternatives now deliver comparable quality at a fraction of the inference cost.

Quick start

ollama run llama3.1:405b

Or use the open-source MCP server to query this model from Claude Desktop, Cursor, or any MCP-compatible client.

Tools

Is Llama 3.1 405B Instruct the right pick for you?

Compute self-hosted ROI → Back to catalog