BestLLMfor EN Your hardware. Your LLM. Your call.
APIOpen data Find my LLM
Model fiche

Tülu 3 70B

By Allen AI · United States

chat general reasoning
Parameters
70B
License
Llama 3.1 Community
Context
125k
VRAM (Q4)
40 GB
Released
November 2024

Overview

Allen AI's fully open RLHF stack on Llama 3.1 70B, beating Claude Haiku, GPT-3.5 Turbo, and GPT-4o-mini on standard reasoning and code benchmarks.

When to pick this model

  • Self-hosted alternative to closed mid-tier APIs
  • Math-heavy chat with GSM8K 93.5 territory performance
  • Code assistance where HumanEval+ matters more than agentic loops
  • Research projects that need a fully documented post-training pipeline
  • Workloads that justify a 2x A100 footprint

VRAM requirements by quantization

QuantizationVRAM required
Q4_K_M (recommended)40 GB
Q5_K_M48 GB
Q8_075 GB
FP16 (no quantization)140 GB

VRAM figures include model weights plus a typical 8k KV cache and ~600 MB runtime overhead (Ollama / llama.cpp baseline). Add headroom for higher context lengths.

Published benchmark scores

BenchmarkScore
GSM8K93.5
HumanEval+92.4
IFEval83.2

Scores published by the model author or aggregated from public leaderboards. Re-measured monthly by our editorial team.

Strengths

  • Beats Claude Haiku, GPT-3.5 Turbo, and GPT-4o-mini on key evals
  • GSM8K 93.5 and HumanEval+ 92.4 at open weights
  • Fully open SFT + DPO + RLVR recipe
  • Strong instruction following and refusal calibration
  • Stable, well-documented behavior for production deploys

Limitations

  • ~40 GB VRAM at Q4 — needs serious hardware
  • Bound by Llama 3.1 Community License
  • No multimodal capabilities

Architecture & training

Architecture: Dense Llama 3.1 70B · full Tülu recipe

Training: SFT + DPO + RLVR on 70B.

Verdict

The strongest fully open post-trained 70B available — a credible self-hosted replacement for closed mid-tier chat APIs.

Quick start

ollama run tulu3:70b

Or use the open-source MCP server to query this model from Claude Desktop, Cursor, or any MCP-compatible client.

Tools

Is Tülu 3 70B the right pick for you?

Compute self-hosted ROI → Back to catalog