BestLLMfor EN Your hardware. Your LLM. Your call.
APIOpen data Find my LLM
Model fiche

Seed-OSS 36B Instruct

By ByteDance · China

chat general
Parameters
36B
License
Apache 2.0
Context
512k
VRAM (Q4)
22 GB
Released
April 2025

Overview

ByteDance's first major open release: a dense 36B model with a native 524k context — roughly 4× the competition. Apache 2.0.

When to pick this model

  • Extreme long-document analysis (codebases, books, transcripts)
  • RAG-free workflows that load everything into context
  • Dense-model deployments preferring predictable behavior
  • Apache-licensed commercial use

VRAM requirements by quantization

QuantizationVRAM required
Q4_K_M (recommended)22 GB
Q5_K_M26 GB
Q8_040 GB
FP16 (no quantization)72 GB

VRAM figures include model weights plus a typical 8k KV cache and ~600 MB runtime overhead (Ollama / llama.cpp baseline). Add headroom for higher context lengths.

Strengths

  • 524k native context — a record for accessible dense models
  • Dense 36B is easier to deploy than equivalent MoEs
  • Strong long-document comprehension
  • Apache 2.0

Limitations

  • Around 22 GB VRAM at Q4 (much more with full context)
  • ByteDance license terms need a careful read
  • Limited fine-tune ecosystem at launch

Architecture & training

Architecture: Dense · 36B · ByteDance Seed-OSS · 524k native context

Training: ByteDance — very long context (524k tokens) natively supported.

Verdict

Unmatched long-context for a dense open model — the pick when you genuinely need to load 500k+ tokens at once.

Quick start

ollama pull hf.co/ByteDance/seed-oss-36b-GGUF

Or use the open-source MCP server to query this model from Claude Desktop, Cursor, or any MCP-compatible client.

Tools

Is Seed-OSS 36B Instruct the right pick for you?

Compute self-hosted ROI → Back to catalog