BestLLMfor EN Your hardware. Your LLM. Your call.
APIOpen data Find my LLM
Model fiche

Falcon Mamba 7B

By TII · UAE

chat general
Parameters
7B
License
TII Falcon-LLM License 2.0
Context
8k
VRAM (Q4)
5 GB
Released
August 2024

Overview

TII's first serious pure Mamba SSM at scale — 7B with constant memory per token, sidestepping transformer attention costs entirely.

When to pick this model

  • Streaming workloads needing constant memory per token
  • Research on state-space models versus transformers
  • Throughput-bound inference where attention is the bottleneck
  • Long-running generation where context grows unboundedly
  • Edge inference on memory-constrained devices

VRAM requirements by quantization

QuantizationVRAM required
Q4_K_M (recommended)5 GB
Q5_K_M6 GB
Q8_09 GB
FP16 (no quantization)14 GB

VRAM figures include model weights plus a typical 8k KV cache and ~600 MB runtime overhead (Ollama / llama.cpp baseline). Add headroom for higher context lengths.

Published benchmark scores

BenchmarkScore
MMLU62

Scores published by the model author or aggregated from public leaderboards. Re-measured monthly by our editorial team.

Strengths

  • O(1) memory per token at inference
  • No practical context limit imposed by attention
  • Apache 2.0 license
  • Demonstrates Mamba viability at production scale

Limitations

  • Weaker in-context learning than transformers of equal size
  • No vision or multimodal support
  • Trained context is only 8k despite architectural headroom

Architecture & training

Architecture: Mamba architecture (SSM) · 7B · no Transformer · O(1) inference

Training: TII UAE — 5.5T tokens corpus. Pure State Space Model architecture.

Verdict

The benchmark pure-Mamba 7B — pick it to study SSMs or to serve streaming workloads where attention costs hurt most.

Quick start

ollama run falcon-mamba:7b

Or use the open-source MCP server to query this model from Claude Desktop, Cursor, or any MCP-compatible client.

Tools

Is Falcon Mamba 7B the right pick for you?

Compute self-hosted ROI → Back to catalog