Llama 4 Maverick

Name: Llama 4 Maverick
Price: 0.5 USD
Author: Meta

Metafrontier

Meta's most capable open-weight model. A 400B MoE (17B active) with native multimodal support and a 1M token context window. Approaches closed-source frontier quality on many benchmarks while being fully open-weight. Competitive with GPT-5.4-mini and Claude Sonnet 4.6.

Released 2025-11-14Knowledge cutoff: 2025-08

Needs review|Updated 116d ago|88% source confidence

Specifications

Context Window

1M tokens

Max Output

64K tokens

Input Price

$0.500 / 1M tokens

Output Price

$1.50 / 1M tokens

Latency Tier

Moderate (speed score: 6.5/10)

Capability Profile

Long Context

9/10

Reasoning

8/10

Coding

8/10

Cost Efficiency

8/10

Factuality

8/10

Structured Output

7.5/10

Multimodal

7.5/10

Creativity

7.5/10

Instruction Following

7.5/10

Tool Use

7.5/10

Conversational

7.5/10

Speed

6.5/10

Safety & Enterprise

6.5/10

Feature Support

Vision Yes

Audio In No

Audio Out No

Video No

Image Generation No

Image Editing No

Function Calling Yes

JSON Mode Yes

Structured Output Yes

Streaming Yes

Reasoning No

Realtime No

Computer Use No

Web Search No

Best Use Cases

Open-weight deployments needing frontier-class quality

Fine-tuning for enterprise or domain-specific applications

Long-context multimodal analysis at open-weight pricing

Research requiring model weight access for interpretability or customization

Cost-effective frontier alternative to GPT-5.4-mini or Sonnet 4.6

Not Ideal For

Simple self-hosted inference on a single GPU (use Scout instead)

Enterprise deployments requiring the strictest safety alignment

Audio or video processing

Applications where the absolute best coding quality is needed (use Claude Opus)

Strengths

Best open-weight model available — approaches closed-source frontier quality

1M context with strong recall — best long-context open model

Native multimodal (text + images) with strong vision understanding

Full model weights available for customization and fine-tuning

Excellent value through hosted providers at ~$0.50/$1.50

Weaknesses

Requires multi-GPU infrastructure for self-hosting (4-8 H100s recommended)

Quality gap vs Claude Opus 4.6 and GPT-5.4 is measurable on hard tasks

Instruction following less precise than Anthropic or OpenAI models

Safety alignment is basic compared to closed-source competitors

Structured output compliance is good but not best-in-class

Edge Cases & Notes

17B active parameters per token despite 400B total — MoE efficiency is key

Pricing varies significantly by hosted provider

Fine-tuned variants from the community can significantly improve domain-specific performance

Self-hosting requires expertise in distributed inference (vLLM, TGI, etc.)

Provider Notes

Open-weight under Meta's Llama license. Available through Together AI, Fireworks, Replicate, and self-hosted. Self-hosting requires multi-GPU setup. The best open-weight option for teams needing frontier-class quality with weight access.

Benchmarks

MMLU88.5%

HumanEval88%

Arena Elo1340

Benchmark Notes

MMLU-Pro 88.5%. Competitive with GPT-5.4-mini on many benchmarks. SWE-bench ~48%. Best open-weight model on LMSYS Arena. Long-context performance is strong throughout 1M window.

Research Meta

Last Evaluated

2026-03-15

Source Confidence

88%

Evaluation Method

Open LLM Leaderboard, LMSYS Arena, SWE-bench, long-context evaluation

Needs Re-evaluation

Sources

Meta Llama 4 technical report
Open LLM Leaderboard
LMSYS Chatbot Arena
SWE-bench leaderboard

Continue exploring

Route a prompt

See how Llama 4 Maverick ranks

Compare models

Side-by-side analysis

Browse registry

Explore all 24 models