Grok 4.20
xAIfrontierxAI's flagship model with a massive 2M token context window, strong reasoning capabilities, and vision support. Known for its straightforward, less filtered conversational style and real-time information access through X integration. Competitive with GPT-5.4 on reasoning benchmarks.
Specifications
2M tokens
128K tokens
$2.00 / 1M tokens
$6.00 / 1M tokens
Fast (speed score: 7/10)
Capability Profile
Feature Support
Best Use Cases
Not Ideal For
Strengths
Weaknesses
Edge Cases & Notes
Provider Notes
Available through the xAI API. X/Twitter integration available but optional. API maturity is improving but still behind OpenAI and Anthropic in terms of features like batching and caching.
Benchmarks
Benchmark Notes
MMLU-Pro 91%. Strong LMSYS Arena showing, especially in reasoning and conversational categories. SWE-bench ~48%. Long-context benchmarks are its standout — near-perfect needle-in-haystack at 1M tokens.
Research Meta
Last Evaluated
2026-03-15
Source Confidence
82%
Evaluation Method
LMSYS Arena, MMLU-Pro, long-context benchmarks, SWE-bench, conversational evaluation
Needs Re-evaluation
NoSources
- xAI Grok 4.20 announcement
- LMSYS Chatbot Arena
- Independent long-context evaluations