Mistral Small 4
MistralbudgetMistral's efficient MoE model with 119B total parameters but only 6B active per token. Features a 256K context window, reasoning mode, and vision support. Open-weight under Apache 2.0. Designed for self-hosting on modest hardware while providing strong reasoning capabilities.
Specifications
256K tokens
16.4K tokens
$0.100 / 1M tokens
$0.300 / 1M tokens
Ultra Fast (speed score: 9/10)
Capability Profile
Feature Support
Best Use Cases
Not Ideal For
Strengths
Weaknesses
Edge Cases & Notes
Provider Notes
Open-weight under Apache 2.0. Available through Mistral's La Plateforme, Ollama, and self-hosted. One of the best options for self-hosted reasoning on consumer hardware.
Benchmarks
Benchmark Notes
MMLU-Pro 78.5%. Impressive for 6B active parameters. Reasoning mode benchmarks show meaningful improvement over non-reasoning mode. Good multilingual benchmark scores.
Research Meta
Last Evaluated
2026-03-15
Source Confidence
82%
Evaluation Method
Open LLM Leaderboard, LMSYS Arena, self-hosting evaluation, cost analysis
Needs Re-evaluation
NoSources
- Mistral Small 4 technical report
- Open LLM Leaderboard
- LMSYS Chatbot Arena