Gemini 2.5 Flash

Name: Gemini 2.5 Flash
Price: 0.15 USD
Author: Google

Googlemid

Google's fast and affordable thinking model with native multimodal support and a 1M token context window. Combines reasoning capabilities with exceptional speed and low cost. One of the best value models available for multimodal and long-context workloads.

Released 2025-10-08Knowledge cutoff: 2025-06

Needs review|Updated 101d ago|88% source confidence

Specifications

Context Window

1.0M tokens

Max Output

65.5K tokens

Input Price

$0.150 / 1M tokens

Output Price

$0.600 / 1M tokens

Latency Tier

Ultra Fast (speed score: 9.5/10)

Capability Profile

Speed

9.5/10

Long Context

9/10

Cost Efficiency

9/10

Multimodal

8.5/10

Structured Output

8/10

Instruction Following

8/10

Tool Use

8/10

Reasoning

7.5/10

Coding

7.5/10

Factuality

7.5/10

Safety & Enterprise

7.5/10

Conversational

7.5/10

Creativity

6.5/10

Feature Support

Vision Yes

Audio In Yes

Audio Out No

Video Yes

Image Generation No

Image Editing No

Function Calling Yes

JSON Mode Yes

Structured Output Yes

Streaming Yes

Reasoning No

Realtime No

Computer Use No

Web Search No

Best Use Cases

High-volume multimodal processing — images, audio, video — at low cost

Long document analysis with 1M token context at budget pricing

Real-time applications needing multimodal understanding with low latency

Agentic workflows requiring speed and tool use at scale

Video and audio content analysis and summarization

Not Ideal For

The hardest reasoning or coding problems where Pro/frontier models are needed

Nuanced creative writing requiring depth

Enterprise deployments requiring the strongest safety alignment

Tasks where you need the absolute best structured output compliance

Strengths

Extraordinary value — multimodal + reasoning + 1M context at $0.15/$0.60

Built-in thinking mode brings reasoning to a mid-tier price point

Native video and audio understanding at Flash-tier pricing

Very fast inference even with thinking enabled

1M token context with reasonable recall quality

Weaknesses

Reasoning quality below Gemini 2.5 Pro on hard problems

Structured output compliance is good but not Claude-level

Thinking overhead can be wasteful on simple tasks

Safety filtering is less predictable than Anthropic models

Edge Cases & Notes

Thinking can be disabled via API for pure speed on simple tasks

Video analysis works well on short clips but degrades on long content

Quality at the extremes of 1M context is lower than Gemini 2.5 Pro

Free tier is available with generous rate limits for development

Provider Notes

The best value multimodal model in the market. Available through Gemini API and Vertex AI. Free tier available. Recommended as the default for cost-sensitive multimodal workloads.

Benchmarks

MMLU86.5%

HumanEval84.2%

Arena Elo1310

Benchmark Notes

MMLU-Pro 86.5%. Impressive for its price tier. Multimodal benchmarks are especially strong relative to cost. SWE-bench ~40%.

Research Meta

Last Evaluated

2026-04-01

Source Confidence

88%

Evaluation Method

LMSYS Arena, MMLU-Pro, multimodal evaluations, cost-quality Pareto analysis

Needs Re-evaluation

Sources

Google Gemini 2.5 Flash technical report
LMSYS Chatbot Arena
Artificial Analysis

Continue exploring

Route a prompt

See how Gemini 2.5 Flash ranks

Compare models

Side-by-side analysis

Browse registry

Explore all 24 models