Gemini 2.5 Flash Lite

Name: Gemini 2.5 Flash Lite
Price: 0.075 USD
Author: Google

Googlebudget

Google's ultra-budget multimodal model. The cheapest model with native vision, audio, and video understanding available anywhere. Designed for extreme-volume workloads where cost is the primary constraint.

Released 2026-01-10Knowledge cutoff: 2025-06

Needs review|Updated 118d ago|82% source confidence

Specifications

Context Window

1.0M tokens

Max Output

16.4K tokens

Input Price

$0.075 / 1M tokens

Output Price

$0.300 / 1M tokens

Latency Tier

Ultra Fast (speed score: 10/10)

Capability Profile

Speed

10/10

Cost Efficiency

10/10

Long Context

8/10

Multimodal

7.5/10

Structured Output

7/10

Instruction Following

7/10

Safety & Enterprise

7/10

Tool Use

6.5/10

Factuality

6/10

Conversational

6/10

Reasoning

5.5/10

Coding

5/10

Creativity

5/10

Feature Support

Vision Yes

Audio In Yes

Audio Out No

Video Yes

Image Generation No

Image Editing No

Function Calling Yes

JSON Mode Yes

Structured Output Yes

Streaming Yes

Reasoning No

Realtime No

Computer Use No

Web Search No

Best Use Cases

Bulk video and image classification/tagging at massive scale

Content moderation pipelines processing millions of items

First-pass triage before escalating to a more capable model

Simple multimodal extraction tasks at the lowest possible cost

Audio transcription and basic summarization at scale

Not Ideal For

Complex reasoning or analysis of any kind

Code generation beyond simple snippets

Nuanced creative writing

Enterprise-critical decision making

Tasks requiring high factual accuracy

Strengths

Cheapest multimodal model available — $0.075/M input tokens

Native video, audio, and image understanding at budget pricing

1M token context window even at this price tier

Extremely fast inference — the fastest Google model

Weaknesses

Noticeable quality drop vs Gemini 2.5 Flash on everything

Reasoning and coding capabilities are quite limited

Produces shallow, generic responses on complex topics

Hallucination rate is higher than any other model in this list

Structured output can be unreliable on complex schemas

Edge Cases & Notes

Best used as a filter/router — let it classify and route to a better model when needed

Quality on non-English languages is significantly weaker than Flash

1M context is supported but quality degrades noticeably past 200K

Provider Notes

Available through Gemini API and Vertex AI. Free tier with generous limits. The go-to model for teams needing multimodal at scale with minimal budget.

Benchmarks

MMLU74.5%

HumanEval68%

Arena Elo1150

Benchmark Notes

Modest benchmarks overall, but exceptional on cost-normalized metrics. Best evaluated as a cost-efficiency champion rather than an absolute quality leader.

Research Meta

Last Evaluated

2026-03-15

Source Confidence

82%

Evaluation Method

Public benchmarks, cost-efficiency analysis, multimodal evaluation

Needs Re-evaluation

Sources

Google Gemini 2.5 Flash Lite documentation
LMSYS Chatbot Arena
Artificial Analysis

Continue exploring

Route a prompt

See how Gemini 2.5 Flash Lite ranks

Compare models

Side-by-side analysis

Browse registry

Explore all 24 models