N
NexusRoute
Back to Models

Gemini 2.5 Flash Lite

Googlebudget

Google's ultra-budget multimodal model. The cheapest model with native vision, audio, and video understanding available anywhere. Designed for extreme-volume workloads where cost is the primary constraint.

Released 2026-01-10Knowledge cutoff: 2025-06
Medium confidence|Updated 72d ago|82% source confidence

Specifications

Context Window

1.0M tokens

Max Output

16.4K tokens

Input Price

$0.075 / 1M tokens

Output Price

$0.300 / 1M tokens

Latency Tier

Ultra Fast (speed score: 10/10)

Capability Profile

Speed
10/10
Cost Efficiency
10/10
Long Context
8/10
Multimodal
7.5/10
Structured Output
7/10
Instruction Following
7/10
Safety & Enterprise
7/10
Tool Use
6.5/10
Factuality
6/10
Conversational
6/10
Reasoning
5.5/10
Coding
5/10
Creativity
5/10

Feature Support

Vision Yes
Audio In Yes
Audio Out No
Video Yes
Image Generation No
Image Editing No
Function Calling Yes
JSON Mode Yes
Structured Output Yes
Streaming Yes
Reasoning No
Realtime No
Computer Use No
Web Search No

Best Use Cases

Bulk video and image classification/tagging at massive scale
Content moderation pipelines processing millions of items
First-pass triage before escalating to a more capable model
Simple multimodal extraction tasks at the lowest possible cost
Audio transcription and basic summarization at scale

Not Ideal For

Complex reasoning or analysis of any kind
Code generation beyond simple snippets
Nuanced creative writing
Enterprise-critical decision making
Tasks requiring high factual accuracy

Strengths

Cheapest multimodal model available — $0.075/M input tokens
Native video, audio, and image understanding at budget pricing
1M token context window even at this price tier
Extremely fast inference — the fastest Google model

Weaknesses

Noticeable quality drop vs Gemini 2.5 Flash on everything
Reasoning and coding capabilities are quite limited
Produces shallow, generic responses on complex topics
Hallucination rate is higher than any other model in this list
Structured output can be unreliable on complex schemas

Edge Cases & Notes

Best used as a filter/router — let it classify and route to a better model when needed
Quality on non-English languages is significantly weaker than Flash
1M context is supported but quality degrades noticeably past 200K

Provider Notes

Available through Gemini API and Vertex AI. Free tier with generous limits. The go-to model for teams needing multimodal at scale with minimal budget.

Benchmarks

MMLU74.5%
HumanEval68%
Arena Elo1150

Benchmark Notes

Modest benchmarks overall, but exceptional on cost-normalized metrics. Best evaluated as a cost-efficiency champion rather than an absolute quality leader.

Research Meta

Last Evaluated

2026-03-15

Source Confidence

82%

Evaluation Method

Public benchmarks, cost-efficiency analysis, multimodal evaluation

Needs Re-evaluation

No

Sources

  • Google Gemini 2.5 Flash Lite documentation
  • LMSYS Chatbot Arena
  • Artificial Analysis