Google family

Gemini 3

Gemini 3: Gemini 3.1 Pro ranks #5 of 186 with $2/$12 per 1M tokens. Compare Gemini 3 Pro, Flash, and Lite by workload.

Top in this family

Gemini 3.1 Pro ranks #5 of 186 on overall quality (QS 104.3) at $2/$12 per 1M tokens.

Practical pick

Gemini 3 Flash (Preview) at $0.5/$3 per 1M tokens (rank #36 of 186).

Variants
3
License
Closed weights
Provider
Google

★ Most teams should start here

Gemini 3 Flash

Variant: Preview

The practical default. Carries Gemini 3's quality ceiling for everyday API workloads at a fraction of Pro pricing. Step up to 3 Pro only when the workload visibly benefits.

Quality Score
87.3
Input
$0.500/1M
Output
$3.00/1M
Context
License
Closed · API

Best variant by workload

One pick per common job. Pick by what you need to ship — not by which variant has the highest score on a leaderboard you don't use.

Note — picks are framed for direct API usage where cost per million tokens is load-bearing. If you're inside an agent harness (Claude Code, Cursor, etc.) the calculus changes: the harness sets the model, the per-task cost is usually negligible, and the flagship variant tends to win. See our piece on Claude Code for the harness-vs-API framing.
WorkloadBest pickWhy
General API workhorse
Gemini 3 Flash
Preview
$0.500/1M / $3.00/1M
Best quality-per-dollar in the family for chat, summarization, and tool-augmented assistants.
Long-context RAG
Gemini 3 Pro
3.1
$2.00/1M / $12.00/1M
Strongest long-context recall in the family. Pick when document scale and faithful retrieval over long inputs dominate.
High-volume chat
Gemini 3.1 Flash Lite
Latest
$0.250/1M / $1.50/1M
Cheapest production-grade tier in the current generation. Use for high-volume chat where per-token cost compounds.

All variants

20 variants across 3 models (+ 2 cross-family for context). Sorted by quality score (descending).

VariantQSGPQAHLESWESWE-ProTerminalTauMCPAIMEIn $/MOut $/MContextReleasedLic.
3.1
Gemini 3 Pro
104.3
#5/186
94.344.480.654.268.590.873.9$2$12Nov 18, 2025
3.0
Gemini 3 Pro
95.0
#20/186
91.937.576.243.354.285.354.195.0$2$12Nov 18, 2025
3.0
Gemini 3 Flash
88.9
#32/186
90.433.778.049.647.662.0$0.5$3Dec 17, 2025
Preview
Gemini 3 Flash
87.3
#36/186
62.0$0.5$3Dec 17, 2025
Latest
Gemini 3.1 Flash Lite
81.1
#59/186
86.916.057.1$0.25$1.5
4.8 Thinkingcross-family
Anthropic Claude Opus 4
108.6
#2/186
93.649.888.669.282.2$5$25200KMay 22, 2025
4.7 Thinkingcross-family
Anthropic Claude Opus 4
107.8
#3/186
94.246.987.664.369.477.3$5$25200KMay 22, 2025
4.6 Thinkingcross-family
Anthropic Claude Opus 4
104.1
#6/186
91.340.080.853.465.491.959.595.6$5$251.0MMay 22, 2025
4.5 Thinkingcross-family
Anthropic Claude Opus 4
98.6
#13/186
87.030.880.959.388.962.392.8$5$25200KMay 22, 2025
V4 Pro Thinkingcross-family
DeepSeek V4
98.0
#15/186
90.137.780.655.473.6$0.435$0.871.0MApr 24, 2026
4.6 Non-thinkingcross-family
Anthropic Claude Opus 4
93.1
#23/186
19.0$5$25200KMay 22, 2025
V4 Flash Thinkingcross-family
DeepSeek V4
92.0
#27/186
88.134.879.052.669.0$0.098$0.1971.0MApr 24, 2026
4.1 Thinkingcross-family
Anthropic Claude Opus 4
83.1
#50/186
81.011.774.538.086.840.978.0$15$75200KMay 22, 2025
V4 Procross-family
DeepSeek V4
80.9
#61/186
72.97.773.652.169.4$0.435$0.871.0MApr 24, 2026
4.5 Non-thinkingcross-family
Anthropic Claude Opus 4
80.7
#63/186
14.245.9$5$25200KMay 22, 2025
4.0 Thinkingcross-family
Anthropic Claude Opus 4
80.7
#64/186
79.610.772.581.475.5$15$75200KMay 22, 2025
4.0 Non-thinkingcross-family
Anthropic Claude Opus 4
79.1
#73/186
74.96.772.581.833.9$15$75200KMay 22, 2025
V4 Flashcross-family
DeepSeek V4
78.1
#78/186
71.28.173.749.164.0$0.098$0.1971.0MApr 24, 2026
4.1 Non-thinkingcross-family
Anthropic Claude Opus 4
70.4
#115/186
7.9$15$75200KMay 22, 2025
4.7 Non-thinkingcross-family
Anthropic Claude Opus 4
$5$25200KMay 22, 2025

Benchmark evidence

Every benchmark we track for this family, across capabilities. The headline Quality Score draws from a deliberately narrow, governed panel (58 of 186 rows here feed it); the rest is tracked evidence — recorded and comparable, but not folded into one synthetic score.

Model / VariantBenchmarkScoreRankScoring
Gemini 3 Pro · 3.0LiveCodeBench · v690.71 / 40In Quality Score
Gemini 3 Pro · 3.0MMLU Pro90.11 / 86In Quality Score
Gemini 3 Pro · 3.1SimpleBench79.61 / 61In Quality Score
Gemini 3 Pro · 3.0τ²-bench · airline731 / 29In Quality Score
Gemini 3 Pro · 3.0Humanity's Last Exam · verified481 / 5In Quality Score
Gemini 3 Pro · 3.1Humanity's Last Exam · hle_text47.31 / 56In Quality Score
Gemini 3 Pro · 3.0AIME 2025 · code_exec1002 / 4In Quality Score
Gemini 3 Pro · 3.1τ²-bench · telecom99.32 / 28In Quality Score
Show all benchmark evidence (186 rows)

Reasoning

Model / VariantBenchmarkScoreRankScoring
Gemini 3 Pro · 3.0MMLU Pro90.11 / 86In Quality Score
Gemini 3 Pro · 3.1SimpleBench79.61 / 61In Quality Score
Gemini 3 Pro · 3.0Humanity's Last Exam · verified481 / 5In Quality Score
Gemini 3 Pro · 3.1Humanity's Last Exam · hle_text47.31 / 56In Quality Score
Gemini 3 Pro · 3.0AIME 2025 · code_exec1002 / 4In Quality Score
Gemini 3 Flash · 3.0AIME 2025 · no_tools95.22 / 15In Quality Score
Gemini 3 Pro · 3.1GPQA Diamond94.32 / 143In Quality Score
Gemini 3 Pro · 3.1Humanity's Last Exam · search_code51.42 / 6In Quality Score
Gemini 3 Flash · 3.0AIME 2025 · code_exec99.73 / 4In Quality Score
Gemini 3 Pro · 3.0AIME 2025953 / 88In Quality Score
Gemini 3 Pro · 3.0AIME 2025 · no_tools953 / 15In Quality Score
Gemini 3 Pro · 3.0SimpleBench76.43 / 61In Quality Score
Gemini 3 Pro · 3.0Humanity's Last Exam · hle_text37.53 / 56In Quality Score
Gemini 3 Pro · 3.1LiveBench79.94 / 110In Quality Score
Gemini 3 Pro · 3.0Humanity's Last Exam · search_code45.84 / 6In Quality Score
Gemini 3 Pro · 3.1Humanity's Last Exam · hle44.44 / 90In Quality Score
Gemini 3 Pro · 3.1Arena Elo14876 / 158In Quality Score
Gemini 3 Flash · 3.0Humanity's Last Exam · search_code43.56 / 6In Quality Score
Gemini 3 Pro · 3.0Arena Elo14867 / 158In Quality Score
Gemini 3 Pro · 3.0GPQA Diamond91.99 / 143In Quality Score
Gemini 3 Pro · 3.1Humanity's Last Exam · tools51.410 / 38In Quality Score
Gemini 3 Pro · 3.0Humanity's Last Exam · hle37.511 / 90In Quality Score
Gemini 3 Flash · 3.0GPQA Diamond90.412 / 143In Quality Score
Gemini 3 Flash · 3.0SimpleBench61.112 / 61In Quality Score
Gemini 3 Flash · 3.0Humanity's Last Exam · hle33.714 / 90In Quality Score
Gemini 3 Flash · 3.0Arena Elo147317 / 158In Quality Score
Gemini 3 Pro · 3.0Humanity's Last Exam · tools45.821 / 38In Quality Score
Gemini 3 Pro · 3.0LiveBench73.423 / 110In Quality Score
Gemini 3 Flash · PreviewArena Elo146125 / 158In Quality Score
Gemini 3.1 Flash Lite · LatestGPQA Diamond86.926 / 143In Quality Score
Gemini 3 Flash · PreviewLiveBench72.426 / 110In Quality Score
Gemini 3.1 Flash Lite · LatestHumanity's Last Exam · hle_text8.035 / 56In Quality Score
Gemini 3.1 Flash Lite · LatestHumanity's Last Exam · hle1651 / 90In Quality Score
Gemini 3.1 Flash Lite · LatestArena Elo143352 / 158In Quality Score
Gemini 3.1 Flash Lite · LatestLiveBench61.755 / 110In Quality Score
Gemini 3 Pro · 3.0VendingBench25478.21 / 4Tracked evidence
Gemini 3 Pro · 3.0Global PIQA93.41 / 26Tracked evidence
Gemini 3 Pro · 3.0GlobalPIQA93.41 / 4Tracked evidence
Gemini 3 Flash · 3.0MMMLU91.81 / 38Tracked evidence
Gemini 3 Pro · 3.1BrowseComp · context_manage85.91 / 15Tracked evidence
Gemini 3 Pro · 3.0WMT24++80.71 / 6Tracked evidence
Gemini 3 Pro · 3.0SimpleQA72.11 / 40Tracked evidence
Gemini 3 Pro · 3.0FACTS Benchmark Suite70.51 / 12Tracked evidence
Gemini 3 Pro · 3.1SciCode591 / 24Tracked evidence
Gemini 3 Pro · 3.1AIME 202698.22 / 19Tracked evidence
Gemini 3 Flash · 3.0Global PIQA92.82 / 26Tracked evidence
Gemini 3 Pro · 3.1MMLU92.62 / 33Tracked evidence
Gemini 3 Pro · 3.0MMMLU91.82 / 38Tracked evidence
Gemini 3 Pro · 3.1IPhO 2025 (Theory)87.72 / 3Tracked evidence
Gemini 3 Pro · 3.1BrowseComp85.92 / 51Tracked evidence
Gemini 3 Flash · 3.0MMMU PRO81.22 / 52Tracked evidence
Gemini 3 Flash · 3.0SimpleQA68.72 / 40Tracked evidence
Gemini 3 Pro · 3.0SciCode562 / 24Tracked evidence
Gemini 3 Pro · 3.0HMMT Feb 202597.33 / 44Tracked evidence
Gemini 3 Pro · 3.0MMLU91.83 / 33Tracked evidence
Gemini 3 Pro · 3.1MRCR · v2_128k84.93 / 23Tracked evidence
Gemini 3 Flash · 3.0FACTS Benchmark Suite61.93 / 12Tracked evidence
Gemini 3 Pro · 3.0MathArenaApex23.43 / 8Tracked evidence
Gemini 3 Pro · 3.1HealthBench · hard20.63 / 5Tracked evidence
Gemini 3 Pro · 3.1Frontier Science Research23.34 / 4Tracked evidence
Gemini 3 Pro · 3.1HMMT Nov 202594.85 / 31Tracked evidence
Gemini 3 Pro · 3.0MAXIFE87.55 / 21Tracked evidence
Gemini 3 Pro · 3.0MMMU PRO815 / 52Tracked evidence
Gemini 3 Pro · 3.1FinanceAgent59.75 / 15Tracked evidence
Gemini 3 Pro · 3.1FrontierMath · tier1_336.95 / 5Tracked evidence
Gemini 3 Pro · 3.1FrontierMath · tier416.75 / 5Tracked evidence
Gemini 3 Pro · 3.1HMMT Feb 202687.36 / 16Tracked evidence
Gemini 3 Pro · 3.1MMMU PRO80.56 / 52Tracked evidence
Gemini 3 Pro · 3.0BrowseComp_zh66.86 / 20Tracked evidence
Gemini 3 Pro · 3.1FinanceAgent · v2436 / 7Tracked evidence
Gemini 3 Pro · 3.0MRCR · v2_1m26.36 / 14Tracked evidence
Gemini 3.1 Flash Lite · LatestMMMLU88.97 / 38Tracked evidence
Gemini 3 Pro · 3.0MRCR · v2_128k777 / 23Tracked evidence
Gemini 3 Flash · 3.0FinanceAgent · v242.67 / 7Tracked evidence
Gemini 3 Pro · 3.1MRCR · v2_1m26.37 / 14Tracked evidence
Gemini 3 Flash · 3.0MRCR · v2_128k67.28 / 23Tracked evidence
Gemini 3.1 Flash Lite · LatestSimpleQA43.38 / 40Tracked evidence
Gemini 3 Flash · 3.0MRCR · v2_1m22.18 / 14Tracked evidence
Gemini 3 Pro · 3.0IMO AnswerBench83.39 / 28Tracked evidence
Gemini 3 Pro · 3.0IFBench709 / 28Tracked evidence
Gemini 3.1 Flash Lite · LatestFACTS Benchmark Suite40.69 / 12Tracked evidence
Gemini 3 Pro · 3.0HMMT Nov 20259310 / 31Tracked evidence
Gemini 3.1 Flash Lite · LatestMRCR · v2_128k60.111 / 23Tracked evidence
Gemini 3.1 Flash Lite · LatestMRCR · v2_1m12.311 / 14Tracked evidence
Gemini 3 Pro · 3.1IMO AnswerBench8113 / 28Tracked evidence
Gemini 3 Pro · 3.0BrowseComp · context_manage59.213 / 15Tracked evidence
Gemini 3.1 Flash Lite · LatestMMMU PRO76.814 / 52Tracked evidence
Gemini 3 Pro · 3.0FinanceAgent44.114 / 15Tracked evidence
Gemini 3 Pro · 3.0AIME 202690.618 / 19Tracked evidence
Gemini 3 Pro · 3.0BrowseComp37.834 / 51Tracked evidence

Coding

Model / VariantBenchmarkScoreRankScoring
Gemini 3 Pro · 3.0LiveCodeBench · v690.71 / 40In Quality Score
Gemini 3 Pro · 3.1LiveCodeBench · pro82.92 / 5In Quality Score
Gemini 3 Pro · 3.1SWE-bench Verified80.67 / 68In Quality Score
Gemini 3 Pro · 3.1GSO (Global Software Optimization) · opt_at_121.67 / 24In Quality Score
Gemini 3 Pro · 3.0GSO (Global Software Optimization) · opt_at_117.68 / 24In Quality Score
Gemini 3.1 Flash Lite · LatestLiveCodeBench7210 / 69In Quality Score
Gemini 3 Flash · 3.0GSO (Global Software Optimization) · opt_at_17.811 / 24In Quality Score
Gemini 3 Flash · 3.0SWE-bench Verified7813 / 68In Quality Score
Gemini 3 Pro · 3.0SWE-bench Verified76.224 / 68In Quality Score
Gemini 3 Pro · 3.0SecCodeBench62.44 / 6Tracked evidence
Gemini 3 Pro · 3.1NL2Repo33.47 / 9Tracked evidence
Gemini 3 Pro · 3.0SWE-bench Multilingual6518 / 18Tracked evidence

Agentic

Model / VariantBenchmarkScoreRankScoring
Gemini 3 Pro · 3.0τ²-bench · airline731 / 29In Quality Score
Gemini 3 Pro · 3.1τ²-bench · telecom99.32 / 28In Quality Score
Gemini 3 Pro · 3.0τ²-bench · average90.72 / 30In Quality Score
Gemini 3 Pro · 3.1τ²-bench · retail90.83 / 34In Quality Score
Gemini 3 Flash · 3.0τ²-bench · average90.23 / 30In Quality Score
Gemini 3 Pro · 3.1MCP Atlas · public_set69.24 / 13In Quality Score
Gemini 3 Pro · 3.0τ²-bench · telecom986 / 28In Quality Score
Gemini 3 Pro · 3.0τ²-bench · retail85.37 / 34In Quality Score
Gemini 3 Pro · 3.1MCP Atlas73.97 / 33In Quality Score
Gemini 3 Pro · 3.0MCP Atlas · public_set66.68 / 13In Quality Score
Gemini 3 Flash · PreviewMCP Atlas6216 / 33In Quality Score
Gemini 3 Flash · 3.0MCP Atlas6217 / 33In Quality Score
Gemini 3.1 Flash Lite · LatestMCP Atlas57.123 / 33In Quality Score
Gemini 3 Pro · 3.0MCP Atlas54.125 / 33In Quality Score
Gemini 3 Pro · 3.0VendingBench · v254781 / 7Tracked evidence
Gemini 3 Pro · 3.0FinSearchComp · t2_t349.92 / 2Tracked evidence
Gemini 3 Pro · 3.0BFCL v472.53 / 18Tracked evidence
Gemini 3 Pro · 3.0MCPMark53.93 / 8Tracked evidence
Gemini 3 Flash · 3.0VendingBench · v236354 / 7Tracked evidence
Gemini 3 Pro · 3.1Automation Bench9.65 / 5Tracked evidence
Gemini 3 Pro · 3.1OSWorld · verified76.26 / 27Tracked evidence
Gemini 3 Pro · 3.1DeepSearchQA69.76 / 7Tracked evidence
Gemini 3 Pro · 3.1GDPVal67.36 / 6Tracked evidence
Gemini 3 Flash · 3.0Toolathlon49.47 / 31Tracked evidence
Gemini 3 Pro · 3.0DeepPlanning23.37 / 16Tracked evidence
Gemini 3 Pro · 3.1Toolathlon48.88 / 31Tracked evidence
Gemini 3 Pro · 3.1τ³-Bench67.19 / 10Tracked evidence
Gemini 3 Pro · 3.0Seal-045.59 / 16Tracked evidence
Gemini 3 Pro · 3.0CyberGym39.910 / 12Tracked evidence
Gemini 3 Pro · 3.0WideSearch5711 / 13Tracked evidence
Gemini 3 Pro · 3.1GDPVal-AA131413 / 17Tracked evidence
Gemini 3 Flash · 3.0OSWorld · verified65.114 / 27Tracked evidence
Gemini 3 Flash · 3.0GDPVal-AA120415 / 17Tracked evidence
Gemini 3 Pro · 3.0GDPVal-AA120116 / 17Tracked evidence
Gemini 3 Pro · 3.0Toolathlon36.422 / 31Tracked evidence

Multimodal

Model / VariantBenchmarkScoreRankScoring
Gemini 3 Pro · 3.0AI2D · test94.11 / 33Tracked evidence
Gemini 3 Pro · 3.0VideoMME · with_sub88.41 / 22Tracked evidence
Gemini 3 Pro · 3.0VideoMME · without_sub87.71 / 21Tracked evidence
Gemini 3 Pro · 3.0Video-MMMU87.61 / 28Tracked evidence
Gemini 3 Pro · 3.1MedXpertQA · mm81.31 / 31Tracked evidence
Gemini 3 Pro · 3.0LVBench76.21 / 18Tracked evidence
Gemini 3 Pro · 3.0SimpleVQA73.21 / 29Tracked evidence
Gemini 3 Pro · 3.1MedXpertQA · text71.51 / 5Tracked evidence
Gemini 3 Pro · 3.0ERQA70.51 / 27Tracked evidence
Gemini 3 Pro · 3.0WorldVQA47.41 / 5Tracked evidence
Gemini 3 Pro · 3.0MMBench · en_dev_v1_193.72 / 24Tracked evidence
Gemini 3 Flash · 3.0Video-MMMU86.92 / 28Tracked evidence
Gemini 3 Pro · 3.0MMStar83.12 / 33Tracked evidence
Gemini 3 Pro · 3.0ScreenSpot-Pro72.72 / 24Tracked evidence
Gemini 3 Pro · 3.1SimpleVQA72.42 / 29Tracked evidence
Gemini 3 Pro · 3.0MotionBench70.32 / 4Tracked evidence
Gemini 3 Pro · 3.1ERQA69.42 / 27Tracked evidence
Gemini 3 Pro · 3.0BabyVision49.72 / 22Tracked evidence
Gemini 3 Pro · 3.0ZEROBench · sub392 / 23Tracked evidence
Gemini 3 Pro · 3.1ZEROBench192 / 27Tracked evidence
Gemini 3 Pro · 3.0MathVista · mini87.93 / 36Tracked evidence
Gemini 3 Pro · 3.0MathVision86.63 / 17Tracked evidence
Gemini 3 Pro · 3.0SLAKE81.33 / 22Tracked evidence
Gemini 3 Pro · 3.0MMVU77.53 / 20Tracked evidence
Gemini 3 Pro · 3.0ODinW · 1346.33 / 13Tracked evidence
Gemini 3 Pro · 3.0CountBench97.34 / 23Tracked evidence
Gemini 3 Pro · 3.0MedXpertQA · mm764 / 31Tracked evidence
Gemini 3.1 Flash Lite · LatestVideo-MMMU84.85 / 28Tracked evidence
Gemini 3 Pro · 3.1CharXiv Reasoning83.35 / 48Tracked evidence
Gemini 3 Pro · 3.0ZEROBench105 / 27Tracked evidence
Gemini 3 Pro · 3.0DynaMath85.16 / 23Tracked evidence
Gemini 3 Flash · 3.0ScreenSpot-Pro69.16 / 24Tracked evidence
Gemini 3 Pro · 3.0RefSpatialBench65.56 / 21Tracked evidence
Gemini 3 Pro · 3.0RealWorldQA83.37 / 24Tracked evidence
Gemini 3 Pro · 3.0LingoQA72.88 / 16Tracked evidence
Gemini 3 Pro · 3.0CharXiv Reasoning81.49 / 48Tracked evidence
Gemini 3 Pro · 3.0MVBench74.110 / 18Tracked evidence
Gemini 3 Pro · 3.0HallusionBench68.611 / 33Tracked evidence
Gemini 3 Pro · 3.1ScreenSpot-Pro6111 / 24Tracked evidence
Gemini 3 Pro · 3.0V*8812 / 23Tracked evidence
Gemini 3 Pro · 3.0MLVU · mavg8312 / 22Tracked evidence
Gemini 3 Flash · 3.0CharXiv Reasoning80.312 / 48Tracked evidence
Gemini 3 Pro · 3.0RefCOCO · avg84.116 / 18Tracked evidence
Gemini 3.1 Flash Lite · LatestCharXiv Reasoning73.222 / 48Tracked evidence
Gemini 3 Pro · 3.0EmbSpatialBench61.223 / 24Tracked evidence

Document/OCR

Model / VariantBenchmarkScoreRankScoring
Gemini 3 Pro · 3.0MMLongBench-Doc60.53 / 22Tracked evidence
Gemini 3 Pro · 3.0OCRBench90.45 / 35Tracked evidence
Gemini 3 Flash · 3.0OmniDocBench · v1_50.15 / 6Tracked evidence
Gemini 3 Pro · 3.0OmniDocBench · v1_50.16 / 6Tracked evidence

Where this family sits in the market

Gemini 3 Flash and 3.1 Flash Lite take the price-efficiency frontier within the family. 3 Pro trades cost for headroom on the hardest workloads.

AnthropicCohereDeepSeekGoogleMetaMicrosoftMiniMaxMistralMoonshotnvidiaOpenAIQwenxAIZhipu

Dashed line = Pareto frontier (no model both cheaper and better). Thinking/non-thinking pairs of the same model are connected — line length = cost of reasoning. Hover any dot for details.

The Gemini 3 family

Every variant we track in this family, grouped by license. Use this to orient before drilling into the variant table.

Closed · API only (3)

  • Gemini 3 Pro2 variants
  • Gemini 3 Flash2 variants
  • Gemini 3.1 Flash Lite1 variant

Alternatives to consider

Peer families that solve overlapping problems. Pick by your binding constraint (cost, latency, open weights, vendor lock-in), not by leaderboard order.

Caveats

What this page does not tell you, listed honestly.

  • Context window not declared for: Gemini 3 Pro, Gemini 3 Flash, Gemini 3.1 Flash Lite.
  • Cross-family models (marked "cross-family" in the variants table) are shown for context only. Their canonical page lives on the family that owns them.

Editor's notes

By borisLast verified AI-assisted, human-reviewed

Why this family matters

Gemini 3 is Google's current generation, structured as a three-tier ladder (Pro, Flash, Flash Lite). Two facts pull most teams onto this family. First, the per-token pricing is aggressive against the closed-flagship field: Gemini 3 Flash lists at $0.5 input / $3 output per million tokens and Gemini 3.1 Flash Lite at $0.25 / $1.5, which sits below the equivalent tiers on most competitor pricing tables we track. Second, Gemini 3 Pro 3.1 lands at Quality Score 104.3 (#5 of 186 models in our index), which puts the flagship inside the same cluster as the closed-frontier tier, not below it.

The combination is unusual. Most families are either cheap-and-mid or expensive-and-frontier; Gemini 3 ships both ends of the ladder simultaneously, which makes "which tier" the entire decision.

Which variant to start with

Default to google-gemini-3-flash. At Quality Score 88.9 (#32 of 186) and $0.5 input / $3 output, it is the family's price-quality sweet spot for chat, summarization, and tool-augmented assistants. For most teams shipping API-backed product features, this is the practical default.

Step up to google-gemini-3-pro when the workload visibly benefits from the additional headroom. The 3.1 variant lands at QS 104.3 against Flash's 88.9, which is a meaningful jump on the hardest reasoning, coding, and multi-step planning evals (GPQA Diamond 94.3, LiveBench 79.93). The price gap is roughly 4x on input ($2 vs $0.5) and 4x on output ($12 vs $3), so the workload needs to be one where the score delta translates to a measurable product win.

Drop to google-gemini-3-1-flash-lite for high-volume chat at scale. Quality Score 81.1 vs Flash's 88.9 is a real gap, but at

$0.25 /

$1.5

the per-token saving compounds quickly on workloads dominated by repetitive low-stakes turns.

When to deviate:

  • Long-context RAG: use google-gemini-3-pro. The 3.0 variant in our index ships with a 1M-token context window at the same headline pricing. Reach for it when document scale and faithful retrieval over long inputs dominate the workload.
  • Hardest-tier reasoning workloads: Gemini 3 Pro 3.1 is competitive at the very top of our reasoning leaderboards (GPQA Diamond rank 3, LiveBench rank 4). Run a side-by-side against your closed-flagship alternative on the specific reasoning benchmark that matters; on the numbers we have, the cost-quality trade-off is genuinely close.
  • You already use Flash for everything: before adding Pro to the rotation, run an A/B on your specific eval. The score gap on benchmarks is real; whether it shows up on your traffic distribution is the question worth answering with data, not vibes.

Where the data is weak

We aggregate benchmark scores from multiple sources but coverage is uneven across this family. Specifically:

  • Gemini 3 Pro has two minor versions in our index (3.0 and 3.1) with substantially different scores. Pro 3.0 sits at QS 95.0; Pro 3.1 at QS 104.3. When the article quotes a number, it is for the specific minor version named; do not collapse the line to a single Pro score.
  • Context windows are partially declared. Our index lists the 1M-token window only on Gemini 3 Pro 3.0; Pro 3.1, Flash, and Flash Lite show the field as unset. That is a coverage gap, not evidence of a smaller window. Verify the limit on the deployment surface you actually use before committing on a long-document workload.
  • Flash Lite has thinner benchmark coverage than the other tiers. Several benchmarks (SWE-Bench Verified, AIME 2025) carry numbers for Flash and Pro but not Flash Lite at last verification. Treat its scores as directional outside the headline benchmarks (Quality Score, Arena ELO, LiveBench, GPQA Diamond).
  • Pricing on this page is the published API list price. Vertex AI routing, batch pricing, and enterprise agreements can change the unit economics. List price is a calibration anchor, not the cost ceiling.

If you are making a procurement decision, the variant table on this page is the load-bearing artifact. Cross-check pricing against Google's own docs (and Vertex AI routing if you go through that path) before you commit.

When to reach for which alternative

  • Open-weights deployment is a requirement: Gemini is API-only. The conversation moves to open-weights families (Qwen3, DeepSeek). On the cost-per-quality axis at the chat workhorse tier, DeepSeek V4 Flash ($0.098 / $0.197 with QS 78.1) is the price anchor to beat.
  • Closed-flagship reasoning at the absolute top end: Claude Opus 4.7-thinking lands at QS 107.8 in our index, slightly above Gemini 3 Pro 3.1 (QS 104.3). On any specific benchmark the ranking can flip; compare on the workload that matters before treating the headline QS difference as decisive.
  • Previous-generation Gemini is already in production: the Gemini 2 line is on the sibling gemini-2 surface in our index. For some workloads the migration cost to 3 may not be earned by the score delta, particularly if Vertex routing or fine-tunes are tied to the older line.

Sources worth reading

How we score

Quality scores combine multiple public benchmarks (LMArena, LiveBench, SWE-bench, Aider and others) into a single comparable number. Pricing is the published API list price; self-hosted cost depends on your own hardware. We do not accept paid placements.

Author: Boris. Read the full methodology.

Get the next Gemini 3 update

New variants, repriced models, and recommendation changes, in plain English. No spam, no paid placements.

Subscribe →

Need help picking for production?

Independent evaluation against your real workload, your real data, and your real cost ceiling. No vendor incentives.

See services →