Google family

Gemma

Gemma: 4 31B IT (Thinking) ranks #34 of 186 with 262K-token context and $0.12/$0.37 per 1M tokens. Compare Gemma 4 and Gemma 3 by workload.

Top in this family

Gemma 4 31B IT (Thinking) ranks #34 of 186 on overall quality (QS 88.6) at $0.12/$0.37 per 1M tokens.

Practical pick

Gemma 4 26B A4B IT (Thinking) at $0.06/$0.33 per 1M tokens (rank #48 of 186).

Variants
10
License
Open weights
Provider
Google

★ Most teams should start here

Google Gemma 4 26B A4B IT

Variant: Thinking

The practical default in the current generation. Mixture-of-experts variant: cost and serves like a small model on capable hardware while carrying near-flagship quality. Pick the dense 31B only when you have a specific deployment reason to avoid MoE.

Quality Score
83.9
Input
$0.060/1M
Output
$0.330/1M
Context
262K
License
Open weights

Best variant by workload

One pick per common job. Pick by what you need to ship — not by which variant has the highest score on a leaderboard you don't use.

Note — picks are framed for direct API usage where cost per million tokens is load-bearing. If you're inside an agent harness (Claude Code, Cursor, etc.) the calculus changes: the harness sets the model, the per-task cost is usually negligible, and the flagship variant tends to win. See our piece on Claude Code for the harness-vs-API framing.
WorkloadBest pickWhy
Self-host on 1 GPU
Google Gemma 4 26B A4B IT
Thinking
$0.060/1M / $0.330/1M
Mixture-of-experts active-param footprint is closer to a 4B model than a 26B one. Plan for the full parameter count when sizing GPU memory, not the active subset.
Edge / on-device
Google Gemma 4 E2B IT
Thinking
Smallest efficient Gemma 4 variant. Fits CPU and edge inference for local or on-device deployment.
General API workhorse
Google Gemma 4 31B IT
Thinking
$0.120/1M / $0.370/1M
Dense Gemma 4 flagship. Use when MoE serving complexity is a problem and you want a predictable parameter-count profile.

All variants

14 variants across 10 models (+ 1 cross-family for context). Sorted by quality score (descending).

VariantQSGPQAHLESWESWE-ProMCPAIMEIn $/MOut $/MContextReleasedLic.
Thinking
Gemma 4 31B IT
88.6
#34/186
84.319.5$0.12$0.37262KApr 2, 2026
Thinking
Gemma 4 26B A4B IT
83.9
#48/186
82.38.7$0.06$0.33262KApr 2, 2026
Thinking
Gemma 4 E4B IT
69.2
#119/186
58.6Apr 2, 2026
Thinking
Gemma 4 E2B IT
62.5
#147/186
43.4Apr 2, 2026
Non-thinkingPrevious
Gemma 3 27B IT
58.9
#164/186
42.411.424.0$0.08$0.16131KMar 12, 2025
Non-thinkingPrevious
Gemma 3 12B IT
58.7
#165/186
40.918.8$0.04$0.13131KMar 12, 2025
Non-thinkingPrevious
Gemma 3n 4B
50.4
#175/186
23.711.6$0.06$0.1233KJun 26, 2025
Non-thinkingPrevious
Gemma 3 4B IT
49.9
#178/186
30.8$0.04$0.08131KMar 12, 2025
Non-thinkingPrevious
Gemma 3n 2B IT
46.9
#181/186
24.86.7Jun 26, 2025
Non-thinkingPrevious
Gemma 3 1B IT
32.1
#186/186
19.20.8Mar 12, 2025
V4 Pro Thinkingcross-family
DeepSeek V4
98.0
#15/186
90.137.780.655.473.6$0.435$0.871.0MApr 24, 2026
V4 Flash Thinkingcross-family
DeepSeek V4
92.0
#27/186
88.134.879.052.669.0$0.098$0.1971.0MApr 24, 2026
V4 Procross-family
DeepSeek V4
80.9
#61/186
72.97.773.652.169.4$0.435$0.871.0MApr 24, 2026
V4 Flashcross-family
DeepSeek V4
78.1
#78/186
71.28.173.749.164.0$0.098$0.1971.0MApr 24, 2026

Benchmark evidence

Every benchmark we track for this family, across capabilities. The headline Quality Score draws from a deliberately narrow, governed panel (59 of 123 rows here feed it); the rest is tracked evidence — recorded and comparable, but not folded into one synthetic score.

Model / VariantBenchmarkScoreRankScoring
Google Gemma 3 27B IT · Non-thinkingMMLU Pro · 5_shot_cot67.51 / 4In Quality Score
Google Gemma 4 31B IT · ThinkingHumanity's Last Exam · search26.51 / 2In Quality Score
Google Gemma 3 27B IT · Non-thinkingGPQA Diamond · 5_shot_cot42.42 / 4In Quality Score
Google Gemma 4 26B A4B IT · ThinkingHumanity's Last Exam · search17.22 / 2In Quality Score
Google Gemma 3n 4B · Non-thinkingLiveCodeBench · v525.74 / 5In Quality Score
Google Gemma 4 31B IT · ThinkingLiveCodeBench805 / 69In Quality Score
Google Gemma 3n 2B IT · Non-thinkingLiveCodeBench · v518.65 / 5In Quality Score
Google Gemma 4 26B A4B IT · ThinkingLiveCodeBench77.16 / 69In Quality Score
Show all benchmark evidence (123 rows)

Reasoning

Model / VariantBenchmarkScoreRankScoring
Google Gemma 3 27B IT · Non-thinkingMMLU Pro · 5_shot_cot67.51 / 4In Quality Score
Google Gemma 4 31B IT · ThinkingHumanity's Last Exam · search26.51 / 2In Quality Score
Google Gemma 3 27B IT · Non-thinkingGPQA Diamond · 5_shot_cot42.42 / 4In Quality Score
Google Gemma 4 26B A4B IT · ThinkingHumanity's Last Exam · search17.22 / 2In Quality Score
Google Gemma 4 31B IT · ThinkingMMLU Pro85.220 / 86In Quality Score
Google Gemma 4 31B IT · ThinkingArena Elo145234 / 158In Quality Score
Google Gemma 4 31B IT · ThinkingGPQA Diamond84.337 / 143In Quality Score
Google Gemma 4 26B A4B IT · ThinkingMMLU Pro82.638 / 86In Quality Score
Google Gemma 4 31B IT · ThinkingHumanity's Last Exam · hle19.542 / 90In Quality Score
Google Gemma 4 26B A4B IT · ThinkingArena Elo143947 / 158In Quality Score
Google Gemma 4 26B A4B IT · ThinkingGPQA Diamond82.347 / 143In Quality Score
Google Gemma 4 31B IT · ThinkingLiveBench61.656 / 110In Quality Score
Google Gemma 3 27B IT · Non-thinkingAIME 20252462 / 88In Quality Score
Google Gemma 4 E4B IT · ThinkingMMLU Pro69.465 / 86In Quality Score
Google Gemma 4 26B A4B IT · ThinkingHumanity's Last Exam · hle8.767 / 90In Quality Score
Google Gemma 3 27B IT · Non-thinkingMMLU Pro67.568 / 86In Quality Score
Google Gemma 3 12B IT · Non-thinkingAIME 202518.870 / 88In Quality Score
Google Gemma 3 12B IT · Non-thinkingMMLU Pro60.674 / 86In Quality Score
Google Gemma 4 E2B IT · ThinkingMMLU Pro6075 / 86In Quality Score
Google Gemma 3n 4B · Non-thinkingAIME 202511.678 / 88In Quality Score
Google Gemma 3n 4B · Non-thinkingMMLU Pro50.680 / 86In Quality Score
Google Gemma 3 4B IT · Non-thinkingMMLU Pro43.681 / 86In Quality Score
Google Gemma 3n 2B IT · Non-thinkingMMLU Pro40.583 / 86In Quality Score
Google Gemma 3n 2B IT · Non-thinkingAIME 20256.783 / 88In Quality Score
Google Gemma 3 27B IT · Non-thinkingLiveBench49.286 / 110In Quality Score
Google Gemma 3 1B IT · Non-thinkingMMLU Pro14.786 / 86In Quality Score
Google Gemma 3 1B IT · Non-thinkingAIME 20250.888 / 88In Quality Score
Google Gemma 3 12B IT · Non-thinkingLiveBench43.794 / 110In Quality Score
Google Gemma 4 E4B IT · ThinkingGPQA Diamond58.6103 / 143In Quality Score
Google Gemma 3 1B IT · Non-thinkingLiveBench14.4110 / 110In Quality Score
Google Gemma 3 27B IT · Non-thinkingArena Elo1366113 / 158In Quality Score
Google Gemma 4 E2B IT · ThinkingGPQA Diamond43.4124 / 143In Quality Score
Google Gemma 3 27B IT · Non-thinkingGPQA Diamond42.4125 / 143In Quality Score
Google Gemma 3 12B IT · Non-thinkingArena Elo1342126 / 158In Quality Score
Google Gemma 3 12B IT · Non-thinkingGPQA Diamond40.9127 / 143In Quality Score
Google Gemma 3 4B IT · Non-thinkingGPQA Diamond30.8135 / 143In Quality Score
Google Gemma 3n 4B · Non-thinkingArena Elo1318137 / 158In Quality Score
Google Gemma 3n 2B IT · Non-thinkingGPQA Diamond24.8140 / 143In Quality Score
Google Gemma 3n 4B · Non-thinkingGPQA Diamond23.7141 / 143In Quality Score
Google Gemma 3 1B IT · Non-thinkingGPQA Diamond19.2143 / 143In Quality Score
Google Gemma 3 4B IT · Non-thinkingArena Elo1303144 / 158In Quality Score
Google Gemma 3 27B IT · Non-thinkingGSM8K95.91 / 10Tracked evidence
Google Gemma 4 31B IT · ThinkingAIME 2026 · no_tools89.21 / 4Tracked evidence
Google Gemma 3 12B IT · Non-thinkingGSM8K94.42 / 10Tracked evidence
Google Gemma 4 26B A4B IT · ThinkingAIME 2026 · no_tools88.32 / 4Tracked evidence
Google Gemma 4 E4B IT · ThinkingAIME 2026 · no_tools42.53 / 4Tracked evidence
Google Gemma 4 E2B IT · ThinkingAIME 2026 · no_tools37.54 / 4Tracked evidence
Google Gemma 3 4B IT · Non-thinkingGSM8K89.27 / 10Tracked evidence
Google Gemma 4 31B IT · ThinkingMMMLU88.49 / 38Tracked evidence
Google Gemma 3 27B IT · Non-thinkingMulti-IF69.89 / 32Tracked evidence
Google Gemma 4 31B IT · ThinkingMRCR · v2_128k66.49 / 23Tracked evidence
Google Gemma 3 1B IT · Non-thinkingGSM8K62.810 / 10Tracked evidence
Google Gemma 4 31B IT · ThinkingMMMU PRO76.912 / 52Tracked evidence
Google Gemma 3 27B IT · Non-thinkingArena-Hard86.813 / 40Tracked evidence
Google Gemma 4 26B A4B IT · ThinkingMMMLU86.314 / 38Tracked evidence
Google Gemma 3 12B IT · Non-thinkingMulti-IF65.614 / 32Tracked evidence
Google Gemma 3 12B IT · Non-thinkingArena-Hard82.619 / 40Tracked evidence
Google Gemma 4 26B A4B IT · ThinkingMRCR · v2_128k44.119 / 23Tracked evidence
Google Gemma 3 12B IT · Non-thinkingMMMU · mmmu_single5021 / 22Tracked evidence
Google Gemma 4 E4B IT · ThinkingMRCR · v2_128k25.422 / 23Tracked evidence
Google Gemma 4 E2B IT · ThinkingMRCR · v2_128k19.123 / 23Tracked evidence
Google Gemma 4 26B A4B IT · ThinkingMMMU PRO73.826 / 52Tracked evidence
Google Gemma 4 E4B IT · ThinkingMMMLU76.627 / 38Tracked evidence
Google Gemma 3 27B IT · Non-thinkingMMLU76.929 / 33Tracked evidence
Google Gemma 3 27B IT · Non-thinkingMATH 5009030 / 55Tracked evidence
Google Gemma 4 E2B IT · ThinkingMMMLU67.431 / 38Tracked evidence
Google Gemma 3 27B IT · Non-thinkingBFCL v359.131 / 49Tracked evidence
Google Gemma 3 1B IT · Non-thinkingMulti-IF32.831 / 32Tracked evidence
Google Gemma 3n 4B · Non-thinkingMMLU64.932 / 33Tracked evidence
Google Gemma 3n 2B IT · Non-thinkingMMLU60.133 / 33Tracked evidence
Google Gemma 3 27B IT · Non-thinkingSimpleQA1035 / 40Tracked evidence
Google Gemma 3 12B IT · Non-thinkingMATH 50085.636 / 55Tracked evidence
Google Gemma 3 12B IT · Non-thinkingSimpleQA6.337 / 40Tracked evidence
Google Gemma 3 1B IT · Non-thinkingArena-Hard17.838 / 40Tracked evidence
Google Gemma 3 4B IT · Non-thinkingSimpleQA439 / 40Tracked evidence
Google Gemma 3 1B IT · Non-thinkingSimpleQA2.240 / 40Tracked evidence
Google Gemma 4 E4B IT · ThinkingMMMU PRO52.641 / 52Tracked evidence
Google Gemma 3 12B IT · Non-thinkingBFCL v350.641 / 49Tracked evidence
Google Gemma 3 27B IT · Non-thinkingMMMU PRO48.445 / 52Tracked evidence
Google Gemma 3 27B IT · Non-thinkingAIME 202432.646 / 69Tracked evidence
Google Gemma 4 E2B IT · ThinkingMMMU PRO44.248 / 52Tracked evidence
Google Gemma 3 1B IT · Non-thinkingBFCL v316.349 / 49Tracked evidence
Google Gemma 3 1B IT · Non-thinkingMATH 50046.455 / 55Tracked evidence
Google Gemma 3 12B IT · Non-thinkingAIME 202422.455 / 69Tracked evidence
Google Gemma 3 1B IT · Non-thinkingAIME 20240.969 / 69Tracked evidence

Coding

Model / VariantBenchmarkScoreRankScoring
Google Gemma 3n 4B · Non-thinkingLiveCodeBench · v525.74 / 5In Quality Score
Google Gemma 4 31B IT · ThinkingLiveCodeBench805 / 69In Quality Score
Google Gemma 3n 2B IT · Non-thinkingLiveCodeBench · v518.65 / 5In Quality Score
Google Gemma 4 26B A4B IT · ThinkingLiveCodeBench77.16 / 69In Quality Score
Google Gemma 3 27B IT · Non-thinkingLiveCodeBench · 2024_10_01_to_2025_02_0129.77 / 9In Quality Score
Google Gemma 4 E4B IT · ThinkingLiveCodeBench5233 / 69In Quality Score
Google Gemma 4 E2B IT · ThinkingLiveCodeBench4436 / 69In Quality Score
Google Gemma 3 27B IT · Non-thinkingAider (Polyglot)4.944 / 45In Quality Score
Google Gemma 3 27B IT · Non-thinkingLiveCodeBench29.748 / 69In Quality Score
Google Gemma 3 12B IT · Non-thinkingLiveCodeBench24.655 / 69In Quality Score
Google Gemma 3n 2B IT · Non-thinkingLiveCodeBench13.260 / 69In Quality Score
Google Gemma 3n 4B · Non-thinkingLiveCodeBench13.261 / 69In Quality Score
Google Gemma 3 4B IT · Non-thinkingLiveCodeBench12.662 / 69In Quality Score
Google Gemma 3 1B IT · Non-thinkingLiveCodeBench1.969 / 69In Quality Score
Google Gemma 4 31B IT · ThinkingCodeforces21505 / 47Tracked evidence
Google Gemma 4 26B A4B IT · ThinkingCodeforces171820 / 47Tracked evidence
Google Gemma 3 27B IT · Non-thinkingCodeforces106333 / 47Tracked evidence
Google Gemma 4 E4B IT · ThinkingCodeforces94036 / 47Tracked evidence
Google Gemma 4 E2B IT · ThinkingCodeforces63344 / 47Tracked evidence
Google Gemma 3 12B IT · Non-thinkingCodeforces46246 / 47Tracked evidence

Agentic

Model / VariantBenchmarkScoreRankScoring
Google Gemma 4 31B IT · Thinkingτ²-bench · average76.920 / 30In Quality Score
Google Gemma 4 26B A4B IT · Thinkingτ²-bench · average68.222 / 30In Quality Score
Google Gemma 4 E4B IT · Thinkingτ²-bench · average42.227 / 30In Quality Score
Google Gemma 4 E2B IT · Thinkingτ²-bench · average24.529 / 30In Quality Score

Multimodal

Model / VariantBenchmarkScoreRankScoring
Google Gemma 3 27B IT · Non-thinkingChartQA76.38 / 9Tracked evidence
Google Gemma 3 12B IT · Non-thinkingMathVision · mini31.98 / 10Tracked evidence
Google Gemma 3 12B IT · Non-thinkingChartQA · test399 / 10Tracked evidence
Google Gemma 3 12B IT · Non-thinkingMathVerse · mini29.810 / 10Tracked evidence
Google Gemma 4 31B IT · ThinkingMedXpertQA · mm61.314 / 31Tracked evidence
Google Gemma 4 26B A4B IT · ThinkingMedXpertQA · mm58.115 / 31Tracked evidence
Google Gemma 3 12B IT · Non-thinkingHallusionBench65.316 / 33Tracked evidence
Google Gemma 4 E4B IT · ThinkingMedXpertQA · mm28.723 / 31Tracked evidence
Google Gemma 3 12B IT · Non-thinkingAI2D · test80.428 / 33Tracked evidence
Google Gemma 4 E2B IT · ThinkingMedXpertQA · mm23.528 / 31Tracked evidence
Google Gemma 3 12B IT · Non-thinkingMMStar59.430 / 33Tracked evidence
Google Gemma 3 12B IT · Non-thinkingMathVista · mini57.435 / 36Tracked evidence

Document/OCR

Model / VariantBenchmarkScoreRankScoring
Google Gemma 3 27B IT · Non-thinkingDocVQA90.46 / 8Tracked evidence
Google Gemma 3 12B IT · Non-thinkingOCRBench75.329 / 35Tracked evidence

Where this family sits in the market

Gemma's e2b and e4b efficiency variants take the cost-efficiency frontier for small-footprint self-hosting. The 26B-A4B MoE sits at the family's quality-per-active-param sweet spot.

AnthropicCohereDeepSeekGoogleMetaMicrosoftMiniMaxMistralMoonshotnvidiaOpenAIQwenxAIZhipu

Dashed line = Pareto frontier (no model both cheaper and better). Thinking/non-thinking pairs of the same model are connected — line length = cost of reasoning. Hover any dot for details.

Self-hosting

These variants ship with open weights, so you can run them on your own hardware or via a hosting provider you control. Pick a variant that fits your GPU memory budget; mixture-of-experts variants are cheaper to serve than their total parameter count suggests, but the full weights still need to fit in memory.

  • Google Gemma 4 31B ITThinking · open weights
  • Google Gemma 4 26B A4B ITThinking · open weights
  • Google Gemma 4 E4B ITThinking · open weights
  • Google Gemma 4 E2B ITThinking · open weights
  • Google Gemma 3 27B ITNon-thinking · open weights
  • Google Gemma 3 12B ITNon-thinking · open weights
  • Google Gemma 3 4B ITNon-thinking · open weights
  • Google Gemma 3 1B ITNon-thinking · open weights
  • Google Gemma 3n 4BNon-thinking · open weights
  • Google Gemma 3n 2B ITNon-thinking · open weights

The Gemma family

Every variant we track in this family, grouped by license. Use this to orient before drilling into the variant table.

Open weights (10)

  • Google Gemma 4 31B IT1 variant
  • Google Gemma 4 26B A4B IT1 variant
  • Google Gemma 4 E4B IT1 variant
  • Google Gemma 4 E2B IT1 variant
  • Google Gemma 3 27B IT1 variant
  • Google Gemma 3 12B IT1 variant
  • Google Gemma 3 4B IT1 variant
  • Google Gemma 3 1B IT1 variant
  • Google Gemma 3n 4B1 variant
  • Google Gemma 3n 2B IT1 variant

Alternatives to consider

Peer families that solve overlapping problems. Pick by your binding constraint (cost, latency, open weights, vendor lock-in), not by leaderboard order.

Caveats

What this page does not tell you, listed honestly.

  • No tracked API pricing for: Google Gemma 4 E4B IT, Google Gemma 4 E2B IT, Google Gemma 3 1B IT, Google Gemma 3n 2B IT. Variants without hosted-provider pricing are listed for completeness; cost columns show a dash.
  • Context window not declared for: Google Gemma 4 E4B IT, Google Gemma 4 E2B IT, Google Gemma 3 1B IT, Google Gemma 3n 2B IT.
  • Cross-family models (marked "cross-family" in the variants table) are shown for context only. Their canonical page lives on the family that owns them.

Editor's notes

By borisLast verified AI-assisted, human-reviewed

Why this family matters

Gemma is Google's open-weights line, distinct from the closed Gemini API family. The current generation (Gemma 4) brings a meaningful architectural change: alongside the dense 31B variant, the family now ships a 26B-A4B mixture-of-experts build (26B total parameters, ~4B active per token), plus efficiency-tuned e2b and e4b variants sized for edge and on-device deployment.

The structural pull onto Gemma is self-host fit. The 26B-A4B MoE costs and serves like a small model on capable hardware while landing at Quality Score 83.9 (#48 of 186 models we track), which is competitive with dense models 3 to 5 times its active-parameter footprint. The dense 31B at QS 88.6 is the family's quality ceiling; the e2b and e4b variants are the cost-efficiency frontier for constrained hardware.

Which variant to start with

Default to google-gemma-4-26b-a4b-it. At $0.06 input / $0.33 output per million on hosted routes (or proportional self-host cost) and Quality Score 83.9 with a 262K context window, it is the family's quality-per-active-param sweet spot. Pick this unless you have a specific reason not to.

When to deviate:

  • Maximum quality within the family: use google-gemma-4-31b-it (QS 88.6, dense 31B). The score gap to 26B-A4B is modest (3 points of Quality Score), but for deployments that benefit from dense inference characteristics or where MoE serving is awkward, the dense variant is the cleaner option.
  • Self-host on a single GPU: the choice depends on what you are optimising for. The 26B-A4B MoE has the active-param footprint of a 4B model but the full weights still need to fit in memory; plan for the full 26B parameter count when sizing GPU memory, not the active subset. The dense 31B is heavier on memory but simpler to serve. Below those, the e4b variant is the realistic single-GPU consumer-hardware target in the family.
  • Edge or on-device: drop to google-gemma-4-e2b-it (QS 62.5, efficiency-tuned). Smallest Gemma 4 variant; fits CPU and small-footprint inference where round-trip latency rules out hosted APIs. Treat its score as directional, not comparative against the larger variants.
  • You are migrating an existing Gemma 3 deployment: the Gemma 3 line is still in our index, but the 27B at QS 58.9 (and the smaller variants below it) lag the Gemma 4 generation by enough that the migration cost is almost certainly earned by the score uplift, particularly if the deployment is recent enough that fine-tunes or pinned weights are not the constraint.

Where the data is weak

We aggregate benchmark scores from multiple sources but coverage and naming across this family deserve a careful read. Specifically:

  • Gemma 3 vs Gemma 4 is a generation gap, not a minor version bump. Gemma 4 31B at QS 88.6 vs Gemma 3 27B at QS 58.9 is a difference of nearly 30 Quality Score points on otherwise comparable parameter counts. Do not collapse "Gemma 27B" and "Gemma 31B" into one mental category; they are different generations.
  • e2b and e4b efficiency variants have thinner benchmark coverage than the dense and MoE variants. Several benchmarks (LiveBench, AIME, SWE-Bench) carry numbers for 31B / 26B-A4B but not for the efficiency variants at last verification. Treat their listed scores as directional, particularly outside Quality Score and Arena ELO.
  • Context windows on the e-variants are not declared in our index. Gemma 4 31B and 26B-A4B both ship at 262K; the e4b and e2b context fields are unset. That is a coverage gap, not evidence of a smaller window.
  • Hosted-vs-self-host pricing. The pricing on this page is what hosted inference providers charge for Gemma, not the cost of self-hosting (which depends on your own hardware and utilisation). For an open-weights family, list price is the calibration anchor for cross-family comparison; the deployment cost question is on you.
  • The Gemma 3n line. The 3n variants (e2b, e4b) appear alongside Gemma 3 in our index but are an efficiency-tuned derivative, not a drop-in for Gemma 3. Their scores are weaker than the larger Gemma 3 variants and substantially weaker than Gemma 4 e-variants; they exist for niche on-device use cases, not general workhorse deployment.

If you are making a procurement decision (or, more often for this family, a deployment-architecture decision), the variant table on this page is the load-bearing artifact. Cross-check the open-weights license terms against your use case before you commit; Gemma's licence is permissive but specific.

When to reach for which alternative

  • Open-weights breadth across model sizes: the Qwen3 family ships dense models from 0.6B to 32B plus MoE variants, which gives a wider spread than Gemma's current lineup, particularly at the smaller end. Pick by which family's smallest deployable variant fits your hardware budget and licence requirements.
  • Cheapest competent open-weights API: DeepSeek V4 Flash ($0.098 / $0.197 with QS 78.1 and 1M context) is the price anchor to beat at the chat workhorse tier. Gemma 4 26B-A4B competes on the open-weights story; DeepSeek wins on hosted-API quality-per-dollar at the chat tier.
  • You need a closed Google option for the same workload: the Gemini 3 surface in our index covers Google's current closed line. Gemma is not a drop-in for Gemini in either direction; the licence and deployment story differ enough that the surfaces are best evaluated separately, not as a tier-ladder.

Sources worth reading

How we score

Quality scores combine multiple public benchmarks (LMArena, LiveBench, SWE-bench, Aider and others) into a single comparable number. Pricing is the published API list price; self-hosted cost depends on your own hardware. We do not accept paid placements.

Author: Boris. Read the full methodology.

Get the next Gemma update

New variants, repriced models, and recommendation changes, in plain English. No spam, no paid placements.

Subscribe →

Need help picking for production?

Independent evaluation against your real workload, your real data, and your real cost ceiling. No vendor incentives.

See services →