Navigate
AI BENCHY
Advertise here

AI BENCHY Compare

Google: Gemini 3.1 Pro Preview vs Google: Gemini 3.5 Flash

Summary

Gemini 3.1 Pro Preview vs Gemini 3.5 Flash benchmark comparison: Gemini 3.1 Pro Preview leads on average score with 9.2 vs 9.1. Gemini 3.5 Flash has the lower benchmark cost at $0.582 vs $1.054. Gemini 3.5 Flash is faster at 4.94s vs 20.14s, with pass rates of 90.5% vs 87.3%.

Recommended model: Gemini 3.5 Flash - Its score stays close to the best score here (9.1 vs 9.2), while costing about 1.8x less than Gemini 3.1 Pro Preview.

Last updated at: 2026-07-02

Metric Gemini 3.1 Pro Preview Gemini 3.1 Pro Preview medium Release: 2026-02-19 Gemini 3.5 Flash Gemini 3.5 Flash medium Release: 2026-05-19
Score 9.2 9.1
Rank #7 #8
Reliability 10.0 10.0
Consistency 10.0 9.6
Tests Correct
Attempt pass rate 90.5% 87.3%
Flaky tests 0 1
Total Runs 63 63
Cost per result 5.546 3.229
Total Cost $1.054 $0.582
Input Price $2.000 / 1M $1.500 / 1M
Output Price $12.000 / 1M $9.000 / 1M
Total Input Tokens 41,617 36,936
Output Tokens 1,977 2,001
Reasoning Tokens 78,896 56,408
Response Time (avg) 20.14s 4.94s
Response Time (max) 88.68s 18.07s
Response Time (total) 281.92s 103.79s

Generation showcase

Hamster playing table tennis

Prompt: Create a detailed SVG illustration of a hamster playing table tennis.

#7 Gemini 3.1 Pro Preview

medium
Cost
$0.115
Time
87.2s
Tokens
9,629 tok

#8 Gemini 3.5 Flash

medium
Cost
$0.201
Time
112.9s
Tokens
22,371 tok

Top Models by Score

Score vs Total Cost

Response Time (avg)

Score vs Response Time (avg)

Total Output Tokens

Score vs Total Output Tokens

Category Breakdown

Anti-AI Tricks Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Gemini 3.1 Pro Preview 10.0 10.0 100.0% 0 7.90s 498 112 3,218
Gemini 3.5 Flash 10.0 10.0 100.0% 0 2.09s 492 171 3,385
Coding Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Gemini 3.1 Pro Preview 7.9 9.9 66.7% 0 40.17s 8,124 435 41,247
Gemini 3.5 Flash 7.9 7.5 77.8% 1 12.63s 8,118 461 24,939
Combined Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Gemini 3.1 Pro Preview 9.5 10.0 100.0% 0 40.61s 17,240 432 9,281
Gemini 3.5 Flash 10.0 10.0 100.0% 0 12.05s 12,873 351 7,807
Data parsing and extraction Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Gemini 3.1 Pro Preview 10.0 10.0 100.0% 0 7.72s 7,265 279 3,904
Gemini 3.5 Flash 10.0 10.0 100.0% 0 4.07s 7,548 279 3,784
Domain specific Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Gemini 3.1 Pro Preview 7.7 10.0 66.7% 0 32.73s 635 18 12,424
Gemini 3.5 Flash 7.7 10.0 66.7% 0 5.24s 633 12 8,047
General Intelligence Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Gemini 3.1 Pro Preview 10.0 10.0 100.0% 0 11.77s 490 108 1,179
Gemini 3.5 Flash 10.0 10.0 100.0% 0 2.52s 486 115 1,144
Instructions following Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Gemini 3.1 Pro Preview 10.0 10.0 100.0% 0 9.56s 621 72 2,236
Gemini 3.5 Flash 9.9 10.0 100.0% 0 2.70s 615 71 2,855
Puzzle Solving Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Gemini 3.1 Pro Preview 10.0 10.0 100.0% 0 6.90s 570 235 3,128
Gemini 3.5 Flash 7.7 10.0 66.7% 0 2.38s 558 295 2,747
Tool Calling Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Gemini 3.1 Pro Preview 10.0 10.0 100.0% 0 23.15s 6,018 274 982
Gemini 3.5 Flash 10.0 10.0 100.0% 0 3.81s 5,457 234 455
Trivia Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Gemini 3.1 Pro Preview 10.0 10.0 100.0% 0 6.27s 156 12 1,297
Gemini 3.5 Flash 10.0 10.0 100.0% 0 2.75s 156 12 1,245

Quick Compare

Switch Comparison Pair