Navigate
AI BENCHY
AD
Track all your projects in one dashboard. Get 📊stats, 🔥heatmaps and 👀recordings in one self-hosted dashboard.
uxwizz.com

AI BENCHY Compare

Anthropic: Claude Sonnet 5 vs Google: Gemini 3.1 Flash Lite

Summary

Claude Sonnet 5 vs Gemini 3.1 Flash Lite benchmark comparison: Gemini 3.1 Flash Lite leads on average score with 6.1 vs 5.7. Gemini 3.1 Flash Lite has the lower benchmark cost at $0.013 vs $0.287. Gemini 3.1 Flash Lite is faster at 1.33s vs 4.74s, with pass rates of 42.9% vs 54.0%.

Recommended model: Gemini 3.1 Flash Lite - It has the best score here (6.1), while costing about 22.1x less than Claude Sonnet 5.

Last updated at: 2026-06-30

Metric Claude Sonnet 5 Claude Sonnet 5 none Release: 2026-06-30 Gemini 3.1 Flash Lite Gemini 3.1 Flash Lite minimal Release: 2026-05-08
Score 5.7 6.1
Rank #117 #96
Reliability 10.0 10.0
Consistency 8.6 8.8
Tests Correct
Attempt pass rate 42.9% 54.0%
Flaky tests 4 3
Total Runs 63 63
Cost per result 4.098 0.130
Total Cost $0.287 $0.013
Input Price $2.000 / 1M $0.250 / 1M
Output Price $10.000 / 1M $1.500 / 1M
Total Input Tokens 76,797 36,973
Output Tokens 13,325 2,487
Reasoning Tokens 0 0
Response Time (avg) 4.74s 1.33s
Response Time (max) 29.46s 4.49s
Response Time (total) 99.46s 27.91s

Generation showcase

Hamster playing table tennis

Prompt: Create a detailed SVG illustration of a hamster playing table tennis.

#117 Claude Sonnet 5

none
Cost
$0.061
Time
53.7s
Tokens
6,172 tok

#96 Gemini 3.1 Flash Lite

minimal
Cost
$0.001
Time
3.7s
Tokens
635 tok

Top Models by Score

Score vs Total Cost

Response Time (avg)

Score vs Response Time (avg)

Total Output Tokens

Score vs Total Output Tokens

Category Breakdown

Anti-AI Tricks Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Claude Sonnet 5 5.3 10.0 25.0% 0 3.60s 834 1,813 0
Gemini 3.1 Flash Lite 8.3 10.0 75.0% 0 1.10s 500 639 0
Coding Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Claude Sonnet 5 4.6 7.9 22.2% 1 3.67s 10,590 1,864 0
Gemini 3.1 Flash Lite 5.5 10.0 33.3% 0 831ms 8,126 666 0
Combined Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Claude Sonnet 5 3.0 10.0 0.0% 0 29.46s 38,775 6,340 0
Gemini 3.1 Flash Lite 3.0 10.0 0.0% 0 2.53s 12,870 357 0
Data parsing and extraction Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Claude Sonnet 5 10.0 10.0 100.0% 0 3.01s 10,503 309 0
Gemini 3.1 Flash Lite 10.0 10.0 100.0% 0 1.04s 7,552 279 0
Domain specific Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Claude Sonnet 5 5.3 7.2 44.4% 1 3.28s 975 933 0
Gemini 3.1 Flash Lite 2.9 7.2 11.1% 1 1.02s 641 15 0
General Intelligence Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Claude Sonnet 5 4.7 3.1 33.3% 1 2.81s 708 272 0
Gemini 3.1 Flash Lite 4.0 10.0 0.0% 0 791ms 490 63 0
Instructions following Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Claude Sonnet 5 6.4 10.0 50.0% 0 2.58s 909 103 0
Gemini 3.1 Flash Lite 10.0 10.0 100.0% 0 932ms 615 72 0
Puzzle Solving Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Claude Sonnet 5 6.0 7.4 55.6% 1 3.22s 894 778 0
Gemini 3.1 Flash Lite 6.0 4.6 66.7% 2 2.15s 564 153 0
Tool Calling Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Claude Sonnet 5 10.0 10.0 100.0% 0 6.80s 12,351 522 0
Gemini 3.1 Flash Lite 10.0 10.0 100.0% 0 3.51s 5,457 234 0
Trivia Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Claude Sonnet 5 3.0 10.0 0.0% 0 4.31s 258 391 0
Gemini 3.1 Flash Lite 3.0 10.0 0.0% 0 724ms 158 9 0

Quick Compare

Switch Comparison Pair