Navigate
AI BENCHY
AD
Track all your projects in one dashboard. Get 📊stats, 🔥heatmaps and 👀recordings in one self-hosted dashboard.
uxwizz.com

AI BENCHY Compare

DeepSeek: DeepSeek V4 Flash vs Google: Gemini 3.1 Flash Lite

Summary

DeepSeek V4 Flash vs Gemini 3.1 Flash Lite benchmark comparison: Gemini 3.1 Flash Lite leads on average score with 6.1 vs 5.5. DeepSeek V4 Flash has the lower benchmark cost at $0.007 vs $0.013. Gemini 3.1 Flash Lite is faster at 1.06s vs 26.75s, with pass rates of 30.2% vs 52.4%.

Recommended model: Gemini 3.1 Flash Lite - It has the best score here (6.1), while responding about 25.1x faster than DeepSeek V4 Flash.

Last updated at: 2026-06-18

Metric DeepSeek V4 Flash DeepSeek V4 Flash none Release: 2026-04-24 Gemini 3.1 Flash Lite Gemini 3.1 Flash Lite none Release: 2026-05-08
Score 5.5 6.1
Rank #117 #96
Reliability 10.0 10.0
Consistency 8.9 8.6
Tests Correct
Attempt pass rate 30.2% 52.4%
Flaky tests 3 4
Total Runs 63 63
Cost per result 0.203 0.144
Total Cost $0.007 $0.013
Input Price $0.090 / 1M $0.250 / 1M
Output Price $0.180 / 1M $1.500 / 1M
Total Input Tokens 50,127 36,710
Output Tokens 13,710 2,484
Reasoning Tokens 0 0
Response Time (avg) 26.75s 1.06s
Response Time (max) 111.96s 2.97s
Response Time (total) 561.82s 22.35s

Generation showcase

Hamster playing table tennis

Prompt: Create a detailed SVG illustration of a hamster playing table tennis.

#117 DeepSeek V4 Flash

none
Cost
$0.004
Time
157.6s
Tokens
11,297 tok

#96 Gemini 3.1 Flash Lite

none
Cost
$0.001
Time
4.5s
Tokens
727 tok

Top Models by Score

Score vs Total Cost

Response Time (avg)

Score vs Response Time (avg)

Total Output Tokens

Score vs Total Output Tokens

Category Breakdown

Anti-AI Tricks Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
DeepSeek V4 Flash 3.0 10.0 0.0% 0 20.18s 540 174 0
Gemini 3.1 Flash Lite 7.5 8.4 66.7% 1 1.07s 506 639 0
Coding Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
DeepSeek V4 Flash 4.2 7.4 11.1% 1 17.13s 7,279 9,717 0
Gemini 3.1 Flash Lite 5.5 10.0 33.3% 0 938ms 8,128 666 0
Combined Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
DeepSeek V4 Flash 4.5 2.1 66.7% 1 111.96s 24,398 2,664 0
Gemini 3.1 Flash Lite 3.0 10.0 0.0% 0 2.73s 12,870 357 0
Data parsing and extraction Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
DeepSeek V4 Flash 10.0 10.0 100.0% 0 23.79s 7,290 195 0
Gemini 3.1 Flash Lite 10.0 10.0 100.0% 0 843ms 7,267 279 0
Domain specific Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
DeepSeek V4 Flash 5.3 10.0 33.3% 0 19.73s 666 18 0
Gemini 3.1 Flash Lite 2.9 7.2 11.1% 1 762ms 647 15 0
General Intelligence Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
DeepSeek V4 Flash 4.2 9.9 0.0% 0 23.74s 471 67 0
Gemini 3.1 Flash Lite 4.0 10.0 0.0% 0 992ms 486 63 0
Instructions following Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
DeepSeek V4 Flash 6.5 10.0 50.0% 0 17.54s 627 321 0
Gemini 3.1 Flash Lite 10.0 10.0 100.0% 0 859ms 619 72 0
Puzzle Solving Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
DeepSeek V4 Flash 3.1 7.3 11.1% 1 23.72s 594 207 0
Gemini 3.1 Flash Lite 6.3 4.8 66.7% 2 720ms 570 150 0
Tool Calling Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
DeepSeek V4 Flash 10.0 10.0 100.0% 0 77.93s 8,079 327 0
Gemini 3.1 Flash Lite 10.0 10.0 100.0% 0 2.97s 5,457 234 0
Trivia Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
DeepSeek V4 Flash 3.0 10.0 0.0% 0 3.07s 183 20 0
Gemini 3.1 Flash Lite 3.0 10.0 0.0% 0 733ms 160 9 0

Quick Compare

Switch Comparison Pair