Navigate
AI BENCHY
Advertise here

AI BENCHY Compare

ByteDance Seed: Seed-2.0-Lite vs Google: Gemini 3.1 Flash Lite

Last updated at: 2026-05-08

Metric Seed-2.0-Lite Seed-2.0-Lite medium Release: 2026-02-14 Gemini 3.1 Flash Lite Gemini 3.1 Flash Lite low Release: 2026-05-08
Score 8.3 7.6
Rank #11 #44
Reliability 10.0 10.0
Consistency 8.9 9.2
Tests Correct
Attempt pass rate 79.0% 68.4%
Flaky tests 3 2
Total Runs 57 57
Cost per result 0.958 0.203
Total Cost $0.125 $0.025
Input Price $0.250 / 1M $0.250 / 1M
Output Price $2.000 / 1M $1.500 / 1M
Output Tokens 3,266 2,702
Reasoning Tokens 54,082 8,596
Response Time (avg) 31.32s 1.92s
Response Time (max) 168.71s 5.66s
Response Time (total) 595.04s 36.49s

Top Models by Score

Score vs Total Cost

Response Time (avg)

Score vs Response Time (avg)

Total Output Tokens

Score vs Total Output Tokens

Category Breakdown

Anti-AI Tricks Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Seed-2.0-Lite 8.3 10.0 75.0% 0 17.99s 996 7,142
Gemini 3.1 Flash Lite 7.3 6.2 75.0% 2 1.84s 1,013 1,548
Coding Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Seed-2.0-Lite 10.0 10.0 100.0% 0 74.49s 436 7,319
Gemini 3.1 Flash Lite 10.0 10.0 100.0% 0 1.46s 441 408
Combined Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Seed-2.0-Lite 10.0 10.0 100.0% 0 37.67s 506 4,299
Gemini 3.1 Flash Lite 3.0 10.0 0.0% 0 4.48s 348 975
Data parsing and extraction Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Seed-2.0-Lite 10.0 10.0 100.0% 0 9.07s 246 1,742
Gemini 3.1 Flash Lite 10.0 10.0 100.0% 0 1.44s 291 697
Domain specific Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Seed-2.0-Lite 5.9 7.2 55.6% 1 88.74s 15 23,897
Gemini 3.1 Flash Lite 5.3 10.0 33.3% 0 1.52s 15 1,214
General Intelligence Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Seed-2.0-Lite 6.7 3.6 66.7% 1 18.25s 304 1,620
Gemini 3.1 Flash Lite 4.0 10.0 0.0% 0 1.37s 69 438
Instructions following Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Seed-2.0-Lite 10.0 10.0 100.0% 0 7.26s 71 1,480
Gemini 3.1 Flash Lite 10.0 10.0 100.0% 0 1.52s 72 760
Puzzle Solving Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Seed-2.0-Lite 9.0 7.9 88.9% 1 11.03s 461 3,532
Gemini 3.1 Flash Lite 10.0 10.0 100.0% 0 1.40s 210 1,191
Tool Calling Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Seed-2.0-Lite 10.0 10.0 100.0% 0 12.38s 222 1,011
Gemini 3.1 Flash Lite 10.0 10.0 100.0% 0 5.66s 234 945
Trivia Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Seed-2.0-Lite 3.0 10.0 0.0% 0 48.32s 9 2,040
Gemini 3.1 Flash Lite 3.0 10.0 0.0% 0 1.46s 9 420

Quick Compare

Switch Comparison Pair