Navigate
AI BENCHY
Your ad here

AI BENCHY Compare

ByteDance Seed: Seed-2.0-Lite vs Qwen: Qwen3.5 Plus 2026-02-15

Last updated at: 2026-03-12

Metric Seed-2.0-Lite Seed-2.0-Lite medium Release: 2026-02-14 Qwen3.5 Plus 2026-02-15 Qwen3.5 Plus 2026-02-15 none Release: 2026-02-15
Rank #3 #31
Avg Score 8.5 6.2
Consistency 8.7 9.6
Cost per result 0.870 0.172
Total Cost $0.105 $0.016
Tests Correct
Attempt pass rate 87.5% 58.3%
Flaky tests 3 1
Total Runs 48 48
Output Tokens 2,815 2,015
Reasoning Tokens 44,618 0
Response Time (avg) 29.39s 2.65s
Response Time (max) 168.71s 6.65s
Response Time (total) 470.29s 26.52s

Top Models by Score

Score vs Total Cost

Response Time (avg)

Avg Score vs Response Time (avg)

Total Output Tokens

Avg Score vs Total Output Tokens

Category Breakdown

Anti-AI Tricks Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Seed-2.0-Lite 10.0 10.0 100.0% 0 23.34s 990 7,037
Qwen3.5 Plus 2026-02-15 4.0 10.0 33.3% 0 2.74s 514 0
Combined Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Seed-2.0-Lite 10.0 10.0 100.0% 0 37.67s 506 4,299
Qwen3.5 Plus 2026-02-15 10.0 10.0 0.0% 0 6.65s 314 0
Data parsing and extraction Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Seed-2.0-Lite 9.9 10.0 100.0% 0 9.07s 246 1,742
Qwen3.5 Plus 2026-02-15 9.9 10.0 100.0% 0 1.89s 243 0
Domain specific Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Seed-2.0-Lite 4.0 7.2 55.6% 1 88.74s 15 23,897
Qwen3.5 Plus 2026-02-15 4.0 10.0 33.3% 0 1.17s 17 0
General Intelligence Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Seed-2.0-Lite 7.0 3.6 66.7% 1 18.25s 304 1,620
Qwen3.5 Plus 2026-02-15 4.0 3.0 33.3% 1 2.26s 117 0
Instructions following Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Seed-2.0-Lite 10.0 10.0 100.0% 0 7.26s 71 1,480
Qwen3.5 Plus 2026-02-15 10.0 10.0 100.0% 0 1.67s 72 0
Puzzle Solving Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Seed-2.0-Lite 9.3 7.9 88.9% 1 11.03s 461 3,532
Qwen3.5 Plus 2026-02-15 7.0 10.0 66.7% 0 2.82s 516 0
Tool Calling Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Seed-2.0-Lite 10.0 10.0 100.0% 0 12.38s 222 1,011
Qwen3.5 Plus 2026-02-15 10.0 10.0 100.0% 0 3.33s 222 0

Quick Compare

Switch Comparison Pair