Navigate
AI BENCHY
AD
Track all your projects in one dashboard. Get 📊stats, 🔥heatmaps and 👀recordings in one self-hosted dashboard.
uxwizz.com

AI BENCHY Compare

Google: Gemini 3.1 Flash Lite Preview vs Qwen: Qwen3.5 Plus 2026-02-15

Last updated at: 2026-03-15

Metric Gemini 3.1 Flash Lite Preview Gemini 3.1 Flash Lite Preview medium Release: 2026-03-03 Qwen3.5 Plus 2026-02-15 Qwen3.5 Plus 2026-02-15 medium Release: 2026-02-15
Rank #16 #4
Score 8.0 8.8
Consistency 10.0 9.5
Cost per result 0.443 1.264
Total Cost $0.049 $0.165
Tests Correct
Attempt pass rate 68.8% 85.4%
Flaky tests 0 1
Total Runs 48 48
Output Tokens 1,731 1,735
Reasoning Tokens 25,821 77,212
Response Time (avg) 3.83s 34.45s
Response Time (max) 14.93s 79.86s
Response Time (total) 61.25s 310.09s

Top Models by Score

Score vs Total Cost

Response Time (avg)

Score vs Response Time (avg)

Total Output Tokens

Score vs Total Output Tokens

Category Breakdown

Anti-AI Tricks Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3.1 Flash Lite Preview 8.8 10.0 66.7% 0 2.53s 564 3,780
Qwen3.5 Plus 2026-02-15 10.0 10.0 100.0% 0 10.37s 186 5,926
Combined Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3.1 Flash Lite Preview 10.0 10.0 100.0% 0 14.93s 327 7,347
Qwen3.5 Plus 2026-02-15 10.0 10.0 100.0% 0 46.85s 421 7,906
Data parsing and extraction Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3.1 Flash Lite Preview 10.0 10.0 100.0% 0 2.29s 279 2,952
Qwen3.5 Plus 2026-02-15 10.0 10.0 100.0% 0 46.91s 270 14,916
Domain specific Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3.1 Flash Lite Preview 3.0 10.0 0.0% 0 4.21s 18 5,325
Qwen3.5 Plus 2026-02-15 5.3 10.0 33.3% 0 17.50s 35 16,680
General Intelligence Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3.1 Flash Lite Preview 10.0 10.0 100.0% 0 3.16s 96 1,488
Qwen3.5 Plus 2026-02-15 4.7 1.6 66.7% 1 79.86s 73 8,675
Instructions following Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3.1 Flash Lite Preview 10.0 10.0 100.0% 0 1.91s 72 2,121
Qwen3.5 Plus 2026-02-15 10.0 10.0 100.0% 0 31.93s 101 7,704
Puzzle Solving Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3.1 Flash Lite Preview 7.7 10.0 66.7% 0 3.58s 141 1,896
Qwen3.5 Plus 2026-02-15 10.0 10.0 100.0% 0 34.57s 340 14,496
Tool Calling Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Gemini 3.1 Flash Lite Preview 10.0 10.0 100.0% 0 3.80s 234 912
Qwen3.5 Plus 2026-02-15 10.0 10.0 100.0% 0 7.54s 309 909

Quick Compare

Switch Comparison Pair