Navigate
AI BENCHY
AD
Track all your projects in one dashboard. Get 📊stats, 🔥heatmaps and 👀recordings in one self-hosted dashboard.
uxwizz.com

AI BENCHY Compare

Qwen: Qwen3.5-35B-A3B vs Z.ai: GLM 5 Turbo

Last updated at: 2026-03-15

Metric Qwen3.5-35B-A3B Qwen3.5-35B-A3B medium Release: 2026-02-24 GLM 5 Turbo GLM 5 Turbo none Release: 2026-03-15
Rank #33 #53
Score 7.1 5.7
Consistency 6.3 9.5
Cost per result 4.251 0.467
Total Cost $0.341 $0.028
Tests Correct
Attempt pass rate 77.1% 39.6%
Flaky tests 7 1
Total Runs 48 48
Output Tokens 5,495 1,264
Reasoning Tokens 169,266 0
Response Time (avg) 43.93s 2.92s
Response Time (max) 106.00s 8.21s
Response Time (total) 702.85s 46.72s

Top Models by Score

Score vs Total Cost

Response Time (avg)

Score vs Response Time (avg)

Total Output Tokens

Score vs Total Output Tokens

Category Breakdown

Anti-AI Tricks Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Qwen3.5-35B-A3B 10.0 10.0 100.0% 0 21.75s 429 36,235
GLM 5 Turbo 3.0 10.0 0.0% 0 3.01s 376 0
Combined Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Qwen3.5-35B-A3B 4.7 1.6 66.7% 1 75.34s 775 12,485
GLM 5 Turbo 3.0 10.0 0.0% 0 4.89s 144 0
Data parsing and extraction Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Qwen3.5-35B-A3B 7.3 5.9 83.3% 1 59.33s 235 19,493
GLM 5 Turbo 10.0 10.0 100.0% 0 2.47s 204 0
Domain specific Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Qwen3.5-35B-A3B 4.1 4.4 44.5% 2 88.34s 41 46,368
GLM 5 Turbo 5.3 10.0 33.3% 0 1.97s 25 0
General Intelligence Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Qwen3.5-35B-A3B 2.8 1.6 33.3% 1 30.30s 20 3,753
GLM 5 Turbo 4.2 9.9 0.0% 0 2.18s 48 0
Instructions following Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Qwen3.5-35B-A3B 10.0 10.0 100.0% 0 24.45s 97 17,361
GLM 5 Turbo 6.5 10.0 50.0% 0 2.13s 65 0
Puzzle Solving Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Qwen3.5-35B-A3B 6.4 4.4 77.8% 2 31.58s 3,589 32,206
GLM 5 Turbo 5.5 7.4 44.4% 1 2.43s 180 0
Tool Calling Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Qwen3.5-35B-A3B 10.0 10.0 100.0% 0 4.65s 309 1,365
GLM 5 Turbo 10.0 10.0 100.0% 0 8.21s 222 0

Quick Compare

Switch Comparison Pair