Navigate
AI BENCHY
AD
Track all your projects in one dashboard. Get 📊stats, 🔥heatmaps and 👀recordings in one self-hosted dashboard.
uxwizz.com

AI BENCHY Compare

Qwen: Qwen3.7 Plus vs Z.ai: GLM 5

Last updated at: 2026-06-03

Metric Qwen3.7 Plus Qwen3.7 Plus medium Release: 2026-06-03 GLM 5 GLM 5 medium Release: 2026-02-12
Score 8.4 8.2
Rank #16 #18
Reliability 9.9 10.0
Consistency 9.2 8.4
Tests Correct
Attempt pass rate 80.0% 81.7%
Flaky tests 2 4
Total Runs 60 60
Cost per result 1.324 1.676
Total Cost $0.199 $0.212
Input Price $0.400 / 1M $0.600 / 1M
Output Price $1.600 / 1M $1.920 / 1M
Total Input Tokens 38,104 32,626
Output Tokens 2,107 21,558
Reasoning Tokens 112,479 95,772
Response Time (avg) 36.84s 32.67s
Response Time (max) 178.04s 99.85s
Response Time (total) 736.86s 392.01s

Top Models by Score

Score vs Total Cost

Response Time (avg)

Score vs Response Time (avg)

Total Output Tokens

Score vs Total Output Tokens

Category Breakdown

Anti-AI Tricks Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Qwen3.7 Plus 10.0 10.0 100.0% 0 8.58s 672 195 5,065
GLM 5 10.0 10.0 100.0% 0 23.66s 555 480 7,056
Coding Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Qwen3.7 Plus 6.5 5.9 66.7% 1 122.40s 3,637 396 30,301
GLM 5 10.0 10.0 100.0% 0 89.47s 4,656 2,985 45,706
Combined Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Qwen3.7 Plus 10.0 10.0 100.0% 0 65.24s 14,934 366 10,132
GLM 5 10.0 10.0 100.0% 0 28.96s 12,804 662 3,242
Data parsing and extraction Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Qwen3.7 Plus 10.0 10.0 100.0% 0 21.75s 7,782 270 6,713
GLM 5 7.1 5.6 83.3% 1 8.90s 5,508 567 3,734
Domain specific Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Qwen3.7 Plus 3.6 7.2 22.2% 1 45.35s 771 57 27,073
GLM 5 3.5 4.4 33.3% 2 0ms 260 13,176 14,137
General Intelligence Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Qwen3.7 Plus 10.0 10.0 100.0% 0 25.48s 516 123 3,998
GLM 5 6.1 3.1 66.7% 1 14.69s 477 2,020 2,248
Instructions following Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Qwen3.7 Plus 10.0 10.0 100.0% 0 16.13s 699 102 5,013
GLM 5 10.0 10.0 100.0% 0 7.25s 636 1,001 2,129
Puzzle Solving Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Qwen3.7 Plus 10.0 10.0 100.0% 0 16.38s 696 280 7,312
GLM 5 10.0 10.0 100.0% 0 11.33s 609 33 4,076
Tool Calling Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Qwen3.7 Plus 10.0 10.0 100.0% 0 15.02s 8,193 292 1,831
GLM 5 10.0 10.0 100.0% 0 15.93s 6,935 233 994
Trivia Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Qwen3.7 Plus 3.0 10.0 0.0% 0 91.07s 204 26 15,041
GLM 5 3.0 10.0 0.0% 0 67.37s 186 401 12,450

Quick Compare

Switch Comparison Pair