Navigate
AI BENCHY
AD
Track all your projects in one dashboard. Get 📊stats, 🔥heatmaps and 👀recordings in one self-hosted dashboard.
uxwizz.com

AI BENCHY Compare

Z.ai: GLM 5.1 vs Z.ai: GLM 5 Turbo

Last updated at: 2026-04-07

Metric GLM 5.1 GLM 5.1 medium Release: 2026-04-07 GLM 5 Turbo GLM 5 Turbo medium Release: 2026-03-15
Score 8.0 8.0
Rank #23 #20
Consistency 9.0 7.9
Tests Correct
Attempt pass rate 76.5% 76.5%
Flaky tests 2 5
Total Runs 51 51
Cost per result 1.270 1.509
Total Cost $0.153 $0.166
Input Price $1.000 / 1M $1.200 / 1M
Output Price $3.200 / 1M $4.000 / 1M
Output Tokens 6,666 11,865
Reasoning Tokens 35,313 35,632
Response Time (avg) 18.23s 17.98s
Response Time (max) 43.11s 194.23s
Response Time (total) 291.73s 305.72s

Top Models by Score

Score vs Total Cost

Response Time (avg)

Score vs Response Time (avg)

Total Output Tokens

Score vs Total Output Tokens

Category Breakdown

Anti-AI Tricks Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
GLM 5.1 10.0 10.0 100.0% 0 8.31s 401 5,122
GLM 5 Turbo 10.0 10.0 100.0% 0 4.82s 362 3,137
Combined Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
GLM 5.1 9.5 10.0 100.0% 0 43.11s 327 4,206
GLM 5 Turbo 10.0 10.0 100.0% 0 13.88s 390 2,037
Data parsing and extraction Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
GLM 5.1 10.0 10.0 100.0% 0 9.33s 991 4,552
GLM 5 Turbo 10.0 10.0 100.0% 0 6.19s 577 3,632
Domain specific Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
GLM 5.1 5.3 10.0 33.3% 0 29.77s 969 11,314
GLM 5 Turbo 2.9 4.4 22.2% 2 71.07s 9,665 19,279
General Intelligence Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
GLM 5.1 10.0 10.0 100.0% 0 20.95s 2,875 2,875
GLM 5 Turbo 6.1 3.1 66.7% 1 10.05s 60 2,216
Instructions following Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
GLM 5.1 6.4 5.8 66.7% 1 7.47s 204 1,617
GLM 5 Turbo 10.0 10.0 100.0% 0 5.38s 255 2,183
Puzzle Solving Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
GLM 5.1 8.2 7.2 88.9% 1 23.85s 899 5,627
GLM 5 Turbo 7.3 5.8 55.6% 2 5.44s 315 2,702
Tool Calling Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
GLM 5.1 3.0 10.0 0.0% 0 0ms 0 0
GLM 5 Turbo 10.0 10.0 100.0% 0 9.84s 241 446

Quick Compare

Switch Comparison Pair