Navigate
AI BENCHY
Advertise here

AI BENCHY Compare

Anthropic: Claude Opus 4.7 vs Z.ai: GLM 5 Turbo

Summary

Claude Opus 4.7 vs GLM 5 Turbo benchmark comparison: Claude Opus 4.7 leads on average score with 8.7 vs 8.4. GLM 5 Turbo has the lower benchmark cost at $0.323 vs $0.679. Claude Opus 4.7 is faster at 4.73s vs 23.00s, with pass rates of 82.5% vs 74.6%.

Recommended model: Claude Opus 4.7 - It has the best score here (8.7), while responding about 4.9x faster than GLM 5 Turbo.

Last updated at: 2026-06-12

Metric Claude Opus 4.7 Claude Opus 4.7 medium Release: 2026-04-16 GLM 5 Turbo GLM 5 Turbo medium Release: 2026-03-15
Score 8.7 8.4
Rank #17 #24
Reliability 10.0 10.0
Consistency 9.6 8.5
Tests Correct
Attempt pass rate 82.5% 74.6%
Flaky tests 1 4
Total Runs 63 63
Cost per result 3.991 2.011
Total Cost $0.679 $0.323
Input Price $5.000 / 1M $1.200 / 1M
Output Price $25.000 / 1M $4.000 / 1M
Total Input Tokens 65,406 35,593
Output Tokens 11,858 12,245
Reasoning Tokens 2,198 62,277
Response Time (avg) 4.73s 23.00s
Response Time (max) 23.18s 194.23s
Response Time (total) 94.51s 482.97s

Generation showcase

Hamster playing table tennis

Prompt: Create a detailed SVG illustration of a hamster playing table tennis.

#17 Claude Opus 4.7

medium
Cost
$0.059
Time
26.8s
Tokens
2,475 tok

#24 GLM 5 Turbo

medium
Cost
$0.074
Time
206.0s
Tokens
18,549 tok

Top Models by Score

Score vs Total Cost

Response Time (avg)

Score vs Response Time (avg)

Total Output Tokens

Score vs Total Output Tokens

Category Breakdown

Anti-AI Tricks Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Claude Opus 4.7 8.3 10.0 75.0% 0 1.85s 894 348 0
GLM 5 Turbo 10.0 10.0 100.0% 0 4.82s 555 362 3,137
Coding Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Claude Opus 4.7 7.6 7.2 77.8% 1 12.96s 10,635 7,629 1,114
GLM 5 Turbo 8.2 9.3 66.7% 0 45.90s 5,941 363 25,381
Combined Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Claude Opus 4.7 10.0 10.0 100.0% 0 21.45s 24,501 2,369 1,084
GLM 5 Turbo 10.0 10.0 100.0% 0 13.88s 12,714 390 2,037
Data parsing and extraction Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Claude Opus 4.7 10.0 10.0 100.0% 0 2.37s 10,533 324 0
GLM 5 Turbo 10.0 10.0 100.0% 0 6.19s 7,107 577 3,632
Domain specific Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Claude Opus 4.7 7.7 10.0 66.7% 0 1.17s 630 51 0
GLM 5 Turbo 2.9 4.4 22.2% 2 71.07s 489 9,665 19,279
General Intelligence Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Claude Opus 4.7 10.0 10.0 100.0% 0 2.87s 723 256 0
GLM 5 Turbo 6.1 3.1 66.7% 1 10.05s 477 60 2,216
Instructions following Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Claude Opus 4.7 10.0 10.0 100.0% 0 1.57s 939 114 0
GLM 5 Turbo 10.0 10.0 100.0% 0 5.38s 636 255 2,183
Puzzle Solving Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Claude Opus 4.7 10.0 10.0 100.0% 0 2.43s 939 370 0
GLM 5 Turbo 8.7 7.9 77.8% 1 5.23s 609 312 2,647
Tool Calling Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Claude Opus 4.7 10.0 10.0 100.0% 0 4.17s 15,339 373 0
GLM 5 Turbo 10.0 10.0 100.0% 0 9.84s 6,879 241 446
Trivia Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Claude Opus 4.7 3.0 10.0 0.0% 0 2.25s 273 24 0
GLM 5 Turbo 3.0 10.0 0.0% 0 40.17s 186 20 1,319

Quick Compare

Switch Comparison Pair