Navigate
AI BENCHY
Advertise here

AI BENCHY Compare

Google: Gemini 3.5 Flash vs Z.ai: GLM 5.1

Summary

Gemini 3.5 Flash vs GLM 5.1 benchmark comparison: GLM 5.1 leads on average score with 7.1 vs 7.0. GLM 5.1 has the lower benchmark cost at $0.292 vs $1.079. Gemini 3.5 Flash is faster at 9.93s vs 33.67s, with pass rates of 77.8% vs 68.3%.

Recommended model: Gemini 3.5 Flash - Its score stays close to the best score here (7.0 vs 7.1), while responding about 3.4x faster than GLM 5.1.

Last updated at: 2026-06-12

Metric Gemini 3.5 Flash Gemini 3.5 Flash none Release: 2026-05-19 GLM 5.1 GLM 5.1 medium Release: 2026-04-07
Score 7.0 7.1
Rank #66 #64
Reliability 10.0 6.7
Consistency 8.9 8.3
Tests Correct
Attempt pass rate 77.8% 68.3%
Flaky tests 3 4
Total Runs 63 63
Cost per result 7.190 2.496
Total Cost $1.079 $0.292
Input Price $1.500 / 1M $0.980 / 1M
Output Price $9.000 / 1M $3.080 / 1M
Total Input Tokens 13,843 32,995
Output Tokens 117,518 11,655
Reasoning Tokens 0 75,421
Response Time (avg) 9.93s 33.67s
Response Time (max) 64.36s 172.60s
Response Time (total) 178.68s 673.41s

Generation showcase

Hamster playing table tennis

Prompt: Create a detailed SVG illustration of a hamster playing table tennis.

#66 Gemini 3.5 Flash

none
Cost
$0.225
Time
125.5s
Tokens
25,004 tok

#64 GLM 5.1

medium
Invalid SVG
Cost
$0.000
Time
300.0s
Tokens
0 tok

Top Models by Score

Score vs Total Cost

Response Time (avg)

Score vs Response Time (avg)

Total Output Tokens

Score vs Total Output Tokens

Category Breakdown

Anti-AI Tricks Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Gemini 3.5 Flash 10.0 10.0 100.0% 0 2.53s 492 5,101 0
GLM 5.1 10.0 10.0 100.0% 0 8.31s 555 401 5,122
Coding Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Gemini 3.5 Flash 8.8 7.8 88.9% 1 34.69s 8,122 75,927 0
GLM 5.1 4.6 3.7 44.5% 2 109.63s 5,702 4,871 37,826
Combined Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Gemini 3.5 Flash 3.0 10.0 0.0% 0 0ms 0 0 0
GLM 5.1 9.5 10.0 100.0% 0 43.11s 17,298 327 4,206
Data parsing and extraction Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Gemini 3.5 Flash 6.5 10.0 50.0% 0 8.10s 2,781 5,895 0
GLM 5.1 10.0 10.0 100.0% 0 9.33s 7,107 991 4,552
Domain specific Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Gemini 3.5 Flash 7.6 7.2 77.8% 1 10.64s 633 17,910 0
GLM 5.1 5.3 10.0 33.3% 0 29.77s 489 969 11,314
General Intelligence Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Gemini 3.5 Flash 10.0 10.0 100.0% 0 3.46s 486 1,620 0
GLM 5.1 10.0 10.0 100.0% 0 20.95s 477 2,875 2,875
Instructions following Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Gemini 3.5 Flash 9.8 10.0 100.0% 0 3.38s 615 3,928 0
GLM 5.1 6.4 5.8 66.7% 1 7.47s 634 204 1,617
Puzzle Solving Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Gemini 3.5 Flash 10.0 10.0 100.0% 0 3.13s 558 4,640 0
GLM 5.1 8.2 7.2 88.9% 1 31.64s 609 935 5,730
Tool Calling Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Gemini 3.5 Flash 3.0 10.0 0.0% 0 0ms 0 0 0
GLM 5.1 3.0 10.0 0.0% 0 0ms 0 0 0
Trivia Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Gemini 3.5 Flash 2.8 1.6 33.3% 1 4.87s 156 2,497 0
GLM 5.1 3.0 10.0 0.0% 0 29.40s 124 82 2,179

Quick Compare

Switch Comparison Pair