Navigate
AI BENCHY
Your ad here

AI BENCHY Compare

Mistral: Mistral Small 4 vs Z.ai: GLM 4.7 Flash

Last updated at: 2026-03-17

Metric Mistral Small 4 Mistral Small 4 none Release: 2026-03-16 GLM 4.7 Flash GLM 4.7 Flash none Release: 2026-01-19
Rank #61 #57
Score 5.3 5.6
Consistency 9.5 8.5
Cost per result 0.108 0.053
Total Cost $0.006 $0.003
Tests Correct
Attempt pass rate 33.3% 39.2%
Flaky tests 1 3
Total Runs 51 51
Output Tokens 1,624 1,863
Reasoning Tokens 0 0
Response Time (avg) 629ms 3.13s
Response Time (max) 1.72s 7.05s
Response Time (total) 10.70s 31.33s

Top Models by Score

Score vs Total Cost

Response Time (avg)

Score vs Response Time (avg)

Total Output Tokens

Score vs Total Output Tokens

Category Breakdown

Anti-AI Tricks Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Mistral Small 4 3.4 7.9 16.7% 1 395ms 182 0
GLM 4.7 Flash 5.2 7.9 41.7% 1 5.51s 438 0
Combined Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Mistral Small 4 3.0 10.0 0.0% 0 1.72s 496 0
GLM 4.7 Flash 3.0 10.0 0.0% 0 3.22s 704 0
Data parsing and extraction Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Mistral Small 4 10.0 10.0 100.0% 0 822ms 261 0
GLM 4.7 Flash 7.3 5.8 83.3% 1 4.82s 196 0
Domain specific Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Mistral Small 4 5.3 10.0 33.3% 0 367ms 28 0
GLM 4.7 Flash 7.7 10.0 66.7% 0 744ms 19 0
General Intelligence Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Mistral Small 4 4.0 10.0 0.0% 0 729ms 205 0
GLM 4.7 Flash 4.0 10.0 0.0% 0 1.59s 134 0
Instructions following Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Mistral Small 4 6.5 10.0 50.0% 0 380ms 69 0
GLM 4.7 Flash 6.5 10.0 50.0% 0 888ms 62 0
Puzzle Solving Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Mistral Small 4 3.1 9.9 0.0% 0 589ms 170 0
GLM 4.7 Flash 4.4 10.0 0.0% 0 1.00s 98 0
Tool Calling Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Mistral Small 4 10.0 10.0 100.0% 0 1.40s 213 0
GLM 4.7 Flash 2.8 1.6 33.3% 1 7.05s 212 0

Quick Compare

Switch Comparison Pair