Granite 4.1 8B vs GLM 4.7 Flash (medium)

Recommended model Granite 4.1 8B

Its score stays close to the best score here (4.0 vs 4.3), while costing about 26.4x less than GLM 4.7 Flash (medium).

Detailed comparison

Metric	Granite 4.1 8B Granite 4.1 8B none Release: 2026-05-01	GLM 4.7 Flash GLM 4.7 Flash medium Release: 2026-01-19

Metric	Granite 4.1 8B Granite 4.1 8B none Release: 2026-05-01	GLM 4.7 Flash GLM 4.7 Flash medium Release: 2026-01-19
Score	4.0	4.3
Rank	#224	#217
Reliability	10.0	7.8
Consistency	10.0	7.0
Tests Correct
Attempt pass rate	9.1%	31.8%
Flaky tests	0	8
Total Runs	66	66
Cost per result	0.315	4.147
Total Cost	$0.007	$0.166
Input Price	$0.050 / 1M	$0.060 / 1M
Output Price	$0.100 / 1M	$0.400 / 1M
Total Input Tokens	113,827	79,051
Output Tokens	5,996	43,754
Reasoning Tokens	0	374,109
Response Time (avg)	1.45s	142.59s
Response Time (max)	16.67s	1539.97s
Response Time (total)	31.96s	1996.21s

Prompt: Create a detailed SVG illustration of a hamster playing table tennis.

none

medium

Invalid SVG

Category:

Anti-AI Tricks	Score	Consistency	Attempt pass rate	Flaky tests	Tests Correct	Response Time (avg)	Input Tokens	Output Tokens	Reasoning Tokens
Granite 4.1 8B	4.9	10.0	25.0%	0		844ms	645	903	0
GLM 4.7 Flash	4.7	5.9	41.7%	2		14.95s	555	1,122	6,110

Coding	Score	Consistency	Attempt pass rate	Flaky tests	Tests Correct	Response Time (avg)	Input Tokens	Output Tokens	Reasoning Tokens
Granite 4.1 8B	4.5	10.0	0.0%	0		775ms	8,344	525	0
GLM 4.7 Flash	3.2	7.4	11.1%	1		55.33s	3,106	4,981	22,387

Combined	Score	Consistency	Attempt pass rate	Flaky tests	Tests Correct	Response Time (avg)	Input Tokens	Output Tokens	Reasoning Tokens
Granite 4.1 8B	3.0	10.0	0.0%	0		9.28s	86,631	3,481	0
GLM 4.7 Flash	2.9	6.0	16.7%	1		802.77s	59,030	2,585	305,678

Data parsing and extraction	Score	Consistency	Attempt pass rate	Flaky tests	Tests Correct	Response Time (avg)	Input Tokens	Output Tokens	Reasoning Tokens
Granite 4.1 8B	3.0	10.0	0.0%	0		575ms	7,617	195	0
GLM 4.7 Flash	6.3	10.0	50.0%	0		1.51s	7,107	584	2,755

Domain specific	Score	Consistency	Attempt pass rate	Flaky tests	Tests Correct	Response Time (avg)	Input Tokens	Output Tokens	Reasoning Tokens
Granite 4.1 8B	3.0	10.0	0.0%	0		357ms	768	24	0
GLM 4.7 Flash	3.5	4.4	33.3%	2		174.55s	643	33,000	25,394

General Intelligence	Score	Consistency	Attempt pass rate	Flaky tests	Tests Correct	Response Time (avg)	Input Tokens	Output Tokens	Reasoning Tokens
Granite 4.1 8B	4.0	10.0	0.0%	0		499ms	528	115	0
GLM 4.7 Flash	3.6	9.7	0.0%	0		18.14s	318	18	2,138

Instructions following	Score	Consistency	Attempt pass rate	Flaky tests	Tests Correct	Response Time (avg)	Input Tokens	Output Tokens	Reasoning Tokens
Granite 4.1 8B	3.6	9.9	0.0%	0		344ms	687	66	0
GLM 4.7 Flash	6.2	5.8	66.7%	1		2.97s	636	388	2,181

Puzzle Solving	Score	Consistency	Attempt pass rate	Flaky tests	Tests Correct	Response Time (avg)	Input Tokens	Output Tokens	Reasoning Tokens
Granite 4.1 8B	3.2	10.0	0.0%	0		608ms	672	432	0
GLM 4.7 Flash	2.9	7.2	11.1%	1		12.93s	521	781	5,255

Tool Calling	Score	Consistency	Attempt pass rate	Flaky tests	Tests Correct	Response Time (avg)	Input Tokens	Output Tokens	Reasoning Tokens
Granite 4.1 8B	10.0	10.0	100.0%	0		2.17s	7,719	243	0
GLM 4.7 Flash	10.0	10.0	100.0%	0		15.95s	6,949	224	1,014

Trivia	Score	Consistency	Attempt pass rate	Flaky tests	Tests Correct	Response Time (avg)	Input Tokens	Output Tokens	Reasoning Tokens
Granite 4.1 8B	3.0	10.0	0.0%	0		306ms	216	12	0
GLM 4.7 Flash	3.0	10.0	0.0%	0		11.13s	186	71	1,197

Switch Comparison Pair