Navigate
AI BENCHY
Advertise here

AI BENCHY Compare

DeepSeek: DeepSeek V4 Flash vs IBM: Granite 4.1 8B

Last updated at: 2026-05-29

Metric DeepSeek V4 Flash DeepSeek V4 Flash none Release: 2026-04-24 Free Available Granite 4.1 8B Granite 4.1 8B none Release: 2026-05-01
Score 5.1 4.1
Rank #137 #158
Reliability 10.0 10.0
Consistency 8.8 10.0
Tests Correct
Attempt pass rate 31.7% 10.0%
Flaky tests 3 0
Total Runs 60 60
Cost per result 0.198 0.122
Total Cost $0.010 $0.003
Input Price $0.100 / 1M $0.050 / 1M
Output Price $0.200 / 1M $0.100 / 1M
Output Tokens 13,700 2,743
Reasoning Tokens 0 0
Response Time (avg) 27.97s 719ms
Response Time (max) 111.96s 2.17s
Response Time (total) 559.36s 14.37s

Top Models by Score

Score vs Total Cost

Response Time (avg)

Score vs Response Time (avg)

Total Output Tokens

Score vs Total Output Tokens

Category Breakdown

Anti-AI Tricks Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
DeepSeek V4 Flash 3.0 10.0 0.0% 0 20.18s 174 0
Granite 4.1 8B 4.9 10.0 25.0% 0 844ms 903 0
Coding Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
DeepSeek V4 Flash 4.8 6.0 16.7% 1 24.47s 9,707 0
Granite 4.1 8B 5.2 10.0 0.0% 0 706ms 357 0
Combined Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
DeepSeek V4 Flash 4.5 2.1 66.7% 1 111.96s 2,664 0
Granite 4.1 8B 3.0 10.0 0.0% 0 1.88s 396 0
Data parsing and extraction Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
DeepSeek V4 Flash 10.0 10.0 100.0% 0 23.79s 195 0
Granite 4.1 8B 3.0 10.0 0.0% 0 575ms 195 0
Domain specific Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
DeepSeek V4 Flash 5.3 10.0 33.3% 0 19.73s 18 0
Granite 4.1 8B 3.0 10.0 0.0% 0 357ms 24 0
General Intelligence Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
DeepSeek V4 Flash 4.2 9.9 0.0% 0 23.74s 67 0
Granite 4.1 8B 4.0 10.0 0.0% 0 499ms 115 0
Instructions following Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
DeepSeek V4 Flash 6.5 10.0 50.0% 0 17.54s 321 0
Granite 4.1 8B 3.6 9.9 0.0% 0 344ms 66 0
Puzzle Solving Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
DeepSeek V4 Flash 3.1 7.3 11.1% 1 23.72s 207 0
Granite 4.1 8B 3.2 10.0 0.0% 0 608ms 432 0
Tool Calling Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
DeepSeek V4 Flash 10.0 10.0 100.0% 0 77.93s 327 0
Granite 4.1 8B 10.0 10.0 100.0% 0 2.17s 243 0
Trivia Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
DeepSeek V4 Flash 3.0 10.0 0.0% 0 3.07s 20 0
Granite 4.1 8B 3.0 10.0 0.0% 0 306ms 12 0

Quick Compare

Switch Comparison Pair