Navigate
AI BENCHY
Your ad here

AI BENCHY Compare

Qwen: Qwen3.5-9B vs xAI: Grok 4.1 Fast

Last updated at: 2026-04-04

Metric Qwen3.5-9B Qwen3.5-9B none Release: 2026-03-02 Grok 4.1 Fast Grok 4.1 Fast none Release: 2025-11-19
Score 4.8 4.4
Rank #82 #86
Consistency 10.0 9.0
Tests Correct
Attempt pass rate 23.5% 23.5%
Flaky tests 0 2
Total Runs 51 51
Cost per result 0.111 0.251
Total Cost $0.005 $0.008
Input Price $0.050 / 1M $0.200 / 1M
Output Price $0.150 / 1M $0.500 / 1M
Output Tokens 2,945 1,154
Reasoning Tokens 0 0
Response Time (avg) 1.22s 1.76s
Response Time (max) 5.91s 5.51s
Response Time (total) 20.74s 17.56s

Top Models by Score

Score vs Total Cost

Response Time (avg)

Score vs Response Time (avg)

Total Output Tokens

Score vs Total Output Tokens

Category Breakdown

Anti-AI Tricks Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Qwen3.5-9B 3.1 9.9 0.0% 0 1.71s 582 0
Grok 4.1 Fast 3.2 10.0 0.0% 0 1.07s 235 0
Combined Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Qwen3.5-9B 3.0 10.0 0.0% 0 5.91s 1,255 0
Grok 4.1 Fast 3.0 10.0 0.0% 0 3.33s 105 0
Data parsing and extraction Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Qwen3.5-9B 10.0 10.0 100.0% 0 847ms 249 0
Grok 4.1 Fast 10.0 10.0 100.0% 0 943ms 180 0
Domain specific Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Qwen3.5-9B 3.0 10.0 0.0% 0 464ms 24 0
Grok 4.1 Fast 5.9 7.2 55.6% 1 1.06s 15 0
General Intelligence Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Qwen3.5-9B 4.4 9.9 0.0% 0 552ms 99 0
Grok 4.1 Fast 4.4 9.9 0.0% 0 1.08s 112 0
Instructions following Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Qwen3.5-9B 6.5 10.0 50.0% 0 514ms 75 0
Grok 4.1 Fast 3.0 10.0 0.0% 0 923ms 56 0
Puzzle Solving Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Qwen3.5-9B 3.2 9.9 0.0% 0 683ms 388 0
Grok 4.1 Fast 3.2 10.0 0.0% 0 1.28s 243 0
Tool Calling Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Qwen3.5-9B 10.0 10.0 100.0% 0 1.27s 273 0
Grok 4.1 Fast 2.8 1.6 33.3% 1 5.51s 208 0

Quick Compare

Switch Comparison Pair