Navigate
AI BENCHY
AD
Track all your projects in one dashboard. Get 📊stats, 🔥heatmaps and 👀recordings in one self-hosted dashboard.
uxwizz.com

AI BENCHY Compare

StepFun: Step 3.5 Flash vs xAI: Grok 4.1 Fast

Last updated at: 2026-04-11

Metric Step 3.5 Flash Step 3.5 Flash none Release: 2026-02-01 Grok 4.1 Fast Grok 4.1 Fast none Release: 2025-11-19
Score 3.0 4.5
Rank #93 #89
Consistency 10.0 8.7
Tests Correct
Attempt pass rate 0.0% 24.1%
Flaky tests 0 3
Total Runs 3 54
Cost per result 0.000 0.269
Total Cost $0.000 $0.009
Input Price $0.100 / 1M $0.200 / 1M
Output Price $0.300 / 1M $0.500 / 1M
Output Tokens 0 1,721
Reasoning Tokens 0 0
Response Time (avg) 0ms 1.76s
Response Time (max) 0ms 5.51s
Response Time (total) 0ms 19.35s

Top Models by Score

Score vs Total Cost

Response Time (avg)

Score vs Response Time (avg)

Total Output Tokens

Score vs Total Output Tokens

Category Breakdown

Coding Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Step 3.5 Flash 3.0 10.0 0.0% 0 0ms 0 0
Grok 4.1 Fast 5.3 3.4 33.3% 1 1.79s 567 0
Anti-AI Tricks Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Step 3.5 Flash - - - - - - - -
Grok 4.1 Fast 3.2 10.0 0.0% 0 1.07s 235 0
Combined Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Step 3.5 Flash - - - - - - - -
Grok 4.1 Fast 3.0 10.0 0.0% 0 3.33s 105 0
Data parsing and extraction Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Step 3.5 Flash - - - - - - - -
Grok 4.1 Fast 10.0 10.0 100.0% 0 943ms 180 0
Domain specific Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Step 3.5 Flash - - - - - - - -
Grok 4.1 Fast 5.9 7.2 55.6% 1 1.06s 15 0
General Intelligence Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Step 3.5 Flash - - - - - - - -
Grok 4.1 Fast 4.4 9.9 0.0% 0 1.08s 112 0
Instructions following Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Step 3.5 Flash - - - - - - - -
Grok 4.1 Fast 3.0 10.0 0.0% 0 923ms 56 0
Puzzle Solving Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Step 3.5 Flash - - - - - - - -
Grok 4.1 Fast 3.2 10.0 0.0% 0 1.28s 243 0
Tool Calling Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Step 3.5 Flash - - - - - - - -
Grok 4.1 Fast 2.8 1.6 33.3% 1 5.51s 208 0

Quick Compare

Switch Comparison Pair