Navigate
AI BENCHY
Advertise here

AI BENCHY Compare

StepFun: Step 3.7 Flash vs Xiaomi: MiMo-V2.5-Pro

Last updated at: 2026-05-29

Metric Step 3.7 Flash Step 3.7 Flash low Release: 2026-05-29 MiMo-V2.5-Pro MiMo-V2.5-Pro medium Release: 2026-04-22
Score 7.4 7.6
Rank #60 #48
Reliability 10.0 10.0
Consistency 8.7 8.9
Tests Correct
Attempt pass rate 68.3% 68.3%
Flaky tests 3 3
Total Runs 60 60
Cost per result 2.796 2.408
Total Cost $0.336 $0.289
Input Price $0.200 / 1M $0.435 / 1M
Output Price $1.150 / 1M $0.870 / 1M
Output Tokens 285,209 5,004
Reasoning Tokens 0 80,295
Response Time (avg) 16.06s 21.79s
Response Time (max) 124.75s 130.77s
Response Time (total) 321.11s 435.79s

Top Models by Score

Score vs Total Cost

Response Time (avg)

Score vs Response Time (avg)

Total Output Tokens

Score vs Total Output Tokens

Category Breakdown

Anti-AI Tricks Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Step 3.7 Flash 8.7 7.9 91.7% 1 4.02s 10,896 0
MiMo-V2.5-Pro 10.0 10.0 100.0% 0 3.26s 323 1,179
Coding Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Step 3.7 Flash 10.0 10.0 100.0% 0 9.43s 14,569 0
MiMo-V2.5-Pro 7.0 6.2 66.7% 1 81.67s 769 33,771
Combined Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Step 3.7 Flash 10.0 10.0 100.0% 0 7.98s 6,426 0
MiMo-V2.5-Pro 10.0 10.0 100.0% 0 53.36s 348 11,870
Data parsing and extraction Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Step 3.7 Flash 7.3 5.8 83.3% 1 2.29s 2,667 0
MiMo-V2.5-Pro 7.3 5.8 83.3% 1 18.81s 260 8,383
Domain specific Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Step 3.7 Flash 5.3 7.2 44.4% 1 43.31s 104,487 0
MiMo-V2.5-Pro 5.3 10.0 33.3% 0 37.87s 275 17,023
General Intelligence Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Step 3.7 Flash 3.4 9.3 0.0% 0 7.00s 4,604 0
MiMo-V2.5-Pro 5.5 10.0 0.0% 0 4.02s 155 163
Instructions following Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Step 3.7 Flash 9.8 10.0 100.0% 0 1.58s 1,857 0
MiMo-V2.5-Pro 9.9 10.0 100.0% 0 2.77s 82 803
Puzzle Solving Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Step 3.7 Flash 5.5 9.9 33.3% 0 1.84s 3,564 0
MiMo-V2.5-Pro 6.7 7.9 55.6% 1 5.31s 540 2,181
Tool Calling Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Step 3.7 Flash 10.0 10.0 100.0% 0 3.25s 1,360 0
MiMo-V2.5-Pro 10.0 10.0 100.0% 0 16.87s 311 2,908
Trivia Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Step 3.7 Flash 3.0 10.0 0.0% 0 124.75s 134,779 0
MiMo-V2.5-Pro 3.0 10.0 0.0% 0 12.46s 1,941 2,014

Quick Compare

Switch Comparison Pair