Navigate
AI BENCHY
Advertise here

AI BENCHY Compare

StepFun: Step 3.7 Flash vs Xiaomi: MiMo-V2-Pro

Last updated at: 2026-05-29

Metric Step 3.7 Flash Step 3.7 Flash low Release: 2026-05-29 MiMo-V2-Pro MiMo-V2-Pro medium Release: 2026-03-18
Score 7.4 7.6
Rank #60 #52
Reliability 10.0 9.6
Consistency 8.7 7.9
Tests Correct
Attempt pass rate 68.3% 76.7%
Flaky tests 3 5
Total Runs 60 60
Cost per result 2.796 2.450
Total Cost $0.336 $0.294
Input Price $0.200 / 1M $1.000 / 1M
Output Price $1.150 / 1M $3.000 / 1M
Output Tokens 285,209 2,518
Reasoning Tokens 0 81,801
Response Time (avg) 16.06s 22.16s
Response Time (max) 124.75s 136.29s
Response Time (total) 321.11s 443.22s

Top Models by Score

Score vs Total Cost

Response Time (avg)

Score vs Response Time (avg)

Total Output Tokens

Score vs Total Output Tokens

Category Breakdown

Anti-AI Tricks Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Step 3.7 Flash 8.7 7.9 91.7% 1 4.02s 10,896 0
MiMo-V2-Pro 10.0 10.0 100.0% 0 2.86s 251 1,154
Coding Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Step 3.7 Flash 10.0 10.0 100.0% 0 9.43s 14,569 0
MiMo-V2-Pro 7.5 6.0 83.3% 1 94.21s 527 37,424
Combined Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Step 3.7 Flash 10.0 10.0 100.0% 0 7.98s 6,426 0
MiMo-V2-Pro 4.7 1.6 66.7% 1 64.71s 380 14,186
Data parsing and extraction Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Step 3.7 Flash 7.3 5.8 83.3% 1 2.29s 2,667 0
MiMo-V2-Pro 7.3 5.8 83.3% 1 17.20s 260 7,484
Domain specific Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Step 3.7 Flash 5.3 7.2 44.4% 1 43.31s 104,487 0
MiMo-V2-Pro 5.3 10.0 33.3% 0 8.82s 170 2,158
General Intelligence Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Step 3.7 Flash 3.4 9.3 0.0% 0 7.00s 4,604 0
MiMo-V2-Pro 10.0 10.0 100.0% 0 4.92s 184 400
Instructions following Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Step 3.7 Flash 9.8 10.0 100.0% 0 1.58s 1,857 0
MiMo-V2-Pro 9.9 10.0 100.0% 0 3.36s 83 667
Puzzle Solving Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Step 3.7 Flash 5.5 9.9 33.3% 0 1.84s 3,564 0
MiMo-V2-Pro 6.4 4.4 77.8% 2 5.08s 372 1,622
Tool Calling Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Step 3.7 Flash 10.0 10.0 100.0% 0 3.25s 1,360 0
MiMo-V2-Pro 10.0 10.0 100.0% 0 8.19s 263 864
Trivia Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Step 3.7 Flash 3.0 10.0 0.0% 0 124.75s 134,779 0
MiMo-V2-Pro 3.0 10.0 0.0% 0 82.71s 28 15,842

Quick Compare

Switch Comparison Pair