Navigate
AI BENCHY
AD
Track all your projects in one dashboard. Get 📊stats, 🔥heatmaps and 👀recordings in one self-hosted dashboard.
uxwizz.com

AI BENCHY Compare

xAI: Grok 4.20 vs Xiaomi: MiMo-V2-Flash

Last updated at: 2026-05-21

Metric Grok 4.20 Grok 4.20 none Release: 2026-03-31 MiMo-V2-Flash MiMo-V2-Flash none Release: 2025-12-16
Score 5.4 4.5
Rank #122 #146
Reliability N/A 10.0
Consistency 9.5 7.9
Tests Correct
Attempt pass rate 35.2% 26.3%
Flaky tests 1 5
Total Runs 54 57
Cost per result 1.574 0.754
Total Cost $0.095 $0.023
Input Price $1.250 / 1M $0.100 / 1M
Output Price $2.500 / 1M $0.300 / 1M
Output Tokens 1,967 68,534
Reasoning Tokens 0 0
Response Time (avg) 1.11s 2.73s
Response Time (max) 6.04s 19.68s
Response Time (total) 20.02s 40.90s

Top Models by Score

Score vs Total Cost

Response Time (avg)

Score vs Response Time (avg)

Total Output Tokens

Score vs Total Output Tokens

Category Breakdown

Anti-AI Tricks Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Grok 4.20 4.8 10.0 25.0% 0 501ms 267 0
MiMo-V2-Flash 3.2 8.0 8.3% 1 1.19s 865 0
Coding Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Grok 4.20 3.4 9.3 0.0% 0 1.22s 312 0
MiMo-V2-Flash 6.3 3.7 33.3% 1 2.79s 726 0
Combined Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Grok 4.20 3.0 10.0 0.0% 0 6.04s 282 0
MiMo-V2-Flash 3.0 10.0 0.0% 0 2.87s 330 0
Data parsing and extraction Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Grok 4.20 10.0 10.0 100.0% 0 522ms 207 0
MiMo-V2-Flash 2.9 5.8 16.7% 1 19.68s 161 0
Domain specific Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Grok 4.20 3.0 10.0 0.0% 0 687ms 325 0
MiMo-V2-Flash 5.3 7.2 44.4% 1 564ms 24 0
General Intelligence Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Grok 4.20 4.8 10.0 0.0% 0 659ms 83 0
MiMo-V2-Flash 4.6 10.0 0.0% 0 1.67s 104 0
Instructions following Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Grok 4.20 6.3 10.0 50.0% 0 455ms 60 0
MiMo-V2-Flash 6.5 10.0 50.0% 0 857ms 69 0
Puzzle Solving Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Grok 4.20 5.3 7.2 44.4% 1 487ms 242 0
MiMo-V2-Flash 3.6 7.2 22.2% 1 1.38s 65,971 0
Tool Calling Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Grok 4.20 10.0 10.0 100.0% 0 4.63s 189 0
MiMo-V2-Flash 10.0 10.0 100.0% 0 2.28s 272 0
Trivia Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Grok 4.20 - - - - - - - -
MiMo-V2-Flash 3.0 10.0 0.0% 0 1.82s 12 0

Quick Compare

Switch Comparison Pair