Navigate
AI BENCHY
AD
Track all your projects in one dashboard. Get 📊stats, 🔥heatmaps and 👀recordings in one self-hosted dashboard.
uxwizz.com

AI BENCHY Compare

xAI: Grok 4.20 vs Xiaomi: MiMo-V2-Omni

Last updated at: 2026-05-10

Metric Grok 4.20 Grok 4.20 medium Release: 2026-03-31 MiMo-V2-Omni MiMo-V2-Omni none Release: 2026-03-18
Score 6.9 6.3
Rank #68 #81
Reliability 10.0 10.0
Consistency 8.3 9.7
Tests Correct
Attempt pass rate 63.2% 43.9%
Flaky tests 4 1
Total Runs 57 49
Cost per result 7.559 0.241
Total Cost $0.756 $0.020
Input Price $1.250 / 1M $0.400 / 1M
Output Price $2.500 / 1M $2.000 / 1M
Output Tokens 1,784 2,254
Reasoning Tokens 128,233 0
Response Time (avg) 14.53s 2.37s
Response Time (max) 63.48s 6.81s
Response Time (total) 276.06s 45.03s

Top Models by Score

Score vs Total Cost

Response Time (avg)

Score vs Response Time (avg)

Total Output Tokens

Score vs Total Output Tokens

Category Breakdown

Anti-AI Tricks Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Grok 4.20 8.2 7.9 83.3% 1 3.95s 287 8,312
MiMo-V2-Omni 3.6 8.4 8.3% 1 1.63s 773 0
Coding Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Grok 4.20 4.3 1.1 66.7% 1 24.33s 250 12,804
MiMo-V2-Omni 6.6 10.0 0.0% 0 1.72s 399 0
Combined Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Grok 4.20 10.0 10.0 100.0% 0 17.40s 232 9,556
MiMo-V2-Omni 3.0 10.0 0.0% 0 5.96s 387 0
Data parsing and extraction Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Grok 4.20 10.0 10.0 100.0% 0 4.17s 180 5,333
MiMo-V2-Omni 10.0 10.0 100.0% 0 1.76s 147 0
Domain specific Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Grok 4.20 5.3 10.0 33.3% 0 27.03s 375 49,339
MiMo-V2-Omni 5.3 10.0 33.3% 0 2.10s 24 0
General Intelligence Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Grok 4.20 3.9 2.6 33.3% 1 24.48s 65 6,440
MiMo-V2-Omni 4.1 10.0 0.0% 0 2.33s 103 0
Instructions following Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Grok 4.20 7.3 6.0 83.3% 1 4.42s 40 5,474
MiMo-V2-Omni 6.5 10.0 50.0% 0 4.26s 30 0
Puzzle Solving Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Grok 4.20 7.7 10.0 66.7% 0 6.20s 149 7,913
MiMo-V2-Omni 10.0 10.0 100.0% 0 1.16s 148 0
Tool Calling Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Grok 4.20 3.0 10.0 0.0% 0 13.68s 197 6,620
MiMo-V2-Omni 10.0 10.0 100.0% 0 5.40s 231 0
Trivia Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Grok 4.20 3.0 10.0 0.0% 0 63.48s 9 16,442
MiMo-V2-Omni 3.0 10.0 0.0% 0 1.30s 12 0

Quick Compare

Switch Comparison Pair