Navigate
AI BENCHY
AD
Track all your projects in one dashboard. Get 📊stats, 🔥heatmaps and 👀recordings in one self-hosted dashboard.
uxwizz.com

AI BENCHY Compare

OpenAI: GPT-5.4 Nano vs Xiaomi: MiMo-V2-Flash

Last updated at: 2026-04-04

Metric GPT-5.4 Nano GPT-5.4 Nano none Release: 2026-03-17 MiMo-V2-Flash MiMo-V2-Flash none Release: 2025-12-16
Score 4.3 4.4
Rank #88 #87
Consistency 7.3 8.0
Tests Correct
Attempt pass rate 29.4% 27.5%
Flaky tests 6 4
Total Runs 51 51
Cost per result 0.404 0.743
Total Cost $0.009 $0.023
Input Price $0.200 / 1M $0.090 / 1M
Output Price $1.250 / 1M $0.290 / 1M
Output Tokens 2,185 67,796
Reasoning Tokens 0 0
Response Time (avg) 1.39s 2.79s
Response Time (max) 3.84s 19.68s
Response Time (total) 23.70s 36.29s

Top Models by Score

Score vs Total Cost

Response Time (avg)

Score vs Response Time (avg)

Total Output Tokens

Score vs Total Output Tokens

Category Breakdown

Anti-AI Tricks Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
GPT-5.4 Nano 3.5 8.0 16.7% 1 1.18s 800 0
MiMo-V2-Flash 3.2 8.0 8.3% 1 1.19s 865 0
Combined Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
GPT-5.4 Nano 3.0 10.0 0.0% 0 3.84s 280 0
MiMo-V2-Flash 3.0 10.0 0.0% 0 2.87s 330 0
Data parsing and extraction Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
GPT-5.4 Nano 6.5 10.0 50.0% 0 1.11s 219 0
MiMo-V2-Flash 2.9 5.8 16.7% 1 19.68s 161 0
Domain specific Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
GPT-5.4 Nano 2.9 4.4 22.2% 2 926ms 52 0
MiMo-V2-Flash 5.3 7.2 44.4% 1 564ms 24 0
General Intelligence Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
GPT-5.4 Nano 3.8 2.5 33.3% 1 1.31s 180 0
MiMo-V2-Flash 4.6 10.0 0.0% 0 1.67s 104 0
Instructions following Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
GPT-5.4 Nano 5.0 6.8 33.3% 1 787ms 84 0
MiMo-V2-Flash 6.5 10.0 50.0% 0 857ms 69 0
Puzzle Solving Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
GPT-5.4 Nano 3.7 7.3 22.2% 1 1.29s 348 0
MiMo-V2-Flash 3.6 7.2 22.2% 1 1.38s 65,971 0
Tool Calling Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
GPT-5.4 Nano 10.0 10.0 100.0% 0 3.40s 222 0
MiMo-V2-Flash 10.0 10.0 100.0% 0 2.28s 272 0

Quick Compare

Switch Comparison Pair