Navigate
AI BENCHY
Compare Charts Methodology
❤️ Made by XCS
AD
Track all your projects in one dashboard. Get 📊stats, 🔥heatmaps and 👀recordings in one self-hosted dashboard.
uxwizz.com

AI BENCHY Compare

OpenAI: GPT-5.2 Chat vs Xiaomi: MiMo-V2-Flash

Compare:

Last updated at: 2026-03-06

Metric OpenAI: GPT-5.2 Chat none Release: 2025-12-11 Xiaomi: MiMo-V2-Flash medium Release: 2025-12-16
Avg Score 7.7 7.5
Rank #11 #17
Tests Correct
Consistency 9.5 9.4
Cost per result 2.389 0.314
Total Cost $0.263 $0.035
Attempt pass rate 77.8% 77.8%
Flaky tests 1 1
common.totalRuns 45 (15 x 3) 45 (15 x 3)
Output Tokens 15,510 11,526
Reasoning Tokens 0 106,226
Response Time (avg) 7.29s 27.68s
Response Time (max) 38.52s 96.01s
Response Time (total) 109.31s 249.14s

Top Models by Score

Score vs Total Cost

Response Time (avg)

Avg Score vs Response Time (avg)

Category Breakdown

Anti-AI Tricks Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
OpenAI: GPT-5.2 Chat 10.0 10.0 100.0% 0 3.97s 1,651 0
Xiaomi: MiMo-V2-Flash 9.7 10.0 100.0% 0 16.79s 1,328 18,739
Combined Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
OpenAI: GPT-5.2 Chat 10.0 10.0 100.0% 0 9.12s 1,243 0
Xiaomi: MiMo-V2-Flash 9.0 10.0 100.0% 0 75.68s 442 26,859
Data parsing and extraction Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
OpenAI: GPT-5.2 Chat 9.9 10.0 100.0% 0 3.05s 980 0
Xiaomi: MiMo-V2-Flash 5.5 10.0 50.0% 0 0ms 153 0
Domain specific Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
OpenAI: GPT-5.2 Chat 4.0 10.0 33.3% 0 17.78s 7,810 0
Xiaomi: MiMo-V2-Flash 4.0 7.2 55.6% 1 96.01s 8,374 42,461
Instructions following Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
OpenAI: GPT-5.2 Chat 6.0 6.1 83.3% 1 5.46s 1,528 0
Xiaomi: MiMo-V2-Flash 10.0 10.0 100.0% 0 4.28s 75 3,504
Puzzle Solving Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
OpenAI: GPT-5.2 Chat 7.0 10.0 66.7% 0 4.42s 1,743 0
Xiaomi: MiMo-V2-Flash 7.0 10.0 66.7% 0 3.77s 833 1,948
Tool Calling Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
OpenAI: GPT-5.2 Chat 10.0 10.0 100.0% 0 4.68s 555 0
Xiaomi: MiMo-V2-Flash 10.0 10.0 100.0% 0 27.78s 321 12,715

Quick Compare

Switch Comparison Pair