Navigate
AI BENCHY
AD
Track all your projects in one dashboard. Get 📊stats, 🔥heatmaps and 👀recordings in one self-hosted dashboard.
uxwizz.com

AI BENCHY Compare

MiniMax: MiniMax M3 vs OpenAI: GPT-5.2 Chat

Last updated at: 2026-06-01

Metric MiniMax M3 MiniMax M3 medium Release: 2026-06-01 GPT-5.2 Chat GPT-5.2 Chat none Release: 2025-12-11
Score 7.5 7.9
Rank #54 #32
Reliability 9.6 10.0
Consistency 8.4 8.9
Tests Correct
Attempt pass rate 75.0% 73.3%
Flaky tests 4 3
Total Runs 60 60
Cost per result 1.083 2.703
Total Cost $0.120 $0.352
Input Price $0.300 / 1M $1.750 / 1M
Output Price $1.200 / 1M $14.000 / 1M
Output Tokens 46,884 21,144
Reasoning Tokens 85,935 0
Response Time (avg) 68.44s 6.82s
Response Time (max) 431.03s 38.52s
Response Time (total) 1300.32s 136.34s

Top Models by Score

Score vs Total Cost

Response Time (avg)

Score vs Response Time (avg)

Total Output Tokens

Score vs Total Output Tokens

Category Breakdown

Anti-AI Tricks Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
MiniMax M3 5.5 3.7 66.7% 3 14.95s 874 3,414
GPT-5.2 Chat 8.7 7.9 91.7% 1 3.40s 1,807 0
Coding Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
MiniMax M3 8.3 10.0 100.0% 0 185.58s 4,071 26,059
GPT-5.2 Chat 8.2 6.7 83.3% 1 8.05s 4,131 0
Combined Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
MiniMax M3 10.0 10.0 100.0% 0 65.30s 1,306 6,253
GPT-5.2 Chat 10.0 10.0 100.0% 0 9.12s 1,243 0
Data parsing and extraction Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
MiniMax M3 10.0 10.0 100.0% 0 14.92s 514 3,164
GPT-5.2 Chat 10.0 10.0 100.0% 0 3.05s 980 0
Domain specific Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
MiniMax M3 6.5 10.0 66.7% 0 233.13s 16,254 19,070
GPT-5.2 Chat 5.3 10.0 33.3% 0 17.78s 7,810 0
General Intelligence Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
MiniMax M3 5.1 3.4 33.3% 1 33.25s 2,487 2,523
GPT-5.2 Chat 4.4 3.0 33.3% 1 3.20s 335 0
Instructions following Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
MiniMax M3 9.8 10.0 100.0% 0 6.14s 103 920
GPT-5.2 Chat 9.8 10.0 100.0% 0 5.51s 1,441 0
Puzzle Solving Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
MiniMax M3 7.9 9.9 66.7% 0 49.91s 11,946 13,761
GPT-5.2 Chat 7.7 10.0 66.7% 0 4.10s 1,603 0
Tool Calling Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
MiniMax M3 10.0 10.0 100.0% 0 11.91s 281 555
GPT-5.2 Chat 10.0 10.0 100.0% 0 4.68s 555 0
Trivia Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
MiniMax M3 3.0 10.0 0.0% 0 100.80s 9,048 10,216
GPT-5.2 Chat 3.0 10.0 0.0% 0 6.89s 1,239 0

Quick Compare

Switch Comparison Pair