AI BENCHY
Compare
❤️ Made by XCS

Model Name

OpenAI: GPT-4o-mini

Last updated at : Feb 19, 2026

Metric OpenAI: GPT-4o-mini
Rank#19
CompanyOpenAI
Score 4.00
Consistency 9.98
Cost per result 0.0576
Total Cost $0.00173
Tests Correct 3/12
Attempt pass rate 25.0%
Flaky tests 0
Output Tokens 570
Reasoning Tokens 0

Category Breakdown

Category Fully passed tests Score Consistency Attempt pass rate Flaky tests Reasoning score Cost
Anti-AI Tricks 0/2 1.00 10.00 0.0% 0 - $0.00005
Data parsing and extraction 2/2 10.00 10.00 100.0% 0 - $0.00115
Domain specific 0/3 1.00 10.00 0.0% 0 - $0.00012
Instructions following 1/2 5.50 10.00 50.0% 0 - $0.00015
Puzzle Solving 0/3 4.00 9.92 0.0% 0 - $0.00028

Compared models

Compare OpenAI: GPT-4o-mini against...

#18 · Stepfun

StepFun: Step 3.5 Flash

Reasoning (medium)

Score: 4.92

Consistency: 7.34

Attempt pass rate: 58.3%

Flaky tests: 4

Cost per result: 0.0000

Tests Correct: 5/12

Total Cost: $0.00000

Compare

#20 · Z.ai

Z.ai: GLM 4.7 Flash

Reasoning (medium)

Score: 3.92

Consistency: 6.51

Attempt pass rate: 50.0%

Flaky tests: 5

Cost per result: 0.2253

Tests Correct: 4/12

Total Cost: $0.00902

Compare

#17 · MiniMax

MiniMax: MiniMax M2.5

Reasoning (medium)

Score: 5.08

Consistency: 6.00

Attempt pass rate: 61.1%

Flaky tests: 6

Cost per result: 4.0276

Tests Correct: 5/12

Total Cost: $0.20138

Compare

Quick Compare

Compare OpenAI: GPT-4o-mini against...