Navigate
AI BENCHY
Your ad here

AI BENCHY Compare

Mistral: Mistral Small 4 vs OpenAI: GPT-5 Nano

Last updated at: 2026-03-17

Metric Mistral Small 4 Mistral Small 4 none Release: 2026-03-16 GPT-5 Nano GPT-5 Nano medium Release: 2025-08-07
Rank #61 #43
Score 5.3 6.2
Consistency 9.5 6.7
Cost per result 0.108 0.864
Total Cost $0.006 $0.061
Tests Correct
Attempt pass rate 33.3% 58.8%
Flaky tests 1 7
Total Runs 51 51
Output Tokens 1,624 4,500
Reasoning Tokens 0 143,296
Response Time (avg) 629ms 44.47s
Response Time (max) 1.72s 204.02s
Response Time (total) 10.70s 444.74s

Top Models by Score

Score vs Total Cost

Response Time (avg)

Score vs Response Time (avg)

Total Output Tokens

Score vs Total Output Tokens

Category Breakdown

Anti-AI Tricks Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Mistral Small 4 3.4 7.9 16.7% 1 395ms 182 0
GPT-5 Nano 6.5 7.9 58.3% 1 25.50s 1,221 21,184
Combined Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Mistral Small 4 3.0 10.0 0.0% 0 1.72s 496 0
GPT-5 Nano 10.0 10.0 100.0% 0 65.96s 578 17,984
Data parsing and extraction Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Mistral Small 4 10.0 10.0 100.0% 0 822ms 261 0
GPT-5 Nano 3.7 1.7 50.0% 2 21.42s 453 10,560
Domain specific Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Mistral Small 4 5.3 10.0 33.3% 0 367ms 28 0
GPT-5 Nano 5.2 4.4 55.6% 2 204.02s 237 64,448
General Intelligence Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Mistral Small 4 4.0 10.0 0.0% 0 729ms 205 0
GPT-5 Nano 4.1 10.0 0.0% 0 17.51s 202 4,608
Instructions following Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Mistral Small 4 6.5 10.0 50.0% 0 380ms 69 0
GPT-5 Nano 8.5 6.8 83.3% 1 11.90s 382 4,096
Puzzle Solving Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Mistral Small 4 3.1 9.9 0.0% 0 589ms 170 0
GPT-5 Nano 5.3 7.2 44.4% 1 19.81s 869 13,440
Tool Calling Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Mistral Small 4 10.0 10.0 100.0% 0 1.40s 213 0
GPT-5 Nano 10.0 10.0 100.0% 0 33.30s 558 6,976

Quick Compare

Switch Comparison Pair