Navigate
AI BENCHY
AD
Track all your projects in one dashboard. Get 📊stats, 🔥heatmaps and 👀recordings in one self-hosted dashboard.
uxwizz.com

AI BENCHY Compare

Mistral: Mistral Small 4 vs NVIDIA: Nemotron 3 Super

Last updated at: 2026-05-01

Metric Mistral Small 4 Mistral Small 4 none Release: 2026-03-16 Nemotron 3 Super Nemotron 3 Super none Release: 2026-03-11 Free Available
Score 5.2 5.2
Rank #115 #113
Reliability N/A N/A
Consistency 9.5 8.6
Tests Correct
Attempt pass rate 31.5% 37.0%
Flaky tests 1 3
Total Runs 54 52
Cost per result 0.118 0.000
Total Cost $0.006 $0.000
Input Price $0.150 / 1M $0.090 / 1M
Output Price $0.600 / 1M $0.450 / 1M
Output Tokens 2,207 4,760
Reasoning Tokens 0 0
Response Time (avg) 665ms 8.54s
Response Time (max) 1.72s 24.97s
Response Time (total) 11.97s 153.69s

Top Models by Score

Score vs Total Cost

Response Time (avg)

Score vs Response Time (avg)

Total Output Tokens

Score vs Total Output Tokens

Category Breakdown

Anti-AI Tricks Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Mistral Small 4 3.4 7.9 16.7% 1 395ms 182 0
Nemotron 3 Super 4.8 10.0 25.0% 0 7.43s 2,174 0
Coding Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Mistral Small 4 4.5 9.0 0.0% 0 1.28s 583 0
Nemotron 3 Super 3.3 1.6 33.3% 1 2.99s 535 0
Combined Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Mistral Small 4 3.0 10.0 0.0% 0 1.72s 496 0
Nemotron 3 Super 3.0 10.0 0.0% 0 19.98s 124 0
Data parsing and extraction Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Mistral Small 4 10.0 10.0 100.0% 0 822ms 261 0
Nemotron 3 Super 10.0 10.0 100.0% 0 7.92s 249 0
Domain specific Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Mistral Small 4 5.3 10.0 33.3% 0 367ms 28 0
Nemotron 3 Super 3.6 7.2 22.2% 1 6.23s 26 0
General Intelligence Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Mistral Small 4 4.0 10.0 0.0% 0 729ms 205 0
Nemotron 3 Super 4.2 9.9 0.0% 0 24.97s 170 0
Instructions following Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Mistral Small 4 6.5 10.0 50.0% 0 380ms 69 0
Nemotron 3 Super 6.3 10.0 50.0% 0 1.50s 66 0
Puzzle Solving Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Mistral Small 4 3.1 9.9 0.0% 0 589ms 170 0
Nemotron 3 Super 5.7 10.0 33.3% 0 7.50s 1,135 0
Tool Calling Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Mistral Small 4 10.0 10.0 100.0% 0 1.40s 213 0
Nemotron 3 Super 4.7 1.6 66.7% 1 16.00s 281 0

Quick Compare

Switch Comparison Pair