Navigate
AI BENCHY
AD
Track all your projects in one dashboard. Get 📊stats, 🔥heatmaps and 👀recordings in one self-hosted dashboard.
uxwizz.com

AI BENCHY Compare

Mistral: Mistral Small 4 vs Hunter Alpha

Last updated at: 2026-03-17

Metric Mistral Small 4 Mistral Small 4 none Release: 2026-03-16 Hunter Alpha Hunter Alpha medium Release: Unknown release date
Rank #61 #35
Score 5.3 7.0
Consistency 9.5 7.2
Cost per result 0.108 0.000
Total Cost $0.006 $0.000
Tests Correct
Attempt pass rate 33.3% 68.6%
Flaky tests 1 6
Total Runs 51 51
Output Tokens 1,624 4,724
Reasoning Tokens 0 17,921
Response Time (avg) 629ms 10.33s
Response Time (max) 1.72s 30.53s
Response Time (total) 10.70s 175.60s

Top Models by Score

Score vs Total Cost

Response Time (avg)

Score vs Response Time (avg)

Total Output Tokens

Score vs Total Output Tokens

Category Breakdown

Anti-AI Tricks Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Mistral Small 4 3.4 7.9 16.7% 1 395ms 182 0
Hunter Alpha 7.3 5.8 83.3% 2 4.75s 479 1,103
Combined Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Mistral Small 4 3.0 10.0 0.0% 0 1.72s 496 0
Hunter Alpha 4.7 1.6 66.7% 1 30.53s 792 3,456
Data parsing and extraction Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Mistral Small 4 10.0 10.0 100.0% 0 822ms 261 0
Hunter Alpha 10.0 10.0 100.0% 0 23.16s 1,488 8,017
Domain specific Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Mistral Small 4 5.3 10.0 33.3% 0 367ms 28 0
Hunter Alpha 3.0 10.0 0.0% 0 10.52s 892 2,406
General Intelligence Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Mistral Small 4 4.0 10.0 0.0% 0 729ms 205 0
Hunter Alpha 7.0 3.7 66.7% 1 6.44s 116 260
Instructions following Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Mistral Small 4 6.5 10.0 50.0% 0 380ms 69 0
Hunter Alpha 9.9 10.0 100.0% 0 4.18s 208 465
Puzzle Solving Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Mistral Small 4 3.1 9.9 0.0% 0 589ms 170 0
Hunter Alpha 6.1 4.7 66.7% 2 5.36s 441 1,310
Tool Calling Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Mistral Small 4 10.0 10.0 100.0% 0 1.40s 213 0
Hunter Alpha 10.0 10.0 100.0% 0 17.33s 308 904

Quick Compare

Switch Comparison Pair