Navigate
AI BENCHY
AD
Track all your projects in one dashboard. Get 📊stats, 🔥heatmaps and 👀recordings in one self-hosted dashboard.
uxwizz.com

AI BENCHY Compare

Anthropic: Claude Sonnet 4.6 vs Anthropic: Claude Sonnet 5

Summary

Claude Sonnet 4.6 vs Claude Sonnet 5 benchmark comparison: The average score is effectively tied at 7.8 vs 7.9. Claude Sonnet 5 has the lower benchmark cost at $0.550 vs $1.418. Claude Sonnet 5 is faster at 9.94s vs 17.06s, with pass rates of 65.1% vs 79.4%.

Recommended model: Claude Sonnet 5 - It has the best score here (7.9), while costing about 2.6x less than Claude Sonnet 4.6.

Last updated at: 2026-06-30

Metric Claude Sonnet 4.6 Claude Sonnet 4.6 medium Release: 2026-02-17 Claude Sonnet 5 Claude Sonnet 5 medium Release: 2026-06-30
Score 7.8 7.9
Rank #32 #30
Reliability 10.0 10.0
Consistency 9.1 9.0
Tests Correct
Attempt pass rate 65.1% 79.4%
Flaky tests 2 3
Total Runs 63 63
Cost per result 10.904 3.662
Total Cost $1.418 $0.550
Input Price $3.000 / 1M $2.000 / 1M
Output Price $15.000 / 1M $10.000 / 1M
Total Input Tokens 49,112 67,416
Output Tokens 54,703 34,012
Reasoning Tokens 29,970 7,673
Response Time (avg) 17.06s 9.94s
Response Time (max) 46.35s 56.94s
Response Time (total) 221.83s 208.71s

Generation showcase

Hamster playing table tennis

Prompt: Create a detailed SVG illustration of a hamster playing table tennis.

#32 Claude Sonnet 4.6

medium
Invalid SVG
Cost
$0.000
Time
300.0s
Tokens
0 tok

#30 Claude Sonnet 5

medium
Cost
$0.007
Time
6.4s
Tokens
832 tok

Top Models by Score

Score vs Total Cost

Response Time (avg)

Score vs Response Time (avg)

Total Output Tokens

Score vs Total Output Tokens

Category Breakdown

Anti-AI Tricks Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Claude Sonnet 4.6 6.5 10.0 50.0% 0 2.98s 789 1,046 1,093
Claude Sonnet 5 10.0 10.0 100.0% 0 3.80s 834 1,220 446
Coding Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Claude Sonnet 4.6 5.7 6.6 44.4% 1 33.29s 6,995 16,089 3,686
Claude Sonnet 5 9.0 7.9 88.9% 1 17.28s 10,590 13,153 2,379
Combined Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Claude Sonnet 4.6 10.0 10.0 100.0% 0 46.35s 18,351 5,871 3,962
Claude Sonnet 5 4.5 2.1 66.7% 1 37.01s 29,394 4,848 2,170
Data parsing and extraction Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Claude Sonnet 4.6 10.0 10.0 100.0% 0 13.90s 8,676 649 742
Claude Sonnet 5 10.0 10.0 100.0% 0 3.16s 10,503 312 0
Domain specific Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Claude Sonnet 4.6 2.9 7.2 11.1% 1 0ms 471 25,790 16,919
Claude Sonnet 5 7.7 10.0 66.7% 0 20.38s 975 12,140 1,994
General Intelligence Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Claude Sonnet 4.6 10.0 10.0 100.0% 0 4.94s 564 256 433
Claude Sonnet 5 4.8 3.2 33.3% 1 4.32s 708 264 0
Instructions following Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Claude Sonnet 4.6 10.0 10.0 100.0% 0 2.61s 792 318 552
Claude Sonnet 5 9.9 10.0 100.0% 0 3.10s 909 318 269
Puzzle Solving Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Claude Sonnet 4.6 10.0 10.0 100.0% 0 5.31s 816 592 646
Claude Sonnet 5 7.7 10.0 66.7% 0 2.98s 894 407 121
Tool Calling Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Claude Sonnet 4.6 10.0 10.0 100.0% 0 7.48s 11,454 655 351
Claude Sonnet 5 10.0 10.0 100.0% 0 10.70s 12,351 433 90
Trivia Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Claude Sonnet 4.6 3.0 10.0 0.0% 0 30.09s 204 3,437 1,586
Claude Sonnet 5 3.0 10.0 0.0% 0 7.06s 258 917 204

Quick Compare

Switch Comparison Pair