Navigate
AI BENCHY
Advertise here

AI BENCHY Compare

Anthropic: Claude Sonnet 5 vs DeepSeek: DeepSeek V4 Pro

Summary

Claude Sonnet 5 vs DeepSeek V4 Pro benchmark comparison: Claude Sonnet 5 leads on average score with 7.9 vs 7.2. DeepSeek V4 Pro has the lower benchmark cost at $0.034 vs $0.550. DeepSeek V4 Pro is faster at 6.41s vs 9.94s, with pass rates of 79.4% vs 52.4%.

Recommended model: DeepSeek V4 Pro - Its score stays close to the best score here (7.2 vs 7.9), while costing about 16.5x less than Claude Sonnet 5.

Last updated at: 2026-06-30

Metric Claude Sonnet 5 Claude Sonnet 5 medium Release: 2026-06-30 DeepSeek V4 Pro DeepSeek V4 Pro none Release: 2026-04-24
Score 7.9 7.2
Rank #30 #60
Reliability 10.0 9.9
Consistency 9.0 8.8
Tests Correct
Attempt pass rate 79.4% 52.4%
Flaky tests 3 3
Total Runs 63 63
Cost per result 3.662 0.333
Total Cost $0.550 $0.034
Input Price $2.000 / 1M $0.435 / 1M
Output Price $10.000 / 1M $0.870 / 1M
Total Input Tokens 67,416 53,558
Output Tokens 34,012 11,424
Reasoning Tokens 7,673 0
Response Time (avg) 9.94s 6.41s
Response Time (max) 56.94s 30.09s
Response Time (total) 208.71s 134.66s

Generation showcase

Hamster playing table tennis

Prompt: Create a detailed SVG illustration of a hamster playing table tennis.

#30 Claude Sonnet 5

medium
Cost
$0.007
Time
6.4s
Tokens
832 tok

#60 DeepSeek V4 Pro

none
Invalid SVG
Cost
$0.000
Time
300.0s
Tokens
0 tok

Top Models by Score

Score vs Total Cost

Response Time (avg)

Score vs Response Time (avg)

Total Output Tokens

Score vs Total Output Tokens

Category Breakdown

Anti-AI Tricks Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Claude Sonnet 5 10.0 10.0 100.0% 0 3.80s 834 1,220 446
DeepSeek V4 Pro 3.2 6.1 16.7% 2 4.02s 540 1,168 0
Coding Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Claude Sonnet 5 9.0 7.9 88.9% 1 17.28s 10,590 13,153 2,379
DeepSeek V4 Pro 5.6 10.0 33.3% 0 13.38s 7,275 5,500 0
Combined Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Claude Sonnet 5 4.5 2.1 66.7% 1 37.01s 29,394 4,848 2,170
DeepSeek V4 Pro 9.5 10.0 100.0% 0 23.74s 27,529 2,235 0
Data parsing and extraction Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Claude Sonnet 5 10.0 10.0 100.0% 0 3.16s 10,503 312 0
DeepSeek V4 Pro 10.0 10.0 100.0% 0 4.61s 7,568 200 0
Domain specific Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Claude Sonnet 5 7.7 10.0 66.7% 0 20.38s 975 12,140 1,994
DeepSeek V4 Pro 5.3 10.0 33.3% 0 3.72s 666 24 0
General Intelligence Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Claude Sonnet 5 4.8 3.2 33.3% 1 4.32s 708 264 0
DeepSeek V4 Pro 5.0 10.0 0.0% 0 2.05s 471 126 0
Instructions following Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Claude Sonnet 5 9.9 10.0 100.0% 0 3.10s 909 318 269
DeepSeek V4 Pro 6.3 5.8 66.7% 1 4.12s 627 713 0
Puzzle Solving Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Claude Sonnet 5 7.7 10.0 66.7% 0 2.98s 894 407 121
DeepSeek V4 Pro 10.0 10.0 100.0% 0 3.61s 594 442 0
Tool Calling Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Claude Sonnet 5 10.0 10.0 100.0% 0 10.70s 12,351 433 90
DeepSeek V4 Pro 10.0 10.0 100.0% 0 7.40s 8,105 328 0
Trivia Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Claude Sonnet 5 3.0 10.0 0.0% 0 7.06s 258 917 204
DeepSeek V4 Pro 3.0 10.0 0.0% 0 5.76s 183 688 0

Quick Compare

Switch Comparison Pair