Navigate
AI BENCHY
Advertise here

AI BENCHY Compare

Cobuddy vs DeepSeek: DeepSeek V4 Flash

Last updated at: 2026-05-19

Metric Cobuddy Cobuddy medium Release: 2026-05-06 Free Available DeepSeek V4 Flash DeepSeek V4 Flash none Release: 2026-04-24 Free Available
Score 5.8 5.2
Rank #102 #127
Reliability 9.9 10.0
Consistency 6.9 9.2
Tests Correct
Attempt pass rate 54.4% 31.6%
Flaky tests 7 2
Total Runs 57 57
Cost per result 0.000 0.147
Total Cost $0.000 $0.008
Input Price $0.000 / 1M $0.112 / 1M
Output Price $0.000 / 1M $0.224 / 1M
Output Tokens 1,648 4,464
Reasoning Tokens 96,062 0
Response Time (avg) 36.50s 28.01s
Response Time (max) 309.02s 111.96s
Response Time (total) 693.45s 532.17s

Top Models by Score

Score vs Total Cost

Response Time (avg)

Score vs Response Time (avg)

Total Output Tokens

Score vs Total Output Tokens

Category Breakdown

Anti-AI Tricks Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Cobuddy 8.7 7.9 91.7% 1 10.00s 98 4,666
DeepSeek V4 Flash 3.0 10.0 0.0% 0 20.18s 174 0
Coding Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Cobuddy 4.3 1.1 66.7% 1 53.59s 343 9,678
DeepSeek V4 Flash 6.3 10.0 0.0% 0 24.04s 471 0
Combined Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Cobuddy 3.0 10.0 0.0% 0 47.38s 465 7,265
DeepSeek V4 Flash 4.5 2.1 66.7% 1 111.96s 2,664 0
Data parsing and extraction Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Cobuddy 6.3 5.8 66.7% 1 17.36s 275 5,591
DeepSeek V4 Flash 10.0 10.0 100.0% 0 23.79s 195 0
Domain specific Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Cobuddy 2.9 4.4 22.2% 2 128.15s 10 49,454
DeepSeek V4 Flash 5.3 10.0 33.3% 0 19.73s 18 0
General Intelligence Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Cobuddy 4.2 9.9 0.0% 0 23.23s 76 3,782
DeepSeek V4 Flash 4.2 9.9 0.0% 0 23.74s 67 0
Instructions following Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Cobuddy 9.8 10.0 100.0% 0 11.60s 64 2,842
DeepSeek V4 Flash 6.5 10.0 50.0% 0 17.54s 321 0
Puzzle Solving Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Cobuddy 3.5 4.4 33.3% 2 12.91s 175 5,627
DeepSeek V4 Flash 3.1 7.3 11.1% 1 22.96s 207 0
Tool Calling Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Cobuddy 10.0 10.0 100.0% 0 11.19s 133 294
DeepSeek V4 Flash 10.0 10.0 100.0% 0 77.93s 327 0
Trivia Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Output Tokens Reasoning Tokens
Cobuddy 3.0 10.0 0.0% 0 36.98s 9 6,863
DeepSeek V4 Flash 3.0 10.0 0.0% 0 3.07s 20 0

Quick Compare

Switch Comparison Pair