Navigate
AI BENCHY
Advertise here

AI BENCHY Compare

Trinity Large Preview vs Cobuddy

Last updated at: 2026-06-03

Metric Trinity Large Preview Trinity Large Preview none Release: 2026-01-27 Cobuddy Cobuddy medium Release: 2026-05-06
Score 4.7 5.8
Rank #148 #116
Reliability 10.0 10.0
Consistency 9.3 7.4
Tests Correct
Attempt pass rate 23.3% 50.0%
Flaky tests 2 6
Total Runs 60 60
Cost per result 0.017 0.000
Total Cost $0.008 $0.000
Input Price $0.243 / 1M $0.000 / 1M
Output Price $0.243 / 1M $0.000 / 1M
Total Input Tokens 29,828 37,449
Output Tokens 2,169 1,677
Reasoning Tokens 0 116,703
Response Time (avg) 2.98s 39.90s
Response Time (max) 14.34s 309.02s
Response Time (total) 56.57s 797.98s

Top Models by Score

Score vs Total Cost

Response Time (avg)

Score vs Response Time (avg)

Total Output Tokens

Score vs Total Output Tokens

Category Breakdown

Anti-AI Tricks Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Trinity Large Preview 3.1 10.0 0.0% 0 2.07s 651 550 0
Cobuddy 8.7 7.9 91.7% 1 10.00s 453 98 4,666
Coding Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Trinity Large Preview 4.0 6.6 16.7% 1 14.34s 738 397 0
Cobuddy 4.1 5.1 33.3% 1 79.17s 4,726 358 30,138
Combined Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Trinity Large Preview 3.0 10.0 0.0% 0 8.91s 12,053 294 0
Cobuddy 3.0 10.0 0.0% 0 47.38s 18,324 465 7,265
Data parsing and extraction Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Trinity Large Preview 10.0 10.0 100.0% 0 3.26s 6,900 186 0
Cobuddy 6.3 5.8 66.7% 1 17.36s 8,181 275 5,591
Domain specific Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Trinity Large Preview 5.3 10.0 33.3% 0 877ms 738 25 0
Cobuddy 2.9 4.4 22.2% 2 128.15s 540 10 49,454
General Intelligence Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Trinity Large Preview 4.5 10.0 0.0% 0 873ms 498 104 0
Cobuddy 4.2 9.9 0.0% 0 23.23s 498 76 3,782
Instructions following Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Trinity Large Preview 3.5 10.0 0.0% 0 822ms 678 63 0
Cobuddy 9.8 10.0 100.0% 0 11.60s 508 64 2,842
Puzzle Solving Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Trinity Large Preview 3.6 7.7 11.1% 1 1.97s 669 265 0
Cobuddy 3.6 7.2 22.2% 1 12.83s 561 189 5,808
Tool Calling Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Trinity Large Preview 10.0 10.0 100.0% 0 6.67s 6,699 267 0
Cobuddy 10.0 10.0 100.0% 0 11.19s 3,505 133 294
Trivia Score Consistency Attempt pass rate Flaky tests Tests Correct Response Time (avg) Input Tokens Output Tokens Reasoning Tokens
Trinity Large Preview 3.0 10.0 0.0% 0 777ms 204 18 0
Cobuddy 3.0 10.0 0.0% 0 36.98s 153 9 6,863

Quick Compare

Switch Comparison Pair