#45
Arcee AI ยท Release: 2026-01-27 ยท arcee-ai/trinity-large-preview::none
Flaky tests
1
Flaky tests had mixed outcomes across runs (at least one pass and one fail).
Wrong answer: 9 Did not follow instructions: 2
Charts
Choose the first model, then click a second model to open a side-by-side page.
Quick Compare
Trinity Large PreviewnoneFree AvailablevsGPT-5.4noneTrinity Large PreviewnoneFree AvailablevsKimi K2.5noneTrinity Large PreviewnoneFree AvailablevsMiniMax M2.5mediumTrinity Large PreviewnoneFree AvailablevsGPT-4o-mininoneTrinity Large PreviewnoneFree AvailablevsQwen3.5-35B-A3BnoneTrinity Large PreviewnoneFree AvailablevsQwen3 Coder NextnoneTrinity Large PreviewnoneFree AvailablevsGemini 3 Flash PreviewmediumTrinity Large PreviewnoneFree AvailablevsGemini 3.1 Pro PreviewmediumTrinity Large PreviewnoneFree AvailablevsStep 3.5 FlashmediumFree Available
Category Breakdown
| Category | Avg Score | Consistency | Tests Correct |
|---|---|---|---|
| Anti-AI Tricks | 10.0 | 10.0 | |
| Combined | 10.0 | 10.0 | |
| Data parsing and extraction | 9.9 | 10.0 | |
| Domain specific | 4.0 | 10.0 | |
| General Intelligence | 3.0 | 9.9 | |
| Instructions following | 3.5 | 6.7 | |
| Puzzle Solving | 4.0 | 10.0 | |
| Tool Calling | 10.0 | 10.0 |