AI BENCHY Compare
Arcee AI: Trinity Large Preview (free) vs MoonshotAI: Kimi K2.5
Last updated at: 2026-03-03
| Metric | Arcee AI: Trinity Large Preview (free) none Release: 2026-01-27 Free Available | MoonshotAI: Kimi K2.5 none Release: 2026-01-27 |
|---|---|---|
| Rank | #33 | #35 |
| Avg Score | 4.34 | 4.07 |
| Consistency | 9.97 | 8.92 |
| Cost per result | 0.000 | 0.232 |
| Total Cost | $0.000 | $0.010 |
| Tests Correct | 5/14 | 4/14 |
| Attempt pass rate | 35.7% | 35.7% |
| Flaky tests | 0 | 2 |
| Output Tokens | 1,415 | 1,915 |
| Reasoning Tokens | 0 | 0 |
Category Breakdown
| Anti-AI Tricks | Score | Consistency | Attempt pass rate | Flaky tests | Tests Correct | Output Tokens | Reasoning Tokens |
|---|---|---|---|---|---|---|---|
| Arcee AI: Trinity Large Preview (free) | 1.00 | 10.00 | 0.0% | 0 | 587 | 0 | |
| MoonshotAI: Kimi K2.5 | 2.67 | 7.86 | 11.1% | 1 | 363 | 0 |
| Data parsing and extraction | Score | Consistency | Attempt pass rate | Flaky tests | Tests Correct | Output Tokens | Reasoning Tokens |
|---|---|---|---|---|---|---|---|
| Arcee AI: Trinity Large Preview (free) | 9.88 | 10.00 | 100.0% | 0 | 186 | 0 | |
| MoonshotAI: Kimi K2.5 | 5.50 | 5.81 | 83.3% | 1 | 995 | 0 |
| Domain specific | Score | Consistency | Attempt pass rate | Flaky tests | Tests Correct | Output Tokens | Reasoning Tokens |
|---|---|---|---|---|---|---|---|
| Arcee AI: Trinity Large Preview (free) | 4.00 | 10.00 | 33.3% | 0 | 21 | 0 | |
| MoonshotAI: Kimi K2.5 | 4.00 | 10.00 | 33.3% | 0 | 29 | 0 |
| Instructions following | Score | Consistency | Attempt pass rate | Flaky tests | Tests Correct | Output Tokens | Reasoning Tokens |
|---|---|---|---|---|---|---|---|
| Arcee AI: Trinity Large Preview (free) | 2.00 | 9.79 | 0.0% | 0 | 63 | 0 | |
| MoonshotAI: Kimi K2.5 | 5.00 | 9.99 | 50.0% | 0 | 61 | 0 |
| Puzzle Solving | Score | Consistency | Attempt pass rate | Flaky tests | Tests Correct | Output Tokens | Reasoning Tokens |
|---|---|---|---|---|---|---|---|
| Arcee AI: Trinity Large Preview (free) | 4.00 | 9.99 | 33.3% | 0 | 291 | 0 | |
| MoonshotAI: Kimi K2.5 | 2.00 | 9.92 | 0.0% | 0 | 247 | 0 |
| Tool Calling | Score | Consistency | Attempt pass rate | Flaky tests | Tests Correct | Output Tokens | Reasoning Tokens |
|---|---|---|---|---|---|---|---|
| Arcee AI: Trinity Large Preview (free) | 10.00 | 10.00 | 100.0% | 0 | 267 | 0 | |
| MoonshotAI: Kimi K2.5 | 10.00 | 10.00 | 100.0% | 0 | 220 | 0 |
Quick Compare
Switch Comparison Pair
Kimi K2.5nonevsGLM 4.7 FlashmediumTrinity Large Preview (free)noneFree AvailablevsGLM 4.7 FlashmediumKimi K2.5nonevsQwen3 Coder NextmediumTrinity Large Preview (free)noneFree AvailablevsQwen3 Coder NextmediumTrinity Large Preview (free)noneFree AvailablevsMiniMax M2.5mediumTrinity Large Preview (free)noneFree Availablevsgpt-oss-120bmediumFree AvailableTrinity Large Preview (free)noneFree AvailablevsQwen3.5-FlashmediumTrinity Large Preview (free)noneFree AvailablevsGPT-5 NanomediumMiniMax M2.5mediumvsKimi K2.5noneKimi K2.5nonevsgpt-oss-120bmediumFree AvailableTrinity Large Preview (free)noneFree AvailablevsQwen3.5-35B-A3BmediumTrinity Large Preview (free)noneFree AvailablevsMiMo-V2-Flashmedium