AI BENCHY Compare
Inception: Mercury 2 vs MoonshotAI: Kimi K2.5
Benchmark zimetengenezwa kutoka seti za majaribio za AI BENCHY tarehe: 2026-06-03
| Kipimo | Mercury 2 Mercury 2 medium | Kimi K2.5 Kimi K2.5 medium |
|---|---|---|
| Alama | 6.5 | 6.7 |
| Nafasi | #89 | #81 |
| Uaminifu | 10.0 | 10.0 |
| Uthabiti | 8.8 | 6.8 |
| Majaribio sahihi | ||
| Kiwango cha kupita kwa kila jaribio | 51.7% | 66.7% |
| Majaribio yasiyo thabiti | 3 | 8 |
| Jumla ya uendeshaji | 60 | 60 |
| Gharama kwa matokeo | 0.611 | 3.486 |
| Jumla ya gharama | $0.055 | $0.272 |
| Bei ya ingizo | $0.250 / 1M | $0.400 / 1M |
| Bei ya toleo | $0.750 / 1M | $1.900 / 1M |
| Jumla ya tokeni za ingizo | 32,570 | 31,717 |
| Tokeni za matokeo | 4,022 | 48,374 |
| Tokeni za hoja | 58,405 | 128,473 |
| Muda wa majibu (wastani) | 2.27s | 89.02s |
| Muda wa majibu (upeo) | 14.63s | 281.00s |
| Muda wa majibu (jumla) | 43.20s | 1157.32s |
Alama dhidi ya gharama ya jumla
Muda wa majibu (wastani)
Alama vs Muda wa majibu (wastani)
Jumla ya tokeni za matokeo
Alama vs Jumla ya tokeni za matokeo
Mgawanyo wa kategoria
| Uchanganuzi na uchimbaji wa data | Alama | Uthabiti | Kiwango cha kupita kwa kila jaribio | Majaribio yasiyo thabiti | Majaribio sahihi | Muda wa majibu (wastani) | Tokeni za ingizo | Tokeni za matokeo | Tokeni za hoja |
|---|---|---|---|---|---|---|---|---|---|
| Mercury 2 | 7.3 | 5.9 | 83.3% | 1 | 1.11s | 6,234 | 183 | 1,656 | |
| Kimi K2.5 | 10.0 | 10.0 | 100.0% | 0 | 49.78s | 7,020 | 563 | 7,940 |
Ulinganisho wa haraka
Badilisha jozi ya ulinganisho
Gemini 3.1 Flash LiteminimalvsKimi K2.5mediumGemma 4 31BnoneInapatikana burevsKimi K2.5mediumMercury 2mediumvsGPT-5.5noneGemini 3.1 Flash LitenonevsMercury 2mediumKimi K2.5mediumvsQwen3.7 PlusnoneMercury 2mediumvsQwen3.7 PlusnoneMercury 2mediumvsQwen3.5 Plus 2026-02-15noneMercury 2mediumvsRing-2.6-1TnoneGemini 3.1 Flash LitenonevsKimi K2.5mediumKimi K2.5mediumvsGPT-5.5noneGemini 2.5 FlashnonevsMercury 2mediumGemini 3.1 Flash LiteminimalvsMercury 2medium