Kategoria ya AI BENCHY
Orodha ya Mwito wa zana
Ona ni modeli gani za AI zinafanya vizuri zaidi katika Mwito wa zana, zipi zinabaki thabiti, na pengo kubwa liko wapi. Panga kwa: Muda wa majibu (wastani) ↓.
| Nafasi | Modeli | Kampuni | Alama ya Mwito wa zana | Alama | Majaribio sahihi | Muda wa majibu (wastani) |
|---|---|---|---|---|---|---|
| #18 | Qwen3.7 Plus medium | Qwen | 10.0 | 8.2 | 1/1 | 15.0s |
| #36 | Qwen3.5 Plus 2026-04-20 medium | Qwen | 10.0 | 7.6 | 1/1 | 14.7s |
| #135 | Kimi K2.5 none | Moonshot AI | 10.0 | 5.2 | 1/1 | 14.0s |
| #80 | Mimo V2 Omni medium | Xiaomi | 10.0 | 6.7 | 1/1 | 14.0s |
| #65 | Grok 4.20 medium | X AI | 3.0 | 7.1 | 0/1 | 13.7s |
| #21 | GPT-5.4 medium | OpenAI | 10.0 | 8.0 | 1/1 | 13.3s |
| #47 | Grok Build 0.1 medium | X AI | 10.0 | 7.4 | 1/1 | 13.1s |
| #1 | Gemini 3 Flash Preview medium | 10.0 | 9.8 | 1/1 | 12.6s | |
| #59 | GLM 5V Turbo medium | Z.ai | 7.0 | 7.2 | 0/1 | 12.5s |
| #13 | Grok 4.20 Beta medium | X AI | 3.0 | 8.5 | 0/1 | 12.4s |
| #19 | Seed-2.0-Lite medium | Bytedance Seed | 10.0 | 8.2 | 1/1 | 12.4s |
| #130 | MiniMax M2.7 medium | Minimax | 4.7 | 5.3 | 0/1 | 12.0s |
| #35 | Gemini 3 PRO Preview medium | 10.0 | 7.6 | 1/1 | 12.0s | |
| #62 | Step 3.5 Flash medium | Stepfun | 10.0 | 7.2 | 1/1 | 11.9s |
| #67 | MiniMax M3 medium | Minimax | 10.0 | 7.1 | 1/1 | 11.9s |