Kushindwa kwa AI BENCHY
Kushindwa kwa Muda umeisha
Ona ni modeli gani za AI hukutana na Muda umeisha mara nyingi zaidi ili utambue hatari za utegemevu kabla ya kuchagua. Panga kwa: Muda wa majibu (wastani) ↑.
| Nafasi | Modeli | Kampuni | Idadi ya Muda umeisha | Alama | Majaribio sahihi | Muda wa majibu (wastani) |
|---|---|---|---|---|---|---|
| #11 | Claude Opus 4.7 medium | Anthropic | 1 | 8.7 | 17/21 | 4.73s |
| #102 | Gemma 4 26B A4B none | 1 | 6.0 | 8/21 | 5.91s | |
| #150 | Qwen3 Coder Next medium | Qwen | 1 | 4.6 | 4/21 | 8.58s |
| #79 | Hunter Alpha medium | OpenRouter | 2 | 6.7 | 8/18 | 10.3s |
| #42 | GPT-5.2 medium | OpenAI | 1 | 7.5 | 13/21 | 16.9s |
| #52 | Claude Sonnet 4.6 medium | Anthropic | 1 | 7.4 | 13/21 | 17.1s |
| #64 | MiMo-V2-Flash medium | Xiaomi | 1 | 7.2 | 12/21 | 20.1s |
| #51 | Mimo V2 PRO medium | Xiaomi | 1 | 7.4 | 12/21 | 22.2s |
| #23 | GLM 5 Turbo medium | Z.ai | 1 | 8.0 | 14/21 | 23.0s |
| #54 | GPT-5 Mini medium | OpenAI | 1 | 7.3 | 12/21 | 23.6s |
| #86 | Grok 4.1 Fast medium | X AI | 1 | 6.5 | 9/19 | 23.8s |
| #105 | Nemotron 3 Super medium | NVIDIA | 1 | 5.8 | 8/21 | 32.0s |
| #17 | GLM 5 medium | Z.ai | 1 | 8.3 | 15/21 | 33.5s |
| #55 | GLM 5.1 medium | Z.ai | 2 | 7.3 | 12/21 | 33.7s |
| #158 | GLM 4.7 Flash medium | Z.ai | 2 | 4.4 | 4/21 | 35.1s |