Kushindwa kwa kategoria za AI BENCHY
Uandishi wa msimbo: Muda umeisha
Uandishi wa msimbo
Muda umeisha
Ona ni modeli gani za AI zina uwezekano mkubwa wa kupata Muda umeisha katika Uandishi wa msimbo, ili uone udhaifu haraka.
Sababu za kushindwa
| Nafasi | Modeli | Kampuni | Idadi ya Muda umeisha | Alama ya kategoria | Majaribio sahihi | Muda wa majibu (wastani) |
|---|---|---|---|---|---|---|
| #23 | Gemma 4 31B medium | 1 | 3.8 | 0/2 | 110.9s | |
| #38 | Qwen3.5-122B-A10B medium | Qwen | 1 | 4.1 | 0/2 | 119.6s |
| #47 | Gemma 4 26B A4B medium | 1 | 2.9 | 0/2 | 258.4s | |
| #51 | GLM 5.1 medium | Z.ai | 1 | 4.7 | 0/2 | 145.6s |
| #56 | Qwen3.5-Flash medium | Qwen | 1 | 4.1 | 0/2 | 54.2s |
| #67 | MiMo-V2-Flash medium | Xiaomi | 1 | 4.1 | 0/2 | 7.20s |
| #71 | DeepSeek V3.2 medium | DeepSeek | 1 | 3.9 | 0/2 | 185.0s |
| #79 | Kimi K2.5 medium | Moonshot AI | 1 | 4.1 | 0/2 | 215.9s |
| #91 | Gemma 4 26B A4B none | 1 | 4.1 | 0/2 | 3.83s | |
| #119 | MiniMax M2.5 medium | Minimax | 1 | 3.5 | 0/2 | 125.8s |
| #141 | Qwen3 Coder Next medium | Qwen | 1 | 4.1 | 0/2 | 1.17s |
| #148 | GLM 4.7 Flash medium | Z.ai | 1 | 3.4 | 0/2 | 55.3s |