Eșecuri pe categorii AI BENCHY
Specific domeniului: Timp expirat
Specific domeniului
Timp expirat
Vezi ce modele AI au cele mai mari șanse să întâmpine Timp expirat la Specific domeniului, ca să găsești mai repede punctele slabe.
Motive de eșec
| Rang | Model | Companie | Număr de Timp expirat | Scor de categorie | Teste corecte | Timp de răspuns (mediu) |
|---|---|---|---|---|---|---|
| #73 | Seed-2.0-Mini medium | Bytedance Seed | 3 | 3.0 | 0/3 | 0ms |
| #161 | Qwen3.5-9B medium | Qwen | 3 | 3.6 | 0/3 | 137.7s |
| #60 | Kimi K2.6 medium | Moonshot AI | 2 | 5.3 | 1/3 | 202.4s |
| #66 | Qwen3.5-35B-A3B medium | Qwen | 2 | 4.1 | 0/3 | 88.3s |
| #67 | MiniMax M3 medium | Minimax | 2 | 5.5 | 1/3 | 233.1s |
| #130 | MiniMax M2.7 medium | Minimax | 2 | 3.0 | 0/3 | 19.0s |
| #11 | Claude Opus 4.7 medium | Anthropic | 1 | 7.7 | 2/3 | 1.17s |
| #17 | GLM 5 medium | Z.ai | 1 | 3.5 | 0/3 | 0ms |
| #23 | GLM 5 Turbo medium | Z.ai | 1 | 2.9 | 0/3 | 71.1s |
| #25 | Qwen3.5 Plus 2026-02-15 medium | Qwen | 1 | 5.3 | 1/3 | 17.5s |
| #30 | Qwen3.5-27B medium | Qwen | 1 | 5.3 | 1/3 | 79.5s |
| #37 | Gemma 4 26B A4B medium | 1 | 2.9 | 0/3 | 23.6s | |
| #42 | GPT-5.2 medium | OpenAI | 1 | 5.9 | 1/3 | 77.8s |
| #49 | Qwen3.5-Flash medium | Qwen | 1 | 5.3 | 1/3 | 146.5s |
| #51 | Mimo V2 PRO medium | Xiaomi | 1 | 5.3 | 1/3 | 8.82s |