Eșecuri pe categorii AI BENCHY
Specific domeniului: Timp expirat
Specific domeniului
Timp expirat
Vezi ce modele AI au cele mai mari șanse să întâmpine Timp expirat la Specific domeniului, ca să găsești mai repede punctele slabe. Sortează după: Teste corecte ↓.
| Rang | Model | Companie | Număr de Timp expirat | Scor de categorie | Teste corecte | Timp de răspuns (mediu) |
|---|---|---|---|---|---|---|
| #3 | Claude Opus 4.7 medium | Anthropic | 1 | 7.7 | 2/3 | 1.17s |
| #8 | Qwen3.5 Plus 2026-02-15 medium | Qwen | 1 | 5.3 | 1/3 | 17.5s |
| #10 | Qwen3.5-27B medium | Qwen | 1 | 5.3 | 1/3 | 79.5s |
| #23 | MiMo-V2-Pro medium | Xiaomi | 1 | 5.3 | 1/3 | 6.00s |
| #27 | DeepSeek V3.2 medium | DeepSeek | 1 | 5.3 | 1/3 | 39.3s |
| #32 | Qwen3.5-Flash medium | Qwen | 1 | 5.3 | 1/3 | 146.5s |
| #33 | GLM 5.1 medium | Z.ai | 1 | 5.3 | 1/3 | 29.8s |
| #34 | Kimi K2.6 medium | Moonshot AI | 2 | 5.3 | 1/3 | 202.4s |
| #40 | GPT-5.2 medium | OpenAI | 1 | 5.9 | 1/3 | 77.8s |
| #52 | Grok 4.1 Fast medium | X AI | 1 | 5.8 | 1/3 | 121.8s |
| #57 | GPT-5 Nano medium | OpenAI | 1 | 5.2 | 1/3 | 204.0s |
| #13 | GLM 5 medium | Z.ai | 1 | 3.5 | 0/3 | 0ms |
| #18 | GLM 5 Turbo medium | Z.ai | 1 | 2.9 | 0/3 | 71.1s |
| #24 | Gemma 4 26B A4B medium | 1 | 2.9 | 0/3 | 23.6s | |
| #26 | Claude Sonnet 4.6 medium | Anthropic | 1 | 2.9 | 0/3 | 0ms |