AI BENCHY Category Failures
Domain specific
Timed out
Domain specific
Timed out
See which AI models are most likely to hit Timed out on Domain specific, so you can spot weak points faster.
Related Failure Reasons
Related Categories
| Rank | Model | Company | Timed out Count | Category Score | Tests Correct | Response Time (avg) |
|---|---|---|---|---|---|---|
| #23 | Seed-2.0-Mini medium | Bytedance Seed | 3 | 10.0 | 0/3 | 0ms |
| #35 | Qwen3.5-35B-A3B medium | Qwen | 2 | 10.0 | 0/3 | 88.3s |
| #4 | Qwen3.5 Plus 2026-02-15 medium | Qwen | 1 | 4.0 | 1/3 | 17.5s |
| #7 | Qwen3.5-27B medium | Qwen | 1 | 4.0 | 1/3 | 79.5s |
| #11 | Claude Sonnet 4.6 medium | Anthropic | 1 | 10.0 | 0/3 | 0ms |
| #14 | GLM 5 medium | Z.ai | 1 | 10.0 | 0/3 | 0ms |
| #18 | DeepSeek V3.2 medium | DeepSeek | 1 | 4.0 | 1/3 | 39.3s |
| #24 | Qwen3.5-Flash medium | Qwen | 1 | 4.0 | 1/3 | 146.5s |
| #27 | GPT-5.2 medium | OpenAI | 1 | 4.0 | 1/3 | 77.8s |
| #28 | Kimi K2.5 medium | Moonshot AI | 1 | 10.0 | 0/3 | 137.3s |
| #30 | Grok 4.1 Fast medium | X AI | 1 | 4.0 | 1/3 | 121.8s |
| #32 | GPT-5 Mini medium | OpenAI | 1 | 10.0 | 0/3 | 44.6s |
| #34 | GPT-5 Nano medium | OpenAI | 1 | 4.0 | 1/3 | 204.0s |
| #43 | MiniMax M2.5 medium | Minimax | 1 | 10.0 | 0/3 | 237.3s |