AI BENCHY Category Failures
Domain specific
Timed out
Domain specific
Timed out
See which AI models are most likely to hit Timed out on Domain specific, so you can spot weak points faster. Sort by: Response Time (avg) ↑.
Related Failure Reasons
Related Categories
| Rank | Model | Company | Timed out Count | Category Score | Tests Correct | Response Time (avg) |
|---|---|---|---|---|---|---|
| #11 | Claude Sonnet 4.6 medium | Anthropic | 1 | 10.0 | 0/3 | 0ms |
| #14 | GLM 5 medium | Z.ai | 1 | 10.0 | 0/3 | 0ms |
| #23 | Seed-2.0-Mini medium | Bytedance Seed | 3 | 10.0 | 0/3 | 0ms |
| #4 | Qwen3.5 Plus 2026-02-15 medium | Qwen | 1 | 4.0 | 1/3 | 17.5s |
| #18 | DeepSeek V3.2 medium | DeepSeek | 1 | 4.0 | 1/3 | 39.3s |
| #32 | GPT-5 Mini medium | OpenAI | 1 | 10.0 | 0/3 | 44.6s |
| #27 | GPT-5.2 medium | OpenAI | 1 | 4.0 | 1/3 | 77.8s |
| #7 | Qwen3.5-27B medium | Qwen | 1 | 4.0 | 1/3 | 79.5s |
| #35 | Qwen3.5-35B-A3B medium | Qwen | 2 | 10.0 | 0/3 | 88.3s |
| #30 | Grok 4.1 Fast medium | X AI | 1 | 4.0 | 1/3 | 121.8s |
| #28 | Kimi K2.5 medium | Moonshot AI | 1 | 10.0 | 0/3 | 137.3s |
| #24 | Qwen3.5-Flash medium | Qwen | 1 | 4.0 | 1/3 | 146.5s |
| #34 | GPT-5 Nano medium | OpenAI | 1 | 4.0 | 1/3 | 204.0s |
| #43 | MiniMax M2.5 medium | Minimax | 1 | 10.0 | 0/3 | 237.3s |