AI BENCHY
Your ad here

AI BENCHY Failures

Timed out Failures

See which AI models run into Timed out most often, so you can spot reliability risks before choosing one. Sort by: Response Time (avg) ↓.

Models Shown

15

Total Failures

61

Most Affected Model

Qwen3.5-9B 11
Rank Model Company Timed out Count Score Tests Correct Response Time (avg)
#97 Qwen3.5-9B medium Qwen 11 4.4 3/18 73.6s
#46 Kimi K2.5 medium Moonshot AI 2 7.0 9/18 72.4s
#39 Seed-2.0-Mini medium Bytedance Seed 4 7.5 11/18 69.7s
#32 Qwen3.5-Flash medium Qwen 4 7.8 11/18 66.7s
#10 Qwen3.5-27B medium Qwen 1 8.4 13/18 53.0s
#8 Qwen3.5 Plus 2026-02-15 medium Qwen 2 8.5 14/18 46.6s
#27 DeepSeek V3.2 medium DeepSeek 2 8.0 12/18 46.4s
#34 Kimi K2.6 medium Moonshot AI 2 7.7 11/18 45.2s
#43 Qwen3.5-35B-A3B medium Qwen 4 7.4 10/18 44.5s
#57 GPT-5 Nano medium OpenAI 1 6.3 7/18 44.1s
#71 MiniMax M2.5 medium Minimax 4 5.7 5/18 39.6s
#93 GLM 4.7 Flash medium Z.ai 1 4.6 4/18 32.3s
#19 Qwen3.5-122B-A10B medium Qwen 2 8.1 13/18 31.4s
#80 MiniMax M2.7 medium Minimax 2 5.3 4/18 31.1s
#24 Gemma 4 26B A4B medium Google 2 8.0 13/18 25.0s

Top Models by Timed out Count

Timed out Count vs Score

Top Models by Response Time (avg)