AI BENCHY
Your ad here

AI BENCHY Failures

Timed out Failures

See which AI models run into Timed out most often, so you can spot reliability risks before choosing one. Sort by: Response Time (avg) ↓.

Models Shown

15

Total Failures

61

Most Affected Model

Qwen3.5-9B 11
Rank Model Company Timed out Count Score Tests Correct Response Time (avg)
#14 Gemma 4 31B medium Google 1 8.3 13/18 24.9s
#33 GLM 5.1 medium Z.ai 2 7.8 12/18 24.1s
#45 GPT-5 Mini medium OpenAI 1 7.0 9/18 24.0s
#52 Grok 4.1 Fast medium X AI 1 6.7 9/18 23.9s
#41 MiMo-V2-Flash medium Xiaomi 1 7.5 11/18 23.4s
#13 GLM 5 medium Z.ai 1 8.4 13/18 23.3s
#51 Nemotron 3 Super medium NVIDIA 1 6.7 9/18 19.1s
#18 GLM 5 Turbo medium Z.ai 1 8.1 12/18 17.7s
#40 GPT-5.2 medium OpenAI 1 7.5 11/18 14.0s
#26 Claude Sonnet 4.6 medium Anthropic 1 8.0 13/18 12.7s
#23 MiMo-V2-Pro medium Xiaomi 1 8.1 12/18 12.3s
#92 Qwen3 Coder Next medium Qwen 1 4.7 3/18 10.8s
#50 Hunter Alpha medium OpenRouter 2 6.7 8/18 10.3s
#60 Gemma 4 26B A4B none Google 1 6.2 7/18 6.59s
#3 Claude Opus 4.7 medium Anthropic 1 9.2 16/18 3.53s

Top Models by Timed out Count

Timed out Count vs Score

Top Models by Response Time (avg)