AI BENCHY
Advertise here

AI BENCHY Failures

Timed out Failures

See which AI models run into Timed out most often, so you can spot reliability risks before choosing one.

Models Shown

15

Total Failures

73

Most Affected Model

Qwen3.5-9B 11
Rank Model Company Timed out Count Score Tests Correct Response Time (avg)
#79 Hunter Alpha medium OpenRouter 2 6.7 8/18 10.3s
#130 MiniMax M2.7 medium Minimax 2 5.3 5/21 38.2s
#158 GLM 4.7 Flash medium Z.ai 2 4.4 4/21 35.1s
#11 Claude Opus 4.7 medium Anthropic 1 8.7 17/21 4.73s
#17 GLM 5 medium Z.ai 1 8.3 15/21 33.5s
#18 Qwen3.7 Plus medium Qwen 1 8.2 15/21 38.9s
#23 GLM 5 Turbo medium Z.ai 1 8.0 14/21 23.0s
#30 Qwen3.5-27B medium Qwen 1 7.8 13/21 68.4s
#42 GPT-5.2 medium OpenAI 1 7.5 13/21 16.9s
#51 Mimo V2 PRO medium Xiaomi 1 7.4 12/21 22.2s
#52 Claude Sonnet 4.6 medium Anthropic 1 7.4 13/21 17.1s
#54 GPT-5 Mini medium OpenAI 1 7.3 12/21 23.6s
#62 Step 3.5 Flash medium Stepfun 1 7.2 11/20 72.5s
#64 MiMo-V2-Flash medium Xiaomi 1 7.2 12/21 20.1s
#86 Grok 4.1 Fast medium X AI 1 6.5 9/19 23.8s

Top Models by Timed out Count

Timed out Count vs Score

Top Models by Response Time (avg)