AI BENCHY
Compare Charts Methodology
❤️ Made by XCS
Your ad here

AI BENCHY Failures

No answer Failures

See which AI models run into No answer most often, so you can spot reliability risks before choosing one. Sort by: Avg Score ↓.

Models Shown

6

Total Failures

7

Most Affected Model

GLM 5 1
Rank Model Company No answer Count Avg Score Tests Correct Response Time (avg)
#14 GLM 5 medium Z.ai 1 7.4 11/16 16.2s
#27 GPT-5.2 medium OpenAI 1 6.5 10/16 15.3s
#28 Kimi K2.5 medium Moonshot AI 1 6.4 9/16 69.8s
#30 Grok 4.1 Fast medium X AI 1 6.2 9/16 26.3s
#35 Qwen3.5-35B-A3B medium Qwen 1 5.5 8/16 43.9s
#52 GLM 4.7 Flash medium Z.ai 2 3.1 4/16 36.8s

Top Models by No answer Count

No answer Count vs Avg Score

Top Models by Response Time (avg)