AI BENCHY
Compare Charts Methodology
❤️ Made by XCS
Your ad here

AI BENCHY Failures

API error Failures

See which AI models run into API error most often, so you can spot reliability risks before choosing one. Sort by: Avg Score ↓.

Models Shown

5

Total Failures

8

Most Affected Model

MiMo-V2-Flash 1
Rank Model Company API error Count Avg Score Tests Correct Response Time (avg)
#21 MiMo-V2-Flash medium Xiaomi 1 7.2 11/16 25.3s
#24 Qwen3.5-Flash medium Qwen 1 6.9 10/16 70.8s
#35 Qwen3.5-35B-A3B medium Qwen 1 5.5 8/16 43.9s
#54 MiMo-V2-Flash none Xiaomi 1 2.9 3/16 2.97s
#55 LFM2-24B-A2B none Liquid 4 2.6 1/16 811ms

Top Models by API error Count

API error Count vs Avg Score

Top Models by Response Time (avg)