AI BENCHY
Compare Charts Methodology
❤️ Made by XCS
AD
Track all your projects in one dashboard. Get 📊stats, 🔥heatmaps and 👀recordings in one self-hosted dashboard.
uxwizz.com

AI BENCHY Failures

API error Failures

See which AI models run into API error most often, so you can spot reliability risks before choosing one. Sort by: Avg Score ↑.

Models Shown

5

Total Failures

8

Most Affected Model

LFM2-24B-A2B 4
Rank Model Company API error Count Avg Score Tests Correct Response Time (avg)
#55 LFM2-24B-A2B none Liquid 4 2.6 1/16 811ms
#54 MiMo-V2-Flash none Xiaomi 1 2.9 3/16 2.97s
#35 Qwen3.5-35B-A3B medium Qwen 1 5.5 8/16 43.9s
#24 Qwen3.5-Flash medium Qwen 1 6.9 10/16 70.8s
#21 MiMo-V2-Flash medium Xiaomi 1 7.2 11/16 25.3s

Top Models by API error Count

API error Count vs Avg Score

Top Models by Response Time (avg)