AI BENCHY
AD
Track all your projects in one dashboard. Get 📊stats, 🔥heatmaps and 👀recordings in one self-hosted dashboard.
uxwizz.com

AI BENCHY Failures

API error Failures

See which AI models run into API error most often, so you can spot reliability risks before choosing one. Sort by: Failure Count ↑.

Models Shown

3

Total Failures

27

Most Affected Model

Gemini 3 PRO Preview 1
Rank Model Company API error Count Score Tests Correct Response Time (avg)
#73 Mistral Small 4 medium Mistral 2 5.7 5/18 5.64s
#84 gpt-oss-120b none OpenAI 3 5.2 4/18 12.0s
#98 LFM2-24B-A2B none Liquid 4 4.1 1/16 811ms

Top Models by API error Count

API error Count vs Score

Top Models by Response Time (avg)