AI BENCHY Failures
API error Failures
See which AI models run into API error most often, so you can spot reliability risks before choosing one. Sort by: Tests Correct ↑.
| Rank | Model | Company | API error Count | Score | Tests Correct | Response Time (avg) |
|---|---|---|---|---|---|---|
| #14 | Gemma 4 31B medium | 2 | 8.3 | 13/18 | 24.9s | |
| #20 | Qwen3.6 Plus medium | Qwen | 1 | 8.1 | 13/18 | 15.3s |
| #12 | Gemini 3 PRO Preview medium | 1 | 8.4 | 14/18 | 9.06s |