Fallos AI BENCHY
Fallos por Error de API
Mira qué modelos de IA se encuentran con Error de API con más frecuencia para detectar riesgos de fiabilidad antes de elegir. Ordenar por: Tiempo de respuesta (promedio) ↓.
| Rango | Modelo | Empresa | Cantidad de Error de API | Puntuación | Pruebas correctas | Tiempo de respuesta (promedio) |
|---|---|---|---|---|---|---|
| #32 | Qwen3.5-Flash medium | Qwen | 1 | 7.8 | 11/18 | 66.7s |
| #43 | Qwen3.5-35B-A3B medium | Qwen | 1 | 7.4 | 10/18 | 44.5s |
| #14 | Gemma 4 31B medium | 2 | 8.3 | 13/18 | 24.9s | |
| #33 | GLM 5.1 medium | Z.ai | 1 | 7.8 | 12/18 | 24.1s |
| #41 | MiMo-V2-Flash medium | Xiaomi | 1 | 7.5 | 11/18 | 23.4s |
| #51 | Nemotron 3 Super medium | NVIDIA | 1 | 6.7 | 9/18 | 19.1s |
| #20 | Qwen3.6 Plus medium | Qwen | 1 | 8.1 | 13/18 | 15.3s |
| #84 | gpt-oss-120b none | OpenAI | 3 | 5.2 | 4/18 | 12.0s |
| #50 | Hunter Alpha medium | OpenRouter | 1 | 6.7 | 8/18 | 10.3s |
| #47 | Grok 4.20 medium | X AI | 1 | 7.0 | 9/18 | 10.3s |
| #56 | Grok 4.20 Multi Agent Beta medium | X AI | 2 | 6.4 | 7/18 | 9.80s |
| #12 | Gemini 3 PRO Preview medium | 1 | 8.4 | 14/18 | 9.06s | |
| #73 | Mistral Small 4 medium | Mistral | 2 | 5.7 | 5/18 | 5.64s |
| #72 | Hunter Alpha none | OpenRouter | 1 | 5.7 | 6/18 | 4.58s |
| #48 | Gemma 4 31B none | 2 | 6.9 | 10/18 | 4.02s |