Fallos AI BENCHY
Fallos por Error de API
Mira qué modelos de IA se encuentran con Error de API con más frecuencia para detectar riesgos de fiabilidad antes de elegir. Ordenar por: Tiempo de respuesta (promedio) ↑.
| Rango | Modelo | Empresa | Cantidad de Error de API | Puntuación | Pruebas correctas | Tiempo de respuesta (promedio) |
|---|---|---|---|---|---|---|
| #99 | Step 3.5 Flash none | Stepfun | 1 | 3.0 | 0/1 | 0ms |
| #98 | LFM2-24B-A2B none | Liquid | 4 | 4.1 | 1/16 | 811ms |
| #94 | MiMo-V2-Flash none | Xiaomi | 1 | 4.5 | 3/18 | 2.79s |
| #48 | Gemma 4 31B none | 2 | 6.9 | 10/18 | 4.02s | |
| #72 | Hunter Alpha none | OpenRouter | 1 | 5.7 | 6/18 | 4.58s |
| #73 | Mistral Small 4 medium | Mistral | 2 | 5.7 | 5/18 | 5.64s |
| #12 | Gemini 3 PRO Preview medium | 1 | 8.4 | 14/18 | 9.06s | |
| #56 | Grok 4.20 Multi Agent Beta medium | X AI | 2 | 6.4 | 7/18 | 9.80s |
| #47 | Grok 4.20 medium | X AI | 1 | 7.0 | 9/18 | 10.3s |
| #50 | Hunter Alpha medium | OpenRouter | 1 | 6.7 | 8/18 | 10.3s |
| #84 | gpt-oss-120b none | OpenAI | 3 | 5.2 | 4/18 | 12.0s |
| #20 | Qwen3.6 Plus medium | Qwen | 1 | 8.1 | 13/18 | 15.3s |
| #51 | Nemotron 3 Super medium | NVIDIA | 1 | 6.7 | 9/18 | 19.1s |
| #41 | MiMo-V2-Flash medium | Xiaomi | 1 | 7.5 | 11/18 | 23.4s |
| #33 | GLM 5.1 medium | Z.ai | 1 | 7.8 | 12/18 | 24.1s |