Fallos AI BENCHY
Fallos por Error de API
Mira qué modelos de IA se encuentran con Error de API con más frecuencia para detectar riesgos de fiabilidad antes de elegir. Ordenar por: Puntuación ↑.
| Rango | Modelo | Empresa | Cantidad de Error de API | Puntuación | Pruebas correctas | Tiempo de respuesta (promedio) |
|---|---|---|---|---|---|---|
| #99 | Step 3.5 Flash none | Stepfun | 1 | 3.0 | 0/1 | 0ms |
| #98 | LFM2-24B-A2B none | Liquid | 4 | 4.1 | 1/16 | 811ms |
| #94 | MiMo-V2-Flash none | Xiaomi | 1 | 4.5 | 3/18 | 2.79s |
| #84 | gpt-oss-120b none | OpenAI | 3 | 5.2 | 4/18 | 12.0s |
| #73 | Mistral Small 4 medium | Mistral | 2 | 5.7 | 5/18 | 5.64s |
| #72 | Hunter Alpha none | OpenRouter | 1 | 5.7 | 6/18 | 4.58s |
| #56 | Grok 4.20 Multi Agent Beta medium | X AI | 2 | 6.4 | 7/18 | 9.80s |
| #51 | Nemotron 3 Super medium | NVIDIA | 1 | 6.7 | 9/18 | 19.1s |
| #50 | Hunter Alpha medium | OpenRouter | 1 | 6.7 | 8/18 | 10.3s |
| #48 | Gemma 4 31B none | 2 | 6.9 | 10/18 | 4.02s | |
| #47 | Grok 4.20 medium | X AI | 1 | 7.0 | 9/18 | 10.3s |
| #43 | Qwen3.5-35B-A3B medium | Qwen | 1 | 7.4 | 10/18 | 44.5s |
| #41 | MiMo-V2-Flash medium | Xiaomi | 1 | 7.5 | 11/18 | 23.4s |
| #33 | GLM 5.1 medium | Z.ai | 1 | 7.8 | 12/18 | 24.1s |
| #32 | Qwen3.5-Flash medium | Qwen | 1 | 7.8 | 11/18 | 66.7s |