Categoría AI BENCHY
Ranking de Trucos anti-IA
Mira qué modelos de IA rinden mejor en Trucos anti-IA, cuáles se mantienen fiables y dónde aparecen las mayores diferencias. Ordenar por: Pruebas correctas ↓.
Modelos mostrados
15
Promedio de Puntuación de Trucos anti-IA
6.7
Mejor modelo
Gemini 3 Flash Preview 10.0| Rango | Modelo | Empresa | Puntuación de Trucos anti-IA | Puntuación | Pruebas correctas | Tiempo de respuesta (promedio) |
|---|---|---|---|---|---|---|
| #60 | Hunter Alpha medium | OpenRouter | 7.3 | 6.7 | 2/4 | 4.75s |
| #65 | Mercury 2 medium | Inception | 6.9 | 6.5 | 2/4 | 1.12s |
| #67 | Grok 4.20 Multi Agent Beta medium | X AI | 6.9 | 6.4 | 2/4 | 3.46s |
| #68 | GPT-5 Nano medium | OpenAI | 6.5 | 6.3 | 2/4 | 25.5s |
| #79 | gpt-oss-120b medium | OpenAI | 6.7 | 5.8 | 2/4 | 10.2s |
| #83 | MiniMax M2.5 medium | Minimax | 7.9 | 5.7 | 2/4 | 20.8s |
| #90 | Ling 2.6 Flash none | Inclusionai | 6.5 | 5.4 | 2/4 | 12.3s |
| #94 | MiniMax M2.7 medium | Minimax | 7.9 | 5.3 | 2/4 | 40.3s |
| #95 | Elephant Alpha medium | Openrouter | 6.6 | 5.2 | 2/4 | 1.19s |
| #98 | gpt-oss-120b none | OpenAI | 6.6 | 5.2 | 2/4 | 6.03s |
| #99 | Elephant Alpha none | Openrouter | 6.6 | 5.2 | 2/4 | 963ms |
| #50 | Claude Sonnet 4.6 none | Anthropic | 4.8 | 7.4 | 1/4 | 2.94s |
| #58 | Qwen3.5 Plus 2026-02-15 none | Qwen | 4.8 | 6.8 | 1/4 | 1.91s |
| #62 | DeepSeek V4 Pro none | DeepSeek | 4.8 | 6.7 | 1/4 | 36.1s |
| #64 | GLM 5 none | Z.ai | 4.8 | 6.6 | 1/4 | 2.37s |