AI BENCHY Category
Anti-AI Tricks Ranking
See which AI models perform best on Anti-AI Tricks, which ones stay reliable, and where the biggest gaps appear. Sort by: Response Time (avg) ↓.
| Rank | Model | Company | Anti-AI Tricks Score | Score | Tests Correct | Response Time (avg) |
|---|---|---|---|---|---|---|
| #133 | DeepSeek V3.2 none | DeepSeek | 3.2 | 5.2 | 0/4 | 9.35s |
| #89 | Hy3 preview low | Tencent | 8.3 | 6.4 | 3/4 | 9.32s |
| #38 | Grok 4.3 medium | X AI | 10.0 | 7.6 | 4/4 | 8.83s |
| #150 | Qwen3 Coder Next medium | Qwen | 3.5 | 4.6 | 0/4 | 8.64s |
| #41 | Nemotron 3 Ultra 550b A55b medium | NVIDIA | 10.0 | 7.5 | 4/4 | 8.62s |
| #18 | Qwen3.7 Plus medium | Qwen | 10.0 | 8.2 | 4/4 | 8.58s |
| #55 | GLM 5.1 medium | Z.ai | 10.0 | 7.3 | 4/4 | 8.31s |
| #4 | Gemini 3.1 Pro Preview medium | 10.0 | 9.4 | 4/4 | 7.90s | |
| #105 | Nemotron 3 Super medium | NVIDIA | 8.3 | 5.8 | 3/4 | 7.85s |
| #42 | GPT-5.2 medium | OpenAI | 6.5 | 7.5 | 2/4 | 7.81s |
| #69 | Claude Opus 4.6 medium | Anthropic | 6.4 | 7.0 | 2/4 | 7.45s |
| #47 | Grok Build 0.1 medium | X AI | 8.3 | 7.4 | 3/4 | 7.43s |
| #33 | Hy3 preview medium | Tencent | 10.0 | 7.7 | 4/4 | 6.59s |
| #159 | Ling-2.6-1T none | Inclusionai | 3.4 | 4.3 | 0/4 | 6.55s |
| #5 | Qwen3.7 Max medium | Qwen | 10.0 | 9.1 | 4/4 | 6.36s |