Kategori AI BENCHY
Peringkat Trik anti-AI
Lihat model AI mana yang paling baik di Trik anti-AI, mana yang tetap andal, dan di mana kesenjangan terbesar muncul.
| Peringkat | Model | Perusahaan | Skor Trik anti-AI | Skor | Tes benar | Waktu respons (rata-rata) |
|---|---|---|---|---|---|---|
| #33 | GLM 5.1 medium | Z.ai | 10.0 | 7.8 | 4/4 | 8.31s |
| #35 | MiMo-V2-Omni medium | Xiaomi | 10.0 | 7.7 | 4/4 | 2.11s |
| #43 | Qwen3.5-35B-A3B medium | Qwen | 10.0 | 7.4 | 4/4 | 21.1s |
| #51 | Nemotron 3 Super medium | NVIDIA | 10.0 | 6.7 | 4/4 | 10.1s |
| #17 | Gemini 3.1 Flash Lite Preview medium | 9.1 | 8.2 | 3/4 | 2.33s | |
| #7 | GPT-5.3-Codex medium | OpenAI | 8.7 | 8.6 | 3/4 | 4.16s |
| #10 | Qwen3.5-27B medium | Qwen | 8.7 | 8.4 | 3/4 | 19.8s |
| #25 | Grok 4.20 Beta medium | X AI | 8.7 | 8.0 | 3/4 | 3.16s |
| #28 | GPT-5.2 Chat none | OpenAI | 8.7 | 7.9 | 3/4 | 3.40s |
| #52 | Grok 4.1 Fast medium | X AI | 8.7 | 6.7 | 3/4 | 3.81s |
| #44 | GPT-5.4 Mini medium | OpenAI | 8.6 | 7.3 | 3/4 | 4.05s |
| #15 | Gemini 2.5 Flash medium | 8.4 | 8.2 | 3/4 | 6.30s | |
| #27 | DeepSeek V3.2 medium | DeepSeek | 8.4 | 8.0 | 3/4 | 30.7s |
| #3 | Claude Opus 4.7 medium | Anthropic | 8.3 | 9.2 | 3/4 | 1.85s |
| #4 | Claude Opus 4.7 none | Anthropic | 8.3 | 9.2 | 3/4 | 2.12s |