Kategori AI BENCHY
Peringkat Spesifik domain
Lihat model AI mana yang paling baik di Spesifik domain, mana yang tetap andal, dan di mana kesenjangan terbesar muncul.
Model yang ditampilkan
15
Rata-rata Skor Spesifik domain
4.8
Model terbaik
Gemini 3 Flash Preview 10.0| Peringkat | Model | Perusahaan | Skor Spesifik domain | Skor | Tes benar | Waktu respons (rata-rata) |
|---|---|---|---|---|---|---|
| #50 | Gemini 3.1 Flash Lite Preview low | 5.3 | 7.4 | 1/3 | 2.36s | |
| #51 | Mimo V2 PRO medium | Xiaomi | 5.3 | 7.4 | 1/3 | 8.82s |
| #55 | GLM 5.1 medium | Z.ai | 5.3 | 7.3 | 1/3 | 29.8s |
| #56 | MiMo-V2.5 medium | Xiaomi | 5.3 | 7.3 | 1/3 | 34.5s |
| #58 | Gemini 3.1 Flash Lite Preview none | 5.3 | 7.2 | 1/3 | 942ms | |
| #61 | Gemini 3.1 Flash Lite low | 5.3 | 7.2 | 1/3 | 1.52s | |
| #65 | Grok 4.20 medium | X AI | 5.3 | 7.1 | 1/3 | 27.0s |
| #95 | Qwen3.5 Plus 2026-02-15 none | Qwen | 5.3 | 6.3 | 1/3 | 1.17s |
| #101 | Mimo V2 Omni none | Xiaomi | 5.3 | 6.0 | 1/3 | 2.10s |
| #104 | Nemotron 3 Ultra 550b A55b none | NVIDIA | 5.3 | 6.0 | 1/3 | 698ms |
| #109 | GLM 5V Turbo none | Z.ai | 5.3 | 5.8 | 1/3 | 2.09s |
| #111 | Owl Alpha medium | Openrouter | 5.3 | 5.7 | 1/3 | 8.58s |
| #113 | DeepSeek V4 Pro none | DeepSeek | 5.3 | 5.7 | 1/3 | 3.17s |
| #114 | Qwen3.5 Plus 2026-04-20 none | Qwen | 5.3 | 5.7 | 1/3 | 4.43s |
| #116 | Hunter Alpha none | OpenRouter | 5.3 | 5.7 | 1/3 | 2.33s |