Kategori AI BENCHY
Peringkat Pemecahan teka-teki
Lihat model AI mana yang paling baik di Pemecahan teka-teki, mana yang tetap andal, dan di mana kesenjangan terbesar muncul. Urutkan berdasarkan: Tes benar ↓.
Model yang ditampilkan
15
Rata-rata Skor Pemecahan teka-teki
6.7
Model terbaik
Gemini 3 Flash Preview 10.0| Peringkat | Model | Perusahaan | Skor Pemecahan teka-teki | Skor | Tes benar | Waktu respons (rata-rata) |
|---|---|---|---|---|---|---|
| #20 | Gemini 3.5 Flash none | 10.0 | 8.1 | 3/3 | 3.13s | |
| #25 | Qwen3.5 Plus 2026-02-15 medium | Qwen | 10.0 | 7.9 | 3/3 | 32.5s |
| #26 | Qwen3.6 Plus medium | Qwen | 10.0 | 7.9 | 3/3 | 6.34s |
| #27 | Gemma 4 31B medium | 9.9 | 7.8 | 3/3 | 26.9s | |
| #29 | Qwen3.5-122B-A10B medium | Qwen | 10.0 | 7.8 | 3/3 | 17.9s |
| #32 | Gemini 3.5 Flash minimal | 10.0 | 7.7 | 3/3 | 1.45s | |
| #34 | Qwen3.7 Max none | Qwen | 10.0 | 7.7 | 3/3 | 1.13s |
| #35 | Gemini 3 PRO Preview medium | 10.0 | 7.6 | 3/3 | 3.88s | |
| #37 | Gemma 4 26B A4B medium | 10.0 | 7.6 | 3/3 | 5.79s | |
| #50 | Gemini 3.1 Flash Lite Preview low | 10.0 | 7.4 | 3/3 | 1.69s | |
| #52 | Claude Sonnet 4.6 medium | Anthropic | 10.0 | 7.4 | 3/3 | 5.31s |
| #58 | Gemini 3.1 Flash Lite Preview none | 10.0 | 7.2 | 3/3 | 900ms | |
| #61 | Gemini 3.1 Flash Lite low | 10.0 | 7.2 | 3/3 | 1.40s | |
| #63 | GPT-5.3 Chat none | OpenAI | 10.0 | 7.2 | 3/3 | 2.99s |
| #74 | Qwen3.6 Max Preview none | Qwen | 10.0 | 6.9 | 3/3 | 2.65s |