Categorie AI BENCHY
Clasament Rezolvare de puzzle-uri
Vezi ce modele AI se descurcă cel mai bine la Rezolvare de puzzle-uri, care rămân fiabile și unde apar cele mai mari diferențe. Sortează după: Timp de răspuns (mediu) ↑.
Modele afișate
15
Media pentru Scor Rezolvare de puzzle-uri
6.7
Cel mai bun model
Mistral Small 4 3.1| Rang | Model | Companie | Scor Rezolvare de puzzle-uri | Scor | Teste corecte | Timp de răspuns (mediu) |
|---|---|---|---|---|---|---|
| #77 | Claude Sonnet 4.6 none | Anthropic | 7.7 | 6.8 | 2/3 | 2.53s |
| #74 | Qwen3.6 Max Preview none | Qwen | 10.0 | 6.9 | 3/3 | 2.65s |
| #134 | GLM 5 Turbo none | Z.ai | 5.5 | 5.2 | 1/3 | 2.65s |
| #95 | Qwen3.5 Plus 2026-02-15 none | Qwen | 7.7 | 6.3 | 2/3 | 2.71s |
| #68 | Claude Opus 4.8 none | Anthropic | 7.7 | 7.0 | 2/3 | 2.74s |
| #110 | Seed-2.0-Lite none | Bytedance Seed | 5.3 | 5.8 | 1/3 | 2.78s |
| #63 | GPT-5.3 Chat none | OpenAI | 10.0 | 7.2 | 3/3 | 2.99s |
| #20 | Gemini 3.5 Flash none | 10.0 | 8.1 | 3/3 | 3.13s | |
| #105 | Nemotron 3 Super medium | NVIDIA | 3.0 | 5.8 | 0/3 | 3.15s |
| #28 | Gemini 2.5 Flash medium | 7.7 | 7.8 | 2/3 | 3.18s | |
| #2 | Gemini 3.5 Flash high | 10.0 | 9.6 | 3/3 | 3.23s | |
| #111 | Owl Alpha medium | Openrouter | 5.3 | 5.7 | 1/3 | 3.40s |
| #13 | Grok 4.20 Beta medium | X AI | 10.0 | 8.5 | 3/3 | 3.52s |
| #41 | Nemotron 3 Ultra 550b A55b medium | NVIDIA | 5.5 | 7.5 | 1/3 | 3.54s |
| #116 | Hunter Alpha none | OpenRouter | 5.8 | 5.7 | 1/3 | 3.71s |