Categorie AI BENCHY
Clasament Specific domeniului
Vezi ce modele AI se descurcă cel mai bine la Specific domeniului, care rămân fiabile și unde apar cele mai mari diferențe. Sortează după: Metrică ↑.
| Rang | Model | Companie | Scor Specific domeniului | Scor | Teste corecte | Timp de răspuns (mediu) |
|---|---|---|---|---|---|---|
| #155 | Mercury 2 none | Inception | 5.3 | 4.5 | 1/3 | 534ms |
| #6 | GPT-5.5 low | OpenAI | 5.3 | 9.0 | 1/3 | 28.1s |
| #10 | Claude Opus 4.8 medium | Anthropic | 5.3 | 8.7 | 1/3 | 14.2s |
| #12 | Gemini 3.1 Flash Lite Preview high | 5.3 | 8.6 | 1/3 | 127.6s | |
| #13 | Grok 4.20 Beta medium | X AI | 5.3 | 8.5 | 1/3 | 21.3s |
| #24 | GPT-5.2 Chat none | OpenAI | 5.3 | 7.9 | 1/3 | 17.8s |
| #25 | Qwen3.5 Plus 2026-02-15 medium | Qwen | 5.3 | 7.9 | 1/3 | 17.5s |
| #30 | Qwen3.5-27B medium | Qwen | 5.3 | 7.8 | 1/3 | 79.5s |
| #33 | Hy3 preview medium | Tencent | 5.3 | 7.7 | 1/3 | 22.3s |
| #35 | Gemini 3 PRO Preview medium | 5.3 | 7.6 | 1/3 | 7.01s | |
| #43 | MiMo-V2.5-Pro medium | Xiaomi | 5.3 | 7.5 | 1/3 | 37.9s |
| #47 | Grok Build 0.1 medium | X AI | 5.3 | 7.4 | 1/3 | 158.0s |
| #50 | Gemini 3.1 Flash Lite Preview low | 5.3 | 7.4 | 1/3 | 2.36s | |
| #51 | Mimo V2 PRO medium | Xiaomi | 5.3 | 7.4 | 1/3 | 8.82s |
| #55 | GLM 5.1 medium | Z.ai | 5.3 | 7.3 | 1/3 | 29.8s |