Categorie AI BENCHY
Clasament Specific domeniului
Vezi ce modele AI se descurcă cel mai bine la Specific domeniului, care rămân fiabile și unde apar cele mai mari diferențe.
Modele afișate
15
Media pentru Scor Specific domeniului
4.8
Cel mai bun model
Gemini 3 Flash Preview 10.0| Rang | Model | Companie | Scor Specific domeniului | Scor | Teste corecte | Timp de răspuns (mediu) |
|---|---|---|---|---|---|---|
| #1 | Gemini 3 Flash Preview medium | 10.0 | 9.8 | 3/3 | 15.3s | |
| #32 | Gemini 3.5 Flash minimal | 10.0 | 7.7 | 3/3 | 899ms | |
| #83 | Step 3.5 Flash none | Stepfun | 10.0 | 6.6 | 1/1 | 34.5s |
| #3 | Gemini 3.5 Flash low | 7.7 | 9.4 | 2/3 | 3.39s | |
| #4 | Gemini 3.1 Pro Preview medium | 7.7 | 9.4 | 2/3 | 32.7s | |
| #7 | Gemini 3.5 Flash medium | 7.7 | 9.0 | 2/3 | 5.24s | |
| #8 | Claude Opus 4.7 none | Anthropic | 7.7 | 8.9 | 2/3 | 1.19s |
| #11 | Claude Opus 4.7 medium | Anthropic | 7.7 | 8.7 | 2/3 | 1.17s |
| #22 | Step 3.7 Flash medium | Stepfun | 7.7 | 8.0 | 2/3 | 48.3s |
| #27 | Gemma 4 31B medium | 7.7 | 7.8 | 2/3 | 38.5s | |
| #34 | Qwen3.7 Max none | Qwen | 7.7 | 7.7 | 2/3 | 975ms |
| #48 | Gemini 3 Flash Preview none | 7.7 | 7.4 | 2/3 | 963ms | |
| #74 | Qwen3.6 Max Preview none | Qwen | 7.7 | 6.9 | 2/3 | 1.22s |
| #77 | Claude Sonnet 4.6 none | Anthropic | 7.7 | 6.8 | 2/3 | 3.54s |
| #85 | Gemma 4 31B none | 7.7 | 6.5 | 2/3 | 3.22s |