Kategoria ya AI BENCHY
Orodha ya Uandishi wa msimbo
Ona ni modeli gani za AI zinafanya vizuri zaidi katika Uandishi wa msimbo, zipi zinabaki thabiti, na pengo kubwa liko wapi. Panga kwa: Muda wa majibu (wastani) ↓.
| Nafasi | Modeli | Kampuni | Alama ya Uandishi wa msimbo | Alama | Majaribio sahihi | Muda wa majibu (wastani) |
|---|---|---|---|---|---|---|
| #56 | Qwen3.5-Flash none | Qwen | 10.0 | 6.2 | 1/1 | 1.29s |
| #78 | Mistral Small 4 none | Mistral | 4.5 | 5.2 | 0/1 | 1.28s |
| #77 | Grok 4.20 none | X AI | 3.4 | 5.2 | 0/1 | 1.22s |
| #80 | GPT-5.4 Mini none | OpenAI | 10.0 | 5.1 | 1/1 | 1.19s |
| #59 | Gemini 2.5 Flash none | 10.0 | 6.2 | 1/1 | 1.16s | |
| #75 | Grok 4.20 Beta none | X AI | 5.5 | 5.3 | 0/1 | 1.14s |
| #85 | Mercury 2 none | Inception | 3.6 | 4.8 | 0/1 | 969ms |
| #10 | Gemini 3 PRO Preview medium | 3.0 | 8.4 | 0/1 | 0ms | |
| #18 | Qwen3.6 Plus medium | Qwen | 3.0 | 8.1 | 0/1 | 0ms |
| #47 | Hunter Alpha medium | OpenRouter | 3.0 | 6.7 | 0/1 | 0ms |
| #48 | Nemotron 3 Super medium | NVIDIA | 3.0 | 6.7 | 0/1 | 0ms |
| #67 | MiniMax M2.5 medium | Minimax | 3.0 | 5.7 | 0/1 | 0ms |
| #68 | Hunter Alpha none | OpenRouter | 3.0 | 5.7 | 0/1 | 0ms |
| #93 | Step 3.5 Flash none | Stepfun | 3.0 | 3.0 | 0/1 | 0ms |