Kategoria ya AI BENCHY
Orodha ya Uandishi wa msimbo
Ona ni modeli gani za AI zinafanya vizuri zaidi katika Uandishi wa msimbo, zipi zinabaki thabiti, na pengo kubwa liko wapi. Panga kwa: Muda wa majibu (wastani) ↓.
| Nafasi | Modeli | Kampuni | Alama ya Uandishi wa msimbo | Alama | Majaribio sahihi | Muda wa majibu (wastani) |
|---|---|---|---|---|---|---|
| #105 | Grok 4.20 Beta none | X AI | 5.5 | 5.8 | 0/1 | 1.14s |
| #85 | Gemini 3.1 Flash Lite none | 6.8 | 6.6 | 1/2 | 1.13s | |
| #141 | GPT-5.4 Nano none | OpenAI | 5.4 | 4.8 | 0/2 | 1.09s |
| #52 | Gemini 3.1 Flash Lite Preview none | 6.8 | 7.5 | 1/2 | 1.06s | |
| #135 | Mistral Small 4 none | Mistral | 4.0 | 5.0 | 0/2 | 1.03s |
| #137 | GPT-5.4 Mini none | OpenAI | 6.8 | 4.9 | 1/2 | 1.01s |
| #98 | Qwen3.5-Flash none | Qwen | 6.8 | 5.9 | 1/2 | 993ms |
| #78 | Gemini 3.1 Flash Lite minimal | 6.8 | 6.7 | 1/2 | 951ms | |
| #146 | Mercury 2 none | Inception | 3.5 | 4.6 | 0/2 | 831ms |
| #90 | Gemini 2.5 Flash none | 6.8 | 6.4 | 1/2 | 810ms | |
| #153 | Granite 4.1 8B none | IBM Granite | 5.2 | 4.1 | 0/2 | 706ms |
| #17 | Qwen3.6 Plus Preview medium | Qwen | 0.0 | 8.2 | 0/0 | 0ms |
| #20 | Gemini 3 PRO Preview medium | 3.0 | 8.1 | 0/2 | 0ms | |
| #34 | Step 3.5 Flash none | Stepfun | 3.0 | 7.8 | 0/1 | 0ms |
| #76 | Hunter Alpha medium | OpenRouter | 3.0 | 6.7 | 0/1 | 0ms |