Kategoria ya AI BENCHY
Orodha ya Ufuataji wa maagizo
Ona ni modeli gani za AI zinafanya vizuri zaidi katika Ufuataji wa maagizo, zipi zinabaki thabiti, na pengo kubwa liko wapi. Panga kwa: Muda wa majibu (wastani) ↓.
| Nafasi | Modeli | Kampuni | Alama ya Ufuataji wa maagizo | Alama | Majaribio sahihi | Muda wa majibu (wastani) |
|---|---|---|---|---|---|---|
| #111 | Owl Alpha medium | Openrouter | 6.5 | 5.7 | 1/2 | 10.2s |
| #29 | Qwen3.5-122B-A10B medium | Qwen | 10.0 | 7.8 | 2/2 | 9.88s |
| #4 | Gemini 3.1 Pro Preview medium | 10.0 | 9.4 | 2/2 | 9.56s | |
| #83 | Step 3.5 Flash none | Stepfun | 10.0 | 6.6 | 1/1 | 9.30s |
| #108 | Qwen3.5-Flash none | Qwen | 6.3 | 5.8 | 1/2 | 8.81s |
| #113 | DeepSeek V4 Pro none | DeepSeek | 6.3 | 5.7 | 1/2 | 8.23s |
| #140 | Qwen3 Coder Next none | Qwen | 6.3 | 4.9 | 1/2 | 7.78s |
| #99 | gpt-oss-120b medium | OpenAI | 9.9 | 6.1 | 2/2 | 7.63s |
| #26 | Qwen3.6 Plus medium | Qwen | 10.0 | 7.9 | 2/2 | 7.54s |
| #46 | Qwen3.6 35B A3B medium | Qwen | 10.0 | 7.4 | 2/2 | 7.50s |
| #150 | Qwen3 Coder Next medium | Qwen | 6.3 | 4.6 | 1/2 | 7.49s |
| #55 | GLM 5.1 medium | Z.ai | 6.4 | 7.3 | 1/2 | 7.47s |
| #5 | Qwen3.7 Max medium | Qwen | 10.0 | 9.1 | 2/2 | 7.46s |
| #100 | Grok Build 0.1 none | X AI | 9.8 | 6.0 | 2/2 | 7.36s |
| #19 | Seed-2.0-Lite medium | Bytedance Seed | 10.0 | 8.2 | 2/2 | 7.26s |