Kategoria ya AI BENCHY
Orodha ya Ufuataji wa maagizo
Ona ni modeli gani za AI zinafanya vizuri zaidi katika Ufuataji wa maagizo, zipi zinabaki thabiti, na pengo kubwa liko wapi. Panga kwa: Majaribio sahihi ↑.
| Nafasi | Modeli | Kampuni | Alama ya Ufuataji wa maagizo | Alama | Majaribio sahihi | Muda wa majibu (wastani) |
|---|---|---|---|---|---|---|
| #91 | GPT-5.5 none | OpenAI | 6.2 | 6.4 | 1/2 | 1.15s |
| #93 | Qwen3.6 Plus Preview medium | Qwen | 6.5 | 6.3 | 1/2 | 3.40s |
| #101 | Mimo V2 Omni none | Xiaomi | 6.5 | 6.0 | 1/2 | 4.26s |
| #102 | Gemma 4 26B A4B none | 6.3 | 6.0 | 1/2 | 690ms | |
| #105 | Nemotron 3 Super medium | NVIDIA | 7.3 | 5.8 | 1/2 | 6.97s |
| #106 | Grok 4.20 Beta none | X AI | 6.3 | 5.8 | 1/2 | 649ms |
| #108 | Qwen3.5-Flash none | Qwen | 6.3 | 5.8 | 1/2 | 8.81s |
| #109 | GLM 5V Turbo none | Z.ai | 6.5 | 5.8 | 1/2 | 1.97s |
| #111 | Owl Alpha medium | Openrouter | 6.5 | 5.7 | 1/2 | 10.2s |
| #113 | DeepSeek V4 Pro none | DeepSeek | 6.3 | 5.7 | 1/2 | 8.23s |
| #114 | Qwen3.5 Plus 2026-04-20 none | Qwen | 6.2 | 5.7 | 1/2 | 1.17s |
| #115 | Qwen3.5-27B none | Qwen | 6.3 | 5.7 | 1/2 | 1.03s |
| #116 | Hunter Alpha none | OpenRouter | 6.4 | 5.7 | 1/2 | 2.82s |
| #117 | Qwen3.5-35B-A3B none | Qwen | 6.3 | 5.6 | 1/2 | 809ms |
| #118 | Qwen3.6 27B none | Qwen | 6.2 | 5.6 | 1/2 | 1.92s |