Kategoria ya AI BENCHY
Orodha ya Ufuataji wa maagizo
Ona ni modeli gani za AI zinafanya vizuri zaidi katika Ufuataji wa maagizo, zipi zinabaki thabiti, na pengo kubwa liko wapi. Panga kwa: Muda wa majibu (wastani) ↓.
| Nafasi | Modeli | Kampuni | Alama ya Ufuataji wa maagizo | Alama | Majaribio sahihi | Muda wa majibu (wastani) |
|---|---|---|---|---|---|---|
| #19 | Qwen3.5-122B-A10B medium | Qwen | 10.0 | 8.1 | 2/2 | 9.88s |
| #2 | Gemini 3.1 Pro Preview medium | 10.0 | 9.6 | 2/2 | 9.56s | |
| #59 | Qwen3.5-Flash none | Qwen | 6.3 | 6.2 | 1/2 | 8.81s |
| #51 | Nemotron 3 Super medium | NVIDIA | 7.2 | 6.7 | 1/2 | 7.72s |
| #87 | Qwen3 Coder Next none | Qwen | 4.8 | 5.1 | 0/2 | 7.71s |
| #68 | gpt-oss-120b medium | OpenAI | 9.9 | 5.8 | 2/2 | 7.63s |
| #9 | Qwen3.6 Plus Preview medium | Qwen | 10.0 | 8.5 | 2/2 | 7.54s |
| #20 | Qwen3.6 Plus medium | Qwen | 10.0 | 8.1 | 2/2 | 7.54s |
| #33 | GLM 5.1 medium | Z.ai | 6.4 | 7.8 | 1/2 | 7.47s |
| #92 | Qwen3 Coder Next medium | Qwen | 4.8 | 4.7 | 0/2 | 7.34s |
| #6 | Seed-2.0-Lite medium | Bytedance Seed | 10.0 | 8.6 | 2/2 | 7.26s |
| #13 | GLM 5 medium | Z.ai | 10.0 | 8.4 | 2/2 | 7.25s |
| #5 | Gemini 3 Flash Preview low | 9.9 | 8.8 | 2/2 | 7.02s | |
| #1 | Gemini 3 Flash Preview medium | 10.0 | 10.0 | 2/2 | 6.10s | |
| #28 | GPT-5.2 Chat none | OpenAI | 7.5 | 7.9 | 1/2 | 5.46s |