Kategoria ya AI BENCHY
Orodha ya Mchanganyiko
Ona ni modeli gani za AI zinafanya vizuri zaidi katika Mchanganyiko, zipi zinabaki thabiti, na pengo kubwa liko wapi. Panga kwa: Muda wa majibu (wastani) ↑.
| Nafasi | Modeli | Kampuni | Alama ya Mchanganyiko | Alama | Majaribio sahihi | Muda wa majibu (wastani) |
|---|---|---|---|---|---|---|
| #14 | Gemma 4 31B medium | 3.0 | 8.3 | 0/1 | 0ms | |
| #48 | Gemma 4 31B none | 3.0 | 6.9 | 0/1 | 0ms | |
| #56 | Grok 4.20 Multi Agent Beta medium | X AI | 3.0 | 6.4 | 0/1 | 0ms |
| #84 | gpt-oss-120b none | OpenAI | 3.0 | 5.2 | 0/1 | 0ms |
| #97 | Qwen3.5-9B medium | Qwen | 3.0 | 4.4 | 0/1 | 0ms |
| #98 | LFM2-24B-A2B none | Liquid | 3.0 | 4.1 | 0/1 | 0ms |
| #91 | Mercury 2 none | Inception | 3.0 | 4.8 | 0/1 | 606ms |
| #83 | Mistral Small 4 none | Mistral | 3.0 | 5.2 | 0/1 | 1.72s |
| #55 | MiMo-V2-Omni none | Xiaomi | 3.0 | 6.5 | 0/1 | 2.47s |
| #86 | GPT-5.4 Mini none | OpenAI | 3.0 | 5.1 | 0/1 | 2.52s |
| #94 | MiMo-V2-Flash none | Xiaomi | 3.0 | 4.5 | 0/1 | 2.87s |
| #66 | GPT-5.4 none | OpenAI | 3.0 | 5.9 | 0/1 | 2.89s |
| #29 | Gemini 3.1 Flash Lite Preview none | 3.0 | 7.9 | 0/1 | 3.20s | |
| #74 | GLM 4.7 Flash none | Z.ai | 3.0 | 5.6 | 0/1 | 3.22s |
| #5 | Gemini 3 Flash Preview low | 3.0 | 8.8 | 0/1 | 3.27s |