Kategoria ya AI BENCHY
Orodha ya Mchanganyiko
Ona ni modeli gani za AI zinafanya vizuri zaidi katika Mchanganyiko, zipi zinabaki thabiti, na pengo kubwa liko wapi.
Modeli zilizoonyeshwa
15
Wastani wa Alama ya Mchanganyiko
6.2
Modeli bora
Gemini 3 Flash Preview 10.0| Nafasi | Modeli | Kampuni | Alama ya Mchanganyiko | Alama | Majaribio sahihi | Muda wa majibu (wastani) |
|---|---|---|---|---|---|---|
| #47 | Grok 4.20 medium | X AI | 10.0 | 7.0 | 1/1 | 17.4s |
| #51 | Nemotron 3 Super medium | NVIDIA | 10.0 | 6.7 | 1/1 | 87.8s |
| #52 | Grok 4.1 Fast medium | X AI | 10.0 | 6.7 | 1/1 | 37.6s |
| #54 | Mercury 2 medium | Inception | 10.0 | 6.5 | 1/1 | 3.28s |
| #57 | GPT-5 Nano medium | OpenAI | 10.0 | 6.3 | 1/1 | 66.0s |
| #68 | gpt-oss-120b medium | OpenAI | 10.0 | 5.8 | 1/1 | 31.2s |
| #38 | GPT-5.4 Nano medium | OpenAI | 9.8 | 7.6 | 1/1 | 24.1s |
| #41 | MiMo-V2-Flash medium | Xiaomi | 9.8 | 7.5 | 1/1 | 75.7s |
| #24 | Gemma 4 26B A4B medium | 9.6 | 8.0 | 1/1 | 73.5s | |
| #2 | Gemini 3.1 Pro Preview medium | 9.5 | 9.6 | 1/1 | 40.6s | |
| #4 | Claude Opus 4.7 none | Anthropic | 9.5 | 9.2 | 1/1 | 18.3s |
| #33 | GLM 5.1 medium | Z.ai | 9.5 | 7.8 | 1/1 | 43.1s |
| #42 | Claude Sonnet 4.6 none | Anthropic | 9.5 | 7.4 | 1/1 | 23.8s |
| #31 | GLM 5V Turbo medium | Z.ai | 6.9 | 7.8 | 0/1 | 15.1s |
| #64 | DeepSeek V3.2 none | DeepSeek | 6.5 | 6.1 | 0/1 | 115.9s |