Kategoria ya AI BENCHY
Orodha ya Akili ya jumla
Ona ni modeli gani za AI zinafanya vizuri zaidi katika Akili ya jumla, zipi zinabaki thabiti, na pengo kubwa liko wapi. Panga kwa: Muda wa majibu (wastani) ↓.
| Nafasi | Modeli | Kampuni | Alama ya Akili ya jumla | Alama | Majaribio sahihi | Muda wa majibu (wastani) |
|---|---|---|---|---|---|---|
| #59 | GLM 5V Turbo medium | Z.ai | 10.0 | 7.2 | 1/1 | 11.1s |
| #126 | gpt-oss-120b none | OpenAI | 4.8 | 5.4 | 0/1 | 10.8s |
| #23 | GLM 5 Turbo medium | Z.ai | 6.1 | 8.0 | 0/1 | 10.1s |
| #39 | Qwen3.6 Flash medium | Qwen | 4.8 | 7.5 | 0/1 | 9.88s |
| #27 | Gemma 4 31B medium | 10.0 | 7.8 | 1/1 | 9.57s | |
| #35 | Gemini 3 PRO Preview medium | 10.0 | 7.6 | 1/1 | 9.34s | |
| #133 | DeepSeek V3.2 none | DeepSeek | 4.7 | 5.2 | 0/1 | 9.32s |
| #46 | Qwen3.6 35B A3B medium | Qwen | 4.4 | 7.4 | 0/1 | 8.66s |
| #99 | gpt-oss-120b medium | OpenAI | 4.3 | 6.1 | 0/1 | 7.90s |
| #57 | Step 3.7 Flash low | Stepfun | 3.4 | 7.3 | 0/1 | 7.00s |
| #105 | Nemotron 3 Super medium | NVIDIA | 4.1 | 5.8 | 0/1 | 6.91s |
| #143 | MiMo-V2.5 none | Xiaomi | 4.4 | 4.9 | 0/1 | 6.86s |
| #22 | Step 3.7 Flash medium | Stepfun | 4.0 | 8.0 | 0/1 | 6.85s |
| #129 | MiniMax M2.5 medium | Minimax | 3.8 | 5.3 | 0/1 | 6.63s |
| #79 | Hunter Alpha medium | OpenRouter | 7.0 | 6.7 | 0/1 | 6.44s |