Kategoria ya AI BENCHY
Orodha ya Mbinu za kupinga AI
Ona ni modeli gani za AI zinafanya vizuri zaidi katika Mbinu za kupinga AI, zipi zinabaki thabiti, na pengo kubwa liko wapi. Panga kwa: Muda wa majibu (wastani) ↓.
| Nafasi | Modeli | Kampuni | Alama ya Mbinu za kupinga AI | Alama | Majaribio sahihi | Muda wa majibu (wastani) |
|---|---|---|---|---|---|---|
| #72 | DeepSeek V3.2 medium | DeepSeek | 8.2 | 7.0 | 3/4 | 24.2s |
| #17 | GLM 5 medium | Z.ai | 10.0 | 8.3 | 4/4 | 23.7s |
| #14 | Qwen3.6 Max Preview medium | Qwen | 10.0 | 8.5 | 4/4 | 22.1s |
| #66 | Qwen3.5-35B-A3B medium | Qwen | 10.0 | 7.1 | 4/4 | 21.1s |
| #129 | MiniMax M2.5 medium | Minimax | 7.9 | 5.3 | 2/4 | 20.8s |
| #139 | DeepSeek V4 Flash none | DeepSeek | 3.0 | 5.0 | 0/4 | 20.2s |
| #30 | Qwen3.5-27B medium | Qwen | 8.7 | 7.8 | 3/4 | 19.8s |
| #19 | Seed-2.0-Lite medium | Bytedance Seed | 8.3 | 8.2 | 3/4 | 18.0s |
| #103 | DeepSeek V4 Pro high | DeepSeek | 6.4 | 6.0 | 2/4 | 16.5s |
| #64 | MiMo-V2-Flash medium | Xiaomi | 8.1 | 7.2 | 3/4 | 15.8s |
| #82 | Hy3 preview high | Tencent | 6.4 | 6.6 | 2/4 | 15.1s |
| #35 | Gemini 3 PRO Preview medium | 10.0 | 7.6 | 4/4 | 15.0s | |
| #158 | GLM 4.7 Flash medium | Z.ai | 4.7 | 4.4 | 1/4 | 15.0s |
| #67 | MiniMax M3 medium | Minimax | 5.5 | 7.1 | 1/4 | 14.9s |
| #113 | DeepSeek V4 Pro none | DeepSeek | 3.5 | 5.7 | 0/4 | 14.0s |