Kategoria ya AI BENCHY
Orodha ya Mbinu za kupinga AI
Ona ni modeli gani za AI zinafanya vizuri zaidi katika Mbinu za kupinga AI, zipi zinabaki thabiti, na pengo kubwa liko wapi. Panga kwa: Kipimo ↑.
| Nafasi | Modeli | Kampuni | Alama ya Mbinu za kupinga AI | Alama | Majaribio sahihi | Muda wa majibu (wastani) |
|---|---|---|---|---|---|---|
| #127 | Grok 4.20 none | X AI | 4.8 | 5.4 | 1/4 | 501ms |
| #131 | Qwen3.5-122B-A10B none | Qwen | 4.8 | 5.3 | 1/4 | 1.59s |
| #147 | GPT-4o-mini none | OpenAI | 4.8 | 4.8 | 1/4 | 1.34s |
| #156 | Hy3 preview none | Tencent | 4.8 | 4.4 | 1/4 | 11.1s |
| #162 | Nemotron 3 Nano Omni 30b A3b Reasoning none | NVIDIA | 4.8 | 4.1 | 1/4 | 584ms |
| #141 | Nemotron 3 Super none | NVIDIA | 4.8 | 4.9 | 1/4 | 4.46s |
| #163 | Granite 4.1 8B none | IBM Granite | 4.9 | 4.0 | 1/4 | 844ms |
| #161 | Qwen3.5-9B medium | Qwen | 5.1 | 4.2 | 1/4 | 34.4s |
| #74 | Qwen3.6 Max Preview none | Qwen | 5.2 | 6.9 | 1/4 | 2.63s |
| #122 | GLM 4.7 Flash none | Z.ai | 5.2 | 5.5 | 1/4 | 5.51s |
| #67 | MiniMax M3 medium | Minimax | 5.5 | 7.1 | 1/4 | 14.9s |
| #132 | Mistral Small 4 medium | Mistral | 5.6 | 5.3 | 1/4 | 2.67s |
| #69 | Claude Opus 4.6 medium | Anthropic | 6.4 | 7.0 | 2/4 | 7.45s |
| #82 | Hy3 preview high | Tencent | 6.4 | 6.6 | 2/4 | 15.1s |
| #103 | DeepSeek V4 Pro high | DeepSeek | 6.4 | 6.0 | 2/4 | 16.5s |