Kategoria ya AI BENCHY
Orodha ya Mbinu za kupinga AI
Ona ni modeli gani za AI zinafanya vizuri zaidi katika Mbinu za kupinga AI, zipi zinabaki thabiti, na pengo kubwa liko wapi. Panga kwa: Muda wa majibu (wastani) ↓.
| Nafasi | Modeli | Kampuni | Alama ya Mbinu za kupinga AI | Alama | Majaribio sahihi | Muda wa majibu (wastani) |
|---|---|---|---|---|---|---|
| #95 | Qwen3.5 Plus 2026-02-15 none | Qwen | 4.8 | 6.3 | 1/4 | 1.91s |
| #114 | Qwen3.5 Plus 2026-04-20 none | Qwen | 4.8 | 5.7 | 1/4 | 1.88s |
| #85 | Gemma 4 31B none | 6.5 | 6.5 | 2/4 | 1.85s | |
| #11 | Claude Opus 4.7 medium | Anthropic | 8.3 | 8.7 | 3/4 | 1.85s |
| #61 | Gemini 3.1 Flash Lite low | 7.3 | 7.2 | 2/4 | 1.84s | |
| #120 | Mimo V2 PRO none | Xiaomi | 3.5 | 5.6 | 0/4 | 1.80s |
| #154 | Qwen3.5-9B none | Qwen | 3.1 | 4.6 | 0/4 | 1.71s |
| #101 | Mimo V2 Omni none | Xiaomi | 3.6 | 6.0 | 0/4 | 1.63s |
| #128 | Qwen3.6 Flash none | Qwen | 3.1 | 5.4 | 0/4 | 1.63s |
| #131 | Qwen3.5-122B-A10B none | Qwen | 4.8 | 5.3 | 1/4 | 1.59s |
| #117 | Qwen3.5-35B-A3B none | Qwen | 3.4 | 5.6 | 0/4 | 1.43s |
| #124 | Kimi K2.6 none | Moonshot AI | 4.6 | 5.5 | 1/4 | 1.39s |
| #88 | Qwen3.7 Plus none | Qwen | 6.5 | 6.4 | 2/4 | 1.38s |
| #147 | GPT-4o-mini none | OpenAI | 4.8 | 4.8 | 1/4 | 1.34s |
| #108 | Qwen3.5-Flash none | Qwen | 3.5 | 5.8 | 0/4 | 1.32s |