Kategoria ya AI BENCHY
Orodha ya Mahususi kwa domeni
Ona ni modeli gani za AI zinafanya vizuri zaidi katika Mahususi kwa domeni, zipi zinabaki thabiti, na pengo kubwa liko wapi. Panga kwa: Majaribio sahihi ↓.
Modeli zilizoonyeshwa
15
Wastani wa Alama ya Mahususi kwa domeni
4.8
Modeli bora
Gemini 3 Flash Preview 10.0| Nafasi | Modeli | Kampuni | Alama ya Mahususi kwa domeni | Alama | Majaribio sahihi | Muda wa majibu (wastani) |
|---|---|---|---|---|---|---|
| #80 | Mimo V2 Omni medium | Xiaomi | 3.0 | 6.7 | 0/3 | 47.9s |
| #81 | Mercury 2 medium | Inception | 2.9 | 6.6 | 0/3 | 6.48s |
| #84 | Grok 4.20 Multi Agent Beta medium | X AI | 2.9 | 6.6 | 0/3 | 24.7s |
| #87 | Gemini 3.1 Flash Lite minimal | 2.9 | 6.4 | 0/3 | 1.02s | |
| #88 | Qwen3.7 Plus none | Qwen | 3.0 | 6.4 | 0/3 | 868ms |
| #90 | Gemini 3.1 Flash Lite none | 2.9 | 6.4 | 0/3 | 762ms | |
| #91 | GPT-5.5 none | OpenAI | 2.9 | 6.4 | 0/3 | 1.31s |
| #93 | Qwen3.6 Plus Preview medium | Qwen | 3.0 | 6.3 | 0/3 | 22.1s |
| #98 | GLM 5 none | Z.ai | 3.0 | 6.1 | 0/3 | 2.24s |
| #99 | gpt-oss-120b medium | OpenAI | 2.9 | 6.1 | 0/3 | 50.9s |
| #100 | Grok Build 0.1 none | X AI | 3.6 | 6.0 | 0/3 | 103.7s |
| #102 | Gemma 4 26B A4B none | 3.6 | 6.0 | 0/3 | 2.49s | |
| #103 | DeepSeek V4 Pro high | DeepSeek | 2.9 | 6.0 | 0/3 | 205.7s |
| #105 | Nemotron 3 Super medium | NVIDIA | 2.9 | 5.8 | 0/3 | 16.2s |
| #106 | Grok 4.20 Beta none | X AI | 3.0 | 5.8 | 0/3 | 611ms |