Kategoria ya AI BENCHY
Orodha ya Mahususi kwa domeni
Ona ni modeli gani za AI zinafanya vizuri zaidi katika Mahususi kwa domeni, zipi zinabaki thabiti, na pengo kubwa liko wapi. Panga kwa: Kipimo ↑.
| Nafasi | Modeli | Kampuni | Alama ya Mahususi kwa domeni | Alama | Majaribio sahihi | Muda wa majibu (wastani) |
|---|---|---|---|---|---|---|
| #87 | Qwen3 Coder Next none | Qwen | 5.3 | 5.1 | 1/3 | 962ms |
| #92 | Qwen3 Coder Next medium | Qwen | 5.3 | 4.7 | 1/3 | 638ms |
| #52 | Grok 4.1 Fast medium | X AI | 5.8 | 6.7 | 1/3 | 121.8s |
| #6 | Seed-2.0-Lite medium | Bytedance Seed | 5.9 | 8.6 | 1/3 | 88.7s |
| #7 | GPT-5.3-Codex medium | OpenAI | 5.9 | 8.6 | 1/3 | 64.3s |
| #15 | Gemini 2.5 Flash medium | 5.9 | 8.2 | 1/3 | 37.3s | |
| #38 | GPT-5.4 Nano medium | OpenAI | 5.9 | 7.6 | 1/3 | 38.2s |
| #40 | GPT-5.2 medium | OpenAI | 5.9 | 7.5 | 1/3 | 77.8s |
| #41 | MiMo-V2-Flash medium | Xiaomi | 5.9 | 7.5 | 1/3 | 96.0s |
| #62 | Gemini 2.5 Flash none | 5.9 | 6.2 | 1/3 | 495ms | |
| #95 | Grok 4.1 Fast none | X AI | 5.9 | 4.5 | 1/3 | 1.06s |
| #98 | LFM2-24B-A2B none | Liquid | 5.9 | 4.1 | 1/3 | 287ms |
| #2 | Gemini 3.1 Pro Preview medium | 7.7 | 9.6 | 2/3 | 32.7s | |
| #3 | Claude Opus 4.7 medium | Anthropic | 7.7 | 9.2 | 2/3 | 1.17s |
| #4 | Claude Opus 4.7 none | Anthropic | 7.7 | 9.2 | 2/3 | 1.19s |